Facebook AI's ReBel Takes on Imperfect Information Games
- By John K. Waters
Facebook AI is set to unveil a slew of artificial intelligence (AI) and machine learning (ML) solutions at sessions, spotlight presentations, and workshops during the virtual Conference on Neural Information Processing Systems (NeurIPS), currently underway online. From building AI that can teach itself by paraphrasing sample text to creating a single algorithm that can excel at both chess and poker, the organization has a lot to talk about.
The latter announcement might be the most intriguing, because it takes on the challenge of imperfect-information games--poker vs. chess, literally--with an algorithm that "provably converges to a Nash equilibrium in any two-player, zero-sum game." (e.g. poker).
The "Nash Equilibrium" is a concept of game theory where the optimal outcome of a game is one where no player has an incentive to deviate from his/her chosen strategy after considering an opponent's choice. It's named after mathematician John Nash, who won the 1994 Nobel Prize in Economic Sciences for his work on game theory.
Facebook's new Recursive Belief-based Learning (ReBeL) combines reinforcement learning (RL) with search in an algorithm that can work in all two-player zero-sum games, including imperfect-information games.
In game theory, a sequential game, like chess or Go, has perfect information, because each player is perfectly informed of all the events that have previously occurred--they can see the board and the other players' moves. AI has been outscoring top players in this space for years. The AlphaZero algorithm has achieved state-of-the-art performance when playing the perfect information games of Go, chess, and shogi.
Imperfect information, on the other hand, is what you have in, say, a game of Texas hold 'em, where you can't see all the other players' cards. You lack all the information you need to make truly informed decision. AI has not performed flawlessly in this space, so far.
"Unlike previous AIs, ReBeL makes decisions by factoring in the probability distribution of different beliefs each player might have about the current state of the game," Facebook AI's blog states, "which we call a public belief state. For example, ReBeL can assess the chances that its poker opponent thinks it has a pair of aces. ReBeL achieves superhuman performance in heads-up no-limit Texas Hold'em while using far less domain knowledge than any prior poker bot. It extends to other imperfect-information games, as well, such as Liar's Dice, for which we've open-sourced our implementation. ReBeL will help us create more general AI algorithms."
Based on research documented in the paper "Combining Deep Reinforcement Learning and Search for Imperfect-Information Games," (published by NeurIPS 2020), ReBeL also achieves state-of-the-art in bidding in the game of bridge--which is where this development gets really interesting, because Facebook AI asserts that this development is more about improving the collaboration between humans and AI than displays of AI superiority.
John K. Waters is the editor in chief of a number of Converge360.com sites, with a focus on high-end development, AI and future tech. He's been writing about cutting-edge technologies and culture of Silicon Valley for more than two decades, and he's written more than a dozen books. He also co-scripted the documentary film Silicon Valley: A 100 Year Renaissance, which aired on PBS. He can be reached at firstname.lastname@example.org.