To be great at poker you gotta know when to hold them, know when to fold them, know when to walk away, and know when to core dump. That’s only part of the technique a new AI system created by researchers at Carnegie Mellon used to beat four of the “world’s best professional poker players” – Dong Kim, Jimmy Chou, Daniel McAulay and Jason Les. The AI played the humans in a 20-day 120,000-hand Heads-up No-Limit Texas Hold’em binge that happened live on a casino floor in Pittsburgh.
The AI, called Libratus, was up $1,766,250 in chips by the end of the experiment when it finally beat the four pros in a competition at Rivers Casino. The players played nearly constantly, conferring on strategy after each day of play. The AI didn’t originally know how to play poker. Instead the researchers told it to try things at random until, after trillions of hands, it learned a winning strategy. The humans played the AI for 11 hours a day, finishing at 10pm every night, for twenty days.
“The best AI’s ability to do strategic reasoning with imperfect information has now surpassed that of the best humans,” said Tuomas Sandholm, professor of computer science and co-creator of the AI.
The AI didn’t win any money but the humans split a $200,000 pot based on their performance. After all, the computer only needed electrical and 600 compute notes on the Pittsburgh Supercomputing Center’s Bridges 846 node supercomputer where it powered through hands at 1.35 petaflops. McAulay, one of the human players, said that “Libratus was a tougher opponent than he expected.”
“Whenever you play a top player at poker, you learn from it,” he said.
The humans worked together to figure out the AI’s weaknesses even as the AI learned about its own faults – and how to bluff.
“The computer can’t win at poker if it can’t bluff,” said Frank Pfenning, head of the CMU Computer Science Department. “Developing an AI that can do that successfully is a tremendous step forward scientifically and has numerous applications. Imagine that your smartphone will someday be able to negotiate the best price on a new car for you. That’s just the beginning.”
He sees the AI as a step forward in AI and can be used in “any realm in which information is incomplete and opponents sow misinformation.”
The AI also “fixed” its strategy daily, assessing where it failed in the previous day’s competition.
“After play ended each day, a meta-algorithm analyzed what holes the pros had identified and exploited in Libratus’ strategy,” said Sandholm. “It then prioritized the holes and algorithmically patched the top three using the supercomputer each night. This is very different than how learning has been used in the past in poker. Typically researchers develop algorithms that try to exploit the opponent’s weaknesses. In contrast, here the daily improvement is about algorithmically fixing holes in our own strategy.”
The research that led to Libratus can be used to expand research into automated negotiations and even complex biological and engineering problems. In the end the AI was trained to solve a complex problem full of incomplete information, not simply drub four professional poker players.
“CMU played a pivotal role in developing both computer chess, which eventually beat the human world champion, and Watson, the AI that beat top human Jeopardy! competitors,” said Pfenning. “It has been very exciting to watch the progress of poker-playing programs that have finally surpassed the best human players. Each one of these accomplishments represents a major milestone in our understanding of intelligence.”