This is PLO

Talking about the Application of Artificial Intelligence in Texas Hold’em

Posted by

Posted by posted in Limit Hold 'Em

Talking about the Application of Artificial Intelligence in Texas Hold’em

In 2016, AlphaGo defeated the strongest human Go player and opened up a new era of artificial intelligence. Debo is the most typical game of the game, so it is also the focus of artificial intelligence research.libratusfield of Texas Hold’em, there are many different types of AI, which are divided into two categories: AI (competing humans, defeating human teachers) and technical routes (enhanced learning, big data mining + adaptive technology). Kind.

The first type: Defoe AI with the goal of defeating mankind

There are two well-known AIs in the world who have defeated human professional poker players in the Texas Hold’em 1v1 competition. They are DeepStack and Libratus. At present, AI has not conquered the Debu multi-person table.

Among them, Libratus is based on reinforcement learning, which means that you play a lot of games with yourself, and keep the best strategy, avoid learning from the established model of human beings, and finally realize the Nash equilibrium strategy (also known as GTO). ).

Of course, there are a lot of unrecognized AIs on the line 1m1. In actual combat, the multiplayer table multiplayer game is very complicated, and the amount of data collected by the same scene AI is very small. Therefore, the Texas Hold’em AI multiplayer table wins human beings or is a worldwide problem. At present, this type of AI is still in the experimental stage. As for the effect, think it yourself.

The second: teach people GTO’s defensive AI

The core of this kind of AI is to explore a set of optimal strategies by yourself (intensive learning) and then teach them to humans. The most famous in the field are pokersnowie and pioslover.

Take Snowie to say that its principle is similar to that of Libratus, which generates a optimal strategy (GTO) through reinforcement learning and teaches this strategy to players. After many years of iteration, it has established a certain degree of trust in the high-level Debu player circle. It is one of the best GTO learning software and the most widely used Depu training AI. For those who want to use this type of AI, I have some reminders:

It is very useful for masters. Many masters use it and pisolover to run some card analysis, even two comparisons, to test some ideas or get some inspiration;
The range of the robot and the actual opponent range must be different, which causes snowie to teach you the best strategy. When copying to actual combat, it is usually not the best strategy or even a good strategy, but it does not teach you how to adjust, so you light Practicing this will only become a GTO dog, and the best may not stand still, but it can’t beat rake and various living costs;
For non-top experts, to maximize efficiency, to find the opponent’s weaknesses, the most profitable is far more realistic than spending a lot of effort to repair their own unobvious vulnerabilities;
People who have used pioslover analysis know that the calculation strategy has to be operated for a long time. The human brain is not a machine. It can’t remember so many strategies, and it can’t be calculated in actual combat.
To sum up (the GTO definition differs from person to person, I use Nash Equilibrium/Balance to explain the problem), my opinion is that the higher the level of the opponent, the more you need to consider the balance factor. The lower the level, the more you need to exploit the game. The balance is useless and even misleading and harmful. In the same way, snowie is useful for top players. It is harmful, misleading, and fascinating for low- and medium-level players. It is like a pony crossing a river, others are clouds.

The third type: based on adaptive technology, the training of AI

The core of this kind of AI is to realize personalized training. The algorithm dynamically generates and matches the training content and training difficulty for the trainees. In the field of education AI, this kind of AI technology is widely used, because this kind of technical route is the best experience for the students. In the field of Texas poker training, there is only one master of spades in the world, because this takes into account the differences of the students, leading to engineering realization. The difficulty coefficient is high and the workload is large. It is a bit like Luo Chuan. How cool the users are, how much pain the producers have.

The concept of the master of spades is very different from the AI ​​that is intensively studied. It is human-centered and considers the following factors:

Based on the natural laws of human learning, follow deliberate practice rules, recent developments to theory and gamification teaching;
Texas Hold’em is profound and profound. Directly teaching newcomers’ advanced skills and teaching a final trick is not feasible. It has to be step by step. Therefore, it is necessary to teach students in accordance with their aptitude, to personalize training, and to dynamically adjust the training content and difficulty.
Teach what is good, people who can’t make a product decide, come from the player and improve based on the player feedback.
Its main mechanism:

The generation of training content: Using the big data analysis/AI clustering technology, the player classification, scene classification and action tree classification are performed on the actual historical data of real players on the massive online line, and the classification of the action tree is clearly identified, and the excellent players and the poor players win. The scenes with large difference in rate (high teaching effect), and generate preliminary evaluation criteria, the manual teaching and research team will review, supplement and improve, form training content, and continuously improve according to player feedback.
Training Push: Improve the push algorithm and evaluation algorithm by using player training data, and continuously review and filter the scenes with poor teaching results based on big data analysis.
There are two differences between the masters of spades and the AIs that teach GTO/optimal strategies. There are two main points:

The master of spades teaches targeted play. After all, efficiency is the most important, and efficiency comes from exploiting opponents’ vulnerabilities, especially weak opponents.
Training also covers a part of the balance strategy / optimal strategy relatively clear training
So if you want to choose training AI, how to choose? It is recommended that top players use pokersnowie and pioslover. The middle and low level players have no suspense to choose the master of spades. For the general sense, playing the master of spades to see the comprehensive level score and his card reviews, naturally there are counts.

For more exciting reading, please visit

Loading 0 Comments...

Be the first to add a comment

You must upgrade your account to leave a comment.

This thread has been locked. No further comments can be added. uses cookies to give you the best experience. Learn more about our Cookie Policy