There was a thread discussing Pluribus playing style here but nothing on what it means for online poker, so I thought I would start one here so we can each share our thoughts.
Here are a couple of interesting quotes from Noam Brown, the co-creator, which answered a AMA 5 days ago
We want to make the research accessible to AI researchers, so we're
including detailed descriptions of the algorithms and pseudocode in
the supplementary material, but we won't be releasing the code or
models in part because it would have a serious impact on online poker.
Pluribus, by comparison, uses "$150 worth of compute and runs in real
time on 2 CPUs" using less than 128 GB of memory
The most popular poker sites have advanced bot-detection techniques,
so trying to run a bot online is probably too risky to be worth it.
But I do think this kind of research will have an impact on pro poker.
In particular I think our latest techniques will be adopted by poker
training tools. Those tools are particularly weak right now when
dealing with 3+ player situations. Things like Linear CFR and
Discounted CFR should also allow these tools to compute all solutions
faster than they currently do.
Pluribus does not adapt to the way its opponents play. It treated each
hand that it played against the humans individually and did not carry
over knowledge from one hand to another. It learned to play entirely
through self play.
CFR is not guaranteed to converge to a Nash equilibrium in bridge.
That said, it wasn’t guaranteed to converge to anything useful in
6-player poker either, but it worked fine there.
I think opponent adaptation/exploitation is still a very interesting
AI challenge. I do think that top pros could beat weak players by more
than Pluribus would (though I do think Pluribus would still make a ton
of money off of weak players). The current state of the art for
opponent adaptation is pretty disappointing.
Tough the code for the bot is not freely available, it is partly available to other AI researchers, so there is a good possibility that it will get to some people with the right contacts and be developed to play on the larger sites. Fraud is always ahead of the anti-fraud detection by 6-12 months, as we saw with the Oborra or some Eastern European PS accounts. I could also imagine a real-time advisor being developed out of Pluribus play since it's really fast and only worth $150 of compute. All this would impact the poker online ecosystem. Are the days of online cash game counted? And how would you adjust your study habits in the short-/medium-term?