I created an adaptor for this poker-bot to work with the dev.fun's poker-arena-starter-kit so that others who are already familiar with their CLI and tooling can easily use this bot's strategies as well.
-
Clone the dev.fun's poker-arena-starter-kit and this repository to the same directory
mkdir poker cd poker git clone https://github.com/devfun-org/poker-arena-starter-kit git clone https://github.com/gin/poker-bot.git -
Copy and paste the adaptor file from the root of this repo to the root of the starter-kit repo
cp poker-bot/poker-arena-starter-kit-adaptor/poker_bot_strategy_adaptor.py poker-arena-starter-kit/examples/
-
Add my repository as a dependency in
poker-arena-starter-kit/pyproject.toml
For reference, look at ./poker-arena-starter-kit-adaptor/pyproject.toml- At the bottom, add these lines
[tool.uv.sources] poker-bot = { path = "../../poker/poker-bot", editable = true }
- In
dependenciesadd"poker-bot"like thisdependencies = [ "httpx>=0.27", "python-dotenv>=1.0", "treys>=0.1.8", "pokerkit>=0.5", "poker-bot", ]
- At the bottom, add these lines
-
To run one of this bot's strategies, in the
poker-arena-starter-kitdirectory, prependPOKER_BOT_STRATEGY=<strategy_name>with the adaptor file as the agent when runningpoker-arena-starter-kit's./pokerkitcommand like this:cd poker-arena-starter-kit POKER_BOT_STRATEGY=royal_adaptive ./pokerkit run --agent poker_bot_strategy_adaptor.py --hands 1000 --players 6The strategies are at: src/poker_bot/strategies/
The filename is the strategy name.
- Create account from https://dev.fun
curl --json '{"handle": "YOUR_HANDLE", "name": "YOUR_NAME", "quote": "BIO_DESCRIPTION"}' https://arena.dev.fun/api/arena/auth/register- Copy
.arena-credentials.exampleto.arena-credentials - Update API key, API prefix, competition ID, and Agent ID in
.arena-credentials - Sync dependencies
uv sync - Run test to ensure no errors
uv run pytest - Run the bot
uv run main.py
uv run pytest
# Linting and formatting
uv run ruff check .
# Type check
uv run ty checkRun the terminal poker simulator against the bot logic:
# Human play against a bot using baseline strategy
uv run simulator.py
# Human play against a bot using strategy in src/poker_bot/strategies/all_in_everytime.py
uv run simulator --strat all_in_everytimeI ran benchmark of different strategies against simple strategy, where each new strategy is an iteration of the one before. It appears that poker playing strategy can be self-improved in a closed-loop with instruction "create a new strategy based on the previous one but ensure the new strategy consistently beats the ones that came before it". all_in_everytime and monte_carlo strategies are used as edge cases where simple (src/poker_bot/strategies/simple.py) strategy is used as baseline. The total cost of this self-improvement poker playing experiment including agentically competing with other bots on dev.fun arena until we lost is $0.68 (12.6M tokens using DeepSeek V4 Flash via OpenRouter):
# bb/100 self-improvement progression starting at adaptive strategy
simple +2.9 | ██
all_in_everytime -123.2 | ██████████████████████████████████████████████████ -
monte_carlo -15.0 | ██████ -
--------------------------------------------------------------------------------
adaptive +18.5 | ███████
counter_adaptive +27.4 | ███████████
profiled_counter_adaptive +27.4 | ███████████
threshold_pressure +28.0 | ███████████
anti_threshold +42.9 | █████████████████
# Ranking by bb/100
1. anti_threshold +42.9
2. threshold_pressure +28.0
3. profiled_counter_adaptive +27.4
4. counter_adaptive +27.4
5. adaptive +18.5
6. simple +2.9
7. monte_carlo -15.0
8. all_in_everytime -123.2
# Net Chips
simple +71,650 | █
all_in_everytime -3,079,650 | ██████████████████████████████████████████████████ -
monte_carlo -375 | -
adaptive +462,325 | ███████
counter_adaptive +684,890 | ███████████
profiled_counter_adaptive +684,890 | ███████████
threshold_pressure +700,450 | ███████████
anti_threshold +1,072,698 | █████████████████
# Win Rate
simple 50.1% | █████████████████████████
all_in_everytime 89.6% | █████████████████████████████████████████████
monte_carlo 40.0% | ████████████████████
adaptive 55.2% | ████████████████████████████
counter_adaptive 61.0% | ██████████████████████████████
profiled_counter_adaptive 61.0% | ██████████████████████████████
threshold_pressure 61.0% | ██████████████████████████████
anti_threshold 67.1% | ██████████████████████████████████
Recreate the benchmark data with:
$ uv run selfplay --strat simple --hands 50000 --seed 1
hands : 50000
opponent : simple x1
wins/losses : 25068/24878 (push: 54)
net chips : +71650
bb/100 : +2.9
elapsed : 0.6s (78792 hands/s)
$ uv run selfplay --strat all_in_everytime --hands 50000 --seed 1
hands : 50000
opponent : simple x1
wins/losses : 44776/5091 (push: 133)
net chips : -3079650
bb/100 : -123.2
elapsed : 1.3s (39025 hands/s)
$ uv run selfplay --strat monte_carlo --hands 50 --seed 1
hands : 50
opponent : simple x1
wins/losses : 20/30 (push: 0)
net chips : -375
bb/100 : -15.0
elapsed : 1.3s (38 hands/s)
$ uv run selfplay --strat adaptive --hands 50000 --seed 1
hands : 50000
opponent : simple x1
wins/losses : 27602/22358 (push: 40)
net chips : +462325
bb/100 : +18.5
elapsed : 0.9s (58645 hands/s)
$ uv run selfplay --strat counter_adaptive --hands 50000 --seed 1
hands : 50000
opponent : simple x1
wins/losses : 30477/19487 (push: 36)
net chips : +684890
bb/100 : +27.4
elapsed : 0.8s (62677 hands/s)
$ uv run selfplay --strat profiled_counter_adaptive --hands 50000 --seed 1
hands : 50000
opponent : simple x1
wins/losses : 30477/19487 (push: 36)
net chips : +684890
bb/100 : +27.4
elapsed : 0.9s (57313 hands/s)
$ uv run selfplay --strat threshold_pressure --hands 50000 --seed 1
hands : 50000
opponent : simple x1
wins/losses : 30477/19487 (push: 36)
net chips : +700450
bb/100 : +28.0
elapsed : 0.8s (61176 hands/s)
$ uv run selfplay --strat anti_threshold --hands 50000 --seed 1
hands : 50000
opponent : simple x1
wins/losses : 33567/16380 (push: 53)
net chips : +1072698
bb/100 : +42.9
elapsed : 1.1s (45679 hands/s)Or use the benchmark command to run a strategy against a bunch of different strategies:
$ uv run benchmark --strat survival_balanced --opponents simple,all_in_everytime,adaptive,royal_flush --hands 50000 --seed 1
benchmark : survival_balanced
cases : 8
elapsed : 46.9s
opponent players seeds hands net bb/100 chips/hand W/L/P
------------------------------------------------------------------------------
simple 2 1 50000 +110075 +22.0 +2.2 28863/21076/61
simple 6 1 50000 +106471 +21.3 +2.1 7639/12327/30034
all_in_everytime 2 1 50000 +2248445 +449.7 +45.0 5379/44472/149
all_in_everytime 6 1 50000 +2863065 +572.6 +57.3 1231/18587/30182
royal_flush 2 1 50000 +41793 +8.4 +0.8 24678/25246/76
royal_flush 6 1 50000 +31327 +6.3 +0.6 7734/12244/30022
adaptive 2 1 50000 -5257 -1.1 -0.1 29560/20414/26
adaptive 6 1 50000 -117079 -23.4 -2.3 6626/14965/28409Append --profile to the benchmark command:
$ uv run benchmark --strat survival_balanced --opponents simple --players 2 --hands 50000 --profile
benchmark : survival_balanced
cases : 12
elapsed : 58.3s
opponent players seeds hands net bb/100 chips/hand W/L/P
------------------------------------------------------------------------------
simple 2 6 300000 +390110 +13.0 +1.3 172209/127430/361
profile : tight/measured
VPIP 19.1% (tight)
PFR 10.1% (passive)
AF 14.34 (aggressive)
3-BET% 17.4% (aggressive)
WTSD 24.0% (tight)
W$SD 48.4% (balanced)
BLUFF 0.0% (tight)
Thank you fielding (fielding. in Discord), T (drt_thea_35041 in Discord), and sayurnara (sayurnara123 in Discord) for the detailed analysis of my bot's last hand.
Onwards to S2 Tournament.
Use my referral link to enter the tournament: https://arena.dev.fun/i/r-tiluigi-ddb2db95
Entry is free if you enter your bot to S2 Playground before S2 Tournament starts.