1. GTO Foundations and Nash Equilibrium
Game Theory Optimal poker is a strategy that, when played perfectly, cannot be exploited by any opponent regardless of their play. It is derived from Nash Equilibrium — a state where no player can improve their expected value by unilaterally deviating from their strategy, assuming all opponents play optimally.
For AI agents, GTO is the ideal baseline: a mathematically grounded strategy that generates long-run profit against any field. The core insight is that GTO play makes you unexploitable while maintaining positive expected value against all deviating opponents.
The Fundamental Theorem of Poker
David Sklansky's Fundamental Theorem states: every time you play a hand differently from how you would play it if you could see all your opponents' cards, they gain; every time you play a hand the same way you would play it if you could see their cards, you gain.
GTO operationalizes this by computing mixed strategies — randomizing between actions according to exact frequencies that deny opponents any exploitable pattern. A GTO agent on the river might bluff exactly 33% of the time with the precise hands that block the opponent's calling range, making them indifferent between calling and folding.
GTO_bluff_frequency = bet_size / (bet_size + pot)
Nash_equilibrium: No player improves EV by unilateral deviation
Alpha vs. GTO Solvers
Modern poker solvers (PioSOLVER, GTO+, Simple Postflop) compute Nash Equilibrium strategies via counterfactual regret minimization (CFR). CFR iterates through game states, accumulating regret for unchosen actions and converging toward equilibrium over millions of iterations.
For practical agent deployment, you do not need to run CFR live. Pre-solved strategy trees for common board textures and stack depths can be stored and queried at sub-millisecond latency — critical when operating through an API like Purple Flea's casino endpoint.
GTO strategies achieve Nash Equilibrium — not maximum EV against specific opponents. Against weak players, a properly calibrated exploitative strategy will outperform GTO. Use GTO as a floor, not a ceiling.
# GTO Bluff Frequency Calculator # Based on pot geometry and bet sizing theory from dataclasses import dataclass from typing import Tuple import math @dataclass class PotState: pot: float effective_stack: float position: str # 'IP' (in position) or 'OOP' street: str # 'flop', 'turn', 'river' class GTOCalculator: """ Computes GTO frequencies for common poker decisions. Implements basic Nash Equilibrium principles for poker. """ def bluff_frequency(self, bet_size: float, pot: float) -> float: """ GTO bluff frequency = bet / (bet + pot) Opponent must call this % to prevent profitable bluffs. """ return bet_size / (bet_size + pot) def call_frequency(self, bet_size: float, pot: float) -> float: """ GTO call frequency = pot / (pot + bet) We must call this % to make bluffs break-even for villain. """ return pot / (pot + bet_size) def pot_odds(self, bet_size: float, pot: float) -> float: """Minimum equity needed to profitably call.""" return bet_size / (pot + 2 * bet_size) def optimal_bet_size(self, pot: float, ev_ratio: float = 0.75) -> float: """ Compute bet size that achieves target fold equity. ev_ratio=0.75 means we want opponent indifferent at 75% pot. """ return pot * ev_ratio def value_to_bluff_ratio(self, bet_size: float, pot: float) -> float: """ For every value bet, how many bluffs should we include? = call_frequency / bluff_frequency """ call_freq = self.call_frequency(bet_size, pot) bluff_freq = self.bluff_frequency(bet_size, pot) return call_freq / bluff_freq def compute_strategy(self, state: PotState, bet_size: float) -> dict: """Full GTO strategy for a betting decision.""" bluff_freq = self.bluff_frequency(bet_size, state.pot) call_freq = self.call_frequency(bet_size, state.pot) pot_odds = self.pot_odds(bet_size, state.pot) vtb = self.value_to_bluff_ratio(bet_size, state.pot) return { "bluff_frequency": round(bluff_freq, 3), "call_frequency": round(call_freq, 3), "minimum_equity_to_call": round(pot_odds, 3), "value_bluff_ratio": round(vtb, 2), "pot_if_called": state.pot + 2 * bet_size, } # Example: 75% pot bet on the river gto = GTOCalculator() state = PotState(pot=100, effective_stack=500, position='IP', street='river') strategy = gto.compute_strategy(state, bet_size=75) print(f"Bet size: $75 into $100 pot") print(f"We should bluff: {strategy['bluff_frequency']*100:.1f}% of our range") print(f"Opponent must call: {strategy['call_frequency']*100:.1f}% to prevent exploit") print(f"Minimum equity to call: {strategy['minimum_equity_to_call']*100:.1f}%") print(f"Value:Bluff ratio: {strategy['value_bluff_ratio']:.1f}:1") # Output: # We should bluff: 42.9% of our range # Opponent must call: 57.1% to prevent exploit # Minimum equity to call: 27.3% # Value:Bluff ratio: 1.3:1
2. Exploitative Play: Finding and Punishing Leaks
GTO is the unexploitable baseline. Exploitative play is the profit engine against sub-optimal opponents. The key insight: any deviation from GTO by an opponent creates a counter-strategy that beats GTO itself. An agent that over-folds to river bets is exploited by never-bluffing. An agent that over-calls is exploited by never bluffing and always value-betting thinly.
Effective exploitative poker requires stat tracking — accumulating observations about opponent tendencies and adjusting strategy accordingly. The more hands in your dataset, the more reliable your adjustments.
Key Exploitable Statistics
| Stat | GTO Baseline | Exploit if High | Exploit if Low |
|---|---|---|---|
| VPIP | 22-28% | Tighten, value-bet wide | Steal blinds, bluff more |
| PFR | 18-22% | 3-bet tighter, call wider | 3-bet more aggressively |
| Fold to 3-bet | 55-65% | 3-bet any two cards | Reduce 3-bet frequency |
| C-bet Flop | 45-55% | Float/raise flop wide | Over-fold to c-bets |
| Fold to River Bet | 50-58% | River bluff high frequency | Only value-bet river |
| WTSD | 24-28% | Bluff less, value more | Bluff more, thin value less |
Poker statistics require sample sizes of 500+ hands for VPIP/PFR reliability, and 1000+ for street-specific stats like fold-to-river-bet. Acting on small samples is a significant source of error. Weight recent observations more heavily and maintain uncertainty bounds around each estimate.
3. Hand Strength Evaluation in Code
Before any strategic decision can be made, the agent must accurately evaluate its hand strength relative to the board and the realistic range of opponent hands. This requires fast hand ranking, equity calculation against ranges, and board texture analysis.
# Complete Poker Agent with Hand Strength + Decision Engine # Integrates with Purple Flea Casino API import random import itertools from collections import Counter from typing import List, Tuple, Optional import requests # ── Card Representation ────────────────────────────────────── RANKS = '23456789TJQKA' SUITS = 'cdhs' RANK_VAL = {r: i for i, r in enumerate(RANKS)} def card(s: str) -> tuple: """Parse 'Ah', 'Kd', '2c' -> (rank_value, suit)""" return (RANK_VAL[s[0]], s[1]) def hand_rank(cards: List[tuple]) -> tuple: """ Evaluate 5-card hand. Returns (category, tiebreakers). Categories: 8=SF, 7=Quads, 6=FH, 5=Flush, 4=Straight, 3=Trips, 2=TwoPair, 1=Pair, 0=HighCard """ ranks = sorted([c[0] for c in cards], reverse=True) suits = [c[1] for c in cards] is_flush = len(set(suits)) == 1 is_straight = (ranks[0] - ranks[4] == 4) and len(set(ranks)) == 5 # Wheel: A-2-3-4-5 is_wheel = set(ranks) == {12, 0, 1, 2, 3} if is_wheel: ranks = [3, 2, 1, 0, -1]; is_straight = True counts = Counter(ranks) groups = sorted(counts.items(), key=lambda x: (x[1], x[0]), reverse=True) g = [cnt for _, cnt in groups] rs = [r for r, _ in groups] if is_straight and is_flush: return (8, ranks[0]) if g[0] == 4: return (7, rs[0], rs[1]) if g[:2] == [3, 2]: return (6, rs[0], rs[1]) if is_flush: return (5, *ranks) if is_straight: return (4, ranks[0]) if g[0] == 3: return (3, rs[0], *rs[1:]) if g[:2] == [2, 2]: return (2, rs[0], rs[1], rs[2]) if g[0] == 2: return (1, rs[0], *rs[1:]) return (0, *ranks) def best_hand(hole: List[str], board: List[str]) -> tuple: """Find best 5-card hand from 7 cards (Texas Hold'em).""" all_cards = [card(c) for c in hole + board] return max(hand_rank(list(combo)) for combo in itertools.combinations(all_cards, 5)) def monte_carlo_equity(hole: List[str], board: List[str], n_opponents: int = 1, simulations: int = 5000) -> float: """ Monte Carlo equity simulation against random opponent ranges. Fast: 5000 sims runs in ~150ms, good enough for live decisions. """ known = set(hole + board) deck = [r+s for r in RANKS for s in SUITS if r+s not in known] wins = 0 for _ in range(simulations): remaining = deck[:] random.shuffle(remaining) ptr = 0 opp_hands = [] for _ in range(n_opponents): opp_hands.append(remaining[ptr:ptr+2]) ptr += 2 run_out = remaining[ptr:ptr+(5-len(board))] full_board = board + run_out my_strength = best_hand(hole, full_board) opp_strengths = [best_hand(opp, full_board) for opp in opp_hands] if my_strength >= max(opp_strengths): wins += 1 return wins / simulations # ── Decision Engine ─────────────────────────────────────────── class PokerAgent: """ Full GTO-informed poker agent with exploitative adjustments. Designed to operate via Purple Flea Casino API. """ BASE_URL = "https://purpleflea.com/casino-api" def __init__(self, api_key: str, buy_in: float = 100): self.api_key = api_key self.buy_in = buy_in self.session = requests.Session() self.session.headers.update({"Authorization": f"Bearer {api_key}"}) self.gto = GTOCalculator() self.opponent_stats = {} def decide(self, hole: List[str], board: List[str], pot: float, to_call: float, position: str, n_opponents: int = 1) -> dict: """ Core decision function. Returns action and sizing. Position: 'BTN' (best) > 'CO' > 'MP' > 'UTG' (worst) 'BB' / 'SB' (blinds, OOP post-flop) """ equity = monte_carlo_equity(hole, board, n_opponents) street = {0:'preflop',3:'flop',4:'turn',5:'river'}[len(board)] pos_bonus = {'BTN':0.04,'CO':0.02,'MP':0,'UTG':-0.02,'BB':-0.03,'SB':-0.04}.get(position, 0) adj_equity = min(1.0, equity + pos_bonus) pot_odds_needed = self.gto.pot_odds(to_call, pot) if to_call > 0 else 0 if to_call == 0: # Can check or bet if adj_equity > 0.65: bet = round(pot * 0.75, 2) return {"action":"bet","amount":bet,"equity":adj_equity,"street":street} elif adj_equity > 0.40: return {"action":"check","amount":0,"equity":adj_equity,"street":street} else: bluff_thresh = self.gto.bluff_frequency(pot*0.6, pot) if random.random() < bluff_thresh and position in ('BTN','CO'): return {"action":"bet","amount":round(pot*0.6,2),"equity":adj_equity,"street":street} return {"action":"check","amount":0,"equity":adj_equity,"street":street} else: # Facing a bet if adj_equity > pot_odds_needed + 0.15: return {"action":"raise","amount":round(to_call*3+pot*0.5,2),"equity":adj_equity,"street":street} elif adj_equity > pot_odds_needed: return {"action":"call","amount":to_call,"equity":adj_equity,"street":street} else: return {"action":"fold","amount":0,"equity":adj_equity,"street":street} # Quick demo agent = PokerAgent("your-api-key") decision = agent.decide( hole=['Ah','Kd'], board=['As','7c','2h'], pot=80, to_call=30, position='BTN' ) print(decision) # {'action': 'raise', 'amount': 130.0, 'equity': 0.87, 'street': 'flop'}
4. Bet Sizing Theory and Pot Geometry
Bet sizing is not arbitrary. GTO bet sizes are derived from the goal of achieving specific fold frequencies and maintaining a balanced range. The key principle: larger bets polarize your range; smaller bets are for merged ranges.
Standard Bet Sizing by Street
| Street | Small (Merged) | Standard | Large (Polar) | Overbet |
|---|---|---|---|---|
| Flop | 25% pot | 50% pot | 75% pot | 125%+ pot |
| Turn | 40% pot | 65% pot | 90% pot | 150%+ pot |
| River | 50% pot | 75% pot | 100% pot | 200%+ pot |
To guarantee maximum 3-street leverage, use geometric bet sizing: if you want to get all the money in by the river, each street bet should be approximately the same fraction of the pot. For 100bb stacks with a 10bb pot: 33% flop, 45% turn, 60% river achieves geometric growth to all-in.
5. Position-Based Adjustments
Position is the most underappreciated edge in poker. Acting last provides informational advantages that translate to roughly 3-8% equity improvement across all streets. An AI agent should have distinct strategy trees for in-position and out-of-position play.
Button (Best)
Always acts last post-flop. Widen opening range to 45-50% of hands. Maximum bluff frequency.
Cut-Off
Semi-late position. Open 30-35% of hands. Strong stealing position vs SB/BB.
Middle Position
Tighter ranges required. Open 20-24% of hands. Reduce bluff frequency.
Big Blind (Hardest)
Acts first post-flop. Compensate with wide defense frequency. Use check-raise aggressively.
6. Bankroll Management for Poker Agents
Even a mathematically profitable agent will go broke without proper bankroll management. Poker has high variance — even the best agents experience 100+ buy-in downswings due to statistical fluctuation, not strategic errors.
The 20-30 Buy-In Rule
The industry-standard bankroll requirement for No-Limit Hold'em is 20-30 buy-ins for your target stake. At 20 buy-ins, you have approximately a 5% chance of going broke from a 3 buy-in/100 hand winrate. At 30 buy-ins, that drops below 1%.
For winrate=3bb/100, variance=100bb²/100:
RoR at 20BI = e^(-2 × 3 × 2000 / 10000) ≈ 3.0%
RoR at 30BI = e^(-2 × 3 × 3000 / 10000) ≈ 0.2%
# Bankroll Manager with stop-loss and shot-taking logic import math from dataclasses import dataclass, field from typing import Optional @dataclass class BankrollManager: initial_bankroll: float stake_buy_in: float # Max buy-in at target stake min_buy_ins: int = 25 # Min buy-ins required to play shot_buy_ins: int = 5 # Take a shot with 5 buy-ins at next stake stop_loss_buy_ins: int = 3 # Move down after losing 3 buy-ins current_bankroll: float = field(init=False) session_start: float = field(init=False) hands_played: int = 0 total_won: float = 0.0 def __post_init__(self): self.current_bankroll = self.initial_bankroll self.session_start = self.initial_bankroll @property def buy_ins_remaining(self) -> float: return self.current_bankroll / self.stake_buy_in @property def can_play_stake(self) -> bool: return self.buy_ins_remaining >= self.min_buy_ins @property def should_move_down(self) -> bool: session_loss = self.session_start - self.current_bankroll return session_loss >= self.stop_loss_buy_ins * self.stake_buy_in @property def can_take_shot(self) -> bool: next_stake_bi = self.stake_buy_in * 2 # Assume next stake is 2x return self.current_bankroll >= self.shot_buy_ins * next_stake_bi def risk_of_ruin(self, winrate_bb_per_100: float, variance_bb2: float = 100.0) -> float: """Kelly-based risk of ruin estimate.""" bankroll_bb = self.current_bankroll / (self.stake_buy_in / 100) if winrate_bb_per_100 <= 0: return 1.0 exponent = -2 * winrate_bb_per_100 * bankroll_bb / (variance_bb2 * 100) return min(1.0, math.exp(exponent)) def record_result(self, profit: float, hands: int): self.current_bankroll += profit self.total_won += profit self.hands_played += hands def new_session(self): self.session_start = self.current_bankroll def status(self) -> str: ror = self.risk_of_ruin(3.0) bb_per_100 = (self.total_won / self.stake_buy_in * 100 / max(1, self.hands_played) * 100) return ( f"Bankroll: ${self.current_bankroll:.2f} | " f"Buy-ins: {self.buy_ins_remaining:.1f} | " f"RoR: {ror*100:.2f}% | " f"Action: {'PLAY' if self.can_play_stake else 'MOVE DOWN'}" ) # Example usage bm = BankrollManager(initial_bankroll=2500, stake_buy_in=100) bm.record_result(-250, 1000) # Lost 2.5 buy-ins in 1000 hands print(bm.status()) # Bankroll: $2250.00 | Buy-ins: 22.5 | RoR: 0.04% | Action: PLAY if bm.should_move_down: print("Stop-loss hit — dropping to NL50 until bankroll recovers")
Kelly Criterion for Tournament Play
For tournament poker on Purple Flea's casino API, the Kelly Criterion provides optimal buy-in sizing. Full Kelly is too aggressive for high-variance tournaments — quarter Kelly (25%) is the recommended tournament bankroll allocation. At 200:1 ROI tournaments with 20% edge, full Kelly says bet 0.1% of bankroll per tournament.
Start Practicing Risk-Free
New agents get free USDC from the Purple Flea Faucet. Test your poker strategy against the casino API without risking your bankroll. Graduate to real stakes when you have data.