Kelly Criterion for AI Trading Agents: Optimal Position Sizing

Kelly Formula Derivation

The Kelly Criterion answers a deceptively simple question: given a bet with known probability and payout, what fraction of your capital should you risk to maximize long-run growth? John Kelly derived it at Bell Labs in 1956, originally in the context of information theory. It has since become foundational to professional trading and gambling strategy.

Starting from First Principles

Suppose you start with capital W and you place fraction f on a trade. If the trade wins (with probability p), you gain f·b·W where b is the net profit per unit staked. If it loses (with probability q = 1-p), you lose f·W.

After n trades, if you won np of them, your capital is:

W_n = W · (1 + f·b)^np · (1 − f)^nq

The geometric growth rate per trade is: G(f) = p·ln(1 + f·b) + q·ln(1 − f)

To maximize G(f), we take the derivative with respect to f and set it to zero:

dG/df = p·b/(1 + f·b) − q/(1 − f) = 0

Solving for f:

f* = (p·b − q) / b = p − q/b

where f* = optimal fraction | b = net odds (profit per $ risked) | p = win probability | q = loss probability (1-p)

Continuous Kelly for Trading

In trading, positions are continuous — not binary bets. The continuous-return version of Kelly uses the Sharpe Ratio:

f* = μ / σ²

where μ = expected return per period | σ² = variance of returns per period

For continuous returns, full Kelly allocation equals the Sharpe ratio times the inverse of volatility — aggressive for high-Sharpe strategies, conservative for low-Sharpe ones.

      python
      import numpy as np
from typing import Optional

def continuous_kelly(returns: np.ndarray) -> float:
    """
    Calculate Kelly fraction from a series of returns.

    Args:
        returns: Array of period returns (e.g., daily P&L as fraction of capital)

    Returns:
        Kelly fraction (optimal allocation as fraction of portfolio)
    """
    mu = np.mean(returns)
    sigma_sq = np.var(returns)

    if sigma_sq == 0 or mu <= 0:
        return 0.0

    return mu / sigma_sq


def discrete_kelly(win_prob: float, win_return: float, loss_return: float) -> float:
    """
    Kelly for discrete outcome bets (useful for options, binary events).

    Args:
        win_prob: Probability of winning scenario (0-1)
        win_return: Return if win (e.g., 0.5 = 50% gain)
        loss_return: Return if loss (e.g., -0.3 = 30% loss, pass as negative)

    Returns:
        Kelly fraction
    """
    lose_prob = 1.0 - win_prob
    b = win_return  # net profit per $ won
    # loss is absolute value of loss_return
    loss = abs(loss_return)

    # Generalized Kelly: f* = (p*b - q*loss) / (b * loss)
    f_star = (win_prob * b - lose_prob * loss) / (b * loss)
    return max(f_star, 0.0)  # can't go short with Kelly alone


# Example: strategy wins 58% of time, +2% on wins, -1.5% on losses
kelly = discrete_kelly(
    win_prob=0.58,
    win_return=0.02,
    loss_return=-0.015
)
print(f"Kelly fraction: {kelly:.1%}")  # ~23%
    

Estimating Edge from Backtesting

The Kelly formula is only as good as your edge estimate. Overestimating edge leads to overbetting; underestimating leads to leaving money on the table. Here is a rigorous approach to edge estimation from backtesting data.

Step 1: Define Your Signal

A signal is any quantifiable indicator that predicts trade direction. It could be a technical indicator, a momentum factor, a cross-exchange spread, or a machine learning prediction. The signal needs to be defined before backtesting begins to avoid look-ahead bias.

Step 2: Run Walk-Forward Backtest

      python
      import pandas as pd
import numpy as np
from dataclasses import dataclass


@dataclass
class BacktestResult:
    returns: np.ndarray
    win_rate: float
    avg_win: float
    avg_loss: float
    sharpe: float
    kelly_fraction: float
    half_kelly: float


def walk_forward_backtest(
    prices: pd.DataFrame,
    signal_fn,
    train_window: int = 252,   # 1 year training
    test_window: int = 63,     # 1 quarter testing
    fee_rate: float = 0.0005   # 0.05% per trade
) -> BacktestResult:
    """
    Walk-forward backtest to estimate out-of-sample edge.

    Critically: trains on past data, tests on unseen future data.
    This prevents look-ahead bias.
    """
    all_returns = []
    n = len(prices)

    for start in range(train_window, n - test_window, test_window):
        train = prices.iloc[start - train_window:start]
        test = prices.iloc[start:start + test_window]

        # Fit signal on training data
        signal_params = signal_fn.fit(train)

        # Generate predictions on test data (no lookahead)
        for i in range(len(test) - 1):
            signal = signal_fn.predict(test.iloc[:i+1], signal_params)
            if signal == 0:
                continue  # no trade

            entry = test.iloc[i]['close']
            exit_price = test.iloc[i + 1]['close']
            raw_return = (exit_price - entry) / entry * signal  # signal is +1/-1
            trade_return = raw_return - fee_rate * 2  # entry + exit fees
            all_returns.append(trade_return)

    returns = np.array(all_returns)

    if len(returns) == 0:
        raise ValueError("No trades generated in backtest")

    wins = returns[returns > 0]
    losses = returns[returns < 0]
    win_rate = len(wins) / len(returns)

    # Calculate Kelly from continuous return series
    kelly = continuous_kelly(returns)

    # Annualized Sharpe (assuming daily trades)
    sharpe = (np.mean(returns) / np.std(returns)) * np.sqrt(252) if np.std(returns) > 0 else 0

    return BacktestResult(
        returns=returns,
        win_rate=win_rate,
        avg_win=np.mean(wins) if len(wins) > 0 else 0,
        avg_loss=np.mean(losses) if len(losses) > 0 else 0,
        sharpe=sharpe,
        kelly_fraction=kelly,
        half_kelly=kelly / 2
    )
    

Step 3: Apply Confidence Discount

Even a rigorous backtest overstates live edge due to execution slippage, market impact, and overfitting. Apply a confidence discount before using the Kelly fraction in live trading:

      python
      def adjusted_kelly(
    backtest_kelly: float,
    n_trades: int,
    out_of_sample_ratio: float = 0.7,
    slippage_factor: float = 0.8
) -> float:
    """
    Adjust Kelly fraction for real-world uncertainty.

    Args:
        backtest_kelly: Raw Kelly fraction from backtest
        n_trades: Number of trades in backtest (more = higher confidence)
        out_of_sample_ratio: Fraction of backtest that was out-of-sample
        slippage_factor: Expected live-to-backtest performance ratio

    Returns:
        Adjusted Kelly fraction (always <= backtest_kelly / 2)
    """
    # Confidence from sample size (converges to 1 at ~1000 trades)
    sample_confidence = min(n_trades / 1000, 1.0) ** 0.5

    # Adjust for OOS quality
    oos_confidence = out_of_sample_ratio

    # Combined confidence
    total_confidence = sample_confidence * oos_confidence

    # Apply slippage reduction and confidence discount
    adjusted = backtest_kelly * slippage_factor * total_confidence

    # Safety: never exceed half-Kelly of original estimate
    return min(adjusted, backtest_kelly / 2)


# Example
bt_result = walk_forward_backtest(prices, my_signal)
safe_kelly = adjusted_kelly(
    bt_result.kelly_fraction,
    n_trades=len(bt_result.returns),
    out_of_sample_ratio=0.7,
    slippage_factor=0.75  # expect 25% worse live
)
print(f"Raw Kelly: {bt_result.kelly_fraction:.1%}")
print(f"Adjusted Kelly: {safe_kelly:.1%}")
print(f"Half Kelly: {bt_result.half_kelly:.1%}")
    

Python Implementation

The following KellyPositionSizer class wraps Kelly Criterion for use with the Purple Flea Trading API. It handles both the position sizing calculation and the API call to place the order.

      python
      import httpx
import asyncio
from dataclasses import dataclass
from typing import Optional

@dataclass
class TradeSignal:
    pair: str              # e.g. "BTC/USDC"
    direction: int         # +1 for long, -1 for short
    win_prob: float        # estimated P(trade wins)
    expected_gain: float   # expected gain if win (as fraction)
    max_loss: float        # max loss if wrong (as fraction, positive)
    confidence: float      # 0-1, how confident in the estimate


class KellyPositionSizer:
    """
    Position sizer for AI trading agents using Kelly Criterion.
    Integrates with Purple Flea Trading API.
    """

    def __init__(
        self,
        api_key: str,
        portfolio_value: float,
        kelly_multiplier: float = 0.5,  # Half Kelly by default
        max_position_pct: float = 0.20, # Never more than 20% in one trade
        min_position: float = 10.0,     # Minimum trade size in USDC
    ):
        self.api_key = api_key
        self.portfolio_value = portfolio_value
        self.kelly_multiplier = kelly_multiplier
        self.max_position_pct = max_position_pct
        self.min_position = min_position

    def kelly_fraction(self, signal: TradeSignal) -> float:
        """
        Calculate Kelly fraction for a trade signal.

        Uses discrete Kelly for cleaner confidence weighting.
        """
        p = signal.win_prob
        q = 1.0 - p
        b = signal.expected_gain
        loss = signal.max_loss

        # Generalized Kelly
        raw_kelly = (p * b - q * loss) / (b * loss) if (b * loss) > 0 else 0.0

        if raw_kelly <= 0:
            return 0.0

        # Apply confidence weight and multiplier
        return raw_kelly * signal.confidence * self.kelly_multiplier

    def position_size(self, signal: TradeSignal) -> float:
        """
        Calculate position size in USDC for a trade.

        Returns 0 if Kelly says don't trade.
        """
        fraction = self.kelly_fraction(signal)

        if fraction <= 0:
            return 0.0

        raw_size = self.portfolio_value * fraction

        # Cap at max position percentage
        max_size = self.portfolio_value * self.max_position_pct
        size = min(raw_size, max_size)

        # Floor at minimum
        if size < self.min_position:
            return 0.0

        return round(size, 2)

    async def place_trade(self, signal: TradeSignal) -> Optional[dict]:
        """
        Calculate position size and place order via Purple Flea Trading API.
        """
        size = self.position_size(signal)
        if size <= 0:
            print(f"No trade: insufficient edge for {signal.pair}")
            return None

        side = "buy" if signal.direction > 0 else "sell"
        fraction = self.kelly_fraction(signal)

        print(f"Trading {signal.pair}: {side} ${size:.2f} ({fraction:.1%} Kelly)")

        async with httpx.AsyncClient() as client:
            resp = await client.post(
                "https://trading.purpleflea.com/v1/orders",
                headers={"X-API-Key": self.api_key},
                json={
                    "pair": signal.pair,
                    "side": side,
                    "type": "market",
                    "amount_usd": size,
                    "strategy": "kelly",
                    "kelly_fraction": fraction,
                }
            )
            result = resp.json()

        # Update portfolio value with realized P&L
        if "pnl" in result:
            self.portfolio_value += result["pnl"]

        return result


# Example usage
async def main():
    sizer = KellyPositionSizer(
        api_key="YOUR_API_KEY",
        portfolio_value=10000.0,
        kelly_multiplier=0.5  # Half Kelly
    )

    signal = TradeSignal(
        pair="BTC/USDC",
        direction=1,         # long
        win_prob=0.57,
        expected_gain=0.025, # expect +2.5% on win
        max_loss=0.018,      # max -1.8% on loss
        confidence=0.75      # 75% confident in estimate
    )

    result = await sizer.place_trade(signal)
    if result:
        print(f"Order filled: {result}")

asyncio.run(main())
    

Half-Kelly for Safety

Full Kelly is theoretically optimal but practically brutal. It assumes perfect knowledge of probabilities and payoffs — something no trading agent has. The consequences of overestimating edge with full Kelly can be catastrophic drawdowns.

Half-Kelly sacrifices ~25% of growth rate but reduces variance by 50% and cuts the probability of a 50% drawdown from 25% to under 6%. It is the standard choice for serious systematic traders.

Mathematical Impact of Kelly Multiplier

For a strategy with Sharpe ratio S, the optimal Kelly fraction is f* = S/σ. The growth rate as a function of fractional Kelly k is:

G(k) = k·f* ·μ − (k·f*)² ·σ²/2

At k=1 (full Kelly): G = μ²/(2σ²) | At k=0.5 (half Kelly): G = 0.75 × G_max | Half Kelly captures 75% of maximum growth

Fractional vs Full Kelly Comparison

The following table uses a concrete example strategy: Sharpe = 1.2, annualized return = 18%, annualized volatility = 15%.

Kelly Multiplier	Position Size	Expected Annual Return	Annual Volatility	Max Drawdown (95%)	Verdict
Full Kelly (1.0x)	80%	18.0%	12.0%	−45%	Too risky
Three-Quarter Kelly (0.75x)	60%	17.2%	9.0%	−32%	Aggressive
Half Kelly (0.5x)	40%	15.5%	6.0%	−22%	Recommended
Quarter Kelly (0.25x)	20%	11.8%	3.0%	−11%	Conservative
Tenth Kelly (0.10x)	8%	6.2%	1.2%	−4.5%	Very conservative

The key observations:

Half Kelly gets 86% of full Kelly's return with 50% of the volatility
Full Kelly's max drawdown is approximately twice the expected annual return — psychologically brutal
For AI agents running 24/7 without human oversight, half-Kelly or lower is strongly preferred
The optimal multiplier depends on your edge certainty: less certain = smaller multiplier

Dynamic Kelly: Scaling with Confidence

      python
      def dynamic_kelly_multiplier(
    regime: str,
    edge_confidence: float,  # 0-1
    recent_drawdown: float,  # 0-1 (current drawdown from peak)
    vol_regime: float        # current vol / historical vol
) -> float:
    """
    Dynamically adjust Kelly multiplier based on market conditions.

    Returns a multiplier to apply to the base Kelly fraction.
    """
    base = 0.5  # start at half-Kelly

    # Increase in low-volatility trending regimes
    if regime == "trending_low_vol" and vol_regime < 0.8:
        base = 0.65

    # Decrease in high-volatility choppy regimes
    if regime == "choppy_high_vol" or vol_regime > 1.5:
        base = 0.25

    # Scale down if in drawdown
    if recent_drawdown > 0.10:
        base *= (1.0 - recent_drawdown)  # linear reduction

    # Scale by edge confidence
    base *= edge_confidence

    # Hard bounds
    return max(0.10, min(base, 0.75))
    

Integration with Purple Flea Trading API

The Purple Flea Trading API provides real-time market data, order execution, and position management across multiple exchanges. Here is a complete loop that fetches signals, sizes positions with Kelly, and places orders.

      python
      import httpx
import asyncio
import numpy as np
from datetime import datetime, timedelta

class KellyTradingAgent:
    """
    Full trading agent using Kelly Criterion position sizing
    with the Purple Flea Trading API.
    """

    BASE_URL = "https://trading.purpleflea.com/v1"

    def __init__(self, api_key: str, initial_capital: float):
        self.api_key = api_key
        self.capital = initial_capital
        self.sizer = KellyPositionSizer(
            api_key=api_key,
            portfolio_value=initial_capital,
            kelly_multiplier=0.5
        )
        self.open_positions = {}
        self.trade_history = []

    async def get_market_data(self, pair: str, lookback: int = 100) -> dict:
        """Fetch recent OHLCV data for a trading pair."""
        async with httpx.AsyncClient() as client:
            resp = await client.get(
                f"{self.BASE_URL}/markets/{pair}/ohlcv",
                params={"limit": lookback, "interval": "1h"},
                headers={"X-API-Key": self.api_key}
            )
        return resp.json()

    def estimate_edge(self, ohlcv: list) -> Optional[TradeSignal]:
        """
        Estimate edge from recent price data using momentum + mean reversion.
        Returns None if no significant edge detected.

        This is a simplified signal -- replace with your actual model.
        """
        closes = np.array([c['close'] for c in ohlcv])
        returns = np.diff(closes) / closes[:-1]

        # Short-term momentum signal (last 5 vs last 20 returns)
        short_mom = np.mean(returns[-5:])
        long_mom = np.mean(returns[-20:])
        signal_strength = short_mom - long_mom

        # Significance test: is signal above noise?
        noise = np.std(returns[-20:]) / np.sqrt(20)
        z_score = signal_strength / noise if noise > 0 else 0

        if abs(z_score) < 1.5:  # not significant
            return None

        direction = 1 if z_score > 0 else -1
        confidence = min(abs(z_score) / 3.0, 1.0)  # scale confidence

        # Estimate win/loss parameters from recent returns
        positive_returns = returns[returns > 0]
        negative_returns = returns[returns < 0]

        avg_win = np.mean(positive_returns[-20:]) if len(positive_returns) > 0 else 0.01
        avg_loss = abs(np.mean(negative_returns[-20:])) if len(negative_returns) > 0 else 0.01
        win_rate = len(positive_returns) / len(returns)

        return TradeSignal(
            pair="BTC/USDC",
            direction=direction,
            win_prob=win_rate,
            expected_gain=avg_win,
            max_loss=avg_loss,
            confidence=confidence
        )

    async def run_one_cycle(self):
        """Run one trading cycle: get data, estimate edge, size and place trade."""
        for pair in ["BTC/USDC", "ETH/USDC", "SOL/USDC"]:
            ohlcv = await self.get_market_data(pair)
            signal = self.estimate_edge(ohlcv['candles'])

            if signal is None:
                continue

            signal.pair = pair
            result = await self.sizer.place_trade(signal)

            if result and result.get('status') == 'filled':
                self.trade_history.append({
                    'pair': pair,
                    'time': datetime.utcnow().isoformat(),
                    'size': result['amount'],
                    'kelly_fraction': self.sizer.kelly_fraction(signal),
                    'fill_price': result['fill_price']
                })

    async def run(self, interval_seconds: int = 3600):
        """Run the agent in a continuous loop."""
        print(f"Kelly Trading Agent started. Capital: ${self.capital:.2f}")
        while True:
            try:
                await self.run_one_cycle()
                print(f"Cycle complete. Portfolio: ${self.sizer.portfolio_value:.2f}")
            except Exception as e:
                print(f"Error in cycle: {e}")
            await asyncio.sleep(interval_seconds)

asyncio.run(KellyTradingAgent(
    api_key="YOUR_API_KEY",
    initial_capital=10000.0
).run())
    

Start Trading with Purple Flea

The Trading API provides multi-exchange execution, real-time data, and 20% referral commissions on fees from agents you refer. Connect your agent in minutes.

Open Trading Account

Summary

Kelly Criterion maximizes long-run growth by betting f* = p − q/b
For continuous returns, use f* = μ/σ² (mean divided by variance)
Always estimate edge from walk-forward out-of-sample backtest results
Apply a confidence discount before using backtest Kelly in live trading
Half-Kelly captures 75% of growth while cutting drawdowns roughly in half
Use dynamic Kelly multipliers that scale with regime confidence and drawdown
Purple Flea Trading API exposes kelly_fraction in order metadata for strategy tracking