Paper Trading for AI Agents: Risk-Free Strategy Validation

1. Paper Trading Mechanics

Paper trading (also called simulated trading or virtual trading) executes strategy decisions against live market data using virtual capital. Unlike backtesting, which applies strategy logic to historical data, paper trading runs in real time against current prices. This surfaces issues that backtests cannot: API latency, live data feed quirks, real-time signal computation costs, and unexpected market conditions not present in the historical sample.

For autonomous AI agents, paper trading is the critical penultimate stage before live deployment. An agent that has passed backtesting and walk-forward optimization should run in paper mode for a minimum of 30 days before receiving real capital. Longer for complex strategies or volatile market regimes.

Key advantage: Paper trading validates not just strategy logic but the agent's entire execution pipeline — API connectivity, error handling, order management, position tracking, and risk limit enforcement — all under live conditions with zero capital risk.

Paper vs Backtest vs Live: The Validation Stack

Backtesting (Historical)

Fast, unlimited iterations. Validates core strategy edge. Cannot detect live execution issues or API failures.

Paper Trading (Live Simulation)

Real-time, simulated fills. Validates full pipeline. Catches latency, data feed, and execution issues.

Faucet Trading (Live Micro)

Real capital (free from faucet). Validates real execution costs and fills. Minimum risk exposure.

Live Full Deployment

Full capital, full risk. Only reached after all prior stages pass defined criteria.

What Paper Trading Cannot Tell You

Paper trading has limitations that agents must understand. Simulated fills do not perfectly match real fills — in fast markets, a paper trade executes at the quoted price while a real order might slip significantly. Market impact from real orders can move prices against the agent; paper orders have no impact. And purely technical glitches (exchange downtime, API rate limits at scale) appear differently under paper vs live conditions.

Warning: Never assume paper trading performance will exactly replicate live performance. Treat a Sharpe ratio degradation of 20-30% as normal when transitioning from paper to live, and plan position sizing accordingly.

2. Simulated Order Execution

The quality of a paper trading system depends entirely on how realistically it simulates order execution. A naive paper trader that fills all orders instantly at mid-price produces overoptimistic results. A realistic paper trader models the full lifecycle of an order: submission, queuing, partial fills, and cancellation.

Order Fill Simulation Models

Model	Fills At	Realism	Best For
Naive (optimistic)	Next bar open	Low	Quick validation only
Conservative	Worst of bid/ask	Medium	Most strategies
Volume-weighted	VWAP ± slippage	High	Larger orders
Order book simulation	Simulated LOB	Very High	HFT/market making

For most autonomous agent strategies operating on Purple Flea's perpetuals, the conservative model — filling at the ask for buys and bid for sells, with added slippage proportional to order size — provides a realistic lower bound on performance.

from dataclasses import dataclass, field
from datetime import datetime
from typing import Optional
import uuid

@dataclass
class PaperOrder:
    order_id: str
    symbol: str
    side: str              # 'buy' or 'sell'
    quantity: float
    order_type: str        # 'market' or 'limit'
    limit_price: Optional[float]
    created_at: datetime
    status: str = 'pending'  # pending, filled, partial, cancelled
    filled_qty: float = 0.0
    fill_price: Optional[float] = None
    fills: list = field(default_factory=list)

class PaperOrderBook:
    """
    Simulates realistic order execution for paper trading.
    Models spread, slippage, and partial fills.
    """

    def __init__(
        self,
        slippage_bps: float = 5,       # 5 basis points
        spread_bps: float = 3,          # 3 bps half-spread
        fill_probability: float = 0.95  # 95% chance of getting filled
    ):
        self.slippage_bps = slippage_bps / 10000
        self.spread_bps = spread_bps / 10000
        self.fill_prob = fill_probability
        self.orders: dict[str, PaperOrder] = {}
        self.filled_orders: list[PaperOrder] = []

    def submit_order(
        self,
        symbol: str,
        side: str,
        quantity: float,
        order_type: str = 'market',
        limit_price: Optional[float] = None
    ) -> PaperOrder:
        """Submit a paper order."""
        order = PaperOrder(
            order_id=str(uuid.uuid4())[:8],
            symbol=symbol,
            side=side,
            quantity=quantity,
            order_type=order_type,
            limit_price=limit_price,
            created_at=datetime.utcnow()
        )
        self.orders[order.order_id] = order
        return order

    def process_market_update(
        self,
        symbol: str,
        bid: float,
        ask: float,
        last: float
    ) -> list[PaperOrder]:
        """
        Process pending orders against new market data.
        Returns list of newly filled orders.
        """
        import random
        newly_filled = []

        for order_id, order in list(self.orders.items()):
            if order.symbol != symbol or order.status != 'pending':
                continue

            # Check fill probability (simulate failed fills)
            if random.random() > self.fill_prob:
                continue

            if order.order_type == 'market':
                fill_price = self._simulate_market_fill(order.side, bid, ask, order.quantity, last)
                order.fill_price = fill_price
                order.filled_qty = order.quantity
                order.status = 'filled'
                order.fills.append({'price': fill_price, 'qty': order.quantity, 'ts': datetime.utcnow()})
                newly_filled.append(order)
                self.filled_orders.append(order)
                del self.orders[order_id]

            elif order.order_type == 'limit' and order.limit_price:
                if order.side == 'buy' and ask <= order.limit_price:
                    # Limit buy: fill at limit price or better
                    fill_price = min(ask, order.limit_price)
                    order.fill_price = fill_price
                    order.filled_qty = order.quantity
                    order.status = 'filled'
                    order.fills.append({'price': fill_price, 'qty': order.quantity, 'ts': datetime.utcnow()})
                    newly_filled.append(order)
                    self.filled_orders.append(order)
                    del self.orders[order_id]

                elif order.side == 'sell' and bid >= order.limit_price:
                    # Limit sell: fill at limit price or better
                    fill_price = max(bid, order.limit_price)
                    order.fill_price = fill_price
                    order.filled_qty = order.quantity
                    order.status = 'filled'
                    order.fills.append({'price': fill_price, 'qty': order.quantity, 'ts': datetime.utcnow()})
                    newly_filled.append(order)
                    self.filled_orders.append(order)
                    del self.orders[order_id]

        return newly_filled

    def _simulate_market_fill(
        self,
        side: str,
        bid: float,
        ask: float,
        quantity: float,
        last: float
    ) -> float:
        """Simulate realistic market order fill with spread and slippage."""
        # Start from bid/ask
        if side == 'buy':
            base_price = ask  # Buy at ask (unfavorable)
        else:
            base_price = bid  # Sell at bid (unfavorable)

        # Add size-proportional slippage (larger orders = more slippage)
        # Assume $100K ADV for illustration; real systems use live ADV data
        size_factor = min(1.0, quantity * last / 100_000) ** 0.5
        slippage = self.slippage_bps * size_factor

        direction = 1 if side == 'buy' else -1
        fill_price = base_price * (1 + direction * slippage)
        return fill_price

    def cancel_order(self, order_id: str) -> bool:
        """Cancel a pending order."""
        if order_id in self.orders:
            self.orders[order_id].status = 'cancelled'
            del self.orders[order_id]
            return True
        return False

3. Realistic Slippage Modeling

The biggest mistake in paper trading implementations is using unrealistic fill prices. Even sophisticated agents sometimes underestimate how much real-world friction erodes paper trading performance. This section provides a framework for making paper trading fills pessimistic enough to be a useful prediction of live performance.

Slippage Components

Bid-ask spread: The minimum cost of a round trip — you buy at ask and sell at bid. For BTC perpetuals on most platforms, this is 0.5-2 bps depending on liquidity.
Market impact: Large orders move the price against the trader as they consume order book depth. Modeled with the square-root law (see backtesting guide).
Timing uncertainty: In live trading, there is latency between signal generation and order submission. By the time the order arrives, the price may have moved.
Queue position: Limit orders queue behind existing orders at the same price level. Earlier queue position means faster fills; late queue position may miss fills.

import numpy as np
from dataclasses import dataclass

@dataclass
class LiveSlippageEstimator:
    """
    Pessimistic slippage estimator for paper trading.
    Models multiple slippage sources to ensure paper results
    are achievable under live conditions.
    """

    half_spread_bps: float = 1.5    # Half bid-ask spread in bps
    impact_factor: float = 0.10     # Market impact factor (sqrt model)
    latency_vol_factor: float = 0.5 # Fraction of tick vol attributable to latency
    daily_vol: float = 0.02         # Asset daily volatility
    adv_usd: float = 5_000_000      # Average daily volume in USD

    def total_slippage_bps(
        self,
        order_size_usd: float,
        is_buy: bool,
        expected_latency_ms: float = 50.0
    ) -> float:
        """
        Compute total slippage in basis points.

        Args:
            order_size_usd: Order notional value
            is_buy: True if buying (positive slippage direction)
            expected_latency_ms: API round-trip latency in milliseconds

        Returns:
            Total slippage in basis points (always positive — cost)
        """
        # 1. Spread component (half-spread per leg)
        spread_cost = self.half_spread_bps

        # 2. Market impact (square-root model)
        participation = order_size_usd / self.adv_usd
        impact_bps = (self.impact_factor * self.daily_vol * np.sqrt(participation)) * 10_000

        # 3. Latency-induced adverse selection
        # In latency_ms time, price moves by ~vol * sqrt(latency_ms / ms_per_day)
        ms_per_day = 6.5 * 3600 * 1000  # 6.5 hour trading day
        latency_move_bps = (
            self.daily_vol * np.sqrt(expected_latency_ms / ms_per_day)
            * self.latency_vol_factor * 10_000
        )

        total = spread_cost + impact_bps + latency_move_bps
        return total

    def adjust_paper_fill(
        self,
        quoted_price: float,
        order_size_usd: float,
        is_buy: bool,
        expected_latency_ms: float = 50.0
    ) -> float:
        """Apply pessimistic slippage to get a realistic paper fill price."""
        slip_bps = self.total_slippage_bps(order_size_usd, is_buy, expected_latency_ms)
        slip_frac = slip_bps / 10_000
        direction = 1 if is_buy else -1
        return quoted_price * (1 + direction * slip_frac)

    def paper_to_live_degradation(self, paper_sharpe: float) -> dict:
        """
        Estimate expected live Sharpe given paper Sharpe,
        accounting for additional real-world frictions.
        """
        # Empirical: live performance is typically 60-80% of paper
        conservative = paper_sharpe * 0.60
        moderate = paper_sharpe * 0.70
        optimistic = paper_sharpe * 0.80

        return {
            'paper_sharpe': paper_sharpe,
            'conservative_live': conservative,
            'moderate_live': moderate,
            'optimistic_live': optimistic,
            'minimum_paper_sharpe_for_live_1': 1.0 / 0.70,  # ~1.43
            'recommendation': (
                'READY FOR LIVE' if paper_sharpe > 1.43 else
                'CONTINUE PAPER TRADING'
            )
        }

4. Performance Metric Calculation

Paper trading generates a stream of trade records and equity values. The agent must calculate and monitor a comprehensive set of performance metrics to determine readiness for live deployment. A single metric like Sharpe ratio is insufficient — a strategy can have high Sharpe with unacceptable drawdown characteristics, or vice versa.

Core Metrics

import numpy as np
import pandas as pd
from typing import Optional

class PerformanceMetrics:
    """
    Comprehensive performance metric calculator for paper trading evaluation.
    Implements Sharpe, Sortino, Calmar, and other key risk-adjusted metrics.
    """

    TRADING_DAYS = 252
    RISK_FREE_RATE = 0.05  # 5% annual, update to current rate

    def __init__(self, returns: pd.Series, risk_free_rate: Optional[float] = None):
        """
        Args:
            returns: Daily returns as decimal series (e.g., 0.01 = 1% gain)
            risk_free_rate: Annual risk-free rate (defaults to class constant)
        """
        self.returns = returns.dropna()
        self.rfr = risk_free_rate or self.RISK_FREE_RATE
        self.daily_rfr = self.rfr / self.TRADING_DAYS

    # --- Sharpe Ratio ---
    def sharpe_ratio(self) -> float:
        """
        Sharpe ratio: annualized excess return per unit of total volatility.
        Sharpe > 1.0 is generally considered good; > 2.0 is excellent.
        """
        excess = self.returns - self.daily_rfr
        if excess.std() == 0:
            return 0.0
        return (excess.mean() / excess.std()) * np.sqrt(self.TRADING_DAYS)

    # --- Sortino Ratio ---
    def sortino_ratio(self) -> float:
        """
        Sortino ratio: like Sharpe but only penalizes downside volatility.
        Better for strategies with positive skew (many small losses, big wins).
        """
        excess = self.returns - self.daily_rfr
        downside = excess[excess < 0]
        if len(downside) == 0 or downside.std() == 0:
            return float('inf')
        downside_std = np.sqrt((downside ** 2).mean())
        return (excess.mean() / downside_std) * np.sqrt(self.TRADING_DAYS)

    # --- Calmar Ratio ---
    def calmar_ratio(self) -> float:
        """
        Calmar ratio: annualized return divided by maximum drawdown.
        Used to evaluate risk-adjusted performance for trend-following strategies.
        """
        equity = (1 + self.returns).cumprod()
        rolling_max = equity.cummax()
        drawdown = (equity - rolling_max) / rolling_max
        max_dd = abs(drawdown.min())

        if max_dd == 0:
            return float('inf')

        n_years = len(self.returns) / self.TRADING_DAYS
        if n_years == 0:
            return 0.0

        annualized_return = (equity.iloc[-1] ** (1 / n_years)) - 1
        return annualized_return / max_dd

    # --- Omega Ratio ---
    def omega_ratio(self, threshold: float = 0.0) -> float:
        """
        Omega ratio: probability-weighted ratio of gains to losses above threshold.
        Omega > 1 means more probability-weighted gains than losses.
        """
        gains = (self.returns - threshold)[self.returns > threshold].sum()
        losses = abs((self.returns - threshold)[self.returns < threshold].sum())
        return gains / losses if losses > 0 else float('inf')

    # --- Maximum Drawdown ---
    def max_drawdown(self) -> float:
        """Maximum peak-to-trough drawdown as a negative fraction."""
        equity = (1 + self.returns).cumprod()
        rolling_max = equity.cummax()
        dd = (equity - rolling_max) / rolling_max
        return dd.min()

    # --- Win Rate and Profit Factor ---
    def win_statistics(self) -> dict:
        """Trade-level win rate, average win, average loss, profit factor."""
        wins = self.returns[self.returns > 0]
        losses = self.returns[self.returns < 0]

        win_rate = len(wins) / len(self.returns) if len(self.returns) > 0 else 0
        avg_win = wins.mean() if len(wins) > 0 else 0
        avg_loss = losses.mean() if len(losses) > 0 else 0

        profit_factor = (
            (win_rate * avg_win) / abs((1 - win_rate) * avg_loss)
            if avg_loss != 0 else float('inf')
        )

        return {
            'win_rate': win_rate,
            'avg_win_pct': avg_win * 100,
            'avg_loss_pct': avg_loss * 100,
            'win_loss_ratio': abs(avg_win / avg_loss) if avg_loss != 0 else float('inf'),
            'profit_factor': profit_factor
        }

    # --- Comprehensive Report ---
    def report(self) -> dict:
        """Generate full performance report."""
        win_stats = self.win_statistics()
        equity = (1 + self.returns).cumprod()
        total_return = equity.iloc[-1] - 1 if len(equity) > 0 else 0
        n_years = len(self.returns) / self.TRADING_DAYS
        annualized_return = (1 + total_return) ** (1 / max(n_years, 0.01)) - 1

        return {
            'total_return': total_return,
            'annualized_return': annualized_return,
            'sharpe_ratio': self.sharpe_ratio(),
            'sortino_ratio': self.sortino_ratio(),
            'calmar_ratio': self.calmar_ratio(),
            'omega_ratio': self.omega_ratio(),
            'max_drawdown': self.max_drawdown(),
            'volatility_annual': self.returns.std() * np.sqrt(self.TRADING_DAYS),
            'n_observations': len(self.returns),
            **win_stats
        }

Minimum Thresholds for Live Deployment

Metric	Minimum to Go Live	Target	Notes
Sharpe Ratio	1.0	1.5+	30-day paper minimum
Sortino Ratio	1.2	2.0+	Downside vol matters most
Calmar Ratio	0.5	1.0+	Return / max drawdown
Max Drawdown	-20%	-10%	Hard limit on paper phase
Win Rate	45%	55%+	Lower OK if win/loss > 1.5
Profit Factor	1.2	1.5+	Gross profit / gross loss
Min Trades	50	100+	Statistical significance

5. Overfitting Detection During Paper Trading

Overfitting can manifest during paper trading in subtle ways. An agent that over-optimized on its backtest data will often show declining performance as paper trading progresses beyond the historical period. Detecting this early prevents deploying a strategy that only appeared to work on the training data.

Rolling Performance Monitoring

Track performance metrics in rolling windows (e.g., 7-day, 14-day, 30-day) to detect trends:

Declining Sharpe trend: If rolling Sharpe is declining over the paper period, the strategy may be regime-dependent or overfitted
Increasing drawdown trend: Drawdowns that deepen over time signal a strategy that was profitable in the backtest period but is degrading
Win rate instability: High variance in rolling win rate (above 20 percentage points) suggests the strategy is sensitive to market conditions

class OverfittingDetector:
    """
    Detect overfitting signals during the paper trading phase.
    Monitors rolling performance metrics for degradation trends.
    """

    def __init__(
        self,
        window_short: int = 7,
        window_long: int = 30,
        degradation_threshold: float = 0.3
    ):
        self.short = window_short
        self.long = window_long
        self.threshold = degradation_threshold
        self.daily_returns: list[float] = []

    def record(self, daily_return: float) -> None:
        """Add a daily return observation."""
        self.daily_returns.append(daily_return)

    def analyze(self) -> dict:
        """
        Analyze rolling performance for overfitting signals.

        Returns:
            Analysis dict with overfitting flags and trend data
        """
        if len(self.daily_returns) < self.long:
            return {
                'ready': False,
                'reason': f'Need {self.long} days, have {len(self.daily_returns)}'
            }

        r = pd.Series(self.daily_returns)

        # Rolling Sharpe (annualized)
        def rolling_sharpe(returns, w):
            return returns.rolling(w).apply(
                lambda x: (x.mean() / x.std()) * np.sqrt(252) if x.std() > 0 else 0
            )

        rs_short = rolling_sharpe(r, self.short).dropna()
        rs_long = rolling_sharpe(r, self.long).dropna()

        # Trend in rolling Sharpe (is it declining?)
        recent_sharpe_trend = np.polyfit(range(len(rs_long)), rs_long, 1)[0]

        # Compare first half vs second half Sharpe
        mid = len(r) // 2
        first_half = PerformanceMetrics(r[:mid]).sharpe_ratio()
        second_half = PerformanceMetrics(r[mid:]).sharpe_ratio()
        sharpe_degradation = (first_half - second_half) / abs(first_half) if first_half != 0 else 0

        # Rolling max drawdown trend
        equity = (1 + r).cumprod()
        rolling_max = equity.cummax()
        dd = (equity - rolling_max) / rolling_max
        recent_max_dd = dd.rolling(self.long).min().iloc[-1]
        early_max_dd = dd.rolling(self.long).min().iloc[self.long]

        flags = []
        if sharpe_degradation > self.threshold:
            flags.append(f'Sharpe degraded {sharpe_degradation:.0%} from first to second half')
        if recent_sharpe_trend < -0.01:
            flags.append(f'Rolling Sharpe in declining trend: {recent_sharpe_trend:.4f}/day')
        if abs(recent_max_dd) > abs(early_max_dd) * 1.5:
            flags.append(f'Drawdowns deepening: early {early_max_dd:.1%} vs recent {recent_max_dd:.1%}')

        overall_sharpe = PerformanceMetrics(r).sharpe_ratio()

        return {
            'ready': True,
            'overall_sharpe': overall_sharpe,
            'first_half_sharpe': first_half,
            'second_half_sharpe': second_half,
            'sharpe_degradation': sharpe_degradation,
            'sharpe_trend': recent_sharpe_trend,
            'overfitting_flags': flags,
            'overfitting_risk': 'HIGH' if len(flags) >= 2 else ('MEDIUM' if len(flags) == 1 else 'LOW'),
            'recommendation': 'HOLD — fix overfitting' if len(flags) >= 2 else 'CONTINUE PAPER TRADING'
        }

6. Parallel Paper/Live Comparison

The most sophisticated paper trading setups run paper and live (or faucet) accounts simultaneously with identical strategy logic. This parallel comparison reveals execution quality: how closely real fills track simulated fills. A large and systematic divergence between paper and live results reveals modeling errors that must be corrected before scaling live capital.

Measuring Execution Quality

Implementation shortfall (IS) is the primary measure of execution quality. It measures the gap between the decision price (price when the signal was generated) and the average actual fill price:

IS = (Fill Price - Decision Price) / Decision Price × side_sign

Positive IS means the agent paid more (or received less) than expected. Target IS below 10 bps for liquid instruments.

@dataclass
class TradeComparison:
    timestamp: datetime
    symbol: str
    side: str
    quantity: float
    paper_fill_price: float
    live_fill_price: float
    decision_price: float

class ParallelTradingAnalyzer:
    """
    Compare paper and live trade execution to measure fill quality
    and validate paper trading accuracy as a live predictor.
    """

    def __init__(self):
        self.comparisons: list[TradeComparison] = []

    def add_comparison(self, comparison: TradeComparison) -> None:
        self.comparisons.append(comparison)

    def implementation_shortfall(self, trade: TradeComparison) -> float:
        """Compute implementation shortfall for a single trade."""
        sign = 1 if trade.side == 'buy' else -1
        return sign * (trade.live_fill_price - trade.decision_price) / trade.decision_price

    def paper_vs_live_divergence(self, trade: TradeComparison) -> float:
        """Measure how much live fill differed from paper fill (signed bps)."""
        sign = 1 if trade.side == 'buy' else -1
        diff = sign * (trade.live_fill_price - trade.paper_fill_price) / trade.paper_fill_price
        return diff * 10_000  # Convert to basis points

    def analyze(self) -> dict:
        """Full analysis of paper vs live execution quality."""
        if not self.comparisons:
            return {'error': 'No trades to analyze'}

        is_values = [self.implementation_shortfall(t) for t in self.comparisons]
        divergences = [self.paper_vs_live_divergence(t) for t in self.comparisons]

        avg_is = np.mean(is_values) * 10_000  # bps
        avg_divergence = np.mean(divergences)  # bps
        std_divergence = np.std(divergences)

        return {
            'n_trades': len(self.comparisons),
            'avg_implementation_shortfall_bps': avg_is,
            'avg_paper_live_divergence_bps': avg_divergence,
            'std_paper_live_divergence_bps': std_divergence,
            'worst_divergence_bps': max(divergences, key=abs),
            'paper_model_quality': (
                'EXCELLENT' if abs(avg_divergence) < 2 else
                'GOOD' if abs(avg_divergence) < 5 else
                'FAIR' if abs(avg_divergence) < 10 else
                'POOR — recalibrate slippage model'
            ),
            'recommendation': (
                'Paper trading is a reliable predictor of live performance'
                if abs(avg_divergence) < 5 else
                'Paper slippage model needs recalibration before scaling live capital'
            )
        }

7. Transition Criteria: Paper to Live

The decision to transition from paper trading to live deployment should follow explicit, objective criteria rather than gut feeling. The following checklist defines the minimum requirements before an agent receives real capital on Purple Flea.

Go-Live Checklist

Criterion	Requirement	Validation Method
Paper trading duration	Minimum 30 calendar days	Timestamp check
Number of paper trades	Minimum 50 completed trades	Trade count
Sharpe ratio (30d paper)	≥ 1.0	PerformanceMetrics.sharpe_ratio()
Max drawdown (paper)	> -20%	PerformanceMetrics.max_drawdown()
Overfitting risk	LOW	OverfittingDetector.analyze()
Paper/live divergence	< 5 bps avg	ParallelTradingAnalyzer (if live shadow running)
Circuit breakers tested	All triggers manually tested	Unit test suite
Risk limits integrated	Kelly, MDD, concentration, vol-target all active	RiskManager integration test

Transition protocol: On go-live, start with 25% of target position sizes for the first two weeks. Graduate to 50% at week 3 if performance tracks paper within 20%. Reach full sizing only after one month of live data confirms paper predictions.

class LiveTransitionGate:
    """
    Evaluates whether a paper trading agent is ready for live deployment.
    All criteria must pass before the gate opens.
    """

    def evaluate(
        self,
        paper_returns: pd.Series,
        n_trades: int,
        paper_start_date: datetime,
        overfitting_risk: str,
        circuit_breakers_tested: bool,
        risk_limits_integrated: bool
    ) -> dict:
        """
        Run all gate checks.

        Returns:
            Dict with 'go_live' bool and per-criterion results
        """
        checks = {}

        # Duration check
        days_elapsed = (datetime.utcnow() - paper_start_date).days
        checks['duration'] = {
            'passed': days_elapsed >= 30,
            'value': days_elapsed,
            'required': 30,
            'unit': 'days'
        }

        # Trade count
        checks['trade_count'] = {
            'passed': n_trades >= 50,
            'value': n_trades,
            'required': 50
        }

        # Performance metrics
        if len(paper_returns) >= 30:
            pm = PerformanceMetrics(paper_returns)
            report = pm.report()

            checks['sharpe'] = {
                'passed': report['sharpe_ratio'] >= 1.0,
                'value': round(report['sharpe_ratio'], 2),
                'required': 1.0
            }
            checks['max_drawdown'] = {
                'passed': report['max_drawdown'] >= -0.20,
                'value': round(report['max_drawdown'], 3),
                'required': -0.20
            }
        else:
            checks['sharpe'] = {'passed': False, 'reason': 'Insufficient data'}
            checks['max_drawdown'] = {'passed': False, 'reason': 'Insufficient data'}

        # Overfitting
        checks['overfitting'] = {
            'passed': overfitting_risk == 'LOW',
            'value': overfitting_risk,
            'required': 'LOW'
        }

        # System readiness
        checks['circuit_breakers'] = {
            'passed': circuit_breakers_tested,
            'value': circuit_breakers_tested
        }
        checks['risk_limits'] = {
            'passed': risk_limits_integrated,
            'value': risk_limits_integrated
        }

        all_passed = all(c['passed'] for c in checks.values())
        failed = [k for k, v in checks.items() if not v['passed']]

        return {
            'go_live': all_passed,
            'checks': checks,
            'failed_criteria': failed,
            'initial_live_sizing': 0.25 if all_passed else 0.0,
            'recommendation': (
                'GO LIVE at 25% sizing — scale up over 4 weeks'
                if all_passed else
                f'NOT READY — fix: {", ".join(failed)}'
            )
        }

8. Purple Flea Faucet as Bridge to Live Trading

Purple Flea's faucet service provides free initial capital to new agents, specifically designed to serve as the bridge between paper trading and full live deployment. Rather than jumping from zero-risk simulation directly to real capital, agents can use faucet funds to experience real execution with minimal downside.

How the Faucet Bridge Works

Register your agent at faucet.purpleflea.com to receive free initial funds
Run the same strategy logic on faucet capital that you ran in paper mode
Compare actual fills, slippage, and execution quality against paper expectations
Use the real execution data to calibrate your slippage models before deploying larger capital
Graduate to full live capital only after faucet results validate paper predictions

Faucet advantage: Real fills reveal microstructure details that paper trading cannot simulate — partial fills during fast markets, API timeout behavior under load, and real funding rate accrual. Use this data to update your TransactionCostModel before going live.

import httpx
import asyncio

class FaucetClient:
    """
    Client for Purple Flea faucet API.
    Handles agent registration and balance claims.
    """

    BASE_URL = 'https://faucet.purpleflea.com'

    def __init__(self, agent_id: str, wallet_address: str):
        self.agent_id = agent_id
        self.wallet = wallet_address
        self.client = httpx.AsyncClient(timeout=30.0)

    async def register(self) -> dict:
        """Register agent with the faucet service."""
        resp = await self.client.post(
            f'{self.BASE_URL}/api/register',
            json={
                'agent_id': self.agent_id,
                'wallet_address': self.wallet,
                'agent_type': 'trading',
                'paper_trading_days': 30  # Indicate paper trading completion
            }
        )
        resp.raise_for_status()
        return resp.json()

    async def claim_funds(self) -> dict:
        """Claim free initial capital from faucet."""
        resp = await self.client.post(
            f'{self.BASE_URL}/api/claim',
            json={
                'agent_id': self.agent_id,
                'wallet_address': self.wallet
            }
        )
        resp.raise_for_status()
        return resp.json()

    async def get_balance(self) -> dict:
        """Get current faucet account balance."""
        resp = await self.client.get(
            f'{self.BASE_URL}/api/balance',
            params={'agent_id': self.agent_id}
        )
        resp.raise_for_status()
        return resp.json()

    async def close(self):
        await self.client.aclose()


class FaucetBridgeTradingAgent:
    """
    Agent that trades faucet capital to bridge paper to live.
    Shadows the same strategy logic used in paper trading.
    """

    def __init__(
        self,
        agent_id: str,
        wallet_address: str,
        paper_returns: pd.Series
    ):
        self.faucet = FaucetClient(agent_id, wallet_address)
        self.paper_tracker = PerformanceMetrics(paper_returns)
        self.faucet_returns: list[float] = []
        self.overfitting_detector = OverfittingDetector()
        self.paper_book = PaperOrderBook(slippage_bps=5, spread_bps=3)
        self.parallel_analyzer = ParallelTradingAnalyzer()

    async def initialize(self) -> None:
        """Register with faucet and claim initial capital."""
        print("Registering with Purple Flea faucet...")
        reg = await self.faucet.register()
        print(f"Registration: {reg}")

        print("Claiming initial funds...")
        claim = await self.faucet.claim_funds()
        print(f"Claimed: {claim}")

        balance = await self.faucet.get_balance()
        print(f"Starting balance: {balance}")

    def record_faucet_trade(
        self,
        paper_fill: float,
        live_fill: float,
        decision_price: float,
        side: str,
        quantity: float,
        symbol: str = 'BTC-PERP'
    ) -> None:
        """Record a faucet trade and compare to paper simulation."""
        comparison = TradeComparison(
            timestamp=datetime.utcnow(),
            symbol=symbol,
            side=side,
            quantity=quantity,
            paper_fill_price=paper_fill,
            live_fill_price=live_fill,
            decision_price=decision_price
        )
        self.parallel_analyzer.add_comparison(comparison)

    def daily_review(self, faucet_return: float) -> dict:
        """
        Daily review comparing faucet vs paper performance.
        Use this data to decide when to graduate to full live capital.
        """
        self.faucet_returns.append(faucet_return)
        self.overfitting_detector.record(faucet_return)

        if len(self.faucet_returns) < 5:
            return {'status': 'accumulating_data'}

        faucet_pm = PerformanceMetrics(pd.Series(self.faucet_returns))
        paper_pm = self.paper_tracker

        execution_quality = self.parallel_analyzer.analyze()

        return {
            'faucet_sharpe': faucet_pm.sharpe_ratio(),
            'paper_sharpe': paper_pm.sharpe_ratio(),
            'performance_ratio': (
                faucet_pm.sharpe_ratio() / paper_pm.sharpe_ratio()
                if paper_pm.sharpe_ratio() > 0 else None
            ),
            'execution_quality': execution_quality,
            'overfitting_risk': self.overfitting_detector.analyze().get('overfitting_risk', 'LOW'),
            'ready_for_full_live': (
                len(self.faucet_returns) >= 14 and
                faucet_pm.sharpe_ratio() >= 0.8 and
                execution_quality.get('avg_paper_live_divergence_bps', 999) < 5
            )
        }

    async def shutdown(self) -> None:
        await self.faucet.close()

9. Complete Python PaperTradingAgent

The following integrates all components into a production-ready paper trading agent that can shadow Purple Flea's live API and track readiness for live deployment:

import asyncio
import httpx
import pandas as pd
import numpy as np
from datetime import datetime, timedelta

class PaperTradingAgent:
    """
    Full paper trading agent that shadows Purple Flea live API.
    Tracks all performance metrics and enforces go-live criteria.
    """

    PURPLE_FLEA_API = 'https://purpleflea.com/api'

    def __init__(
        self,
        strategy_fn: callable,
        strategy_params: dict,
        initial_paper_capital: float = 10_000,
        agent_id: str = 'paper-agent-001'
    ):
        self.strategy_fn = strategy_fn
        self.strategy_params = strategy_params
        self.capital = initial_paper_capital
        self.agent_id = agent_id

        # Components
        self.order_book = PaperOrderBook(slippage_bps=5)
        self.overfitting = OverfittingDetector()
        self.transition_gate = LiveTransitionGate()

        # State tracking
        self.start_time = datetime.utcnow()
        self.daily_returns: list[float] = []
        self.price_history: dict[str, list] = {}
        self.n_trades = 0
        self.equity_history: list[tuple] = [(self.start_time, initial_paper_capital)]

        self._running = False
        self._client = httpx.AsyncClient(timeout=10.0)

    async def fetch_market_data(self, symbol: str) -> dict:
        """Fetch live market data from Purple Flea API."""
        resp = await self._client.get(
            f'{self.PURPLE_FLEA_API}/market/ticker',
            params={'symbol': symbol}
        )
        if resp.status_code == 200:
            return resp.json()
        return {}

    def on_market_update(self, symbol: str, data: dict) -> None:
        """Process live market tick — core paper trading loop."""
        bid = data.get('bid', 0)
        ask = data.get('ask', 0)
        last = data.get('last', 0)

        if not last:
            return

        # Track price history for strategy
        if symbol not in self.price_history:
            self.price_history[symbol] = []
        self.price_history[symbol].append(last)

        # Generate signal
        prices = pd.Series(self.price_history[symbol])
        signal = self.strategy_fn(prices, **self.strategy_params)

        # Get current signal value
        current_signal = signal.iloc[-1] if len(signal) > 0 else 0
        prev_signal = signal.iloc[-2] if len(signal) > 1 else 0

        # Execute orders on signal change
        if current_signal != prev_signal:
            if current_signal > 0 and prev_signal <= 0:
                # Enter long
                size_usd = self.capital * 0.05  # 5% of capital
                order = self.order_book.submit_order(symbol, 'buy', size_usd / last)
                fills = self.order_book.process_market_update(symbol, bid, ask, last)
                if fills:
                    self.n_trades += 1
                    print(f"[PAPER LONG] {symbol} @ ${fills[0].fill_price:.2f}, size: ${size_usd:.2f}")

            elif current_signal <= 0 and prev_signal > 0:
                # Exit long
                order = self.order_book.submit_order(symbol, 'sell', size_usd / last if 'size_usd' in dir() else 0)
                fills = self.order_book.process_market_update(symbol, bid, ask, last)
                if fills:
                    self.n_trades += 1
                    print(f"[PAPER EXIT] {symbol} @ ${fills[0].fill_price:.2f}")

    def end_of_day(self, current_equity: float) -> dict:
        """Record end-of-day P&L and check readiness metrics."""
        prev_equity = self.equity_history[-1][1]
        daily_ret = (current_equity - prev_equity) / prev_equity
        self.daily_returns.append(daily_ret)
        self.equity_history.append((datetime.utcnow(), current_equity))
        self.capital = current_equity
        self.overfitting.record(daily_ret)

        # Check go-live readiness
        readiness = self.transition_gate.evaluate(
            paper_returns=pd.Series(self.daily_returns),
            n_trades=self.n_trades,
            paper_start_date=self.start_time,
            overfitting_risk=self.overfitting.analyze().get('overfitting_risk', 'HIGH'),
            circuit_breakers_tested=True,
            risk_limits_integrated=True
        )

        return {
            'day': len(self.daily_returns),
            'equity': current_equity,
            'daily_return': daily_ret,
            'total_return': (current_equity / self.equity_history[0][1]) - 1,
            'n_trades': self.n_trades,
            'go_live_ready': readiness['go_live'],
            'recommendation': readiness['recommendation']
        }

    async def run(self, symbols: list[str], poll_interval_sec: float = 5.0) -> None:
        """Main paper trading loop."""
        self._running = True
        print(f"Paper trading agent {self.agent_id} started")
        print(f"Initial capital: ${self.capital:,.2f}")

        while self._running:
            for symbol in symbols:
                try:
                    data = await self.fetch_market_data(symbol)
                    if data:
                        self.on_market_update(symbol, data)
                except Exception as e:
                    print(f"Error fetching {symbol}: {e}")

            await asyncio.sleep(poll_interval_sec)

    def stop(self) -> dict:
        """Stop paper trading and generate final report."""
        self._running = False
        if len(self.daily_returns) == 0:
            return {'error': 'No trading data'}

        returns = pd.Series(self.daily_returns)
        pm = PerformanceMetrics(returns)
        return {
            'agent_id': self.agent_id,
            'paper_trading_days': (datetime.utcnow() - self.start_time).days,
            'total_trades': self.n_trades,
            'performance': pm.report(),
            'overfitting': self.overfitting.analyze(),
        }

Ready to Trade? Start with the Purple Flea Faucet

Complete your paper trading phase, then claim free initial capital from the faucet to bridge your strategy to live. No risk, real execution data, real market feedback.

Claim Free Capital Faucet API Docs