1. Mean Reversion Theory
Mean reversion is the empirical observation that asset prices — or more precisely, spreads between related assets — tend to drift back toward a long-run equilibrium after deviating from it. Unlike trend-following, which bets that current momentum continues, mean reversion bets that today's dislocation is temporary.
For autonomous AI agents, mean reversion is particularly attractive because:
- Entry and exit signals are mathematically defined (Z-score thresholds), reducing discretionary ambiguity.
- Position sizing via the Ornstein-Uhlenbeck (OU) framework gives an analytical volatility estimate.
- The strategy is market-neutral when properly hedged — performance uncorrelated to broad crypto beta.
- Agents can monitor dozens of pairs simultaneously without fatigue, exploiting micro-dislocations that human traders miss.
The core statistical requirement is stationarity: the spread series must have a finite variance and return to its mean. A non-stationary series (a random walk) has no long-run mean to revert to, so testing for stationarity is the critical first step.
Crypto perpetual futures are ideal for mean reversion: 24/7 trading, no borrow restrictions for short legs, tight spreads on major pairs, and frictionless API access for autonomous agents.
2. Pairs Trading Mechanics
Pairs trading constructs a synthetic spread between two correlated assets X and Y. The agent simultaneously goes long one leg and short the other in a ratio that creates a stationary spread series.
The Hedge Ratio
The hedge ratio beta is estimated via ordinary least squares regression of Y on X:
The spread is then defined as:
Dynamic Hedge Ratios
Static OLS regression gives a single hedge ratio for the entire lookback window. In practice, the relationship between crypto assets drifts over time. Agents should implement rolling regression (e.g., 60-day window) or Kalman filter estimation to track the evolving hedge ratio dynamically.
| Method | Advantage | Disadvantage | Best for |
|---|---|---|---|
| Static OLS | Simple, fast | Stale in regime shifts | Stable macro pairs |
| Rolling OLS | Adapts to drift | Lag in fast-moving markets | Most crypto pairs |
| Kalman Filter | Real-time adaptive | Complex tuning | High-frequency intraday |
| PCA / Eigen | Multi-asset portfolios | Interpretability | Basket pairs |
3. Cointegration Testing: ADF and KPSS
Before trading any pair, the agent must verify that the spread series is cointegrated — meaning the two assets share a common stochastic trend and the spread is mean-reverting. Two complementary tests are standard:
Augmented Dickey-Fuller (ADF) Test
The ADF test has the null hypothesis that the series has a unit root (is non-stationary). Rejecting the null (p-value < 0.05) is evidence of stationarity. However, the ADF has low power in small samples and can give false positives if there is structural break.
KPSS Test
The KPSS test reverses the null: the null is stationarity. Failing to reject KPSS (p-value > 0.05) combined with rejecting ADF provides a much stronger confirmation. Agents should use both tests together.
Engle-Granger Two-Step Procedure
For pairs trading, the formal cointegration test regresses one asset on the other, then tests the residuals for stationarity:
- Regress Y on X via OLS to obtain residuals.
- Apply ADF to the residuals. If p < 0.05, the pair is cointegrated.
- Optionally apply Johansen test for multi-variate cointegration (baskets of 3+ assets).
import numpy as np
import pandas as pd
from statsmodels.tsa.stattools import adfuller, kpss, coint
def test_cointegration(price_y: pd.Series, price_x: pd.Series) -> dict:
"""
Full cointegration test suite for a candidate pair.
Returns dict with test statistics and tradeable flag.
"""
# Engle-Granger cointegration test
score, pvalue, _ = coint(price_y, price_x)
# Compute spread with OLS hedge ratio
from numpy.polynomial import polynomial as P
beta = np.cov(price_y, price_x)[0, 1] / np.var(price_x)
alpha = np.mean(price_y) - beta * np.mean(price_x)
spread = price_y - beta * price_x - alpha
# ADF on spread
adf_stat, adf_p, _, _, adf_crit, _ = adfuller(spread, maxlags=10, autolag='AIC')
# KPSS on spread (stationarity null)
kpss_stat, kpss_p, _, kpss_crit = kpss(spread, regression='c', nlags='auto')
tradeable = (adf_p < 0.05) and (kpss_p > 0.05) and (pvalue < 0.05)
return {
'coint_pvalue': pvalue,
'adf_pvalue': adf_p,
'kpss_pvalue': kpss_p,
'hedge_ratio': beta,
'alpha': alpha,
'tradeable': tradeable,
'spread_mean': float(spread.mean()),
'spread_std': float(spread.std()),
}
# Usage example
btc_prices = pd.Series([...]) # BTCUSDT daily closes
eth_prices = pd.Series([...]) # ETHUSDT daily closes
result = test_cointegration(eth_prices, btc_prices)
if result['tradeable']:
print(f"BTC/ETH cointegrated — hedge ratio: {result['hedge_ratio']:.4f}")
4. Ornstein-Uhlenbeck Process Calibration
The Ornstein-Uhlenbeck (OU) process is the continuous-time mathematical model that describes how a mean-reverting spread evolves. It is defined by the stochastic differential equation:
Discretized OU for Estimation
In discrete time (daily or hourly bars), the OU process becomes an AR(1) model:
Fitting an OLS regression of S(t) on S(t-1) gives estimates of a and b, from which we recover theta, mu, and sigma directly:
import numpy as np
from scipy.stats import linregress
def calibrate_ou(spread: np.ndarray, dt: float = 1.0) -> dict:
"""
Calibrate Ornstein-Uhlenbeck parameters from a spread time series.
dt: time step in days (1.0 for daily, 1/24 for hourly)
"""
S = spread
S_lag = S[:-1]
S_current = S[1:]
# OLS regression: S(t) = a + b*S(t-1) + epsilon
b, a, r_value, p_value, se_b = linregress(S_lag, S_current)
if b >= 1.0 or b <= 0.0:
return {'valid': False, 'reason': 'Non-stationary: b outside (0,1)'}
# Recover OU parameters
theta = -np.log(b) / dt # speed of mean reversion
mu = a / (1 - b) # long-run mean
residuals = S_current - (a + b * S_lag)
sigma_eps = np.std(residuals, ddof=2)
# sigma in continuous-time
sigma = sigma_eps * np.sqrt(2 * theta / (1 - b**2))
# Equilibrium (stationary) std
sigma_eq = sigma / np.sqrt(2 * theta)
# Half-life: time to revert halfway to mean
half_life = np.log(2) / theta
return {
'valid': True,
'theta': theta,
'mu': mu,
'sigma': sigma,
'sigma_eq': sigma_eq,
'half_life_days': half_life / dt,
'b': b,
'r_squared': r_value**2,
}
# Example
import pandas as pd
spread_series = pd.Series([...]) # computed spread
params = calibrate_ou(spread_series.values, dt=1.0)
print(f"Half-life: {params['half_life_days']:.1f} days")
print(f"Mean reversion speed theta: {params['theta']:.4f}")
print(f"Equilibrium vol sigma_eq: {params['sigma_eq']:.4f}")
Only trade pairs with a half-life between 2 and 30 days. Too short (<2d) and transaction costs eat returns; too long (>30d) and capital is tied up inefficiently. Sweet spot for crypto is 5–15 days.
5. Z-Score Trading Rules
Once the OU parameters are calibrated, the agent converts the raw spread into a standardized Z-score that measures how far the spread has deviated from its equilibrium mean, in units of standard deviations:
Entry and Exit Logic
Standard Z-score rules define four thresholds:
- Entry Long Spread: Z < -2.0 (spread too low, expect reversion upward)
- Entry Short Spread: Z > +2.0 (spread too high, expect reversion downward)
- Exit / Take Profit: |Z| < 0.25 (spread near equilibrium)
- Stop Loss: |Z| > 3.5 (spread moved further against position)
Adaptive Thresholds
Fixed thresholds of ±2.0 are a starting point, but agents can optimize thresholds using historical backtests. Key considerations:
- Tighter entries (±1.5) increase trade frequency but worsen signal quality.
- Wider entries (±2.5) produce higher per-trade returns but fewer opportunities.
- Regime-dependent thresholds: widen entries during high-volatility regimes to avoid whipsaws.
- Rolling Z-score recalculation prevents the agent from trading a pair whose properties have shifted.
import numpy as np
import pandas as pd
from dataclasses import dataclass
from enum import Enum
class Signal(Enum):
LONG_SPREAD = "long_spread" # long Y, short X
SHORT_SPREAD = "short_spread" # short Y, long X
HOLD = "hold"
CLOSE = "close"
@dataclass
class ZScoreConfig:
entry_threshold: float = 2.0
exit_threshold: float = 0.25
stop_loss_threshold: float = 3.5
lookback_window: int = 60 # days for rolling stats
class ZScoreStrategy:
def __init__(self, config: ZScoreConfig = None):
self.config = config or ZScoreConfig()
self.position: Signal = Signal.HOLD
def compute_zscore(self, spread: pd.Series) -> pd.Series:
"""Rolling Z-score using expanding window up to lookback."""
rolling_mean = spread.rolling(self.config.lookback_window, min_periods=20).mean()
rolling_std = spread.rolling(self.config.lookback_window, min_periods=20).std()
return (spread - rolling_mean) / rolling_std.clip(lower=1e-8)
def get_signal(self, z: float) -> Signal:
"""State-machine signal logic with hysteresis."""
cfg = self.config
if self.position == Signal.HOLD:
if z < -cfg.entry_threshold:
return Signal.LONG_SPREAD
elif z > cfg.entry_threshold:
return Signal.SHORT_SPREAD
return Signal.HOLD
elif self.position == Signal.LONG_SPREAD:
if abs(z) < cfg.exit_threshold:
return Signal.CLOSE
elif z < -cfg.stop_loss_threshold:
return Signal.CLOSE # stop loss
return Signal.LONG_SPREAD
elif self.position == Signal.SHORT_SPREAD:
if abs(z) < cfg.exit_threshold:
return Signal.CLOSE
elif z > cfg.stop_loss_threshold:
return Signal.CLOSE # stop loss
return Signal.SHORT_SPREAD
return Signal.HOLD
def run(self, spread: pd.Series) -> pd.DataFrame:
"""Generate full signal series from spread."""
zscores = self.compute_zscore(spread)
signals = []
for z in zscores:
if np.isnan(z):
signals.append(Signal.HOLD)
continue
sig = self.get_signal(z)
self.position = sig
signals.append(sig)
return pd.DataFrame({'spread': spread, 'zscore': zscores, 'signal': signals})
6. Half-Life Estimation and Capital Efficiency
The half-life of mean reversion determines how quickly the spread returns to equilibrium. It directly impacts strategy economics:
Capital Turnover and Sharpe Ratio
A pair with a 5-day half-life will cycle through roughly 6 complete mean reversion episodes per month. A 20-day half-life pair completes only 1.5 cycles. Shorter half-lives generate more alpha per unit of capital deployed but increase transaction costs. The optimal half-life maximizes:
Practical Half-Life Screening
Agents should run nightly half-life screens across all viable pairs, filtering to a tradeable universe:
import itertools
import pandas as pd
from typing import List, Tuple
def screen_pairs(
prices: dict[str, pd.Series],
min_half_life: float = 2.0,
max_half_life: float = 30.0,
min_r2: float = 0.85,
) -> List[dict]:
"""
Screen all asset pairs for mean reversion viability.
prices: dict of {symbol: price_series}
Returns ranked list of tradeable pairs.
"""
symbols = list(prices.keys())
results = []
for sym_y, sym_x in itertools.combinations(symbols, 2):
series_y = prices[sym_y].dropna()
series_x = prices[sym_x].dropna()
# Align on common index
aligned = pd.concat([series_y, series_x], axis=1).dropna()
if len(aligned) < 60:
continue # insufficient history
coint_result = test_cointegration(aligned.iloc[:, 0], aligned.iloc[:, 1])
if not coint_result['tradeable']:
continue
# Compute spread and calibrate OU
spread = (aligned.iloc[:, 0]
- coint_result['hedge_ratio'] * aligned.iloc[:, 1]
- coint_result['alpha'])
ou_params = calibrate_ou(spread.values, dt=1.0)
if not ou_params['valid']:
continue
hl = ou_params['half_life_days']
r2 = ou_params['r_squared']
if min_half_life <= hl <= max_half_life and r2 >= min_r2:
results.append({
'pair': f"{sym_y}/{sym_x}",
'sym_y': sym_y,
'sym_x': sym_x,
'half_life': hl,
'hedge_ratio': coint_result['hedge_ratio'],
'sigma_eq': ou_params['sigma_eq'],
'r_squared': r2,
'coint_p': coint_result['coint_pvalue'],
})
# Rank by half-life ascending (fastest reversion first)
return sorted(results, key=lambda d: d['half_life'])
# Nightly screen example
ranked_pairs = screen_pairs(price_data, min_half_life=3, max_half_life=20)
print(f"Found {len(ranked_pairs)} tradeable pairs")
for p in ranked_pairs[:10]:
print(f" {p['pair']}: HL={p['half_life']:.1f}d, beta={p['hedge_ratio']:.3f}, R²={p['r_squared']:.3f}")
7. Execution Timing and Transaction Cost Management
Even a theoretically profitable mean reversion strategy can be destroyed by poor execution. Agents must account for several sources of friction:
Bid-Ask Spread and Slippage
For a pairs trade, the agent crosses the bid-ask spread twice (once per leg) on both entry and exit — meaning 4 crosses total per round trip. On major crypto perpetuals, spreads are typically 0.01–0.05% per cross. Total round-trip cost on BTC/ETH might be 0.08–0.20% in spread alone.
Funding Rates
Perpetual futures accrue funding payments every 8 hours. For a long/short pairs trade, if both legs are on the same side of the funding rate, one leg pays and the other receives, roughly netting to zero. However, during extreme sentiment, funding can reach 0.3–1.0% per day — easily dominating the mean reversion return if the trade is held through multiple funding periods.
Execution Algorithm Selection
- TWAP (Time-Weighted Average Price): Minimizes market impact for large positions by spreading orders over time. Best for positions >$50K notional.
- Maker-only orders: Post limit orders at mid or better to earn rebates (typically −0.01% to −0.03% on most crypto venues). Agents should default to maker unless urgency requires taking.
- Simultaneous leg execution: Use atomic basket orders or near-simultaneous REST calls with sub-100ms timing to minimize leg-risk (the risk that one leg fills and the other doesn't).
The Purple Flea trading API supports simultaneous multi-leg orders on perp markets, reducing leg-risk for pairs traders. See trading docs for batch order endpoints.
8. Portfolio Construction and Risk Management
Running a single pair concentrates idiosyncratic risk — a regime shift, delistng, or regulatory event can blow up the spread permanently. A robust mean reversion portfolio diversifies across multiple uncorrelated pairs.
Equal Risk Contribution
Rather than equal notional allocation, allocate so that each pair contributes equally to total portfolio volatility. If pair i has equilibrium spread volatility sigma_i, the notional allocation N_i is:
Correlation Between Pairs
Pairs involving the same underlying (e.g., BTC/ETH and BTC/SOL) are not independent — both spread series will be shocked by a sudden BTC move. Agents should build a correlation matrix of active spread series and constrain the portfolio to limit correlated exposures.
Stop-Loss and Drawdown Management
- Per-pair stop: close position if Z > 3.5 (spread widened further than expected).
- Portfolio drawdown stop: halt all new entries if portfolio is down >8% from peak.
- Stale-relationship detection: re-run cointegration tests weekly; drop any pair with ADF p > 0.10.
9. Python MeanReversionAgent with Purple Flea Perp Hedges
The following is a complete skeleton for an autonomous mean reversion agent that integrates with the Purple Flea API for perpetual futures execution. It handles pair selection, signal generation, and order placement with risk controls.
"""
MeanReversionAgent — autonomous pairs trader on Purple Flea perps
Requires: PURPLE_FLEA_API_KEY env variable
"""
import os, asyncio, logging
import numpy as np
import pandas as pd
import httpx
from dataclasses import dataclass, field
from datetime import datetime, timedelta
from typing import Optional
logging.basicConfig(level=logging.INFO, format='%(asctime)s %(levelname)s %(message)s')
log = logging.getLogger("MeanReversionAgent")
PF_API_BASE = "https://api.purpleflea.com/v1"
PF_API_KEY = os.environ["PURPLE_FLEA_API_KEY"]
HEADERS = {
"Authorization": f"Bearer {PF_API_KEY}",
"Content-Type": "application/json",
}
@dataclass
class Pair:
sym_y: str
sym_x: str
hedge_ratio: float
mu: float # spread equilibrium mean
sigma_eq: float # spread equilibrium std
half_life: float # days
last_validated: datetime = field(default_factory=datetime.utcnow)
@dataclass
class Position:
pair: Pair
direction: str # "long_spread" or "short_spread"
entry_z: float
notional_y: float
notional_x: float
entry_time: datetime = field(default_factory=datetime.utcnow)
class MeanReversionAgent:
def __init__(self, target_pairs: list[Pair], max_positions: int = 5):
self.pairs = target_pairs
self.max_positions = max_positions
self.positions: dict[str, Position] = {}
self.pf = httpx.AsyncClient(base_url=PF_API_BASE, headers=HEADERS, timeout=15)
async def get_price(self, symbol: str) -> float:
"""Fetch latest mark price from Purple Flea perp API."""
r = await self.pf.get(f"/perps/{symbol}/price")
r.raise_for_status()
return float(r.json()["mark_price"])
async def get_ohlcv(self, symbol: str, days: int = 90) -> pd.Series:
"""Fetch daily OHLCV and return close prices."""
r = await self.pf.get(f"/perps/{symbol}/ohlcv", params={"period": "1d", "limit": days})
r.raise_for_status()
data = r.json()["candles"]
closes = pd.Series([c["close"] for c in data])
return closes
def compute_current_z(self, pair: Pair, price_y: float, price_x: float) -> float:
"""Compute instantaneous Z-score from current prices."""
spread = price_y - pair.hedge_ratio * price_x
z = (spread - pair.mu) / (pair.sigma_eq + 1e-10)
return z
async def place_pair_order(self, pair: Pair, direction: str, notional: float):
"""Place simultaneous long+short legs on Purple Flea perps."""
price_y = await self.get_price(pair.sym_y)
price_x = await self.get_price(pair.sym_x)
size_y = notional / price_y
size_x = (notional * pair.hedge_ratio) / price_x
if direction == "long_spread":
side_y, side_x = "buy", "sell"
else:
side_y, side_x = "sell", "buy"
# Batch order placement (near-simultaneous)
orders = await asyncio.gather(
self.pf.post("/perps/order", json={
"symbol": pair.sym_y, "side": side_y,
"size": round(size_y, 4), "order_type": "market",
}),
self.pf.post("/perps/order", json={
"symbol": pair.sym_x, "side": side_x,
"size": round(size_x, 4), "order_type": "market",
}),
)
for o in orders:
o.raise_for_status()
log.info(f"Placed {direction} on {pair.sym_y}/{pair.sym_x} notional ${notional:,.0f}")
async def run_cycle(self):
"""Single evaluation cycle across all monitored pairs."""
for pair in self.pairs:
key = f"{pair.sym_y}/{pair.sym_x}"
price_y, price_x = await asyncio.gather(
self.get_price(pair.sym_y),
self.get_price(pair.sym_x),
)
z = self.compute_current_z(pair, price_y, price_x)
log.info(f"{key} Z={z:.3f}")
if key in self.positions:
pos = self.positions[key]
# Exit logic
if abs(z) < 0.25 or abs(z) > 3.5:
close_dir = ("long_spread" if pos.direction == "short_spread"
else "short_spread")
await self.place_pair_order(pair, close_dir, pos.notional_y)
del self.positions[key]
log.info(f"Closed {key} at Z={z:.3f}")
elif len(self.positions) < self.max_positions:
# Entry logic
if z < -2.0:
notional = 10_000 # USD per leg
await self.place_pair_order(pair, "long_spread", notional)
self.positions[key] = Position(
pair=pair, direction="long_spread",
entry_z=z, notional_y=notional, notional_x=notional * pair.hedge_ratio,
)
elif z > 2.0:
notional = 10_000
await self.place_pair_order(pair, "short_spread", notional)
self.positions[key] = Position(
pair=pair, direction="short_spread",
entry_z=z, notional_y=notional, notional_x=notional * pair.hedge_ratio,
)
async def run(self, interval_minutes: int = 60):
"""Main loop — evaluate every interval_minutes."""
log.info(f"MeanReversionAgent started | {len(self.pairs)} pairs | max {self.max_positions} positions")
while True:
try:
await self.run_cycle()
except Exception as e:
log.error(f"Cycle error: {e}")
await asyncio.sleep(interval_minutes * 60)
if __name__ == "__main__":
# Register with Purple Flea faucet to claim free first
# GET https://faucet.purpleflea.com/claim?agent_id=YOUR_ID
pairs = [
Pair("ETH-PERP", "BTC-PERP", hedge_ratio=0.065, mu=0.0, sigma_eq=120.0, half_life=8.5),
Pair("SOL-PERP", "AVAX-PERP", hedge_ratio=2.1, mu=0.0, sigma_eq=3.2, half_life=5.3),
]
agent = MeanReversionAgent(pairs=pairs, max_positions=4)
asyncio.run(agent.run(interval_minutes=60))
10. Advanced Risk Controls and Live Monitoring
Production mean reversion agents require several additional safeguards beyond basic signal logic:
Relationship Decay Detection
Cointegration relationships in crypto can break down over weeks as fundamentals shift. Agents should re-run the full cointegration test weekly on active pairs. If p-value exceeds 0.10 or the hedge ratio has drifted >20% from the entry value, the pair should be flagged for manual review or automatically closed.
Jump Filtering
Sudden news events can cause one-sided moves that look like tradeable Z-score extremes but are actually regime breaks. A simple jump filter prevents entries when the absolute 1-hour return of either leg exceeds 5%:
async def is_jump_event(symbol: str, threshold: float = 0.05) -> bool:
"""
Returns True if a large jump has occurred recently (1h return > threshold).
Prevents entering mean reversion positions during regime breaks.
"""
ohlcv_1h = await get_ohlcv_1h(symbol, limit=2) # last 2 hourly candles
if len(ohlcv_1h) < 2:
return False
ret = abs(ohlcv_1h[-1] / ohlcv_1h[-2] - 1)
return ret > threshold
async def safe_entry_check(pair: Pair) -> bool:
"""All guards must pass before entering a new position."""
jump_y = await is_jump_event(pair.sym_y)
jump_x = await is_jump_event(pair.sym_x)
if jump_y or jump_x:
log.warning(f"Jump filter blocked entry on {pair.sym_y}/{pair.sym_x}")
return False
return True
Funding Rate Awareness
Before entering, check the current 8-hour funding rate for both legs. If net funding cost (annualized) exceeds the expected annualized spread return, defer the entry to the next cycle:
async def net_funding_cost_annual(pf_client, sym_y: str, sym_x: str,
direction: str, hedge_ratio: float) -> float:
"""
Returns annualized net funding cost for a directional spread position.
Positive means the position PAYS funding; negative means it receives.
"""
fr_y = await pf_client.get(f"/perps/{sym_y}/funding_rate")
fr_x = await pf_client.get(f"/perps/{sym_x}/funding_rate")
rate_y = float(fr_y.json()["rate"]) # per 8h
rate_x = float(fr_x.json()["rate"]) # per 8h
periods_per_year = 3 * 365 # 3 funding periods per day
if direction == "long_spread":
# Long Y: pays rate_y, Short X: receives rate_x
net_per_period = rate_y - hedge_ratio * rate_x
else:
# Short Y: receives rate_y, Long X: pays rate_x
net_per_period = hedge_ratio * rate_x - rate_y
return net_per_period * periods_per_year
New agents can claim free trading capital through the Purple Flea Faucet before deploying their mean reversion strategy. Zero risk entry — ideal for validating a strategy with live data before committing real capital.
Deploy Your Mean Reversion Agent
Access Purple Flea perp markets, claim free faucet capital to test your strategy, and use the escrow service for trustless agent-to-agent settlements.
Get Started Free API Docs