Pair Trading for AI Agents: Statistical Arbitrage Between Correlated Assets
Pair trading is one of the oldest quantitative strategies in finance — long one asset, short a correlated partner, profit when the spread mean-reverts. For AI agents it is uniquely attractive: the position is market-neutral in theory, generates income independent of broad market direction, and the signal logic is fully automatable with no subjective judgment required. This guide covers everything from cointegration theory to live execution on Purple Flea Trading.
What Is Pair Trading?
A pair trade exploits the historical tendency of two correlated assets to revert to a stable price relationship. When Asset A rises relative to Asset B beyond what history predicts, you sell A and buy B expecting convergence. When they converge, both positions are closed for profit.
The key statistical concept is cointegration — a stronger relationship than simple correlation. Two series are cointegrated if a linear combination of them is stationary (mean-reverting), even though each series individually is a random walk. Correlation only measures how assets move together; cointegration confirms a long-run equilibrium that prices are pulled back toward.
BTC and ETH are highly correlated (both trend up in bull markets) but may not be cointegrated — their ratio can drift indefinitely. SOL and AVAX, or two exchange-listed tokens tracking the same narrative, may be cointegrated — their spread is stationary and reverts predictably. Always test for cointegration, not just correlation.
Cointegration Testing in Python
The Engle-Granger test is the standard method for detecting cointegration between two price series. It runs an OLS regression of one series on the other, then tests whether the residuals are stationary using the Augmented Dickey-Fuller test. A p-value below 0.05 indicates cointegration at 95% confidence.
import numpy as np
import pandas as pd
from statsmodels.tsa.stattools import coint, adfuller
from statsmodels.regression.linear_model import OLS
import requests
def fetch_ohlcv(market: str, limit: int = 500) -> pd.Series:
"""Fetch closing prices from Purple Flea Trading API."""
resp = requests.get(
f"https://api.purpleflea.com/trading/candles",
params={"market": market, "interval": "1h", "limit": limit},
headers={"Authorization": f"Bearer {API_KEY}"}
)
data = resp.json()["candles"]
closes = [c["close"] for c in data]
return pd.Series(closes, name=market)
def test_cointegration(series_a: pd.Series, series_b: pd.Series) -> dict:
"""
Run Engle-Granger cointegration test on two price series.
Returns p_value, hedge_ratio, and whether cointegrated at 95%.
"""
# Engle-Granger test (checks both directions, takes min p-value)
score, p_val, crit_vals = coint(series_a, series_b)
# OLS regression to find hedge ratio: A = beta * B + alpha
X = np.column_stack([series_b.values, np.ones(len(series_b))])
model = OLS(series_a.values, X).fit()
hedge_ratio = model.params[0]
alpha = model.params[1]
# Compute residuals (the "spread" we will trade)
spread = series_a.values - hedge_ratio * series_b.values - alpha
# Confirm spread stationarity with ADF
adf_result = adfuller(spread, autolag='AIC')
spread_p_val = adf_result[1]
return {
"p_value": p_val,
"spread_adf_p": spread_p_val,
"hedge_ratio": hedge_ratio,
"alpha": alpha,
"cointegrated": p_val < 0.05 and spread_p_val < 0.05,
"spread_mean": spread.mean(),
"spread_std": spread.std(),
}
# Example: test SOL-PERP vs AVAX-PERP
sol = fetch_ohlcv("SOL-PERP")
avax = fetch_ohlcv("AVAX-PERP")
result = test_cointegration(sol, avax)
print(f"Cointegrated: {result['cointegrated']}")
print(f"Engle-Granger p-value: {result['p_value']:.4f}")
print(f"Hedge ratio (beta): {result['hedge_ratio']:.4f}")
print(f"Spread mean: {result['spread_mean']:.4f}, std: {result['spread_std']:.4f}")
Pairs Selection: Scanning the Universe
With 275+ perpetual markets on Purple Flea Trading, there are thousands of candidate pairs. Automated scanning lets your agent test all combinations and rank them by cointegration strength.
Universe Filtering
Before running statistical tests, narrow the universe by domain knowledge. Good pairs share:
- Same sector — Layer-1s (SOL, AVAX, NEAR, APT), DeFi tokens, AI tokens, gaming tokens
- Similar market cap — large-cap pairs are more liquid with tighter bid-ask spreads
- High correlation as a prerequisite screen (Pearson r > 0.85 over 90 days)
- Both listed on Purple Flea Trading for simultaneous execution without exchange risk
import itertools
def scan_pairs(
markets: list[str],
lookback_hours: int = 500,
corr_threshold: float = 0.85,
coint_pval: float = 0.05,
) -> list[dict]:
"""
Scan all pairs in a list of markets.
Returns pairs ranked by cointegration p-value (best first).
"""
# Fetch all price series
prices = {}
for market in markets:
prices[market] = fetch_ohlcv(market, limit=lookback_hours)
price_df = pd.DataFrame(prices).dropna()
candidates = []
for a, b in itertools.combinations(markets, 2):
# Pre-screen with correlation
corr = price_df[a].corr(price_df[b])
if abs(corr) < corr_threshold:
continue
# Run cointegration test
result = test_cointegration(price_df[a], price_df[b])
if result["cointegrated"]:
candidates.append({
"pair": (a, b),
"correlation": corr,
"p_value": result["p_value"],
"hedge_ratio": result["hedge_ratio"],
"spread_std": result["spread_std"],
"alpha": result["alpha"],
})
# Sort by strongest cointegration (lowest p-value)
candidates.sort(key=lambda x: x["p_value"])
return candidates
# Layer-1 universe scan
L1_MARKETS = ["SOL-PERP", "AVAX-PERP", "NEAR-PERP", "APT-PERP", "SUI-PERP", "TIA-PERP"]
good_pairs = scan_pairs(L1_MARKETS)
for p in good_pairs[:5]:
print(f"{p['pair']} | p={p['p_value']:.4f} | beta={p['hedge_ratio']:.3f}")
Cointegration is not permanent. Market structure changes, new competitors emerge, correlations break down. Re-run your pairs scan at least weekly. If a pair's ADF p-value rises above 0.10, stop trading it immediately and cut the position — the mean-reversion anchor may be gone.
Z-Score Entry and Exit Rules
Once you have a confirmed cointegrated pair and its hedge ratio, compute the live spread and normalize it to a z-score relative to the recent rolling mean and standard deviation. The z-score tells you how many standard deviations the spread is away from its historical mean — this is your trading signal.
import numpy as np
class PairSignalEngine:
"""
Computes live z-score for a cointegrated pair and generates
entry / exit signals.
"""
def __init__(
self,
hedge_ratio: float,
alpha: float,
lookback: int = 60, # rolling window in candles
entry_z: float = 2.0, # enter when |z| > 2
exit_z: float = 0.5, # exit when |z| < 0.5
stop_z: float = 3.5, # stop-loss when |z| > 3.5
):
self.hedge_ratio = hedge_ratio
self.alpha = alpha
self.lookback = lookback
self.entry_z = entry_z
self.exit_z = exit_z
self.stop_z = stop_z
self.spread_history: list[float] = []
def update(self, price_a: float, price_b: float) -> dict:
"""Feed in the latest prices and get a signal back."""
spread = price_a - self.hedge_ratio * price_b - self.alpha
self.spread_history.append(spread)
# Keep only the rolling window
if len(self.spread_history) > self.lookback:
self.spread_history.pop(0)
if len(self.spread_history) < self.lookback:
return {"signal": "WAIT", "z_score": 0.0}
arr = np.array(self.spread_history)
mean = arr.mean()
std = arr.std()
if std < 1e-10:
return {"signal": "WAIT", "z_score": 0.0}
z = (spread - mean) / std
signal = "HOLD"
if z > self.stop_z or z < -self.stop_z:
signal = "STOP_LOSS"
elif z > self.entry_z:
signal = "ENTER_SHORT_A_LONG_B" # A overpriced relative to B
elif z < -self.entry_z:
signal = "ENTER_LONG_A_SHORT_B" # A underpriced relative to B
elif -self.exit_z < z < self.exit_z:
signal = "EXIT" # Spread has mean-reverted
return {
"signal": signal,
"z_score": z,
"spread": spread,
"spread_mean": mean,
"spread_std": std,
}
Choosing Z-Score Thresholds
| Threshold | Entry Z | Exit Z | Stop Z | Character |
|---|---|---|---|---|
| Aggressive | 1.5 | 0.25 | 3.0 | More trades, lower per-trade PnL |
| Standard | 2.0 | 0.5 | 3.5 | Balanced frequency and edge |
| Conservative | 2.5 | 0.75 | 4.0 | Fewer trades, wider expected PnL per trade |
Sizing the Hedge Leg
The hedge ratio determines how many units of Asset B to trade per unit of Asset A. On a perpetuals exchange where positions are denominated in USDC, you need to translate the statistical beta into notional dollar amounts.
def compute_position_sizes(
total_capital_usdc: float,
hedge_ratio: float,
price_a: float,
price_b: float,
leverage: float = 3.0,
) -> dict:
"""
Compute the dollar sizes for each leg of the pair trade.
The goal is for the dollar value of leg B to be `hedge_ratio`
times the dollar value of leg A so that a 1% move in B is offset
by a hedge_ratio% move in A.
"""
# Allocate capital: split so that notional values are beta-weighted
# Let x = notional A. Then notional B = hedge_ratio * x (in price-B units)
# Total margin = (notional_A + notional_B) / leverage
notional_a = total_capital_usdc * leverage / (1 + hedge_ratio)
notional_b = notional_a * hedge_ratio
return {
"size_a_usdc": round(notional_a, 2),
"size_b_usdc": round(notional_b, 2),
"margin_required": round((notional_a + notional_b) / leverage, 2),
}
# Example: $500 capital, 3x leverage, beta = 0.72
sizes = compute_position_sizes(
total_capital_usdc=500,
hedge_ratio=0.72,
price_a=145.0, # SOL price
price_b=35.0, # AVAX price
leverage=3.0,
)
print(sizes)
# {'size_a_usdc': 872.09, 'size_b_usdc': 627.91, 'margin_required': 500.0}
Executing Both Legs on Purple Flea Trading
The hardest part of pair trading is executing both legs simultaneously. Legged-in entries — where you place one leg and then the other a second later — expose you to execution risk if the market moves between the two fills. The Purple Flea Trading API lets you submit both orders in rapid succession and tag them with a correlation ID for audit tracking.
import asyncio
import aiohttp
import time
API_BASE = "https://api.purpleflea.com/trading"
async def open_pair_trade(
session: aiohttp.ClientSession,
api_key: str,
signal: str,
market_a: str,
market_b: str,
size_a_usdc: float,
size_b_usdc: float,
corr_id: str,
) -> tuple[dict, dict]:
"""
Open both legs of a pair trade concurrently.
signal: 'ENTER_SHORT_A_LONG_B' or 'ENTER_LONG_A_SHORT_B'
"""
headers = {"Authorization": f"Bearer {api_key}", "Content-Type": "application/json"}
if signal == "ENTER_SHORT_A_LONG_B":
side_a, side_b = "short", "long"
else:
side_a, side_b = "long", "short"
order_a = {
"market": market_a,
"side": side_a,
"size_usdc": size_a_usdc,
"order_type": "market",
"client_id": f"{corr_id}_leg_a",
}
order_b = {
"market": market_b,
"side": side_b,
"size_usdc": size_b_usdc,
"order_type": "market",
"client_id": f"{corr_id}_leg_b",
}
# Submit both legs concurrently to minimize leg risk
result_a, result_b = await asyncio.gather(
session.post(f"{API_BASE}/orders", json=order_a, headers=headers),
session.post(f"{API_BASE}/orders", json=order_b, headers=headers),
)
fill_a = await result_a.json()
fill_b = await result_b.json()
return fill_a, fill_b
async def main():
engine = PairSignalEngine(hedge_ratio=0.72, alpha=5.3)
sizes = compute_position_sizes(500, 0.72, 145.0, 35.0, leverage=3)
async with aiohttp.ClientSession() as session:
while True:
# Fetch current prices
sol_price = await get_last_price(session, "SOL-PERP")
avax_price = await get_last_price(session, "AVAX-PERP")
result = engine.update(sol_price, avax_price)
print(f"Z={result['z_score']:.2f} Signal={result['signal']}")
if result["signal"].startswith("ENTER"):
corr_id = f"pair_{int(time.time())}"
fill_a, fill_b = await open_pair_trade(
session, API_KEY, result["signal"],
"SOL-PERP", "AVAX-PERP",
sizes["size_a_usdc"], sizes["size_b_usdc"], corr_id
)
print(f"Opened pair: A={fill_a}, B={fill_b}")
await asyncio.sleep(60) # check every minute
asyncio.run(main())
If your Leg A fill succeeds but Leg B fails (e.g. insufficient margin, market halt, API timeout), you now hold a naked directional position — the opposite of what pair trading is designed to give you. Always check both fill confirmations and have an automatic rollback routine: if Leg B fails within 2 seconds, market-close Leg A immediately.
Risk Management and Position Limits
Even with cointegrated pairs, spread divergence can persist far longer than expected. The historical 3.5-sigma stop provides a safety net, but you also need position-level and portfolio-level limits to prevent catastrophic losses on a regime change.
Recommended Risk Controls
- Max pairs open simultaneously: 3-5 pairs; more concentrates correlation risk during market stress when all pair spreads widen at once
- Max per-pair margin: 15-20% of total capital; diversification is your primary edge
- Hard stop after N losing trades: if 3 consecutive pair trades hit stop-loss, pause for 24 hours and re-test cointegration before resuming
- Overnight hold limit: review open positions before major scheduled events (Fed meetings, token unlocks, protocol upgrades) that can break correlations temporarily
- Spread half-life monitoring: fit an Ornstein-Uhlenbeck process to measure how quickly the spread mean-reverts; if half-life exceeds 48 hours, the pair is no longer suitable for short-term trading
import numpy as np
def estimate_half_life(spread: np.ndarray) -> float:
"""
Fit an AR(1) process to estimate the spread's half-life in periods.
Half-life = -log(2) / log(phi) where phi is the AR(1) coefficient.
If half_life > 48 (hours), avoid this pair.
"""
y = spread[1:]
x = spread[:-1]
phi = np.polyfit(x, y, 1)[0] # AR(1) coefficient
if phi >= 1 or phi <= 0:
return float("inf") # non-stationary or negative mean-reversion
half_life = -np.log(2) / np.log(phi)
return half_life
# Example check before trading
spread_data = np.array(engine.spread_history)
hl = estimate_half_life(spread_data)
print(f"Estimated half-life: {hl:.1f} candles")
if hl > 48:
print("Half-life too long — skip this pair today")
else:
print(f"Good to trade — spread reverts in ~{hl:.0f} hours")
Pair Trading at Scale: Referral Income on Purple Flea
A pair trading agent executing on Purple Flea Trading benefits from the 20% referral fee on the trading volume of any agents it refers. An agent that both runs its own pair strategies and deploys the same code for a fleet of sub-agents can compound income across two channels: spread PnL from its own book, and referral revenue from every trade placed by agents it has registered under its referral link.
Register your agent at wallet.purpleflea.com to get your referral link. Deploy sub-agents using that link. Every pair trade those sub-agents execute generates 20% of Purple Flea's trading fee back to you — passive income layered on top of strategy returns.
Start Pair Trading on Purple Flea
Register your agent, fund your wallet, and start scanning for cointegrated pairs across 275+ perpetual markets — all via API, no KYC required.
Register your agent → Trading API docs