Strategy Research

Pair Trading for AI Agents: Statistical Arbitrage Between Correlated Assets

📅 March 6, 2026 ⏱ 16 min read ✍️ Purple Flea Research

Pair trading is one of the oldest quantitative strategies in finance — long one asset, short a correlated partner, profit when the spread mean-reverts. For AI agents it is uniquely attractive: the position is market-neutral in theory, generates income independent of broad market direction, and the signal logic is fully automatable with no subjective judgment required. This guide covers everything from cointegration theory to live execution on Purple Flea Trading.

275+

Perpetual markets to pair

<1s

Execution latency target

20%

Trading API referral

100x

Max leverage on each leg

What Is Pair Trading?

A pair trade exploits the historical tendency of two correlated assets to revert to a stable price relationship. When Asset A rises relative to Asset B beyond what history predicts, you sell A and buy B expecting convergence. When they converge, both positions are closed for profit.

The key statistical concept is cointegration — a stronger relationship than simple correlation. Two series are cointegrated if a linear combination of them is stationary (mean-reverting), even though each series individually is a random walk. Correlation only measures how assets move together; cointegration confirms a long-run equilibrium that prices are pulled back toward.

Correlation vs. Cointegration

BTC and ETH are highly correlated (both trend up in bull markets) but may not be cointegrated — their ratio can drift indefinitely. SOL and AVAX, or two exchange-listed tokens tracking the same narrative, may be cointegrated — their spread is stationary and reverts predictably. Always test for cointegration, not just correlation.

Cointegration Testing in Python

The Engle-Granger test is the standard method for detecting cointegration between two price series. It runs an OLS regression of one series on the other, then tests whether the residuals are stationary using the Augmented Dickey-Fuller test. A p-value below 0.05 indicates cointegration at 95% confidence.

import numpy as np
import pandas as pd
from statsmodels.tsa.stattools import coint, adfuller
from statsmodels.regression.linear_model import OLS
import requests

def fetch_ohlcv(market: str, limit: int = 500) -> pd.Series:
    """Fetch closing prices from Purple Flea Trading API."""
    resp = requests.get(
        f"https://api.purpleflea.com/trading/candles",
        params={"market": market, "interval": "1h", "limit": limit},
        headers={"Authorization": f"Bearer {API_KEY}"}
    )
    data = resp.json()["candles"]
    closes = [c["close"] for c in data]
    return pd.Series(closes, name=market)

def test_cointegration(series_a: pd.Series, series_b: pd.Series) -> dict:
    """
    Run Engle-Granger cointegration test on two price series.
    Returns p_value, hedge_ratio, and whether cointegrated at 95%.
    """
    # Engle-Granger test (checks both directions, takes min p-value)
    score, p_val, crit_vals = coint(series_a, series_b)

    # OLS regression to find hedge ratio: A = beta * B + alpha
    X = np.column_stack([series_b.values, np.ones(len(series_b))])
    model = OLS(series_a.values, X).fit()
    hedge_ratio = model.params[0]
    alpha = model.params[1]

    # Compute residuals (the "spread" we will trade)
    spread = series_a.values - hedge_ratio * series_b.values - alpha

    # Confirm spread stationarity with ADF
    adf_result = adfuller(spread, autolag='AIC')
    spread_p_val = adf_result[1]

    return {
        "p_value": p_val,
        "spread_adf_p": spread_p_val,
        "hedge_ratio": hedge_ratio,
        "alpha": alpha,
        "cointegrated": p_val < 0.05 and spread_p_val < 0.05,
        "spread_mean": spread.mean(),
        "spread_std": spread.std(),
    }

# Example: test SOL-PERP vs AVAX-PERP
sol = fetch_ohlcv("SOL-PERP")
avax = fetch_ohlcv("AVAX-PERP")
result = test_cointegration(sol, avax)

print(f"Cointegrated: {result['cointegrated']}")
print(f"Engle-Granger p-value: {result['p_value']:.4f}")
print(f"Hedge ratio (beta): {result['hedge_ratio']:.4f}")
print(f"Spread mean: {result['spread_mean']:.4f}, std: {result['spread_std']:.4f}")

Pairs Selection: Scanning the Universe

With 275+ perpetual markets on Purple Flea Trading, there are thousands of candidate pairs. Automated scanning lets your agent test all combinations and rank them by cointegration strength.

Universe Filtering

Before running statistical tests, narrow the universe by domain knowledge. Good pairs share:

Same sector — Layer-1s (SOL, AVAX, NEAR, APT), DeFi tokens, AI tokens, gaming tokens
Similar market cap — large-cap pairs are more liquid with tighter bid-ask spreads
High correlation as a prerequisite screen (Pearson r > 0.85 over 90 days)
Both listed on Purple Flea Trading for simultaneous execution without exchange risk

import itertools

def scan_pairs(
    markets: list[str],
    lookback_hours: int = 500,
    corr_threshold: float = 0.85,
    coint_pval: float = 0.05,
) -> list[dict]:
    """
    Scan all pairs in a list of markets.
    Returns pairs ranked by cointegration p-value (best first).
    """
    # Fetch all price series
    prices = {}
    for market in markets:
        prices[market] = fetch_ohlcv(market, limit=lookback_hours)

    price_df = pd.DataFrame(prices).dropna()
    candidates = []

    for a, b in itertools.combinations(markets, 2):
        # Pre-screen with correlation
        corr = price_df[a].corr(price_df[b])
        if abs(corr) < corr_threshold:
            continue

        # Run cointegration test
        result = test_cointegration(price_df[a], price_df[b])
        if result["cointegrated"]:
            candidates.append({
                "pair": (a, b),
                "correlation": corr,
                "p_value": result["p_value"],
                "hedge_ratio": result["hedge_ratio"],
                "spread_std": result["spread_std"],
                "alpha": result["alpha"],
            })

    # Sort by strongest cointegration (lowest p-value)
    candidates.sort(key=lambda x: x["p_value"])
    return candidates

# Layer-1 universe scan
L1_MARKETS = ["SOL-PERP", "AVAX-PERP", "NEAR-PERP", "APT-PERP", "SUI-PERP", "TIA-PERP"]
good_pairs = scan_pairs(L1_MARKETS)
for p in good_pairs[:5]:
    print(f"{p['pair']} | p={p['p_value']:.4f} | beta={p['hedge_ratio']:.3f}")

Re-Test Cointegration Frequently

Cointegration is not permanent. Market structure changes, new competitors emerge, correlations break down. Re-run your pairs scan at least weekly. If a pair's ADF p-value rises above 0.10, stop trading it immediately and cut the position — the mean-reversion anchor may be gone.

Z-Score Entry and Exit Rules

Once you have a confirmed cointegrated pair and its hedge ratio, compute the live spread and normalize it to a z-score relative to the recent rolling mean and standard deviation. The z-score tells you how many standard deviations the spread is away from its historical mean — this is your trading signal.

import numpy as np

class PairSignalEngine:
    """
    Computes live z-score for a cointegrated pair and generates
    entry / exit signals.
    """

    def __init__(
        self,
        hedge_ratio: float,
        alpha: float,
        lookback: int = 60,     # rolling window in candles
        entry_z: float = 2.0,   # enter when |z| > 2
        exit_z: float = 0.5,    # exit when |z| < 0.5
        stop_z: float = 3.5,    # stop-loss when |z| > 3.5
    ):
        self.hedge_ratio = hedge_ratio
        self.alpha = alpha
        self.lookback = lookback
        self.entry_z = entry_z
        self.exit_z = exit_z
        self.stop_z = stop_z
        self.spread_history: list[float] = []

    def update(self, price_a: float, price_b: float) -> dict:
        """Feed in the latest prices and get a signal back."""
        spread = price_a - self.hedge_ratio * price_b - self.alpha
        self.spread_history.append(spread)

        # Keep only the rolling window
        if len(self.spread_history) > self.lookback:
            self.spread_history.pop(0)

        if len(self.spread_history) < self.lookback:
            return {"signal": "WAIT", "z_score": 0.0}

        arr = np.array(self.spread_history)
        mean = arr.mean()
        std = arr.std()

        if std < 1e-10:
            return {"signal": "WAIT", "z_score": 0.0}

        z = (spread - mean) / std

        signal = "HOLD"
        if z > self.stop_z or z < -self.stop_z:
            signal = "STOP_LOSS"
        elif z > self.entry_z:
            signal = "ENTER_SHORT_A_LONG_B"   # A overpriced relative to B
        elif z < -self.entry_z:
            signal = "ENTER_LONG_A_SHORT_B"   # A underpriced relative to B
        elif -self.exit_z < z < self.exit_z:
            signal = "EXIT"                  # Spread has mean-reverted

        return {
            "signal": signal,
            "z_score": z,
            "spread": spread,
            "spread_mean": mean,
            "spread_std": std,
        }

Choosing Z-Score Thresholds

Threshold	Entry Z	Exit Z	Stop Z	Character
Aggressive	1.5	0.25	3.0	More trades, lower per-trade PnL
Standard	2.0	0.5	3.5	Balanced frequency and edge
Conservative	2.5	0.75	4.0	Fewer trades, wider expected PnL per trade

Sizing the Hedge Leg

The hedge ratio determines how many units of Asset B to trade per unit of Asset A. On a perpetuals exchange where positions are denominated in USDC, you need to translate the statistical beta into notional dollar amounts.

def compute_position_sizes(
    total_capital_usdc: float,
    hedge_ratio: float,
    price_a: float,
    price_b: float,
    leverage: float = 3.0,
) -> dict:
    """
    Compute the dollar sizes for each leg of the pair trade.

    The goal is for the dollar value of leg B to be `hedge_ratio`
    times the dollar value of leg A so that a 1% move in B is offset
    by a hedge_ratio% move in A.
    """
    # Allocate capital: split so that notional values are beta-weighted
    # Let x = notional A. Then notional B = hedge_ratio * x (in price-B units)
    # Total margin = (notional_A + notional_B) / leverage

    notional_a = total_capital_usdc * leverage / (1 + hedge_ratio)
    notional_b = notional_a * hedge_ratio

    return {
        "size_a_usdc": round(notional_a, 2),
        "size_b_usdc": round(notional_b, 2),
        "margin_required": round((notional_a + notional_b) / leverage, 2),
    }

# Example: $500 capital, 3x leverage, beta = 0.72
sizes = compute_position_sizes(
    total_capital_usdc=500,
    hedge_ratio=0.72,
    price_a=145.0,   # SOL price
    price_b=35.0,    # AVAX price
    leverage=3.0,
)
print(sizes)
# {'size_a_usdc': 872.09, 'size_b_usdc': 627.91, 'margin_required': 500.0}

Executing Both Legs on Purple Flea Trading

The hardest part of pair trading is executing both legs simultaneously. Legged-in entries — where you place one leg and then the other a second later — expose you to execution risk if the market moves between the two fills. The Purple Flea Trading API lets you submit both orders in rapid succession and tag them with a correlation ID for audit tracking.

import asyncio
import aiohttp
import time

API_BASE = "https://api.purpleflea.com/trading"

async def open_pair_trade(
    session: aiohttp.ClientSession,
    api_key: str,
    signal: str,
    market_a: str,
    market_b: str,
    size_a_usdc: float,
    size_b_usdc: float,
    corr_id: str,
) -> tuple[dict, dict]:
    """
    Open both legs of a pair trade concurrently.
    signal: 'ENTER_SHORT_A_LONG_B' or 'ENTER_LONG_A_SHORT_B'
    """
    headers = {"Authorization": f"Bearer {api_key}", "Content-Type": "application/json"}

    if signal == "ENTER_SHORT_A_LONG_B":
        side_a, side_b = "short", "long"
    else:
        side_a, side_b = "long", "short"

    order_a = {
        "market": market_a,
        "side": side_a,
        "size_usdc": size_a_usdc,
        "order_type": "market",
        "client_id": f"{corr_id}_leg_a",
    }
    order_b = {
        "market": market_b,
        "side": side_b,
        "size_usdc": size_b_usdc,
        "order_type": "market",
        "client_id": f"{corr_id}_leg_b",
    }

    # Submit both legs concurrently to minimize leg risk
    result_a, result_b = await asyncio.gather(
        session.post(f"{API_BASE}/orders", json=order_a, headers=headers),
        session.post(f"{API_BASE}/orders", json=order_b, headers=headers),
    )

    fill_a = await result_a.json()
    fill_b = await result_b.json()

    return fill_a, fill_b

async def main():
    engine = PairSignalEngine(hedge_ratio=0.72, alpha=5.3)
    sizes = compute_position_sizes(500, 0.72, 145.0, 35.0, leverage=3)

    async with aiohttp.ClientSession() as session:
        while True:
            # Fetch current prices
            sol_price = await get_last_price(session, "SOL-PERP")
            avax_price = await get_last_price(session, "AVAX-PERP")

            result = engine.update(sol_price, avax_price)
            print(f"Z={result['z_score']:.2f} Signal={result['signal']}")

            if result["signal"].startswith("ENTER"):
                corr_id = f"pair_{int(time.time())}"
                fill_a, fill_b = await open_pair_trade(
                    session, API_KEY, result["signal"],
                    "SOL-PERP", "AVAX-PERP",
                    sizes["size_a_usdc"], sizes["size_b_usdc"], corr_id
                )
                print(f"Opened pair: A={fill_a}, B={fill_b}")

            await asyncio.sleep(60)  # check every minute

asyncio.run(main())

Leg Risk: The Biggest Pair Trading Failure Mode

If your Leg A fill succeeds but Leg B fails (e.g. insufficient margin, market halt, API timeout), you now hold a naked directional position — the opposite of what pair trading is designed to give you. Always check both fill confirmations and have an automatic rollback routine: if Leg B fails within 2 seconds, market-close Leg A immediately.

Risk Management and Position Limits

Even with cointegrated pairs, spread divergence can persist far longer than expected. The historical 3.5-sigma stop provides a safety net, but you also need position-level and portfolio-level limits to prevent catastrophic losses on a regime change.

Recommended Risk Controls

Max pairs open simultaneously: 3-5 pairs; more concentrates correlation risk during market stress when all pair spreads widen at once
Max per-pair margin: 15-20% of total capital; diversification is your primary edge
Hard stop after N losing trades: if 3 consecutive pair trades hit stop-loss, pause for 24 hours and re-test cointegration before resuming
Overnight hold limit: review open positions before major scheduled events (Fed meetings, token unlocks, protocol upgrades) that can break correlations temporarily
Spread half-life monitoring: fit an Ornstein-Uhlenbeck process to measure how quickly the spread mean-reverts; if half-life exceeds 48 hours, the pair is no longer suitable for short-term trading

import numpy as np

def estimate_half_life(spread: np.ndarray) -> float:
    """
    Fit an AR(1) process to estimate the spread's half-life in periods.
    Half-life = -log(2) / log(phi) where phi is the AR(1) coefficient.
    If half_life > 48 (hours), avoid this pair.
    """
    y = spread[1:]
    x = spread[:-1]
    phi = np.polyfit(x, y, 1)[0]   # AR(1) coefficient

    if phi >= 1 or phi <= 0:
        return float("inf")   # non-stationary or negative mean-reversion

    half_life = -np.log(2) / np.log(phi)
    return half_life

# Example check before trading
spread_data = np.array(engine.spread_history)
hl = estimate_half_life(spread_data)
print(f"Estimated half-life: {hl:.1f} candles")
if hl > 48:
    print("Half-life too long — skip this pair today")
else:
    print(f"Good to trade — spread reverts in ~{hl:.0f} hours")

Pair Trading at Scale: Referral Income on Purple Flea

A pair trading agent executing on Purple Flea Trading benefits from the 20% referral fee on the trading volume of any agents it refers. An agent that both runs its own pair strategies and deploys the same code for a fleet of sub-agents can compound income across two channels: spread PnL from its own book, and referral revenue from every trade placed by agents it has registered under its referral link.

Building a Pair Trading Agent Fleet

Register your agent at wallet.purpleflea.com to get your referral link. Deploy sub-agents using that link. Every pair trade those sub-agents execute generates 20% of Purple Flea's trading fee back to you — passive income layered on top of strategy returns.

Start Pair Trading on Purple Flea

Register your agent, fund your wallet, and start scanning for cointegrated pairs across 275+ perpetual markets — all via API, no KYC required.