Autonomous AI trading agents need an edge — and that edge starts with knowing what prices are likely to do before placing a single order. Time series forecasting, once the exclusive domain of quantitative hedge funds, is now accessible to any agent with a few hundred lines of Python and access to historical OHLCV data.
This guide walks through the complete forecasting stack: from classical ARIMA baselines to LSTM recurrent networks, Facebook Prophet for seasonality decomposition, lightweight Transformer architectures, and finally ensemble methods that blend all of the above into a single probability distribution your agent can act on.
All execution examples use the Purple Flea Trading API, which accepts orders programmatically and supports both market and limit order types with sub-second latency.
1. Forecasting Methods Overview
No single model wins across all market regimes. The practical answer is a toolkit — choose the right model for the regime, or blend them all. Here is how the major families compare:
| Model | Strengths | Weaknesses | Best for |
|---|---|---|---|
| ARIMA | Fast, interpretable, no GPU | Linear only, stationarity req'd | Short-horizon, mean-reverting regimes |
| LSTM | Captures nonlinear long-range deps | Needs large data, hyperparameter sensitive | Multi-step trend continuation |
| Prophet | Handles seasonality, holidays | Weak on pure noise series | Daily/weekly crypto patterns |
| Transformer | State-of-the-art on long sequences | Heavy compute, needs pretraining | 1h–48h horizon with full orderbook |
| Ensemble | Best generalization across regimes | Complexity, latency overhead | Production agents requiring robustness |
For a trading agent that needs to make decisions every few minutes, a lightweight ensemble of ARIMA + LSTM trained on the last 90 days of 1-minute candles is the sweet spot — fast enough to re-forecast on every bar without blocking the event loop.
Setting the Forecast Horizon
Match your horizon to your holding period. If your agent trades mean-reversion on 5-minute candles, you need a 5–30 minute forecast. If it's a trend-following agent on 4-hour candles, forecast 24–48 hours ahead. Longer horizons require wider confidence intervals — make sure your position sizing reflects that uncertainty.
Rule of thumb: Forecast horizon should not exceed 10% of your training window. A model trained on 1,000 candles should not reliably forecast beyond 100 candles forward.
2. Feature Engineering for Price Forecasting
Raw OHLCV data is necessary but not sufficient. The features you engineer are often more predictive than the model architecture. This section covers the essential feature set for a crypto forecasting pipeline.
Technical Indicator Features
Beyond the raw close price, feed your model these derived signals:
- Log returns —
log(close_t / close_{t-1})— stationary and symmetric, preferred over raw price - Realized volatility — rolling standard deviation of log returns over 14, 30, and 60 periods
- RSI (14) — momentum measure, normalized to [0, 1]
- MACD signal — difference between EMA12 and EMA26
- Bollinger Band width — (upper - lower) / middle, a volatility regime indicator
- Volume z-score — standardized volume vs. rolling 30-period mean
- Hour-of-day and day-of-week — one-hot encoded, captures crypto's strong weekend effects
import pandas as pd
import numpy as np
from ta import add_all_ta_features
def engineer_features(df: pd.DataFrame) -> pd.DataFrame:
"""
df must have columns: open, high, low, close, volume
Returns df with additional feature columns.
"""
df = df.copy()
# ── Log returns and lagged returns ──
df['log_return'] = np.log(df['close'] / df['close'].shift(1))
for lag in [1, 2, 3, 5, 10, 20]:
df[f'return_lag_{lag}'] = df['log_return'].shift(lag)
# ── Realized volatility (multiple windows) ──
for window in [14, 30, 60]:
df[f'realvol_{window}'] = (
df['log_return']
.rolling(window)
.std() * np.sqrt(252 * 24) # annualized
)
# ── RSI ──
delta = df['close'].diff()
gain = delta.clip(lower=0).rolling(14).mean()
loss = (-delta.clip(upper=0)).rolling(14).mean()
rs = gain / loss.replace(0, np.nan)
df['rsi_14'] = (100 - (100 / (1 + rs))) / 100
# ── MACD ──
ema12 = df['close'].ewm(span=12).mean()
ema26 = df['close'].ewm(span=26).mean()
df['macd'] = (ema12 - ema26) / df['close']
# ── Bollinger Band width ──
mid = df['close'].rolling(20).mean()
std = df['close'].rolling(20).std()
df['bb_width'] = (4 * std) / mid
# ── Volume z-score ──
vol_mean = df['volume'].rolling(30).mean()
vol_std = df['volume'].rolling(30).std()
df['volume_z'] = (df['volume'] - vol_mean) / vol_std.replace(0, np.nan)
# ── Temporal features ──
df['hour_sin'] = np.sin(2 * np.pi * df.index.hour / 24)
df['hour_cos'] = np.cos(2 * np.pi * df.index.hour / 24)
df['dow_sin'] = np.sin(2 * np.pi * df.index.dayofweek / 7)
df['dow_cos'] = np.cos(2 * np.pi * df.index.dayofweek / 7)
return df.dropna()
# Example usage
if __name__ == '__main__':
import requests
# Fetch 90 days of hourly candles from Purple Flea Trading API
resp = requests.get(
'https://purpleflea.com/trading-api/candles',
params={'symbol': 'BTC/USDC', 'interval': '1h', 'limit': 2160},
headers={'X-API-Key': 'YOUR_KEY'}
)
raw = pd.DataFrame(resp.json()['candles'])
raw['timestamp'] = pd.to_datetime(raw['timestamp'], unit='ms')
raw = raw.set_index('timestamp')
features = engineer_features(raw)
print(features.shape) # e.g. (2100, 22)
Target Variable Construction
Rather than predicting the absolute future price (which drifts with market regimes), predict the n-step forward log return. This keeps the target stationary, symmetric, and comparable across different price levels and time periods.
For classification-style agents (long / flat / short), discretize the forward return into three buckets: below -0.5%, between ±0.5% (flat), and above +0.5%. Adjust the threshold based on your transaction costs — there's no point predicting a 0.1% move if fees eat 0.2%.
3. LSTM Implementation
Long Short-Term Memory networks are the workhorse of sequential price forecasting. Their gating mechanism allows them to selectively remember long-range dependencies — critical for capturing how a breakout from three days ago still influences price action today.
Architecture Design
For hourly crypto data, a two-layer LSTM with 128 units each, followed by dropout (0.2) and a dense output layer, strikes the right balance between expressiveness and overfitting resistance. Use a lookback window of 96 hours (4 days) as the sequence length.
import numpy as np
import torch
import torch.nn as nn
from torch.utils.data import Dataset, DataLoader
from sklearn.preprocessing import StandardScaler
from dataclasses import dataclass
from typing import Tuple
class TimeSeriesDataset(Dataset):
def __init__(self, X: np.ndarray, y: np.ndarray):
self.X = torch.FloatTensor(X)
self.y = torch.FloatTensor(y)
def __len__(self): return len(self.X)
def __getitem__(self, idx): return self.X[idx], self.y[idx]
class LSTMForecaster(nn.Module):
def __init__(
self,
input_size: int,
hidden_size: int = 128,
num_layers: int = 2,
dropout: float = 0.2,
horizon: int = 1,
):
super().__init__()
self.lstm = nn.LSTM(
input_size=input_size,
hidden_size=hidden_size,
num_layers=num_layers,
dropout=dropout,
batch_first=True,
)
self.norm = nn.LayerNorm(hidden_size)
self.drop = nn.Dropout(dropout)
self.out = nn.Linear(hidden_size, horizon)
def forward(self, x: torch.Tensor) -> torch.Tensor:
# x: (batch, seq_len, features)
lstm_out, _ = self.lstm(x)
last = lstm_out[:, -1, :] # take final hidden state
last = self.norm(last)
last = self.drop(last)
return self.out(last) # (batch, horizon)
def build_sequences(
data: np.ndarray,
lookback: int = 96,
horizon: int = 6,
) -> Tuple[np.ndarray, np.ndarray]:
"""Create sliding window sequences."""
X, y = [], []
target_col = 0 # log_return is column 0
for i in range(len(data) - lookback - horizon + 1):
X.append(data[i : i + lookback])
y.append(data[i + lookback : i + lookback + horizon, target_col])
return np.array(X), np.array(y)
def train_lstm(
features: np.ndarray,
epochs: int = 40,
batch_size: int = 64,
lr: float = 1e-3,
lookback: int = 96,
horizon: int = 6,
):
scaler = StandardScaler()
scaled = scaler.fit_transform(features)
X, y = build_sequences(scaled, lookback, horizon)
split = int(0.85 * len(X))
train_ds = TimeSeriesDataset(X[:split], y[:split])
val_ds = TimeSeriesDataset(X[split:], y[split:])
train_dl = DataLoader(train_ds, batch_size=batch_size, shuffle=True)
val_dl = DataLoader(val_ds, batch_size=batch_size)
model = LSTMForecaster(
input_size=features.shape[1],
horizon=horizon,
)
optimizer = torch.optim.AdamW(model.parameters(), lr=lr, weight_decay=1e-4)
scheduler = torch.optim.lr_scheduler.CosineAnnealingLR(optimizer, epochs)
criterion = nn.HuberLoss(delta=0.5)
for epoch in range(epochs):
model.train()
train_loss = 0.0
for xb, yb in train_dl:
optimizer.zero_grad()
pred = model(xb)
loss = criterion(pred, yb)
loss.backward()
nn.utils.clip_grad_norm_(model.parameters(), 1.0)
optimizer.step()
train_loss += loss.item()
scheduler.step()
if (epoch + 1) % 10 == 0:
model.eval()
val_loss = 0.0
with torch.no_grad():
for xb, yb in val_dl:
val_loss += criterion(model(xb), yb).item()
print(f"Epoch {epoch+1}: train={train_loss/len(train_dl):.5f} val={val_loss/len(val_dl):.5f}")
return model, scaler
Preventing Data Leakage
The most common mistake in backtesting LSTM models is fitting the scaler on the entire dataset before splitting. Always fit StandardScaler only on the training portion, then apply the same transform to validation and test sets. Any look-ahead in normalization produces inflated backtest metrics that collapse in live trading.
4. Ensemble Forecasting
Ensemble methods combine multiple forecasters, reducing variance and improving robustness to regime changes. The simplest ensemble is a weighted average, where weights are inversely proportional to each model's recent validation loss.
from statsmodels.tsa.arima.model import ARIMA
from prophet import Prophet
import numpy as np
class EnsembleForecaster:
def __init__(self, lstm_model, scaler, lookback=96, horizon=6):
self.lstm = lstm_model
self.scaler = scaler
self.lookback = lookback
self.horizon = horizon
self.weights = np.array([0.5, 0.25, 0.25]) # LSTM, ARIMA, Prophet
def forecast_arima(self, returns: np.ndarray) -> np.ndarray:
try:
model = ARIMA(returns[-500:], order=(2, 0, 2))
fit = model.fit(method_kwargs={'warn_convergence': False})
return fit.forecast(steps=self.horizon)
except:
return np.zeros(self.horizon)
def forecast_prophet(self, df_prophet) -> np.ndarray:
try:
m = Prophet(
daily_seasonality=True,
weekly_seasonality=True,
changepoint_prior_scale=0.05,
seasonality_mode='multiplicative',
)
m.fit(df_prophet, verbose=False)
future = m.make_future_dataframe(periods=self.horizon, freq='H')
forecast = m.predict(future)
return forecast['yhat'].iloc[-self.horizon:].values
except:
return np.zeros(self.horizon)
def forecast_lstm(self, features: np.ndarray) -> np.ndarray:
import torch
seq = self.scaler.transform(features[-self.lookback:])
x = torch.FloatTensor(seq).unsqueeze(0)
with torch.no_grad():
pred = self.lstm(x).squeeze().numpy()
return pred
def forecast(self, features, returns, df_prophet) -> dict:
lstm_pred = self.forecast_lstm(features)
arima_pred = self.forecast_arima(returns)
prophet_ret = np.diff(np.log(self.forecast_prophet(df_prophet)))
# Normalize prophet to same horizon length
prophet_pred = prophet_ret[:self.horizon] if len(prophet_ret) >= self.horizon \
else np.pad(prophet_ret, (0, self.horizon - len(prophet_ret)))
ensemble = (
self.weights[0] * lstm_pred +
self.weights[1] * arima_pred +
self.weights[2] * prophet_pred
)
return {
'ensemble': ensemble,
'lstm': lstm_pred,
'arima': arima_pred,
'prophet': prophet_pred,
'direction': 'long' if ensemble[0] > 0.001 else (
'short' if ensemble[0] < -0.001 else 'flat'),
}
5. Live Trading Integration with Purple Flea
Once your ensemble produces a directional signal, executing it via the Purple Flea Trading API is straightforward. The API accepts JSON orders over HTTPS and returns confirmation within ~80ms on average.
Signal to Order Pipeline
The agent loop runs on a cron-like scheduler: every hour, fetch the latest candles, re-compute features, run the ensemble, and conditionally place or cancel orders based on the signal and current position state.
import asyncio
import httpx
from datetime import datetime, timezone
API_BASE = 'https://purpleflea.com/trading-api'
API_KEY = 'YOUR_PURPLEFLEA_API_KEY'
HEADERS = {'X-API-Key': API_KEY, 'Content-Type': 'application/json'}
async def get_candles(symbol: str, interval: str, limit: int) -> list:
async with httpx.AsyncClient() as c:
r = await c.get(
f'{API_BASE}/candles',
params={'symbol': symbol, 'interval': interval, 'limit': limit},
headers=HEADERS,
)
r.raise_for_status()
return r.json()['candles']
async def place_order(symbol: str, side: str, qty: float, order_type='market') -> dict:
async with httpx.AsyncClient() as c:
r = await c.post(
f'{API_BASE}/orders',
json={
'symbol': symbol,
'side': side,
'quantity': qty,
'type': order_type,
},
headers=HEADERS,
)
r.raise_for_status()
return r.json()
async def agent_loop(forecaster, symbol='BTC/USDC', trade_qty=0.001):
position = 'flat'
while True:
try:
# 1. Fetch latest data
candles = await get_candles(symbol, '1h', 300)
df = pd.DataFrame(candles).set_index('timestamp')
features_df = engineer_features(df)
# 2. Run ensemble
result = forecaster.forecast(
features=features_df.values,
returns=features_df['log_return'].values,
df_prophet=features_df[['ds', 'y']],
)
signal = result['direction']
conf = abs(result['ensemble'][0])
print(f"[{datetime.now(timezone.utc).isoformat()}] signal={signal} conf={conf:.5f}")
# 3. Execute orders
if signal == 'long' and position != 'long' and conf > 0.002:
if position == 'short':
await place_order(symbol, 'buy', trade_qty * 2)
else:
await place_order(symbol, 'buy', trade_qty)
position = 'long'
elif signal == 'short' and position != 'short' and conf > 0.002:
if position == 'long':
await place_order(symbol, 'sell', trade_qty * 2)
else:
await place_order(symbol, 'sell', trade_qty)
position = 'short'
except Exception as e:
print(f"Agent error: {e}")
# Wait for next hourly candle
await asyncio.sleep(3600)
if __name__ == '__main__':
asyncio.run(agent_loop(forecaster))
Risk management: Always implement a hard stop-loss in the agent loop. Forecasting models are not perfect — a single bad prediction in a high-volatility regime can exceed the cumulative gains of many correct predictions if position sizing is unconstrained.
Model Retraining Schedule
Crypto markets are non-stationary. Models trained three months ago may be capturing patterns that no longer exist. A practical retraining schedule: retrain the LSTM weekly on a rolling 90-day window, update the ARIMA order selection monthly, and let Prophet retrain every run (it is fast enough). Automate this with a APScheduler job that runs at 00:00 UTC each Monday.
6. Transformer Models for Long-Horizon Forecasting
When your agent needs 24–48 hour forecasts with the full feature set, a lightweight Transformer encoder outperforms LSTM in most empirical evaluations on financial data. The attention mechanism allows the model to directly attend to relevant historical timestamps without the gradient vanishing issues that plague deep LSTMs.
For production use, the Temporal Fusion Transformer (TFT) by Lim et al. is the current state of the art for multi-horizon probabilistic forecasting. It outputs quantile forecasts (10th, 50th, 90th percentile), which lets your agent express uncertainty-aware position sizing: smaller positions when the forecast confidence interval is wide, larger when it is narrow.
Key Transformer Hyperparameters
- Attention heads: 4–8 for most crypto datasets; more heads do not consistently help on sequences shorter than 1,000 steps
- d_model: 64–128. Larger models overfit on typical 90-day datasets
- Positional encoding: Use sinusoidal encoding, but add the timestamp features as additional inputs rather than relying on position alone
- Training objective: Quantile loss (pinball loss) for probabilistic outputs, or MSE for point forecasts
Practical tip: Before investing in Transformer training, check if a well-tuned LSTM with good feature engineering already saturates the Sharpe ratio. Transformers rarely add more than 10–15% improvement on datasets under 100,000 samples.
Start Trading with Your Forecasts
Purple Flea provides a full trading API with market orders, limit orders, and position management. New agents can claim free USDC from the faucet to test their forecasting strategies in live markets — no funding required.