Fine-Tuned Models, Real Money
Replicate is a platform that hosts thousands of open-source ML models via a clean REST API. You can call models from Stability AI, Meta, Mistral, and hundreds of community contributors with a single HTTP request โ no GPU infrastructure required. More importantly, Replicate supports fine-tuning: you can train a custom version of Llama 3, Mistral, or any supported base model on your own data, host it privately, and call it via API.
For crypto trading agents, this opens a compelling possibility. A Llama 3 model fine-tuned on six months of your historical trade data โ market state as input, trade decision and outcome as output โ can develop genuine pattern recognition for the market conditions where you have edge. A 7B fine-tuned specialist often outperforms a 70B generalist on domain-specific tasks. This guide shows how to connect a Replicate-hosted model to live trade execution via Purple Flea's API.
Why Fine-Tune for Trading
Base models are trained on internet text. They can reason about trading concepts, but they do not know:
- Your specific risk tolerance (how much drawdown is acceptable)
- The market patterns that have historically worked in your timeframe
- Your exchange's fee structure and how it affects position sizing
- Which indicators have been predictive in your specific market conditions
Fine-tuning on your historical data teaches the model all of these things. The training examples encode real market states paired with the decisions that led to profitable outcomes. The model learns the mapping specific to your strategy and risk profile.
A secondary benefit: fine-tuned models produce more structured, parseable output. A base model asked to output a trading decision in JSON will occasionally hallucinate extra fields or deviate from the schema. A fine-tuned model that has seen thousands of correctly structured examples almost never breaks the format.
Setup
pip install replicate requests
Get your Replicate API token at replicate.com/account/api-tokens. Set it as an environment variable:
export REPLICATE_API_TOKEN="your-token-here"
export PF_API_KEY="your-purple-flea-key"
Basic Integration: Base Model
Before fine-tuning, validate the pipeline with a base model. This confirms the market data fetching, model invocation, response parsing, and trade execution all work end-to-end:
import replicate
import requests
import json
import os
PF_KEY = os.environ["PF_API_KEY"]
HEADERS = {"Authorization": f"Bearer {PF_KEY}"}
PF_BASE = "https://purpleflea.com/api/v1"
def get_market_context(symbol: str = "BTC-PERP") -> dict:
"""Fetch current market data for the trading prompt."""
price_r = requests.get(f"{PF_BASE}/markets/{symbol}/price", headers=HEADERS)
price_data = price_r.json()
funding_r = requests.get(f"{PF_BASE}/markets/{symbol}/funding-rate", headers=HEADERS)
funding_data = funding_r.json()
return {
"symbol": symbol,
"price": price_data["price"],
"change_24h_pct": price_data["change_24h_pct"],
"volume_24h_usd": price_data["volume_24h_usd"],
"funding_rate_8h": funding_data["rate_8h"],
}
def run_trading_agent(max_position_usd: float = 1000.0) -> dict:
"""Run one decision cycle: fetch data, query model, execute if warranted."""
ctx = get_market_context("BTC-PERP")
prompt = (
f"Market state: {ctx['symbol']} at ${ctx['price']:,.0f}, "
f"24h change: {ctx['change_24h_pct']:+.1f}%, "
f"24h volume: ${ctx['volume_24h_usd']/1e6:.0f}M, "
f"funding rate (8h): {ctx['funding_rate_8h']:.4f}%. "
f"Max position size: ${max_position_usd:,.0f}. "
f"Respond with JSON only: "
f'{{\"action\": \"long|short|hold\", \"size_usd\": number, '
f'\"stop_loss_pct\": number, \"take_profit_pct\": number, '
f'\"reason\": \"string\"}}'
)
output = replicate.run(
"meta/meta-llama-3.1-70b-instruct",
input={"prompt": prompt, "max_tokens": 256, "temperature": 0.3}
)
response_text = "".join(output).strip()
# Extract JSON from response
start = response_text.find("{")
end = response_text.rfind("}") + 1
decision = json.loads(response_text[start:end])
print(f"[AGENT] Decision: {decision['action'].upper()} "
f"${decision.get('size_usd', 0):,.0f} โ {decision.get('reason', '')}")
if decision["action"] in ["long", "short"]:
trade_r = requests.post(
f"{PF_BASE}/trade",
json={
"symbol": "BTC-PERP",
"side": decision["action"],
"size_usd": min(decision["size_usd"], max_position_usd),
"stop_loss_pct": decision.get("stop_loss_pct", 3.0),
"take_profit_pct": decision.get("take_profit_pct", 6.0),
},
headers=HEADERS
)
decision["trade_result"] = trade_r.json()
return decision
if __name__ == "__main__":
result = run_trading_agent()
print(json.dumps(result, indent=2))
Fine-Tuning for Trading
Fine-tuning requires a dataset of training examples. Each example is a prompt (market state) paired with a completion (optimal trade decision). Purple Flea's historical data API gives you the raw material: past market states, and your past trade outcomes.
Dataset Format
# training_data.jsonl โ one example per line
{"prompt": "Market state: BTC-PERP at $72,400, 24h change: +2.3%, funding: +0.012%...",
"completion": "{\"action\": \"long\", \"size_usd\": 800, \"stop_loss_pct\": 2.5, \"take_profit_pct\": 5.0, \"reason\": \"Positive momentum with moderate funding; favorable risk/reward\"}"}
{"prompt": "Market state: BTC-PERP at $68,100, 24h change: -4.1%, funding: -0.022%...",
"completion": "{\"action\": \"short\", \"size_usd\": 600, \"stop_loss_pct\": 3.0, \"take_profit_pct\": 8.0, \"reason\": \"Breakdown below support; negative funding adds carry income\"}"}
{"prompt": "Market state: BTC-PERP at $75,200, 24h change: +0.1%, funding: +0.045%...",
"completion": "{\"action\": \"hold\", \"size_usd\": 0, \"stop_loss_pct\": 0, \"take_profit_pct\": 0, \"reason\": \"High funding rate with no momentum; avoid longs\"}"}
Build 500โ2,000 examples by querying your Purple Flea trade history and pairing each trade's market context with the outcome-labelled decision. Trades that were profitable get their actual parameters in the completion; losing trades are converted to "hold" decisions so the model learns to avoid those conditions.
Launching a Fine-Tune via Replicate
import replicate
training = replicate.trainings.create(
version="meta/meta-llama-3.1-8b-instruct:latest",
input={
"train_data": "https://storage.googleapis.com/your-bucket/training_data.jsonl",
"num_train_epochs": 3,
"learning_rate": 0.0002,
"lora_rank": 16,
},
destination="your-username/btc-trading-agent"
)
print(f"Training ID: {training.id}")
Testing and Validation
Before deploying with real capital, backtest the fine-tuned model against a held-out dataset of historical market states. Compare its decisions against the base model and against a random baseline. Key metrics to evaluate:
- Decision accuracy: How often does the model's action match the optimal action (in hindsight)?
- Simulated Sharpe ratio: Run 90 days of paper trading and compute the Sharpe. Fine-tuned model should exceed base model.
- Format reliability: What percentage of outputs are valid JSON? Should be 99%+.
- Hold rate: A model that always says "long" is not learning. Check that hold decisions are appropriate in flat markets.
Production Deployment
In production, schedule the agent to run on a regular interval using cron or a task queue. A typical configuration runs the decision cycle every 15 minutes during high-volatility hours, every 60 minutes during quiet periods.
# /etc/cron.d/trading-agent
*/15 6-22 * * 1-5 ubuntu /usr/bin/python3 /home/ubuntu/agent/run_trading_agent.py >> /var/log/trading-agent.log 2>&1
Always set a maximum daily loss limit. Track cumulative P&L in a database and halt the agent if drawdown exceeds your threshold. Fine-tuned models can overfit to historical patterns that no longer apply โ regular retraining (monthly) keeps the model current.
Getting Started
The Replicate + Purple Flea combination provides a complete stack for AI-native crypto trading: model hosting and fine-tuning on Replicate, market data and trade execution on Purple Flea. Start with the base model integration to validate the pipeline, collect trade data for six to eight weeks, then fine-tune and compare performance.