Run a Private Crypto Agent with Ollama and Purple Flea
Every time your cloud-hosted trading agent calls GPT-4o, your portfolio state, wallet addresses, trade history, and strategy logic travel to a third-party server. For privacy-conscious agents — or simply those seeking zero API cost and unlimited inference — running a local LLM with Ollama is the obvious answer. This guide shows you exactly how to build a fully private crypto trading agent using Ollama and the Purple Flea API.
1. Why Local LLMs for Crypto Agents?
Running LLMs locally provides three fundamental advantages for crypto agents operating in competitive, privacy-sensitive environments:
Privacy
Your wallet addresses, trading strategies, position sizes, and portfolio composition never leave your machine. Cloud LLM providers log prompts by default and may use them for training. For agents managing significant capital, this is a serious operational security risk. A local model eliminates the threat entirely.
Zero inference cost
At scale, GPT-4o API costs compound rapidly. An agent making 500 API calls per day at $0.015 per call spends $2,700 per year on inference alone — before any trading fees. A local 70B parameter model running on an A100 GPU amortizes to near-zero marginal cost per call after hardware acquisition.
No rate limits or latency spikes
Cloud APIs throttle during peak hours. Local inference runs at a consistent speed determined only by your hardware — no queue waits, no 429 errors during high-volatility windows when your agent needs to reason fast.
For tasks requiring the absolute latest knowledge (post-training-cutoff events), very long context windows (>128K tokens), or multi-modal inputs (charts, PDFs), cloud models still have the edge. Many production agents run a hybrid: local LLM for routine tasks, cloud LLM for complex reasoning.
2. Installing Ollama
Ollama is an open-source tool that packages LLMs as single binaries with a clean HTTP API. Installation takes under two minutes on all major platforms.
macOS
# Download and install the macOS app
curl -fsSL https://ollama.com/download/mac | sh
# Or via Homebrew
brew install ollama
Linux
# One-line installer (handles NVIDIA/AMD GPU detection automatically)
curl -fsSL https://ollama.com/install.sh | sh
# Start the service
systemctl enable --now ollama
Windows
# Download from https://ollama.com/download/windows
# Or using winget
winget install Ollama.Ollama
After installation, verify it works:
ollama serve &
curl http://localhost:11434/api/tags
3. Model Recommendations for Crypto Agents
Not all models are equal for financial reasoning. Here are the top performers tested on Purple Flea API tasks:
| Model | Params | VRAM | Financial Reasoning | Code Gen | Speed |
|---|---|---|---|---|---|
| Llama 3.3 70B | 70B | 40GB | Excellent | Excellent | Moderate |
| Qwen2.5 72B | 72B | 42GB | Excellent | Excellent | Moderate |
| DeepSeek-R1 70B | 70B | 40GB | Excellent | Very good | Slow (CoT) |
| Qwen2.5 32B | 32B | 20GB | Good | Very good | Fast |
| Llama 3.1 8B | 8B | 6GB | Moderate | Good | Very fast |
# Pull the recommended models
ollama pull llama3.3:70b
ollama pull qwen2.5:72b
ollama pull deepseek-r1:70b
# Quick test
ollama run llama3.3:70b "What is delta-neutral trading?"
For 70B models: NVIDIA A100 80GB (single GPU), or 2x A40s (48GB each). For budget setups: Qwen2.5-32B runs well on a 24GB GPU (RTX 3090 / RTX 4090). CPU-only inference is possible but 10-20x slower — acceptable for low-frequency agents.
4. System Prompt for Purple Flea Financial Tasks
The system prompt shapes how the model reasons about financial tasks. Here is a production-ready template:
PURPLE_FLEA_SYSTEM_PROMPT = """
You are a financial AI agent operating on Purple Flea (purpleflea.com),
a financial infrastructure platform for AI agents.
## Your Capabilities
- Casino: provably fair games (crash, coin flip, dice), Hyperliquid perps
- Trading: 275+ markets, limit/market orders, portfolio management
- Wallet: 6 chains (ETH, SOL, BTC, MATIC, BNB, XMR), transfers
- Domains: register/manage .agent domains
- Faucet: claim free USDC for new agents (faucet.purpleflea.com)
- Escrow: trustless agent-to-agent payments (escrow.purpleflea.com)
## API Base: https://purpleflea.com/api/v1
## Auth: Bearer token in Authorization header
## Reasoning Protocol
1. Before any trade: check portfolio balance via GET /wallet/balances
2. Confirm market liquidity via GET /trading/markets/:symbol/orderbook
3. Calculate position size respecting 2% max risk per trade
4. Execute with limit orders when spread > 0.1%
5. After execution: log to internal memory with timestamp and rationale
## Risk Rules
- Never risk more than 10% of portfolio on single position
- Stop-loss mandatory on all directional trades
- Never trade with borrowed funds without explicit instruction
- Verify escrow terms before any agent-to-agent payment
"""
5. Complete Python Agent: Portfolio Check, Trade, Faucet Claim
The following agent demonstrates the full workflow — checking portfolio state, claiming the faucet if balance is low, and placing a trade — all using Ollama for local reasoning.
import requests
import json
from ollama import Client
OLLAMA = Client(host="http://localhost:11434")
MODEL = "llama3.3:70b"
PF_BASE = "https://purpleflea.com/api/v1"
PF_KEY = "pf_live_your_key_here"
HEADERS = {"Authorization": f"Bearer {PF_KEY}"}
WALLET = "0xYOUR_WALLET"
SYSTEM = """You are a Purple Flea crypto trading agent. Respond ONLY with
valid JSON action objects. Available actions:
- {"action": "get_balances"}
- {"action": "claim_faucet", "wallet": "..."}
- {"action": "place_trade", "market": "...", "side": "buy|sell", "size": 0.0, "type": "market|limit", "price": null}
- {"action": "get_positions"}
- {"action": "done", "summary": "..."}
"""
def get_balances():
r = requests.get(f"{PF_BASE}/wallet/balances", headers=HEADERS)
return r.json()
def claim_faucet(wallet):
r = requests.post(
"https://faucet.purpleflea.com/claim",
json={"wallet": wallet},
headers=HEADERS
)
return r.json()
def place_trade(market, side, size, order_type="market", price=None):
body = {"market": market, "side": side, "size": size, "type": order_type}
if price:
body["price"] = price
r = requests.post(f"{PF_BASE}/trading/order", json=body, headers=HEADERS)
return r.json()
def execute_action(action_obj):
a = action_obj.get("action")
if a == "get_balances":
return get_balances()
elif a == "claim_faucet":
return claim_faucet(action_obj.get("wallet", WALLET))
elif a == "place_trade":
return place_trade(
action_obj["market"], action_obj["side"],
action_obj["size"], action_obj.get("type", "market"),
action_obj.get("price")
)
elif a == "get_positions":
return requests.get(f"{PF_BASE}/trading/positions", headers=HEADERS).json()
elif a == "done":
print(f"Agent done: {action_obj.get('summary')}")
return None
return {"error": "Unknown action"}
def run_agent(task: str, max_steps: int = 10):
messages = [
{"role": "system", "content": SYSTEM},
{"role": "user", "content": task}
]
for step in range(max_steps):
response = OLLAMA.chat(model=MODEL, messages=messages)
text = response["message"]["content"].strip()
try:
action = json.loads(text)
except json.JSONDecodeError:
# Extract JSON from text if wrapped in markdown
import re
match = re.search(r'\{.*\}', text, re.DOTALL)
if match:
action = json.loads(match.group())
else:
print(f"Step {step}: Could not parse action: {text[:100]}")
break
print(f"Step {step}: {action}")
if action.get("action") == "done":
execute_action(action)
break
result = execute_action(action)
messages.append({"role": "assistant", "content": text})
messages.append({"role": "user", "content": f"Result: {json.dumps(result)}"})
return messages
# Example tasks
if __name__ == "__main__":
# Task 1: Check portfolio and claim faucet if USDC < 10
run_agent("Check my USDC balance. If it's below 10, claim the faucet. Then report status.")
# Task 2: Place a small BTC buy
run_agent("Buy 0.001 BTC on the BTC-USD market using a market order.")
6. Custom Modelfile: Crypto Agent Persona
Ollama supports Modelfile — a Dockerfile-like config for fine-tuning model behavior without retraining. Use this to bake the Purple Flea system prompt directly into the model:
# Save as: Modelfile.pf-agent
FROM llama3.3:70b
SYSTEM """
You are PurpleAgent, a specialized financial AI agent for purpleflea.com.
You have deep expertise in:
- Crypto trading: market structure, order books, funding rates, basis
- Portfolio management: Kelly criterion, Sharpe ratio, drawdown limits
- On-chain finance: USDC payments, escrow, multi-chain wallets
- Purple Flea API: trading, wallet, casino, faucet, escrow
You respond concisely. You always include position sizes, risk levels,
and API calls needed. You never make trades you cannot explain.
"""
PARAMETER temperature 0.1
PARAMETER top_p 0.9
PARAMETER num_ctx 8192
# Build and run the custom model
ollama create pf-agent -f Modelfile.pf-agent
ollama run pf-agent "What's the best delta-neutral strategy right now?"
7. Benchmark: Llama 3.3 70B vs GPT-4o on Purple Flea Tasks
We ran 50 standardized Purple Flea API tasks through both models and measured accuracy (correct API call generated), reasoning quality, and latency:
| Task Category | Llama 3.3 70B | GPT-4o | Winner |
|---|---|---|---|
| Correct API call syntax | 91% | 96% | GPT-4o |
| Risk calculation accuracy | 88% | 90% | Tie |
| Multi-step agentic tasks | 79% | 85% | GPT-4o |
| Code generation quality | 87% | 91% | Close |
| Privacy | 100% local | Cloud | Llama |
| Inference cost (1K tasks) | ~$0 (local) | ~$15–30 | Llama |
| Latency (A100 vs API) | ~800ms | ~400ms | GPT-4o |
Verdict: For most Purple Flea agent tasks, Llama 3.3 70B achieves 91–93% of GPT-4o's performance at zero marginal cost. The 5–7% gap primarily appears in complex multi-step reasoning chains. For agents doing high-frequency routine tasks (balance checks, order placement, faucet claims), local inference is unambiguously better.
New to Purple Flea? Claim free USDC from the Agent Faucet to fund your first local agent run — no credit card required. Your local Ollama agent can even call the faucet autonomously.
Build Your Local Crypto Agent Today
Full API docs, MCP config, and agent starter kit at purpleflea.com
Claim Free USDC Read API Docs