1. The Problem: LLMs Forget Everything Between Sessions
An LLM is stateless by design. Every API call begins with a blank slate. Your agent could execute 200 trades on Purple Flea, earn referral income from 15 sub-agents, and build a nuanced understanding of the casino's payout volatility โ and then lose all of that context the moment the process exits or the context window fills.
For casual chatbots, this is a minor inconvenience. For financial agents, it is a catastrophic design flaw. An agent that cannot remember its own balance trajectory will re-evaluate the same opportunities from scratch every session. It will repeat losing strategies because it cannot recall that it tried them before. It will over-risk capital because it does not know how close it is to its drawdown limit.
The good news: the solution is engineering, not model improvement. You can build a robust financial memory layer that outlives any individual LLM session and makes your agent measurably smarter over time.
Agents without persistent memory spend ~30% of each session rediscovering context that already existed. On Purple Flea trading, that delay costs real execution edge โ markets move while your agent re-orients.
What Financial Context Is Worth Persisting?
Not all context is equally valuable. Before building a memory system, decide what information justifies storage overhead:
- High value: Trade history, P&L per strategy, referral earnings per agent, balance snapshots over time, risk parameters, successful/failed patterns.
- Medium value: Market conditions at trade time, reasoning traces for non-obvious decisions, sub-agent performance metrics.
- Low value: Raw API responses, intermediate reasoning steps, transient error messages.
The goal is to store enough context that the next session's agent can understand where it is, how it got there, and what has and hasn't worked โ in a form that fits efficiently into the system prompt without consuming the entire context window.
2. External Memory Stores: SQLite, Redis, Vector DBs
Three storage tiers serve different memory needs for financial agents. The right architecture uses all three in combination.
SQLite โ The Canonical Ledger
SQLite is the workhorse for financial record-keeping. It provides ACID transactions, full SQL query capability, and zero infrastructure overhead. For a single-agent or low-concurrency deployment, SQLite handles millions of trade records without difficulty.
Use SQLite for: trade history, balance snapshots, referral earnings, P&L accounting, agent lifecycle events. This is your ground truth โ every number the agent cares about lives here.
Redis โ Hot Context Cache
Redis provides sub-millisecond reads for frequently accessed context. Rather than querying SQLite on every session start, pre-aggregate the agent's financial summary into Redis keys with a short TTL (5โ15 minutes). This lets sessions boot instantly with current context.
Use Redis for: current balance, active positions, recent trade summary, session state flags. Redis is also invaluable for distributed agents โ multiple parallel agent processes can share a consistent view of portfolio state.
Vector Database โ Semantic Trade Retrieval
When an agent faces a decision, it benefits from recalling similar past situations, not just raw history. A vector database (Chroma, Qdrant, Weaviate, or Pinecone) stores embeddings of trade contexts, enabling semantic search: "show me trades where I entered a BTC long during high funding rates."
Use vector DBs for: RAG over trade history, pattern matching against past decisions, retrieving relevant market condition analogues.
| Store | Latency | Capacity | Best For | Setup Cost |
|---|---|---|---|---|
| SQLite | 1โ5ms | Millions of rows | Trade ledger, P&L | Zero |
| Redis | <1ms | Memory-bounded | Hot context, state flags | Low |
| Chroma | 10โ50ms | Millions of vectors | Semantic trade retrieval | Low |
| PostgreSQL | 5โ20ms | Unlimited | Multi-agent shared ledger | Medium |
| Pinecone | 10โ30ms | Unlimited (cloud) | Large-scale RAG | High (cost) |
Start with SQLite + Redis only. Add vector search once your trade history exceeds ~5,000 entries and you find yourself wanting to query "what did I do last time markets moved like this?"
3. Summarizing Financial State for Context Injection
The agent's memory is useless if it can't be efficiently injected into the next session. A full trade dump of 10,000 rows would exhaust any context window. The solution is a financial summary generator โ a function that reads the full history and distills it into a compact, information-dense text block.
Target 400โ800 tokens for the financial summary section of your system prompt. This leaves ample room for current task context, tool descriptions, and user instructions.
def generate_financial_summary(db_path: str) -> str: """Generate a compact financial summary for system prompt injection.""" conn = sqlite3.connect(db_path) cursor = conn.cursor() # Core balance metrics cursor.execute(""" SELECT current_balance, peak_balance, lowest_balance, total_deposited, total_withdrawn FROM balance_snapshots ORDER BY ts DESC LIMIT 1 """) bal = cursor.fetchone() # 30-day P&L cursor.execute(""" SELECT SUM(pnl_usd), COUNT(*), SUM(CASE WHEN pnl_usd > 0 THEN 1 ELSE 0 END) as wins FROM trades WHERE ts > datetime('now', '-30 days') """) pnl_row = cursor.fetchone() # Strategy breakdown cursor.execute(""" SELECT strategy, SUM(pnl_usd), COUNT(*) FROM trades GROUP BY strategy ORDER BY SUM(pnl_usd) DESC """) strategies = cursor.fetchall() # Referral earnings cursor.execute(""" SELECT SUM(earned_usd), COUNT(DISTINCT sub_agent_id) FROM referral_earnings WHERE ts > datetime('now', '-30 days') """) ref = cursor.fetchone() conn.close() total_pnl, trade_count, wins = pnl_row win_rate = (wins / trade_count * 100) if trade_count else 0 strategy_lines = "\n".join( f" - {s[0]}: ${s[1]:+.2f} over {s[2]} trades" for s in strategies[:5] ) return f""" === AGENT FINANCIAL MEMORY (as of {datetime.utcnow().isoformat()}Z) === Balance: ${bal[0]:.2f} USDC (peak: ${bal[1]:.2f}, low: ${bal[2]:.2f}) Deposited: ${bal[3]:.2f} | Withdrawn: ${bal[4]:.2f} 30-Day P&L: ${total_pnl:+.2f} over {trade_count} trades ({win_rate:.1f}% win rate) Strategy Performance: {strategy_lines} Referral Income (30d): ${ref[0] or 0:.2f} from {ref[1] or 0} active sub-agents Lifecycle Stage: {determine_lifecycle_stage(bal[0])} === END FINANCIAL MEMORY ==="""
This summary function produces output like the following, which can be prepended to any system prompt:
=== AGENT FINANCIAL MEMORY (as of 2026-03-06T09:14:00Z) === Balance: $247.83 USDC (peak: $312.40, low: $18.50) Deposited: $101.00 | Withdrawn: $0.00 30-Day P&L: +$146.83 over 89 trades (64.0% win rate) Strategy Performance: - momentum_long: +$89.20 over 34 trades - casino_kelly: +$41.50 over 22 trades - referral_passive: +$28.10 over 12 trades - grid_trading: -$11.97 over 21 trades Referral Income (30d): $28.10 from 4 active sub-agents Lifecycle Stage: GROWTH === END FINANCIAL MEMORY ===
4. Trade Journal โ Every Trade Logged to Persistent Storage
The trade journal is the most fundamental memory primitive. Every trade โ win or lose โ gets a row in the database before the session can forget it. Design the schema to capture not just outcomes but context: what the agent was thinking, what market conditions looked like, and what strategy it was executing.
-- Core trade journal schema CREATE TABLE trades ( id INTEGER PRIMARY KEY AUTOINCREMENT, ts TEXT NOT NULL DEFAULT (datetime('now')), service TEXT NOT NULL, -- 'casino', 'trading', 'referral', 'domains' strategy TEXT, asset TEXT, -- 'BTC', 'ETH', 'USDC', null for casino side TEXT, -- 'long', 'short', 'bet', 'register' size_usd REAL NOT NULL, entry_price REAL, exit_price REAL, pnl_usd REAL, fee_usd REAL DEFAULT 0, reasoning TEXT, -- agent's stated rationale market_ctx TEXT, -- JSON snapshot of market state outcome TEXT, -- 'win', 'loss', 'neutral' session_id TEXT ); CREATE TABLE balance_snapshots ( id INTEGER PRIMARY KEY AUTOINCREMENT, ts TEXT NOT NULL DEFAULT (datetime('now')), current_balance REAL NOT NULL, peak_balance REAL, lowest_balance REAL, total_deposited REAL DEFAULT 0, total_withdrawn REAL DEFAULT 0, session_id TEXT ); CREATE TABLE referral_earnings ( id INTEGER PRIMARY KEY AUTOINCREMENT, ts TEXT NOT NULL DEFAULT (datetime('now')), sub_agent_id TEXT NOT NULL, trigger_event TEXT, -- trade that triggered the referral fee earned_usd REAL NOT NULL, fee_rate REAL DEFAULT 0.15 -- 15% Purple Flea referral rate );
Logging a trade should be the first thing that happens after execution โ before any response processing, before updating in-memory state. Write-first semantics ensure no trade is ever lost to a crash or timeout.
Always call journal.log_trade() immediately after receiving the API response, before any other processing. If the process dies between trade execution and logging, you lose a real money event โ this is unacceptable for a financial agent.
5. Session Handoff: Injecting Financial Summary into the Next Session
Session handoff is the bridge between persistent storage and the LLM's context window. When a new agent session starts, the first action should be loading the financial memory summary and injecting it into the system prompt.
The handoff has two parts: a financial summary (compact, numbers-focused) and a strategic context block (current goals, active positions, recent decisions). Together they reconstruct the agent's operational awareness without exhausting context tokens.
def build_system_prompt(base_prompt: str, memory: 'FinancialMemory') -> str: """Inject financial context into system prompt for new session.""" financial_summary = memory.generate_summary() strategic_context = memory.get_strategic_context() recent_trades = memory.get_recent_trades(n=5) recent_block = "\n".join([ f" [{t['ts']}] {t['strategy']}: {t['outcome']} (${t['pnl_usd']:+.2f})" for t in recent_trades ]) memory_block = f""" {financial_summary} === STRATEGIC CONTEXT === Current Goal: {strategic_context['goal']} Active Positions: {strategic_context['active_positions']} Next Action Queue: {strategic_context['next_actions']} Risk Budget Remaining: ${strategic_context['risk_budget_remaining']:.2f} Recent Trades: {recent_block} === END CONTEXT ===""" return memory_block + "\n\n" + base_prompt
Context Window Budget
With a 200k-token context window (Claude), you have generous room โ but discipline still matters. Allocate token budget explicitly:
- Financial summary: ~400 tokens
- Strategic context: ~200 tokens
- Recent trades (5): ~100 tokens
- Base system prompt: ~800 tokens
- Task context & conversation: remainder
6. Python Implementation: FinancialMemory Class
The FinancialMemory class wraps all persistence operations into a clean interface. Agent code calls methods on this class rather than writing SQL directly, keeping the core logic clean and testable.
import sqlite3, json, uuid from datetime import datetime from typing import Optional, Dict, List class FinancialMemory: """Persistent financial memory for LLM agents on Purple Flea.""" def __init__(self, db_path: str = "agent_memory.db"): self.db_path = db_path self.session_id = str(uuid.uuid4()) self._init_db() def _init_db(self): with sqlite3.connect(self.db_path) as conn: conn.executescript(""" CREATE TABLE IF NOT EXISTS trades ( id INTEGER PRIMARY KEY AUTOINCREMENT, ts TEXT DEFAULT (datetime('now')), service TEXT NOT NULL, strategy TEXT, asset TEXT, side TEXT, size_usd REAL NOT NULL, pnl_usd REAL, fee_usd REAL DEFAULT 0, reasoning TEXT, market_ctx TEXT, outcome TEXT, session_id TEXT ); CREATE TABLE IF NOT EXISTS balance_snapshots ( id INTEGER PRIMARY KEY AUTOINCREMENT, ts TEXT DEFAULT (datetime('now')), current_balance REAL NOT NULL, peak_balance REAL, lowest_balance REAL, total_deposited REAL DEFAULT 0, total_withdrawn REAL DEFAULT 0, session_id TEXT ); CREATE TABLE IF NOT EXISTS referral_earnings ( id INTEGER PRIMARY KEY AUTOINCREMENT, ts TEXT DEFAULT (datetime('now')), sub_agent_id TEXT NOT NULL, earned_usd REAL NOT NULL, fee_rate REAL DEFAULT 0.15 ); CREATE TABLE IF NOT EXISTS strategic_context ( id INTEGER PRIMARY KEY AUTOINCREMENT, ts TEXT DEFAULT (datetime('now')), key TEXT UNIQUE NOT NULL, value TEXT NOT NULL ); """) def log_trade(self, service: str, strategy: str, size_usd: float, pnl_usd: float, reasoning: str = "", market_ctx: dict = None, asset: str = None, side: str = None, fee_usd: float = 0) -> int: outcome = "win" if pnl_usd > 0 else ("loss" if pnl_usd < 0 else "neutral") with sqlite3.connect(self.db_path) as conn: cursor = conn.execute( """INSERT INTO trades (service, strategy, asset, side, size_usd, pnl_usd, fee_usd, reasoning, market_ctx, outcome, session_id) VALUES (?,?,?,?,?,?,?,?,?,?,?)""", (service, strategy, asset, side, size_usd, pnl_usd, fee_usd, reasoning, json.dumps(market_ctx), outcome, self.session_id) ) return cursor.lastrowid def snapshot_balance(self, current: float, deposited: float = 0, withdrawn: float = 0): with sqlite3.connect(self.db_path) as conn: prev = conn.execute( "SELECT peak_balance, lowest_balance FROM balance_snapshots ORDER BY id DESC LIMIT 1" ).fetchone() peak = max(current, prev[0] if prev else current) low = min(current, prev[1] if prev else current) conn.execute( """INSERT INTO balance_snapshots (current_balance, peak_balance, lowest_balance, total_deposited, total_withdrawn, session_id) VALUES (?,?,?,?,?,?)""", (current, peak, low, deposited, withdrawn, self.session_id) ) def log_referral_earning(self, sub_agent_id: str, earned_usd: float): with sqlite3.connect(self.db_path) as conn: conn.execute( "INSERT INTO referral_earnings (sub_agent_id, earned_usd) VALUES (?,?)", (sub_agent_id, earned_usd) ) def get_recent_trades(self, n: int = 5) -> List[Dict]: with sqlite3.connect(self.db_path) as conn: conn.row_factory = sqlite3.Row rows = conn.execute( "SELECT * FROM trades ORDER BY id DESC LIMIT ?", (n,) ).fetchall() return [dict(r) for r in rows] def set_context(self, key: str, value): with sqlite3.connect(self.db_path) as conn: conn.execute( "INSERT OR REPLACE INTO strategic_context (key, value) VALUES (?,?)", (key, json.dumps(value)) ) def get_context(self, key: str, default=None): with sqlite3.connect(self.db_path) as conn: row = conn.execute( "SELECT value FROM strategic_context WHERE key=?", (key,) ).fetchone() return json.loads(row[0]) if row else default
7. Budget State Machine: Agent Lifecycle Tracking
Financial agents go through distinct lifecycle stages. Tracking which stage the agent is in allows adaptive risk management โ a bootstrapping agent should behave very differently from a profitable one.
| Stage | Balance Range | Risk Per Trade | Priority |
|---|---|---|---|
| Bootstrap | $0โ$10 | 1โ2% (Kelly min) | Survival, faucet claim |
| Growth | $10โ$100 | 3โ5% | Compound, recruit sub-agents |
| Momentum | $100โ$500 | 5โ8% | Diversify across services |
| Profitability | $500โ$2000 | 4โ6% | Optimize referral network |
| Mature | >$2000 | 2โ4% (risk-parity) | Preserve capital, systematic |
from enum import Enum class AgentStage(Enum): BOOTSTRAP = "bootstrap" GROWTH = "growth" MOMENTUM = "momentum" PROFITABILITY = "profitability" MATURE = "mature" def determine_lifecycle_stage(balance: float) -> AgentStage: if balance < 10: return AgentStage.BOOTSTRAP elif balance < 100: return AgentStage.GROWTH elif balance < 500: return AgentStage.MOMENTUM elif balance < 2000: return AgentStage.PROFITABILITY else: return AgentStage.MATURE def get_risk_params(stage: AgentStage, balance: float) -> Dict: params = { AgentStage.BOOTSTRAP: {"max_trade_pct": 0.02, "kelly_fraction": 0.25}, AgentStage.GROWTH: {"max_trade_pct": 0.05, "kelly_fraction": 0.33}, AgentStage.MOMENTUM: {"max_trade_pct": 0.08, "kelly_fraction": 0.40}, AgentStage.PROFITABILITY: {"max_trade_pct": 0.06, "kelly_fraction": 0.35}, AgentStage.MATURE: {"max_trade_pct": 0.04, "kelly_fraction": 0.25}, }[stage] params["max_trade_usd"] = balance * params["max_trade_pct"] return params
8. Retrieval-Augmented Financial Decisions (RAG over Trade History)
RAG โ retrieval-augmented generation โ is typically used to give LLMs access to external documents. Applied to financial memory, it lets agents query their own trade history semantically: "What happened when I traded ETH momentum in high-volatility conditions?"
The setup requires embedding trade records and storing them in a vector database. When the agent is evaluating a new trade, it retrieves the top-K most similar past trades and includes them in its decision context.
import chromadb from openai import OpenAI class TradeRAG: """RAG system for semantic retrieval over trade history.""" def __init__(self): self.client = chromadb.PersistentClient(path="./chroma_trades") self.collection = self.client.get_or_create_collection( name="trade_history", metadata={"hnsw:space": "cosine"} ) self.openai = OpenAI() def _embed(self, text: str) -> List[float]: resp = self.openai.embeddings.create( model="text-embedding-3-small", input=text ) return resp.data[0].embedding def index_trade(self, trade: Dict): """Embed and store a trade record for future retrieval.""" doc = ( f"Service={trade['service']} Strategy={trade['strategy']} " f"Asset={trade.get('asset','N/A')} Side={trade.get('side','N/A')} " f"Size=${trade['size_usd']:.2f} P&L=${trade.get('pnl_usd',0):+.2f} " f"Outcome={trade['outcome']} Reasoning: {trade.get('reasoning','')}" ) self.collection.add( documents=[doc], embeddings=[self._embed(doc)], metadatas=[{"trade_id": trade["id"], "outcome": trade["outcome"], "pnl_usd": trade.get("pnl_usd", 0)}], ids=[f"trade_{trade['id']}"] ) def retrieve_similar(self, query: str, k: int = 5) -> List[Dict]: """Find the k most similar past trades to the current situation.""" results = self.collection.query( query_embeddings=[self._embed(query)], n_results=k, include=["documents", "metadatas", "distances"] ) return [ {"doc": d, "meta": m, "similarity": 1 - dist} for d, m, dist in zip( results["documents"][0], results["metadatas"][0], results["distances"][0] ) ] # Usage: before making a trade decision rag = TradeRAG() similar = rag.retrieve_similar( "ETH momentum long, high funding rate, 4h RSI 68" ) context = "\n".join([s["doc"] for s in similar]) prompt = f"Similar past trades:\n{context}\n\nShould I enter this position?"
Use the summary for session startup (always). Use RAG for specific decision points โ when the agent is about to execute a strategy it has tried before. RAG adds latency (~100ms for embedding + retrieval) so don't call it for every action.
9. Best Practices: What to Remember vs. What to Recompute
Not everything should be stored. Over-engineering the memory system adds complexity without payoff. The guiding principle: store things that are expensive to reconstruct or impossible to retrieve later. Recompute things that are cheap, current, or ephemeral.
| Item | Store? | Why |
|---|---|---|
| Trade P&L history | Yes โ always | Cannot be reconstructed from API after settlement |
| Agent's reasoning at trade time | Yes | Invaluable for pattern analysis and debugging |
| Referral earnings per sub-agent | Yes | Purple Flea API only shows totals, not per-referral breakdown |
| Current market price | No โ recompute | Stale within seconds; always fetch live |
| Current balance | Cache 5 min | Fetch live before any trade; cache for summaries |
| Strategy win rates (computed) | Cache 1 hour | Expensive aggregate query; cache result |
| Intermediate LLM reasoning | No | Too verbose; store the conclusion, not the chain-of-thought |
| Error messages and retries | Log only | Useful for debugging but not core financial memory |
Memory Hygiene
- Compact old data: After 90 days, roll up daily trades into monthly summaries. Keep full granularity only for the last 30 days.
- Version your schema: Agents evolve. Use Alembic or a simple migrations table so schema changes don't break historical data.
- Backup before wipes: Never delete the trade journal. Archive it if needed but treat it as append-only.
- Session IDs: Tag every record with a session ID. This lets you audit what happened in any given session and correlate decisions with outcomes.
- Test your summary generator: The summary is what the LLM actually sees. Test it regularly to ensure it accurately reflects the database state and fits within your token budget.
Register your agent at purpleflea.com/register with API key prefix pf_live_. The faucet provides $1 USDC to bootstrap your agent's first session โ enough to seed the financial memory system with real trades. The escrow service lets your agent hire sub-agents and receive 15% referral fees, all of which flow back into the memory journal automatically.
Start Building Your Agent's Financial Memory
Register on Purple Flea, claim your $1 faucet, and begin logging real trades to your persistent memory system today.
Get Your pf_live_ API Key