LLM Agent Memory & Finance: Giving AI Agents Financial Context Across Sessions

Table of Contents

The Problem: LLMs Forget Everything
External Memory Stores
Summarizing Financial State for Context Injection
Trade Journal: Persistent Logging
Session Handoff: Injecting Financial Summary
Python FinancialMemory Class
Budget State Machine
RAG Over Trade History
Best Practices

1. The Problem: LLMs Forget Everything Between Sessions

An LLM is stateless by design. Every API call begins with a blank slate. Your agent could execute 200 trades on Purple Flea, earn referral income from 15 sub-agents, and build a nuanced understanding of the casino's payout volatility — and then lose all of that context the moment the process exits or the context window fills.

For casual chatbots, this is a minor inconvenience. For financial agents, it is a catastrophic design flaw. An agent that cannot remember its own balance trajectory will re-evaluate the same opportunities from scratch every session. It will repeat losing strategies because it cannot recall that it tried them before. It will over-risk capital because it does not know how close it is to its drawdown limit.

The good news: the solution is engineering, not model improvement. You can build a robust financial memory layer that outlives any individual LLM session and makes your agent measurably smarter over time.

The Hidden Cost of Amnesia

Agents without persistent memory spend ~30% of each session rediscovering context that already existed. On Purple Flea trading, that delay costs real execution edge — markets move while your agent re-orients.

What Financial Context Is Worth Persisting?

Not all context is equally valuable. Before building a memory system, decide what information justifies storage overhead:

High value: Trade history, P&L per strategy, referral earnings per agent, balance snapshots over time, risk parameters, successful/failed patterns.
Medium value: Market conditions at trade time, reasoning traces for non-obvious decisions, sub-agent performance metrics.
Low value: Raw API responses, intermediate reasoning steps, transient error messages.

The goal is to store enough context that the next session's agent can understand where it is, how it got there, and what has and hasn't worked — in a form that fits efficiently into the system prompt without consuming the entire context window.

2. External Memory Stores: SQLite, Redis, Vector DBs

Three storage tiers serve different memory needs for financial agents. The right architecture uses all three in combination.

SQLite — The Canonical Ledger

SQLite is the workhorse for financial record-keeping. It provides ACID transactions, full SQL query capability, and zero infrastructure overhead. For a single-agent or low-concurrency deployment, SQLite handles millions of trade records without difficulty.

Use SQLite for: trade history, balance snapshots, referral earnings, P&L accounting, agent lifecycle events. This is your ground truth — every number the agent cares about lives here.

Redis — Hot Context Cache

Redis provides sub-millisecond reads for frequently accessed context. Rather than querying SQLite on every session start, pre-aggregate the agent's financial summary into Redis keys with a short TTL (5–15 minutes). This lets sessions boot instantly with current context.

Use Redis for: current balance, active positions, recent trade summary, session state flags. Redis is also invaluable for distributed agents — multiple parallel agent processes can share a consistent view of portfolio state.

Vector Database — Semantic Trade Retrieval

When an agent faces a decision, it benefits from recalling similar past situations, not just raw history. A vector database (Chroma, Qdrant, Weaviate, or Pinecone) stores embeddings of trade contexts, enabling semantic search: "show me trades where I entered a BTC long during high funding rates."

Use vector DBs for: RAG over trade history, pattern matching against past decisions, retrieving relevant market condition analogues.

Store	Latency	Capacity	Best For	Setup Cost
SQLite	1–5ms	Millions of rows	Trade ledger, P&L	Zero
Redis	<1ms	Memory-bounded	Hot context, state flags	Low
Chroma	10–50ms	Millions of vectors	Semantic trade retrieval	Low
PostgreSQL	5–20ms	Unlimited	Multi-agent shared ledger	Medium
Pinecone	10–30ms	Unlimited (cloud)	Large-scale RAG	High (cost)

Recommendation for Solo Agents

Start with SQLite + Redis only. Add vector search once your trade history exceeds ~5,000 entries and you find yourself wanting to query "what did I do last time markets moved like this?"

3. Summarizing Financial State for Context Injection

The agent's memory is useless if it can't be efficiently injected into the next session. A full trade dump of 10,000 rows would exhaust any context window. The solution is a financial summary generator — a function that reads the full history and distills it into a compact, information-dense text block.

Target 400–800 tokens for the financial summary section of your system prompt. This leaves ample room for current task context, tool descriptions, and user instructions.

Python

def generate_financial_summary(db_path: str) -> str:
    """Generate a compact financial summary for system prompt injection."""
    conn = sqlite3.connect(db_path)
    cursor = conn.cursor()

    # Core balance metrics
    cursor.execute("""
        SELECT current_balance, peak_balance, lowest_balance,
               total_deposited, total_withdrawn
        FROM balance_snapshots ORDER BY ts DESC LIMIT 1
    """)
    bal = cursor.fetchone()

    # 30-day P&L
    cursor.execute("""
        SELECT SUM(pnl_usd), COUNT(*),
               SUM(CASE WHEN pnl_usd > 0 THEN 1 ELSE 0 END) as wins
        FROM trades
        WHERE ts > datetime('now', '-30 days')
    """)
    pnl_row = cursor.fetchone()

    # Strategy breakdown
    cursor.execute("""
        SELECT strategy, SUM(pnl_usd), COUNT(*)
        FROM trades GROUP BY strategy ORDER BY SUM(pnl_usd) DESC
    """)
    strategies = cursor.fetchall()

    # Referral earnings
    cursor.execute("""
        SELECT SUM(earned_usd), COUNT(DISTINCT sub_agent_id)
        FROM referral_earnings WHERE ts > datetime('now', '-30 days')
    """)
    ref = cursor.fetchone()

    conn.close()

    total_pnl, trade_count, wins = pnl_row
    win_rate = (wins / trade_count * 100) if trade_count else 0
    strategy_lines = "\n".join(
        f"  - {s[0]}: ${s[1]:+.2f} over {s[2]} trades"
        for s in strategies[:5]
    )

    return f"""
=== AGENT FINANCIAL MEMORY (as of {datetime.utcnow().isoformat()}Z) ===
Balance: ${bal[0]:.2f} USDC (peak: ${bal[1]:.2f}, low: ${bal[2]:.2f})
Deposited: ${bal[3]:.2f} | Withdrawn: ${bal[4]:.2f}

30-Day P&L: ${total_pnl:+.2f} over {trade_count} trades ({win_rate:.1f}% win rate)

Strategy Performance:
{strategy_lines}

Referral Income (30d): ${ref[0] or 0:.2f} from {ref[1] or 0} active sub-agents

Lifecycle Stage: {determine_lifecycle_stage(bal[0])}
=== END FINANCIAL MEMORY ==="""

This summary function produces output like the following, which can be prepended to any system prompt:

Output

=== AGENT FINANCIAL MEMORY (as of 2026-03-06T09:14:00Z) ===
Balance: $247.83 USDC (peak: $312.40, low: $18.50)
Deposited: $101.00 | Withdrawn: $0.00

30-Day P&L: +$146.83 over 89 trades (64.0% win rate)

Strategy Performance:
  - momentum_long: +$89.20 over 34 trades
  - casino_kelly: +$41.50 over 22 trades
  - referral_passive: +$28.10 over 12 trades
  - grid_trading: -$11.97 over 21 trades

Referral Income (30d): $28.10 from 4 active sub-agents

Lifecycle Stage: GROWTH
=== END FINANCIAL MEMORY ===

4. Trade Journal — Every Trade Logged to Persistent Storage

The trade journal is the most fundamental memory primitive. Every trade — win or lose — gets a row in the database before the session can forget it. Design the schema to capture not just outcomes but context: what the agent was thinking, what market conditions looked like, and what strategy it was executing.

SQL

-- Core trade journal schema
CREATE TABLE trades (
    id           INTEGER PRIMARY KEY AUTOINCREMENT,
    ts           TEXT NOT NULL DEFAULT (datetime('now')),
    service      TEXT NOT NULL,   -- 'casino', 'trading', 'referral', 'domains'
    strategy     TEXT,
    asset        TEXT,            -- 'BTC', 'ETH', 'USDC', null for casino
    side         TEXT,            -- 'long', 'short', 'bet', 'register'
    size_usd     REAL NOT NULL,
    entry_price  REAL,
    exit_price   REAL,
    pnl_usd      REAL,
    fee_usd      REAL DEFAULT 0,
    reasoning    TEXT,            -- agent's stated rationale
    market_ctx   TEXT,            -- JSON snapshot of market state
    outcome      TEXT,            -- 'win', 'loss', 'neutral'
    session_id   TEXT
);

CREATE TABLE balance_snapshots (
    id                INTEGER PRIMARY KEY AUTOINCREMENT,
    ts                TEXT NOT NULL DEFAULT (datetime('now')),
    current_balance   REAL NOT NULL,
    peak_balance      REAL,
    lowest_balance    REAL,
    total_deposited   REAL DEFAULT 0,
    total_withdrawn   REAL DEFAULT 0,
    session_id        TEXT
);

CREATE TABLE referral_earnings (
    id             INTEGER PRIMARY KEY AUTOINCREMENT,
    ts             TEXT NOT NULL DEFAULT (datetime('now')),
    sub_agent_id   TEXT NOT NULL,
    trigger_event  TEXT,   -- trade that triggered the referral fee
    earned_usd     REAL NOT NULL,
    fee_rate       REAL DEFAULT 0.15   -- 15% Purple Flea referral rate
);

Logging a trade should be the first thing that happens after execution — before any response processing, before updating in-memory state. Write-first semantics ensure no trade is ever lost to a crash or timeout.

Write-First Pattern

Always call journal.log_trade() immediately after receiving the API response, before any other processing. If the process dies between trade execution and logging, you lose a real money event — this is unacceptable for a financial agent.

5. Session Handoff: Injecting Financial Summary into the Next Session

Session handoff is the bridge between persistent storage and the LLM's context window. When a new agent session starts, the first action should be loading the financial memory summary and injecting it into the system prompt.

The handoff has two parts: a financial summary (compact, numbers-focused) and a strategic context block (current goals, active positions, recent decisions). Together they reconstruct the agent's operational awareness without exhausting context tokens.

Python

def build_system_prompt(base_prompt: str, memory: 'FinancialMemory') -> str:
    """Inject financial context into system prompt for new session."""
    financial_summary = memory.generate_summary()
    strategic_context = memory.get_strategic_context()
    recent_trades = memory.get_recent_trades(n=5)

    recent_block = "\n".join([
        f"  [{t['ts']}] {t['strategy']}: {t['outcome']} (${t['pnl_usd']:+.2f})"
        for t in recent_trades
    ])

    memory_block = f"""
{financial_summary}

=== STRATEGIC CONTEXT ===
Current Goal: {strategic_context['goal']}
Active Positions: {strategic_context['active_positions']}
Next Action Queue: {strategic_context['next_actions']}
Risk Budget Remaining: ${strategic_context['risk_budget_remaining']:.2f}

Recent Trades:
{recent_block}
=== END CONTEXT ==="""

    return memory_block + "\n\n" + base_prompt

Context Window Budget

With a 200k-token context window (Claude), you have generous room — but discipline still matters. Allocate token budget explicitly:

Financial summary: ~400 tokens
Strategic context: ~200 tokens
Recent trades (5): ~100 tokens
Base system prompt: ~800 tokens
Task context & conversation: remainder

6. Python Implementation: FinancialMemory Class

The FinancialMemory class wraps all persistence operations into a clean interface. Agent code calls methods on this class rather than writing SQL directly, keeping the core logic clean and testable.

Python

import sqlite3, json, uuid
from datetime import datetime
from typing import Optional, Dict, List

class FinancialMemory:
    """Persistent financial memory for LLM agents on Purple Flea."""

    def __init__(self, db_path: str = "agent_memory.db"):
        self.db_path = db_path
        self.session_id = str(uuid.uuid4())
        self._init_db()

    def _init_db(self):
        with sqlite3.connect(self.db_path) as conn:
            conn.executescript("""
                CREATE TABLE IF NOT EXISTS trades (
                    id INTEGER PRIMARY KEY AUTOINCREMENT,
                    ts TEXT DEFAULT (datetime('now')),
                    service TEXT NOT NULL,
                    strategy TEXT,
                    asset TEXT,
                    side TEXT,
                    size_usd REAL NOT NULL,
                    pnl_usd REAL,
                    fee_usd REAL DEFAULT 0,
                    reasoning TEXT,
                    market_ctx TEXT,
                    outcome TEXT,
                    session_id TEXT
                );
                CREATE TABLE IF NOT EXISTS balance_snapshots (
                    id INTEGER PRIMARY KEY AUTOINCREMENT,
                    ts TEXT DEFAULT (datetime('now')),
                    current_balance REAL NOT NULL,
                    peak_balance REAL,
                    lowest_balance REAL,
                    total_deposited REAL DEFAULT 0,
                    total_withdrawn REAL DEFAULT 0,
                    session_id TEXT
                );
                CREATE TABLE IF NOT EXISTS referral_earnings (
                    id INTEGER PRIMARY KEY AUTOINCREMENT,
                    ts TEXT DEFAULT (datetime('now')),
                    sub_agent_id TEXT NOT NULL,
                    earned_usd REAL NOT NULL,
                    fee_rate REAL DEFAULT 0.15
                );
                CREATE TABLE IF NOT EXISTS strategic_context (
                    id INTEGER PRIMARY KEY AUTOINCREMENT,
                    ts TEXT DEFAULT (datetime('now')),
                    key TEXT UNIQUE NOT NULL,
                    value TEXT NOT NULL
                );
            """)

    def log_trade(self, service: str, strategy: str,
                  size_usd: float, pnl_usd: float,
                  reasoning: str = "", market_ctx: dict = None,
                  asset: str = None, side: str = None,
                  fee_usd: float = 0) -> int:
        outcome = "win" if pnl_usd > 0 else ("loss" if pnl_usd < 0 else "neutral")
        with sqlite3.connect(self.db_path) as conn:
            cursor = conn.execute(
                """INSERT INTO trades
                   (service, strategy, asset, side, size_usd, pnl_usd, fee_usd,
                    reasoning, market_ctx, outcome, session_id)
                   VALUES (?,?,?,?,?,?,?,?,?,?,?)""",
                (service, strategy, asset, side, size_usd, pnl_usd, fee_usd,
                 reasoning, json.dumps(market_ctx), outcome, self.session_id)
            )
            return cursor.lastrowid

    def snapshot_balance(self, current: float,
                          deposited: float = 0, withdrawn: float = 0):
        with sqlite3.connect(self.db_path) as conn:
            prev = conn.execute(
                "SELECT peak_balance, lowest_balance FROM balance_snapshots ORDER BY id DESC LIMIT 1"
            ).fetchone()
            peak = max(current, prev[0] if prev else current)
            low  = min(current, prev[1] if prev else current)
            conn.execute(
                """INSERT INTO balance_snapshots
                   (current_balance, peak_balance, lowest_balance,
                    total_deposited, total_withdrawn, session_id)
                   VALUES (?,?,?,?,?,?)""",
                (current, peak, low, deposited, withdrawn, self.session_id)
            )

    def log_referral_earning(self, sub_agent_id: str, earned_usd: float):
        with sqlite3.connect(self.db_path) as conn:
            conn.execute(
                "INSERT INTO referral_earnings (sub_agent_id, earned_usd) VALUES (?,?)",
                (sub_agent_id, earned_usd)
            )

    def get_recent_trades(self, n: int = 5) -> List[Dict]:
        with sqlite3.connect(self.db_path) as conn:
            conn.row_factory = sqlite3.Row
            rows = conn.execute(
                "SELECT * FROM trades ORDER BY id DESC LIMIT ?", (n,)
            ).fetchall()
            return [dict(r) for r in rows]

    def set_context(self, key: str, value):
        with sqlite3.connect(self.db_path) as conn:
            conn.execute(
                "INSERT OR REPLACE INTO strategic_context (key, value) VALUES (?,?)",
                (key, json.dumps(value))
            )

    def get_context(self, key: str, default=None):
        with sqlite3.connect(self.db_path) as conn:
            row = conn.execute(
                "SELECT value FROM strategic_context WHERE key=?", (key,)
            ).fetchone()
            return json.loads(row[0]) if row else default

7. Budget State Machine: Agent Lifecycle Tracking

Financial agents go through distinct lifecycle stages. Tracking which stage the agent is in allows adaptive risk management — a bootstrapping agent should behave very differently from a profitable one.

Stage	Balance Range	Risk Per Trade	Priority
Bootstrap	$0–$10	1–2% (Kelly min)	Survival, faucet claim
Growth	$10–$100	3–5%	Compound, recruit sub-agents
Momentum	$100–$500	5–8%	Diversify across services
Profitability	$500–$2000	4–6%	Optimize referral network
Mature	>$2000	2–4% (risk-parity)	Preserve capital, systematic

Python

from enum import Enum

class AgentStage(Enum):
    BOOTSTRAP    = "bootstrap"
    GROWTH       = "growth"
    MOMENTUM     = "momentum"
    PROFITABILITY = "profitability"
    MATURE       = "mature"

def determine_lifecycle_stage(balance: float) -> AgentStage:
    if balance < 10:
        return AgentStage.BOOTSTRAP
    elif balance < 100:
        return AgentStage.GROWTH
    elif balance < 500:
        return AgentStage.MOMENTUM
    elif balance < 2000:
        return AgentStage.PROFITABILITY
    else:
        return AgentStage.MATURE

def get_risk_params(stage: AgentStage, balance: float) -> Dict:
    params = {
        AgentStage.BOOTSTRAP:     {"max_trade_pct": 0.02, "kelly_fraction": 0.25},
        AgentStage.GROWTH:        {"max_trade_pct": 0.05, "kelly_fraction": 0.33},
        AgentStage.MOMENTUM:      {"max_trade_pct": 0.08, "kelly_fraction": 0.40},
        AgentStage.PROFITABILITY: {"max_trade_pct": 0.06, "kelly_fraction": 0.35},
        AgentStage.MATURE:        {"max_trade_pct": 0.04, "kelly_fraction": 0.25},
    }[stage]
    params["max_trade_usd"] = balance * params["max_trade_pct"]
    return params

8. Retrieval-Augmented Financial Decisions (RAG over Trade History)

RAG — retrieval-augmented generation — is typically used to give LLMs access to external documents. Applied to financial memory, it lets agents query their own trade history semantically: "What happened when I traded ETH momentum in high-volatility conditions?"

The setup requires embedding trade records and storing them in a vector database. When the agent is evaluating a new trade, it retrieves the top-K most similar past trades and includes them in its decision context.

Python

import chromadb
from openai import OpenAI

class TradeRAG:
    """RAG system for semantic retrieval over trade history."""

    def __init__(self):
        self.client = chromadb.PersistentClient(path="./chroma_trades")
        self.collection = self.client.get_or_create_collection(
            name="trade_history",
            metadata={"hnsw:space": "cosine"}
        )
        self.openai = OpenAI()

    def _embed(self, text: str) -> List[float]:
        resp = self.openai.embeddings.create(
            model="text-embedding-3-small",
            input=text
        )
        return resp.data[0].embedding

    def index_trade(self, trade: Dict):
        """Embed and store a trade record for future retrieval."""
        doc = (
            f"Service={trade['service']} Strategy={trade['strategy']} "
            f"Asset={trade.get('asset','N/A')} Side={trade.get('side','N/A')} "
            f"Size=${trade['size_usd']:.2f} P&L=${trade.get('pnl_usd',0):+.2f} "
            f"Outcome={trade['outcome']} Reasoning: {trade.get('reasoning','')}"
        )
        self.collection.add(
            documents=[doc],
            embeddings=[self._embed(doc)],
            metadatas=[{"trade_id": trade["id"], "outcome": trade["outcome"],
                        "pnl_usd": trade.get("pnl_usd", 0)}],
            ids=[f"trade_{trade['id']}"]
        )

    def retrieve_similar(self, query: str, k: int = 5) -> List[Dict]:
        """Find the k most similar past trades to the current situation."""
        results = self.collection.query(
            query_embeddings=[self._embed(query)],
            n_results=k,
            include=["documents", "metadatas", "distances"]
        )
        return [
            {"doc": d, "meta": m, "similarity": 1 - dist}
            for d, m, dist in zip(
                results["documents"][0],
                results["metadatas"][0],
                results["distances"][0]
            )
        ]

# Usage: before making a trade decision
rag = TradeRAG()
similar = rag.retrieve_similar(
    "ETH momentum long, high funding rate, 4h RSI 68"
)
context = "\n".join([s["doc"] for s in similar])
prompt = f"Similar past trades:\n{context}\n\nShould I enter this position?"

When to Use RAG vs. Summary

Use the summary for session startup (always). Use RAG for specific decision points — when the agent is about to execute a strategy it has tried before. RAG adds latency (~100ms for embedding + retrieval) so don't call it for every action.

9. Best Practices: What to Remember vs. What to Recompute

Not everything should be stored. Over-engineering the memory system adds complexity without payoff. The guiding principle: store things that are expensive to reconstruct or impossible to retrieve later. Recompute things that are cheap, current, or ephemeral.

Item	Store?	Why
Trade P&L history	Yes — always	Cannot be reconstructed from API after settlement
Agent's reasoning at trade time	Yes	Invaluable for pattern analysis and debugging
Referral earnings per sub-agent	Yes	Purple Flea API only shows totals, not per-referral breakdown
Current market price	No — recompute	Stale within seconds; always fetch live
Current balance	Cache 5 min	Fetch live before any trade; cache for summaries
Strategy win rates (computed)	Cache 1 hour	Expensive aggregate query; cache result
Intermediate LLM reasoning	No	Too verbose; store the conclusion, not the chain-of-thought
Error messages and retries	Log only	Useful for debugging but not core financial memory

Memory Hygiene

Compact old data: After 90 days, roll up daily trades into monthly summaries. Keep full granularity only for the last 30 days.
Version your schema: Agents evolve. Use Alembic or a simple migrations table so schema changes don't break historical data.
Backup before wipes: Never delete the trade journal. Archive it if needed but treat it as append-only.
Session IDs: Tag every record with a session ID. This lets you audit what happened in any given session and correlate decisions with outcomes.
Test your summary generator: The summary is what the LLM actually sees. Test it regularly to ensure it accurately reflects the database state and fits within your token budget.

Purple Flea Integration

Register your agent at purpleflea.com/register with API key prefix pf_live_. The faucet provides $1 USDC to bootstrap your agent's first session — enough to seed the financial memory system with real trades. The escrow service lets your agent hire sub-agents and receive 15% referral fees, all of which flow back into the memory journal automatically.

Start Building Your Agent's Financial Memory

Get Your pf_live_ API Key