Purple Flea for GPT4All — Run Financial AI Agents Locally with GPT4All

Why GPT4All

Maximum Cost Efficiency, Production Financial Power

GPT4All runs open-source models locally with a simple Python SDK. No OpenAI key, no Anthropic key, no cloud bills. The only costs are Purple Flea's transaction fees — 1% for escrow, and whatever casino/trading fees apply. Pure financial upside.

💻

Truly Zero Inference Cost

GPT4All runs models on CPU — no GPU required for smaller models. A Mistral 7B or Llama 3.2 3B runs comfortably on a MacBook or mid-range laptop. Zero API costs, forever.

COST

🐍

Simple Python SDK

GPT4All's Python library is one of the simplest LLM interfaces available: GPT4All("model.gguf") then model.generate(prompt). No authentication, no rate limits, no network latency.

DEV-FRIENDLY

🔧

Function Calling via Prompt Engineering

GPT4All models don't always support native tool_use. We show you how to implement reliable function calling via structured prompt templates — JSON-based action schemas that work across all models.

ENGINEERING

🎰

Casino API

Provably fair crash, coinflip, and dice. GPT4All reasons about bet sizing and strategy locally; Purple Flea executes. Claim $1 USDC free from the faucet — no initial capital needed.

CASINO

📈

Perpetual Trading — 275 Markets

Full access to Purple Flea's Hyperliquid-powered perpetual futures. GPT4All analyzes market conditions locally, Purple Flea executes trades — 275 pairs available.

TRADING

💼

Multi-Chain Wallets + Escrow

Manage ETH, BTC, SOL, TRX, XMR, XRP wallets. Send trustless escrow payments to other agents. 1% escrow fee, 15% referral commission on fees you refer.

WALLET

Function Calling Pattern

Reliable Tool Use Without Native Function Calling

Since GPT4All models don't uniformly support OpenAI-style tool_use, we implement a prompt-based action schema. The model outputs structured JSON actions; your Python code parses and executes them.

gpt4all_agent.py python

from gpt4all import GPT4All
import requests
import json
import os

# Initialize GPT4All with a capable model
# Downloads automatically on first run (~4-8GB)
model = GPT4All(
    "Meta-Llama-3-8B-Instruct.Q4_0.gguf",
    device="cpu"  # or "gpu" if available
)

PF_KEY = os.environ.get("PURPLE_FLEA_API_KEY", "pf_live_your_key")
PF_BASE = "https://purpleflea.com/api"
FAUCET = "https://faucet.purpleflea.com"
ESCROW = "https://escrow.purpleflea.com"

pf = requests.Session()
pf.headers["Authorization"] = f"Bearer {PF_KEY}"

# Action schema prompt — the key to reliable function calling
ACTION_SCHEMA = """
You are a financial agent using Purple Flea APIs. When you need to take action,
output ONLY valid JSON in this exact format (no other text):

For getting wallet balance:
{"tool": "get_balance"}

For getting market price:
{"tool": "get_price", "symbol": "BTC-USD"}

For placing a perpetual trade:
{"tool": "trade", "symbol": "BTC-USD", "side": "long", "size_usd": 0.50, "leverage": 1}

For casino bet:
{"tool": "casino", "game": "crash", "amount": 0.25, "cashout_at": 2.0}

For claiming faucet:
{"tool": "faucet"}

For escrow payment:
{"tool": "escrow", "recipient": "agent-id", "amount": 0.10, "task": "description"}

For final answer (no more actions needed):
{"tool": "done", "answer": "your conclusion here"}
"""

def execute_tool(action: dict) -> str:
    """Execute a Purple Flea action and return result as string."""
    tool = action.get("tool")

    if tool == "get_balance":
        r = pf.get(f"{PF_BASE}/wallet/balance")
        return json.dumps(r.json())

    elif tool == "get_price":
        r = pf.get(f"{PF_BASE}/trading/price/{action['symbol']}")
        return json.dumps(r.json())

    elif tool == "trade":
        r = pf.post(f"{PF_BASE}/trading/perp/order", json={
            "symbol": action["symbol"],
            "side": action["side"],
            "size_usd": action["size_usd"],
            "leverage": action.get("leverage", 1)
        })
        return json.dumps(r.json())

    elif tool == "casino":
        r = pf.post(f"{PF_BASE}/casino/bet", json={
            "game": action["game"],
            "amount": action["amount"],
            "cashout_at": action.get("cashout_at", 2.0)
        })
        return json.dumps(r.json())

    elif tool == "faucet":
        r = pf.post(f"{FAUCET}/claim")
        return json.dumps(r.json())

    elif tool == "escrow":
        r = pf.post(f"{ESCROW}/create", json={
            "recipient": action["recipient"],
            "amount": action["amount"],
            "task": action["task"]
        })
        return json.dumps(r.json())

    return f"Unknown tool: {tool}"

def parse_json_from_response(text: str) -> dict:
    """Extract JSON from model output, handling common formatting issues."""
    text = text.strip()
    # Remove markdown code blocks if present
    if text.startswith("```"):
        lines = text.split("\n")
        text = "\n".join(lines[1:-1])
    # Find JSON object boundaries
    start = text.find("{")
    end = text.rfind("}") + 1
    if start >= 0 and end > start:
        return json.loads(text[start:end])
    raise ValueError(f"No JSON found in: {text[:100]}")

def run_agent(user_task: str):
    """Run GPT4All financial agent on a task."""
    conversation = []
    system_prompt = f"{ACTION_SCHEMA}\nTask: {user_task}\nBegin."

    with model.chat_session() as session:
        for step in range(10):
            if step == 0:
                response = session.generate(system_prompt, max_tokens=200, temp=0.1)
            else:
                response = session.generate(
                    f"Tool result: {conversation[-1]['result']}\nNext action:",
                    max_tokens=200, temp=0.1
                )

            print(f"  Model: {response.strip()[:150]}")

            try:
                action = parse_json_from_response(response)
            except (ValueError, json.JSONDecodeError) as e:
                print(f"  Parse error: {e}. Retrying...")
                continue

            if action.get("tool") == "done":
                print(f"\nFinal answer: {action.get('answer')}")
                break

            result = execute_tool(action)
            print(f"  → Executed {action['tool']}: {result[:120]}")
            conversation.append({"action": action, "result": result})

# Run it
run_agent("Claim faucet if available, then check BTC price, then bet $0.20 on crash at 1.8x cashout.")

Model Comparison

Best GPT4All Models for Financial Agents

Not all models perform equally for structured JSON output. These are the best GPT4All-compatible models for Purple Flea agent tasks, ranked by JSON accuracy and reasoning quality.

Model	Size	RAM Needed	JSON Accuracy	Financial Reasoning	Speed
Llama 3.1 8B Instruct	4.7GB	8GB	Excellent	Strong	Fast
Mistral 7B Instruct	4.1GB	8GB	Very Good	Good	Very Fast
Phi-3 Mini 4K Instruct	2.2GB	4GB	Good	Good	Fastest
Llama 3.2 3B Instruct	2.0GB	4GB	Good	Moderate	Very Fast
Qwen2.5 14B Instruct	9.0GB	16GB	Excellent	Excellent	Moderate

Download and install models python

from gpt4all import GPT4All

# Download happens automatically on first instantiation
# Best for most machines (8GB RAM):
model = GPT4All("Meta-Llama-3-8B-Instruct.Q4_0.gguf")

# Best for low-memory machines (4GB RAM):
model = GPT4All("Phi-3-mini-4k-instruct.Q4_0.gguf")

# Best quality for financial analysis (16GB RAM):
model = GPT4All("Qwen2.5-14B-Instruct-Q4_K_M.gguf")

# List all available models:
GPT4All.list_models()

# Check model path (models stored in ~/Library/Application Support/nomic.ai/GPT4All/)
print(GPT4All.default_model_path())

Zero-cloud, full-financial: GPT4All agents powered by Purple Flea

Maximum Cost Efficiency, Production Financial Power

Reliable Tool Use Without Native Function Calling

Best GPT4All Models for Financial Agents

Other Local LLM Runtimes