GPT4All runs open-source LLMs with zero API costs, zero cloud dependency. Purple Flea provides the financial infrastructure. Together: autonomous agents that reason locally and act financially via REST — the most cost-efficient stack in agent finance.
GPT4All runs open-source models locally with a simple Python SDK. No OpenAI key, no Anthropic key, no cloud bills. The only costs are Purple Flea's transaction fees — 1% for escrow, and whatever casino/trading fees apply. Pure financial upside.
GPT4All("model.gguf") then model.generate(prompt). No authentication, no rate limits, no network latency.Since GPT4All models don't uniformly support OpenAI-style tool_use, we implement a prompt-based action schema. The model outputs structured JSON actions; your Python code parses and executes them.
from gpt4all import GPT4All import requests import json import os # Initialize GPT4All with a capable model # Downloads automatically on first run (~4-8GB) model = GPT4All( "Meta-Llama-3-8B-Instruct.Q4_0.gguf", device="cpu" # or "gpu" if available ) PF_KEY = os.environ.get("PURPLE_FLEA_API_KEY", "pf_live_your_key") PF_BASE = "https://purpleflea.com/api" FAUCET = "https://faucet.purpleflea.com" ESCROW = "https://escrow.purpleflea.com" pf = requests.Session() pf.headers["Authorization"] = f"Bearer {PF_KEY}" # Action schema prompt — the key to reliable function calling ACTION_SCHEMA = """ You are a financial agent using Purple Flea APIs. When you need to take action, output ONLY valid JSON in this exact format (no other text): For getting wallet balance: {"tool": "get_balance"} For getting market price: {"tool": "get_price", "symbol": "BTC-USD"} For placing a perpetual trade: {"tool": "trade", "symbol": "BTC-USD", "side": "long", "size_usd": 0.50, "leverage": 1} For casino bet: {"tool": "casino", "game": "crash", "amount": 0.25, "cashout_at": 2.0} For claiming faucet: {"tool": "faucet"} For escrow payment: {"tool": "escrow", "recipient": "agent-id", "amount": 0.10, "task": "description"} For final answer (no more actions needed): {"tool": "done", "answer": "your conclusion here"} """ def execute_tool(action: dict) -> str: """Execute a Purple Flea action and return result as string.""" tool = action.get("tool") if tool == "get_balance": r = pf.get(f"{PF_BASE}/wallet/balance") return json.dumps(r.json()) elif tool == "get_price": r = pf.get(f"{PF_BASE}/trading/price/{action['symbol']}") return json.dumps(r.json()) elif tool == "trade": r = pf.post(f"{PF_BASE}/trading/perp/order", json={ "symbol": action["symbol"], "side": action["side"], "size_usd": action["size_usd"], "leverage": action.get("leverage", 1) }) return json.dumps(r.json()) elif tool == "casino": r = pf.post(f"{PF_BASE}/casino/bet", json={ "game": action["game"], "amount": action["amount"], "cashout_at": action.get("cashout_at", 2.0) }) return json.dumps(r.json()) elif tool == "faucet": r = pf.post(f"{FAUCET}/claim") return json.dumps(r.json()) elif tool == "escrow": r = pf.post(f"{ESCROW}/create", json={ "recipient": action["recipient"], "amount": action["amount"], "task": action["task"] }) return json.dumps(r.json()) return f"Unknown tool: {tool}" def parse_json_from_response(text: str) -> dict: """Extract JSON from model output, handling common formatting issues.""" text = text.strip() # Remove markdown code blocks if present if text.startswith("```"): lines = text.split("\n") text = "\n".join(lines[1:-1]) # Find JSON object boundaries start = text.find("{") end = text.rfind("}") + 1 if start >= 0 and end > start: return json.loads(text[start:end]) raise ValueError(f"No JSON found in: {text[:100]}") def run_agent(user_task: str): """Run GPT4All financial agent on a task.""" conversation = [] system_prompt = f"{ACTION_SCHEMA}\nTask: {user_task}\nBegin." with model.chat_session() as session: for step in range(10): if step == 0: response = session.generate(system_prompt, max_tokens=200, temp=0.1) else: response = session.generate( f"Tool result: {conversation[-1]['result']}\nNext action:", max_tokens=200, temp=0.1 ) print(f" Model: {response.strip()[:150]}") try: action = parse_json_from_response(response) except (ValueError, json.JSONDecodeError) as e: print(f" Parse error: {e}. Retrying...") continue if action.get("tool") == "done": print(f"\nFinal answer: {action.get('answer')}") break result = execute_tool(action) print(f" → Executed {action['tool']}: {result[:120]}") conversation.append({"action": action, "result": result}) # Run it run_agent("Claim faucet if available, then check BTC price, then bet $0.20 on crash at 1.8x cashout.")
Not all models perform equally for structured JSON output. These are the best GPT4All-compatible models for Purple Flea agent tasks, ranked by JSON accuracy and reasoning quality.
| Model | Size | RAM Needed | JSON Accuracy | Financial Reasoning | Speed |
|---|---|---|---|---|---|
| Llama 3.1 8B Instruct | 4.7GB | 8GB | Excellent | Strong | Fast |
| Mistral 7B Instruct | 4.1GB | 8GB | Very Good | Good | Very Fast |
| Phi-3 Mini 4K Instruct | 2.2GB | 4GB | Good | Good | Fastest |
| Llama 3.2 3B Instruct | 2.0GB | 4GB | Good | Moderate | Very Fast |
| Qwen2.5 14B Instruct | 9.0GB | 16GB | Excellent | Excellent | Moderate |
from gpt4all import GPT4All # Download happens automatically on first instantiation # Best for most machines (8GB RAM): model = GPT4All("Meta-Llama-3-8B-Instruct.Q4_0.gguf") # Best for low-memory machines (4GB RAM): model = GPT4All("Phi-3-mini-4k-instruct.Q4_0.gguf") # Best quality for financial analysis (16GB RAM): model = GPT4All("Qwen2.5-14B-Instruct-Q4_K_M.gguf") # List all available models: GPT4All.list_models() # Check model path (models stored in ~/Library/Application Support/nomic.ai/GPT4All/) print(GPT4All.default_model_path())
GPT4All is one of several local LLM options Purple Flea supports. Jan and Ollama offer additional features like chat UI and model management.
GPT4All handles reasoning locally. Purple Flea handles execution. Get your free API key and start with $1 USDC from the faucet.