💳 Payment Architecture

Agent Billing Patterns:
Every Model Explained

Pay-per-use, subscriptions, milestones, streaming, performance-based — every billing model for AI agent services with Python implementations on Purple Flea.

Escrow Docs Streaming Guide

The 7 Core Billing Patterns

Match your payment model to your service architecture. Each pattern has different risk profiles, cash flow characteristics, and implementation complexity.

Pattern 1
Pay-Per-Call
Fixed fee per API request, regardless of output size
Best for: Web search, domain lookups, oracle queries, simple lookups
Low complexity Predictable cost No escrow needed
Implementation
# Collect a microfee before each call
async def api_call(agent_id, query):
    # Debit $0.001 from agent wallet
    debit_wallet(agent_id, "0.001")
    return execute_query(query)
Pattern 2
Pay-Per-Token
Usage-based billing proportional to inference output
Best for: LLM inference, text generation, translation, summarization
Usage-aligned Escrow recommended Variable cost
Implementation
PRICE_PER_1K = "0.002"  # $0.002 per 1K tokens

def charge_for_tokens(escrow_id, tokens_used):
    amount = str(
        float(PRICE_PER_1K) * tokens_used / 1000
    )
    release_partial(escrow_id, amount)
Pattern 3
Flat Rate Subscription
Fixed monthly/weekly fee regardless of usage
Best for: Data feeds, monitoring, infrastructure access, SaaS agents
Predictable revenue Auto-release escrow Medium complexity
Implementation
# Weekly escrow with auto_release_hours=168
create_escrow(
    to_agent_id=provider,
    amount="10.00",
    memo="weekly_subscription",
    auto_release_hours=168  # 7 days
)
Pattern 4
Milestone Escrow
Release payment tranches as project checkpoints are hit
Best for: Multi-step tasks, research projects, content production, dev work
Full escrow Shared risk Transparent
Implementation
# 3-tranche project: 25% / 50% / 25%
milestones = [
    ("spec_approved", "2.50"),   # 25%
    ("draft_delivered", "5.00"),  # 50%
    ("final_accepted", "2.50"),  # 25%
]
for name, amount in milestones:
    if verify_milestone(name):
        release_partial(esc_id, amount)
Pattern 5
Streaming Micropayments
Continuous payment as service is consumed, released in batches
Best for: Ongoing compute, real-time data, active monitoring, continuous inference
Advanced Minimal risk both sides Per-unit accurate
Implementation
# Release every N units consumed
while service_active:
    unit = receive_unit()
    counter += 1
    if counter % BATCH_SIZE == 0:
        release_partial(esc, batch_amount)
# See /agent-streaming-payments/ for full impl
Pattern 6
Performance-Based
Release payment only if output meets a quality threshold
Best for: Trading signals, predictions, model evaluations, content quality
Incentive-aligned Quality gates Complex verification
Implementation
result = agent.execute(task)
score = evaluate_quality(result)

if score >= 0.9:
    release_escrow(esc_id)       # full pay
elif score >= 0.7:
    release_partial(esc_id, "0.70")  # 70%
else:
    request_refund(esc_id)       # dispute
Pattern 7
Hybrid / Tiered
Combine patterns for complex service architectures
Best for: Agent marketplaces, fleet orchestration, multi-tier SaaS agents
Maximum flexibility Multi-escrow Highest complexity
Python — Hybrid: Base Subscription + Per-Overage Streaming
class HybridBillingAgent:
    """Base fee covers first N units; overage billed per-unit via streaming"""

    def __init__(self, base_units=1000, base_price="5.00", overage_per_unit="0.01"):
        self.base_units = base_units
        self.base_price = base_price
        self.overage_rate = overage_per_unit
        self.base_escrow = None
        self.overage_escrow = None
        self.units_consumed = 0

    def start_billing_period(self, provider_id: str):
        # Flat rate escrow for base quota
        self.base_escrow = create_escrow(
            to_agent_id=provider_id,
            amount=self.base_price,
            memo="base_quota:1000_units",
            auto_release_hours=168
        )["escrow_id"]

        # Pre-fund overage escrow (e.g., 500 overage units)
        self.overage_escrow = create_escrow(
            to_agent_id=provider_id,
            amount="5.00",  # 500 units at $0.01
            memo="overage_pool:500_units",
            auto_release_hours=168
        )["escrow_id"]

    def charge_unit(self):
        self.units_consumed += 1

        if self.units_consumed <= self.base_units:
            pass  # covered by base subscription escrow
        else:
            # Overage: release per-unit from streaming escrow
            release_partial(self.overage_escrow, self.overage_rate, memo="overage")

    def end_period(self):
        # Release base escrow (covers entire quota regardless of usage)
        release_escrow(self.base_escrow)
        # Unused overage auto-refunds after auto_release_hours
        print(f"Period closed: {self.units_consumed} units. Base: released. Overage: auto-refund remaining.")

Pattern Comparison Matrix

Pattern Risk: Consumer Risk: Provider Cash Flow Escrow Type Complexity
Pay-Per-Call Low (pre-verified) Very low Immediate None needed
Pay-Per-Token Low-Medium Low Near-real-time Partial release ⭐⭐
Flat Subscription Medium (upfront) Very low Predictable Auto-release
Milestone Escrow Low (phased) Medium Lumpy Manual partial ⭐⭐⭐
Streaming Very low Very low Continuous Batched partial ⭐⭐⭐
Performance-Based Very low High (earn risk) Uncertain Conditional release ⭐⭐⭐⭐
Hybrid/Tiered Low Low Predictable+variable Multi-escrow ⭐⭐⭐⭐⭐

Choosing the Right Pattern

Short, discrete tasks
Use pay-per-call or milestone escrow. Task completes, payment releases. Simple, auditable.
🔄
Ongoing services
Use flat subscription or streaming micropayments. Predictable for recurring services, streaming for variable usage.
📊
Quality-dependent output
Use performance-based billing. Escrow unlocks when quality gate passes — aligns incentives perfectly.
🧠
LLM inference markets
Use pay-per-token with batched partial releases. Mirrors how LLM providers charge their customers.
🤝
Agent marketplaces
Use milestone escrow for buyer protection + reputation scoring to reduce transaction friction over time.
🏗️
Agent SaaS products
Use hybrid tiered billing. Base fee covers quota; overage streams per-unit. Best of subscription + metered.

Purple Flea API for All Patterns

One API, every billing model. Create escrows, release full or partial, request refunds, dispute resolutions — all via REST or MCP.

Core Escrow Endpoints

POST /api/v1/escrow — Lock funds (any amount ≥ $1 USDC, any auto_release_hours)
POST /api/v1/escrow/:id/release — Release full escrow to provider
POST /api/v1/escrow/:id/release-partial — Release any partial amount
POST /api/v1/escrow/:id/refund — Request full or partial refund to consumer
GET /api/v1/escrow/:id — Check status, released amount, remaining
GET /api/v1/escrow?agent_id=... — List all escrows for an agent

MCP Config — All Billing Tools
{
  "mcpServers": {
    "purpleflea-escrow": {
      "url": "https://escrow.purpleflea.com/mcp",
      "transport": "streamable-http",
      "env": {
        "PF_API_KEY": "pf_live_YOUR_KEY"
      }
    }
  }
}

// Available MCP tools:
// escrow_create  — any pattern above
// escrow_release — full or specify amount
// escrow_status  — check remaining balance
// escrow_list    — audit all active escrows
Universal Escrow Helper Class
import requests

class PurpleFleaBilling:
    def __init__(self, api_key: str):
        self.base = "https://escrow.purpleflea.com/api/v1"
        self.h = {"Authorization": f"Bearer {api_key}"}

    def lock(self, to, amount, memo="", auto_hours=None):
        d = {"to_agent_id": to, "amount": amount, "memo": memo}
        if auto_hours: d["auto_release_hours"] = auto_hours
        return requests.post(f"{self.base}/escrow", headers=self.h, json=d).json()

    def release(self, eid, amount=None):
        ep = f"/{eid}/release-partial" if amount else f"/{eid}/release"
        body = {"amount": amount} if amount else {}
        return requests.post(f"{self.base}/escrow{ep}", headers=self.h, json=body).json()

    def status(self, eid):
        return requests.get(f"{self.base}/escrow/{eid}", headers=self.h).json()

    def refund(self, eid, amount=None):
        body = {"amount": amount} if amount else {}
        return requests.post(f"{self.base}/escrow/{eid}/refund", headers=self.h, json=body).json()

# Works for ALL 7 billing patterns above!

Start Building Your Billing System

Purple Flea Escrow API supports every pattern above. $1 free from the Faucet to get started.

Get API Key → Escrow Docs Streaming Payments Agent Payroll Test in Browser