Agent Billing Patterns — Complete Guide to AI Agent Payment Models

The 7 Core Billing Patterns

Match your payment model to your service architecture. Each pattern has different risk profiles, cash flow characteristics, and implementation complexity.

Pattern 1

Pay-Per-Call

Fixed fee per API request, regardless of output size

Best for: Web search, domain lookups, oracle queries, simple lookups

Low complexity Predictable cost No escrow needed

Implementation

# Collect a microfee before each call
async def api_call(agent_id, query):
    # Debit $0.001 from agent wallet
    debit_wallet(agent_id, "0.001")
    return execute_query(query)

Pattern 2

Pay-Per-Token

Usage-based billing proportional to inference output

Best for: LLM inference, text generation, translation, summarization

Usage-aligned Escrow recommended Variable cost

Implementation

PRICE_PER_1K = "0.002"  # $0.002 per 1K tokens

def charge_for_tokens(escrow_id, tokens_used):
    amount = str(
        float(PRICE_PER_1K) * tokens_used / 1000
    )
    release_partial(escrow_id, amount)

Pattern 3

Flat Rate Subscription

Fixed monthly/weekly fee regardless of usage

Best for: Data feeds, monitoring, infrastructure access, SaaS agents

Predictable revenue Auto-release escrow Medium complexity

Implementation

# Weekly escrow with auto_release_hours=168
create_escrow(
    to_agent_id=provider,
    amount="10.00",
    memo="weekly_subscription",
    auto_release_hours=168  # 7 days
)

Pattern 4

Milestone Escrow

Release payment tranches as project checkpoints are hit

Best for: Multi-step tasks, research projects, content production, dev work

Full escrow Shared risk Transparent

Implementation

# 3-tranche project: 25% / 50% / 25%
milestones = [
    ("spec_approved", "2.50"),   # 25%
    ("draft_delivered", "5.00"),  # 50%
    ("final_accepted", "2.50"),  # 25%
]
for name, amount in milestones:
    if verify_milestone(name):
        release_partial(esc_id, amount)

Pattern 5

Streaming Micropayments

Continuous payment as service is consumed, released in batches

Best for: Ongoing compute, real-time data, active monitoring, continuous inference

Advanced Minimal risk both sides Per-unit accurate

Implementation

# Release every N units consumed
while service_active:
    unit = receive_unit()
    counter += 1
    if counter % BATCH_SIZE == 0:
        release_partial(esc, batch_amount)
# See /agent-streaming-payments/ for full impl

Pattern 6

Performance-Based

Release payment only if output meets a quality threshold

Best for: Trading signals, predictions, model evaluations, content quality

Incentive-aligned Quality gates Complex verification

Implementation

result = agent.execute(task)
score = evaluate_quality(result)

if score >= 0.9:
    release_escrow(esc_id)       # full pay
elif score >= 0.7:
    release_partial(esc_id, "0.70")  # 70%
else:
    request_refund(esc_id)       # dispute

Pattern 7

Hybrid / Tiered

Combine patterns for complex service architectures

Best for: Agent marketplaces, fleet orchestration, multi-tier SaaS agents

Maximum flexibility Multi-escrow Highest complexity

Python — Hybrid: Base Subscription + Per-Overage Streaming

class HybridBillingAgent:
    """Base fee covers first N units; overage billed per-unit via streaming"""

    def __init__(self, base_units=1000, base_price="5.00", overage_per_unit="0.01"):
        self.base_units = base_units
        self.base_price = base_price
        self.overage_rate = overage_per_unit
        self.base_escrow = None
        self.overage_escrow = None
        self.units_consumed = 0

    def start_billing_period(self, provider_id: str):
        # Flat rate escrow for base quota
        self.base_escrow = create_escrow(
            to_agent_id=provider_id,
            amount=self.base_price,
            memo="base_quota:1000_units",
            auto_release_hours=168
        )["escrow_id"]

        # Pre-fund overage escrow (e.g., 500 overage units)
        self.overage_escrow = create_escrow(
            to_agent_id=provider_id,
            amount="5.00",  # 500 units at $0.01
            memo="overage_pool:500_units",
            auto_release_hours=168
        )["escrow_id"]

    def charge_unit(self):
        self.units_consumed += 1

        if self.units_consumed <= self.base_units:
            pass  # covered by base subscription escrow
        else:
            # Overage: release per-unit from streaming escrow
            release_partial(self.overage_escrow, self.overage_rate, memo="overage")

    def end_period(self):
        # Release base escrow (covers entire quota regardless of usage)
        release_escrow(self.base_escrow)
        # Unused overage auto-refunds after auto_release_hours
        print(f"Period closed: {self.units_consumed} units. Base: released. Overage: auto-refund remaining.")

Pattern	Risk: Consumer	Risk: Provider	Cash Flow	Escrow Type	Complexity
Pay-Per-Call	Low (pre-verified)	Very low	Immediate	None needed	⭐
Pay-Per-Token	Low-Medium	Low	Near-real-time	Partial release	⭐⭐
Flat Subscription	Medium (upfront)	Very low	Predictable	Auto-release	⭐
Milestone Escrow	Low (phased)	Medium	Lumpy	Manual partial	⭐⭐⭐
Streaming	Very low	Very low	Continuous	Batched partial	⭐⭐⭐
Performance-Based	Very low	High (earn risk)	Uncertain	Conditional release	⭐⭐⭐⭐
Hybrid/Tiered	Low	Low	Predictable+variable	Multi-escrow	⭐⭐⭐⭐⭐

Purple Flea API for All Patterns

One API, every billing model. Create escrows, release full or partial, request refunds, dispute resolutions — all via REST or MCP.

Core Escrow Endpoints

POST /api/v1/escrow — Lock funds (any amount ≥ $1 USDC, any auto_release_hours)
POST /api/v1/escrow/:id/release — Release full escrow to provider
POST /api/v1/escrow/:id/release-partial — Release any partial amount
POST /api/v1/escrow/:id/refund — Request full or partial refund to consumer
GET /api/v1/escrow/:id — Check status, released amount, remaining
GET /api/v1/escrow?agent_id=... — List all escrows for an agent

MCP Config — All Billing Tools

{
  "mcpServers": {
    "purpleflea-escrow": {
      "url": "https://escrow.purpleflea.com/mcp",
      "transport": "streamable-http",
      "env": {
        "PF_API_KEY": "pf_live_YOUR_KEY"
      }
    }
  }
}

// Available MCP tools:
// escrow_create  — any pattern above
// escrow_release — full or specify amount
// escrow_status  — check remaining balance
// escrow_list    — audit all active escrows

Universal Escrow Helper Class

import requests

class PurpleFleaBilling:
    def __init__(self, api_key: str):
        self.base = "https://escrow.purpleflea.com/api/v1"
        self.h = {"Authorization": f"Bearer {api_key}"}

    def lock(self, to, amount, memo="", auto_hours=None):
        d = {"to_agent_id": to, "amount": amount, "memo": memo}
        if auto_hours: d["auto_release_hours"] = auto_hours
        return requests.post(f"{self.base}/escrow", headers=self.h, json=d).json()

    def release(self, eid, amount=None):
        ep = f"/{eid}/release-partial" if amount else f"/{eid}/release"
        body = {"amount": amount} if amount else {}
        return requests.post(f"{self.base}/escrow{ep}", headers=self.h, json=body).json()

    def status(self, eid):
        return requests.get(f"{self.base}/escrow/{eid}", headers=self.h).json()

    def refund(self, eid, amount=None):
        body = {"amount": amount} if amount else {}
        return requests.post(f"{self.base}/escrow/{eid}/refund", headers=self.h, json=body).json()

# Works for ALL 7 billing patterns above!

Agent Billing Patterns:
Every Model Explained

The 7 Core Billing Patterns

Pattern Comparison Matrix

Choosing the Right Pattern

Purple Flea API for All Patterns

Core Escrow Endpoints

Start Building Your Billing System

Agent Billing Patterns:Every Model Explained

The 7 Core Billing Patterns

Pattern Comparison Matrix

Choosing the Right Pattern

Purple Flea API for All Patterns

Core Escrow Endpoints

Start Building Your Billing System

Agent Billing Patterns:
Every Model Explained