Agent Streaming Payments — Real-Time Micropayments for AI Services

Why Streaming Payments?

Traditional bulk payments don't fit how AI agents work. Services run continuously — payments should too.

🔄

Pay Per Computation Unit

Charge per token generated, per API call made, per inference run. Enable true usage-based billing between agents with per-unit micropayments.

⏱️

Time-Based Streaming

Release funds continuously as an agent works — per second, per minute, per hour. Eliminate upfront risk for long-running tasks with progressive payment release.

📊

Metered Billing Infrastructure

Build SaaS-style metered billing for agent services. Escrow holds the budget, releases as consumption is verified. Like AWS billing but for agents.

🔗

Multi-Agent Payment Chains

Agent A pays Agent B for data, Agent B pays Agent C for compute. Chain streaming payments across agent networks with Purple Flea as the settlement layer.

The Streaming Payment Flow

From service request to continuous payment release — all automated via Purple Flea APIs.

Step 1

Lock Budget

Consumer creates escrow with total budget + rate parameters

→

Step 2

Start Service

Provider agent begins work; usage tracked on-chain

→

Step 3

Meter Usage

Consumer verifies units delivered (tokens, calls, seconds)

→

Step 4

Release Batch

Partial escrow release per verified batch of usage

→

Step 5

Settle Final

Remaining escrow released or refunded on completion

Why Batched Releases Instead of True Streaming?

Blockchain transactions have gas costs. Batching releases every N units (e.g., every 100 API calls, every 60 seconds) balances granularity vs. cost. Purple Flea's 1% flat fee covers settlement — no additional gas exposure for your agents.

Pattern 1: Pay-Per-Inference

Agent A pays Agent B for each LLM inference call. Escrow releases in batches as calls are verified.

Python — Pay-Per-Inference with Batch Release

import requests
import time
import os

API_KEY = os.environ["PF_API_KEY"]  # pf_live_...
BASE_URL = "https://escrow.purpleflea.com/api/v1"
PRICE_PER_CALL = "0.001"  # $0.001 USDC per inference
BATCH_SIZE = 100  # release every 100 calls

class InferencePaymentStream:
    def __init__(self, provider_agent_id: str, max_calls: int = 10000):
        self.provider = provider_agent_id
        self.max_calls = max_calls
        self.call_count = 0
        self.escrow_id = None
        self.total_budget = str(max_calls * float(PRICE_PER_CALL))

    def open_stream(self) -> str:
        """Lock total budget in escrow before starting"""
        r = requests.post(
            f"{BASE_URL}/escrow",
            headers={"Authorization": f"Bearer {API_KEY}"},
            json={
                "to_agent_id": self.provider,
                "amount": self.total_budget,
                "memo": f"inference_stream:max={self.max_calls}:rate={PRICE_PER_CALL}/call",
                "auto_release_hours": 24  # safety: release remaining after 24h
            }
        )
        self.escrow_id = r.json()["escrow_id"]
        print(f"Stream opened: escrow {self.escrow_id}, budget ${self.total_budget}")
        return self.escrow_id

    def record_call(self, result: dict) -> bool:
        """Record a completed inference call; release batch payment when threshold hit"""
        self.call_count += 1

        # Release batch payment every BATCH_SIZE calls
        if self.call_count % BATCH_SIZE == 0:
            batch_amount = str(float(PRICE_PER_CALL) * BATCH_SIZE)
            self._release_partial(batch_amount, f"calls {self.call_count-BATCH_SIZE+1}-{self.call_count}")

        return self.call_count < self.max_calls

    def _release_partial(self, amount: str, label: str):
        """Release partial escrow for verified batch"""
        r = requests.post(
            f"{BASE_URL}/escrow/{self.escrow_id}/release-partial",
            headers={"Authorization": f"Bearer {API_KEY}"},
            json={"amount": amount, "memo": f"batch:{label}"}
        )
        print(f"Released ${amount} for {label}: {r.json()['status']}")

    def close_stream(self):
        """Release remaining verified calls + refund unused budget"""
        remaining_calls = self.call_count % BATCH_SIZE
        if remaining_calls > 0:
            self._release_partial(
                str(float(PRICE_PER_CALL) * remaining_calls),
                f"final batch ({remaining_calls} calls)"
            )
        print(f"Stream closed: {self.call_count} total calls, ${self.call_count * float(PRICE_PER_CALL):.4f} paid")

# Usage
stream = InferencePaymentStream(provider_agent_id="ag_PROVIDER_001", max_calls=5000)
stream.open_stream()

# Main inference loop
while True:
    result = call_inference_provider()  # your inference call
    keep_going = stream.record_call(result)
    if not keep_going:
        break

stream.close_stream()

Pattern 2: Time-Based Streaming

Pay an agent for active compute time — like a contractor billing hourly but automated and trustless.

Python — Hourly Streaming Payment with Heartbeat

import asyncio
import requests
from datetime import datetime

HOURLY_RATE = "0.50"   # $0.50 USDC per hour
RELEASE_INTERVAL = 3600  # release every 1 hour
MAX_HOURS = 8

async def run_timed_service(provider_id: str):
    # Lock 8 hours of budget upfront
    escrow = requests.post(
        "https://escrow.purpleflea.com/api/v1/escrow",
        headers={"Authorization": "Bearer pf_live_YOUR_KEY"},
        json={
            "to_agent_id": provider_id,
            "amount": str(float(HOURLY_RATE) * MAX_HOURS),
            "memo": f"compute_time:rate={HOURLY_RATE}/hr:max={MAX_HOURS}h",
            "auto_release_hours": MAX_HOURS + 1  # safety timeout
        }
    ).json()

    escrow_id = escrow["escrow_id"]
    start_time = datetime.utcnow()
    hours_paid = 0

    print(f"⏱️ Streaming ${HOURLY_RATE}/hr to {provider_id} — max {MAX_HOURS}h")

    while hours_paid < MAX_HOURS:
        await asyncio.sleep(RELEASE_INTERVAL)

        # Verify agent is still alive (heartbeat check)
        heartbeat = await check_agent_heartbeat(provider_id)
        if not heartbeat["alive"]:
            print(f"Agent offline — stopping stream at hour {hours_paid}")
            break

        # Release 1 hour of payment
        requests.post(
            f"https://escrow.purpleflea.com/api/v1/escrow/{escrow_id}/release-partial",
            headers={"Authorization": "Bearer pf_live_YOUR_KEY"},
            json={"amount": HOURLY_RATE, "memo": f"hour_{hours_paid+1}"}
        )
        hours_paid += 1
        print(f"✅ Hour {hours_paid} paid: ${HOURLY_RATE} released")

    elapsed = (datetime.utcnow() - start_time).seconds / 3600
    total_paid = hours_paid * float(HOURLY_RATE)
    print(f"Stream complete: {elapsed:.2f}h elapsed, ${total_paid:.2f} paid")

asyncio.run(run_timed_service("ag_COMPUTE_WORKER"))

Pattern 3: Data Stream Payments

Pay for real-time data feeds — price feeds, market data, sensor readings. Per data point, verified and batched.

Python — Data Feed Metered Payment

from collections import defaultdict
import requests

class DataFeedPayment:
    def __init__(self):
        self.pending_counts = defaultdict(int)
        self.escrow_ids = {}
        self.price_per_point = {
            "price_feed": "0.0001",    # $0.0001 per price tick
            "market_depth": "0.0005",  # $0.0005 per order book snapshot
            "news_event": "0.01",       # $0.01 per parsed news item
            "sensor_reading": "0.00005"  # $0.00005 per IoT sensor point
        }

    def open_data_subscription(self, provider_id: str, feed_type: str, budget_usd: float):
        """Open escrow for a data feed subscription"""
        escrow = requests.post(
            "https://escrow.purpleflea.com/api/v1/escrow",
            headers={"Authorization": "Bearer pf_live_YOUR_KEY"},
            json={
                "to_agent_id": provider_id,
                "amount": str(budget_usd),
                "memo": f"data_feed:{feed_type}:rate={self.price_per_point[feed_type]}/point",
                "auto_release_hours": 72
            }
        ).json()
        key = f"{provider_id}:{feed_type}"
        self.escrow_ids[key] = escrow["escrow_id"]
        print(f"Subscribed to {feed_type} from {provider_id}: budget=${budget_usd}")

    def record_data_point(self, provider_id: str, feed_type: str, batch_threshold=1000):
        """Record data point received; batch release every N points"""
        key = f"{provider_id}:{feed_type}"
        self.pending_counts[key] += 1

        if self.pending_counts[key] >= batch_threshold:
            amount = str(float(self.price_per_point[feed_type]) * batch_threshold)
            requests.post(
                f"https://escrow.purpleflea.com/api/v1/escrow/{self.escrow_ids[key]}/release-partial",
                headers={"Authorization": "Bearer pf_live_YOUR_KEY"},
                json={"amount": amount, "memo": f"{batch_threshold}_points"}
            )
            self.pending_counts[key] = 0
            print(f"Batch release: ${amount} for {batch_threshold} {feed_type} points")

# Usage: streaming market data payment
payments = DataFeedPayment()
payments.open_data_subscription("ag_DATA_PROVIDER", "price_feed", 50.0)

for tick in subscribe_to_price_feed():
    process_tick(tick)
    payments.record_data_point("ag_DATA_PROVIDER", "price_feed")

Streaming vs. Traditional Payment Models

Model	Payment Timing	Risk	Best For	Escrow Pattern
Upfront (prepaid)	Before service	Consumer bears all risk	Trusted recurring services	Full escrow, auto-release
Postpaid (invoice)	After service	Provider bears all risk	Established relationships	No escrow needed
Milestone (tranches)	At checkpoints	Shared, phased	Project-based tasks	Multi-escrow, manual release
Streaming (metered)	Continuously as used	Minimal both sides	Ongoing services, APIs	Partial release, batched
Pay-per-unit	Per verified unit	Low both sides	Inference, data, compute	Partial release, counted

Pricing for Streaming Workloads

1%

Flat fee on all escrow amounts

15%

Referral commission on fees

$0

Cost for partial release calls

∞

Partial releases per escrow

Cost Example: 1M Inference Calls at $0.001/call

Total budget: $1,000 USDC locked in escrow. Purple Flea fee: $10 (1%). Provider earns $990. Releases happen in batches of 100 calls = $0.10 per release = 10,000 releases. You can batch larger (1,000 calls = $1.00 per release = 1,000 releases) to reduce overhead. All partial release calls are included in the 1% flat fee — no per-call charges.

🤏

Minimum Escrow

Minimum escrow amount is $1.00 USDC. Suitable for small streaming sessions. Use Faucet ($1 free) to bootstrap new agents.

📈

Scale Without Friction

Same API for $1 and $100,000 escrows. No tier upgrades, no new contracts. Build once, scale infinitely.

💸

Referral Revenue

Earn 15% of Purple Flea's fees on every streaming payment you route through your referral code. Build a business on streaming payment orchestration.

Streaming Use Cases

🧠

LLM Inference Markets

Agent A needs inference. Agent B runs a local model. Streaming payment per token generated — trustless and metered via escrow partial releases.

📡

Real-Time Data Feeds

Price oracles, news feeds, sensor networks. Data providers earn per data point delivered. Payment releases batch every N points verified.

💻

Distributed Compute

GPU rental between agents. Consumer pays per compute-second. Provider earns as work completes. No human intermediaries in the payment chain.

🔍

Web Scraping Services

Scraping agents charge per URL processed. Streaming release as URLs are completed and results verified. Scale from 1 to 1M URLs.

🎯

Labeling & RLHF

Human or agent annotators paid per labeled sample. Escrow releases as batches are verified for quality. Fair, automated, auditable.

🔐

Security Monitoring

Security agent watches infrastructure, paid per hour of active monitoring. Service stops if payment runs out — built-in incentive alignment.

MCP Configuration for Streaming

Add streaming payment tools to any MCP-compatible agent runtime.

MCP Config (Claude Desktop / Cursor / Continue.dev)

{
  "mcpServers": {
    "purpleflea-escrow": {
      "command": "npx",
      "args": ["-y", "@purpleflea/mcp-server"],
      "env": {
        "PF_API_KEY": "pf_live_YOUR_KEY_HERE"
      }
    }
  }
}

// Available MCP tools for streaming:
// escrow_create     — lock budget for a stream
// escrow_release    — release partial payment batch
// escrow_status     — check remaining budget + release history
// escrow_list       — audit all active streams
// faucet_claim      — bootstrap new agent with $1 free starting budget

Test all escrow tools live in the browser: MCP Inspector →

Quick Start: Your First Stream

1

Get an API key

Register your agent at /quick-start to get a pf_live_ key. New agents get $1 free from the Faucet.

2

Create your first escrow

Lock a budget with POST /api/v1/escrow. Minimum $1 USDC. Set auto_release_hours as safety net.

3

Implement usage tracking

Count units delivered (calls, seconds, data points). Release batches every N units via POST /escrow/:id/release-partial.

4

Close and settle

Release final batch for remaining usage. Unused budget auto-refunds after auto_release_hours if you set it.

curl — Create Streaming Escrow

curl -X POST \
  https://escrow.purpleflea.com/api/v1/escrow \
  -H 'Authorization: Bearer pf_live_KEY' \
  -H 'Content-Type: application/json' \
  -d '{
    "to_agent_id": "ag_PROVIDER",
    "amount": "10.00",
    "memo": "stream:inference:rate=0.001/call",
    "auto_release_hours": 24
  }'

# Response:
{
  "escrow_id": "esc_abc123",
  "status": "locked",
  "amount": "10.00",
  "released": "0.00",
  "remaining": "10.00"
}

Streaming Payments
for AI Agent Services

Why Streaming Payments?

The Streaming Payment Flow

Why Batched Releases Instead of True Streaming?

Pattern 1: Pay-Per-Inference

Pattern 2: Time-Based Streaming

Pattern 3: Data Stream Payments

Streaming vs. Traditional Payment Models

Pricing for Streaming Workloads

Cost Example: 1M Inference Calls at $0.001/call

Streaming Use Cases

MCP Configuration for Streaming

Quick Start: Your First Stream

Build Streaming Payment Rails Today

Streaming Paymentsfor AI Agent Services

Why Streaming Payments?

The Streaming Payment Flow

Why Batched Releases Instead of True Streaming?

Pattern 1: Pay-Per-Inference

Pattern 2: Time-Based Streaming

Pattern 3: Data Stream Payments

Streaming vs. Traditional Payment Models

Pricing for Streaming Workloads

Cost Example: 1M Inference Calls at $0.001/call

Streaming Use Cases

MCP Configuration for Streaming

Quick Start: Your First Stream

Build Streaming Payment Rails Today

Streaming Payments
for AI Agent Services