Name: Purple Flea for Grafana Loki
Author: Purple Flea

Why Loki for Agent Financial Operations?

AI agents running on Purple Flea — executing trades, managing wallets, running casino sessions, and settling escrow payments — generate a continuous stream of operational events. Without structured log aggregation, debugging a failed trade or diagnosing an unusual spike in API errors means SSHing into a server and grepping through flat text files. That doesn't scale to agents running 24/7 across multiple strategies.

Grafana Loki is the ideal log aggregation backend for agent financial workloads because:

📦

Label-based indexing

Index by agent_id, strategy, symbol, and environment without storing the full text. Query by any label combination instantly.

💸

Cost-efficient at scale

Loki stores log text compressed, indexes only labels. Running 50 agents each generating 10K events/day costs cents in storage.

⚡

Native Grafana integration

Correlate log streams directly with your Prometheus metrics dashboards. Click a spike on a chart, see the exact log events that caused it.

🔍

LogQL is powerful

Parse JSON logs, extract fields, compute rates, filter by severity — all in one query language purpose-built for log aggregation.

🚨

Ruler for alerting

Define alert rules directly in LogQL. Fire PagerDuty/Slack when API error rate exceeds 5% or escrow failure is detected.

🔗

Multi-agent correlation

Trace a single trade_id across wallet deduction, order placement, and escrow settlement events — even when they happen in different agent processes.

<1ms

Log ingest latency (Promtail local)

10x

Lower cost than Elasticsearch at equivalent scale

JSON

Native structured log parsing in LogQL

Free

Loki is open source; self-host on any Linux server

Architecture Overview

The recommended observability stack for Purple Flea agents is straightforward:

Trading Agent

Casino Agent

Escrow Agent

Wallet Agent

↓ structured JSON logs to stdout/file

Promtail (log shipper)

↓ push to Loki HTTP API

Grafana Loki (storage + query)

↓ LogQL queries + alert rules

Grafana Dashboards

Alert Manager

Each agent writes structured JSON logs to stdout or a log file. Promtail reads those logs, attaches labels (agent_id, strategy, env), and pushes them to Loki. Grafana reads from Loki using LogQL and renders dashboards, log panels, and alert rules.

Structured Logging for Agent Operations

The foundation of a useful Loki setup is structured log output. Flat text logs are nearly impossible to query meaningfully — structured JSON logs let you filter, parse, and aggregate on any field.

Every log event from a Purple Flea agent should include these standard fields:

Field	Type	Example	Purpose
`timestamp`	ISO 8601	`"2026-03-07T14:22:01.342Z"`	Event time (UTC always)
`level`	string	`"INFO"`, `"WARN"`, `"ERROR"`	Severity for filtering
`agent_id`	string	`"basis-agent-01"`	Identifies the agent instance
`strategy`	string	`"basis_convergence"`	Strategy type for grouping
`event`	string	`"order_placed"`	Structured event name
`symbol`	string	`"BTCUSDT"`	Trading pair (optional)
`trade_id`	string UUID	`"a3f92b..."`	Correlation ID for multi-step trades
`amount_usd`	float	`4250.00`	USD notional (for anomaly detection)
`latency_ms`	int	`143`	API call duration for SLO tracking
`error`	string	`"rate_limit_exceeded"`	Error code when applicable
`status_code`	int	`429`	HTTP status for API calls

Here is a representative sample of what well-structured Purple Flea agent logs look like:

2026-03-07T14:22:01.342Z [INFO] event="scan_complete" agent_id="basis-agent-01" strategy="basis_convergence" opportunities_found=3 scan_duration_ms=218
2026-03-07T14:22:01.580Z [INFO] event="order_placed" agent_id="basis-agent-01" symbol="BTCUSDT" side="buy" amount_usd=5000.00 trade_id="a3f92b44" latency_ms=143
2026-03-07T14:22:01.601Z [INFO] event="order_placed" agent_id="basis-agent-01" symbol="BTCUSDT-PERP" side="sell" amount_usd=5000.00 trade_id="a3f92b44" latency_ms=89
2026-03-07T14:30:00.001Z [WARN] event="rate_limit_hit" agent_id="casino-agent-07" endpoint="/v1/casino/bet" status_code=429 retry_after_ms=1000
2026-03-07T14:32:14.777Z [ERROR] event="escrow_failed" agent_id="escrow-settler-02" escrow_id="esc_f9a211" error="insufficient_funds" amount_usd=1250.00 counterparty="agent-99"

Python: Structured Logging Setup

Two excellent Python libraries for structured logging are structlog (highly configurable, explicit context binding) and loguru (minimal boilerplate, great for smaller agent scripts). Both output JSON that Loki can parse natively.

Using structlog

bash

pip install structlog

Python logging_setup.py

import structlog
import logging
import json
import sys
from datetime import datetime, timezone

def configure_structlog():
    """Configure structlog for Purple Flea agent structured JSON output."""
    structlog.configure(
        processors=[
            # Add timestamp in ISO 8601 UTC
            structlog.processors.TimeStamper(fmt="iso", utc=True),
            # Add log level
            structlog.stdlib.add_log_level,
            # Add caller info (file, line)
            structlog.processors.CallsiteParameterAdder(
                [structlog.processors.CallsiteParameter.FILENAME,
                 structlog.processors.CallsiteParameter.LINENO]
            ),
            # Render as JSON
            structlog.processors.JSONRenderer(),
        ],
        wrapper_class=structlog.BoundLogger,
        context_class=dict,
        logger_factory=structlog.PrintLoggerFactory(file=sys.stdout),
    )

configure_structlog()
log = structlog.get_logger()

# Bind agent context once — propagates to all subsequent calls
log = log.bind(
    agent_id="basis-agent-01",
    strategy="basis_convergence",
    env="production",
)

# Event logging examples
log.info("scan_complete", opportunities_found=3, scan_duration_ms=218)
log.info("order_placed", symbol="BTCUSDT", side="buy", amount_usd=5000.0,
         trade_id="a3f92b44", latency_ms=143)
log.warning("rate_limit_hit", endpoint="/v1/trading/order", status_code=429,
            retry_after_ms=1000)
log.error("order_failed", symbol="ETHUSDT", error="insufficient_margin",
          amount_usd=3000.0, trade_id="b8c01d22")

Using loguru (Minimal Setup)

bash

pip install loguru

Python loguru_setup.py

import json
import sys
from loguru import logger

AGENT_ID = "basis-agent-01"
STRATEGY = "basis_convergence"

def json_sink(message):
    """Custom JSON sink for loguru — outputs one JSON object per line."""
    record = message.record
    log_entry = {
        "timestamp": record["time"].isoformat(),
        "level": record["level"].name,
        "agent_id": AGENT_ID,
        "strategy": STRATEGY,
        "event": record["message"],
        **record["extra"],   # Any .bind() context
    }
    print(json.dumps(log_entry), flush=True)

# Remove default handler, add JSON handler
logger.remove()
logger.add(json_sink, level="INFO")

# Optional: also write to rotating file (Promtail can read this)
logger.add(
    "/var/log/purple-flea/basis-agent-01.log",
    level="INFO",
    rotation="100 MB",
    retention="30 days",
    serialize=True,  # loguru's built-in JSON serialization
)

# Usage — bind() returns a contextualized logger
agent_log = logger.bind(agent_id=AGENT_ID, strategy=STRATEGY)
agent_log.info("agent_started", capital_usd=50000.0, symbols=["BTC", "ETH", "SOL"])
agent_log.info("order_placed", symbol="BTCUSDT", side="buy", amount_usd=5000.0,
               trade_id="a3f92b44", latency_ms=143)
agent_log.warning("high_basis_blowout_risk", symbol="SOLANA", current_basis_pct=0.92,
                   entry_basis_pct=0.31)
agent_log.error("emergency_exit_triggered", symbol="AVAX", reason="margin_too_low",
                margin_ratio=1.28)

✓

Always log to stdout first. Promtail can ship from stdout via the journal or from files. Logging to stdout keeps your agent container-friendly (works in Docker, Kubernetes, and bare PM2 deployments without filesystem changes).

Promtail Configuration for Agent Log Shipping

Promtail is the official Loki log shipper. It tails files or reads from systemd journal, attaches labels, and pushes to Loki. Here is a complete promtail-config.yml for a multi-agent Purple Flea deployment:

YAML /etc/promtail/promtail-config.yml

server:
  http_listen_port: 9080
  grpc_listen_port: 0

positions:
  filename: /tmp/promtail-positions.yaml

clients:
  - url: http://localhost:3100/loki/api/v1/push
    # If using Grafana Cloud Loki:
    # url: https://logs-prod-us-central1.grafana.net/loki/api/v1/push
    # basic_auth:
    #   username: YOUR_GRAFANA_CLOUD_USER_ID
    #   password: YOUR_GRAFANA_CLOUD_API_KEY

scrape_configs:

  # ── Trading agents log files ──────────────────────────────────────────────
  - job_name: purple-flea-trading-agents
    static_configs:
      - targets:
          - localhost
        labels:
          job: trading-agents
          env: production
          service: purple-flea
          __path__: /var/log/purple-flea/trading/*.log

    pipeline_stages:
      # Parse JSON log lines
      - json:
          expressions:
            level: level
            agent_id: agent_id
            strategy: strategy
            event: event
            symbol: symbol
            trade_id: trade_id
            amount_usd: amount_usd
            latency_ms: latency_ms
            error: error

      # Promote parsed fields to Loki labels (label-indexed, queryable without scan)
      - labels:
          level:
          agent_id:
          strategy:
          event:

      # Set log timestamp from the JSON field (not the ingest time)
      - timestamp:
          source: timestamp
          format: RFC3339Nano

  # ── Casino agents ─────────────────────────────────────────────────────────
  - job_name: purple-flea-casino-agents
    static_configs:
      - targets: [localhost]
        labels:
          job: casino-agents
          env: production
          service: purple-flea
          __path__: /var/log/purple-flea/casino/*.log
    pipeline_stages:
      - json:
          expressions:
            level: level
            agent_id: agent_id
            event: event
            session_id: session_id
            bet_amount_usd: bet_amount_usd
            game: game
      - labels:
          level:
          agent_id:
          event:
          game:
      - timestamp:
          source: timestamp
          format: RFC3339Nano

  # ── Escrow agents ─────────────────────────────────────────────────────────
  - job_name: purple-flea-escrow-agents
    static_configs:
      - targets: [localhost]
        labels:
          job: escrow-agents
          env: production
          service: purple-flea
          __path__: /var/log/purple-flea/escrow/*.log
    pipeline_stages:
      - json:
          expressions:
            level: level
            agent_id: agent_id
            event: event
            escrow_id: escrow_id
            amount_usd: amount_usd
            counterparty: counterparty
            error: error
      - labels:
          level:
          agent_id:
          event:
      - timestamp:
          source: timestamp
          format: RFC3339Nano

  # ── Wallet agents ─────────────────────────────────────────────────────────
  - job_name: purple-flea-wallet-agents
    static_configs:
      - targets: [localhost]
        labels:
          job: wallet-agents
          env: production
          service: purple-flea
          __path__: /var/log/purple-flea/wallet/*.log
    pipeline_stages:
      - json:
          expressions:
            level: level
            agent_id: agent_id
            event: event
            chain: chain
            tx_hash: tx_hash
            amount_usd: amount_usd
      - labels:
          level:
          agent_id:
          event:
          chain:
      - timestamp:
          source: timestamp
          format: RFC3339Nano

◎

Label cardinality warning: Do not add high-cardinality fields like trade_id, tx_hash, or amount_usd as Loki labels. Use them as parsed fields within log lines instead. High-cardinality labels cause Loki to create thousands of streams and significantly degrade performance.

Installing Promtail

bash install-promtail.sh

# Download latest Promtail binary (replace VERSION with latest)
PROMTAIL_VERSION="3.0.0"
wget "https://github.com/grafana/loki/releases/download/v${PROMTAIL_VERSION}/promtail-linux-amd64.zip"
unzip promtail-linux-amd64.zip
sudo mv promtail-linux-amd64 /usr/local/bin/promtail
sudo chmod +x /usr/local/bin/promtail

# Create log directories for agents
sudo mkdir -p /var/log/purple-flea/{trading,casino,escrow,wallet}
sudo chmod 777 /var/log/purple-flea  # Allow agent processes to write

# Run Promtail as a systemd service
sudo tee /etc/systemd/system/promtail.service <<EOF
[Unit]
Description=Promtail - Loki log shipper for Purple Flea agents
After=network.target

[Service]
ExecStart=/usr/local/bin/promtail -config.file=/etc/promtail/promtail-config.yml
Restart=on-failure
RestartSec=5s
StandardOutput=journal
StandardError=journal

[Install]
WantedBy=multi-user.target
EOF

sudo systemctl daemon-reload
sudo systemctl enable promtail
sudo systemctl start promtail
sudo systemctl status promtail

LogQL Queries for Agent Debugging

LogQL is Loki's query language. It reads like a combination of PromQL and grep. Here are the most useful queries for Purple Flea agent operations — ready to paste into Grafana Explore.

Error Rate Queries

API Error Rate (all agents, last 5 minutes)

sum(rate({job="trading-agents", level="ERROR"}[5m])) by (agent_id)

Shows errors per second for each agent. Spike indicates API failures or logic errors.

HTTP 429 Rate Limit Events

{job=~".*-agents"} | json | status_code = 429 | line_format "{{.agent_id}} hit rate limit on {{.endpoint}}"

Stream all rate-limit hits across all agent jobs, formatted for readability.

Error Percentage by Strategy

sum(rate({job="trading-agents", level="ERROR"}[10m])) by (strategy) / sum(rate({job="trading-agents"}[10m])) by (strategy) * 100

Error percentage per strategy. Alert if any strategy exceeds 5%.

Trade Latency Queries

P95 Order Placement Latency

quantile_over_time(0.95, {job="trading-agents", event="order_placed"} | json | unwrap latency_ms [10m]) by (agent_id)

95th percentile latency for order placement. Should be below 500ms for most strategies.

Slow API Calls (over 1s)

{job=~".*-agents"} | json | latency_ms > 1000 | line_format "SLOW: {{.agent_id}} {{.event}} took {{.latency_ms}}ms"

Stream all individual API calls that took more than 1 second. Useful for identifying latency hotspots.

Average Scan Duration Over Time

avg_over_time({job="trading-agents", event="scan_complete"} | json | unwrap scan_duration_ms [30m]) by (strategy)

How long each strategy's market scan takes on average. Rising values indicate computational overhead growth.

Wallet Event Queries

All Wallet Transactions in Last Hour

{job="wallet-agents"} | json | event = "transaction_sent" | line_format "{{.agent_id}} sent {{.amount_usd}}USD on {{.chain}} tx={{.tx_hash}}"

Audit stream of all on-chain transactions initiated by wallet agents.

Total USD Volume by Agent (last 24h)

sum_over_time({job="trading-agents", event="order_placed"} | json | unwrap amount_usd [24h]) by (agent_id)

Total notional traded per agent over the past 24 hours. Useful for capital utilization reporting.

Escrow Event Queries

Failed Escrow Events

{job="escrow-agents", event="escrow_failed"} | json | line_format "FAILED: {{.escrow_id}} | reason={{.error}} | amount=${{.amount_usd}} | counterparty={{.counterparty}}"

All escrow failures with full context. Every failure should trigger an alert and manual review.

Escrow Settlement Rate (Success %)

sum(rate({job="escrow-agents", event="escrow_settled"}[1h])) / (sum(rate({job="escrow-agents", event="escrow_settled"}[1h])) + sum(rate({job="escrow-agents", event="escrow_failed"}[1h]))) * 100

Percentage of escrows that settle successfully. Should be 99.9%+. Drop below 99% warrants investigation.

Grafana Explore for Agent Debugging

Grafana Explore is the primary interface for ad-hoc log investigation. The key workflow for debugging a specific agent incident:

Start with the time range

In Grafana Explore, set the time range to the window when the incident occurred. Use the "Last 1h" or custom range picker. Narrow the window once you identify the anomaly period.

Filter by agent_id and level

Begin with {job="trading-agents", agent_id="basis-agent-01", level="ERROR"} to see only errors from the specific agent. Expand to WARN if errors are sparse.

Parse JSON and extract the trade_id

Add | json and look for the trade_id field on the first error event. Now filter on that trade_id: | trade_id = "a3f92b44" to see the complete lifecycle of that specific trade.

Correlate across agents

The same trade_id may appear in wallet agent logs (for the funding transaction) and escrow agent logs (if the trade involves settlement). Use {job=~".*-agents"} | json | trade_id = "a3f92b44" to see the full multi-agent trace.

Switch to metrics view for rate queries

Toggle to the "Metrics" view in Explore for rate() and quantile_over_time() queries. The chart view makes it easy to spot spikes in error rates or latency that preceded the incident.

ℹ

Correlate with Prometheus metrics: Grafana allows mixed datasource queries. If your agents also expose a /metrics endpoint (even a simple one with trade count and error count), you can overlay log stream panels with Prometheus metric charts in the same dashboard, making it easy to see if a log error rate spike correlates with a drop in trade throughput.

Alert Rules on Agent Log Patterns

Grafana Loki's Ruler component supports LogQL-based alert rules. These fire via Grafana Alertmanager and can route to Slack, PagerDuty, email, or any webhook.

Ruler Configuration

YAML /etc/loki/rules/purple-flea-agents.yml

groups:
  - name: purple-flea-agent-alerts
    interval: 1m
    rules:

      # ── API Error Rate Alert ────────────────────────────────────────────────
      - alert: AgentHighErrorRate
        expr: |
          (
            sum(rate({job=~".*-agents", level="ERROR"}[5m])) by (agent_id, job)
            /
            sum(rate({job=~".*-agents"}[5m])) by (agent_id, job)
          ) * 100 > 5
        for: 3m
        labels:
          severity: warning
          team: infra
        annotations:
          summary: "Agent {{ $labels.agent_id }} error rate above 5%"
          description: "Agent {{ $labels.agent_id }} in job {{ $labels.job }} has {{ $value | printf \"%.1f\" }}% error rate over 5 minutes. Investigate logs immediately."
          runbook_url: "https://purpleflea.com/docs/troubleshooting"

      # ── Escrow Failure Alert ────────────────────────────────────────────────
      - alert: EscrowFailureDetected
        expr: |
          sum(count_over_time({job="escrow-agents", event="escrow_failed"}[5m])) > 0
        for: 0m
        labels:
          severity: critical
          team: finance
        annotations:
          summary: "Escrow failure detected"
          description: "One or more escrow settlements failed in the last 5 minutes. Check escrow-agents logs immediately."
          runbook_url: "https://escrow.purpleflea.com/docs/failures"

      # ── Unusual Trade Size Alert ────────────────────────────────────────────
      - alert: UnusuallyLargeTradeDetected
        expr: |
          max_over_time(
            {job="trading-agents", event="order_placed"}
            | json
            | unwrap amount_usd [5m]
          ) by (agent_id) > 50000
        for: 0m
        labels:
          severity: warning
          team: risk
        annotations:
          summary: "Unusually large trade from {{ $labels.agent_id }}"
          description: "Agent {{ $labels.agent_id }} placed a trade larger than $50,000 USD. Verify this is intentional."

      # ── API Rate Limit Saturation ───────────────────────────────────────────
      - alert: RateLimitSaturation
        expr: |
          sum(count_over_time({job=~".*-agents"} | json | status_code = 429 [5m])) by (agent_id) > 10
        for: 2m
        labels:
          severity: warning
          team: infra
        annotations:
          summary: "Agent {{ $labels.agent_id }} hitting rate limits frequently"
          description: "Agent {{ $labels.agent_id }} has been rate-limited more than 10 times in 5 minutes. Check request frequency and add backoff."

      # ── Agent Silence Alert (Dead Agent Detection) ──────────────────────────
      - alert: AgentSilent
        expr: |
          sum(count_over_time({job="trading-agents"}[15m])) by (agent_id) == 0
        for: 5m
        labels:
          severity: critical
          team: infra
        annotations:
          summary: "Agent {{ $labels.agent_id }} has stopped producing logs"
          description: "No log events received from {{ $labels.agent_id }} for 15+ minutes. The agent may have crashed or lost connectivity."

      # ── High Order Latency Alert ────────────────────────────────────────────
      - alert: HighOrderLatency
        expr: |
          quantile_over_time(0.95,
            {job="trading-agents", event="order_placed"}
            | json
            | unwrap latency_ms [10m]
          ) by (agent_id) > 2000
        for: 5m
        labels:
          severity: warning
          team: infra
        annotations:
          summary: "P95 order latency > 2s for {{ $labels.agent_id }}"
          description: "95th percentile order placement latency is {{ $value }}ms for agent {{ $labels.agent_id }}. Purple Flea API may be degraded or network path is slow."

Alertmanager Routing for Finance Teams

YAML alertmanager.yml

global:
  resolve_timeout: 5m

route:
  group_by: [alertname, agent_id]
  group_wait: 30s
  group_interval: 5m
  repeat_interval: 12h
  receiver: default
  routes:
    - match:
        severity: critical
      receiver: pagerduty-critical
      continue: true
    - match:
        team: finance
      receiver: slack-finance
    - match:
        team: risk
      receiver: slack-risk

receivers:
  - name: default
    slack_configs:
      - api_url: "https://hooks.slack.com/services/YOUR/SLACK/WEBHOOK"
        channel: "#agent-alerts"
        title: "{{ .CommonAnnotations.summary }}"
        text: "{{ .CommonAnnotations.description }}"

  - name: pagerduty-critical
    pagerduty_configs:
      - routing_key: "YOUR_PAGERDUTY_KEY"
        description: "{{ .CommonAnnotations.summary }}"

  - name: slack-finance
    slack_configs:
      - api_url: "https://hooks.slack.com/services/YOUR/SLACK/WEBHOOK"
        channel: "#finance-alerts"
        title: "[FINANCE] {{ .CommonAnnotations.summary }}"
        text: "{{ .CommonAnnotations.description }}"

  - name: slack-risk
    slack_configs:
      - api_url: "https://hooks.slack.com/services/YOUR/SLACK/WEBHOOK"
        channel: "#risk-alerts"
        title: "[RISK] {{ .CommonAnnotations.summary }}"
        text: "{{ .CommonAnnotations.description }}"

Log Retention and Cost Optimization

Financial agent logs have compliance implications — you may need to retain certain trade logs for regulatory audit purposes. At the same time, high-volume debug logs can be expensive to store indefinitely. Here is a practical retention strategy:

Log Type	Volume	Recommended Retention	Reason
Trade execution events	Low	7 years	Financial audit trail, regulatory compliance
Escrow settlement logs	Low	7 years	Contractual and dispute resolution evidence
Wallet transaction logs	Low	5 years	Tax reporting, AML compliance
API error and warning logs	Medium	90 days	Debugging and incident investigation
Debug/trace level logs	High	7 days	Active debugging only; expensive to retain
Scan/monitoring heartbeat logs	Very high	3 days	Operational visibility only; high volume

Configuring Per-Stream Retention in Loki

YAML loki-config.yml (retention section)

compactor:
  working_directory: /loki/compactor
  shared_store: filesystem
  compaction_interval: 10m
  retention_enabled: true
  retention_delete_delay: 2h
  retention_delete_worker_count: 150

# Global default retention: 90 days
limits_config:
  retention_period: 2160h  # 90 days

# Per-stream retention overrides via ruler (requires Loki 2.9+)
# Fine-grained retention using stream selectors:
# Trade execution logs → 7 years (61320h)
# Warning/error logs → 90 days (2160h)
# Debug/heartbeat logs → 7 days (168h)

Cost Reduction Techniques

Log sampling for high-frequency events: Market scan events that occur every 60 seconds can be sampled at 10% without losing meaningful observability. Only log full detail when the scan finds an opportunity.
Reduce label cardinality: Every unique label combination creates a new Loki stream. Keep labels to < 10 values per dimension (e.g., 5 strategy types, not 1000 trade IDs as labels).
Compression: Enable snappy or zstd compression for chunks. Loki's default is reasonable but zstd gives 10–20% better compression on JSON logs.
Object storage for long-term: Move chunks older than 7 days to S3-compatible storage (AWS S3, MinIO, Backblaze B2). Storage costs drop from ~$0.10/GB/month (SSD) to ~$0.006/GB/month (object store).
Do not log PII or secrets: Never log wallet private keys, API secret keys, or personally identifiable information. Use pf_live_ prefixed API keys in any config references — never sk_live_ prefixed values from other services.

⚠

Security reminder: Loki ingests raw log text. If your agents inadvertently log API keys, wallet mnemonics, or user data, that information will be stored in your Loki backend. Always redact sensitive values before logging. Use a structlog processor or loguru filter to scrub keys matching patterns like pf_live_* from log output.

Sample Grafana Dashboard Panels

A complete Purple Flea agent observability dashboard should include these panels:

📈

Agent Trade Volume (24h)

Time series of total USD notional traded per agent. Shows agent activity levels and detects sudden drops indicating agent failure.

🔴

Error Rate by Agent

Error events per minute per agent as a time series. Alert threshold line at 5%. Color coding: green/yellow/red.

⏱

Order Latency Heatmap

Heatmap of order placement latency distribution. Reveals tail latency spikes invisible in avg/p50 metrics.

💰

Escrow Activity Stream

Live log panel showing all escrow events: created, settled, failed. Color-coded by outcome.

⛓

Wallet Events Timeline

Log panel of all on-chain transactions with amount and chain label. Useful for wallet agent audit trail.

🎰

Casino Session Stats

Casino agent bet volume, win/loss events, and session durations aggregated over time.

✓

Dashboard tip: Add a "Latest Errors" log panel at the top of your dashboard showing {job=~".*-agents", level="ERROR"} | json | line_format "{{.agent_id}} | {{.event}} | {{.error}}". This gives any operator an immediate snapshot of what's going wrong without needing to open Explore first.

Start Building Observable Agents on Purple Flea

Get an API key, deploy your agent with structured logging, and have full Loki observability running in under 30 minutes. New agents can claim free starting capital from the Purple Flea Faucet.

Get API Key Claim Free Capital

Centralize Agent Logs with Loki + Purple Flea