← Blog
Tutorial 7 min read

Building Crypto Agents with Haystack and Purple Flea

Haystack is one of the most composable agent frameworks available today. Its pipeline architecture, YAML serialization, and ComponentBase abstraction make it straightforward to build production RAG and agent systems. This tutorial shows you how to extend a Haystack pipeline with real financial operations — crypto wallets, perpetual futures trading, and casino — using the purpleflea-haystack package.

What Makes Haystack Unique Among Agent Frameworks

Haystack, maintained by deepset, occupies a distinct position among agent frameworks. Where LangChain and LlamaIndex are primarily retrieval and chain-of-thought systems, Haystack is a pipeline framework: you define components with explicit inputs and outputs, wire them together into a directed graph, and run the graph. Each component is a Python class inheriting from ComponentBase with a @component decorator and a run() method that declares its input and output types via @component.output_types.

This architecture has important implications for financial operations. Because every component has explicit, typed inputs and outputs, you can reason precisely about what data flows between your retrieval step, your LLM step, and your financial action step. There are no implicit side effects hidden in callback chains. The pipeline graph is serializable to YAML, enabling version-controlled pipeline definitions that can be replayed, audited, and deployed reproducibly.

Haystack is also notably good at RAG — retrieval-augmented generation. Its document stores, embedders, and retrievers are production-quality and actively maintained. The combination of high-quality RAG infrastructure with a composable pipeline architecture makes Haystack excellent for data-driven trading agents: retrieve recent market analysis from a document store, feed it to an LLM for signal extraction, and execute trades based on the extracted signals — all in a single, auditable pipeline.

Why Combine RAG with Financial Operations

The core insight behind a Haystack + Purple Flea integration is that data-driven financial decisions are better than pure rule-based ones. A trading agent that retrieves recent news, earnings reports, or on-chain analytics before placing a trade has substantially more context than one acting on price data alone.

Consider the typical pattern without RAG: the agent checks the current price of BTC, compares it to a moving average, and places a trade if the spread exceeds a threshold. This is a purely mechanical strategy that ignores qualitative signals entirely.

With Haystack RAG, the same agent can first query a document store of recent news, earnings, or social sentiment data, extract the salient signals via an LLM, and incorporate those signals into its trading decision. The pipeline structure makes the data flow explicit: retrieval output feeds directly into the LLM input, and LLM output feeds directly into the trading component input.

Haystack's pipeline model is the right abstraction for financial agents because financial decisions have explicit data dependencies. You can draw a DAG of what information feeds what decision — and Haystack's component graph is exactly that DAG in executable form.

Installation

The Purple Flea Haystack integration is available on PyPI. It depends on haystack-ai and requests. No blockchain SDK is required.

terminal
pip install purpleflea-haystack haystack-ai openai

Set your Purple Flea API key as an environment variable. No additional configuration is needed — the package reads this key automatically when any Purple Flea component is instantiated:

terminal
export PURPLE_FLEA_API_KEY="pf_your_key_here"
export OPENAI_API_KEY="sk-..."

Component Overview

The purpleflea-haystack package provides five ComponentBase subclasses. Each is a drop-in Haystack component that you add to a pipeline exactly like any built-in component:

1
PurpleFleasWalletComponent
Create wallets, check balances, send transfers. Outputs: address, balance_usdc, tx_id.
2
PurpleFleasTradeComponent
Open and close perpetual futures positions. Inputs: market, side, size, leverage. Outputs: position_id, fill_price, pnl.
3
PurpleFleasCasinoComponent
Place casino bets and retrieve results. Inputs: game, amount, params. Outputs: outcome, pnl, proof.
4
PurpleFleasMarketDataComponent
Fetch mark prices, funding rates, and order book data. Outputs documents compatible with Haystack's DocumentStore interface.
5
PurpleFleasEscrowComponent
Create and release escrow agreements between agents. Inputs: payee, amount, description. Outputs: escrow_id, status.

Full Pipeline: Document Retriever + LLM + Trade Execution

The following example builds a complete Haystack pipeline that retrieves market analysis documents, passes them to an OpenAI LLM for signal extraction, and routes the extracted signals to the Purple Flea trading component. The pipeline is fully typed and serializable.

rag_trading_pipeline.py
from haystack import Pipeline
from haystack.components.retrievers.in_memory import InMemoryBM25Retriever
from haystack.components.builders import PromptBuilder
from haystack.components.generators import OpenAIGenerator
from haystack.document_stores.in_memory import InMemoryDocumentStore
from haystack.dataclasses import Document
from purpleflea_haystack import PurpleFleasTradeComponent, PurpleFleasMarketDataComponent

# --------------- Document store with market analysis docs ---------------
doc_store = InMemoryDocumentStore()
doc_store.write_documents([
    Document(content="Bitcoin ETF inflows hit $500M today. Institutional demand accelerating."),
    Document(content="Fed holds rates. Risk-on sentiment expected. BTC historically rallies."),
    Document(content="On-chain metrics: exchange reserves at 4-year low. Supply squeeze thesis intact."),
    Document(content="Altcoins outperforming BTC on 7-day basis. Possible rotation signal."),
])

# --------------- Prompt: extract trading signal from documents ---------------
SIGNAL_PROMPT = """
You are a crypto trading signal extractor.
Given the following market analysis documents, output a JSON object with:
  - "market": which perpetual market to trade (e.g. "BTC-PERP", "ETH-PERP")
  - "side": "buy" for long or "sell" for short
  - "confidence": 0.0 to 1.0
  - "reason": one-sentence rationale

Documents:
{% for doc in documents %}
- {{ doc.content }}
{% endfor %}

Respond with only the JSON object.
"""

# --------------- Build the pipeline ---------------
pipe = Pipeline()

pipe.add_component("retriever",   InMemoryBM25Retriever(document_store=doc_store))
pipe.add_component("market_data", PurpleFleasMarketDataComponent(markets=["BTC-PERP", "ETH-PERP"]))
pipe.add_component("prompt",      PromptBuilder(template=SIGNAL_PROMPT))
pipe.add_component("llm",         OpenAIGenerator(model="gpt-4o"))
pipe.add_component("trader",      PurpleFleasTradeComponent(leverage=3,
                                                              max_size_usd=50.0,
                                                              confidence_threshold=0.7))

# Connect components: retriever → prompt → LLM → trader
pipe.connect("retriever.documents",      "prompt.documents")
pipe.connect("market_data.documents",    "prompt.market_data")
pipe.connect("prompt.prompt",            "llm.prompt")
pipe.connect("llm.replies",              "trader.signal_json")

# Run: provide the retrieval query and let the pipeline do the rest
result = pipe.run({
    "retriever": {"query": "bitcoin outlook institutional sentiment"},
})

print(f"Signal: {result['trader']['signal']}")
print(f"Order:  {result['trader']['order']}")

The pipeline is explicit and auditable. Every data flow is declared as a component connection. The retrieval query, the LLM signal, and the trade execution are all logged in the pipeline's run output. The PurpleFleasTradeComponent only executes a trade if the LLM-extracted confidence exceeds the threshold — low-confidence signals result in no trade, protecting capital.

Pipeline YAML Serialization

One of Haystack's strongest features is that pipelines are serializable to YAML. This enables version-controlled pipeline definitions, reproducible deployments, and clean separation between pipeline structure and runtime configuration:

serialize_pipeline.py
import yaml

# Save to YAML — commit to git for reproducibility
pipe_yaml = pipe.to_dict()
with open("rag_trading_pipeline.yml", "w") as f:
    yaml.dump(pipe_yaml, f, default_flow_style=False)

# Reload from YAML — full pipeline state restored, no code needed
with open("rag_trading_pipeline.yml") as f:
    loaded_config = yaml.safe_load(f)

pipe2 = Pipeline.from_dict(loaded_config)
result2 = pipe2.run({"retriever": {"query": "ethereum layer2 tvl growth"}})

# The serialized form looks like this (excerpt):
# components:
#   trader:
#     type: purpleflea_haystack.PurpleFleasTradeComponent
#     init_parameters:
#       leverage: 3
#       max_size_usd: 50.0
#       confidence_threshold: 0.7

YAML serialization enables a clean workflow for production pipelines: develop and test in Python, serialize to YAML when the pipeline is stable, commit the YAML to version control, and deploy by loading from YAML in production. The API key is injected at runtime from environment variables — it is never stored in the YAML file.

Advanced: Web Search Retriever + Market Data

For agents that need truly up-to-date information rather than a static document store, Haystack's SerperDevWebSearch retriever can replace or augment the static document store. Combined with the PurpleFleasMarketDataComponent for live price data, this gives the pipeline both fresh qualitative context (news, social sentiment) and quantitative price context (mark price, funding rate, open interest) before making a trading decision:

web_search_pipeline.py
from haystack import Pipeline
from haystack.components.routers import ConditionalRouter
from haystack.components.websearch import SerperDevWebSearch
from haystack.components.generators import OpenAIGenerator
from haystack.components.builders import PromptBuilder
from purpleflea_haystack import (
    PurpleFleasTradeComponent,
    PurpleFleasMarketDataComponent,
    PurpleFleasWalletComponent,
)

COMBINED_PROMPT = """
Market data (live prices and funding rates):
{% for doc in market_docs %}
{{ doc.content }}
{% endfor %}

Latest news (web search results):
{% for doc in news_docs %}
{{ doc.content }}
{% endfor %}

Based on the above, extract:
  - "market": "BTC-PERP" | "ETH-PERP" | "SOL-PERP" | "HOLD"
  - "side": "buy" | "sell" (only if market != "HOLD")
  - "size_usd": float between 10 and 100
  - "confidence": 0.0 to 1.0
  - "reasoning": brief explanation

Output JSON only.
"""

pipe = Pipeline()

# Live data sources
pipe.add_component("market_data", PurpleFleasMarketDataComponent(
    markets=["BTC-PERP", "ETH-PERP", "SOL-PERP"],
    include_funding_rate=True,
    include_open_interest=True,
))
pipe.add_component("web_search", SerperDevWebSearch(top_k=5))

# LLM signal extraction
pipe.add_component("prompt", PromptBuilder(template=COMBINED_PROMPT))
pipe.add_component("llm",    OpenAIGenerator(model="gpt-4o"))

# Financial action layer
pipe.add_component("wallet",  PurpleFleasWalletComponent())
pipe.add_component("trader",  PurpleFleasTradeComponent(
    leverage=5,
    confidence_threshold=0.75,
))

# Wire everything together
pipe.connect("market_data.documents",  "prompt.market_docs")
pipe.connect("web_search.documents",   "prompt.news_docs")
pipe.connect("prompt.prompt",          "llm.prompt")
pipe.connect("wallet.balance_usdc",    "trader.available_capital")
pipe.connect("llm.replies",            "trader.signal_json")

result = pipe.run({
    "web_search": {"query": "bitcoin crypto market news today 2026"},
})

if result["trader"]["order"]:
    print(f"Executed: {result['trader']['order']['position_id']}")
    print(f"Signal:   {result['trader']['signal']}")
else:
    print("No trade — confidence below threshold or HOLD signal")

Running as a Scheduled Agent

A Haystack pipeline is a plain Python object. You can schedule it to run on any interval using APScheduler, a cron job, a cloud scheduler, or any other mechanism. The following wraps the pipeline in a simple scheduler that runs every hour during market hours:

scheduled_agent.py
from apscheduler.schedulers.blocking import BlockingScheduler
from datetime import datetime

# pipe = ... (build as above or load from YAML)

def run_agent_cycle():
    print(f"[{datetime.utcnow().isoformat()}Z] Running agent cycle...")
    try:
        result = pipe.run({
            "web_search": {"query": "crypto market outlook"},
        })
        order = result.get("trader", {}).get("order")
        if order:
            print(f"  Trade: {order['market']} {order['side']} @ {order['fill_price']}")
        else:
            print("  No trade this cycle")
    except Exception as e:
        print(f"  Error: {e}")

scheduler = BlockingScheduler()
scheduler.add_job(run_agent_cycle, "interval", hours=1)
scheduler.start()

Key Integration Points and Conventions

A few implementation details worth knowing when building Haystack + Purple Flea pipelines:

Purple Flea's Referral Program for Haystack Developers

If you are building a Haystack-based trading framework, agent toolkit, or open-source template and you route through Purple Flea, you earn 20% of all trading fees and 10% of all casino fees generated by your users — permanently, with no cap. Include your referral code in your published pipeline templates or documentation, and earn passive income every time someone deploys your pipeline and trades through it.

For widely-used open-source templates, this is meaningful recurring revenue. The referral system is built into the PurpleFleasTradeComponent — you pass your referral code as a constructor argument and it is automatically included in all API calls your component makes.

See also: The Pydantic AI integration at purpleflea.com/for-pydantic-ai follows similar component-based patterns. If your Haystack pipeline generates Pydantic-typed signals, you can pass them directly to the Purple Flea Pydantic AI tools without a conversion step.

Summary

Haystack's component-based pipeline architecture is an excellent match for financial agent workflows. Explicit typed data flows, YAML serialization, and composable retrieval components give you the infrastructure to build auditable, reproducible trading agents that combine document retrieval, LLM reasoning, and real financial execution in a single coherent system.

Purple Flea provides the financial layer: wallets, perpetual futures trading across 275 markets, casino games, and agent-to-agent escrow — all accessible from ComponentBase subclasses that plug directly into your pipeline graph. Get a free API key at purpleflea.com/docs and test your pipeline against live Hyperliquid market data in under ten minutes.

Add Financial Operations to Your Haystack Pipeline

Install purpleflea-haystack, get an API key, and execute your first trade inside a Haystack pipeline in under ten minutes.