What Makes Haystack Unique Among Agent Frameworks
Haystack, maintained by deepset, occupies a distinct position among agent frameworks. Where LangChain and LlamaIndex are primarily retrieval and chain-of-thought systems, Haystack is a pipeline framework: you define components with explicit inputs and outputs, wire them together into a directed graph, and run the graph. Each component is a Python class inheriting from ComponentBase with a @component decorator and a run() method that declares its input and output types via @component.output_types.
This architecture has important implications for financial operations. Because every component has explicit, typed inputs and outputs, you can reason precisely about what data flows between your retrieval step, your LLM step, and your financial action step. There are no implicit side effects hidden in callback chains. The pipeline graph is serializable to YAML, enabling version-controlled pipeline definitions that can be replayed, audited, and deployed reproducibly.
Haystack is also notably good at RAG — retrieval-augmented generation. Its document stores, embedders, and retrievers are production-quality and actively maintained. The combination of high-quality RAG infrastructure with a composable pipeline architecture makes Haystack excellent for data-driven trading agents: retrieve recent market analysis from a document store, feed it to an LLM for signal extraction, and execute trades based on the extracted signals — all in a single, auditable pipeline.
Why Combine RAG with Financial Operations
The core insight behind a Haystack + Purple Flea integration is that data-driven financial decisions are better than pure rule-based ones. A trading agent that retrieves recent news, earnings reports, or on-chain analytics before placing a trade has substantially more context than one acting on price data alone.
Consider the typical pattern without RAG: the agent checks the current price of BTC, compares it to a moving average, and places a trade if the spread exceeds a threshold. This is a purely mechanical strategy that ignores qualitative signals entirely.
With Haystack RAG, the same agent can first query a document store of recent news, earnings, or social sentiment data, extract the salient signals via an LLM, and incorporate those signals into its trading decision. The pipeline structure makes the data flow explicit: retrieval output feeds directly into the LLM input, and LLM output feeds directly into the trading component input.
Haystack's pipeline model is the right abstraction for financial agents because financial decisions have explicit data dependencies. You can draw a DAG of what information feeds what decision — and Haystack's component graph is exactly that DAG in executable form.
Installation
The Purple Flea Haystack integration is available on PyPI. It depends on haystack-ai and requests. No blockchain SDK is required.
pip install purpleflea-haystack haystack-ai openai
Set your Purple Flea API key as an environment variable. No additional configuration is needed — the package reads this key automatically when any Purple Flea component is instantiated:
export PURPLE_FLEA_API_KEY="pf_your_key_here" export OPENAI_API_KEY="sk-..."
Component Overview
The purpleflea-haystack package provides five ComponentBase subclasses. Each is a drop-in Haystack component that you add to a pipeline exactly like any built-in component:
address, balance_usdc, tx_id.market, side, size, leverage. Outputs: position_id, fill_price, pnl.game, amount, params. Outputs: outcome, pnl, proof.documents compatible with Haystack's DocumentStore interface.payee, amount, description. Outputs: escrow_id, status.Full Pipeline: Document Retriever + LLM + Trade Execution
The following example builds a complete Haystack pipeline that retrieves market analysis documents, passes them to an OpenAI LLM for signal extraction, and routes the extracted signals to the Purple Flea trading component. The pipeline is fully typed and serializable.
from haystack import Pipeline from haystack.components.retrievers.in_memory import InMemoryBM25Retriever from haystack.components.builders import PromptBuilder from haystack.components.generators import OpenAIGenerator from haystack.document_stores.in_memory import InMemoryDocumentStore from haystack.dataclasses import Document from purpleflea_haystack import PurpleFleasTradeComponent, PurpleFleasMarketDataComponent # --------------- Document store with market analysis docs --------------- doc_store = InMemoryDocumentStore() doc_store.write_documents([ Document(content="Bitcoin ETF inflows hit $500M today. Institutional demand accelerating."), Document(content="Fed holds rates. Risk-on sentiment expected. BTC historically rallies."), Document(content="On-chain metrics: exchange reserves at 4-year low. Supply squeeze thesis intact."), Document(content="Altcoins outperforming BTC on 7-day basis. Possible rotation signal."), ]) # --------------- Prompt: extract trading signal from documents --------------- SIGNAL_PROMPT = """ You are a crypto trading signal extractor. Given the following market analysis documents, output a JSON object with: - "market": which perpetual market to trade (e.g. "BTC-PERP", "ETH-PERP") - "side": "buy" for long or "sell" for short - "confidence": 0.0 to 1.0 - "reason": one-sentence rationale Documents: {% for doc in documents %} - {{ doc.content }} {% endfor %} Respond with only the JSON object. """ # --------------- Build the pipeline --------------- pipe = Pipeline() pipe.add_component("retriever", InMemoryBM25Retriever(document_store=doc_store)) pipe.add_component("market_data", PurpleFleasMarketDataComponent(markets=["BTC-PERP", "ETH-PERP"])) pipe.add_component("prompt", PromptBuilder(template=SIGNAL_PROMPT)) pipe.add_component("llm", OpenAIGenerator(model="gpt-4o")) pipe.add_component("trader", PurpleFleasTradeComponent(leverage=3, max_size_usd=50.0, confidence_threshold=0.7)) # Connect components: retriever → prompt → LLM → trader pipe.connect("retriever.documents", "prompt.documents") pipe.connect("market_data.documents", "prompt.market_data") pipe.connect("prompt.prompt", "llm.prompt") pipe.connect("llm.replies", "trader.signal_json") # Run: provide the retrieval query and let the pipeline do the rest result = pipe.run({ "retriever": {"query": "bitcoin outlook institutional sentiment"}, }) print(f"Signal: {result['trader']['signal']}") print(f"Order: {result['trader']['order']}")
The pipeline is explicit and auditable. Every data flow is declared as a component connection. The retrieval query, the LLM signal, and the trade execution are all logged in the pipeline's run output. The PurpleFleasTradeComponent only executes a trade if the LLM-extracted confidence exceeds the threshold — low-confidence signals result in no trade, protecting capital.
Pipeline YAML Serialization
One of Haystack's strongest features is that pipelines are serializable to YAML. This enables version-controlled pipeline definitions, reproducible deployments, and clean separation between pipeline structure and runtime configuration:
import yaml # Save to YAML — commit to git for reproducibility pipe_yaml = pipe.to_dict() with open("rag_trading_pipeline.yml", "w") as f: yaml.dump(pipe_yaml, f, default_flow_style=False) # Reload from YAML — full pipeline state restored, no code needed with open("rag_trading_pipeline.yml") as f: loaded_config = yaml.safe_load(f) pipe2 = Pipeline.from_dict(loaded_config) result2 = pipe2.run({"retriever": {"query": "ethereum layer2 tvl growth"}}) # The serialized form looks like this (excerpt): # components: # trader: # type: purpleflea_haystack.PurpleFleasTradeComponent # init_parameters: # leverage: 3 # max_size_usd: 50.0 # confidence_threshold: 0.7
YAML serialization enables a clean workflow for production pipelines: develop and test in Python, serialize to YAML when the pipeline is stable, commit the YAML to version control, and deploy by loading from YAML in production. The API key is injected at runtime from environment variables — it is never stored in the YAML file.
Advanced: Web Search Retriever + Market Data
For agents that need truly up-to-date information rather than a static document store, Haystack's SerperDevWebSearch retriever can replace or augment the static document store. Combined with the PurpleFleasMarketDataComponent for live price data, this gives the pipeline both fresh qualitative context (news, social sentiment) and quantitative price context (mark price, funding rate, open interest) before making a trading decision:
from haystack import Pipeline from haystack.components.routers import ConditionalRouter from haystack.components.websearch import SerperDevWebSearch from haystack.components.generators import OpenAIGenerator from haystack.components.builders import PromptBuilder from purpleflea_haystack import ( PurpleFleasTradeComponent, PurpleFleasMarketDataComponent, PurpleFleasWalletComponent, ) COMBINED_PROMPT = """ Market data (live prices and funding rates): {% for doc in market_docs %} {{ doc.content }} {% endfor %} Latest news (web search results): {% for doc in news_docs %} {{ doc.content }} {% endfor %} Based on the above, extract: - "market": "BTC-PERP" | "ETH-PERP" | "SOL-PERP" | "HOLD" - "side": "buy" | "sell" (only if market != "HOLD") - "size_usd": float between 10 and 100 - "confidence": 0.0 to 1.0 - "reasoning": brief explanation Output JSON only. """ pipe = Pipeline() # Live data sources pipe.add_component("market_data", PurpleFleasMarketDataComponent( markets=["BTC-PERP", "ETH-PERP", "SOL-PERP"], include_funding_rate=True, include_open_interest=True, )) pipe.add_component("web_search", SerperDevWebSearch(top_k=5)) # LLM signal extraction pipe.add_component("prompt", PromptBuilder(template=COMBINED_PROMPT)) pipe.add_component("llm", OpenAIGenerator(model="gpt-4o")) # Financial action layer pipe.add_component("wallet", PurpleFleasWalletComponent()) pipe.add_component("trader", PurpleFleasTradeComponent( leverage=5, confidence_threshold=0.75, )) # Wire everything together pipe.connect("market_data.documents", "prompt.market_docs") pipe.connect("web_search.documents", "prompt.news_docs") pipe.connect("prompt.prompt", "llm.prompt") pipe.connect("wallet.balance_usdc", "trader.available_capital") pipe.connect("llm.replies", "trader.signal_json") result = pipe.run({ "web_search": {"query": "bitcoin crypto market news today 2026"}, }) if result["trader"]["order"]: print(f"Executed: {result['trader']['order']['position_id']}") print(f"Signal: {result['trader']['signal']}") else: print("No trade — confidence below threshold or HOLD signal")
Running as a Scheduled Agent
A Haystack pipeline is a plain Python object. You can schedule it to run on any interval using APScheduler, a cron job, a cloud scheduler, or any other mechanism. The following wraps the pipeline in a simple scheduler that runs every hour during market hours:
from apscheduler.schedulers.blocking import BlockingScheduler from datetime import datetime # pipe = ... (build as above or load from YAML) def run_agent_cycle(): print(f"[{datetime.utcnow().isoformat()}Z] Running agent cycle...") try: result = pipe.run({ "web_search": {"query": "crypto market outlook"}, }) order = result.get("trader", {}).get("order") if order: print(f" Trade: {order['market']} {order['side']} @ {order['fill_price']}") else: print(" No trade this cycle") except Exception as e: print(f" Error: {e}") scheduler = BlockingScheduler() scheduler.add_job(run_agent_cycle, "interval", hours=1) scheduler.start()
Key Integration Points and Conventions
A few implementation details worth knowing when building Haystack + Purple Flea pipelines:
- Component idempotency:
PurpleFleasMarketDataComponentandPurpleFleasWalletComponentare read-only. You can run them as many times as needed without financial side effects. OnlyPurpleFleasTradeComponent,PurpleFleasCasinoComponent, andPurpleFleasEscrowComponentmutate state. - Confidence gating: Every action component accepts a
confidence_thresholdparameter. Set this to 0.7 or higher for live trading. Set it to 0.0 only in test environments. - Dry-run mode: Pass
dry_run=Trueto any action component to simulate the action without executing it. The output schema is identical — you get back a simulated order or bet result that you can use for testing the full pipeline. - Error handling: Each component catches Purple Flea API errors and propagates them as Haystack component errors. Pipeline-level error routing works normally — connect error outputs to a logging component to capture failed trade attempts.
Purple Flea's Referral Program for Haystack Developers
If you are building a Haystack-based trading framework, agent toolkit, or open-source template and you route through Purple Flea, you earn 20% of all trading fees and 10% of all casino fees generated by your users — permanently, with no cap. Include your referral code in your published pipeline templates or documentation, and earn passive income every time someone deploys your pipeline and trades through it.
For widely-used open-source templates, this is meaningful recurring revenue. The referral system is built into the PurpleFleasTradeComponent — you pass your referral code as a constructor argument and it is automatically included in all API calls your component makes.
See also: The Pydantic AI integration at purpleflea.com/for-pydantic-ai follows similar component-based patterns. If your Haystack pipeline generates Pydantic-typed signals, you can pass them directly to the Purple Flea Pydantic AI tools without a conversion step.
Summary
Haystack's component-based pipeline architecture is an excellent match for financial agent workflows. Explicit typed data flows, YAML serialization, and composable retrieval components give you the infrastructure to build auditable, reproducible trading agents that combine document retrieval, LLM reasoning, and real financial execution in a single coherent system.
Purple Flea provides the financial layer: wallets, perpetual futures trading across 275 markets, casino games, and agent-to-agent escrow — all accessible from ComponentBase subclasses that plug directly into your pipeline graph. Get a free API key at purpleflea.com/docs and test your pipeline against live Hyperliquid market data in under ten minutes.