BentoML is the unified framework for building, deploying, and scaling AI services. With Purple Flea's financial API, your BentoML services can do more than predict — they can act. A trading signal becomes a real trade, a wallet balance check becomes an automated rebalance.
BentoML provides the model serving infrastructure: runners for GPU-accelerated inference, a service layer for business logic, and BentoCloud for scalable deployment. Purple Flea provides the financial execution layer. The two connect at the service layer — your BentoML API handler calls Purple Flea after the model produces a signal.
BentoML runners isolate model computation in a separate process, enabling GPU acceleration and independent scaling. Your LSTM, transformer, XGBoost, or RL model runs here — taking raw market features and outputting a signal score.
The BentoML service wraps your runner and adds business logic. After the runner returns a signal, the service decides whether to call Purple Flea's trading, casino, wallet, or escrow endpoints — all with a simple async HTTP call.
Deploy to BentoCloud and let it auto-scale your service based on market activity. During high-volatility periods, BentoCloud spawns additional replicas to handle concurrent signals. During quiet markets, it scales back to zero to minimize cost.
The following is a production-ready BentoML service that accepts market feature vectors, runs inference on a pre-trained trading signal model, and executes real trades via Purple Flea when the signal is strong enough.
import bentoml import requests import os from typing import Optional @bentoml.service( resources={"cpu": "2", "memory": "4Gi"}, traffic={"timeout": 20} ) class TradingAgent: # Load saved model — supports sklearn, PyTorch, TensorFlow, XGBoost, etc. model_runner = bentoml.picklable_model.get( "trading-signal-model:latest" ).to_runner() PURPLE_FLEA_BASE = "https://purpleflea.com/api/v1" PURPLE_FLEA_KEY = os.environ.get("PURPLE_FLEA_API_KEY", "") @bentoml.api async def predict_and_trade(self, features: dict) -> dict: """ Accept a feature dict with market data, run inference, and optionally execute a trade based on the signal. """ # Run ML inference via isolated runner process signal = await self.model_runner.predict.async_run([features]) signal_score = float(signal[0]) result = { "signal": signal_score, "action": "hold", "trade": None } if signal_score > 0.7: # Strong buy signal trade_resp = requests.post( f"{self.PURPLE_FLEA_BASE}/trade", json={ "symbol": features.get("symbol", "BTC-PERP"), "side": "long", "size_usd": 100, "leverage": 5 }, headers={"Authorization": f"Bearer {self.PURPLE_FLEA_KEY}"}, timeout=10 ) result["action"] = "buy" result["trade"] = trade_resp.json() elif signal_score < -0.7: # Strong sell signal trade_resp = requests.post( f"{self.PURPLE_FLEA_BASE}/trade", json={ "symbol": features.get("symbol", "BTC-PERP"), "side": "short", "size_usd": 100, "leverage": 5 }, headers={"Authorization": f"Bearer {self.PURPLE_FLEA_KEY}"}, timeout=10 ) result["action"] = "sell" result["trade"] = trade_resp.json() return result @bentoml.api async def wallet_rebalance( self, target_btc_pct: float, target_eth_pct: float ) -> dict: """Check wallet and rebalance to target allocation.""" # Get current wallet state from Purple Flea wallet = requests.get( f"{self.PURPLE_FLEA_BASE}/wallet", headers={"Authorization": f"Bearer {self.PURPLE_FLEA_KEY}"}, timeout=10 ).json() total_usd = wallet.get("total_usd_value", 0) btc_val = wallet.get("btc_usd_value", 0) eth_val = wallet.get("eth_usd_value", 0) # Calculate rebalance amounts needed target_btc = total_usd * target_btc_pct target_eth = total_usd * target_eth_pct btc_delta = target_btc - btc_val eth_delta = target_eth - eth_val swaps = [] if abs(btc_delta) > 5: # Only rebalance if >$5 off target swap = requests.post( f"{self.PURPLE_FLEA_BASE}/wallet/swap", json={ "from_token": "USDC" if btc_delta > 0 else "BTC", "to_token": "BTC" if btc_delta > 0 else "USDC", "amount": abs(btc_delta), "chain": "bitcoin" }, headers={"Authorization": f"Bearer {self.PURPLE_FLEA_KEY}"}, timeout=15 ).json() swaps.append({"asset": "BTC", "delta": btc_delta, "swap": swap}) return {"total_usd": total_usd, "swaps": swaps} # Save your trained model before deploying: # import bentoml, sklearn.ensemble # model = sklearn.ensemble.RandomForestClassifier() # model.fit(X_train, y_train) # bentoml.picklable_model.save("trading-signal-model", model) # # Run locally: bentoml serve trading_service:TradingAgent # Deploy to cloud: bentoml deploy trading_service:TradingAgent -n trading-agent
Every Purple Flea service is accessible from your BentoML service layer with a simple REST call. Here are the six services and what they unlock for your ML models.
Provably fair games: coin flip, dice, roulette, crash. BentoML RL agents can use the casino as a live environment for probabilistic decision-making training.
275 perpetual markets on Hyperliquid. LSTM price predictors, transformer sentiment models, and XGBoost signal classifiers all map cleanly to long/short entries.
Multi-chain wallets with best-route DEX swaps. Portfolio rebalancing models calculate target allocations; the wallet API executes the actual swaps.
Blockchain domain registration across ENS, Unstoppable, and Handshake. Each BentoML service instance can register its own identity on-chain.
New agent onboarding tool. Claim $1 free to fund initial test trades without a deposit. Perfect for bootstrapping a new BentoML service in a staging environment.
Trustless agent-to-agent payments. BentoML model-serving services can accept escrow payments from agents that consume their prediction APIs.
BentoCloud is the managed platform for running BentoML services at scale. It handles infrastructure, auto-scaling, and monitoring so you can focus on your model and trading logic. Here is how to configure your trading service for BentoCloud deployment with Purple Flea credentials.
# Log in to BentoCloud bentoml cloud login # Deploy with environment variables for your API keys bentoml deploy trading_service:TradingAgent \ --name purple-flea-trader \ --env PURPLE_FLEA_API_KEY="your-pf-live-key" \ --scaling-min 0 \ --scaling-max 10 # Check deployment status bentoml deployment get purple-flea-trader # Send a prediction request curl -X POST https://purple-flea-trader.bentoml.ai/predict_and_trade \ -H "Content-Type: application/json" \ -d '{"symbol":"BTC-PERP","close":67400,"rsi":62,"volume_ratio":1.4,"ma_cross":1}'
Configure BentoCloud to scale from 0 to 10 replicas based on request queue depth. During volatile markets when signals fire frequently, replicas scale up automatically. During quiet periods, scale back to zero and pay nothing.
Store your Purple Flea API key as a BentoCloud secret rather than an environment variable. Use bentoml secret create pf-key PURPLE_FLEA_API_KEY=... to inject credentials at runtime without exposing them in config files.
BentoCloud's observability dashboard tracks inference latency, request volume, and error rates per endpoint. Pair this with Purple Flea's trade history API to monitor full signal-to-execution pipeline performance in one view.
Stop paper trading. Deploy your trained price prediction model as a BentoML service and wire its output directly to Purple Flea's trading API. Your model's confidence score becomes a real position size.
Build a BentoML service that backtests strategies on historical data at one endpoint, then promotes validated strategies to live execution via Purple Flea at another — same service, different modes.
Run multiple runners (LSTM, XGBoost, transformer) in parallel via BentoML's async runner API. Average their signals, weight by historical accuracy, then route the consensus to Purple Flea for execution.
Deploy a risk model that receives live position data from Purple Flea and scores portfolio risk in real time. When risk score exceeds threshold, automatically close positions via the trading API.
Purple Flea's API works with any model that produces a signal score, directional decision, or action. Here is how different model architectures map to Purple Flea financial actions.
| Model Type | Use Case | Purple Flea Action | Signal Output |
|---|---|---|---|
| LSTM | Price sequence prediction | Open/close trade | Predicted price in N bars |
| Transformer | NLP sentiment analysis on news/social | Long on positive, short on negative | Sentiment score [-1, 1] |
| XGBoost | Feature-based signal classification | Trade on class probability > threshold | P(bullish), P(bearish) |
| RL Agent | Action policy (buy / hold / sell) | Direct action mapping to trade API | Discrete action + confidence |
| Regression | Return magnitude prediction | Size trade proportional to predicted return | Expected return % |
| Anomaly Detection | Unusual market condition detection | Close positions on anomaly score spike | Anomaly score [0, 1] |
Register in 60 seconds. Get your API key. Wire your BentoML inference service to Purple Flea and watch your model's signals become real trades.