MLflow is the leading open-source platform for ML lifecycle management. Combined with Purple Flea's trading API, you can track every trading model experiment, version your best performers, and deploy them to execute real trades automatically.
MLflow manages the full model lifecycle. Purple Flea provides the live market data and trade execution endpoints. The pipeline below describes how a single trading strategy goes from backtested idea to production agent in a systematic, reproducible way.
Pull historical OHLCV data from /api/v1/markets/:symbol/history. Train candidate strategies. Log hyperparameters, Sharpe ratio, max drawdown, and total return to MLflow tracking server for every run.
After backtesting all candidates, compare metrics in the MLflow UI. Register the top performer to the MLflow Model Registry under the alias Champion. Set the previous champion to Challenger for A/B comparison.
Start an MLflow model serving endpoint. Your production agent calls the serving endpoint for trading signals, then calls /api/v1/trade to execute. Swap to a new Champion without redeploying the agent code.
The example below logs a full backtesting experiment to MLflow using historical market data from Purple Flea. Every metric that matters for a trading strategy is tracked: Sharpe ratio, max drawdown, total return, and win rate. The best run is registered to the Model Registry automatically.
import mlflow import mlflow.sklearn import requests import numpy as np PF_BASE = "https://purpleflea.com/api/v1" HEADERS = {"Authorization": "Bearer pf_live_YOUR_KEY"} mlflow.set_experiment("btc-momentum-strategies") def run_backtest(lookback: int, threshold: float, leverage: float): with mlflow.start_run(run_name=f"momentum_lb{lookback}_th{threshold}"): # Log hyperparameters mlflow.log_param("lookback_period", lookback) mlflow.log_param("threshold", threshold) mlflow.log_param("leverage", leverage) # Fetch historical data from Purple Flea history = requests.get( f"{PF_BASE}/markets/BTC-PERP/history", params={"days": 90, "interval": "1h"}, headers=HEADERS ).json() prices = np.array([c["close"] for c in history["candles"]]) # Simple momentum strategy: buy when price up N% over lookback returns = [] for i in range(lookback, len(prices)): pct_change = (prices[i] - prices[i - lookback]) / prices[i - lookback] if pct_change > threshold: returns.append(pct_change * leverage) elif pct_change < -threshold: returns.append(-pct_change * leverage) else: returns.append(0.0) returns = np.array(returns) total_return = returns.sum() sharpe = returns.mean() / (returns.std() + 1e-9) * np.sqrt(252) max_dd = (returns.cumsum() - returns.cumsum().cummax()).min() win_rate = (returns > 0).mean() # Log metrics to MLflow mlflow.log_metric("sharpe_ratio", round(sharpe, 3)) mlflow.log_metric("max_drawdown", round(max_dd, 4)) mlflow.log_metric("total_return", round(total_return, 4)) mlflow.log_metric("win_rate", round(win_rate, 3)) print(f"Sharpe: {sharpe:.2f} | DD: {max_dd:.2%} | Return: {total_return:.2%}") # Register if Sharpe > 1.5 if sharpe > 1.5: mlflow.sklearn.log_model( None, # Replace with trained sklearn/torch model "btc_momentum_model", registered_model_name="btc-momentum" ) print("Registered to Model Registry (Sharpe > 1.5)") # Grid search over hyperparameters for lb in [4, 8, 14, 24]: for th in [0.01, 0.02, 0.03]: run_backtest(lb, th, leverage=2) print("All experiments logged. View at: mlflow ui --port 5000")
MLflow's Model Registry gives you version control and deployment aliases for trading models. The Champion alias always points to the model currently executing live trades. Promoting a new model to Champion is a single API call — no code changes, no redeployment.
The model currently executing live trades via Purple Flea. Your production agent loads this version by alias, so promoting a new model is instant. Use client.set_registered_model_alias("btc-momentum", "Champion", version=5) to promote.
The previous Champion, now running in shadow mode. Your agent can log predictions from both Champion and Challenger and compare live performance before permanently retiring the old version.
import mlflow.pyfunc import requests # Always load Champion — no hardcoded version numbers model = mlflow.pyfunc.load_model("models:/btc-momentum@Champion") PF_BASE = "https://purpleflea.com/api/v1" HEADERS = {"Authorization": "Bearer pf_live_YOUR_KEY"} def trade_loop(): # Fetch current features from Purple Flea mkt = requests.get(f"{PF_BASE}/markets/BTC-PERP/price", headers=HEADERS).json() features = [[ mkt["change_1h_pct"], mkt["change_4h_pct"], mkt["volume_24h_usd"], mkt["funding_rate"] ]] # Get Champion's prediction signal = model.predict(features)[0] # 1 = long, -1 = short, 0 = flat if signal == 1: requests.post(f"{PF_BASE}/trade", json={"symbol": "BTC-PERP", "side": "long", "size": 50}, headers=HEADERS) elif signal == -1: requests.post(f"{PF_BASE}/trade", json={"symbol": "BTC-PERP", "side": "short", "size": 50}, headers=HEADERS)
Pull training data, execute model predictions as live trades, and measure real performance — all through Purple Flea's unified API.
Up to 365 days of OHLCV candles for 275 perpetual markets. Pull training datasets directly from /api/v1/markets/:symbol/history. No third-party data vendor required.
Convert model signals into live Hyperliquid perpetual futures positions. Market and limit orders, configurable leverage, automatic stop-loss. The same endpoint you backtested against.
Fetch live PnL, Sharpe, and drawdown from /api/v1/portfolio/performance and log them back to MLflow as live run metrics to compare against backtest predictions.
Use casino game outcomes as training signal for reinforcement learning agents. Win/loss history, bet sizing patterns, and expected value calculations all available via API.
New ML agent instances claim $1 from the faucet on first run. Use in CI/CD pipelines where each model deployment spins up a fresh agent identity with funded capital for live evaluation.
Run Champion vs Challenger in a trustless competition with escrowed capital. The winning model claims the escrow at the end of the evaluation period. 1% platform fee, 15% referral.
Tag each MLflow run with the market regime (trending, ranging, volatile) and filter the experiment view to find which strategy performs best under each condition. Automatically route the production agent to the regime-appropriate Champion using a regime classifier that also runs through MLflow serving.
Deploy two model versions simultaneously. Route 20% of capital to the Challenger, 80% to the Champion. Log real PnL for both to MLflow as live metrics. After 30 days, compare Sharpe ratios and promote Challenger to Champion if it wins. The Purple Flea escrow API enforces capital separation between the two agents.
Build a nightly batch job that re-evaluates all registered models on the last 7 days of live data. If a Challenger's rolling Sharpe exceeds the Champion's by more than 15%, automatically promote it via the MLflow client API. Zero human approval required.
Each trade execution logs realized PnL, slippage, and execution quality back to MLflow as live run metrics alongside the predicted signal. Drift detection alerts you when live Sharpe diverges more than 2 standard deviations from backtested performance — triggering a model refresh automatically.
MLflow manages your model lifecycle. Purple Flea provides the market data and trade execution. Register your agent and pull 90 days of BTC history in your first MLflow experiment.