Crypto prices move on narrative. The same Bitcoin that trades at $60k can trade at $80k six months later — not because the protocol changed, but because the story around it did. AI trading agents that ignore sentiment are leaving one of the strongest signals on the table.
This guide covers the full pipeline: collecting raw text from Twitter/X, Reddit, and financial news sources; normalizing and preprocessing at scale; scoring sentiment using both classical NLP (VADER, FinBERT) and modern LLM APIs; weighting sources by historical predictive reliability; and finally converting the aggregate score into actionable trade signals executed via the Purple Flea Trading API.
1. Data Sources and Collection Strategy
Not all text is equal. A tweet from a pseudonymous account with 200 followers carries fundamentally different signal than a Reuters headline or a post from a known Ethereum core developer. Before writing any NLP code, design your source taxonomy carefully.
Source Taxonomy
Source weights above are starting values derived from backtesting against 18 months of BTC/USDC price data. They should be recalibrated quarterly using rolling correlation analysis between each source's signal and subsequent 4-hour returns.
Rate Limits and Legal Considerations
Twitter's Basic API tier allows ~10,000 tweet reads per month free. For production sentiment agents, the Pro tier (100k tweets/month at $100/mo) is typically required. Reddit's API is free but rate-limited to 60 requests/minute per registered app. For news, CryptoCompare's free tier provides 100k API calls/month — sufficient for polling 15-minute cadence across 50 news sources.
Important: Always respect robots.txt and terms of service for each platform. Aggressive scraping without API access violates most platform ToS and can result in IP bans that break your agent's data pipeline without warning.
2. NLP Preprocessing Pipeline
Raw social media text is noisy: emoji, slang, sarcasm, URLs, and repeated meme phrases. A clean preprocessing pipeline dramatically improves downstream scoring accuracy.
Text Cleaning and Normalization
import re
import unicodedata
from dataclasses import dataclass, field
from typing import Optional
from datetime import datetime
import emoji
@dataclass
class SentimentDoc:
text: str
source: str # 'twitter' | 'reddit' | 'news'
timestamp: datetime
author_followers: int = 0
upvotes: int = 0
cleaned: Optional[str] = None
score: Optional[float] = None
weight: float = 1.0
# Common crypto slang normalizations
CRYPTO_SLANG = {
'wagmi': 'we are going to make it positive',
'ngmi': 'not going to make it negative',
'gm': 'good morning bullish',
'rekt': 'lost money negative',
'wen': 'when',
'ser': 'sir',
'fud': 'fear uncertainty doubt negative',
'fomo': 'fear of missing out buying',
'buidl': 'build positive',
'hodl': 'hold',
'moon': 'price increase very positive',
'dump': 'price decrease negative',
'pump': 'price increase positive',
}
def clean_text(text: str, source: str = 'twitter') -> str:
# Normalize unicode
text = unicodedata.normalize('NFKC', text)
# Convert emoji to text description (they carry sentiment)
text = emoji.demojize(text, delimiters=(' ', ' '))
# Remove URLs
text = re.sub(rr'https?://\S+', '', text)
# Remove @mentions but keep content (they signal context)
text = re.sub(rr'@\w+', '', text)
# Keep $ tickers as signal words
text = re.sub(rr'\$([A-Z]{2,6})', rr'TICKER_\1', text)
# Apply crypto slang normalization
tokens = text.lower().split()
tokens = [CRYPTO_SLANG.get(t, t) for t in tokens]
text = ' '.join(tokens)
# Remove excessive punctuation/whitespace
text = re.sub(rr'[^\w\s_]', ' ', text)
text = re.sub(rr'\s+', ' ', text).strip()
return text
def compute_document_weight(doc: SentimentDoc) -> float:
"""
Authority weight: combines source base weight,
follower count (log-scaled), and recency decay.
"""
import numpy as np
from datetime import timezone
SOURCE_WEIGHTS = {'twitter': 0.30, 'reddit': 0.20, 'news': 0.35, 'onchain': 0.15}
base = SOURCE_WEIGHTS.get(doc.source, 0.20)
# Follower / upvote authority (logarithmic scale)
authority = np.log1p(max(doc.author_followers, doc.upvotes) / 1000) + 1.0
# Recency decay: half-life = 4 hours
now = datetime.now(timezone.utc)
age_hours = (now - doc.timestamp.replace(tzinfo=timezone.utc)).total_seconds() / 3600
recency = 0.5 ** (age_hours / 4.0)
return base * authority * recency
3. LLM-Based Sentiment Scoring
Classical VADER sentiment works reasonably well for short, direct text but struggles with irony, sarcasm, and the specific domain vocabulary of crypto. Fine-tuned models like FinBERT (trained on financial news) improve considerably, but the frontier is zero-shot or few-shot prompting of large language models.
Why LLMs Outperform VADER on Crypto Text
| Text | VADER Score | LLM Score | Correct |
|---|---|---|---|
| "Bitcoin is going to zero lmao" | +0.12 (positive) | -0.85 (negative) | Negative |
| "WAGMI, ser, not financial advice" | -0.05 (neutral) | +0.72 (positive) | Positive |
| "This is bullish... for bears" | +0.65 (positive) | -0.60 (negative) | Negative |
| "Another 50% correction incoming" | +0.0 (neutral) | -0.78 (negative) | Negative |
import asyncio
import json
import httpx
from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer
from transformers import pipeline
vader = SentimentIntensityAnalyzer()
finbert = pipeline(
'text-classification',
model='ProsusAI/finbert',
return_all_scores=True,
)
LLM_SYSTEM = """You are a crypto market sentiment analyst.
Score the following text on a scale from -1.0 (extremely bearish)
to +1.0 (extremely bullish). Consider context, sarcasm, and
crypto-specific terminology. Respond with only a JSON object:
{"score": float, "confidence": float, "reason": string}"""
async def score_with_llm(text: str, api_key: str) -> float:
# Truncate to 512 tokens worth of characters
text = text[:2048]
async with httpx.AsyncClient(timeout=15.0) as c:
r = await c.post(
'https://api.anthropic.com/v1/messages',
headers={
'x-api-key': api_key,
'anthropic-version': '2023-06-01',
'content-type': 'application/json',
},
json={
'model': 'claude-haiku-4-5',
'max_tokens': 128,
'system': LLM_SYSTEM,
'messages': [{'role': 'user', 'content': text}],
},
)
r.raise_for_status()
content = r.json()['content'][0]['text']
data = json.loads(content)
return float(data['score']) * float(data.get('confidence', 1.0))
def score_vader(text: str) -> float:
return vader.polarity_scores(text)['compound']
def score_finbert(text: str) -> float:
text = text[:512]
results = finbert(text)[0]
score_map = {r['label']: r['score'] for r in results}
return score_map.get('positive', 0) - score_map.get('negative', 0)
async def ensemble_score(text: str, llm_api_key: str) -> dict:
"""
Ensemble of VADER (fast), FinBERT (finance-tuned),
and LLM (context-aware). Weights: 0.15, 0.35, 0.50
"""
vader_s = score_vader(text)
finbert_s = score_finbert(text)
llm_s = await score_with_llm(text, llm_api_key)
composite = 0.15 * vader_s + 0.35 * finbert_s + 0.50 * llm_s
return {
'composite': composite,
'vader': vader_s,
'finbert': finbert_s,
'llm': llm_s,
}
Cost optimization: Reserve LLM scoring for high-authority documents (news articles, verified account tweets). Run VADER alone on the bulk of low-authority social media. This reduces LLM API costs by ~85% with less than 5% degradation in signal quality.
4. Source Weighting and Aggregation
Individual document scores need to be aggregated into a single market-level sentiment index every 15 minutes. This requires weighting by source reliability, author authority, document recency, and volume normalization to prevent single high-authority accounts from dominating the signal.
import numpy as np
from collections import deque
from datetime import datetime, timedelta, timezone
from typing import List, NamedTuple
class SentimentWindow(NamedTuple):
timestamp: datetime
score: float # -1 to +1
volume: int # document count
bullish_pct: float
bearish_pct: float
contrarian_signal: float # inverse of extreme readings
class SentimentAggregator:
"""
Rolling 15-minute sentiment windows with EWMA smoothing.
Detects contrarian signals on extreme readings.
"""
CONTRARIAN_THRESHOLD = 0.75 # above this, contrarian signal triggers
def __init__(self, window_minutes: int = 15, ewma_span: int = 8):
self.window_minutes = window_minutes
self.ewma_span = ewma_span
self.buffer: deque = deque()
self.history: list = []
self._ewma_score = 0.0
self._alpha = 2.0 / (ewma_span + 1)
def ingest(self, doc, composite_score: float):
weight = compute_document_weight(doc)
self.buffer.append({
'ts': doc.timestamp,
'score': composite_score,
'weight': weight,
})
self._flush_old()
def _flush_old(self):
cutoff = datetime.now(timezone.utc) - timedelta(minutes=self.window_minutes)
while self.buffer and self.buffer[0]['ts'].replace(tzinfo=timezone.utc) < cutoff:
self.buffer.popleft()
def compute_window(self) -> SentimentWindow:
if not self.buffer:
return SentimentWindow(datetime.now(timezone.utc), 0.0, 0, 0.5, 0.5, 0.0)
docs = list(self.buffer)
weights = np.array([d['weight'] for d in docs])
scores = np.array([d['score'] for d in docs])
# Winsorize at ±2σ to clip outliers
mu, sigma = scores.mean(), scores.std() + 1e-9
scores = np.clip(scores, mu - 2*sigma, mu + 2*sigma)
weighted_score = np.average(scores, weights=weights)
# EWMA update
self._ewma_score = (self._alpha * weighted_score
+ (1 - self._alpha) * self._ewma_score)
bullish_pct = (scores > 0.1).mean()
bearish_pct = (scores < -0.1).mean()
# Contrarian signal: extreme consensus reversal
abs_score = abs(self._ewma_score)
contrarian = 0.0
if abs_score > self.CONTRARIAN_THRESHOLD:
# Signal is opposite to consensus direction
contrarian = -np.sign(self._ewma_score) * (abs_score - self.CONTRARIAN_THRESHOLD)
window = SentimentWindow(
timestamp=datetime.now(timezone.utc),
score=self._ewma_score,
volume=len(docs),
bullish_pct=bullish_pct,
bearish_pct=bearish_pct,
contrarian_signal=contrarian,
)
self.history.append(window)
return window
Contrarian Signals
When sentiment becomes extremely one-sided — EWMA above 0.75 bullish or below -0.75 bearish — history shows the market often reverses within 4–12 hours. This is the "everyone is already long" effect: when bullish sentiment is overwhelming, the marginal buyer has already bought, and any negative catalyst triggers cascade selling.
The contrarian_signal field in the window output carries a negative score when the crowd is excessively bullish (and vice versa). Your agent can use this as an additional feature alongside the primary sentiment score.
5. Trading Integration with Purple Flea
Sentiment signals work best as a filter on top of technical signals, not as a standalone entry trigger. A strong LSTM forecast pointing long becomes even more compelling when sentiment confirms it. A forecast pointing long into strong bearish sentiment is a position to skip or size down.
import asyncio
import httpx
from datetime import datetime, timezone
API_BASE = 'https://purpleflea.com/trading-api'
HEADERS = {'X-API-Key': 'YOUR_KEY', 'Content-Type': 'application/json'}
def combine_signals(
technical_signal: str, # 'long' | 'short' | 'flat'
technical_conf: float, # 0.0 - 1.0
sentiment_window, # SentimentWindow namedtuple
) -> dict:
"""
Combine technical forecast with sentiment.
Returns: {action, size_multiplier, reason}
"""
sent = sentiment_window.score
cont = sentiment_window.contrarian_signal
vol = sentiment_window.volume
# Low volume = noisy sentiment, reduce its influence
sentiment_weight = min(1.0, vol / 50) * 0.40
tech_weight = 1.0 - sentiment_weight
# Translate technical signal to numeric
tech_dir = {'long': 1.0, 'flat': 0.0, 'short': -1.0}.get(technical_signal, 0.0)
# Effective signal
effective = (tech_weight * tech_dir * technical_conf
+ sentiment_weight * (sent + cont))
# Convert to action + position size multiplier
if effective > 0.25:
action = 'long'
size_mult = min(2.0, 1.0 + effective)
elif effective < -0.25:
action = 'short'
size_mult = min(2.0, 1.0 + abs(effective))
else:
action = 'flat'
size_mult = 0.0
return {
'action': action,
'size_mult': size_mult,
'effective': effective,
'sentiment': sent,
'contrarian': cont,
}
async def execute_signal(combined: dict, base_qty: float = 0.001):
if combined['action'] == 'flat':
return
qty = base_qty * combined['size_mult']
side = 'buy' if combined['action'] == 'long' else 'sell'
async with httpx.AsyncClient() as c:
r = await c.post(
f'{API_BASE}/orders',
json={
'symbol': 'BTC/USDC',
'side': side,
'quantity': round(qty, 6),
'type': 'market',
'metadata': {
'strategy': 'sentiment_v1',
'sentiment': combined['sentiment'],
'ts': datetime.now(timezone.utc).isoformat(),
},
},
headers=HEADERS,
)
r.raise_for_status()
print(f"Order placed: {side} {qty:.6f} BTC @ market")
print(f" sentiment={combined['sentiment']:.3f} effective={combined['effective']:.3f}")
Getting started free: New agents can claim USDC from the Purple Flea Faucet to fund their first sentiment-driven trades without any upfront capital. Register, claim, and start testing your pipeline in live markets immediately.
6. Backtesting Sentiment Signals
Before committing capital, backtest the combined technical + sentiment signal on at least 6 months of historical data. Key metrics to check: Sharpe ratio with and without the sentiment overlay, the win rate of trades triggered by sentiment confirmation vs. those without it, and maximum drawdown during high-sentiment-volume periods.
Common findings from backtesting crypto sentiment signals:
- News-driven sentiment leads price action by 2–4 hours on average for major headlines
- Twitter volume spikes (3x+ normal) reliably precede volatility expansions but not a specific direction
- Reddit upvote-weighted sentiment is more reliable for 24h+ horizons than Twitter
- Contrarian signals have a ~58% win rate on 4-hour reversals when EWMA sentiment exceeds ±0.80
- The "not financial advice" filter — removing posts containing this phrase slightly improves signal quality by filtering out amateur speculation
Survivorship bias warning: When backtesting news sentiment, ensure your historical news dataset includes negative news that caused delistings or collapses. Systems trained only on surviving assets will systematically underweight tail risk.
Add Sentiment to Your Trading Agent
The Purple Flea Trading API accepts orders from any agent in any language. Combine your sentiment pipeline with Purple Flea's real-time market data for a complete signal-to-execution stack. First trade is free with faucet USDC.