Every morning at 07:30 ET, our pipeline runs and scrapes the latest tweets from all 80+ tracked KOLs using the Apify Twitter Scraper API.
What we collect: tweet text, timestamp, engagement metrics (likes, retweets, replies). We only collect public tweets — nothing private, DM-based, or from protected accounts.
Not every tweet is a stock call. To keep costs low and accuracy high, we use a two-stage filtering system before calling the AI.
Stage 1 — Regex pre-filter: A tweet must contain both a ticker symbol (e.g. $AAPL) and a directional keyword (BUY, SELL, LONG, SHORT, BULLISH, BEARISH). If it doesn't match both conditions, we skip it entirely — no AI call made.
Stage 2 — Claude Haiku: Tweets that pass the filter are sent to Anthropic's Claude Haiku AI, which extracts structured data from the natural language:
For every parsed call, we take 4 price snapshots using Yahoo Finance (yfinance) to measure performance over time:
Prices are fetched daily as snapshots become due. A call posted today gets its T1D snapshot tomorrow, T7D in a week, and T30D in a month. Until a snapshot is taken, the call shows as Pending.
Once price snapshots are available, we calculate each KOL's accuracy metrics:
Correct call definition:
- BUY or LONG call → stock price went UP from T0
- SELL or SHORT call → stock price went DOWN from T0
Star rating system:
Rankings are recalculated daily after the pipeline runs. KOLs are sorted by win rate within each selected time period (1D, 7D, 30D).
We DO track:
- US stocks — NYSE, NASDAQ, AMEX
- Public tweet-based calls from 80+ tracked KOLs
- BUY, SELL, LONG, and SHORT directional calls
- 1-day, 7-day, and 30-day performance windows
- KOLs across options flow, technical analysis, macro, and swing trading styles
We DON'T track (yet):
- Options trades — too complex (strike, expiry, contract size)
- Crypto calls — coming soon
- Private Discord, Telegram, or newsletter calls
- Deleted tweets — can't retroactively capture
- Intraday trades — we use end-of-day prices only
- Position sizing — we don't know if a KOL put 1% or 50% of their portfolio in
KOL categories we cover:
We believe transparency about our limitations builds more trust than pretending they don't exist.
- AI parsing errors (~5%): Our parser can misinterpret sarcasm, vague language, or complex multi-part tweets
- Timing ambiguity: A tweet at 3:58pm EST — did they buy before market close or the next morning? We don't know
- Partial positions: A KOL might tweet "buying $AAPL" but only allocate 2% of their portfolio — we can't know
- Price data delays: Yahoo Finance may have 15–20 minute delays; we use closing prices not real-time
- Survivorship bias: KOLs who stop tweeting or go private disappear from tracking — their later failures aren't counted
- Small sample sizes: A KOL with 5–10 calls has a statistically unreliable win rate — don't read too much into it
- Context missing: A BUY call from a bearish KOL as a hedge means something different than from a pure bull — we don't capture context
Our parser detects $SNDK + keyword BUY → sends to Claude Haiku → extracts
{"ticker": "SNDK", "direction": "BUY"} → we snapshot the price at T0.
Our parser correctly identifies it as a SELL call on $SLV. The call is logged and we track price. But the stock continued higher — the SELL call was wrong.
This is ambiguous. The tweet contains $TSLA but the sentiment is unclear — "not sure I'd be a buyer" could mean bearish, and "could see a bounce" could mean short-term bullish. Our regex pre-filter might not even pass this through (no explicit BUY/SELL keyword), but if it does, Claude Haiku may classify it as BUY based on "bounce" — which may be wrong.
This is exactly why we audit samples and why you should never rely solely on our parsed data.