Can headlines predict price moves, or do they simply echo what traders already know?

Every market participant faces the same asymmetric information problem: by the time you read a headline, has the trade already happened? We set out to answer this empirically, analysing 72,102 Bitcoin-related news headlines from 801 sources over two years using FinBERT sentiment scoring. The findings reveal a more nuanced picture than the simple "priced in" narrative suggests: one in which sentiment contains predictive information, but only for those who know where (and when) to look.

The Dataset: Two Years of Crypto Headlines

Our analysis spans December 2023 to December 2025, a period that captured one of Bitcoin's most dramatic cycles. BTC ranged from $41,067.81 to $124,773.51 across 730 days of news coverage, averaging approximately 99 articles per day. The sheer volume of coverage, nearly 100 headlines daily, reflects crypto's unique position as both a financial asset and a cultural phenomenon.

We processed each headline with FinBERT, a BERT-based model fine-tuned on financial text, and extracted sentiment scores ranging from -1 (bearish) to +1 (bullish). The resulting panel dataset allows us to examine sentiment dynamics across multiple dimensions: time-series behaviour, cross-sectional variation across sources, and, most importantly, predictive relationships with returns.

Figure 1: BTC price, daily returns, and aggregated news sentiment over the analysis period (2023-2025). Note the visual correlation between sentiment spikes and return volatility.

The time series immediately reveals an interesting pattern: sentiment doesn't just track price, it appears to amplify during periods of market stress. This visual intuition becomes testable through formal statistical analysis.

The Sentiment Landscape: Perpetually Bullish?

Before examining predictive relationships, we need to understand the baseline distribution of crypto news sentiment. The results reveal a persistent optimism bias.

Daily sentiment averages 0.050 with a standard deviation of 0.116. More telling: 68.8% of days record net positive sentiment. This is unsurprising for an asset class built on technological optimism and network effects, but it has implications for how we interpret sentiment signals.

Figure 2: Distribution of daily sentiment scores. The left panel shows a histogram with kernel density estimate; the right panel displays a Q-Q plot against a normal distribution. Note the slight left skew (skewness: -0.401) indicating an asymmetric tail of negative sentiment days.

The distribution statistics tell a more complete story. The mean of 0.050 exceeds the median of 0.062, indicating a slight positive bias in central tendency. The standard deviation of 0.116 captures meaningful day-to-day variation. However, the most interesting feature is the negative skewness of -0.401, which deserves closer attention.

That negative skewness indicates an asymmetry in the tails of the distribution. Despite a positive mean, the distribution exhibits a longer left tail, indicating that extreme negative sentiment days occur more frequently than extreme positive ones. In other words, crypto headlines tend to be mildly positive most of the time, but when sentiment turns negative, it turns sharply negative. This skew is likely influenced by the timeframe analysed: a period during which Bitcoin mainly increased, creating a baseline of positive sentiment that makes negative outliers stand out more dramatically. The kurtosis of 0.460 indicates a distribution that is slightly flatter than a normal distribution, suggesting fewer extreme outliers than a normal distribution would predict. Still, the ones that do occur tend to be negative.

This asymmetry will become crucial when we examine sentiment shocks.

The Core Question: Does Sentiment Lead or Lag?

Cross-correlation analysis provides the first rigorous test of sentiment's predictive power. By computing correlations across various leads and lags, we can determine whether sentiment anticipates price movements, reacts to them, or moves contemporaneously.

Figure 3: Cross-correlation between sentiment measures and BTC returns across lags from -7 to +7 days. Three sentiment measures are compared: absolute level, 1-day momentum (Δ1), and 7-day momentum (Δ7).

The results shown in Figure 3 are strikingly symmetric. The correlation at lag=0 (r=0.4602) confirms that sentiment and returns move together intraday, hardly surprising given that headlines often react to price moves within hours. More interestingly, the perfect symmetry between lag = -1 and lag = +1 is observed at r = 0.3056. At lag = -1, sentiment correlates with yesterday's returns at 0.3056, indicating that sentiment reacts to prior price movements as journalists write stories in response to market moves. At lag=+1, sentiment correlates 0.3056 with tomorrow's returns, suggesting that sentiment also predicts future price movements as traders react to headlines.

This symmetry suggests a feedback loop rather than a purely lead-lag relationship. Today's returns influence tomorrow's sentiment (as journalists react to price moves), and today's sentiment influences tomorrow's returns (as traders respond to headlines). Neither direction dominates: the relationship is bidirectional.

Beyond ±1 day, correlations decay rapidly to near zero, indicating that any predictive signal is short-lived. This has immediate practical implications: if sentiment contains exploitable information, it must be addressed promptly.

Regression Analysis: Quantifying Predictive Power

Cross-correlation establishes association, but regression analysis quantifies actual predictive power. Before diving into the results, it's worth explaining what cross-correlation tells us. Cross-correlation quantifies the relationship between two time series at different time lags. When we compute correlations at negative lags (e.g., lag=-1), we're asking: does today's sentiment correlate with yesterday's returns? This indicates that sentiment responds to past price movements. When we compute correlations at positive lags (e.g., lag=+1), we're asking: does today's sentiment correlate with tomorrow's returns? This would indicate that sentiment predicts future price movements. By examining correlations across multiple lags, we can map out the temporal structure of the relationship and determine whether sentiment leads, lags, or moves contemporaneously with returns.

We ran sentiment against forward returns at multiple horizons, from next-day to cumulative weekly returns.

Figure 4: Multi-horizon regression results. The top panel shows coefficient estimates (β) with confidence intervals; the bottom panel displays R² values across prediction horizons.

Single-Day Returns

The next-day result is remarkably strong: sentiment explains 9.34% of next-day return variance. For context, most single-factor models struggle to explain more than 1-2% of daily returns. A t-statistic of 8.65 leaves no doubt about statistical significance, with a coefficient of 0.0655 that is highly significant (p<0.0001). This means that a one-unit increase in sentiment is associated with an average 6.55% increase in next-day returns.

But here's the catch: predictive power vanishes almost entirely by day three. The t+3 regression produces a coefficient of -0.0021 with a t-statistic of -0.27 and an R² of just 0.01%, essentially indistinguishable from zero (p=0.7893). The t+7 regression is even weaker, with a coefficient of -0.0008, t-statistic of -0.10, and R² of 0.00% (p=0.9188). Whatever sentiment information is available, it is incorporated into prices within 24-48 hours.

Cumulative Returns: A Different Story

The single-day results indicate rapid information incorporation. But cumulative returns reveal something more subtle. Although day-three returns are individually uncorrelated with today's sentiment, cumulative returns over three and seven days remain statistically significant. The cumulative three-day return regression yields a coefficient of 0.0687, a t-statistic of 5.23, and an R² of 3.63% (p<0.0001). The cumulative seven-day return regression yields a coefficient of 0.0698, a t-statistic of 3.50, and an R² of 1.67% (p=0.0005). The interpretation: sentiment captures directional information that unfolds over multiple days, even as the day-to-day signal gets noisy.

This pattern suggests sentiment is better viewed as a regime indicator than a precise timing signal. A strongly positive sentiment day doesn't guarantee tomorrow's return, but it increases the probability that next week's trends will be upward.

When Headlines Scream: Sentiment Shock Analysis

If baseline sentiment provides weak signals, extreme sentiment days offer clearer opportunities. We defined "sentiment shocks" as days where the sentiment z-score exceeded ±2.0, roughly 2.5% in each tail under a normal distribution.

The asymmetry we noted earlier becomes dramatic here. As Figure 5 shows, negative shocks outnumber positive shocks 3:1. Out of 36 total shock days, only nine were positive (1.2% of all days) while 27 were negative (3.7% of all days). This asymmetry reflects the psychological reality of financial media: fear sells. Markets can grind higher on tepid optimism, but crashes generate urgent, emotionally charged coverage.

Figure 5: Comparison of BTC behaviour on sentiment shock days vs normal days. The left panel shows return distributions; the middle panel compares absolute returns (volatility); the right panel breaks down returns by shock direction.

Volatility Spikes Around Shocks

Sentiment shock days exhibit significantly higher volatility, as Figure 5 confirms. The average absolute return on shock days (2.51%) is 42% higher than on normal days (1.77%), a statistically significant difference (t-stat=2.497, p=0.0127). This confirms intuition: when headlines reach extreme sentiment, expect elevated volatility regardless of direction.

More sobering is the directional result: shock days average -1.19% returns compared to +0.17% on regular days, a difference that is highly statistically significant (t-stat=-3.200, p=0.0014). The 3:1 ratio of negative to positive shocks drives this asymmetry. Extreme sentiment, particularly extreme negative sentiment, is associated with meaningful downside.

Figure 6: Event study around sentiment shocks. The left panel shows average returns in the ±5-day window around shock events; the right panel displays cumulative returns through the event window.

The event study reveals that shock days aren't isolated events: they tend to cluster during volatile periods. Cumulative returns over the ±5-day window average -1.72%, suggesting sentiment shocks often occur mid-correction rather than at turning points.

The Momentum Question: Levels vs Changes

A natural follow-up: should traders focus on absolute sentiment levels, or on changes in sentiment (momentum)? A shift from neutral to positive matters more than maintaining a consistently positive outlook.

We tested this by computing Δ1 (one-day sentiment change) and Δ7 (seven-day sentiment change) and running the same predictive regressions.

Figure 7: Sentiment momentum analysis. Panels compare absolute sentiment levels against 1-day and 7-day momentum measures for predicting forward returns.

R² Comparison: Level vs Momentum

As Figure 7 shows, absolute sentiment levels consistently outperform momentum measures across all prediction horizons. For next-day returns, raw sentiment achieves an R² of 9.34%, compared to 8.77% for one-day momentum (Δ1) and just 5.04% for seven-day momentum (Δ7). The Δ1 measure comes close, but Δ7 falls well short. This pattern holds for cumulative returns as well: sentiment levels explain 3.63% of three-day cumulative return variance versus 3.00% for Δ1 and 2.30% for Δ7, and 1.67% of seven-day cumulative variance versus 1.42% for Δ1 and only 0.32% for Δ7.

The correlation analysis reinforces this finding:

  • Δ1 sentiment correlates 0.296 with next-day returns

  • Δ7 sentiment correlates -0.071 with next-day returns

Interestingly, longer momentum windows become negatively correlated, suggesting mean reversion: sustained positive sentiment tends to precede reversals rather than continuation. The Δ7 momentum measure shows a correlation of -0.071 with next-day returns, indicating that after a week of positive sentiment momentum, the market tends to reverse. The lesson is clear: don't overcomplicate it. Today's raw sentiment reading outperforms derivative measures for short-term prediction.

Not All Sources Are Equal

With 801 unique sources in our dataset, a natural question emerges: do some outlets provide more predictive signals than others? Source-level analysis reveals substantial heterogeneity.

The Major Players

As Figure 8 shows, the major news sources vary substantially in both volume and predictive power. Biztoc.com leads in article count with 13,796 articles and an average sentiment of 0.065, producing a predictive coefficient of 0.0259 with a t-statistic of 4.22. However, CoinDesk emerges as the most predictive major source, with the highest t-statistic (4.51) for next-day return prediction despite having fewer articles (6,315). CoinDesk's average sentiment of 0.068 and coefficient of 0.0224 suggest it combines high volume with editorial focus, potentially capturing a meaningful signal that aggregator sites dilute.

Other significant sources show weaker predictive relationships. newsBTC (5,963 articles, avg sentiment 0.043) produces a coefficient of 0.0090 with a t-statistic of 1.55, while Bitcoinist (5,528 articles, avg sentiment 0.073) shows an even weaker coefficient of 0.0042 with a t-statistic of 0.72. Cointelegraph (4,387 articles, avg sentiment 0.041) achieves a coefficient of 0.0129 but with a t-statistic of just 1.01. Note Forbes' negative average sentiment (-0.032), one of the few major outlets with a persistent bearish lean. Interestingly, it still shows positive predictive coefficients (0.0105, t-stat=2.09), suggesting that even a bearish outlet's relative optimism or pessimism carries information.

Figure 8: Top 15 news sources by article count. The left panel shows average sentiment; the right panel displays article volume.

Extreme Sentiment Sources

At the extremes, sentiment variation is dramatic:

  • Most positive: InvestorsObserver (avg: 0.4454)

  • Most negative: TheRegister.com (avg: -0.3349)

This 80-point spread in average sentiment underscores the importance of aggregation. A single source's sentiment is as much about editorial bias as market conditions. InvestorsObserver averages 0.4454, nearly half a point above neutral, while Theregister.com averages -0.3349, well into bearish territory. This variation underscores the importance of aggregating across multiple sources to mitigate source-specific biases and capture the true market sentiment signal.

Figure 9: Source-level sentiment characteristics. Each point represents a source (minimum 50 articles), positioned by average sentiment (x-axis) and sentiment volatility (y-axis). Point size reflects article count.

The scatter plot reveals the landscape: most sources cluster near neutral sentiment with moderate volatility, but several outliers, both consistently bullish and consistently bearish, anchor the extremes.

Limitations and Caveats

Several limitations deserve acknowledgement. First, transaction costs present a significant hurdle. While the 9.34% R² sounds impressive, translating statistical significance into profitable trading requires clearing the high bar of crypto trading costs, slippage, and execution risk. The rapid decay of predictive power within 24-48 hours means any strategy must execute quickly, potentially amplifying these costs.

Second, our analysis is subject to regime dependence. Our sample covers a single bull-bear-bull cycle from December 2023 to December 2025. The sentiment-return relationship may differ substantially in sustained bear markets or periods of low volatility. What works during Bitcoin's dramatic cycles may not generalise to calmer periods or different market regimes.

Third, FinBERT accuracy introduces measurement error. Sentiment scoring is imperfect. Sarcasm, complex narratives, and domain-specific language can trip up even fine-tuned models. Our aggregation across approximately 100 daily articles smooths these errors, but they exist and may bias results in unpredictable ways. The model may misclassify nuanced sentiment or fail to capture context-specific meanings that human readers would understand.

Finally, our methodology is subject to a look-ahead bias. Daily sentiment is computed from all articles published that day, but in practice, a trader would only observe partial daily sentiment until market close. Articles published late in the day cannot influence same-day returns, yet our analysis treats all daily articles equally. This may inflate the apparent predictive power of sentiment.

Conclusion

Does news sentiment lead or lag Bitcoin returns? The honest answer: both, and neither dominates. The symmetric correlation at lag ±1 indicates a feedback loop in which returns drive headlines and headlines drive returns in a continuous cycle.

Within this feedback loop, exploitable predictive information exists, but only barely, and only briefly. Sentiment accounts for a meaningful 9.34% of the variance in next-day returns, after which it decays to statistical noise within 72 hours. The signal is real but fleeting.

For DeFi natives accustomed to thinking in epochs and halvings, this may feel like noise trading. But in a market where 24-hour moves regularly exceed 5%, even a short-lived edge compounds. The key is respecting what sentiment tells us, and, equally important, what it doesn't.

News sentiment won't replace fundamental analysis or technical systems. But as one input in a multi-factor approach, appropriately weighted toward short horizons and shock events, it earns its place in the toolkit.


Methodology: Sentiment scores computed using FinBERT on 72,102 headlines from 801 sources (Dec 2023–Dec 2025). BTC price data from standard market feeds. All regressions use Newey-West standard errors with 5-day lag to account for autocorrelation. Statistical significance: * p<0.10, ** p<0.05, *** p<0.01.

Disclaimer: This article may contain material that is not directed to, or intended for distribution to or use by, any person or entity who is a citizen or resident of or located in any locality, state, country or other jurisdiction where such distribution, publication, availability or use would be contrary to law or regulation or which would subject 512m AG or its affiliates to any registration or licensing requirement within such jurisdiction. The information, tools and material presented in this article are provided to you for information purposes only and are not to be used or considered as an offer or the solicitation of an offer to sell or to buy or subscribe for securities or other financial instruments.

Next
Next

Analysing yoUSD: Disciplined Yield Generation in Practice