Causal Discovery in Time Series: Untangling Time, Correlation & Causation

Introduction

"Correlation is not causation" is a mantra every statistician lives by. However, when it comes to time series data, the very structure of time gives us something to work with. After all, if variable A precedes variable B consistently, can we say A causes B? In this post, we dive into one of the most intriguing challenges in time series analysis: discovering causality from observational data.

We will explore classic and modern methods for identifying causality, their assumptions, limitations, and real-world applications. By the end, you’ll be equipped with tools and insights to experiment with causal inference in your time series data.

What Is Causality in Time Series?

Causality goes beyond correlation. It implies a directional influence — a cause must precede its effect. In time series, this temporal aspect offers a foothold to infer causality. However, time ordering alone is not enough. Confounding variables, feedback loops, and latent causes make the problem complex.

Let’s see how statisticians have tried to tackle it.

Classic Approach: Granger Causality

The idea of Granger causality, introduced by Clive Granger in 1969, offers a statistical test for causality. It says that if past values of time series X contain information that helps predict the future of Y, then X Granger-causes Y.

Key Citation: Granger, C. W. J. (1969). Investigating Causal Relations by Econometric Models and Cross-spectral Methods. Econometrica, 37(3), 424–438. https://doi.org/10.2307/1912791

How It Works:

Fit a Vector Autoregression (VAR) model on multivariate time series.
Use hypothesis testing (typically F-tests) to check if lagged values of X improve the prediction of Y.

Assumptions:

Linearity
Stationarity

R Example:

library(vars)
data(Canada)  # Built-in multivariate dataset
model <- VAR(Canada, p = 2, type = "const")
causality(model, cause = "e")  # Tests if 'e' Granger-causes other variables

Visualize the "e" variable (Employment Index) from the Canada dataset over time:

VAR Model Summary:

VAR Estimation Results:
========================= 
Endogenous variables: e, prod, rw, U 
Deterministic variables: const 
Sample size: 82 
Log Likelihood: -175.819 
Roots of the characteristic polynomial:
0.995 0.9081 0.9081 0.7381 0.7381 0.1856 0.1429 0.1429
Call:
VAR(y = Canada, p = 2, type = "const")

... (Full output truncated here for brevity, but will be available in downloadable analysis script) ...

Limitations:

Cannot detect nonlinear relationships
Fails with instantaneous causality or unobserved confounders

Beyond Granger: Nonlinear & Nonparametric Methods

As data becomes more complex, we need tools beyond linear assumptions. Two major methods stand out:

1. Transfer Entropy

What it is: An information-theoretic measure that captures the amount of directed (time-asymmetric) information flow.
Pros: Detects nonlinear and non-Gaussian dependencies
Cons: Sensitive to parameter choices, computationally expensive

In R:

library(RTransferEntropy)
te_result <- transfer_entropy(x, y, lx = 1, ly = 1, nboot = 100)
summary(te_result)

Transfer Entropy Summary:

Shannon's Transfer Entropy

Coefficients:
            te       ete     se p-value    
X->Y 0.0014192 0.0000000 0.0159    0.43    
Y->X 0.1168745 0.0968558 0.0148  <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Bootstrapped TE Quantiles (100 replications):
Direction      0%     25%     50%     75%    100%
     X->Y  0.0000  0.0000  0.0000  0.0034  0.0856
     Y->X  0.0000  0.0000  0.0014  0.0032  0.0846 

Number of Observations: 84

Conclusion from Transfer Entropy:

The TE analysis reveals that there is a strong and statistically significant information transfer from Y to X (p < 2e-16), but not in the reverse direction (X to Y). This suggests that Y potentially drives or influences X in a nonlinear manner. The absence of significant TE from X to Y reinforces the directionality of this causal link. This highlights the strength of information-theoretic approaches in detecting asymmetric and nonlinear causality that Granger tests might overlook.

2. PCMCI (Python: Tigramite Library)

Uses conditional independence testing and graphical models to infer causality in multivariate time series.
Can handle autocorrelation and indirect links better than pairwise methods.
Usable from R via reticulate.

Key Citation: Runge, J., Nowack, P., Kretschmer, M., Flaxman, S., & Sejdinovic, D. (2019). Detecting and quantifying causal associations in large nonlinear time series datasets. Science Advances, 5(11), eaau4996. https://doi.org/10.1126/sciadv.aau4996

Real-Life Applications

Economics: Does increasing interest rate reduce inflation?
Climate Science: Do rising CO2 levels cause temperature increases?
Finance: Does investor sentiment lead stock prices, or react to them?

These are not just academic questions — real policy decisions depend on causal conclusions.

Common Pitfalls

Reverse causality: The effect may seem to precede the cause in some cases.
Latent confounders: Unobserved variables influencing both X and Y.
Mismatch in sampling frequency: Causes and effects might occur at different time scales.
Overfitting in high-dimensional models : Especially when using many variables or nonlinear models.

Future Directions & Open Questions

Can we infer causality robustly in noisy real-world data?
How do we validate a causal model without interventions?
Can hybrid approaches (e.g., combining data with domain knowledge) improve accuracy?
How do deep learning models handle causality? Can we interpret them?

Causal discovery remains an active and open area of research. New algorithms, theoretical frameworks, and computational tools are evolving rapidly.

Summary & Takeaways

Causality is more than just a temporal lead-lag relationship.
Granger Causality is a simple and popular method, but it has limitations.
Transfer Entropy and PCMCI offer powerful nonlinear, multivariate approaches.
No method is perfect — always be cautious about assumptions and data limitations.

Bonus: Try It Yourself

Want to explore causality in your own datasets? Here's an example R script that:

Loads and explores multivariate time series data
Applies VAR and Granger causality tests
Computes Transfer Entropy between two series
Visualizes time series with ggplot2

# Load libraries
library(vars)
library(RTransferEntropy)
library(ggplot2)

# Load multivariate time series dataset
data(Canada)  # Example dataset from 'vars' package
head(Canada)

# Visualize one series
ggplot(data = as.data.frame(Canada), aes(x = 1:nrow(Canada), y = e)) +
  geom_line(color = "steelblue") +
  labs(title = "Employment Index Over Time", x = "Time", y = "Employment") +
  theme_minimal()

# Fit a VAR model
model <- VAR(Canada, p = 2, type = "const")
summary(model)

# Granger causality test
causality(model, cause = "e")

# Example transfer entropy between two variables
x <- Canada[, "e"]
y <- Canada[, "prod"]
te_result <- transfer_entropy(x, y, lx = 1, ly = 1, nboot = 100)
summary(te_result)

Call to Action

Have you ever tried to discover causality in your own time series data? What methods worked or failed for you? Share your experiences or ask questions in the comments below — let’s learn together at StatSphere!

StatSphere

Search This Blog