At first, I found this really puzzling. is correlated (Pearson) with , and is correlated with . Does this mean is necessarily correlated with ? Intuitively, this totally makes sense. The answer, however, is “no.”
Perhaps the strangest thing is how easy it is to rationalize this “puzzle.” I drink more beer () and read more books () when I am on a vacation (). That is, both pairs – and and and – are positively correlated. But I do not drink more beer when I read more books – and are not correlated. It is now obvious that correlation is not (always) transitive, but a second ago, this sounded bizarre.
Let’s go through the math.
Digging a Bit Deeper
Let’s denote the respective correlations between and by , , and . For simplicity (and without loss of generality), let’s work with standardized versions of these variables – that is, means of 0 and variances of 1. This implies, for any pair.
We can write the linear projections of X and Z on Y as follows:
Then, we have:
We can use the Cauchy-Schwarz inequality to bound the last term, which gives the final range of possible values for :
For instance, if we set , then we get:
That is, can be negative.
An Extremely Simple Example
Perhaps the simplest example to illustrate this is:
- and are independent random variables,
The result follows.
The following code sets up this example in R.
set.seed(68493) x <- runif(n=1000) z <- runif(n=1000) y <- x + z
Below is a table with correlation coefficients and p-values associated with the null hypotheses that they are equal to zero.
You can find the code for this exercise in this GitHub repository.
When Is Correlation Transitive
From the equation above it follows that when both and are sufficiently large, then is sure to be positive (i.e., bounded below by 0).
In the example above, if we fix , then we need to guarantee that .
Where to Learn More
Olkin (1981) derives some further mathematical results related to transitivity in higher dimensions.
- X and Z both being correlated with Y does not guarantee that X and Z are correlated with each other.
- This is the case when the former two correlations are “large enough.”
Olkin, I. (1981). Range restrictions for product-moment correlation matrices. Psychometrika, 46, 469-472. doi:10.1007/BF02293804