Correlation vs Causation

The following headline caught my eye recently:

“Migraines could be caused by gut bacteria, study suggests”

The Guardian – 18/10/16

To anybody who suffers from migraines, this is very interesting; at the moment, we really don’t understand what causes them. If a study has figured this out, then we may be able to help the estimated 15% of the population who are sufferers.

This report is based on a paper published this week, titled:

“Migraines Are Correlated with Higher Levels of Nitrate-, Nitrite-, and Nitric Oxide-Reducing Oral Microbes in the American Gut Project Cohort”

mSystems – 2016

The eagle eyed among you may have spotted the problem with this already. The Guardian has switched the word “correlated” for the word “caused”, so immediately you can see why the headline is wrong. Unfortunately the research did not show that gut bacteria cause migraines. What it did show was that people who suffer from migraines are more likely to have slightly different bacteria in their mouths than people who don’t. While this might lead you to think that there is a link, you cannot conclude this from the data. The difference between the two is easiest to explain in an example.

The sales of ice-cream (A) correlates with the number of shark attacks (B). This could mean one of a number of things:

  1. That A causes B.

Sharks are attacking more because people are buying more ice-cream.

  1. That B causes A.

People are buying more ice-cream because of all the stark attacks.

  1. That something else causes both A and B.

In good weather people go swimming more, and also buy more ice-cream.

  1. That it is just coincidence that A and B are happening together.

Unlikely in this case, but actually extremely common, as I describe below.

It is very, very easy to find correlations between random things. Take this fact, for example: the divorce rate in Maine is correlated with the consumption of margarine (see image below). This obviously does not mean that margarine causes divorces.


Or the fact that the number of people who drown by falling in pools each year is correlated with the number of films that Nicholas Cage has been in that year. While it is tempting to suggest that Nicholas Cage films are so bad they are causing people to fall into pools, it seems a bit of an extreme reaction to his awful acting.


These kinds of spurious correlations are everywhere if you look for them. There is a very good website (here) that mines data to find new ones, including the two examples I have used above. While these correlations are usually obviously nonsense, sometimes a correlation makes instinctive sense, and it is easy to believe that one thing is causing another, without actually having any evidence that it is true. This unfortunately can cause serious damage.

As the rate of vaccination has increased over the last few decades, we have seen an explosion in the number of diagnoses of autism, which has led some people to claim that vaccines cause autism. It is an understandable assumption. The symptoms of autism appear at around the same stage as vaccination, so you can see why some parents jump to that conclusion. However, it has been clearly shown that there is no link between the two. In fact, the increase in the rate of autism is largely down to increased awareness and reporting, and not actually a result of more kids being autistic. Unfortunately however, the belief that these correlated events (autism onset and vaccination) are linked has led to a decrease in vaccination rates, and many preventable illnesses and deaths*.

This problem of mixing up correlation and causation is common in the media, and an easy trap to fall into. Certainly correlation sometimes does mean causation, but without additional evidence we simply cannot say that it does. Correlation studies are common in science, and are an important research tool, particularly for informing future studies. Unfortunately, these are sometimes over-interpreted, and lead to things being linked without cause.

The migraine study that I started this blog post with shows a correlation, but not causation. However, other studies have shown that chemicals that these bacteria produce can indeed cause headaches. While neither study is conclusive, it suggests that it may be worth following up these findings in further studies, which is exactly what the researchers recommend. Unfortunately, that wouldn’t make such a good headline.


*It is worth pointing out that a certain percentage of people will get these illnesses, regardless of vaccinations. The numbers on the linked website above include these cases, so it is very difficult to know how many are directly due to decreased vaccination. It is clear though that the numbers have been increasing with decreasing vaccination rates, but if this blog post has taught you anything, it is that we cannot say that one has definitely caused the other. However, when combined with other available evidence, we can be very sure of that assertion.


One thought on “Correlation vs Causation

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s