WeSearch

Correlation Doesn’t Mean Causation! But What Does It Mean?

Sara A. Metwalli· ·6 min read · 0 reactions · 0 comments · 0 views
Correlation Doesn’t Mean Causation! But What Does It Mean?

What does correlation tells us?

Original article
Towards Data Science · Sara A. Metwalli
Read full at Towards Data Science →
Full article excerpt tap to expand

Data Science Correlation Doesn’t Mean Causation! But What Does It Mean? What does correlation tells us? Sara A. Metwalli Apr 28, 2026 6 min read Share Image by Magda Ehlers from Pexels Even before any of us got into data science, there was a phrase that we’d all heard; everyone knows it, young and old: “Correlation doesn’t imply causation.” It is a catchy phrase, and you’ve definitely said it once or twice, and might even have nodded confidently when someone else said it. Especially for datasets that don’t relate to each other, but where it’s funny and intriguing to imply causation! Here are two very interesting facts: Countries that eat more pizza tend to have higher math scores. The more sunglasses sold, the more shark attacks occur. Now, if that were all the information you have… what should you conclude? Does eating pizza make you better at math? Will buying a new pair of sunglasses cause a shark attack? Though it is funny to think about, the answer to those questions is “probably not”. And yet, these are examples of something very real: Correlation. The question worth asking now is: if correlation doesn’t equal causation, then what does it mean? That’s where things get fuzzy. Because we tend to treat correlation like a vague idea, we think of it as if it means “They’re kind of related”, or “They move together somehow”. But correlation isn’t just a feeling; it’s a precise mathematical measurement of how two variables move together. Instead of just repeating the warning, let’s actually understand the concept. Once you do, those weird examples stop being surprising and start making sense. So, let’s get into it! What is correlation? When people say two things are “correlated,” they usually mean one of three things: “Those two things seem related.” “Those two things move together.” “There’s some connection between those two things.” On a surface level, all three of these are not wrong, but they are missing some nuances. Correlation is not a vibe. It’s a measurement! And like any measurement, it answers a very specific question. Taking a step back, imagine you collect the data on how many hours students studied and their exam scores. You plot it, and you see something like this: Each point represents one student. The x-axis is how long they studied, and the y-axis is their score. When you look at this plot, you notice that the points tend to move upward. So you conclude, “As study time increases, scores tend to increase too”, which is what we call a positive correlation. But, is that just a trend or is the data telling you something more? In this example, the relationship you just plotted is: when one variable is above its average, the other tends to be above its average too. That’s the key idea most people miss: correlation isn’t about raw values, it’s about how variables move relative to their averages. So, the question correlation answers is: Do two variables move together in a consistent way? That question has one of three answers: Up + up → positive correlation Up + down → negative correlation No consistent pattern → no correlation The Math Behind Correlation Let’s try to make thinking about correlation more intuitive. We will do that using the Pearson correlation coefficient, which we can define as: r=cov(X,Y)σX.σYr = \frac{cov(X, Y)}{ \sigma_{X}.\sigma_{Y}} Okay, I know that equation isn’t what anyone thinks of when I say “intuitive”… But stick with me and let’s unpack it without turning it into a lecture. Step 1: Covariance (AKA…

This excerpt is published under fair use for community discussion. Read the full article at Towards Data Science.

Anonymous · no account needed
Share 𝕏 Facebook Reddit LinkedIn Email

Discussion

0 comments

More from Towards Data Science