Five things to know about Pearson r
The final topic for Chapter 11 is the Pearson r coefficient. You won’t be asked to
calculate this statistic, but you will need basic conceptual understanding of
it. You’ll need to supplement this blog entry with the textbook because I can’t
include diagrams here.
1. The general idea
The Pearson coefficient (often referred to as “r”) is a measure of bivariate correlation. This means it measures the
strength of a relationship between two variables. It does NOT measure causation
(remember there are three criteria for causation and correlation is only one).
For instance, it seems intuitive enough that there is a
positive relationship between the variables annual income and years of education
(the more money you make, the more education you likely to have). Therefore
we’d expect a Pearson coefficient to indicate a strong relationship between
these two variables. By contrast, it’s hard to imagine that there’s a
relationship between the variables eye
color and income. It doesn’t make
sense that the two have anything to do with one another. In this case, we’d
expect our Pearson coefficient to indicate either a weak or nonexistent
relationship.
Now let’s talk about specifics.
2. The coefficient
The Pearson coefficient ranges from -1 to +1. The closer the
value is to -1 or to +1, the stronger is the relationship between variables.
Negative and positive values that are close to “0” indicate a weak relationship
between variables. You’ll recall that we have two kinds of relationships
between variables, negative and positive. Those relationships are
reflected in the Pearson coefficient, which is why both -1 and +1 indicate a
“strong” relationship.
3. Type of variable
Pearson r can only
be used to measure variables at the ratio
level of measurement. Nominal and ordinal variables are null and void. The
short and sweet of it is that the Pearson coefficient relies on a calculation
of the mean. And as you already know, the mean can only be calculated for ratio
level variables. So, a red flag should go up if you’re asked to interpret a
Pearson coefficient for the variables age and gender. This would be an invalid
use of Pearson r because gender is a
nominal variable.
4. Type of
Relationship
Pearson r can only
be used to measure linear relationships.
Curvilinear relationships make the Pearson statistic null and void (the
curvilinear relationship between the variable may be real, it’s just that the
Pearson statistic cannot be used to measure or evaluate it). In lecture, I used
“income” as an example of a curvilinear relationship. I said that income over a
lifetime is not a straight line; most people make no money as a child, lots of
money in their prime, and then minimal income after retirement. If you can
imagine plotting that relationship on an x/y graph, you’d have a curve. Another
curvilinear relationship is between health and age; the health of children and
the elderly tends to be poorer than the health of young and middle-aged adults.
Again, you’d have a curved plot on a graph. You may be asking: How do I tell if
a variable is linear or curvilinear? The short answer is that you’d actually
have to plot it and look for a visible pattern. But don’t worry about that, for
our purposes you simply need to know that Pearson r is only appropriate for linear relationships.
5. Interpreting
examples
Interpreting a Pearson coefficient is simple as pie. A
relationship between variables can be: a) weak, b) moderate, or c) strong.
You’ll have to double-check this in the book (I don’t have mine handy) but the
guideline is something like 0-.3 is weak, .31-.69 is moderate, and .7-1 is
strong (same for negative numbers). So if age
and years of education have a Pearson coefficient of .4, you’d conclude that a
moderate positive relationship exists (the older you are, the more education
you have). If amount of smoking and life expectancy in years have a Pearson
coefficient of -.8, you’d conclude a strong negative relationship exists (the
more you smoke, the fewer years you live). If income and IQ have a
Pearson coefficient of “.1”, you’d say a weak relationship exists (so weak, in
fact, you’d probably conclude that no relationship exists).
That’s it. Those are the five key points to know about
Pearson r. To close, here’s a quiz to
test your understanding. Bring questions on Monday.
- Interpret
a Pearson’s coefficient of .75 for the variables age and number of children.
- How
would you draw a Pearson’s coefficient of “0” on a scatter plot?
- True
or False: A coefficient of .5 is a stronger indicator that your hypothesis
is correct than a -.5 coefficient.
- True
or False: A coefficient of -.75 for age and religion indicates a strong
negative relationship between age
and religion?
- True
or False: For the ratio level variables age and income, a
coefficient of .8 means there is a strong causal relationship between age and income.
<< Home