Article 31RTH Negative correlation introduced by success

Negative correlation introduced by success

by
John
from John D. Cook on (#31RTH)

Suppose you measure people on two independent attributes, X and Y, and take those for whom X+Y is above some threshold. Then even though X and Y are uncorrelated in the full population, they will be negatively correlated in your sample.

This article gives the following example. Suppose beauty and acting ability were uncorrelated. Knowing how attractive someone is would give you no advantage in guessing their acting ability, and vice versa. Suppose further that successful actors have a combination of beauty and acting ability. Then among successful actors, the beautiful would tend to be poor actors, and the unattractive would tend to be good actors.

Here's a little Python code to illustrate this. We take two independent attributes, distributed like IQs, i.e. normal with mean 100 and standard deviation 15. As the sum of the two attributes increases, the correlation between the two attributes becomes more negative.

from numpy import arangefrom scipy.stats import norm, pearsonrimport matplotlib.pyplot as plt# Correlation.# The function pearsonr returns correlation and a p-value.def corr(x, y): return pearsonr(x, y)[0]x = norm.rvs(100, 15, 10000)y = norm.rvs(100, 15, 10000)z = x + yspan = arange(80, 260, 10)c = [ corr( x[z > low], y[z > low] ) for low in span ]plt.plot( span, c )plt.xlabel( "minimum sum" )plt.ylabel( "correlation coefficient" )plt.show()

negative_correlation.svg

blQymEiPpLE
External Content
Source RSS or Atom Feed
Feed Location http://feeds.feedburner.com/TheEndeavour?format=xml
Feed Title John D. Cook
Feed Link https://www.johndcook.com/blog
Reply 0 comments