Correlations with attractiveness

Relationship between facial attractiveness and political affiliation, intelligence, health and more.

Jonatan Pallesen
05-21-2019

Introduction

The data is from the Wisconsin Longitudinal Study. Data preparation is performed here. These are people born in Wisconsin in 1939-1940.

I plot the relationship between attractiveness and a number of traits. Attractiveness is rated in the study based on the yearbook photo. Since looking at a single yearbook photo taken at age 16-17 is a bad measure of attractiveness, we expect to find at most weak correlations between other traits, and indeed this is the case. But it is plausible that a better measure of attractiveness would give higher correlations in the same direction, and thus I think it is still interesting to investigate.


Analysis

read data


library(pacman)

p_load(tidyverse, magrittr, broom, janitor, feather, rap)

source('../../src/extra.R', echo = F, encoding="utf-8")

sample size


df <- read_feather("data/data.f") %>% drop_na(attractiveness)

df %>% tabyl(gender) %>% select(-percent)
gender n
female 4416
male 4018

prepare


df %<>% 
  mutate(
    party = case_when(
      party_affiliation == 1 ~ "Republican",
      party_affiliation == 2 ~ "Democrat")) %>% 
  mutate_at(vars(attractiveness, income, eduyears, spouse_iq, household_income),
            stdize)

men <- df %>% filter(gender == "male")

women <- df %>% filter(gender == "female")

get correlations


get_lm <- function(df, v, g){
  lm(glue("attractiveness ~ {v}"), 
     df %>% filter(gender == g)) %>% tidy() %>% 
    filter(term == v) %>% select(cor = estimate, p.value)
}

l1 <- crossing(
  v1 = c("attractiveness"),
  v2 = c("health92", "income", "household_income", "iq_std", "spouse_iq", "eduyears"),
  gender = c("male", "female"),
  controls = c("")
) %>% 
  rap(corr = ~ get_lm(df, v2, gender)) %>% 
  unnest(corr)



get_lm2 <- function(df, g, controls){
  lm(glue("attractiveness ~ party{controls}"), 
     df %>% filter(gender == g)) %>% tidy() %>% 
    filter(term == "partyRepublican") %>% select(cor = estimate, p.value)
}

l2 <- crossing(
  v1 = c("attractiveness"),
  v2 = c("partyRepublican"),
  gender = c("male", "female"),
  controls = c("", " + income + eduyears")) %>% 
  rap(corr = ~ get_lm2(df, gender, controls)) %>% 
  unnest(corr)

bind_rows(l1, l2)
v1 v2 gender controls cor p.value
attractiveness eduyears female 0.103 5.98e-08
attractiveness eduyears male 0.00519 0.713
attractiveness health92 female 0.154 1.71e-08
attractiveness health92 male 0.0504 0.075
attractiveness household_income female 0.14 1.62e-16
attractiveness household_income male 0.0293 0.0621
attractiveness income female -0.0407 0.302
attractiveness income male 0.0307 0.0427
attractiveness iq_std female 0.119 4.85e-14
attractiveness iq_std male 0.0265 0.0773
attractiveness spouse_iq female 0.1 0.000569
attractiveness spouse_iq male -0.0307 0.293
attractiveness partyRepublican female 0.23 1.11e-06
attractiveness partyRepublican female + income + eduyears 0.227 2.32e-06
attractiveness partyRepublican male 0.087 0.0948
attractiveness partyRepublican male + income + eduyears 0.062 0.249

Conclusion