# 13 Reasons Why

Did the Netflix show 13 Reasons Why cause a large jump in suicides? The evidence is shaky.

Jonatan Pallesen
05-05-2019

## Introduction

In a recent study they find that the number of suicides increased following the release of the Netflix show 13 Reasons Why. This has gotten a lot of press; but such questions can be tricky statistically. It is easy to download the numbers from CDC Wonder, so I do that and take a look myself.

## Analysis

``````
library(pacman)

source('../../src/extra.R', echo = F, encoding="utf-8")

clean_names %>%
filter(is.na(notes)) %>%
separate(month_code, into = c("year", "month"), sep="/") %>%
select(year, month, suicides = deaths, gender)
}

normalize <- function(df){
females <- df %>% filter(gender == "Female")
y0 <- females %>% filter(year == 2016, month == "01") %>% pull(suicides)
females %<>% mutate(suicides = suicides / y0)

males <- df %>% filter(gender == "Male")
y0 <- males %>% filter(year == 2016, month == "01") %>% pull(suicides)
males %<>% mutate(suicides = suicides / y0)

bind_rows(males, females)
}

teens_norm <- teens %>% normalize()

``````

plot

``````
month_names <-
c("01" = "Jan", "02" = "Feb", "03" = "Mar", "04" = "Apr",
"05" = "May", "06" = "Jun", "07" = "Jul", "08" = "Aug",
"09" = "Sep", "10" = "Oct", "11" = "Nov", "12" = "Dec",
"Male" = "Male", "Female" = "Female")

plotit <- function(df, title){
df %>% filter(year %in% c("2013", "2014", "2015", "2016", "2017")) %>%
ggplot(aes(x = year, y=suicides, fill=year)) +
geom_bar(stat="identity") +
facet_grid(gender ~ month, labeller = as_labeller(month_names)) +
theme(axis.text.x = element_blank(),
axis.ticks.x = element_blank(),
text = element_text(size = 19)) +
labs(x = "month", title = title) +
scale_fill_jcolors(palette="pal7")
}

plotit(teens, "Age 10-17")
``````

plot

``````
plotit(twenties, "Age 18-29")
``````

Suicide rates have been going up since the mid 2000s for all groups:

code

``````
plot_suicides <- function(df){
df %>% ggplot(aes(x = year, y = suicides, color = gender)) +
geom_point(alpha = 0.4) +
theme(axis.text.x = element_text(angle = 45, hjust=1))
}

teens %>% plot_suicides() + labs(title = "Age 10-17")
``````

code

``````
twenties %>% plot_suicides()+ labs(title = "Age 18-29")
``````

## Discussion

On the pro side of the hypothesis, April 2017 has the highest rate of suicides of any month.

On the con side we have the following arguments:

• The show is about a girl committing suicide, but the jump is seen among boys.

• The sample size is not large, and the values jump around.

• The numbers of suicides have been steadily rising since the early 2000s.

• These three things work together to give quite a high chance that we are going to see an unusually high value, for at least one gender / age group.

• We also see large values for suicide in teenage males in March 2017 (before the show was released) and December 2017 (when probably not that many people were still watching).

In the paper they make some sort of advanced model, but in the end whether you believe the result boils down to whether you think the April 2017 value is large enough to overcome the above problems.

My personal take is that these numbers Bayesianly marginally increase the probability I would put on the show causing an increased number of suicides, but not by too much.