library(ggplot2)
library(Sleuth3)
library(knitr)

# set some options to clean up output
opts_chunk$set(message = FALSE, warning = FALSE)

Anova assumptions

qplot(reorder(spray, count), count, data = InsectSprays, geom= "boxplot")

plot of chunk unnamed-chunk-2

Do you see any evidence of violations of the ANOVA assumptions?

From the raw data we see some sprays (A, B and F) have much larger spread than the others. This is evidence that the maybe the assumption of equal population standard deviations is violated. Notice that these sprays also seen to have much higher centers. ## Take a look at the residual plots. Can you see the violations more easily?

spray_full <-  lm(count ~ spray, data = InsectSprays) 
qplot(.fitted, .resid, data = spray_full) 

plot of chunk unnamed-chunk-3

In the plot of the residuals against the group averages (fitted values) we see a classic funnel shape: the spread of the residuals around zero gets larger as the group average gets larger. This is clear evidence the assumption of equal population standard deviations is violated. There is quite a bit of overplotting in the above plot, so a better plot might be to jitter the points:

qplot(.fitted, .resid, data = spray_full, geom = "jitter" ) 

plot of chunk unnamed-chunk-4

Or even use boxplots instead:

qplot(.fitted, .resid, data = spray_full, geom = "boxplot", group = spray) 

plot of chunk unnamed-chunk-5

qplot(spray, .resid, data = spray_full )

plot of chunk unnamed-chunk-5

The plot of the residuals against the groups reiterates that there is evidence the assumption of equal population standard deviations is violated. Now it’s clearer that is is groups C, D and E that have smaller spread compared to A, B and F. Again it might have been better to use boxplots:

qplot(spray, .resid, data = spray_full, geom = "boxplot" )

plot of chunk unnamed-chunk-6

source(url("http://stat511.cwick.co.nz/code/stat_qqline.r"))
qplot(sample = .resid, data = spray_full) + stat_qqline()

plot of chunk unnamed-chunk-6

The normal probability plot of the residuals shows some deviation from a straight line. The residuals seem to have longer tails that we would expect if the populations were Normally distributed. Be careful here! Part of the reason these residuals don’t look Normal, is that the population standard deviations aren’t equal. Generally, since Normality is less of a concern, we attempt to fix the non-equal SDs, before doing anything to fix non-Normality. ## Do the residual plots look better if we instead model the mean of the squareroot of insect count?

spray_full_sqrt <-  lm(sqrt(count) ~ spray, data = InsectSprays)
qplot(.fitted, .resid, data = spray_full_sqrt, geom = "jitter"  )

plot of chunk unnamed-chunk-7

Yes, big improvement! The funnel shape is gone in the residuals against fitted values plot.

qplot(spray, .resid, data = spray_full_sqrt  )

plot of chunk unnamed-chunk-8

Some groups still appear to have smaller spreads than others (E for example), but the differences are small enough that we could put it down to sampling variability. There is no evidence of a gross violation of the equal SD assumption.

qplot(sample = .resid, data = spray_full_sqrt ) + stat_qqline()

plot of chunk unnamed-chunk-9

The Normalility plot looks a lot better. The residuals follow a straight line closely and we have no evidence they populations aren’t Normal. ## What would you conclude from this ANOVA?

anova(spray_full_sqrt)
## Analysis of Variance Table
## 
## Response: sqrt(count)
##           Df Sum Sq Mean Sq F value Pr(>F)    
## spray      5   88.4   17.69    44.8 <2e-16 ***
## Residuals 66   26.1    0.39                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Since there is no evidence the assumptions for the one-way ANOVA are violated on the square root scale we can proceed with the ANOVA using the transformed data. The null hypothesis is now: all populations have the same mean square root count of insects. We would conclude there is convincing evidence that at least one insect sprays has a different mean squareroot count of insects (one-way ANOVA on square root of count with 5 and 66 degrees of freedom, p-value < 0.0001).

This was a randomized experiment so it would be valid to also state the conclusion as: There is convincing evidence that at least one insect spray has a different treatment effect (one-way ANOVA on square root of count with 5 and 66 degrees of freedom, p-value < 0.0001).

Why not log? The square root is just another example of a transformation like the logarithm. It often works well for responses that are counts of something. It does not have the same easy interpretation of the back transform that the logarithm does. We could have tried a log transform, but the first problem we would run into is that there are zero counts, and log(0)=-Inf. We could add a small number to the counts and then log transform:

spray_full_log <-  lm(log(count + 0.5) ~ spray, data = InsectSprays)
qplot(.fitted, .resid, data = spray_full_log, geom = "jitter"  )

plot of chunk unnamed-chunk-11

but we end up overcorrecting the spreads: groups with higher averages end up with much smaller spreads. We would still be violating the unequal spread assumption. So, log just doesn’t work well here. # Linear combinations of means ## Repeat the steps to compare Spock’s mean to the average of the means of the other judges, i.e. \( C_1 = -1/6, C_2 = -1/6, C_3 = -1/6, C_4 = -1/6, C_5 = -1/6, C_6 = -1/6, C_7 = 1 \)

# sample averages
averages <- with(case0502, tapply(Percent, Judge, mean))
# sample sds
sds <- with(case0502, tapply(Percent, Judge, sd))
# sample sizes
ns <- with(case0502, tapply(Percent, Judge, length))
sp <- sqrt(sum(sds^2*(ns-1))/sum(ns - 1))
df <- sum(ns-1)

C <- c(-1/6,-1/6,-1/6,-1/6,-1/6,-1/6, 1)

g <- sum(C*averages) 
se_g <- sp * sqrt(sum(C^2/ns))

g + c(-1, 1) * qt(0.975, df) *se_g
## [1] -20.322  -9.635
2*(1 - pt(abs(g / se_g), df))
## [1] 1.489e-06

There is convincing evidence that the mean percent of women on the venires of Spock’s judge is not the same as the average of the mean percent of women for the other judges (t-test on linear combination of means, p-value < 0.0001). With 95% confidence the mean percent of women on the venires of Spock’s judge is between 9.6 and 20.3 percentage points lower than the average of the mean percent of women for the other judges.