Stat 411/511

Homework 3

Due Oct 23 on canvas

Q1 Two sample t-test

From last homework

cdc <- read.csv(url("")) 
cdc$wt_diff <- with(cdc, weight - wtdesire)

Conduct a two sample t-test to answer the question “Do people who exercised in the past month want to lose more weight than those who did not exercise in the past month?”. Report your results in a statistical summary. (Hint: your response is wt_diff, your groups are defined by the the exerany variale.)

Q2 A Randomization Test

(No R required - either do the work by hand and attach a scanned copy as a pdf on canvas. Or write your results as plain text in your R script file and hand in the compiled Word document.)

In lecture we saw a randomization test with 47 observations, which resulted in over 16 trillion possible assignments to two groups. To solidify the ideas of the randomization test, in this question you will do a randomization test with 5 observations, so that it is possible to list all possible ways to assign the 5 subjects to two groups.

A study was undertaken to evaluate a new drug to treat high blood pressure. Five subjects were recruited and randomly assigned to either receive the new drug or a placebo. The difference between their blood pressure at the start of the study and after two weeks of treatment was recorded and is given below.

Subject: A B C D E
Treatment group Placebo Placebo New drug New drug New drug
Reduction in BP 0 3 0 3 9

(Values are (initial BP) - (final BP), so positive numbers indicate a reduction in BP).

The following questions will lead you through a randomization test for the hypotheses:

Null: The new drug has the same effect as the placebo.

  1. List all of the 10 possible ways to randomly assign these 5 subjects to the two treatment groups (the “new drug” group should always have three subjects in it).

  2. For each grouping, calculate the difference in sample averages (the test-statistic) between the the two groups.

  3. Explain why these 10 test-statistics represent the null distribution.

  4. Calculate the p-value for the test by finding the proportion of test-statistics in the null distribution that were as or more extreme than the test-statistic observed.

  5. Summarize your findings in a “Statistical Summary”. You don’t need a point estimate or 95% confidence interval.

  6. Explain why this experiment is poorly designed. (Hint: What is the smallest possible p-value we could have got in this experiment?)