Stat 411/511

Homework 2

Due on canvas Oct 16

This homework uses the same data as Lab 2, except I’ve taken a subsample of 1000 people. To get the data and store it in cdc, add this to the start of your script:

cdc <- read.csv(url("http://stat511.cwick.co.nz/homeworks/cdc.csv")) 

(This code is reading a comma delimited file from my website, you’ll need to be connected to the internet for it to work.)

1 Summary statistics

a. Calculate a new column, called wt_diff, for the difference between the subjects weight (weight) and their desired weight (wtdesire) using the following code.

cdc$wt_diff <- with(cdc, weight - wtdesire)

b. Report summary statistics (average, standard deviation and sample size) of weight and desired weight for males and females separately.

c. Construct a histogram of wt_diff for females only, and describe it in context of the data.

d. Construct a plot with separate histograms of weight for males and females. Don’t forget to play with binwidth, but only include one plot your report.

2 Paired t-test

Conduct a paired t-test comparing the weight (weight) and the desired weight (wtdesire) for females only. Report your results in a statistical summary, the summary should include a sentence describing the result of the test, a sentence interpreting the point estimate, and a sentence interpreting the confidence interval.