Stat 411/511

Data Analysis 2

Due on canvas Nov 20th @ midnight

The grading rubric.

The General Social Survey

.. conducts basic scientific research on the structure and development of American society with a data-collection program designed to both monitor societal change within the United States and to compare the United States to other nations.

For this data analysis you are provided with subset of the 2012 survey containing married respondents, where the wife received at least a high school degree. For the purposes of this analysis you may treat this as a simple random sample of such Americans.


Each row of the dataset corresponds to one household. The variables are described below, and correspond to questions pertaining to education levels and time spent on household work.

Consider the difference in time spent between husband and wife on household work. Answer the following questions:

  • Is there evidence the mean difference in the number of hours of time spent on household work between the husband and wife, depends on the wife’s education level?
  • For each of the wife education levels, by how many hours does the mean difference in time spent between husband and wife, differ from that of the next lowest education category?

Your report should include the following sections:

  • Introduction Give a brief overview of the data, a little bit of background and the questions of interest. Keep this concise, understandable to someone outside of this class, free of statistical jargon and to the point. You should provide a summary graphic of the data involved and some basic summary statistics.

  • Methods Describe your reasoning for the procedures you have chosen to answer the questions. State the assumptions of the procedures, and show or describe why you think they are reasonable assumptions in this case (or why the test might be robust to violations of the assumptions). Explain any changes, transformations or other modifications you make to the data.

  • Summary Provide a brief non-technical summary of your findings that answers the questions of interest (like the statistical summaries we have been writing). Make sure you include some indication of the scope of inference (Can population inference be made? To what population? Can causal inference be made?)


husband_degree Level of education of husband, see table below for values
wife_degree Level of education of wife, see table below for values
interviewed Who answered the survey questions? (I.e. the respondent)
husband_hours How many hours a week the husband spends on household work, not including childcare and leisure time activities, as reported by the respondent.
wife_hours How many hours a week the wife spends on household work, not including childcare and leisure time activities, as reported by the respondent.
diff_hours The difference wife_hours - husband_hours, if positive the wife is reported to spend more time than the husband.

Education categories

LT HIGH SCHOOL Less than high school
HIGH SCHOOL High school
JUNIOR COLLEGE Junior college or Associate’s Degree
BACHELOR Bachelor’s degree
GRADUATE Graduate degree