Analyze a dataset of your choice using an appropriate method taught in this class. You will hand in a report (4 page maximum, including figures) summarizing your analysis following the guidelines below.
Introduction Give me a brief overview of your data, a little bit of background on the topic and your question of interest. Keep this concise, free of statistical jargon and to the point
Data Describe where the data came from. How does chance enter the study design? What is the scope of inference of the study? Give a graphical display of the data and note any unusual features.
Methods State the method you are going to use to analyse the data and the null and alternative hypotheses you will test. Why did you chose the method you did? State the assumptions of the method you use, and show or describe why you think they are reasonable assumptions in this case (or why the test might be robust to violations of the assumptions.) Also show and describe any assumptions you checked that ruled out other tests. If there is evidence an assumption is violated, and we have covered an appropriate alternative tool but you fail to use the alternative you will lose points. If there is evidence an assumption is violated but we haven’t covered an appropriate alternative tool, make sure you acknowledge the violation.
Results
Produce a numerical summary of the test results. It should include estimates of any means (or medians) of interest as well as 95% CI intervals (talk to me if you can’t figure out how to do them for your method), the number of observations in each group, the test-statistic, and the p-value. This may be best presented as a table. Make sure it is clear just looking at this section what test you did.
Summary Provide a brief non-technical summary of your analysis (like the statistical summaries we have been writing).
Appendix Put all your R code into the appendix. I should be able to (given the data) run it myself and get all the figures and numbers used in your report.
This is generally the hardest part of any analysis. Find your data and make sure you can get it into R as early as possible. See getting data in to R
http://www.statsci.org/datasets.html is a great place to start if you need to find some data. I looked at the first two and they seemed to have some interesting data sets.
http://www.statsci.org/data/first.html Look particularly at the categories: Two samples, Paired samples, One Way ANOVA and Simple linear regresssion.
http://lib.stat.cmu.edu/DASL/ You can search by a topic you are interested in, or the sort of methods you might apply.
Some of the above give hints about approaches to take. Think critically about their suggestions and justify the approach you take.