STOR 664: COURSE DESCRIPTION
Fall 2020


This page was updated on August 11, 2020.

This class has been moved to PEABODY 220 and is now officially a Hyflex class. Students may choose whether to take the class in-person or online.

Instructor

Richard L. Smith
Hanes 303
Email: rls "at" email "dot" unc "dot" edu
Home Web Page

Class Time and Place

Mondays and Wednesdays, 2:40-3:55 pm.

Location: Peabody 220, see above.

Students should regularly check the course sakai page, where most of the new materials needed during the course will be posted.

Office Hours

Two remote office hours per week (shared with STOR 590), Tuesdays 11:00-12:30, Wednesday 9:30-11:00. Zoom details are on the course sakai page. In addition, there will be an in-person office hour on Wednesdays (4:00-5:00 pm), location to be announced.

Grader

Xi Yang.
Office: TBA
Email: xiyang "at" live "dot" unc "dot" edu

COVID-19 Policies

As everyone is obviously well aware already, we are in the middle of the COVID-19 pandemic, and this requires some class policies that are different from usual. This course is classified as "face to face/hybrid," which means that the bulk of the teaching will take place in class and students are expected to attend in person; however, some elements of the course may be taught online, and it is possible that the entire course will be switched to online teaching if the status of the pandemic worsens. A full statement of university policies for Fall teaching may be accessed here. For specific policies that will be in place in this class, I have created this document. Please review that before attending the first class! If you have questions about the arrangements for the course, please email the instructor.

Course Texts

The course texts will be (a) the draft version of Linear Regression by R.L. Smith and K.D.S. Young (available as a course pack through Student Stores); (b) Linear Models with R (Second Edition)
by Julian Faraway. This will also be available through Student Stores; you are welcome to obtain it from a different supplier but make sure you get the second edition.

I have created a data page that links to datasets and programs from the Smith and Young text that will be used in the course.

Chapter Headings (Smith and Young)

The syllabus for the course is undergoing revisions following a recent review of the department's Applied Statistics offerings. What follows here is essentially the list of topics as covered by the current version of the Smith and Young text; it is planned to adjust some topics as we go through the material.

Chapter 1: Air pollution and public health: A case study for regression analysis.
This introductory chapter discusses a major public policy issue where the use (or, depending on your point of view, misuse) of regression analysis has featured heavily. It illustrates some of the techniques which we will be discussing in detail later in the course, and also describes some of the pitfalls associated with the use of regression to solve substantive scientific problems.

Chapter 2: Simple linear regression.
For most of you, much of this material will be revision, covering the simple case of one y variable and one x variable. However, we also discuss some more subtle features, such as simultaneous confidence intervals, inverse regression or calibration, and tests for autocorrelation.

Chapter 3: Multiple regression.
Matrix formulation and solutions. Confidence and prediction intervals, and hypothesis tests. Simultaneous estimation. Power of the F test. Examples. The chapter concludes with an outline of the geometric approach to least squares theory, with the aid of which we are able to provide slick proofs of all the major mathematical results.

Chapter 4: Diagnostics for influential observations.
This chapter is concerned with the effect of outliers among either the x or y values. The hat matrix. Diagnostics for influence: DFFITS, DFBETAS, Cook's statistic, COVRATIO. Graphical methods. Examples.

Chapter 5: Diagnostics for model selection.
Multicollinearity. Variable selection. Transformations. Applications. To be added: LASSO regression.

Chapter 6: Two Case Studies.
(a) Air pollution and daly mortality in Birmingham, Alabama. (b) The Bush-Gore Election from 2000.

Chapter 7: Miscellaneous topics in regression.
Weighted and Generalized least squares. Response surface methodology. Introduction to nonlinear regression.

Chapter 8: Analysis of Designed Experiments.
One-way and two-way analysis of variance, interaction, analysis of covariance. To be added: latin squares, factorial designs.

Computing

The course includes an extensive practical computing component. Previous versions of the class (including the Smith-Young coursepack) included examples in both R and SAS, but the department has now decided not to try to teach SAS, so the computing elements of the course will be entirely in R. If you do not have it already, you should download it from http://cran.r-project.org.

If you prefer R-Studio, that is also completely acceptable; most of the examples work exactly the same way in R and R-Studio (though the appearance of the output may differ).

The intent of the course is not to teach R from first principles; I assume most if not all students are already familiar with it, but if not, I recommend using the Faraway text and following up further references there if they are needed.

Assignments and Exams

Homeworks consisting of both theoretical and computational exercises will be set, at approximately two-week intervals. There will be a midterm and a final exam, which may be online exams. Following the recent review of the curriculum, it is also planned to include an individual student project component of the course. Provisional distribution of marks: 20% for homework assignments, 25% for the midterm, 25% for the project, 30% for the final exam.

Further reading

Other references that may be helpful include the following:

Atkinson, A.C. (1985), Plots, transformations, and regression. Oxford : Oxford University Press. QA278.2 .A85 1985
Cook, R.D. and Weisberg, S. (1982), Residuals and influence in regression. New York : Chapman and Hall. QA278.2 .C665 1982
Cook, R.D. and Weisberg, S. (1999), Applied regression including computing and graphics. New York : Wiley. QA278.2 .C6617 1999
Dean, A. and Voss, D. (1999), Design and analysis of experiments. New York : Springer. QA279 .D43 1999
Draper, N.R. and Smith, H. (1998), Applied Regression Analysis (Third Edition). New York: Wiley. QA278.2 .D7 1998
Faraway, J.J. (2016), Extending the Linear Model with R: Generalized Linear, Mixed Effects and Nonparametric Regression Models. Second Edition, Chapman and Hall.
McCullagh, P. and Nelder, J.A. (1989), Generalized linear models. London : Chapman and Hall. QA276 .M38 1989
Neter, Kutner, Nachtsheim and Wasserman (1996), Applied Linear Statistical Models. Fourth Edition: Irwin, Chicago. QA278.2 .A66 1996
Rawlings, J.O., Pantula, S. and Dickey, D.A. (1998), Applied regression analysis : a research tool. New York : Springer. QA278.2 .R38 1998
Scheffe, H. (1959), The analysis of variance. New York : Wiley. QA276 .S34
Weisberg, S. (1985), Applied linear regression. New York : Wiley. QA278.2 .W44 1985

Return to Richard Smith's page