Fall 2020

This page was updated on August 11, 2020.

**This class has been moved to PEABODY 220 and is now officially a Hyflex class. Students may choose whether to take the class in-person or online.**

**Instructor**

Richard L. Smith

Hanes 303

Email: rls "at" email "dot" unc "dot" edu

**Home Web Page **

**Class Time and Place**

Mondays and Wednesdays, 2:40-3:55 pm.

Location: Peabody 220, see above.

Students should regularly check the course sakai page, where most of the new materials needed during the course will be posted.

**Office Hours**

Two remote office hours per week (shared with STOR 590), Tuesdays 11:00-12:30, Wednesday 9:30-11:00. Zoom details are on the course sakai page. In addition, there will be an in-person office hour on Wednesdays (4:00-5:00 pm), location to be announced.

**Grader**

Xi Yang.

Office: TBA

Email: xiyang "at" live "dot" unc "dot" edu

**COVID-19 Policies**

As everyone is obviously well aware already, we are in the middle of the COVID-19 pandemic, and this requires some class policies that are different from usual. This course is classified as "face to face/hybrid," which means that the bulk of the teaching will take place in class and students are expected to attend in person; however, some elements of the course may be taught online, and it is possible that the entire course will be switched to online teaching if the status of the pandemic worsens. A full statement of university policies for Fall teaching may be accessed here. For specific policies that will be in place in this class, I have created this document. Please review that before attending the first class! If you have questions about the arrangements for the course, please email the instructor.

**Course Texts**

The course texts will be (a) the draft version of ** Linear Regression ** by R.L. Smith and K.D.S. Young (available as a course pack through Student Stores); (b) **Linear Models with R (Second Edition) **

by Julian Faraway. This will also be available through Student Stores; you are welcome to obtain it from a different supplier but make sure you get the second edition.

I have created a
**data page **
that links to datasets and programs from the Smith and Young text that will be used in the course.

**Chapter Headings (Smith and Young)**

The syllabus for the course is undergoing revisions following a recent review of the department's Applied Statistics offerings. What follows here is essentially the list of topics as covered by the current version of the Smith and Young text; it is planned to adjust some topics as we go through the material.

**Chapter 1: Air pollution
and public health: A case study for regression analysis.**

This introductory chapter discusses a major public policy issue where the use
(or, depending on your point of view, misuse) of regression analysis has
featured heavily. It illustrates some of the techniques which we will be
discussing in detail later in the course, and also describes some of the
pitfalls associated with the use of regression to solve substantive
scientific problems.

**Chapter 2: Simple linear
regression.**

For most of you, much of this material will be revision, covering the simple
case of one y variable and one x variable. However, we also discuss some more
subtle features, such as simultaneous confidence intervals, inverse regression
or calibration, and tests for autocorrelation.

**Chapter 3: Multiple
regression.**

Matrix formulation and solutions. Confidence and prediction intervals, and
hypothesis tests. Simultaneous estimation. Power of the F test. Examples.
The chapter concludes with an outline of the geometric approach to least
squares theory, with the aid of which we are able to provide slick proofs
of all the major mathematical results.

**Chapter 4: Diagnostics
for influential observations.**

This chapter is concerned with the effect of outliers among either the x or
y values. The hat matrix. Diagnostics for influence: DFFITS, DFBETAS, Cook's
statistic, COVRATIO. Graphical methods. Examples.

** Chapter 5: Diagnostics
for model selection.**

Multicollinearity. Variable selection. Transformations.
Applications. * To be added: LASSO regression.*

** Chapter 6: Two Case Studies.**

(a) Air pollution and daly mortality in Birmingham, Alabama. (b) The Bush-Gore Election from 2000.

** Chapter 7: Miscellaneous
topics in regression.**

Weighted and Generalized least squares.
Response surface methodology.
Introduction to nonlinear regression.

** Chapter 8: Analysis of
Designed Experiments.**

One-way and two-way analysis of variance, interaction, analysis of covariance. * To be added: latin squares, factorial designs. *

**Computing**

The course includes an extensive practical computing component. Previous versions of the class (including the Smith-Young coursepack) included examples in both R and SAS, but the department has now decided not to try to teach SAS, so the computing elements of the course will be entirely in R. If you do not have it already, you should download it from
**http://cran.r-project.org. **

If you prefer R-Studio, that is also completely acceptable; most of the examples work exactly the same way in R and R-Studio (though the appearance of the output may differ).

The intent of the course is not to teach R from first principles; I assume most if not all students are already familiar with it, but if not, I recommend using the Faraway text and following up further references there if they are needed.

**Assignments and Exams**

Homeworks consisting of both theoretical and computational exercises will be set, at approximately two-week intervals. There will be a midterm and a final exam, which may be online exams. Following the recent review of the curriculum, it is also planned to include an individual student project component of the course. Provisional distribution of marks: 20% for homework assignments, 25% for the midterm, 25% for the project, 30% for the final exam.

**Further reading**

Other references that may be helpful include the following:

Atkinson, A.C. (1985),
Plots, transformations, and regression.
Oxford : Oxford University Press.
QA278.2 .A85 1985

Cook, R.D. and Weisberg, S. (1982),
Residuals and influence in regression.
New York : Chapman and Hall.
QA278.2 .C665 1982

Cook, R.D. and Weisberg, S. (1999),
Applied regression including computing and graphics.
New York : Wiley.
QA278.2 .C6617 1999

Dean, A. and Voss, D. (1999),
Design and analysis of experiments.
New York : Springer.
QA279 .D43 1999

Draper, N.R. and Smith, H. (1998),
Applied Regression Analysis (Third Edition).
New York: Wiley.
QA278.2 .D7 1998

Faraway, J.J. (2016), Extending the Linear Model with R:
Generalized Linear, Mixed Effects and Nonparametric Regression Models.
Second Edition, Chapman and Hall.

McCullagh, P. and Nelder, J.A. (1989),
Generalized linear models.
London : Chapman and Hall.
QA276 .M38 1989

Neter, Kutner, Nachtsheim and Wasserman (1996),
Applied Linear Statistical Models.
Fourth Edition: Irwin, Chicago.
QA278.2 .A66 1996

Rawlings, J.O., Pantula, S. and Dickey, D.A. (1998),
Applied regression analysis : a research tool.
New York : Springer.
QA278.2 .R38 1998

Scheffe, H. (1959),
The analysis of variance.
New York : Wiley.
QA276 .S34

Weisberg, S. (1985),
Applied linear regression.
New York : Wiley.
QA278.2 .W44 1985

Return to Richard Smith's page