ISL_Chapter7_Moving Beyond Linearity
Written on February 21st , 2018 by hyeju.kimChapter 7. Moving Beyond Linearity
7.1 Polynomial Regression
7.2 Step Functions
- break the range of X into bins, and fit a different constant in each bin.
- converting a continous variable into an ordered categorical variable
7.3 Basis Functions
- the basis functions are $b_j(x_i)=x_j$ , and for piecewise constant functions they are $b_j(x_i)=I(c_j ≤ x_i < c_{j+1})$
7.4 Regression Splines
- extension of polynomial regression and piecewise constant regression
7.4.1 Piecewise Polynomials
- fitting separate low-degree polynomials over differernt regions of X
7.4.2 Constraints and Splines
top-right : constraint that the fitted curve must be continuous
lower-left : constraint that continous + have continuous first and second deriatives (called as cubic spline)
lower-right : constraint of continuity at each knot
7.4.3 The Spline Basis Representation
-
start off with a basis for a cubic polynomial, and then add one truncated power basis function per knot
-
a truncate power basis function is defined as
-
a natural cubic spline is a regression apline with additional boundary constraints:the function is requred to be linear at the boundary
7.4.4 Choosing the Number and Locations of the Knots
- cross- validation
7.4.5 Comparison to Polynomial Regression
- why Regression spline » Polynomial regression?
- splines introduce flexibilityh by increasing the number of knots but keeping the degree fixed
- polynomials must use a high degree to produce flexible fits
7.5 Smoothing Splines
7.5.1 An Overview of Smoothing Splines
Loss + Penalty
$\lambda$ : nonnegative tuning parameter
- if $\lambda \to \infty$ , sensitive to changing
g : function that minimizes(7.11) known as a smoothing spline
- why penalty?
- $g\prime\prime(t)$ = amount by which the slope is changing
- the function g(x) that minimizes (7.11) : (shrunken ver.) a natural cubic spline with knots at $x_1, … , x_n$.
7.5.2 Choosing the Smoothing Parameter $\lambda$
choosing $\lambda \to$ effective degrees of freedom ($df_\lambda$)
- cross-validation
7.6 Local Regression
- choosing $s$ : cross-validation
smaller -> flexibillity up
- effective in varing coefficient model(a multiple linear regression model that is global in some variable , but local in another, such as time)
- perform poorly if p is much larger than about 3 or 4
7.7 Generalized Additive Models
- a general framework for extending a standard linear model by allowing non-linear functions of each of the variable, while maintaining addivitiy
- quantitative and qualitative
7.7.1 GAM for Regression Problems
->
- bulding methods for fitting an addtitive model
- Pros and Cons of GAMs
- fit a non-linear $f_j$ to each $X_j \to$ do not need dto manually try out many different transformations on each variable
- potentially make more accurate predictions
- examine the effect of each $X_j $ on $Y$ individually while holding all of the other variables fixed -> useful for inference
- smoothness of the function $f_j$ can ve summarized via degrees of freedom
- do not include interaction terms
- compromise between linear and fully nonparametric models
7.7.2 GAMs for Classification Problems
Feel free to share!