Chapter 7: Exercise 7

library(ISLR)
set.seed(1)
summary(Wage$maritl)
## 1. Never Married       2. Married       3. Widowed      4. Divorced 
##              648             2074               19              204 
##     5. Separated 
##               55
summary(Wage$jobclass)
##  1. Industrial 2. Information 
##           1544           1456
par(mfrow = c(1, 2))
plot(Wage$maritl, Wage$wage)
plot(Wage$jobclass, Wage$wage)

plot of chunk 7.1

It appears a married couple makes more money on average than other groups. It also appears that Informational jobs are higher-wage than Industrial jobs on average.

Polynomial and Step functions

fit = lm(wage ~ maritl, data = Wage)
deviance(fit)
## [1] 4858941
fit = lm(wage ~ jobclass, data = Wage)
deviance(fit)
## [1] 4998547
fit = lm(wage ~ maritl + jobclass, data = Wage)
deviance(fit)
## [1] 4654752

Splines

Unable to fit splines on categorical variables.

GAMs

library(gam)
## Loading required package: splines Loaded gam 1.09
fit = gam(wage ~ maritl + jobclass + s(age, 4), data = Wage)
deviance(fit)
## [1] 4476501

Without more advanced techniques, we cannot fit splines to categorical variables (factors). maritl and jobclass do add statistically significant improvements to the previously discussed models.