To illustrate local regression techniques, I’ve chosen three distinct data sets.
The first two are simulated.
exa
, the true curve is smooth.
Notably, on the left side, it remains relatively flat, while on the
right, there’s a noticeable fluctuation.exb
, the true curve is a simple
straight line, , but two outliers may impact the estimated curve.The third data set is derived from observations of the Old Faithful Geyser. Familiar to many statistics courses, each data point here denotes the duration of a specific eruption, while the y-axis represents the waiting time between eruptions. There’s an evident positive correlation between these two variables. While it might be tempting to fit a linear model, in this session, we will explore non-linear modeling to uncover deeper insights the data might offer.
par(mfrow=c(1,3))
= "https://liangfgithub.github.io/Data/Example_A.csv"
url = read.csv(url)
exa plot (y ~ x, exa, main="Example A")
lines(m ~ x, exa)
= "https://liangfgithub.github.io/Data/Example_B.csv"
url = read.csv(url)
exb plot(y ~ x, exb, main="Example B")
lines(m ~ x, exb)
= "https://liangfgithub.github.io/Data/faithful.dat"
url = read.table(url, header=TRUE)
faithful plot(waiting ~ eruptions, faithful,main="Old Faithful")