(From ESL 5.2.3 Example: Phoneme Recognition)
In this example we use splines to reduce flexibility rather than increase it; the application comes under the general heading of functional modeling.
The Phoneme data were extracted from the TIMIT database (TIMIT Acoustic-Phonetic Continuous Speech Corpus, NTIS, US Dept of Commerce) which is a widely used resource for research in speech recognition. A dataset was formed by selecting five phonemes for classification based on digitized speech from this database. The phonemes are transcribed as follows:
From continuous speech of 50 male speakers, 4509 speech frames of 32 msec duration were selected, represented by 512 samples at a 16kHz sampling rate, and each frame represents one of the above five phonemes.
From each speech frame, a log-periodogram is computed, which is one of several widely used methods for casting speech data in a form suitable for speech recognition. Thus the data used in what follows consist of 4509 log-periodograms of length 256, with known class (phoneme) memberships.
= "https://hastie.su.domains/ElemStatLearn/datasets/phoneme.data"
url = read.csv(url, header = TRUE)
phoneme = phoneme[phoneme$g %in% c("aa", "ao"), ]
mydata rm(phoneme)
1:2, ]
mydata[dim(mydata)
= data.matrix(mydata[, -c(1, 258, 259)])
X = mydata[, 258]
Y table(Y)
## Y
## aa ao
## 695 1022
= ifelse(Y=="ao", 1, 0)
Y table(Y)
## Y
## 0 1
## 695 1022
The figure below are displays a sample of 15 log-periodograms for each of the two phonemes “aa” and “ao” measured at 256 frequencies. The goal is to use such data to classify a spoken phoneme. These two phonemes were chosen because they are difficult to separate.
= which(Y==0)[sample(1:500, 15)]
id0 = which(Y==1)[sample(1:500, 15)]
id1
plot(c(1, 256), range(X[c(id0, id1), ]), type="n",
xlab = "Frequency",
ylab = "Log-periodogram"
)
for(i in id0)
lines(X[i, ], col="green")
for(i in id1)
lines(X[i, ], col="orange")
legend("topright", c("aa", "ao"), lty=c(1,1),
col = c("green", "orange"))