(From ESL 5.2.3 Example: Phoneme Recognition)

In this example we use splines to reduce flexibility rather than increase it; the application comes under the general heading of functional modeling.

Phoneme Data

The Phoneme data were extracted from the TIMIT database (TIMIT Acoustic-Phonetic Continuous Speech Corpus, NTIS, US Dept of Commerce) which is a widely used resource for research in speech recognition. A dataset was formed by selecting five phonemes for classification based on digitized speech from this database. The phonemes are transcribed as follows:

  • “sh” as in “she”
  • “dcl” as in “dark”
  • “iy” as the vowel in “she”
  • “aa” as the vowel in “dark”
  • “ao” as the first vowel in “water”.

From continuous speech of 50 male speakers, 4509 speech frames of 32 msec duration were selected, represented by 512 samples at a 16kHz sampling rate, and each frame represents one of the above five phonemes.

From each speech frame, a log-periodogram is computed, which is one of several widely used methods for casting speech data in a form suitable for speech recognition. Thus the data used in what follows consist of 4509 log-periodograms of length 256, with known class (phoneme) memberships.

url = "https://hastie.su.domains/ElemStatLearn/datasets/phoneme.data"
phoneme = read.csv(url, header = TRUE)
mydata = phoneme[phoneme$g %in% c("aa", "ao"), ]
rm(phoneme)
mydata[1:2, ]
dim(mydata)
X = data.matrix(mydata[, -c(1, 258, 259)])
Y = mydata[, 258]
table(Y)
## Y
##   aa   ao 
##  695 1022
Y = ifelse(Y=="ao", 1, 0)
table(Y)
## Y
##    0    1 
##  695 1022

The figure below are displays a sample of 15 log-periodograms for each of the two phonemes “aa” and “ao” measured at 256 frequencies. The goal is to use such data to classify a spoken phoneme. These two phonemes were chosen because they are difficult to separate.

id0 = which(Y==0)[sample(1:500, 15)]
id1 = which(Y==1)[sample(1:500, 15)]

plot(c(1, 256), range(X[c(id0, id1), ]), type="n", 
     xlab = "Frequency", 
     ylab = "Log-periodogram"
)

for(i in id0)
  lines(X[i, ], col="green")
for(i in id1)
  lines(X[i, ], col="orange")
legend("topright", c("aa", "ao"), lty=c(1,1),
        col = c("green", "orange"))