Multidimensional Scaling

To elucidate the concept of multidimensional scaling, let’s consider a dataset from a study that aimed to determine which pairs of letters people frequently confuse. The table below is from Wolford and Hollingsworth (1974). The set comprises eight letters: C, D, G, H, M, N, Q, and W. This results in an 8×8 matrix. Each cell in this matrix indicates the frequency with which people confuse the corresponding row and column letters.

D = matrix(0, 8, 8)
letters = c("C", "D", "G", "H", "M", "N", "Q", "W")
colnames(D) = letters
rownames(D) = letters
D[2:8, 1] = c(5, 12, 2, 2, 2, 9, 1)
D[3:8, 2] = c(2, 4, 3, 4, 20, 5)
D[4:8, 3] = c(3, 2, 1, 9, 2)
D[5:8, 4] = c(19, 18, 1, 5)
D[6:8, 5] = c(16, 2, 18)
D[7:8, 6] = c(8, 13)
D[8, 7] = 4
D = (D+t(D))
D
##    C  D  G  H  M  N  Q  W
## C  0  5 12  2  2  2  9  1
## D  5  0  2  4  3  4 20  5
## G 12  2  0  3  2  1  9  2
## H  2  4  3  0 19 18  1  5
## M  2  3  2 19  0 16  2 18
## N  2  4  1 18 16  0  8 13
## Q  9 20  9  1  2  8  0  4
## W  1  5  2  5 18 13  4  0

The matrix \(\mathbf{D}\) is symmetric and represents similarity, not distance. A higher value signifies that the two letters (row and column) are easily confused, indicating their similarity.

First, we change the similarity matrix \(\mathbf{D}\) to a distance matrix \(\mathbf{D}_0\) using the formula \(21-\mathbf{D}\). This essentially inverts the sign, ensuring all entries remain positive (given the highest value in \(\mathbf{D}\) is 20). The resultant matrix, \(\mathbf{D}_0\), serves as our distance matrix. We also set the diagonal of \(\mathbf{D}_0\) to zero since the distance between identical letters should always be zero.

D0 = 21 - D
diag(D0) = 0

Applying multidimensional scaling to this matrix and by default embedding the data in a two-dimensional space, we get a visual representation. An alternative transformation for the similarity-to-distance matrix conversion, using \(41-\mathbf{D}\), was also tried. The results were largely consistent, albeit with slight variations in scale.

tmp = cmdscale(D0)
par(mfrow=c(1,2))
plot(tmp[, 1], tmp[, 2], type="n", xlab="", ylab="", 
     xlim = c(-15, 15), ylim=c(-15, 15))
text(tmp[, 1], tmp[, 2], label = letters)

D1 = 41 - D
diag(D1) = 0
tmp = cmdscale(D1)
plot(tmp[, 1], tmp[, 2], type="n", xlab="", ylab="", 
     xlim = c(-20, 20), ylim=c(-20, 20))
text(tmp[, 1], tmp[, 2], label = letters)