Random Indexing
Ross Gayler pointed out two interesting statements in An Introduction to Random Indexing (Magnus Sahlgren):
Both these are very powerful properties of vector space representations, and these are particularly nice statemnts of these properties.
- “Hecht-Nielsen (1994), … demonstrated that there are many more nearly orthogonal than truly orthogonal directions in a high-dimensional space.” (Hecht-Nielsen, R.; Context vectors; general purpose approximate meaning representations self-organized from raw data. In Zurada, J. M.; R. J. Marks II; C. J. Robinson; Computational intelligence: imitating life. IEEE Press, 1994.)
- “the Johnson-Lindenstrauss lemma (Johnson & Lindenstrauss, 1984) – that states that if we project points in a vector space into a randomly selected subspace of sufficiently high dimensionality, the distances between the points are approximately preserved.” (Johnson, W. B. & J. Lindenstrauss; Extensions to Lipshitz mapping into Hilbert space. Contemporary mathematics. 26, 1984.)
Both these are very powerful properties of vector space representations, and these are particularly nice statemnts of these properties.

1 Comments:
The phrase "the curse of dimensionality" suggests that high-dimensional spaces bring problems, but these two results suggest that they can also bring blessings. This apparent paradox might be a fruitful topic to investigate: Under what circumstances is a high-dimensionality a blessing/curse?
For example, the JL lemma states (approximately) that projection to a lower dimensional space preserves structure. However, the curse of dimensionality suggests that the high-dimensional representation is unlikely to have any useful structure worth preserving in the projection. (By the H-N result, as the dimensionality increases everything tends to orthogonality.)
This suggests that a special process is needed to construct the high-dimensional representations in such a way that there is worthwhile structure to subsequently be projected back into a lower-dimensional subspace.
Can the J-L lemma be run backwards? That is, if we project from a lower-dimensional space to a higher-dimensional one is the structure preserved? If so, perhaps this is a mechanism by which high-dimensional representations can be constructed such that they have useful structure.
Anyway, I would like to see these two results unified with the curse of dimensionality.
By Ross Gayler, at 7:59 PM
Post a Comment
<< Home