Angles between words
Natural language processing represents words as high-dimensional vectors, on the order of 100 dimensions. For example, the glove-wiki-gigaword-50 set of word vectors contains 50-dimensional vectors, and the the glove-wiki-gigaword-200 set of word vectors contains 200-dimensional vectors.
The intent is to represent words in such a way that the angle between vectors is related to similarity between words. Closely related words would be represented by vectors that are close to parallel. On the other hand, words that are unrelated should have large angles between them. The metaphor of two independent things being orthogonal holds almost literally as we'll illustrate below.
Cosine similarityFor vectors x and y in two dimensions,
where is the angle between the vectors. In higher dimensions, this relation defines the angle in terms of the dot product and norms:
The right-hand side of this equation is the cosine similarity of x andy. NLP usually speaks of cosine similarity rather than , but you could always take the inverse cosine of cosine similarity to compute . Note that cos(0) = 1, so small angles correspond to large cosines.
ExamplesFor our examples we'll use gensim with word vectors from the glove-twitter-200 model. As the name implies, this data set maps words to 200-dimensional vectors.
Note that word embeddings differ in the data they were trained on and the algorithm used to produce the vectors. The examples below could be very different using a different source of word vectors.
First some setup code.
import numpy as np import gensim.downloader as api word_vectors = api.load("glove-twitter-200") def norm(word): v = word_vectors[word] return np.dot(v, v)**0.5 def cosinesim(word0, word1): v = word_vectors[word0] w = word_vectors[word1] return np.dot(v, w)/(norm(word0)*norm(word1))
Using this mode, the cosine similarity between dog" and cat" is 0.832, which corresponds to about a 34 angle. The cosine similarity between dog" and wrench" is 0.145, which corresponds to an angle of 82. A dog is more like a cat than like a wrench.
The similarity between dog" and leash" is 0.487, not because a dog is like a leash, but because the word leash" is often used in the same context as the word dog." The similarity between cat" and leash" is only 0.328 because people speaking of leashes are more likely to also be speaking about a dog than a cat.
The cosine similarity between uranium" and walnut" is only 0.0054, corresponding to an angle of 89.7. The vectors associated with the two words are very nearly orthogonal because the words are orthogonal in the metaphorical sense.
Note that opposites are somewhat similar. Uranium is not the opposite of walnut because things have to have something in common to be opposites. The cosine similarity of expensive" and cheap" is 0.706. Both words are adjectives describing prices and so in some sense they're similar, though they have opposite valence. Expensive" has more in common with cheap" than with pumpkin" (similarity 0.192).
The similarity between admiral" and general" is 0.305, maybe less than you'd expect. But the word general" is kinda general: it can be used in more contexts than military office. If you add the vectors for army" and general", you get a vector that has cosine similarity 0.410 with admiral."
Related postsThe post Angles between words first appeared on John D. Cook.