MIT Technology Review has a brief but intriguing article called “How Google Converted Language Translation Into a Problem of Vector Space Mathematics.” If I could only have read it (or rather, the paper it’s based on) when I was a math major, forty-plus years ago!
The new trick is to represent an entire language using the relationship between its words. The set of all the relationships, the so-called “language space”, can be thought of as a set of vectors that each point from one word to another. And in recent years, linguists have discovered that it is possible to handle these vectors mathematically. For example, the operation ‘king’ – ‘man’ + ‘woman’ results in a vector that is similar to ‘queen’.
It turns out that different languages share many similarities in this vector space. That means the process of converting one language into another is equivalent to finding the transformation that converts one vector space into the other.
This turns the problem of translation from one of linguistics into one of mathematics. […]
The method can be used to extend and refine existing dictionaries, and even to spot mistakes in them. Indeed, the Google team do exactly that with an English-Czech dictionary, finding numerous mistakes.
That would have been right up my alley. Alas, having forgotten all the math I once knew, I can only gape and wonder if it’s all it’s cracked up to be. (Thanks, Nick!)