Think of Ten Different Words.

From the About page:

The Divergent Association Task is a quick measure of verbal creativity and divergent thinking […]. The task involves thinking of 10 words that are as different from each other as possible. For example, the words cat and dog are similar, but the words cat and book are not.

Do the Task — it’s fun! (And if you let them use your answers, you’re helping Science.) On my first try, I got 81.51, “higher than 70.03% of the people who have completed this task.” Via MetaFilter, where you will find much nerdy discussion of the details and how to raise or lower your score (only the first try is used in their research); a commenter there refers to the “Word for Word” segment of BBC radio’s comedy quiz show I’m Sorry I Haven’t A Clue, in which “players may say any word as long as it has no connection whatsoever to the previous word” (sample clip).


  1. For this, I cry foul:

    “Using a subset of words allows for some redundancy; you can include up to three invalid words (e.g., misspelled words or words that are too technical) while still receiving an overall score.”

    So my “monopsony” was tossed out. Feh.

    (Mind you, I’m griping despite scoring higher than 97.71% of persons tested.)

  2. Yeah, they only count the first seven words unless one or more of them is invalid, so front-load your gems!

  3. jack morava says

    By chance yesterday I noticed this:

    `Examples of his coinages, many of which are of a scientific or medical nature, include

    ambidextrous, antediluvian, analogous, approximate, ascetic, anomalous, carnivorous, coexistence, coma, compensate, computer, cryptography, cylindrical, disruption, ergotisms, electricity, exhaustion, ferocious, follicle, generator, gymnastic, hallucination, herbaceous, holocaust, insecurity, indigenous, jocularity, literary, locomotion, medical, migrant, mucous, prairie, prostate, polarity, precocious, pubescent, therapeutic, suicide, ulterior, ultimate, and veterinarian.’

  4. David Eddyshaw says

    A princely 87.4% (“higher than 92.03% of the people who have completed this task.”)

    As my thought processes (if that is the term I want) work more or less at random, I’m not surprised.
    (I’d love to believe that the test reflects creativity, but this seems implausible.)

  5. My husband’s list had “clitic” and “bioluminescence” tossed out. How valid can this be if the high vocabulary users’ results are artificially clipped?

  6. J.W. Brewer says

    Perhaps because I erred too much on the side of avoiding nouns that might be considered too obscure and thus “technical,” I had an unimpressive “83.19, higher than 78.14% of the people who have completed this task.” FWIW my most-divergent word pair (out of the first seven, since they all counted) was cloud/turntable, which got a 96; the least-divergent was canyon/jaguar, which got a 74. If I’d stopped and tried to tweak my selections before just hitting submit I might have done a bit better (e.g. I could have replaced “jaguar” with some other large animal whose geographical range is nowhere near that of canyons, or with “starfish,” which was one of my unscored three at the end).

  7. I got 89.46. The closest pair I had were passion and injustice.

  8. J.W. Brewer says

    Now I’m wondering how challenging it would be to try to do this backwards. Get a “low score” benchmark by something obvious like “seven color words” or “seven common types of fruit” and see if you can get even lower than that by finding an even less-divergent (on their metric) set.

  9. David Eddyshaw says

    One, two, three … ten resulted in

    Your score is 14.9, higher than 0.0% of the people who have completed this task

  10. Impressive!

  11. January First-of-May says

    Yeah, they only count the first seven words unless one or more of them is invalid, so front-load your gems!

    You should have told me that in advance; I saved the abstract concepts for the last three, and ended up with 89.31 (“higher than 95.48%”).

    I could have probably also scored higher if I replaced the piece of furniture with an inanimate object that wasn’t furniture, to reduce similarity with the building.

  12. Athel Cornish-Bowden says

    At first i thought it too difficult for a someone with approaching senility, but then I plucked my courage:

    “85.96, higher than 88.33% of the people who have completed this task”

    Lowest score (74) for homework and house; highest (96) for homework and pelvis.

  13. ə de vivre says

    After playing around with it for a bit, the highest scoring list some friends and I came up with was: fucks, metatarsals, distillery, mergansers, magazine, exoplanet, schema.

    Not all numbers are equally proximate either. “Three, four, five, six, seven, eight, nine” does better than “one” through “seven.”

  14. looking at the score screen (92.83!), i got rather confused by their use of “semantic difference” – partly because i couldn’t see why “sirocco” was closer to “aphasia” than “theology” (my lowest and highest scoring pairs). but then i saw that what they use is a proximity measure within a corpus of webpages. to which i can only say Not. The. Same. Thing. and i’d argue not even much of a proxy, unless your corpus is mostly thesauri.

    edit: now i see that the metafilterers were on this already; gonna still leave this here on general principles

    more edit: even though the scoring system is skew to the task, i think the set of submissions would actually be very useful for research into semantic distance – or, better, into how people understand maximal semantic difference!

    but it maybe also explains why so many of us are coming up with rejected words: the corpus must not contain them (or have too few instances for their algorithm to score). which makes a certain kind of sense for “clitic”, “spandrel”, and “benzodiazepine” (which i may have also misspelled), and maybe less so for “bioluminescence” or “ironworker”. (if you ask me, it’s also not a great corpus for semantic-distance purposes if it doesn’t recognize “simoom” but knows from “sirocco”)

    last edit, because it’s a gem (at least for my age cohort):
    I entered the first ten nouns in the lyrics of REM’s It’s The End of the World and I Know It.
    Your score is 80.34, higher than 63.68% of the people who have completed this task”

  15. ə de vivre says

    From the website of the algorithm they use: “GloVe is essentially a log-bilinear model with a weighted least-squares objective. The main intuition underlying the model is the simple observation that ratios of word-word co-occurrence probabilities have the potential for encoding some form of meaning.”

    So, it’s a bit of a circular definition, where semantic similarity refers to whatever it is that leads to words being used together in the same text. It strikes me that the aims of GloVe would be a lot clearer if they just left out the word “semantic.”

  16. J.W. Brewer says

    The google n-gram viewer says that “monopsony” was more frequent than “metatarsals” in its corpus throughout the period 1973 through 2003, so treating the former as flunking the “too technical” criterion but accepting the latter seems a bit suspect to me. Although I guess this just confirms that the optimal strategy is to probably include three such borderline lexemes in your first seven.

  17. Ben Tolley says

    92.06, which was a surprise because I did it in a hurry before leaving work; I would probably have gone for more abstract nouns if I had had more time to think. I didn’t have long to study the results, but I think it rejected idiolect, my one linguistics-related word. The most divergent pair was generation and mincemeat (111), which makes sense if it’s a proximity measure. I can’t imagine those turning up in the same text unless it’s a very strange horror story. (The least divergent were grommet and puncture.)

  18. I took it to mean finding words with the least amount of association with each other. Using “how often the words are used together in similar contexts” to measure that doesn’t seem unreasonable. But calling that “average semantic distance” is a misnomer.

  19. For more context on why they call this semantic similarity, look up “distributional semantics”. This developed as a movement in linguistics during the 1950s, but has resurfaced as a computationally realizeable take on semantics.

    And yeah, going along with realizeable it’s a bit impoverished. The use of distribution for semantics comes out of the larger distributionalist framework, which had behaviorist attitudes and aimed to do without such unobservables as “intention”.

  20. My average was tanked by “aspersion”, which scored only 79 with “silt” and 80 with “hangnail”. Not for reasons I can tell a good story about, really, why aspersion-silt 79 is Goofus and vicar-braid 100 is Gallant.

    I suspect that the extremes this game is pushing for are largely fishing up noise in the metric construction. If the authors randomly split their corpus in half and built two metrics, I believe they’ll agree on n what’s low-scoring. But will they agree on what’s high-scoring? I bet not.

  21. J.W. Brewer says

    Inspired by the R.E.M. example I tried to think of a song with weird and seemingly disconnected lyrics, came up with the Velvet Underground’s “The Murder Mystery” and picked the first noun-functioning-as-noun (this required some judgment calls) in each of the first five pairs of verses (the song has two different sets of lyrics going on simultaneously, one in each speaker). That only got me 88.15 (higher than 93.58%), with moonlight/offenses being the most divergent pair at 102. The seven words they scored FWIW were wrappers, verbs, flag, moonlight, offenses, edge, & rag, with muse, peanuts, and poets going unused.

    I then did a second run keeping the unused muse/peanuts/poets and supplementing with objections/frenzy/contempt/melodies from the next pair of verses. Starting to look a little conceptually linked however surreal, so that knocked it down to 83.3 although the peanuts/poets pair scored 100.

  22. I got 82.6 using a set of common words. I started with ‘cat, book,’ as the blurb suggests. They have a ‘connectivity’ of 80, which makes some sense if people write about reading a book with a cat in the vicinity. But the lowest pair I scored was room/ocean at 64, which I find mystifying, especially as the scores for cat/room and book/room were 72 and 73 respectively.

    Putting in ten color names produced a score of 23.72.

  23. CuConnacht says

    95.51; 99.56%. Closest match was isinglass and eisteddfod at 70. They rejected barracoon.

  24. J.W. Brewer says

    I thought maybe you’d get more semantically disconnected entries from song lyrics by picking the first-noun each (not worrying for this take about whether it was being used as a noun versus adjectivally) from each song on a ten-song album. Unfortunately if you use the Stones’ Sticky Fingers for the experiment, the 102 you get for fortnight/satin will be dragged down by the 36 for child/childhood, not to mention the 44 for day/fortnight, leaving you with a mediocre overall 77.43.

    Trying a different album with a reputation for more surreal lyrics, however (the Soft Boys’ Underwater Moonlight) got me up to 86.73, with individual pairs ranging from a high of 98 (dentures/street) to a low of 71 (street/trails).

    Note that they say that “most people score between 74 and 82,” so if an automated process can give you 86.73 that shows something. On the other hand, if you take however large a corpus of “non-technical” nouns they have (50,000? more? less?) and have a computer just pull random sets of seven out of it, I wonder if that would on average outperform humans consciously trying to devise incongruous sets.

  25. @David L: There are a lot of online listings and discussions of hotel rooms with ocean views.

  26. Underwater Moonlight! I haven’t thought of that album in decades, but just seeing the title brought back the sound of it (I may still have the album moldering somewhere in the basement), and thanks to the miracle of YouTube I can re-experience the magic.

  27. J.W. Brewer says

    The original Underwater Moonlight (CD reissues tended to pile on the bonus tracks) actually has only nine songs with lyrics plus an instrumental, but I figured the first seven with lyrics were all that mattered unless something too “technical” got thrown up, which didn’t happen. But if you don’t go back and relisten to it on an annual-or-more-frequent basis, I can’t help you, dude.

    Abstracting somewhat from the concept of an “album,” I got a pretty respectable 88.95 using the first noun (second in the one instance where it was necessary to avoid duplication of a prior entry) from the first seven sonnets of Shakespeare as conventionally numbered/ordered. (That’s creatures, winters, glass, loveliness, hours, hand, and orient, none of which seem that radically divergent, in the sense that it takes little skill to be given a random two of the seven and write a plausible line of iambic pentameter incorporating both without it seeming forced.)

  28. @Brett: That must be it. Clearly I lack sufficient creativity (which the test claims to measure, very indirectly).

  29. PlasticPaddy says

    Bolzano spoke quickly and buoyantly. “Eleven, forty-one, elephant, voluminous.”

    Would these do well?
    ref. the sixth palace, R. Silverberg

  30. aimed to do without such unobservables as “intention”.

    and it kinda seems like “meaning”, too, along the way – which is a little bit fatal for talking about semantics.
    humpty dumpty was right that if you pay enough, you can claim a word means anything you want (and probably make it stick in court: “is”, anyone?) – but abe lincoln’s bar buddy / boyfriend / ghostwriter was also right that calling a tail a leg don’t make it a leg to anyone who’s not on the payroll.

    i think the algorithm makes for a fascinating gauge of word relationships; i wonder what trends a broader survey of literary rock & roll would find! i just wish they wouldn’t call what they’re measuring “semantic”.

  31. Yep, I’d say the same, but I do have to admit it’s unsettling how plausible a facsimile of meaning these methods can sometimes produce. Purely on a text corpus with no engagement with the world.

  32. David Eddyshaw says

    It’s semantic in pretty much exactly the same sense that Google Translate actually understands languages, I guess. There’s a whole industry dedicated to claiming that this sort of thing is artificial “intelligence.”

  33. I scored 92.26, higher than 98.36 of the generality. I failed to recognize the common thread connecting prank, zombie, and underpants, or I’d have done better. Cadmium and prank have less to do with each other than anything else in my universe. There’s nothing silly about cadmium!

  34. My own score was OK but nothing special. So I tried words from the sort-of-scat finale of Fred Lane’s Rubber Room, one of the greatest, maddest songs ever sung, and I dare them to not call this original. I picked the first ten nouns which were not clearly compounds.

    Fred gets a very respectable 92.47 / 98.48% for bimbo/luncheonette/byline/passport/libido/carpet/estrogen.

    Estrogen/libido got a low score, followed by bimbo/libido. I thought they would try to match substitutable words within a syntactic context, but they are not doing that: they are matching words which co-occur near each other in a text.

  35. Solving analogies by completing the parallelogram in word2vec space I find a bit eerie, for example. The antidote is to repeat to oneself that the co-occurrence matrix of the corpus is produced by humans who do engage with the world and write for meaning; the computational techniques are just parroting that back.

    (I could be wrong, but my read of the GloVe description is not that words X and Y are ‘similar’ if they co-occur in texts themselves, but if “words that co-occur with X” is similar to “words that co-occur with Y”. Which makes a bit more sense — they’re substitutable without *surface* surprise.)

  36. 94.51 (higher than 99.33%) with pushpin, fog, axle, salami, swiftly, tsarina, and dinar. Tried not to throw it for any loops vocab-wise.
    Swiftly and fog were the closest at 76.

  37. As others here have pointed out, the instructions asked for “words that are as different from each other as possible, in all meanings and uses of the words,” but the metric is measuring something completely different. For example, look at the “distances” generated for the following pairs:

    astronomy/astronomer: 35
    philosophy/philosopher: 41
    philosopher/astronomer: 45
    astronomy/philosophy: 61

  38. Your score is 91.4, higher than 97.76% of the people who have completed this task.

    custard aardvark ethics caryatid wicket apogee anapest

    Closest were “caryatid” and “anapest”, with a score of 73. I’m not sure what they have in common besides being sort of Greek. Maybe I should have tried “bratwurst”, but like “custard”, it’s a kind of food.

  39. I feel rather proud of myself:

    Your score is 93.95, higher than 99.15% of the people who have completed this task

    Not as good as some here, of course.

    But they rejected colophon, ziggurat, and trail.

    I revised my answer and got:

    Your score is 92.24, higher than 98.35% of the people who have completed this task

    Going backwards.

    Changing a few I got:

    Your score is 91.87, higher than 98.11% of the people who have completed this task

    Oh well.

  40. January First-of-May says

    Would these do well?

    One sequence of random words I thought of that did do well was Randall Munroe’s, from his Hugo award acceptance: “Gazebo, ointment, harpsichord, credenza, bungalow”. (I supplemented with “battery” and “staple”, deciding that “horse” was too common and “correct” too adjective.)

    Obviously the gazebo/bungalow distance wasn’t too high (and non-obviously so was the gazebo/credenza distance), but the overall score was a respectable 91.13 (97.54%).

    I suspect that the other classic random noun sequence, “Nitwit, blubber, oddment, tweak”, would do at least as well, but I couldn’t think of anything good to supplement it with.

  41. cuchuflete says

    I tried to use ordinary words, rather than obscure nouns. The test rejected “daylily” for reasons unknown.

    “Your score is 88.91, higher than 94.88% of the people who have completed this task.”

    curmudgeon vaporizer tine flax cesspit wafer cylinder

    Back to the field I go, to hybridize more invalid daylilies. I suppose “ploidy” would also fail to cut the mustard.

  42. I just saw Trevor Joyce’s FB post in which he announces with justifiable pride that he got a score of 100.52, “higher than 99.96%.”

  43. My score was 79.1, higher than 56.56 % of people. Considering English is my second language I’m very happy with that score! Highest score hazard and bowl (95), lowest score dream and emotion (61).

  44. Owlmirror says

    I wanted to experiment with phonetically similar words (does proximity mean “in a dictionary”?):

    cup cut clog clod cod cloud club cord cob cop

    Your score is 86.53, higher than 89.92% of the people who have completed this task

    lowest: cup/cut and cup/club (68)

    highest: clod/club (105)


    slop slope sleep sloop sleight slough slap salp spelt spoils

    Your score is 85.92, higher than 88.2% of the people who have completed this task

    lowest: slop/slope (65)
    highest: sleep/sloop (107)


    OK, a genuine attempt to pick unique words from distinct fields:

    collapsar heather pitchblende yardarm semicolon cofferdam windsock hypocaust hauteur glowworm

    Your score is 85.5, higher than 86.92% of the people who have completed this task
    (Seriously?! Lower than the slop-slope and cup-cut ones?)

    Lowest was yardarm/windsock (57)
    Highest was heather/hauteur (102)

    I call shenanigans.

  45. 90.95, higher than 97.37% with: sonata, thatch, nomad, gambit, trapezium, lemur, hemline, I tried for nouns from different disciplines, but not too technical (they rejected “disulphide”) and as far as I knew no metaphorical or trans-disciplinary uses which might brng any two together.

    All individually hand-picked. But I can see that if you could work out a way of automatically sending words to the web site and capturing the results, you might be able to do well by simulated annealing, e.g. by replacing the worst 2 and keeping the best 5.

    The worst pairs were sonata/gambit=80 and lemur/nomad=83. Perhaps there’s a metaphorical sense I overlooked or someone wrote a sonata for viola da gambit…

  46. The nomads of lemuria…

  47. @rosie: I don’t know about sonata and gambit, but there are a lot of Avatar: The Last Airbender fan pages that mention winged lemurs and Air Nomads.

  48. The set consisting of “cunt”, “porcupine”, “drill”, “interest”, “hexagon”, “hyperventilation”, “seed” got 90.31 / 96.73%. At first I felt proud of that achievement, but of course, it appears to be quite meagre by LH standards.

    “Hyperventilation” and “seed” are the most distant pair (108), followed by “porcupine” and “interest” (105). “Hexagon” and “drill” are just 70. Well…

    Second attempt: “piglet”, “swine”, “pork”, “hog”, “peccary”, “sow”, “pig”, 59.08 / 0.23%. It seems that their metric sort of makes sense, but still, “sow” and “pork” are 69 apart, just one less than “hexagon” and “drill”.

  49. looking

    103.39, higher than 99.99%

    (Damn! How can I stay away, when the theme is orthogonality of semantic fields?)

  50. Wow! Well done, and what a delight to see you back in these parts!

Speak Your Mind