Every Non-word.

I generally have little interest in lists of invented words, which at best induce a slight smile and are never heard of again, but every non-word (@nondenotative) is different; it consists of “combinations of English syllables that don’t appear in the dictionary,” and the very fact that it’s not trying to be clever means you can imagine definitions for (non-)words that don’t shout their would-be meaning at you. Kudos to Daniel Temkin, whose projects (shown at that link) involve other interesting ideas like Esoteric.Codes (“a blog investigating programming languages as experiments, jokes, and experiential art”) and Borges: The Complete Works (“All of Borges’s words, slightly out of order”).

Comments

  1. As far as non-words go, “downlove” is pretty popular – About 560,000 results in Google search.

    Sandover is an old English surname, reported to be a variant of Sandford (Anglo-Norman aristocratic clan first mentioned in Domesday Book of 1086)

    Cowtech also appears to be in use, it’s meaning quite straightforward even without mention in dictionary and there are several companies with such name.

    Carpros is plural of “car pro”, expression which gives 300,000,000 Google search results.

  2. and Guncon definitely shouldn’t be in the list, because it has a Wikipedia entry.

  3. Well, I’m not especially bothered if it occasionally comes up with non-words that turn out to be words; there’s nothing riding on its being completely accurate.

  4. circon is mentioned in An Etymological Dictionary of Modern English as older spelling of “circum”

    noncycle is quite common as a term, particularly in English drama.

    lonewater is a geographic name (street in Fredericton, New Brunswick, Canada)

    footsub is… no, I am not going to tell you…

  5. buttrust has Urban Dictionary entry with quite specific (and very unhygienic) definition.

  6. Well, of course “cowtech” is a word. The industrial revolution wasn’t going to leave cow tools unmolested forever.

  7. -if it occasionally comes up with non-words that turn out to be words

    I wonder how exactly they come up with non-words.

    Is it automated process, do they get it from autocorrect data or something?

    as far as I can see, while it comprises non-words which could plausibly exist (and in fact are in use), majority are clearly nonsense and often impossible to pronounce like “adverlacpsyunion”

  8. The program probably searches the internet, finds letter strings between 2 stop symbols which happen to be not in some dictionary (which one?) and then posts them. Or at least that’s what I imagine.

  9. The BlackBlack have a Wikipedia entry too.
    https://en.wikipedia.org/wiki/Blackblack

    Autorec is a Japanese business selling used cars (300,000+ Google hits).

    Sarder is an archaic synonym of fucker (cf. OED on sard).

    Fanplay is a technical term in petroleum geology.

  10. @D.O.: no, it probably subjects a dictionary to either a Markov chain, like the old Dissociated Press, or some coarse approximation to that idea like just picking two words that have a substring of a few letters in common and splicing them. (Then checks they’re not in said dictionary and posts them.)

  11. majority are clearly nonsense and often impossible to pronounce like “adverlacpsyunion”

    Pronounceable or not, they resemble those famous Soviet-era syllabic abbreviations (“syllabbs”?) like Мосгорисполком (Mosgorispolkom) = Исполнительный комитет Московского городского Совета депутатов ‘Executive Committee of the Moscow City Council of People’s Deputies’ (Moscitexcom?) or Росглавстанкоинструментснабсбыт (Rosglavstankoistrumentsnabsbyt) = Главное управление по снабжению и сбыту станков кузнечно-прессового оборудования, инструмента и абразивных изделий при Госплане РСФСР (the translation would be an interesting exercise for students).

  12. …депутатов трудящихся. I must have run out of breath.

  13. Sarder is an archaic synonym of fucker (cf. OED on sard).

    Great heavens, it was worth making the post just to learn about this! Citations:

    c950 Lindisf. Gosp., Matt. v. 27 Ne serð þu oðres mones wif.
    c1425 Cast. Persev. 1163 in Macro Plays 112 Þanne mayst þou bultyn in þi boure, & serdyn gay gerlys.
    1530 J. Palsgrave Lesclarcissement 697/2, I sarde a queene, je fous.
    1535 D. Lindsay Satyre (1871) 3028 Freirs, Quhilk will, for purging of their neirs: Sard up the ta raw, and doun the uther.
    1659 J. Howell Eng. Prov. 17 Go teach your Grandam to sard; a Nottingham Proverb.

    Etymology:

    In Old English only once (Northumbrian) in imperative serð, apparently < Old Norse serða (strong verb) = Middle Low German serden, Middle High German, early modern German serten. Old English may have had the normal *seordan.

    The entry is from 1909; I look forward to the update!

  14. “Fanplay is a technical term in petroleum geology.”

    It’s what my kids call it when you talk through a fan to make your voice sound choppy, or you stick a card in it to pretend it’s a motorcycle or helicopter.

  15. Piotr: I usually see them called stump compounds in English, at least in the Russian context. Native versions are not uncommon in the U.S. Navy, where the Office of Naval Operations is called OpNav and an instruction it issues is an OpNavInst. Other native stump compounds include some of the neighborhoods of New York, where SoHo is a region south of Houston St, NoHo is north of SoHo, TriBeCa is a triangular area below (i.e. south of) Canal (St.), and NoLita is north of Little Italy (though Little Italy is not “Lita”). These last are pronounced more or less naturally: /trɑiˈbɛkə/, for example.

    Hat: Florio’s 1611 Italian/English dictionary defines fottere as ‘to iape, to sard, to fucke, to swive, to occupy’. Harry Turtledove’s semi-fictional Queen Elizabeth refers to the occupation of England by the Spanish Armada in terms that make it clear a play on words is intended.

  16. My guess is different: it is a morphological generator that manufactures somewhat plausible words and then filters them against dictionaries so that known words are filtered out. For example, ad-ver-lac-psy-union is a sequence of morphologically reasonable prefixes (well, ver- and lac- are a bit marginal) followed by a normal base.

    The original Unix spelling checker, which did not have space for a full word-list, would cheerfully accept monstrosities like *overpseudounderstandmetry, removing prefixes and suffixes mechanically to produce stand, which was verified as correct. Sometimes the reduction overdid it, and *thier was analyzed as thy-er, where thy is the 2sg possessive adjective; such a form had to be filtered out by a stop list.

  17. Hat: Florio’s 1611 Italian/English dictionary defines fottere as ‘to iape, to sard, to fucke, to swive, to occupy’. Harry Turtledove’s semi-fictional Queen Elizabeth refers to the occupation of England by the Spanish Armada in terms that make it clear a play on words is intended.

    Oh, I’m very familiar with the others, and have more than once had occasion to explain the former connotations of “occupation” to people. “Conversation” is another good one (Richard III iii. v. 30 “His conuersation with Shores wife”).

  18. Trond Engen says

    Norrøn ordbok:

    serða (sarð; sorðinn) ha lekamleg omgang med ein (helst om sodomitteri); s. e-n

    I.e., mostly used about those sordid sordomites.

    I’d never heard the word. It’s so dead it’s not even in Norsk Ordbok 2014, with its richness of obscure dialect forms.

  19. John Cowen: My favorite Naval compound is DICNAVAB– obviously the Dictionary of Naval Abbreviations.

  20. J. W. Brewer says

    Whatever is generating the combinations doesn’t have programmed into it much of a good sense of English morphology and how it interacts with orthographic conventions. For example, some of the proposed words ending -mis would be perfectly cromulent (even if non-existent) if they ended in -miss (with no change in pronunciation), but the -mis spelling really only works in English for words of obvious classical (or pseudo-classical!) derivation. I suspect the word-generator is working on the premise that “syllables” are fundamentally strings of characters, rather than said-aloud entities that comply with English phonotactic conventions, which then in turn have complicated and context-dependent rules for how they get orthographically represented.

  21. Old Norse serða (strong verb)
    I’ll say it is.

  22. Trond Engen says

    O.N. serða

    Strange word. Well, not that strange, it fits neatly into an IE paradigm, so it should be old, but I can’t find an etymology. I wonder if it might be the sert/sort of e.g. ‘insert’ and ‘consort’ with (i.e without) a lost prefix.

  23. ‘insert’ and ‘consort’

    which are simply serō “to plant”.

  24. Elmar Seebold, in his excellent Vergleichendes und etymologisches Wörterbuch der germanischen starken Verben (bought by me on a field trip to Cambridge, Mass., over forty years ago) says it has “keine brauchbare Vergleichsmöglichkeit.”

  25. Trond Engen says

    It’s seemingly a class 3 strong verb parallel to verðan “become” < PGmc *wérþan- < PIE *wer-t- “turn, rotate”, a t-extension of *wer- “twist, turn”.

    I see that PIE *seH1- of Lat. serō had the original meaning “plant, implant”. Good. But I don’t know if this root could be extended twice and shortened once to become *sert- in PIE. The shortening of the vowel couldn’t have happened in Gmc. because of *ē > *ā.

  26. Trond Engen says

    Wrong again… ē > ā was NWGmc, so it could have been shortened in Common Germanic, but I still don’t think it works as a class 3 verb in PIE without shortening.

  27. A few years ago, Craig Melchert (UCLA) found a plausible cognate of sard in Hittite, the verb sartai/sartanzi ‘rub’.

  28. David Marjanović says

    …депутатов трудящихся. I must have run out of breath.

    😀 😀 😀

    It’s recursive, too: Госплан is itself a syllabic abbreviation.

    PIE *seH1- of Lat. serō

    Never mind Germanic, the e is short in Latin, too… this smells of Dybo’s law (shortening of long vowels everywhere before the stressed syllable).

    a plausible cognate of sard in Hittite, the verb sartai/sartanzi ‘rub’.

    That is frigging plausible.

  29. Heh. Indeed.

  30. In modern Icelandic serða means to have penetrative sex. So it’s only used of men, one never hears of women “serðing” a man.

    The Wictionary article says that the word is “somewhat archaic” which is true. It’s not in common use. Serða is only used for comic & vulgar effect, if not to ridicule the one who is doing the serðing. It implies the sexual encounter is “quick and dirty”, I’d say.

    https://en.wiktionary.org/wiki/serða

  31. You can test your ability to distinguish English words from non-words at a Ghent University website. The results of a similar word test for (native speakers of) Dutch can be found here.

  32. Syllabic abbreviations used to be filed under “acronym”. They seem to have been thrown out to make room for alphabetisms. Of course, all such taxonomies are fuzzy.

  33. Trond Engen says

    The autocorrect of the Icelandiic iPad keyboard didn’t know the word.

  34. That’s because the autocorrect of the Icelandic iPad keyboard is total crap, Trond Engen! None uses it.

    Here’s a blog kept by a translator who does subtitles for tv. He’s talking about which Icelandic “fuck-terms” or expressions are accepted/tolerated by mainstream audiences. I don’t know if you can read the post but the bottom line is that it’s pretty much unthinkable to use “serða” in subtitles. It’s just considered too offensive.

    https://malbeinid.wordpress.com/2011/08/05/serda-eda-brynna-folanum/

  35. Trond Engen says

    I don’t _use_ it either (and I wish I knew how to turn it off) (even more so because I just use it for the ethos and thorns) (but I haven’t cared enough to look up a solution). I just noted that the word was unknown to it.

    I’ll read the blogpost. My Icelandic is rudimentary, but I enjoy trying. And this might put the rude back in my rudimentary.

  36. La Horde Listener says

    Then, sardines..? {8-{

  37. “the ethos and thorns”

    English autocorrect has its own issues.

  38. Trond Engen says

    Heh. The English autocorrect is the most annoying of them all, just for turning italicizors into first person singular pronouns. The Norwegian one is quite picky about variant spellings and gender assignment but nice enough to recognize both Bokmål and Nynorsk forms.

  39. Never mind Germanic, the e is short in Latin, too… this smells of Dybo’s law (shortening of long vowels everywhere before the stressed syllable).

    It was probably the first syllable that was stressed in pre-Italic, and it must have been short in either case. The most parsimonious analysis of serō is as a reduplicated thematic present, and the normal shape of this type of stem is *Cí-CC-e/o- (with zero grade of the root). This gives us *sí-sh₁-e/o-. Another possibility is an original reduplicated athematic present with an ablauting stem: *si-séh₁-ti (sg.)/*sé-sh₁-n̥ti (pl.). Some would prefer an o-grade root and an e-reduplication in the sg. allomorph (thus in LIV, the Lexicon of IE Verbs). I prefer Rasmussen’s analysis of this type, as shown here. There has been some debate concerning the Latin lowering of short high vowels before *z in an open syllable: can it happen only word-medially or also in the initial syllable? The latter seems to be possible (if subject to irregular variation), so *sí-sh₁- > *siz- > ser- is the optimal solution.

  40. David Marjanović says

    Ah, that explains the length variation quite nicely.

  41. Trond Engen says

    Quae serō serō, as Diurna Doria used to say.

    My century-old Latin dictionary has sēvi, sat so neither quantity nor quality is constant through the paradigm.

    But since Latin -r- is phonological, the whole reason to think of a relation to serða is gone. Unless the meaning of the extension is intensive/iterative or something.

  42. My century-old Latin dictionary has sēvi, sat so neither quantity nor quality is constant through the paradigm.
    Those are just the outcomes of different ablaut grades used in other parts of the paradigm – full (e) grade PIE *seH1- -> Proto-Italic *se:- in the perfect, zero grade *sH1-to- , as usual, in the past passive participle.

  43. The site is still there, but seems to have stopped updating as of July 29, 2019.

Speak Your Mind

*