Last night my wife asked me (in the course of our O’Brian reading) where the word admiral comes from, and I gave her an off-the-cuff answer that was correct in essence (Arabic amir) but wrong on the details, as I discovered when I looked it up in the OED today. What astonished me was the length of the etymology: 1,341 words, with separate mini-etymologies for five different historical forms of the word and excursuses on “A further development in Latin,” “Further comments regarding Arabic models” (“It has been suggested that the presence of the final -al was caused or reinforced by Arabic al, the definite article which is also used in genitive constructions, but this is not borne out by the textual evidence in either Arabic or the Western languages”), “History of the title,” “Development of phrases,” “Development of secondary senses,” and “Development of forms”! I briefly wondered whether this was the longest etymology in the OED, but then I realized that was foolishness, and upon checking the Guide to the Third Edition of the OED discovered that (unsurprisingly) “The longest etymology section in the dictionary is the revised one at the verb to be.” So of course I went to that entry and discovered the etymology is a mind-boggling 9,672 words long, so long that it has its own table of contents, running from “1. General overview” to “3.7. Omission of auxiliary have in periphrastic tenses.” And there are 1,765 words (considerably more than the entire admiral etymology) before the table of contents! Here are the first few sentences:

The paradigm of the verb ‘to be’ in West Germanic languages in general shows forms derived from three unrelated Indo-European bases, in English itself perhaps forms derived from four Indo-European bases. These occur in sometimes overlapping, but generally distinct functions within the paradigm (see below), although there have been significant changes in these functions over time and in different varieties of English. The following notation is used in this entry to distinguish the different forms: (i) am/is-group: α (am), β (is), γ (Old English sind), δ (Old English sīe), ε (art), ζ (are); (ii) be-group: η (be); (iii) was-group: θ (was), ι (were). The present tense and non-finite forms are chiefly derived from two distinct bases.[...]

God bless and keep the OED!

Also, I submit for your edification the five-minute YouTube video What Color Is A Mirror? by Michael Stevens of Vsauce. You may think it has nothing to do with linguistics, but you’ll discover he discusses the Indo-European root that (putatively) lies behind both black and French blanc and Spanish blanco ‘white’ (and thus English blank). Unfortunately, the OED doesn’t agree with this (“on formal grounds the word could be from a base related to the Germanic bases of blank [...], but since this would give an expected meaning ‘shining, white’ there is an obvious semantic difficulty; many have sought to resolve this by hypothesizing that the word meaning ‘black’ originated as a past participle (with the meaning ‘burnt, blackened’) of a verb meaning ‘to burn (brightly)’ derived from this base…”), but they don’t dismiss it out of hand, and having inoculated you against taking his etymological remarks too seriously, I urge you to enjoy his explanation of mirrors and color.


  1. I saw the mirror video a while back and thought it sounded a bit fishy. But the overly-hasty linguistic assertion aside, it was a great video.

  2. Wow, the Language Hat commenters are really going downhill! Off topic, poor grammar…

  3. Jewelry blog says:

    [Just so Lane's comment won't appear completely out of context, I'm leaving this comment while deleting the spam URL; before I did my morning spam cleanup, there were others like it, many with poor grammar. —LH]

  4. I thought I’d have a look at OED2 to see if ‘to be’ ranked as high there as in OED3. It doesn’t – not even close. I’ve posted the counts for the eleven longest etymologies here:
    ‘cockatrice’ is a bit surprising, no?

  5. John Emerson says:

    I have verified that in Spanish “eschatology” and “sc*tology” are the same word. And they have no word for snow.

  6. DAW: Excellent, thanks! Here‘s the direct link. (I too pasted the etymologies into Word for a count.)

  7. Back in the days when I was the etymologist for the American Heritage Dictionary (the publisher, Houghton Mifflin Harcourt, laid off almost all the staff in 2011 before eventually entering bankruptcy in 2012), I think the longest etymology I ever wrote was for the flower called abutilon. Even then, I think the etymology was considerably shortened in the editorial process in order to conserve space in the print edition of the dictionary. If I remember correctly, I had also added something about how Arabic ر (r) could have been misread as و (w) in the transmission of the botanical term within Arabic, and also something about how Greek arktion, “Inula candida”, could be derived from arktos, “bear”, (with reference to the plant’s hairy leaves and stems). By the way, long after its disappearance from, the AHD now has an online presence again (something I discovered almost by accident):
    Perhaps Houghton Mifflin is reconsidering the value of the AHD. The new online version is still glitchy and full of bugs, however. Unlike, there are no hyperlinks in the etymologies to the appendices of Indo-European and Semitic roots–the appendices are not online yet either, but perhaps they will come eventually.

  8. Here‘s the direct “abutilon” link; that’s great news about the AHD coming back online. (Also, it’s great to see you around these parts again!)

  9. John Huehnergard has posted his AHD5 introduction and the complete Semitic root appendix here.

  10. Garrigus Carraig says:

    OED-less, I have always gone to AHD first for English etymologies, resorting to the use of the clunky Internet Wayback Machine since its disappearance from Bartleby. So I’m pretty thrilled to know that it’s coming back. Thanks, Patrick.

  11. Oddly enough, Bartleby’s search engine for the AHD turns out to be still operational, even though all the content is gone from their site. So if you go to the the Internet Archive’s AHD4 home page and use the search field, the IA software sees that it doesn’t have the search result, grabs it from Bartleby, and redirects all the returned links to appropriate IA pages. Clicking on these links therefore works fine, including any further links to the IE and Semitic roots. Other than the silly header the IA injects into the pages, which can be closed with one click, everything works as smoothly as it did on Bartleby.
    Note that this is a special case: most IA copies of search pages don’t work, since there is no functioning search engine to generate the results.

  12. Alas, the version does not have direct links to the roots.

