Frequent commenter Tatyana sent me a link to a Russian blog where there was a discussion of the Arabic word SiraaT ‘path’ (famously used in the first sura of the Qur’an, the Fatiha: Ihdina al-sirata al-mustaqima ‘Show us the straight path’), mentioning that it was from Latin stratum ‘path.’ Not having any way to determine whether this was true, I wrote to an Arabic scholar about it, asking also where one could go to look such things up. He confirmed the derivation and added “There is no Arabic etymological dictionary.” I found this shocking, and am hard put to explain it. I can understand why the cultural emphasis on the Arabic of the Qur’an as the perfected form of the language might have made native speakers less likely to look beyond it and work on its Semitic connections, but how could the avid European Orientalists of the Victorian era have omitted to produce such a thing? In an age obsessed with philology, when Edward William Lane was producing his monumental Arabic-English Lexicon and men like Theodor Nöldeke and Carl Brockelmann were doing groundbreaking work on Semitic, how could no one have done an etymological dictionary? And how could no one have done one since? Get cracking, people!


  1. Peter T Daniels (a Semiticiste and regular at sci.lang), he say:

    “Etymological dictionary for Arabic” doesn’t make much sense, since the Arabic lexicon is so much vaster than that of any other Semitic language, since lexicography has been going on for over a thousand years, and etymological dictionaries of all the other Semitic languages mine the native Arab lexica for more or less plausible candidates for cognacy.
    That said, the best etymological dictionary of a Semitic language (largely, if not entirely, supplanting Brockelmann’s Lexicon Syriacum
    ed. 2 of 1928) is Wolf Leslau’s three-volume etymological dictionary of Gurage. His dictionary of Ge`ez, which came out a few years later, is
    less detailed.

  2. Michael Farris says:

    “Peter T Daniels (a Semiticiste and regular at sci.lang), he say:”
  3. It’s a joke, son.
    “‘Etymological dictionary for Arabic’ doesn’t make much sense” doesn’t make much sense, if you ask me. English has a huge vocabulary, too; so what? If I want to find out where “dog” comes from, I don’t want to pore through dictionaries of every related language looking for it — and it wouldn’t do me any good, either. One of the points of an etym dict is to tell you what words have no known etymology. In other words: excuses, excuses! Get cracking!

  4. Michael Farris says:

    Verb agreement aside, I agree that the redoubtable Mr. Daniel’s position seems completely indefensible for the same reasons language hat stated.

  5. Good to hear. I kind of thought that sounded funny. No offense meant to the Count von B.

  6. All jokes aside, the question is most interesting. Massignon speaks, for example, of the ‘étymologisme radical de l’arabe’ in an essay on ‘La structure primitive de l’analyse grammaticale en Arabe’.
    The absence of an Arabic etymological disctionary is perhaps a reflection of the survival of the ancient conflict between the sarfiyyûm, khattatûn and lugawiyyûn…

  8. My guess is that an ancient, self-conscious and scriptural language is a trickier to produce a good etymological dictionary for than most, because it will tend to zealously erase its tracks more often, to aggressively “correct” old or dialect texts while copying them, and generally to impose uniformity wherever possible.
    All of which, of course, is just one more reason to get cracking.

  9. I know the basics of Classical Arabic and Biblical Hebrew and am often frustrated by not knowing precisely how to connect the two. I think surely there must be some basic historical work on Proto-Semitic that I need to find second-hand and buy. A philologist acquaintance who studies these directly assures me there is no such thing. This conforms with the fact that I’ve never come across it in my browsing. Too much Semitic scholarship is nineteenth-century, and for example uses the Tiberian pronunciation of Hebrew, which is historically unhelpful. We need some W. Sidney Allen equivalent to set it all out clearly.
    I don’t think Arab Arabists do erase their tracks: Quranic Arabic is extremely precisely notated, with a small number of phonetic irregularities. They preserve and explain every variation. The story is that the disparate dialect texts of the Quran were collated on the orders of an early caliph, and this fits exactly the slight irregularities that are in the text. It looks very well preserved, not erased.

  10. What would be the primary sources for such a dictionary? There’s the Qur`an, there’s comparison with classical Hebrew (and the tiny remnants of Hebrew’s contemporary near relatives). What else is there? Are there any relics of classical Arabic besides the Qur`an? I’m worried that 75% of the entries in such a dictionary will look like this:
    rahiimi “Merciful”, -a-ii-i “adjective of quality” + RHM “mercy”. Q, 1:2.
    (This is probably wrong in every particular. I’m not an Arabist, but I hope you understand my concern.)

  11. Hebrew is not that close a relative. To start with, there are the South Arabian languages and the Ethiopian languages, and then of course the rest of the Semitic languages (Wikipedia). So just as in an English etym dict you get first other Germanic forms, then related forms from other IE branches and a reconstructed PIE root, in an Arabic one you’d get Ethiopian and Arabian forms, then other Semitic forms, and finally the more distant Afroasiatic forms and a reconstruction (if such is possible).

  12. Mujahid Umrani says:

    I have something you may be looking for. I also have a great deal of understanding in arabic ethmology.

  13. Ah, this has been broached again. It is pleasing to note that Mujahid has something we may all be looking for, and that he is blessed with prowess in “ethmology”.
    I am not an Arabist or any sort of semiticist myself, but I long for a dictionary or handbook that will exhibit pellucidly the ways roots and patterns intersect to produce a good deal of the lexicon. And I want it to do so without centuries of accreted and arcane jargon. From the outside, it really does seem that a review of terminology and of exegetic practice is in order.
    ACW writes:
    I’m worried that 75% of the entries in such a dictionary will look like this:
    rahiimi “Merciful”, -a-ii-i “adjective of quality” + RHM “mercy”. Q, 1:2.
    But that’s close to the sort of thing I’d want.
    Are you completely happy with “I found this shocking, and am hard put to explain it”? Forgive my pedantry (and my recent slip when I wrote “égy” for “egy”, at which I was mortified), but the classic idiom is “hard put to it [to do X]”. I note that many diverge from this nowadays, possibly influenced by “hard pressed [to do X]”. Myself, I avoid the turn altogether.
    Closely related is this one: “there is no question that…”, which people now say instead of “there is no question but that…”. I counsel people to avoid both forms just because a meaning opposite to the one intended might be understood.
    Similar things could be said about “next Thursday”, which at least in Australia can mean either “the Thursday we next have” or “the Thursday in next week”.

  14. Mujahid Umrani says:

    Yes,I see your concerns. What I have is plain as my writing you see.Of what good would knowdlge or wisdom be with no overstanding. I warn you my dear seekers of this book, you must be a good independent seeker of truth. You must have a gymnastic mind to grasp.My teacher has been teaching this since the 70’s.The information is so valuable to us all.The authors filed a copywrite,but published only on paper copies.You see currency is not the issue. Sincer minds to continue the education of humanity. Why? Knowdlge evolves. The intellect can produce greater righteous expressions, for etymology is phenomenal.

  15. Noetica: I fear you are right; investigation convinces me the historic idiom is hard put to it to… It seems I am, all unwittingly, part of the unthinking tide of linguistic change.
    Mujahid: I have not the faintest idea what you’re talking about, but welcome to the site!

  16. Well confessed, LH. I now rest easier about my own myriad peccadillos, known and unknown.

  17. Mujahid Umrani says:

    Thanks for your greeting.Revising of last comment: I have the individual meaning to each arabic letter and how to apply it.I have the ancient numerical value for each letter.Quranic arabic is different from speaking arabic.(example)You can’t say coca cola or telephone in Quranic arabic.

  19. Richard Durkan says:

    I have long given up trying to find an etymological dictionary of Classical Arabic but I have not yet given up trying to find etymological dictionaries for the Arabic dialects (I am very interested, for example. in Turkish and Albanian influence on Egyptian Arabic and Turkish influence on Iraqi Arabic). I have not had much success to date but if anyone can suggest any suitable works, I would welcome details. I share fully the sentiments of the contributor who said “I long for a dictionary or handbook that will exhibit pellucidly the way roots and patterns intersect to produce a good deal of the lexicon”. Has anyone ever found anything?

  20. I am a Turkish speaker with no knowledge of Arabic. But Turkish has a slew of Arabic loan words; and so far I have not able to figure out any systematic behind either the formal or semantic changes that Araic word undergo as they are taken over by Turkish (or Ottoman if you prefer). But that is really beside the point.
    Currently I am working (and teaching) on epics or rather more generally narratives. One of the Turkish words for “story” is “hikaye” which is I am sure of Arabic origin. A related word, which gone into disuse to a large extent is “thakiye” which means “narration”. Now my question is twofold- 1) do any of these words ring a bell among Arabich scholars or speakers? 2) What is/are the standard words in Arabic for “story” and “narration” (and “history” for that matter)?

  21. Yes, hikaya is the Arabic word for ‘story.’ I think you must have mistyped the other, because as far as I know there are no words in Turkish starting with th-. ‘History’ in Arabic is tarikh.

  22. Phil Proper says:

    There is no doubt that an Arabic etymolgical dictionary is sorely lacking, even if only as a basis for further research.
    In a recent study of numbers, I came acros the oft repeated statement that the Arabic SIFR ‘comes from’ Sanskrit ‘sunya’ “void”. This does not make sense to me, since I do not see the linguistic connection. More sense, perhaps, would be a link to Hebrew SFR (which is the root for ‘to count’ as well as ‘book’). Does anyone have any thoughts here?

  23. Huh. I hadn’t heard that one, but you’re right, it seems to be frequently repeated; here‘s a representative quote:
    Since the earliest form of the Hindu symbol was commonly used in inscriptions and manuscripts in order to mark a blank, it was called sunya, meaning “void” or “empty.” This word passed over into the Arabic as sifr, meaning “vacant.” This was transliterated in about 1200 into Latin with the sound but not the sense being kept, resulting in zephirum or zephyrum. Various progressive changes of these forms, including zeuero, zepiro, zero, cifra, and cifre, led to the development of our words “zero” and “cipher.”
    Now, obviously the word sifr does not come from sunya, so what I think has happened is that somebody trying to convey the idea that the Arabs got the idea from the Indians expressed themselves poorly and one or more readers understood them to say that the word itself was borrowed, and a nugget of misinformation was born.
    Apparently, however, the root is only Arabic.

  24. Phil Proper says:

    Thank you for that Arabic root url, which is, unfortunately, unclear as to the etymology of SFR. And there seems to be no reason to buy the Indian option, apart from the fact that the numbers system had its origins in an Indian counting system. As you suggest, the origin is Semitic, the question being whether it is common to both Arabic and Hebrew, or whther it was a loan word from Hebrew.

  25. The etymology isn’t unclear, it says the root is found only in Arabic (and by implication has no Hebrew cognate and cannot be traced back further); it certainly is not a loan from Hebrew.

  26. Phil Proper says:

    If it is an Arabic root, then it should be possible to trace it back to a Semitic root, one that is common to Hebrew as well.

  27. Not at all. It’s very possible that it’s from a Semitic root (though of course words came into proto-Arabic after it separated from common Semitic), but whatever cognates it had have presumably vanished. It’s been several thousand years, you know. There are lots of words like that in Indo-European: words that occur only in one or two branches but don’t look like borrowings. One must always keep in mind that we’re left with scraps of evidence from a much richer past.

  28. Phil Proper says:

    My research so far , albeit in its first stages, has shown a definite Semitic root: SFR in Ugaritic, and earlier in Akkadian ‘saparu’, ‘sipru’, the meanings being “write”, “message”, also in Aramaic, with the meaning of “count” or number”, in Hebrew, in the Torah on a number of occasions.
    What baffles me is the use of Arabic SIFR to mean ‘void’, ‘nothing’. While it seems reasonable that the idea of ‘nothing’ was taken from the Indians, there is a missing link between a possible different, earlier meaning and its later use.
    I am wondering if you know whether words meaning ‘count’ (or connected expressons) based on the root SFR are used in the Koran.

  29. Sorry, I don’t.

  30. When ever you brothers are ready I have answers.

  31. Phil Proper says:

    Yes, definitely ready.

  32. To Phil Proper: I’ve read a few things on the internet about the subject of the Arabic language. I don’t know all the particulars, but my suggestion is this: try finding lexicons of the Sumerian, Accadian, and Aramaic languages. A scholar by the name of Cyrus Gordon wrote that some Arabic words came from these languages. Arabic might be linked to one or more languages spoken in ancient Egypt. Comparing vocabulary words of Arabic to those of the above mentioned might help. I hope this helps. I hope this doesn’t take your studies in any wrong direction.

  34. I have an AED (Arabic Etymological Dictionary).
    If you would like to see the file, I am ready to send it.
  35. Please, Drago… lest we all fall into the ditch.

  36. The word ‘Siraat’can be found in one of the oldest languages known today, if not the oldest. It means in that language ( trace-able Light in the horizon ) which could in literature be taken as path.
    there is a lot of interaction between that language and the arabic language. infact some historical linguists are now reconsidering the classical classification of languages.
    The etymology of the word in that language is as following:
    Si- come to surface, Appear, Shine
    ra- placed in high position
    ad- stationary. not in motion,
    Hope this will answer your question.

  37. I join those who think that sirat, like English street from a Western Germanic *strat-, is a borrowing from Late Latin (via) strata “paved (way)”

  38. SiraaT suggests the Arabic root morpheme SrT, with two so-called emphatic consonants (S and T) and the vowel pattern i-aa. However, Arabic native roots never contain more than one emphatic consonant. That rules out that this is even an original Semitic word (cf. Greenberg, Patterning of Semitic root morphemes).
    The obvious link is with Latin, where a similar word [via] STRATA with a similar meaning exists.
    In support of this etymology one can bring to bear the fact that Arabic borrowed all the other words for the key concepts of empire from the Romans:
    [via] STRATA > SiraaT (= street in English) “military road”
    EXERCITUS > caskar (c=`ayn) “expeditionary army”
    CASTRUM > qaSr (= chester, caster, castle) “fortified camp”
    All these borrowings show a preference for heavy consonants (qaaf, Saad, cayn).

  39. Thanks, that’s very convincing.

  40. Railway sleeper is falanka in Arabic. This sounds suspiciously non-Semitic. Any idea where this can come from?

  41. Arabic falanka from Turkish felenk. This from Greek. (Ref:

  42. I working on Turkish words in Arabic .Can some of you help me just get some of Arabic words that are Turkish ? Actually I need at least a list of tweenty words, but no matter how many you can tell it is ok

  43. Some Turkish words: balta, bamya, baqlawa, basama, budza, burghi, dunum, jazma, saljam, sanja, sanjaq, shanta, shiras, shish, tawuq, tabur, tamgha, yaqa

  44. i dont know about that ‘siraat’ coming from strata. and since the word was used in the quran before the arabs had any real interaction with the romans. the word was not strange to arabs, so one can assume that the word was in common use, and the only contact was through merchants to the syria/lebanon regions. unlike the above notion of the ’empire’ ideas taken from the romans.
    street/road in arabic is “tareeq”

  45. also may want to mention in regards to qasr being from castrum then why would spanish, a latin language, take the word al-qasr from arabic into the spanish language as ‘alcazar’.

  46. Because that’s what was available at the time. During the Moorish period Spaniards were saturated with Arabic and borrowed what they heard around them. English borrowed war from French even though it’s originally Germanic.

  47. Bergstraesser in his Introduction to Semitic Languages in a footnote to his analysis of the Arabic of the opening chapter of the Koran mentions the origin of SiraaT as STRATA.
    In his work “The patterning of Semitic Root Morphemes” , Greenberg observes that no more than one emphatic consonant can occur in a native Semitic root. SiraaT has two: Sad and Tah.
    Etymologies like French abricot via Spanish albaricoque from Arabic al-barquq from Latin praecoquium (praecox, coquo) show that Spanish also borrows from Arabic, even when the Arabic words are Latin borrowings.
    The Arab lands were well within the Roman domain. See Bowersock, Roman Arabia or the recent dissertation about the Eastern limes of the Roman empire “Roms orientalische Steppengrenze” by Michael Sommer, in which Hatra in Central Iraq is identified as a Aramaic-Arabic entity in the second century CE. By the time Islam emerges, the Romans had already receded.

  48. Regarding the word “thakiye”, this is a typo indeed. The correct Turkish word is “tahkiye”. I’m sure this was a typo.
    Both Turkish hikaye (Hikaaye) story and turkish tahkiye (taHkiya) “narration” are associated with the same old Arabic root [Hke]: the reference verb /Hakaa/ means “to tell a story” ([e] stands for the morphophonological alternation between /y/ and /aa/).
    The model for this borrowing, the Arabic word /taHkiye/, has no currency in Modern Arabic, hence the erroneous assumption on this list that it should have been /ta’riix/ (which comes from a totally different root [‘rx]).
    This confusion may serve to illustrate that Turkish words that appear to be Arabic borrowings are in fact PERSIAN borrowings. Persian, of course, in turn borrowed these words from Arabic. However, the time of these borrowings is well over a millenium ago and the place of course the location of the Turks at that time: Central Asia, no where near Arabs. Such borrowings preserve an earlier phase of the source language Arabic: vocabulary,meaning, usage and sound differ slightly, and sometimes even considerably. On top of that, all the sounds have been distorted by (or, rather, rounded off to the nearest match in) Persian.
    Similar to Turkish tahkiye, with no connection with Modern Arabic, is the widespread term for “thanks”: te?ekkür (teshekkur) in Turkish and Persian. Modern Arabic uses shukra-n in such instances. In spite of the vicinity of Turkey to the Arab countries and in spite of its centuries long domination over them, hardly a word of Arabic was absorbed directly from the Arabs. In no Turkish area will one hear the casual use of a term like “?ükren ” (which is the form it would have taken – the learned term ?ükran being derived from shukraan not shukra-n).
    It took me quite a while to realize that there are no recent borrowings from actual spoken Arabic in Turkish. I know both languages reasonably well, and came to the conclusion that regarding Arabic-Turkish, the relations are a unidirectional affair.
    Some more Turkish words in Arabic: zangiil (from zengin) “rich”, shaawirma (from çevirme) “shoarma”, finjaal (from fincan) “glass cup”, kuubrii (from köprü) “bridge”, shawush (from çavu?) “sergeant”, etc. etc.
    NB – a language list would do well to enable Unicode: it proved impossible to post correct Turkish, let alone Arabic!

  49. Wow, that’s fascinating. I’m glad you keep coming by with these nuggets. But I’m not sure why you’re having a hard time posting Unicode characters, which are indeed enabled.
    Çevirme işleminin sağlıklı bir şekilde yapılabilmesi için öncelikle çevrilmesini istediğiniz takvim türünü seçmelisiniz.

  50. Sitirange.
    In the edit box the ??????? looks OK, but once posted it’s all rubbish:
    Çevirme i?leminin sa?l?kl? bir ?ekilde yap?labilmesi için öncelikle çevrilmesini istedi?iniz takvim türünü seçmelisiniz.

  51. Very strange. Can you see my Arabic and Turkish in the comment before yours?

  52. Yes I could. I justed pasted your text back in using Safari. In the edit or “Comments” box it looks all right.

  53. In his work “The patterning of Semitic Root Morphemes” , Greenberg observes that no more than one emphatic consonant can occur in a native Semitic root. SiraaT has two: Sad and Tah.
    thats interesting , and if that is true, then there are hundreds of words taken from other languages. these words are not new either.

  54. While looking for something entirely different on eBay, I found a link to, where you can download scans of (a Lebanese printing of) Lane’s Lexicon as eight huge PDF files. An OCR was done, so English word search kinda works. Obviously the interface is not as nice as the CD-ROM and pictures take five times as much space, but the price is right. Apparently one can also get it on DVD for USD5 or maybe free.
    The same site also has the Qur’an indexed by roots. I did not see how one could download this in bulk, yet.
    Of course, I never found all this when I actually went looking for it online some time ago.

    I’m just an etymology fan, and after learning that California almost certainly got it’s name from Kalif / Calliph (via the Chanson de Roland), I’d just love to know more about KLF in Arabic roots.

    Can anyone enlighten me? 🙂

  56. Well, “almost certainly” is overstating it — it’s definitely from Montalvo’s novel Las Sergas de Esplandián, but the theory that Montalvo got it from the Chanson de Roland‘s “Califerne” is only a theory, however attractive. At any rate, the Semitic root ḫlp ‘to pass, follow’ gives Arabic ḫalafa ‘to follow, succeed,’ whence ḫalīfa ‘successor (to Muhammad), caliph.’ Here‘s a page with all the Qur’anic occurrences of the root.

  57. The Hebrew cognate is ח’ליפה x’alifa, which without the apostrophe (indicating the Arabic pronunciation of the first letter) means suit, as in a suit of clothes. Variations on this word are many, including חלף עם הרוח xalaf im ha-ru’ax, Gone With The Wind, and החלפתי גלגל hixlafti galgal (I changed a wheel [due to flat tire, etc.]). One I learned today is שחלוף shixluf, a noun that a physician told me means diffusion (with respect to medicine). The dictionaries show a verb שחלף shaxlef and define it as re-arrange or re-exchange. Entering the term into Google Translate yields replacement.

  58. What is the Arabic pronunciation of ח?

  59. My technical knowledge of matters phonetic hovers near zero, but Wiki will provide answers for the curious.

  60. David Marjanović says:

    Wiki will provide answers for the curious

    That’s the wrong letter, though. Arabic continues to distinguish ح, in Semitist transcription, from خ, to Semitists and generally rendered kh in English and French. Hebrew merged the latter into the former before its alphabet reached its current state.

    ح is pharyngeal [ħ] or epiglottal [ʜ] depending on the “dialect”; خ is commonly uvular [χ], but reportedly the pronunciation as the velar [x] also exists and is even more prestigious.

    The Hebrew (and Yiddish) fusion of ח and the fricative allophone of כ k has ended up as [χ].

  61. Okay, so presumably ח’ means some post-uvular fricative.

  62. George Gibbard says:

    But presumably ח’ליפה means [χaliːfa] with the uvular fricative (or velar). I think Hebrew ח is pronounced generally as خ [χ] by Ashkenazis (so the apostrophe wouldn’t affect the pronunciation) but as [ħ] (= Arabic ح) by MIzrahis.

    Hans Wehr’s dictionary has the Arabic word ħāχām ~ χāχām (חכם) ‘Jewish sage’. I’m not sure the Hebrew letters are going to display in the right order.

  63. George Gibbard says:

    Apparently the Hebrew did indeed come out all right.

  64. The letters indeed display correctly. חכם ħāχām ~ χāχām today means smart, and has a derivation that means sophisticated. The first two letters are pronounced identically in modern Hebrew, though a minority, including L1 Arabic speakers in Israel, does differentiate.

    As a term for sage, it’s seen most frequently in the acronym חכמיני זכרונם לברכה — חז”ל xazal — xaxameinu zixronam li-braxa our sages, (may) their memory (serve) as a blessing, used to denote the talmudic scholars of old. Rabbis in Muslim lands (scarcely any left) were/are often called xaxam.

    אובר חכם uber xuxum is a hybrid, slangy term from Yiddish אייבערחכם and similar in meaning to גרויסעחכם groisse xuxum — all sarcastic terms for a person too clever by half. (The second U in both is barely pronounced.)

  65. Since Hebrew apostrophe marks a consonant as having a non-Hebrew sound that is related to the regular sound of the consonant, ח’ cannot mean [χ], for that is the sound of ח by itself. That’s why it seems to me that it must indicate a post-uvular fricative.

  66. In ח׳ליפה , the ח׳ transcribes Arabic خ, the voiceless uvular fricative [χ]. ח transcribes Arabic ح, the voiceless pharyngeal fricative [ħ]. [χ] is attested in Israeli Hebrew as the sound of כ (historically [k], and cognate with Arabic ك).

  67. George Gibbard says:

    Does that mean the Ashkenazi merger of ח with lenited כ is regarded as somehow incorrect in Israel?

  68. No, since Ashkenazis have been socially at the top of the heap. Roughly, speaking with a distinct ħ marks you to the elite either as sounding vaguely Arabic and therefore somehow inferior, or as folksy salt of the earth. Only the very phonologically conservative Yemeni dialect (which preserves the ħ and other things) is considered exotically ancient.

  69. George Gibbard says:

    Aha, thanks. What else do Yemenis pronounce archaically? Do they possibly have interdental fricatives (for ד and ת without the dagesh)?

  70. In Yemenite/Temeni Hebrew, which is a biblical reading tradition rather than how Yemeni Jews necessarily speak, all 22 consonants have distinct pronunciations (although sin and samekh are merged), and all six of the BeGeD-KePeT letters have two pronunciations. Here’s my attempt to extract the Yemenite-specific information out of this Encyclopedia Judaica article on pre-Israeli reading pronunciations:

    BeGeD-KePeT: ב is [b] as a plosive and [v] or [β] as a fricative; ג is [g], [gʲ] or [dʒ] as a plosive (according to the local pronunciation of Arabic) and [ɣ] as a fricative; ד is [d] as a plosive and [ð] as a fricative; כ is [k] as a plosive and [x] as a fricative; פ is [p] as a plosive and [f] as a fricative; ת is [t] as a plosive and [θ] as a fricative.

    Postvelars: א is [ʔ]; ה is [h] (with or without mappiq); ח is [ħ]; ע is [ʕ].

    Emphatics: ט is [d̥]; צ is [sˠ]; ק is [q], [g], [ɢ], or [ʁ] (according to the local pronunciation of Arabic).

    Sibilants: שׂ / ס is [s], שׁ is [ʃ], ז is [z].

    Other consonants: ר is [r], ו is [w], י is [j], ל is [l], ם is [m], נ is /n/.

    Vowels: shuruk is [u], [ʉ], [y], or [i]; holem is either merged with tsere or is [ɞ]; qames (indifferently gadol or qatan) is [ɔ] or [ɑ]; patah is [æ]; tsere is /e/ or /ɛ/, segol is /æ/; hiriq is /i/ or /ə/; the hataf vowels are ultrashort, but have the same quality as their non-hataf equivalents; shwa is too complex to account for here (see the article).

    Dagesh forte is realized as gemination in all cases.

  71. David Marjanović says:

    Since Hebrew apostrophe marks a consonant as having a non-Hebrew sound that is related to the regular sound of the consonant, ח’ cannot mean [χ], for that is the sound of ח by itself. That’s why it seems to me that it must indicate a post-uvular fricative.

    Unless it’s the velar fricative instead, which would agree with the claim that that is the most prestigious pronunciation of خ.

  72. The gory details of the geresh (the ‘apostrophe’) are conveniently and fully explicated in English on Wikipedia. Note that if you type the Hebrew geresh character ׳ (Unicode 05F3) after a Hebrew letter, it will be correctly placed to the left of it, unlike the apostrophe.

  73. George Gibbard says:

    Thanks, JC!

  74. About arabic in turkish:

    There are two different periods in turkish language where arabic words incorporated into turkish. The second period is roughly between 1850-1950. Turkish intellectuals learned modern western concepts and tried to create their turkish equivalents. Words created in this period have arabic roots but not arabic usage.

    TaHKiYe is one of those words, created from arabic root HKY for the meaning of “narration”. There are many words like eKaLLiyet (minority) BeYNel MiLeL ( international) TeFRiQa (serial printing, essay printed in installments) etc. Sevan nisanyan names these words as “yeni osmanlıca”. For a broader list, you can read his blog at

  75. Thanks, that’s a useful summary!

  76. Islamic books says:

    The proper name Arab or Arabian (and cognates in other languages) has been used to translate several different but similar-sounding words in ancient and classical texts which do not necessarily have the same meaning or origin.

  77. There are a couple of words in English that look and sound very similar to Arabic words for the same things: earth and lamb. Neither has cognates in other European languages outside the Germanic group AFAIK. One of the Irish words for sea is farraige (generally considered to be non-IE), which looks a bit like the Arabic for sea (the initial sound f is a bilabial fricative whereas the Arabic bahr starts with a bilabial plosive, and both have the vowel sound – ar-. Coincidence?

  78. Yup.

  79. David Marjanović says:

    I wonder if earth is related to the ard, which goes straight back to Proto-Indo-European.

    Neither has cognates in other European languages outside the Germanic group AFAIK.

    Lamb looks like a borrowing from Celtic, assuming that Celtic had a word that had a common ancestor with the Greek word for “deer”, elaphos. Proto-Germanic is known to have had a pretty long list of Celtic loanwords.

  80. Lars Mathiesen says:

    *h₁er- (ἔραζε) vs *h₂erh₃-trom (ἄροτρον). I think that means no.

    There was too much Greek in this, trying to dilute. Hat can remove this graf.

  81. PlasticPaddy says:

    @dm, Hans
    The elk in proto-Celtic is *lono. A very similar word is used for sheep and also grease. Wiktionary gives the PIE trace for the Greek word as *h₁éln̥bʰos, so that is where the n comes from. The Balto-Slavic reflex ( Lithuanian elnias, Russian olen’ ) also keeps the n.

  82. Rodger C says:

    The etymology I’ve seen for farraige is *wergiwios, the raging or (to be etymological) worked-up one. Also spelled fairrge, I think, which makes it look more likely.

  83. PlasticPaddy says:

    @Rodger C
    That sounds more like fearg = “anger”. So how did e become soft (in fairrge) or hard (in farraige) a? Could fairrge be explained as hard a becoming soft due to 2nd vowel drop, farraige > farr’ge > fairrge?

  84. David Stifter (I believe he is the author of this particular post) outlines several proposals for Indo-European etymologies for Old Irish fairrge at the following address on Facebook:

    Here is the text for those readers who don’t have a Facebook account:

    Following Thurneysen (ZCP 9, 312), OIr. fairrge, foirge (f, i̯ā) ‘ocean, sea’ ( is commonly understood as an abstract of the adjective fairsiung, foirsiung (u) ‘ample, broad, spacious’, i.e. ‘vast extent (of the ocean)’. Although this explanation is appealing semantically, phonologically it does not work. Fairsiung itself is a compound of intensifying for- + *eissiung ‘wide, vast’ < PC *eχsangu- ‘un-narrow’ *fairs(n)ge is not expected, nor is there any reason why the s should be lost. OIr. did not have a phonotactical problem with complex clusters like this with medial s arising from syncope, cf. airscél ‘famous tale’ or tairsce ‘some part of a shield’. I can see two alternative etymologies for fairrge:

    1. It could be a compound of the preverb/preposition for ‘over, upon’ + a noun derived from the W1 verb srengaid ‘to pull, drag, draw’ < PIE root *√strengʰ- ‘to twist’, i.e. PrGoid. *u̯or-sreng-ii̯ā. The meaning would be similar to the one suggested above, namely ‘extent, width < *drawing wide out’, applied to the ‘extent of the ocean’. The diachronic phonology is regular in this case: the middle e would be syncopated with concomitant palatalisation and loss of the nasal, and the cluster -rsr- would be simplified to -rr- since the formation of this word belongs to a much earlier period than the one suggested by Thurneysen in his explanation of fairrge; cf. also the behaviour of the compound do·srenga ‘to draw, drag, pull’ whose prototonic stem is ·tairr(n)g-. A compound of fo-sreng- ‘draw/pull under’ would work even better formally, but the semantic development is unclear.

    2. Another possibility is to connect fairrge with the topographical term Οὐεργιούος = /u̯ergiu̯os/ transmitted by Ptolemy for a part of the Atlantic south of Ireland (thus already Stokes, Urkeltischer Sprachschatz 273). This name seems to be related with OIr. ferg ‘fury, anger’ < PC *u̯ergā < PIE *u̯erHg̑eh2. Fairrge could conceivably go back to PC *u̯ergiu̯ii̯ā ‘the wild, furious one’ (less likely *uergii̯ā which should have given *fergae), referring to the less agreeable aspects of the North Atlantic. The vowel of the first syllable in fairrge would have to be explained through some analogical influence. *uergii̯os may underlie W Y Môr Werydd 'the Irish Sea, the sea west of Britain'.

  85. Very interesting, thanks!

  86. The last part of the post is of interest too, if one cares about Old Irish poetry, so I’ll copy it here:

    An Old Irish Poetic Formula

    Probably every student of Old Irish is familiar with the poem Is acher in gaíth innocht ‘The wind is sharp tonight’ ( Its second line has the wonderfully poetic phrase fo·fúasna in fairrge findḟold ‘it tosses the ocean’s white hair’. It seems to have gone unnoticed that the last 4 syllables of the line constitute a veritable poetic formula of Old Irish. Is acher in gaíth innocht, which has Viking attacks as its topic, must postdate 795, and predate 850–1 because it is contained in the precisely-datable glossed Priscian manuscript from St Gall (cod. 904). Appr. 50–100 years older is the first occurrence of the formula in l. 912 (st. 228) of the Poems of Blathmac:

    Is hé tuargaib tuinn do thrácht
    co·mbáidi benna borrbárc;
    is é tróethas anfad ngréich,
    fo·cheird for fairrgi findḟéith.

    ‘It is he ( = God) who raises the wave to/from the strand
    so that it drowns the prows of proud ships;
    it is he who subdues the screech of tempests,
    who puts a fair calm upon the ocean.’

    The formula is used again twice in the possibly 11th century poem Oíbinn beith ar Beinn Étair ‘Delightful to be on Benn Étair’, found in Brussels, Bibl. Royale, MS 5100–4, p. 35 (ed. Reeves, Vita Columbae, 1857, 285). The second stanza is also found in Rawl. B 512, f. 126b (…).

    Oibind beith ar Beinn Edair,
    re ndul tar fairrge findḟind.
    Turracc tuinde ‘na hacchaidh,
    luime a caladh ‘sa himild.

    Oibhind beith ar Beinn Ettair
    re ttecht tar fairrgi fonngil,
    beith occ iomram a curcán,
    uchan sa tracht tondmir.

    ‘Delightful to be on Benn Étair,
    before going over the white hair of the ocean/the white-haired ocean.
    The wave’s onslaught against it,
    the barrenness of its harbour and its border.

    Delightful to be on Benn Étair,
    before going over the fair-bottomed ocean,
    to be rowing in a coracle,
    oh!, on its wave-crazy strand.’

    The formula consists of the word fairrge followed by a compound made up of find ‘fair’ + another noun or adjective, also starting with f (but not in fonngil). In Blathmac and Is acher in gaíth innocht, even the words in the opening of the line start with f. But, as can be expected, there are also differences between the instances. Fairrge can be in the accusative, followed by an adjective, or it can be a preposed genitive before a noun (in Is acher in gaíth innocht and perhaps in the first stanza of the Benn Étair poem). There is also metrical variation: whereas in Blathmac’s poem and in Is acher in gaíth innocht the formula occurs in the second line of a deibide couplet, forming the ardrinn of a rinn-ardrinn rhyme, in Oíbinn beith ar Beinn Étair it constitutes an isosyllabic rhyme in a rannaigecht metre.

    These poems are clearly intertextually connected. Two scenarios are possible. Maybe ‘fairrge find-X’ was a traditional poetic formula or cliché learned by filid in their education. In that case the three instantiations could be independent of each other. Or the poems draw upon each other. In that case Blathmac as the earliest would be the inventor of the formula.

    Geographically there is a connection between the three poems. Blathmac comes from the Louth-Monaghan area. In ITS Suppl. 27 (…/The_Language_of_the_Poems_of_Bla…) I tentatively suggested for Blathmac the possibility of an affiliation with the church of Lann Léire (Co. Louth). However he could well have been a member of the monastery in Bangor, Co. Down, which is not very far away. Is acher in gaíth innocht was in all likelihood written in Bangor. A Bangorian provenance is assumed for the manuscript in which it is contained, and the poem, with its maritime concerns, would very well suit the location of Bangor at the sea where it was particularly vulnerable to Viking attacks. Finally, while nothing is known about the provenance of Oíbinn beith ar Beinn Étair, its theme is Colum Cille and his passage to Iona, so we are still in the north-east of Ireland. Mutual knowledge of the compositions in this well-circumscribed area is easily conceivable.


  87. PlasticPaddy says:

    @Xerib, hat
    Thanks for the detailed posts:
    Hat-I assumed Beinn Étair was Howth Head But maybe it migrated there from somewhere else☺
    Xerib-so the a requires special pleading + the semantics is a stretch, possibly requiring the sea to have been taboo. It is true that some fisherman do not know how to swim, preferring a quick death to a slower one if they fall in. There is also a cognate in Irish to Latin mare, so what was the taboo word?

  88. Stu Clayton says:

    the fair-bottomed ocean

    Here’s a good example of a translation that needs to be updated from time to time. Today one would expect “fair-bootied ocean”.

    “… particularly vulnerable to Viking attacks … this well-circumscribed area …”. It all makes sense.

  89. David Marjanović says:

    A compound of fo-sreng- ‘draw/pull under’ would work even better formally, but the semantic development is unclear.

    Why? The sea is that which pulls you under, and then you die. Or rather, “it drowns the prows of proud ships”.

    Maybe that’s even the analogical influence on the second explanation.

