Hebrew Loanwords in Polynesian Languages.

Via Rebecca Stanton’s Facebook post, I found the fascinating “preview” of Aaron D. Rubin’s “Hebrew Loanwords in Polynesian Languages” on pp. 12-13 of this pdf, which I thought I’d share here (I have tried to eliminate OCR errors, but there are probably one or two remaining):

In the past, there have been scholars who argued for a genetic relationship between the Semitic languages and the Oceanic family of
languages, of which Polynesian is a sub-group (e.g., Macdonald 1907). Such a theory is quite fantastical, of course. A connection of sorts between Hebrew and Polynesian does exist, however, although it is not genetic. Indeed, few Hebraists and Semitists are aware of the fact that a significant number of Hebrew words have been borrowed into several Polynesian languages, including Samoan, Tahitian, and Hawaiian. These Hebrew words made their way to Oceania not through direct contact between speakers of Hebrew and Polynesian, but rather through the efforts of a few 19th-century missionaries.

British missionaries began branching out to the Pacific islands in the 1790s, under the auspices of the Missionary Society (known from 1818–1966 as the London Missionary Society). The first mission was established in Tahiti, and Tahitian is the first Polynesian language into which the Bible was translated. The missionary translators needed many words and concepts not found in Tahitian, and, curiously, they chose to use Hebrew and Greek as sources for these new words. This was, at least in part, because certain Hebrew and Greek words were more easily adaptable to Polynesian phonology (Williams 1837: 528), though certainly religious enthusiasm also played a role. The missionary translators in Samoa and Rarotonga used the Tahitian Bible as a model, and so many Hebrew words were incorporated into the Samoan and Rarotongan Bibles as well.

Many of the Hebrew words used in the Bible translations are terms for flora (e.g., Samoan ‘ārasi ‘cedar’ < Heb. אֶרֶז ’εrεz), fauna (e.g., Samoan nāmeri ‘leopard’ < Heb. נָמֵר nå̄mēr), precious stones (e.g., Samoan pereketa ‘emerald’ < Heb. בּרֶקֶת bå̄rεqεṯ), weights and measures (e.g., Samoan sekeli < Heb. שֶׁקֶל šεqεl), and constellations (e.g., Samoan kīsila ‘Orion’ < Heb. כְּסִיל kəsīl). The Polynesian biblical translations had a profound influence on the respective languages, in no small part because until well into the 20th century the Bible was the only written material to which much of the population had access in most Polynesian islands. Even so, the great majority of the borrowed Hebrew words are found only in their biblical contexts, and did not actually make their way into the spoken language. This is usually because the Hebrew words referred to foreign or outdated biblical concepts (e.g., ancient weights and measures), or flora, fauna, and other materials unknown in the Polynesian islands. In some cases, the biblical loans were simply replaced by native terms, by subsequent loans from modern languages, or by a combination of both. For example, where biblical Samoan has takesa ‘dolphin’ < Heb. תַּחַשׁ taḥaš (e.g., Num. 4.10), modern Samoan uses the native term mumua; where biblical Samoan has kofi ‘ape, monkey’ < Heb. קוֹף qop̄ (e.g., 1 Kgs 10.22), modern Samoan uses manuki (< English monkey); and where biblical Tahitian has sumi ‘garlic’ < Heb. שׁוּם šūm (Num. 11.5), modern Tahitian uses ‘oniāni piropiro ‘stinky onion’ (< English onion + native piropiro).

Some words of Hebrew origin did enter the spoken languages, however. For example, one Hebrew word that was incorporated into spoken Samoan is limoni or limogi [limoŋi] ‘pomegranate’ (biblical Samoan rimoni, e.g., Deut. 8.8) < Heb. רִמּוֹן rimmōn. Hebrew words fully incorporated into Tahitian include ‘oire ‘town, city’ < Heb. עִיר ‘īr; melahi/mērahi ‘angel’ < Heb. מַלְאָךְ mal’å̄ḵ; medebara ‘desert’ < Heb. מִדְבָּר miḏbå̄r; ture ‘(a) law, rule’ < Heb. תּוֹרָה tōrå̄. (At least ‘oire and ture are also current in Rarotongan, nowadays often called Cook Islands Maori). Other loanwords are connected to religion, e.g., Samoan (āso) sāpati ‘Sabbath’, Tahitian and Rarotongan tāpati (biblical Tahitian/Rarotongan sapati) ‘Sunday’ < Heb. שַׁבָּת šabbå̄ṯ; and Samoan Sātani, Tahitian Tātane (biblical Tahitian Satani) ‘Satan, devil’ < Heb. שָׂטָן śå̄ṭå̄n. These religious terms might equally be considered loans from English, though their ultimate source is Hebrew (as is Samoan rapi ‘rabbi’).

An occasionally encountered folk etymology notwithstanding, the well-known Hawaiian word kahuna ‘priest’ (often met in the English expression big kahuna) does not derive from Heb. כֹּהֵן kōhēn ‘priest’, but rather is from a native Polynesian lexeme *tafuŋa ‘priest; craftsman, expert’ (cf. Samoan tufuga, Rarotongan taunga, Tahitian tahu’a).

Rebecca points out that the Comparative Māori-Polynesian Dictionary disagrees that ture ‘law’ is a loan: “This is said to be an introduced word, but is Polynesian.” Also, why would you give up a nice short word like sumi ‘garlic’ for a phrase four times as long like ‘oniāni piropiro?

Comments

  1. limoni sounds more like it comes from lemon, than rimmon. I get the liquids letters and all. I think רִוֹמּן is a typo, with the waw being in the wrong spot. Let’s blame the OCR. Ditto for וֹתּרָה.

  2. I don’t dare try to fix the Hebrew words unless I have correct versions to replace them with, but thanks for pointing out the errors!

  3. imoni sounds more like it comes from lemon, than rimmon
    But note that a) it means “pomegranate” and b) the Biblical Samoan variant has [r]; oscillation between [r] and [l] is common in Polynesian languages and some have switched between both several times in attested history (I think we discussed that here at the hattery before.)

  4. David Marjanović says

    Also, why would you give up a nice short word like sumi ‘garlic’ for a phrase four times as long like ‘oniāni piropiro?

    Rather than nice, maybe it’s too short for a content word?

    Or maybe simply nobody knew what the biblical word referred to…

  5. Dmitry Pruss says

    Garlic is only mentioned once in the Bible, so the botanical or gastronomical meaning of the word may have been thoroughly unknown for an extended period of time, and when the actual garlic appeared, nobody would link it with the poorly understood biblical word

  6. John Cowan says

    Rather than nice, maybe it’s too short for a content word?

    Two-syllable words are quite common in Samoan. Some nouns are mata ‘eye’, lima ‘hand’ (also ‘five’), ulu ‘head’, fatu ‘heart’, fale ‘house’, pele ‘card-playing’; verbs include oo ‘reach’, ʻinu ‘drink’ (L register), gāoi ‘move’, alu ‘go’, tea ‘leave, part from’.

  7. David Eddyshaw says

    Offhand, I can’t think of any language in which bimoraic content words are impossible (though lots in which there are no monomoraic content words.) I’m sure there must be some, though.

    Oti-Volta nominals mostly only end up bimoraic as a result of secondary loss of a mora, but that’s because unbound nominals have to have a class suffix, and monomoraic stems, which were a small minority anyway, have usually been adapted to the pattern of longer stems by various kludges, like lengthening the vowel or doubling the initial consonant of the suffix.

    Bimoraic verbs are all over the place, though, and actually had their own conjugation in proto-Oti-Volta.

  8. Garlic is only mentioned once in the Bible, so the botanical or gastronomical meaning of the word may have been thoroughly unknown for an extended period of time, and when the actual garlic appeared, nobody would link it with the poorly understood biblical word

    Ah, that’s very likely.

  9. I cannot find any explanation of why this Encyclopedia uses å̄ to represent kamatz. Anybody knows?

  10. Garlic is only mentioned once in the Bible,

    Huh? What kind of culture has no respect for garlic? No parable of the bulb and cloves? This is a far more convincing reason to eschew its religion.

  11. I don’t dare try to fix the Hebrew words unless I have correct versions to replace them with

    שׁוּם רִמּוֹן מַלְאָךְ מִדְבָּר תּוֹרָה and also qop̄. Could you take out the line-breaks/hyphens?

    I cannot find any explanation of why this Encyclopedia uses å̄ to represent kamatz. Anybody knows?

    å̄ is qameṣ gadol (Sephardi [a]), å is qameṣ qaṭan (Sephardi [o]).

    The encyclopedia in all four volumes is on archive.org (1 hour loan). That’s a really good thing, as it covers many topics not easily tracked down elsewhere (“Hebrew in China”, etc.)

    lemon
    The Sāmoan word for ‘lemon/lime’ is tīpolo, cf. Mangaia and Tongareva (Cooks) tīporo, Tahitian tāporo. I have no idea where that comes from.


    I doubt the poor benighted heathen Sāmoans needed to borrow words for ‘dolphin’ and ‘Orion’.

  12. For a long list of Hebrew and other loanwords in Tahitian, see here.

  13. More Tahitian loanwords in Davies’ 1851 dictionary. ‘Lemon’ is limoni, ‘pomegranate’ is remuna. Maybe tāporo is a later loan. I’d like it to be from citron but I don’t see how to account for the phonological mismatches, the p in particular.

  14. I doubt the poor benighted heathen Sāmoans needed to borrow words for ‘dolphin’ and ‘Orion’.

    Orion(‘s belt) TUKE-A-MAUI — where Maui is a mythical semi-god, with each Polynesian culture attaching heaps of wondrous voyages and feats, including forming whole islands. Also in Māori Kakau a Māui — which seems to be not ur-Polynesian. (Beware Orion is ‘upside down’ in the Southern hemisphere, so has entirely different celestial properties for navigation purposes. I’m not as confident as @Y it would have consistent naming or detailed mythology. So there seem to be three Māori terms.)

    Haha I mis-read @Y’s comment as about Onion — which is also in o.p. So I’m not going to waste that research:

    At least in Māori, ‘Onion’, ‘garlic’, ‘leek’ are loans from English. It’s perfectly possible Polynesia just didn’t have root crops that were tasty — as opposed to the stodge.

    (I’m finding the dictionary at the link @Hat gives rather hard to navigate. AFAICT if the word is not ur-Polynesian, it simply doesn’t appear. So I’ve linked to the Māori dictionary I usually rely on. Of course it’s not necessarily applicable for Samoan etc, but Māori is usually same as Rarotongan, with minor sound changes. A while back (last year?) we had a long thread (the one on kūmara/sweet potato?) where we accessed tables of ur-Polynesian roots — although that was also hard work to navigate.)

    dolphin/porpoise, which is also in the Hat-linked dictionary. But at my link, see there’s a completely other word for ‘Hector’s Dolphin’ (species). Note ‘Waiau’ is the name of a river, the mouth of which is one of their hangouts; ‘wai’ is the general term for (fresh) water.)

  15. PlasticPaddy says

    @Y
    Not sure why you think the “lemon” word is a borrowing, Tagalog has a(n)tipolo “breadfruit”. My guess would be that each island has a limited number of fruit trees and the discoverers apply their own fruit words to whatever trees they find, even if they are not the same trees.

  16. שׁוּם רִמּוֹן מַלְאָךְ מִדְבָּר תּוֹרָה and also qop̄. Could you take out the line-breaks/hyphens?

    Thanks, and done.

    I’m finding the dictionary at the link @Hat gives rather hard to navigate.

    To be clear, I hold no brief for that particular dictionary, which I’d never seen before — I was just citing it because Rebecca did.

  17. Zeleny Drak says

    @PlasticPaddy

    Actually most of the Polynesian island did not have that many fruit plants. Many of the useful plants now native to Polynesia where introduce by the first settlers there. https://en.wikipedia.org/wiki/Domesticated_plants_and_animals_of_Austronesia

  18. PlasticPaddy says

    @Y, Ironman
    My comment is incorrect and should be withdrawn.

  19. @PlasticPaddy: There are native citruses, called most everywhere by reflexes of *moli, reconstructible maybe as far back as PAN. tīpolo is used for citruses introduced (presumably) by Europeans, namely limes and citrons, though oranges are moli (Christophersen, Flowering Plants of Samoa, here, pp.110–111).

  20. PlasticPaddy says

    @Y
    I think (but am most likely wrong) that one of the French islands (Tahiti?) has replaced the moli word with one derived from French ananas “pineapple”. The moli word looked to me at first to be backslang for Romance limo(n). This convinces me that my continuing to comment on this thread is deleterious to my own mental health and to the blood pressure of those who know better.

  21. Offhand, I can’t think of any language in which bimoraic content words are impossible (though lots in which there are no monomoraic content words.)

    Standard Arabic comes pretty close; m.sg. imperatives from defective roots can be bimoraic (even monomoraic in a handful of cases) but any other content word (as opposed to stem) has to be minimally trimoraic.

  22. David Eddyshaw says

    Biblical Hebrew, at least as the Tiberian system is analysed by Geoffrey Khan, is another, now I think of it. The actual spoken language pretty certainly wasn’t, though.

    Hausa is a bit like Oti-Volta, though for completely different reasons: there are very few bimoraic nouns, but there are plenty of bimoraic verbs. It probably is significant, in both cases, that there are few speech contexts in which a verb word actually occurs in isolation: basically just (brusque) imperatives. In Mooré it’s actually ungrammatical to speak a verb word in isolation, except in direct commands: if somebody asks “what’s the Mooré word for ‘fight’?”, you say n zabe, not just zabe.

  23. David Eddyshaw says

    I wonder how widespread it is to find different prosodic behaviour between nouns and verbs?

    In Kusaal, verb perfectives. which are endingless, behave for tone sandhi purposes like the bound “combining forms” that nouns assume as the first elements of compounds, and there is a sort of syntactic conspiracy which prevents them from appearing clause-finally in main clauses if the verb phrase consists of a verb alone, even though they are not, properly speaking, bound forms.

    In Biblical Hebrew, nouns lengthened the underlying vowel of a final stressed syllable, but verbs didn’t, except before a pause (this was a real phenomenon of the actual spoken language, as Greek transcriptions show; the rules for stress shift sandhi in the Masoretic text confirm it, even if Khan is right that the short/long distinction itself had been neutralised in the Tiberian pronunciation.)

  24. Lars Mathiesen (he/him/his) says

    Well, with zero derivations being so common in English, I’d probably assume that I was being asked for the word corresponding to the English noun, unless it was quoted as to fight. And in the latter case, my answer would be at slås. You might elicit the bare present form slås by asking for that, but the infinitive really wants its marker and you might have more luck with the present by asking for he fights: han slås..

  25. David Eddyshaw says

    There are languages in which you actually can’t cite a noun in isolation, come to think of it, where if somebody asked you for the word for “fish”, say, you’d have to say something like “It’s fish.” (Inevitably, I can’t think of an example just now, but I know I’ve come across this.) I suppose even French is like that in Real Life.

    The asking-for-translation-equivalents scenario is admiitedly likely to produce unnatural constructions in reply, up to and including people helpfully citing bound forms in isolation for you, but you generally seem to end up with useful results by things like investigating how much ellipsis is allowed before speakers decide that the result is just too unspeakable, or (better) seeing what happens in real casual conversation between L1 speakers.

    If verbs really are cross-linguistically less likely to be allowed as isolated utterances, it might interact with the fact that verbs also tend to make fewer lexical tone contrasts than nouns.

    This may just be a chimera though. It’s just occurred to me that the normal way to say “yes” in Welsh is with an isolated verb word …

    And there are all those lovely polysynthetic languages where yer typical sentence is just one verb word …

    Oh well.

  26. Lars Mathiesen (he/him/his) says

    I have a feeling that you might get isolated bound forms in correction scenarios: A:Når de to skændes—B: slås— ~ ‘when the two are arguing—fighting—’.

  27. David Eddyshaw says

    I suppose this is actually a problem with having a too simplistic notion of “bound.” I can imagine a plausible conversation between two completely linguistically unsophisticated English speakers going something like

    A: Did you say I was to lock it or unlock it?
    B: Un.

    Come to think of it, that applies, the other way round, to my Kusaal verb perfectives: they’re “unbound”, unlike the structurally identical noun combining forms, which are straightforwardly bound to the right; but in fact, various grammatical rules interact to ensure that perfective verbs can never even constitute a complete verb phrase by themselves in an independent clause:

    O lu teŋin.
    she fall downward
    “She’s fallen down.”

    O daa lu.
    she TENSE fall
    “She fell.”

    M tɛn’ɛs ka o lu.
    I think and she fall
    “I think that she’s fallen.”

    O lu ya.
    she fall YA
    “She’s fallen.” where the meaningless particle ya is obligatory.

    You can say

    Li naae ya.
    it finish YA
    “It’s finished.”

    and it’s very common to ellipt the subject in casual speech (even though Kusaal is not “pro-drop”):

    Naae ya.
    “Finished.”

    But you can’t ellipt the ya.

    So they are bound: just in a very complicated way.

  28. Stu Clayton says

    where the meaningless particle ya is obligatory. … But you can’t ellipt the ya.

    So it’s meaningful after all. Just not in a “points to a 4D object” kind of way. More like a Name that can be invoked or withheld:

    “His Fall was destin’d to a barren Strand,
    A petty Fortress, and a dubious Hand;
    He left the Name, at which the World grew pale,
    To point a Moral, or adorn a Tale.” [ id est Charles XII of Sweden]

  29. David Eddyshaw says

    As the motivational poster says: “Just because you’re essential, it doesn’t mean that you’re important.”

  30. Stu Clayton says

    Motivate, but in moderation.

  31. Lars Mathiesen (he/him/his) says

    “He was indispensable, and we couldn’t afford that. So we fired him.”

    This is why Real Sysadmins write docs. Job protection.

    The one real life example I’ve met was a guy who thought it was “professional” to keep all the details of all the servers in his head and be available 24/7. It took a bright young thing a month to pry that from his head and write it down, but it was money well spent.

  32. Stu Clayton says

    But you can’t ellipt the ya.

    Would “elide”, “suppress” or “omit” serve equally well ? Or is the question meaningless, because ya is meaningless ? Lack of meaning is infectious.

  33. I cannot find any explanation of why this Encyclopedia uses å̄ to represent kamatz. Anybody knows?

    This is, more or less, the transcription of masoretic Hebrew is used, for example, in Rudolf Meyer’s Hebräische Grammatik (first published as four Göschen booklets in the 1960s, reissued in one volume in the 1990s). The only difference is that the printed text has the ring above the macron, whereas my browser renders it with the ring below the macron. I have seen the same transcription in other scholarly publications about Biblical Hebrew.

  34. I have too, but boy, is it annoying. I was amazed when I found a website that had it in a form I could copy. Why, oh why, do scholars insist on creating these exotic transcriptions?

  35. Stu Clayton says

    Before the advent of Unicode and visually organized electrons (“internet”), such exotic transcriptions existed only in print. They could serve as a watermark making it easier to recognize unacknowleged borrowing (“plagiarism”). As to whether that was deliberate in any or all cases, I do not speculate.

    In the days of hot type, these ring-above-or-below-the-macron issues must have been a nightmare for typesetters. I imagine that if a special type was created for one book, the printer was strongly inclined to reuse it for the next book, no matter what the (different) author said.

  36. Like I said, ring plus macron equals qametz gadol, ring alone is qametz katan. What’s wrong with that?

  37. What’s wrong is that it’s fucking hard to reproduce unless you’re scrawling on paper with a pencil. There must be any number of simpler representations.

  38. Biblical Hebrew, at least as the Tiberian system is analysed by Geoffrey Khan, is another, now I think of it.

    A single-syllable word with a long vowel is bimoraic, isn’t it? Like אִישׁ ʾîš /ʔiːʃ/ ‘man’.

  39. Hat: or, if you have a keyboard set for it, which semiticists do. Vietnamese with its multiple diacritics is a pain too, unless you have a keyboard set up for it. I am using one of the standard Mac keyboard configurations, which makes it easy to produce both a ring and a macron.

    Newer transcription conventions, e.g. Brill’s, don’t use stacked diacritics but still require some special characters at hand.

  40. I am aware there are workarounds. Surely it should be obvious that there should be no need for special workarounds; any sound can be represented with normal symbols combined in normal ways.

  41. David Eddyshaw says

    A single-syllable word with a long vowel is bimoraic, isn’t it?

    No, אִישׁ is trimoraic: the final consonant constitutes a mora.
    There are a few bimoraic nouns in Khan-style Tiberian, e.g. פֶּה .

  42. It is a normal symbol: in 1909, the Gesenius/Kautzsch Hebräische Grammatik used the same convention, and the author (Kautzsch, I guess) points out that the symbol a + ring is taken from the Swedish alphabet; the macron to indicate vowel length was already a common convention in historical linguistics at that time.

  43. John Cowan says

    Unicode has a canonical ordering for diacritics in different locations, thus e + circumflex below + circumflex (above) is the canonical order, and if you swap the circumflexes, canonicalization will swap them back. But a + ring above + macron above renders differently from a + macron above + ring above render differently, and so canonicalization will leave each of them alone.

  44. David Eddyshaw says

    In the Tiberian system chez Khan, monomoraic free words of any kind are not possible; however, as vowel length is neutralised throughout, “mora” is not a useful concept for that stage of the language anyway.

    Khan is a real expert, and if he says it’s so, I’m sure he’s right: however, the Masoretic stress marking definitely reflects a system with contrastive vowel length, and the Masoretes carry through the marking of stress sandhi so accurately and consistently that I cannot believe that the language had already lost contrastive vowel length at the time the cantillation marks were settled: not even they could have been so accurate in getting it right on the basis of tradition alone.

    But all that really means is that the Masoretic pointing, including cantillation marks, must antedate the time of the reading tradition reconstructed by Khan, which is easy enough to believe. He relies a lot on Arabic transcriptions in his reading, for example, and it would hardly be astonishing if the (orally transmitted) Tiberian reading tradition changed significantly over the centuries before it was finally lost.

  45. It is a normal symbol

    We’re using different definitions of “normal.” To you, it apparently means ‘sanctioned by previous scholarly use’; to me, it means ‘reproducible by normal people using normal typewriters and computers without special add-ons.” You can perfectly well use a colon rather than a macron to indicate length, for instance.

  46. But a macron doesn’t indicate length. It indicates a particular vowel mark, whose phonetic value may be uncertain or variable (and may represent both longer quantity and a different quality). One is transliteration, aiming only at a 1:1 representation of the Hebrew script. The other is phonemic transcription.

  47. Fine, it doesn’t matter what it represents — my point is that any distinction can be made in a simple way that can be easily reproduced or an elaborate way that requires special fonts and the like. I know which I prefer.

  48. DE: since Tiberian vowel length reflects Proto-Semitic vowel length, Spoken Biblical Hebrew likely did too.

  49. John Cowan says

    Fine, it doesn’t matter what it represents — my point is that any distinction can be made in a simple way that can be easily reproduced or an elaborate way that requires special fonts and the like. I know which I prefer.

    Well, of course: you can represent [r], [ɾ], [ɹ], [ɽ], [ʀ], [ʁ] as [r1], [r2], [r3], [r4], [r5], [r6], [r7] respectively, or something of the sort. But I don’t see how it’s easier to read.

  50. Lars Mathiesen (he/him/his) says

    In Icelandic, for instance, the modern acute accent occurs mostly where ON had it, but in ON it denoted vowel length. Not so much now. (I think it always makes a difference, though). My input method still doesn’t allow me to type the vowel in bǿr as DEAD ACUTE+Ø.

  51. David Eddyshaw says

    since Tiberian vowel length reflects Proto-Semitic vowel length, Spoken Biblical Hebrew likely did too

    Mostly, but not completely: apart from the obvious loss of word-final short vowels, the main difference is that short vowels in originally open penultimate syllables (before the loss of the aforesaid final vowels) had become long in nouns, but not verbs. This is confirmed by old Greek transcriptions, and also by the Hexapla. Still, that’s the only form of “length by position” (in David Qimhi’s terms) that is actually old. There were also some long vowels resulting from loss of /j/ /w/ /and /ʔ/ after vowels already in BH; they are “long by nature” in the Qimhi system. (The main one he misinterprets as positional is tsere in open syllables preceding the stress, which is actually invariant in the morphology: it seems to come from *iw, judging by the conjugation of pe yod/vav verbs.)

    The Tiberian vowel points really do only mark vowel quality and not length, but of course that no more proves that vowel length was non-contrastive for the Masoretes who perfected the system than Latin orthography proves that Latin had no contrastive vowel length. By the stage that Khan reconstructs, the Tiberian pronunciation had indeed lost vowel length, but this cannot have been so for the devisers of the cantillation marks.

    Qimhi’s interpretation of the written tradition is very ingenious, but effectively conflates two quite distinct historical periods into one. For example, in the original tradition stressed /ɛ/ (segol) became /e/ (tsere) regardless of length*, but for Qimhi, tsere is always long: basically the long vowel corresponding to the (for him) always-short segol. But that was never actually the case historically. By the Khan stage, the two symbols just reflected /ɛ/ and /e/ respectively, and length was irrelevant; previously, segol reflected both short and long /ɛ/, and tsere represented both short and long /e/.

    * There are exceptions, with long /ɛ:/ in open final stressed syllables; and as I noted above, /e:/ could occur in unstressed syllables. In all these cases, the vowel length has arisen historically from loss of /j/ /w/ /ʔ/; the simple rule applies where the vowel is from proto-Semitic *i.

  52. But I don’t see how it’s easier to read.

    Good lord, where did I ever say anything about being “easier to read”? Go back and reacquaint yourself with what I was complaining about.

  53. DE: Welp, if we are at it, פֶּה pe ‘mouth’ is monomoraic, though the construct form פִּי is bimoraic, as are אִי ʾî ‘island’ and also some animal, and צִי ṣî ‘ship’. If you count /ʔ/ codas as morae, then also בֹּא boʾ ‘come! m.sg’; if you don’t, then בָּא bāʾ ‘came 3m.sg’.

  54. David Marjanović says

    I can’t think of any language in which bimoraic content words are impossible

    Austrian Standard German (though I wasn’t thinking of it) seems to require trimoric words (content or not!), except for a few loans.

    The shortest possible allowed rhymes are:

    One syllable:
    – short vowel + two short consonants, e.g. Kind
    – short vowel + long consonant, e.g. Fass; also in, an, ab when stressed
    – long vowel + short consonant, e.g. Rum
    – diphthong + short consonant, e.g. Schein
    – no consonant. In this case, which includes Herr, we’ll have to assume that the vowels/diphthongs are allophonically overlong; given stress-timing this may be phonetically accurate, actually.

    Two syllables, stress on the first:
    – short vowel, two short consonants, short vowel, e.g. Helme
    – short vowel, long consonant, short vowel, e.g. esse
    – long vowel, short consonant, short vowel, e.g. lese
    – diphthong, short consonant, short vowel, e.g. Mäuse

    Three syllables, stressed on the first:
    – short vowel, short consonant, short vowel, short consonant, short vowel, e.g. Panama

    Syllables preceding the stressed one don’t count. (Banane has a long vowel in its stressed second syllable – it must; it’s the same type as lese.)

    All types can be lengthened by adding further morae at the end (whether they’re whole syllables or not), even the last, e.g. Ananas, Benedikt, Nominativ.

    The spellings bb dd gg are borrowed with short /b d g/, so they break this pattern: bimoric Bagger, Ebbe and… I have trouble finding an example better than the Edda (Norse, but still with dd). Except for Bagger they’re all rare, and they all sound overly hasty to me.

    Mega- is another one if we count it as a prosodic word, which it may not even be. (Farther north, it is a word meaning “great” – and gets a long first vowel.)

    What got me thinking about this was reading an attempt to explain the historical lengthening of stressed open syllables and the historical lengthening of monosyllables with no more than one coda consonant as a single phenomenon, which works theoretically if you assume that word-final consonants are extrasyllabic, i.e. don’t count, so that Kind, Hund, Mann have only two morae while Rum had to be lengthened because it had only one. The trouble with this is that they aren’t a single phenomenon: the lengthening of open syllables started in Low German and still hasn’t reached Switzerland, while the lengthening of monosyllables started in Switzerland and still hasn’t reached halfway colloquial registers of northern German (Rum retains its short vowel there, and there are people called Jan running around there sporting shamelessly short vowels). Also, the trisyllabic type without lengthening of stressed open syllables remains unexplained that way; these words are all loans, but Benedikt ought to be a rather old one.

  55. David Eddyshaw says

    פֶּה pe ‘mouth’ is monomoraic

    No, not in Khan’s reconstruction, where it has a long vowel. All stressed vowels in open syllables are long for him (he bases this largely on Arabic transcriptions.)

    As I say, in his reconstruction vowel length is non-contrastive, but he actually reckons that phonetically the neutralisation was to long, not short.

    The idea that segol is always short (and that tsere is always long) is a feature of David Qimhi’s analysis of the written form of the Tiberian tradition. By his day, the spoken tradition had been lost, and he was basically trying to reconstruct it. His resulting system is ingenious, and has formed the basis of the traditional doctrines of BH vowel length ever since. But he was not always right, and in fact this is one case where he was wrong: the absolute form of פֶּה in fact had a long vowel in real spoken BH too (not just by the Kahn period.) It fell under the genuinely old rule that lengthened the vowels of originally open originally penult syllables in the absolute form of nouns (but not in constructs, and not in verbs.)

  56. DM, what about e.g. Kuh?

  57. DE: duh, right.

  58. the lengthening of monosyllables started in Switzerland and still hasn’t reached halfway colloquial registers of northern German (Rum retains its short vowel there, and there are people called Jan running around there sporting shamelessly short vowels)
    Actually, for Rum the short vowel is not just Northern colloquial, but also the Standard German pronunciation as per Duden, which has the long vowel only as süddeutsch und österreichisch auch, schweizerisch meist (Southern German and Austrian also, Swiss German mostly). You are right that the German Standard shows the lengthening of monosyllables in inherited words, but, as this example demonstrates, not as an active phonological process that would work on loans (see also other English loans with short [u] like Klub.)
    The Germany German Standard also has short vowels in monosyllabic function words like prepositions. Is that different in Austrian German?

  59. Good lord, where did I ever say anything about being “easier to read”?

    Fair enough, you didn’t, but I was using that expression metaphorically for superiority: reading is much more common than writing. It drives me crazy when Wells writes [e] instead of [ɛ] for the DRESS vowel on the perfectly-true grounds that there is no /ɛ : e/ opposition in (most kinds of) English but only /ɛ : ei̯/, which he notates [e : ei]. (Note that he uses bold ambiguously for square brackets and slashes, but presumably as a phoneticist he means square brackets.) Matters are the worse when he gives a word without bothering with its conventional spelling, though at least he uses spaces and marks stress, unlike Pullum who does neither, and therefore his long strings of transcribed AAVE have to be deciphered rather than read.

  60. David Eddyshaw says

    If you count /ʔ/ codas as morae, then also בֹּא boʾ ‘come! m.sg’; if you don’t, then בָּא bāʾ ‘came 3m.sg’

    These also already had long vowels by the period when the Masoretes were perfecting their pointing: aleph had by then been lost after a stressed vowel, with compensatory lengthening of the vowel if it was short. (In Khan’s Late Tiberian, the vowels are of course long anyway, but that’s pretty much beside the point.)

    Holem, in fact, is never “long by position”: though it can represent a long vowel in Early Tiberian (or whatever one calls it), this is always either as the result of compensatory lengthening, or from the /o:/ which is the Canaanite outcome of proto-Semitic /a:/, i.e. Qimhi’s “long by nature”, as in the first syllable of the Qal present participle.

    Otherwise, it actually stands for short /o/ (in the stressed syllable of segolate nouns, for example.) Just as /ɛ/ usually became /e/ when stressed, regardless of length, so too /ɔ/ became /o/.

    In open syllables immediately before the stress, where patah becomes kamets, the following consonant is geminated instead of the holem-vowel being prolonged. I suspect this was an alternative strategy for Masoretes whose L1 was Aramaic to preserve the quality of unstressed Hebrew vowels in open syllables; their Aramaic only permitted long vowels and schwa in such syllables, so you had the choice of substituting a long vowel for short, or of making the syllable closed somehow.

    I’ve never seen a really adequate description of all this in print; even sophisticated and diligent researchers seem never to understand that the pronunciation of Hebrew changed significantly over the period in question not once but several times, and use evidence bearing on one period illegitimately as if it applied to other times as well. They also seem to have great difficulty shaking off the long shadow of David Qimhi’s analyses, even though it’s uncontroversial that the Masoretic pointing only marks vowel quality directly, not quantity.

    The evidence that Early Tiberian actually did distinguish vowel length is intimately bound up with the cantillation marking (not too surprisingly, as that’s all about the prosody), and the evidence it provides has not been properly exploited AFAIK.

  61. David Marjanović says

    DM, what about e.g. Kuh?

    ” – no consonant. In this case, which includes Herr, we’ll have to assume that the vowels/diphthongs are allophonically overlong; given stress-timing this may be phonetically accurate, actually.”

    süddeutsch und österreichisch auch, schweizerisch meist

    The Duden simply isn’t going far enough here. I’ve never encountered Rum with a short vowel except from northern sources, and would never have guessed on my own that it has one anywhere. The word is not rare; rum is an important ingredient in certain cookies and is used to disinfect the lids of jam glasses that are being filled… oh, and it’s the base of “hunters’ tea”, traditionally drunk before after skiing.

    Rum isn’t perceived as a loan (unlike Klub, on which I agree completely). Given that fact, it really would have to be spelled with mm to even allow an interpretation as short-voweled. (There are a few people as far south as Berlin who kindly spell themselves Jann. The one Jan I know in Vienna has a long vowel.)

    short vowels in monosyllabic function words like prepositions

    Yes; as I mentioned (in, an, ab), their coda consonants get lengthened instead when they’re stressed enough.

    …except for the long vowel of ob, which I forgot to mention and which seems not to occur outside Austria and Bavaria at all. I suspect that’s borrowed wholesale from the dialects, where vowel length isn’t phonemic and there is no /ɔ/.

  62. I was using that expression metaphorically for superiority: reading is much more common than writing.

    But writing is what I was talking about. It is very difficult to reproduce those invented symbols except with pen(cil) and paper. That is a problem.

  63. J.W. Brewer says

    Just to return to the minimal scriptural teaching on garlic as alluded to upthread, the key verse is Numbers 11:5, viz. (in KJV) “We remember the fish, which we did eat in Egypt freely; the cucumbers, and the melons, and the leeks, and the onions, and the garlick.” Cucumbers are mentioned in two other verses, leeks/onions/garlick not at all. In general there are plenty of Biblical references to grains and some to legumes and also some to the fairly broad/generic category of “herbs” (as the KJV has it), but other vegetables are little-mentioned, especially if you take the pedantic view that “actually, olives are technically a fruit.” My inference from this is that: a) other vegetables were considered of little narrative interest;* and more importantly b) other vegetables are generally not regulated or restricted by any of the elaborate kosher rules so there’s no need to discuss them in that context. But I don’t doubt that there’s some voluminous niche literature I’ve never read on “Vegetables In the Bible.”

    *Possible exception: 2 Kgs. 4:38-41, where someone foolishly gathers unidentified wild gourds and cuts them up and adds them to the stew, but of course they turn out to be poisonous until Elisha intervenes and makes the stew benign. I’m sure there are various analogical/typological exegeses of that passage of which I am ignorant.

  64. PlasticPaddy says

    @jwb
    Taking the part of the lentils in Gen 25:34
    “What are we? Chopped liver?”

  65. J.W. Brewer says

    @PP: intended to be covered by my “and some to legumes” exclusion.

  66. PlasticPaddy says

    Ok that is clear. Somehow I default to legumes = peas and beans, and lentils and chickpeas are exotic Med imports…

  67. Re Rum: my grandparents actually used to bring Stroh Rum home from their vacations in Austria. But while I accompanied them a couple of times, I can’t remember how it was pronounced by the locals; it’s too long ago – I was a pre-teen back then…
    For me, pronouncing Rum with a long vowel is so outlandish that I’d first assume that the speaker has had too much of it 😉

  68. to me, [‘normal] means ‘reproducible by normal people using normal typewriters and computers without special add-ons.” You can perfectly well use a colon rather than a macron to indicate length, for instance.

    This strikes me as an extraordinarily crankypants response from you-of-all-people. You don’t insist on writing Russian or Greek in transliteration, for example; you use a virtual keyboard. Why are you fussing about the occasional oddball character in this-blog-of-all-places? There is a virtual keyboard for IPA too, which lets you enter either ā̊ or å̄ with equal ease. (Personally I prefer the latter, as more fundamental marks should be closer to the base letters; in Vietnamese the quality marks (circumflex, micron, horn, and arguably the crossbar through d) are always closer to the vowel than the tone marks.)

    If you are complaining that the text layer of a PDF created with a TeX variant doesn’t contain proper Unicode that you can cut and paste, that is an extremely long-standing problem caused by the horrible pile of hacks that TeX encodings are. Of course, plenty of PDFs don’t contain a text layer at all, or if they do it is the result of post hoc OCRing, which naturally will have trouble.

    For me, pronouncing Rum with a long vowel is so outlandish that I’d first assume that the speaker has had too much of it

    Unless indeed it means ‘Anatolia’, as in Rumt WP (both en and de) is uninformative on how the market town of Rum in the Tyrol is pronounced, but the alternative spelling Rumb and the older form Rumne both suggest a short vowel to me. In English, Rum is the English/Scots name of an island in the Inner Hebrides (“the largest of the Small Isles”). For a while it was Rhum, as the owner did not want to be called the Lord of Rum in cold print, but the conservancy that owns it now has reverted to the historic spelling. It’s Rùm in ScG, and can be pronounced with GOOSE/FOOT or STRUT in Scots, the latter being borrowed from English rum; I don’t know how the name of the island pronounced in echt English.

  69. Dammit, still no edit window. For “Rumt” read “Rumtürkisch<i>. “

  70. @JC: I assume you know that I was talking about the alcoholic beverage. That said, I pronounced Rum as in Rum-Seldschuken (a name I only knew from my school historical atlas) with a short “u” for years before I learnt the correct pronunciation, and I assume I’m not the only one.

  71. Stu Clayton says

    Rhum mitigates Rheuma, I’ve heard.

  72. David Eddyshaw says

    I’ve only drunk it when it was given to me by Venezuelan relatives*, but I can testify that if I had had a cold at the time, it would rapidly have ceased to trouble me.

    * Who gave me plainly to understand, that non-Venezuelan “rum” is a mere charade and shadow, and not to be spoken of by gentlemen.

  73. Stu Clayton says

    if I had had a cold at the time

    Nowadays Rheuma usually means rheumatism (“rheumatoid arthritis”). But it can (and could, in the past) instead/also have the meaning of “catarrh” (Katarrh). It’s all Greek to me, especially since I suspect I have both as I write.

    This is a job for Roy Porter, who had supernatural abilities:

    #
    He was married five times, firstly to Sue Limb (1970), then Jacqueline Rainfray (1983), then Dorothy Watkins (1987), then Hannah Augstein, and finally his wife at the time of his death, Natsu Hattori.[2][3][6] He was known for the fact that he needed very little sleep.[1][3][5]
    #

  74. David Marjanović says

    For me, pronouncing Rum with a long vowel is so outlandish that I’d first assume that the speaker has had too much of it 😉

    It’s the closest Austria gets to Ruhm “glory” these days!

    WP (both en and de) is uninformative on how the market town of Rum in the Tyrol is pronounced, but the alternative spelling Rumb and the older form Rumne both suggest a short vowel to me.

    Here we get into confusion between dialect and standard. The standard replaced the MHG /uː/, which had diphthongized, by a new one, which developed from a MHG diphthong; throughout the South and Central Bavarian dialects (South is relevant here), the “NHG diphthongization” also happened, but the “NHG monopthongization” did not, so there is no /uː/ (and no /iː/) – and this fact that contributed to the complete loss of phonemic vowel length. Phonetic length now works like in Russian: 1) stressed vowels are longer than unstressed ones under the same conditions; 2) vowels get shorter as you pile up coda consonants, but that’s less important than stress. So you start with a dialectal [rʊ̝ˑm] or so – /m/ short because final, vowel lengthened because stressed –, try to figure out what the Standard equivalent would be and how it would be spelled.

    Perhaps similarly, the -ham place names in Bavaria are all unrecognized -heim.

Speak Your Mind

*