NOBODY IS REALLY TOO KEEN ON A CYRILLIC NTA.

Mark Liberman at the Log writes about a Kyiv Post article by Paul Goble that begins: “A statement by a Kazakhstan minister that his country will eventually shift from a Cyrillic-based alphabet to a Latin-based script and reports that some scholars in Dushanbe are considering dropping another four Russian letters from the Tajik alphabet suggest that a new battle of the alphabets may again be shaping up in Central Asia.” It’s well worth reading, far better informed than most journalistic attempts to deal with linguistic matters, and contains interesting links. And it gives me a chance to plug some of my favorite Log posts of all time, those dealing with Central Asian alphabets: How alphabetic is the nature of molecules, Birlashdirilmish yangi Turk alifbesi, and Vaslav Tchitcherine, call your office—not to mention my own Language in Central Asia. As I said there, be grateful you weren’t trying to become literate in that part of the world in the 1930s.

Comments

  1. michael farris says:

    “be grateful you weren’t trying to become literate in that part of the world in the 1930s”
    or now….

  2. This article is primarily about orthographies, so, in that sense, is not about ‘linguistic matters’. You know.

  3. Poland chose Latin, Ukraine chose Cyrillic, and I’ve always assumed (without any evidence) that this reflected who had the upper hand in politics at the time: western-oriented or Russian-oriented politicians. If I’m wrong, I’m sure I’ll be corrected (and gladly so), but it seems that this current controversy is also primarily political, denials in the Kyiv Post article notwithstanding.

  4. This article is primarily about orthographies, so, in that sense, is not about ‘linguistic matters’

    Writing down a language involves, at a minimum, some analysis of its phonology and morphology; and issues of history, variation, and yes, language politics will get in there — if this isn’t “linguistic matters” what is?

  5. marie-lucie says:

    Poland chose Latin, Ukraine chose Cyrillic
    Wasn’t this originally because of the prevailing religious traditions?

  6. What vasha said. Also, what m-l said. That decision came down long ago, in times when language policy was a matter much more closely associated with religion than the whole nexus of secular politics.

  7. marie-lucie says:

    There is also the parallel use of Cyrillic for Serbian and Roman for Croatian (even though those languages are almost the same), because of adherence to the Orthodox or the Catholic tradition.
    Similarly, Persian and Urdu, which are Indo-European languages, are written with the Arabic script because their speakers adopted Islam, but Hindi, which is very similar to Urdu, uses the Devanagari script which is associated with the scriptures of the Hindu religion.

  8. Serbian is really Cyrillic/Latin, whereas Croatian is just Latin. It’s not uncommon, for example, for people to send a Latin-script article to a publisher, even though it will be typeset in Cyrillic. Transliteration with Unicode is easy, automatic, and perfect.
    Other multiscript languages include Azeri (Arabic, Latin, Cyrillic), Hausa (Arabic, Latin), Kurdish (Arabic, Latin), Mongolian (Cyrillic, Mongolian), Panjabi (Arabic, Gurmukhi), Tachelhit (Latin, Tifinagh), Uzbek (Arabic, Cyrillic, Latin), and of course Mandarin Chinese with its simplified and traditional characters.

  9. michael farris says:

    I assume that the big winners in this script turmoil will be Russian (as de facto working language of bureacracy and media) and English.
    Constantly messing with the script of a language is a very good way of sabotaging literacy and literary output.
    As to the specifics, Kazakh has been making periodic noises about romanization for a long time and they have an okay system in place (essentially a transliteration of the cyrillic but with the hard sign which is useless for Kazakh). The best solution imho would be to declare both official and let the marketplace of native users decide. But then large scale language reformers generally don’t care what the marketplace of native users wants….
    The Tajik proposal doesn’t seem so radical if it’s just dropping letters that aren’t needed (and respelling loan words to correspond better with local pronunciation).

  10. Other multiscript languages
    I’d add Japanese, of course, with two alphabets, Chinese characters/kanji and romaji/Latin all co-existing. And versatile left-to-right, right-to-left, horizontal and vertical lines.
    But in Europe Serbian example is really impressive.

  11. marie-lucie says:

    The best solution imho would be to declare both official and let the marketplace of native users decide. But then large scale language reformers generally don’t care what the marketplace of native users wants….
    “Native users” are not all adults purchasing books and newspapers. They are also schoolchildren learning to read and using textbooks. If there are two official alphabets, who decides which one to use in schools (at least for beginning readers)? The situation is different if there are two languages, because in that case there are also two different populations.

  12. michael farris says:

    What do they do in Serbia?
    To be clear I don’t think there’s much point in Kazakh switching from a very workable cyrillic orthography to a basically identical roman based one. I have no particular idea if existing literate speakers want to do this or not (and i kind of doubt it).
    If they are interested in going ahead with it anyway for whatever reason then gradual incrementalism with a transitional period of biscript literacy would be a better idea than whatever it is they think they’re doing now.

  13. Off topic (as always), but I just saw these gorgeous colour photographs from early last century Russia:
    http://www.huffingtonpost.com/2010/11/20/rare-color-photos-of-the-_n_785798.html#186298

  14. Off topic (as always), but I just saw these gorgeous colour photographs from early last century Russia:
    http://www.huffingtonpost.com/2010/11/20/rare-color-photos-of-the-_n_785798.html#186298

  15. marie-lucie says:

    Does Serbia use both alphabets then? I knew that was true of the former Yugoslavia (eg on stamps), but I did not know that Serbia was still doing it.

  16. michael farris says:

    yes, it isn’t hard to find web publications in either
    here’s one in cyrillic
    http://www.politika.rs/
    and here’s one in latin
    http://www.danas.rs/danasrs/naslovna.1.html
    IIRC cyrillic is official for bureacracy but latin use is very widespread (and possibly growing).
    Don’t know anything about relative script numbers for book publishing or how the problem is dealt with at school.

  17. michael farris says:

    Also, on interesting side effect of the Serbian situation is that foreign names in latin are usually written as if transcribed from cyrillic. I remember seeing a movie magazine in latin script and with references to Voren Bejti and Edi Marfi etc

  18. Thank you for the clarification re Catholic/Orthodox religions.
    The example mentioned above, of simplified versus traditional Chinese, is a good warning of what can happen when politicians take over writing systems. When I was in the biz, I used human editors to convert between the two, because the differences were great enough that no automatic conversion facility worked completely.
    An example of romanization gone wrong: in Japanese, a relatively simple language to romanize, there are three major romanization systems in use, and even the government is inconsistent about it (it allows Hepburn for passport names, for example, even though it officially uses Kunrei-shiki in education).

  19. marie-lucie says:

    on interesting side effect of the Serbian situation is that foreign names in latin are usually written as if transcribed from cyrillic.
    This seems to indicate that cyrillic is the dominant alphabet.

  20. What do they do in Serbia?
    Does Serbia use both alphabets then?
    yes, that’s what I meant: Serbia is the only country in Europe where Latin and Cyrillics peacefully coexist and, I think, officially recognised. It’s not surprising – it’s there that the Roman Commonwealth (Latin) meets the Byzantine Commonwealth (Greek).

  21. It seems to me the Tajiks are the sensible ones here. In the wake of the fall of the Soviet Union the various former Soviet Republics wished to distance themselves from Russia and the Soviet past: that is perfectly understandable.
    But surely there must have existed ways of doing so which did not involve instantly transforming a whole generation into illiterates? Mass literacy in the various vernacular languages of the region may well be the only positive legacy of Soviet rule: it seems unwise to endanger this legacy.
    Hence the sanity, to me, of the Tajik approach: keeping Cyrillic, but modifying it somewhat, making the orthography less Russian-like. Thereby establishing a distance from an unattractive Soviet past without ditching its chief (sole?) positive accomplishment: vernacular mass literacy.
    And I agree with Michael Farris: Russian and English will be the winners here. The lack of a well-defined norm, orthography (or even script) in the language(s) of various newly-decolonized countries has played a major role in keeping the former imperial power’s language in its dominant position.

  22. Japanese, a relatively simple language to romanize
    maybe, when you think of shi or chi, but what do you do with strong rhotic sounds, tsu, which is phonetically closer to Russian Ц (ts as in tsar or czar) and not very comfortable on an English tongue, and of course the strong KH in the ha-hi-fu-he-ho column of gojuon? My mother, whose first foreign language is French, always says ‘itashi’ for ‘Hitachi’. Most anglophones are note very comfortable with strong ‘kh’ too.

  23. Russian and English will be the winners here.
    Michael, Etienne, could you elaborate, please? To me, it seems like an either-or situation.

  24. @Sashura: When people are illiterate in their own language, but literate in a foreign language, they will start to use the foreign language for various things that they would otherwise have used their own languages for.

  25. Can someone explain the Tajik proposal to me? I don’t read Russian, but I looked at the article in question and found the four proposed letters to be eliminated. All of them represent Tajiki sounds and are used in Tajiki Persian words*, so I don’t understand why they are called ‘Russian’ letters. That makes it sound like they are somehow a poor othographic fit for Tajiki, which isn’t true.
    *For instance ‘e’ is the first letter in ‘Iran’, ‘e’ with the two dots is ‘ya’ یا which means ‘or’ in Persian, the backwards ‘R’ is the first letter is yak یک which means ‘one’, etc.

  26. Highly recommended reading in this regard (besides O.C. the pertinent passages of Gravity’s Rainbow) is The Dictionary of the Khazars by Milorad Pavic.

  27. ‘Cause there’s absolutely no reason to spell /jV/ as a single letter other than Russian compatibility.

  28. Japanese isn’t multiscript in the same sense as the others: it requires multiple scripts to write it at all, but there is in practice only one way of doing so. (Obviously you can write English in Shavian script or Greek in IPA, but I am dismissing such extremely minoritarian practices.)
    The Serbian situation is indeed impressive. In most cases, languages are multiscript either because they have used different scripts at different times within living memory, or because different populations, often in different countries, use different scripts. But in Serbia (and Montenegro) there is stable bi-scriptism: every Serbian-speaking child learns both alphabets as a matter of course, though I do not know in detail which, if either, is taught first. (I can imagine them being taught simultaneously, as upper and lower case are taught simultaneously.)
    The scripts are so thoroughly fused in that when doing crossword puzzles in Latin script, the Latin digraphs lj nj dž that are equivalent to a single letter each in Cyrillic are written in a single square: one can write either LJ U B LJ A N A or Љ У Б Љ А Н А in the same seven squares. (From a historical point of view, љ and њ are ligatures of ль and нь, but Serbian views them as unitary letters and does not use ь.)
    As for Edi Marfi, Serbian orthography is ruthlessly phonemic, so everything is written as Serbian-speakers would pronounce it. By the same token, Russian or Bulgarian names with щ are never written so in Serbian: they are transliterated to шч or шт as the case may be. Similarly, ю and я in other Cyrillic-script languages are transliterated to Serbian Cyrillic ју and ја.

  29. marie-lucie says:

    Latin and Cyrillic alphabets: I can imagine them being taught simultaneously, as upper and lower case are taught simultaneously.
    Upper and lower case versions of the same letter rarely look the same, and the characters of one series cannot be confused with those of the other one. Latin and Cyrillic share some characters which have the same shape and value in both, but there are also a number of characters which are ambiguous, like B, C, H, N, P, X, Y, at least. This would make it difficult to teach the two alphabets simultaneously, as opposed to teaching one first, and adding the second one later, when the children are competent in the first alphabet.

  30. Bob Violence says:

    maybe, when you think of shi or chi, but what do you do with strong rhotic sounds, tsu, which is phonetically closer to Russian Ц (ts as in tsar or czar) and not very comfortable on an English tongue, and of course the strong KH in the ha-hi-fu-he-ho column of gojuon? My mother, whose first foreign language is French, always says ‘itashi’ for ‘Hitachi’. Most anglophones are note very comfortable with strong ‘kh’ too.
    Those are all issues with pronunciation, not romanization. An Anglophone (or any other foreigner) who wants to speak Japanese will learn the proper pronunciation of “tsu” or that “h-” initial is usually more fricative (and sometimes palatized as well); everyone else will continue to use Anglofied (or Francofied, Russified, etc.) spelling pronunciations. No romanization system can be 100% intuitive across every single language, but it can still be a perfectly adequate representation of the original language’s phonology. Hanyu Pinyin poses all sorts of issues for English-speakers — q, x, zh, etc. — but it’s used pretty consistently, which gives it a leg up over the various Japanese romanizations.

  31. Bob Violence says:

    And mea culpa: I know that transliteration of Japanese into Russian is a matter of cyrillization, not romanization, although it poses similar pronunciation issues. (My understanding is that Russian uses a single standard cyrillization, but not everyone understands or follows the rules. Dunno about non-Russian cyrillic scripts.)

  32. “‘Cause there’s absolutely no reason to spell /jV/ as a single letter other than Russian compatibility.”
    Was this in response to my question about why certain letters in the Tajiki alphabet are considered ‘Russian’? If so, what letter does it refer to? I don’t understand /jV/.

  33. @Andrew: /jV/ means roughly “a ‘y’ sound followed by a vowel”. Russian has several such letters, but minus273 is saying that Tajik doesn’t need such a thing, and can just as well write the two sounds separately. <я>, for example, could be written <йа>.

  34. Ahh, thanks Ran.

  35. Russian uses a single standard cyrillization
    it does, you can always tell if someone has at least some knowledge about Japan by the way they spell chi-ти, not чи, and shi-си, not ши (sushi-суси-суши).

  36. J. W. Brewer says:

    It seems like a missed chance that Turkmenbashi-era Turkmenistan apparently ditched cyrillic for a boringly roman-based alphabet for Turkmen as soon as the USSR dissolved, rather than devising a brand-new script as a tribute to the unique genius of the Turkmen people and their fearless leader — who engaged in other innovations such as renaming the months of the year etc.: http://en.wikipedia.org/wiki/Renaming_of_Turkmen_months_and_days_of_week,_2002. Although to be fair, since neither North Korea’s rulers nor Myanmar’s have discarded the pre-existing alphabets for their country’s dominant languages, inventing and imposing new writing systems may have fallen out of the Deranged Supervillain playbook.

  37. @Sashura: Interesting point about cyrillization of Japanese. But I guess it means nobody in Moscow has any knowledge about Japan: the last time I was there, it was all chain суши restaurants and no суси in sight. (Presumably this means the recent sushi trend made it to Russia via the west rather than Japan.)

  38. bet it has.
    But суши also reeks of сушка – both the ring-shaped tea bisquits and sun-dried fish (mostly in the North, not in Moscow)

  39. illiterate in their own language, but literate in a foreign language
    Ran, this is succinctly put, but not entirely correct. Surely, you don’t introduce democracy (people power), anarchy (no power – small government) or Soviet (council, local government) and sputnik (satellite), because you are illiterate in your own language, but literate in a foreign language. That’s at the high end, and at the low end, it’s not for literacy/illiteracy reasons that woodcock, in Russian originally ‘pizdrik’, was changed to вальдшнеп (Waldschnepfe).
    But my question was about scripts – that’s why I am slightly puzzled with the argument that Russian and English would both be winners.

  40. @Sashura: If I may elaborate. When the Soviet Union fell the various Central Asian republics all had a “national” language written in Cyrillic, which were the “low” languages, with Russian being the “high” one, sociolinguistically. Subsequent events introduced English as a second “high” language.
    Attempts on the part of Central Asian authorities to extend the domains of use of their national languages (at the expense of Russian) are weakened if not nullified by these “wars of the scripts”: educated native speakers/users, feeling uncertain as to which spelling/alphabet to use, will (I sense) in the end feel more comfortable falling back upon one of the “high” languages (Russian or English) instead of their L1, which recent events make them feel they don’t really master as written codes.
    A very similar situation was and is found in many formerly colonized cou8ntries: even in ones with a dominant non-european vernacular, the low degree of standardization means that even native speakers who are literate in the former colonial power’s language prefer using said language in formal contexts, including writing. Hence the dominant role still played by Dutch in Surinam, French in the Central African Republic, English in Ghana as “high” languages, with Sranan, Sango and Akan (respectively) the dominant informal, “low” languages.
    I would not be surprised if a generation hence the relationship of the various Central Asian languages to Russian and English will be reminescent of the linguistic landscape of the above countries…

  41. I know that transliteration of Japanese into Russian is a matter of cyrillization, not romanization, …
    Nice caution. But a note concerning terms:
    I would not use transliteration unless both source coding and target coding were alphabetic: that is, using letters (literae). I think it is best to use transcription as the general term, with romanisation (and cyrillisation, etc.) as specific kinds of transcription. Orthogonally, every transliteration is a kind of transcription. I don’t know that all experts use the terms in these regimented ways; but I think many do, and they have logic, etymology, and the need for clarity on their side.

  42. Waldschnepfe
    Bird names and their standardisation are something else again. I have no idea why ‘pizdrik’ (пиздрик?) would have been changed to вальдшнеп, but it is possible that ‘pizdrik’ referred to more than one species, and вальдшнеп was introduced in the interest of taxonomical precision. In any case, bird names according to official taxonomies are a totally artificial beast.

  43. Noetica: I would unhesitatingly use transliteration for a mapping between hiragana and Latin letter combinations, though hiragana are not strictly speaking an alphabet. As a Unicadet, the difference to me is primarily that transliteration maps the characters of a script onto the characters of another script without regard to a specific language (thus it is reversible), whereas transcription uses the written conventions of one language to represent the written form of another, and is typically not reversible exactly.

  44. By the way, Hat, I keep reading the headline as a reference to a non-existent casing pair of Unicode characters, CAPITAL and SMALL LETTER NTA, whose existence is seen as undesirable.

  45. I would unhesitatingly use transliteration for a mapping between hiragana and Latin letter combinations, though hiragana are not strictly speaking an alphabet.
    Good for you! I hesitated with that restriction to alphabetic, exactly; and I do not prescribe, but simply observe. Still, the points regarding etymology and clarity stand.
    … that transliteration maps the characters of a script onto the characters of another script without regard to a specific language (thus it is reversible), whereas transcription uses the written conventions of one language to represent the written form of another, and is typically not reversible exactly.
    I wonder about your restriction to “another [language]“. Perhaps usefully ignoring that restriction, and taking the example of a standardised Mandarin Chinese, your suggestion implies that a text in pinyin (an encoding devised by Chinese experts) prepared from one written in characters counts as a transcription, but not as a transliteration, right? Only because it is not reversible? I wonder more generally about your parenthetic gloss “(thus it is reversible)”. Do you mean “in such a way that it is reversible”? Not all mappings achieve reversibility.
    I don’t do Japanese, so I’ll pretty well take your word for it with hiragana. Except, aren’t the standard “transliterations” of じ–ぢ and ず–づ irreversible? And isn’t ん “transliterated” differently according to context? Of course there are similar issues with transliteration of Greek iota subscripts, breathings, and the combinations γγ and γκ, etc. These would have to be accommodated in any truly rigorous definition of our terms.

  46. Etienne, thanks for your observations – it does make sense.
    Waldshnepfe – pizdrik
    Bathrobe, sorry, I should have explained. The older name has been wiped out, because of its link to ‘unprintable’ swear lexicon (maternye slova) – the c- word, the verbal form of which means ‘chatter, blabber’ and, in the case of woodcock, refers to its characteristic mating cry. I learned this word when hunting in the North of Russia, but some dictionaries attribute it to lapwing or oreole.

  47. michael farris says:

    Etienne explained my point better than I would have, so big thanks to him.
    And as chance would have it, I’m just home from a conference where one of the papers was on ….. (wait for it) … “Language Policy in Kazakhstan”.
    Brief out of order summary from the paper and my questions to the author afterward.
    Kazakh could easily serve as the dominant state language but is lagging behind Russian for a number of reasons including (but not limited to) the very cozy relationship between the Kazakh and Russian leadership and serious efforts by Russia (similar to British and French efforts in former colonies) to maintain the dominant position of the language in the country. It’s also a large market for Russian language products and so that urban Kazakhs read international works in Russian rather than Kazakh translations.
    There are also typical code and domain switching issues that will be familiar to anyone familiar with colonial-like language situations. Private conversations in Kazakh turn into Russian for more public use. A Kazakh speaker will ask a bus drivr (also Kazakh) questions in Russian. Buyer and seller (each speaking Kazakh privately a moment before) will negotiate the final price in Russian, that kind of thing.
    The Romanization issue is for all intents and purposes dead (and has been since 2007) no matter what some random minister says now. The only result is a very adequate roman transliteration for those that need or want such a thing.
    Good written materials exist in Kazakh but distribution issue abound so that even those in the market for them have trouble finding them.
    A lack of the mechanisms of civil society means that there’s no long term thought about what language policies are in the best long term interests of the country. Families that can afford to do so send children abroad to learn English (perceived as a competitive domestic advantage).
    I aksed if any countries in the region have anything like a good language policy and was suprised to hear …. Turkmenistan which after some missteps has gotten its romanization in order and the spheres of use of the language are increasing. Uzbekistan is a mess because of dialect compataiblity issues of any written standard.

  48. I have no idea why ‘pizdrik’ (пиздрик?) would have been changed to вальдшнеп
    For the same reason coney has been changed to rabbit in English, as Sashura explains above (the “c-word” in Russian is pizdá).
    michael farris: Thanks very much for that informative comment!

  49. Michael,
    on Turkmenistan, did you get the impression that their progress in romanisation is influenced by or related to Azerbaijan and Turkey?
    And thanks for that summary.

  50. LH: I didn’t know about coney, what a horizon it opens. ‘I have a cunning plan’ – would that be related to coney, the rabbit?

  51. michel farris says:

    Didn’t have time to discuss the Turkmen case in more detail though I might be meeting with the person who gave the paper sometime in the not too distant future.
    I do know that the original romanization plans were pretty …. eccentric and they got a lot more realistic, limiting the letters to ones that are fairly common and thereore easy to reproduce in various media (even if there’s a motley appearance as it mixes haceks, cedillas and umlauts).
    AFAICT the former crazy leader did have something to do with expanding the range of use of the language and that habit has survived him.

  52. Bob Violence says:

    Turkmen Wikipedia has what may or may not be an accurate rundown of the various scripts, including the “eccentric” 1993 alphabet (with £, $, ¢ and Ұ!). (Why no [ɯ] in the 1927 script?)

  53. Wow, that 1993 alphabet is really something.

  54. I believe the 1993 alphabet was based on what could be wedged into ISO 8859-1, which for example did not have Y-umlaut, hence the use of the yen sign.

  55. Sashura: No, no connection. Cunning is a native word, built on the old meaning of can ‘know’, with cognates ken in Scots and kennen in German. The old name for a wizard or sorcerer in English was cunning-man, that is, one who knows.
    Coney on the other hand is a borrowing from the Romance languages, and perhaps cunny, its former homonym, is too; nobody knows for sure. Nowadays coney is used only in place names and the Bible, and is usually pronounced with [o] rather than historic [ʌ].
    The word that displaced it, rabbit, originally referred only to the young animal. It appears in French and Dutch too, but its further relations are not known with certainty: it may have been transferred from a Dutch and Low German word for ‘seal’, presumably some sort of comparison of the fur.

  56. marie-lucie says:

    JC: rabbit, … appears in French
    ????

  57. @John Cowan: “Nowadays coney is used only in place names and the Bible, and is usually pronounced with [o] rather than historic [ʌ].”
    It’s also used (or was used, at least in my childhood in the ’70s) instead of ‘rabbit’ when referring to the fur used in coats – I guess because ‘rabbit’ is obviously cheap (compared to, say, mink) and not everyone would immediately identify ‘coney’.

  58. ????
    Here’s the brand-new (Sept. 2010) OED etymology of rabbit:
    [Apparently < an unattested Anglo-Norman or Middle French *rabotte (French regional (central) rabotte rabbit, rabbit hole), with suffix substitution (see -ET suffix1); the French word would represent a form with dissimilation of o in the first syllable (contrast French regional (Walloon) robète, robett rabbit, with suffix substitution) < *robotte < an unattested Middle Dutch noun corresponding to early modern Dutch robbe rabbit (1599 in Kiliaan; Dutch regional (West Flemish) robbe, also ribbe, rubbe; of uncertain origin: see below) + Middle French -otte, feminine form corresponding to -ot -OT suffix; compare the early modern Dutch diminutive form robbeken (1599 in Kiliaan); compare also Middle French, French rabouillère rabbit burrow (1564 as rabolliere; 1542 in general sense ‘hole’). Compare post-classical Latin rabettus (1407 in a British source), robettus (1473 in a British source), both in sense ‘young rabbit’.
      With early modern Dutch robbe rabbit, perhaps compare Middle Dutch robbe seal (1488; Dutch rob), cognate with West Frisian robbe, rob, German regional (Low German) Robb (> German Robbe (1618)), all in sense ‘seal’, of uncertain and disputed origin; however, the connection between the two animals is not immediately obvious; for a discussion of this and possible ulterior etymologies see A. Liberman in Gen. Linguistics 35 (1997) 108-19.]
    Now, that’s what I call etymology. Amateur etymologists should be forced to examine every link in that chain and figure out why it’s there until they understand what valid evidence is and how much work it is to assemble it.

  59. marie-lucie says:

    LH, thank you for looking this up. Those French words are new to me! They are all from various dialects, and never made it into the standard language. I have never seen them mentioned in lists of French words of Germanic origin.
    All those words in rab- or rob- have to do with holes, and another word that could be added to the mix is the dialectal verb rabouiller, from which derives the noun pair le rabouilleur/la rabouilleuse. The feminine form is known from Balzac’s novel of the same title, and Balzac explains the verb rabouiller as to stir up the water and mud in a waterhole in order to confuse the fish or crayfish and catch them more easily. This was the childhood occupation of the female main character, who came from a very poor family.

  60. What is this saying about bouiller? There seems to be something about stirring up the water with a bouille, whatever that is, and fishing, and even a rabot, but I haven’t got the French to put the pieces together. Are there holes here, other than the gaping holes in my understanding?

  61. Trond Engen says:

    Rather than assembling evidence, here are couple of stray (and mutually exclusive) thoughts from this amateur etymologist:
    There’s a North/West Norwegian word kobbe “seal” with no known etymology — not to me, anyway. One would have to work to unite the continental r and the Scand. k, though.
    Could the semantic connection between the rabbit and the seal be that they’ve both been seen as prey? According to etymonline (I’ve called it that before!) the verb
    rob is

    late 12c., from O.Fr. rober, from a Germanic source (cf. O.H.G. roubon “to rob,” roub “spoil, plunder;” O.E. reafian, source of the reave in bereave; see reave), from P.Gmc. *raubojanan, from *raub- “to break.”

    This verb early got the meaning “steal, plunder”, as in No røve. The noun rov n. means “plundering”, extended to “loot”, then to “prey”. Rovdyr is the Scandinavian word for “carnivore”. Bjorvand % Lindeman tell me that this semantic extension can be seen even in (an unspecified old variety of) German raupa theo “the loots taken from the fallen”. Here raupa is a fem. acc. pl. descended from *ráub/fo-.

  62. marie-lucie says:

    Ø: What is this saying about bouiller? There seems to be something about stirring up the water with a bouille, whatever that is, and fishing, and even a rabot, but I haven’t got the French to put the pieces together.
    I can’t help you here: bouiller here is obviously dialectal, and there is no definition of bouille meaning the tool used.
    Un rabot is a “plane” (a tool for planing wood), but it might be something quite different in this context, although probably not a rabbit.
    It is possible that rabouiller is related to this bouiller, but I thought it might go with the “rabbit” word group because of the word rabouillère “rabbit burrow” mentioned in the etymological entry (but the dictionary might be wrong here if the word comes from ra-bouiller rather than rab-ouiller).. I think that one would have to look through dictionaries of French dialects (there are some local ones, but, to my knowledge, not a comprehensive one).
    Trond: kobbe and robbe (etc) are too far from each other to be related – there is no instance of an alternation between [k] and [r] anywhere else. But perhaps Kobbe is related to English cub? That the same word means a young animal in one language, an adult in another, is not uncommon (as in the etymological discussion above).
    “Seal” and “rabbit” are both prey animals, but where seals live there are not too many other prey animals (and catching a seal is not easy), while rabbits tend to be plentiful and easy to catch (by snaring), so not something an adult hunter would be proud of. Whether the root rob- or rab- is related to raub, would need the expertise of a Germanic specialist.

  63. Round up the usual Germanic specialists!

  64. Trond Engen says:

    marie-lucie: I meant “have to work” as an understatement. I know well that it’s phonetically impossible. I just had this curious thought that there could be some sort of rhyme association involved. But since the two words don’t seem to overlap geographically, it’s probably a silly idea.
    And now that you say it, the relation kobbe ~ cub seems obvious. etymonline lists it too, I see, but it won’t give an etymology beyond that.
    I had to go before I finished my line of argument on the ‘rob’ word. I was going to say that apart from the neuter and feminine derived from the full grade, there’s also a zero-grade attested. Also, the raub/rob alternation would seem to be attested even within the listed French relatives of ‘rabbit’. Finally, there’s a tendency in Dutch and neighbouring Germanic varieties to have doublet forms with gemination, a tendency that’s part of the case for the Central West Germanic substrate hunters.
    Anyway, back on my computer I wanted to find something authoritative on these doublets and stumbled upon this paper on Raupe (and similar) “caterpillar”. It handles my possible etymology in passing.
    And yeah, a specialist would be better.

  65. A fishing implement called bouille, and another called rabot (scroll down to sense 11).

  66. marie-lucie says:

    Good work, Ø!
    My guess is that since la bouille has a round end, le rabot has a flat one (like the mason’s tool). Both are used to stir the muck at the bottom of the water. (It does not sound like a legal fishing technique, but I could be wrong).

  67. There’s nothing like posting based on a vague memory when you’ve have forgotten the details, as I did above. Thanks to all Hattics for making things ship-shape and Bristol fashion.
    m-l, are there actually laws in France against certain fishing techniques? (Besides dropping dynamite into a pool to “catch” all the fish in it, I mean.) I know of laws limiting when and where one may fish, and how many fish one may take, and regulating the use of natural vs. artificial bait, but not of methods that are prohibited absolutely. Still, I am no fisherman.

  68. marie-lucie says:

    JC, I really don’t know if there are prohibited methods of fishing, especially for non-industrical fishing. I know that for hunting rabbits, snares are prohibited, even though when I was young many people still did it (I don’t know about now – to be successful you need to know the woods and the animals’ habits, and modern life does not encourage such knowledge).

  69. I know that for hunting rabbits, snares are prohibited, even though when I was young many people still did it
    I’m envisioning a jolly, snub-nosed character in a Marcel Pagnol movie, checking his illegal traps and talking in a thick southern accent.

  70. David Marjanović says:

    Other multiscript languages include Azeri (Arabic, Latin, Cyrillic), Hausa (Arabic, Latin), Kurdish (Arabic, Latin), Mongolian (Cyrillic, Mongolian), Panjabi (Arabic, Gurmukhi), Tachelhit (Latin, Tifinagh), Uzbek (Arabic, Cyrillic, Latin), and of course Mandarin Chinese with its simplified and traditional characters.

    Most of these are written with different scripts in different places and/or by people with different religions. Azeri and Uzbek are, in their own republics, transitioning from Cyrillic to Latin, and I don’t know if that transition is now complete.
    Serbia and Montenegro don’t work that way. If you stand in the streets and don’t know both alphabets, you’re illiterate.
    Cyrillic does have connotations of nationalism and Orthodoxy, but they aren’t as strong as one would think. Conversely, I think the book NAJBOGATIJI SRBI SVETA (“The world’s richest Serbs”) is written in Latin script specifically so that the author was able to spell himself as MARKO LOPU$INA as a visual pun on his š (Cyrillic ш)… and yes, it is specifically in Serbian; in Croatian, the title would end in svijeta.

    Can someone explain the Tajik proposal to me?

    To get rid of the unitary letters for /je/, /jo/, /ju/ and /ja/.
    Azeri Cyrillic had done the same and introduced the letter J j for /j/, like Serbian.

    But in Serbia (and Montenegro) there is stable bi-scriptism: every Serbian-speaking child learns both alphabets as a matter of course, though I do not know in detail which, if either, is taught first. (I can imagine them being taught simultaneously, as upper and lower case are taught simultaneously.)

    I think my father was taught them on alternate days. In any case he had to write his homework in alternate alphabets on alternate days.

    The scripts are so thoroughly fused in that when doing crossword puzzles in Latin script, the Latin digraphs lj nj dž that are equivalent to a single letter each in Cyrillic are written in a single square: one can write either LJ U B LJ A N A or Љ У Б Љ А Н А in the same seven squares.

    Indeed. And even in Croatia, they’re considered single letters for all purposes. On money exchange offices you can find vertical signs saying
    M
    J
    E
    NJ
    A
    Č
    N
    I
    C
    A
    with each of the ten letters in a separate square, and in handwritten signs in all-caps you can find Nj and Lj with a sort of subscript j.
    This does not extend to Slovenia, though.

    It seems like a missed chance that Turkmenbashi-era Turkmenistan apparently ditched cyrillic for a boringly roman-based alphabet for Turkmen as soon as the USSR dissolved, rather than devising a brand-new script as a tribute to the unique genius of the Turkmen people and their fearless leader — who engaged in other innovations such as renaming the months of the year etc.: http://en.wikipedia.org/wiki/Renaming_of_Turkmen_months_and_days_of_week,_2002. Although to be fair, since neither North Korea’s rulers nor Myanmar’s have discarded the pre-existing alphabets for their country’s dominant languages, inventing and imposing new writing systems may have fallen out of the Deranged Supervillain playbook.

    LOL! Well, it’s unique enough in the details: rather than using ñ for /ŋ/ the way the Tatar Zamanälif (“modern alphabet”) does, it uses ň; it uses ş /ʃ/ and ç /tʃ/ just like Turkish, but for /dʒ/, it uses j rather than c — in fact, it doesn’t use c at all, even though it uses ç; /ʒ/, which is probably limited to Russian loans, is ž (in Turkish, where it’s limited to French loans, it’s j); and then there’s the stupid ý for /j/ — y and j are both otherwise occupied, because Türkmenbaşy refused to introduce the Turkish ı for /ɯ/. (Yes, the good man himself is credited with the Latin orthography. Whether that’s true or not amounts to the same thing.)
    North Korea does handle certain morphophonemic issues differently than the South, but apart from that, no change to the Korean script could ever be sold as an improvement!
    The generals of Myanma”r” don’t look to me as if they cared about such things. Power comes out of gun barrels.

    (> German Robbe (1618)), all in sense ‘seal’

    Well, no, “seal” (Phoca) is Seehund. (…Hm. And here we have the two uses of italics clashing violently. I’m too tired to do anything about this. I mean, I could use <i> around Phoca and <em> around Seehund, but nobody would notice…) Robbe is the cover term for seals and sea lions and perhaps walruses, “pinniped”.

    Rovdyr is the Scandinavian word for “carnivore”.

    Same in German: Raubtier. It’s nowadays avoided, though, because it contains all the criminal associations of rauben “rob”, Räuber “robber”, and Raub “robbery” plus the implication of throwing ecosystems off balance that comes from berauben “bereave” — all these connotations imply “exterminate them all”.

    Finally, there’s a tendency in Dutch and neighbouring Germanic varieties to have doublet forms with gemination, a tendency that’s part of the case for the Central West Germanic substrate hunters.

    Curiouser and curiouser!!!

    this paper on Raupe (and similar) “caterpillar”.

    …Wow. I’ll need to read the whole thing.
    I cannot at all confirm the absence of that word from Bavarian/Austrian dialects (mentioned on the second page), but it may well be a recent import there; the /p/ is definitely suspicious.
    Rupfen looks good, rülpsen is just full of win…

  71. Trond Engen says:

    m-l:
    “Seal” and “rabbit” are both prey animals, but where seals live there are not too many other prey animals (and catching a seal is not easy), while rabbits tend to be plentiful and easy to catch (by snaring), so not something an adult hunter would be proud of.
    I meant to address this yesterday, but I forgot. Slow and clumsy as they were on land, seals were important prey for coastdwellers until their virtual extinction. The tidal flats along the North Sea shore must have abunded in them. And even after their numbers dropped they would have been sought after. The rabbit was the prey for snarehunters, no matter what the leasure hunters may have thought of it. And some more speculation: Might we even include the Rebhuhn?
    DM:
    Robbe is the cover term for seals and sea lions and perhaps walruses, “pinniped”.
    Ah, that’s a concept worthy of a term. But the extention of reference isn’t enough to change my semantic conjecture. They were all food.
    Curiouser and curiouser!!!
    The substrate theory? I don’t know it too well, that’s why I went searching, but I think there are at least three camps: Those who see a Celtic substrate (which shouldn’t be too controversial), those who see a non-Celtic IE (“Belgic”) substrate, and those who see a non-IE substrate. And Vennemann who sees a Semitic substrate…
    Wow. I’ll need to read the whole thing.
    I’ll be looking forward to that. I’ll even try to reread it for comprehension myself! But as I meant to imply, I think it’s only borderline relevant to the question at hand.

  72. marie-lucie says:

    Trond: seals and rabbits
    Perhaps my earlier post was not clear. It is not uncommon for a language to have a word for “meat” which was originally the name of the most prestigious prey animal. This animal needs to be fairly large (providing lots of meat) and demand individual hunting skill. In the Arctic, a seal or other pinniped would qualify (hunting seals at holes in the ice is not obvious, as the seals appear and disappear unfrequently and with lightning speed), but snaring rabbits in the woods does not carry much prestige (as opposed to shooting a deer or moose, for instance). This is why I think that calling seals and rabbits by the same or a similar name on the basis that they are both prey animals seems unconvincing. If it turns out that the same word is indeed used, its original meaning could have been something else.

  73. Trond Engen says:

    The paper I referred to seems to imply that the two are united in a meaning “animal with whiskers”. I think there’s an assumption of an unattested meaning “brush (n.)” in there.

  74. marie-lucie says:

    I did not have time to read your paper thoroughly, but that would make more sense than “prey” as the common theme.

  75. David Marjanović says:

    The substrate theory?

    The idea of a substrate for specifically Continental West Germanic. But maybe I’ve misunderstood something.
    The idea of a substrate for Germanic in general is not new to me. Except…

    I don’t know it too well, that’s why I went searching, but I think there are at least three camps: Those who see a Celtic substrate (which shouldn’t be too controversial), those who see a non-Celtic IE (“Belgic”) substrate, and those who see a non-IE substrate. And Vennemann who sees a Semitic substrate…

    Celtic substrate? All I know of are a few Celtic loans.
    I’ve never heard of “Belgic”. Have you got any links?
    Vennemann sees a Semitic superstrate, and I must say, it looks fairly convincing… even though I’d restrict it to the good old Phoenicians and leave the megaliths far out of it. My impression is he regularly takes very good ideas and runs way too far with them. (He has recently stopped supporting the glottalic hypothesis of PIE, or so I hear.)

  76. Trond Engen says:

    In this company we can safely assume that all misunderstandings are on my part.
    The tripartition was my own brief summary, based on following over-my-head discussions where names like Schrijver and de Vries have been dropped. Substrate hunting is mainly a Leiden school thing, and I now suspect my confining it to Central West Germanic may be an artefact of how far the influence from Leiden reaches. Anyway, I don’t have a good reference for “Belgic”, but note e.g. endnote 4 in this paper.
    If Vennemann’s Semitic hypothesis is limited to loans in a semantic field connected to, say, a Phoenician trading network during the later Nordic Bronze Age, I don’t understand the fuzz.

  77. David Marjanović says:

    Anyway, I don’t have a good reference for “Belgic”, but note e.g. endnote 4 in this paper.

    Thanks. Looks like it’s an idea from the 1950s that was based on placenames and hasn’t received any attention since. :-/
    * * * * * * * * * * * * *
    Incidentally, that’s just about the worst possible paper on its subject. It confuses (at least) two very distinct processes. One is the change of velar plosives next to front vowels into affricates or fricatives, more precisely assibilates or sibilants, because the vowels draw the place of articulation of the consonants forward; the other is the change of all plosives in almost all positions into affricates or fricatives at the same place of articulation because the consonants are aspirated to the breaking point.
    The first is such a common phenomenon that its occurrence in Anglo-Frisian simply doesn’t need an explanation. It’s something that happens just so. Romance except Sardic is one example, the four “palatalizations” of Slavic are another, the change of *ki, *gi, *hi, *kü, *gü, *hü into qi, ji, xi, qu, ju, xu in Standard Mandarin is one more…
    The second is considerably less common; the only certain examples I know, off the top of my head, are the High German consonant shift and what happened to the Greek aspirates. I’m not sure about Hebrew and not even about Grimm’s Law.
    * * * * * * * * * * * * *
    The paper does mention a few interesting facts.
    One of them is the etymology of Catalonia from Gothalania, which suggests that the Gothic /g/ was voiceless and was therefore interpreted as /k/ by Latin/Romance speakers. The paper says there are no other known examples of this phenomenon; I have read somewhere (Wikipedia?), however, that some etymologists think pizza is a Germanic loan, cognate to English bit(e) and German Bissen, think “fast food”, and that this fits other unspecified evidence which suggests that the Longobards started the High German sound shift. (The alternative etymology of pizza is pita, which fits well enough meaning-wise, but what would have triggered that assibilation within Italian?)
    Immediately, the paper goes on to conclude that, in Germanic, the fortes (/p t k/) have been aspirated by default and the lenes (/b d g/) voiceless since at least the Migrations Period, so the absence of aspiration in Dutch and Frisian requires an explanation. I forgot if it was here or on John Wells’s Phonetic Blog that the lack of aspiration in Dutch (where, additionally, the lenes are fully voiced) was discussed and was, for lack of an alternative explanation, ascribed to French influence. French influence on Sater Frisian, spoken in a pocket in northern Germany, seems very unlikely to me, yet the paper explicitly says that most speakers of Sater Frisian don’t aspirate, while the surrounding varieties of German are aspirated. Therefore the paper posits substrate influence, and I can’t argue against this.
    Here’s my own knowledge about this issue:
    Germanic
    Icelandic: fortes aspirated as strongly as in Chinese; lenes voiceless, and sometimes they go all the way and turn into unaspirated fortes.
    Danish: Lenes voiceless where not turned into fricatives or dropped. I didn’t pay enough attention to the fortes during the few days I spent in Copenhagen this June.
    Faeroese, Norwegian, Swedish: No idea. Please help me out.
    Dutch, Frisian: Fortes unaspirated, lenes voiced like in French or Slavic or Japanese. (Never mind word-final fortition.)
    English: Fortes weakly aspirated by default, lenes unreliable – always voiced in singing, otherwise devoiced to variable degrees depending on the accent and the position in a word. Aspiration strong enough that fortes behind /s/, where they are unaspirated, become lenes most or all of the time (something I need to work on in my own pronunciation of English).
    Low and AFAIK Middle German, and Standard German spoken in places where Low or Middle German dialects are or were spoken: fortes weakly aspirated by default, lenes fully voiced. (Never mind word-final fortition.)
    High German except Carinthian, and Standard German spoken in places where High German dialects are or were spoken: old fortes except initial /k/ (exception of exception: Tyrolean and Alemannic) eliminated by High German consonant shift, new fortes (from High German consonant shift and from loans) unaspirated (never mind the word-final lenition of /t/ in most positions); lenes fully voiceless, so that historical spellings are often confused; initial /k/ sort of borderline aspirated except where fortes have become long lenes (Alemannic) or merged with the lenes altogether (Swabian, eastern Austrian dialects).
    Carinthian (High German dialect close to Tyrolean, but with big fat Slovene substrate): /k/ is a cluster [kh], /p t/ unaspirated, lenes fully voiced (I’m not sure what happens word-finally).
    Celtic
    Scottish Gaelic: Fortes very strongly aspirated by default, lenes voiceless, fortes behind fricatives have become lenes even in writing. This is blamed on the Vikings in what I’ve read.
    Irish Gaelic: Same, except I don’t know how strong the aspiration is, and fortes behind fricatives are still written as such.
    Welsh: Like English, says Wikipedia.
    Breton: AFAIK like French, Dutch and Frisian.
    Can we even reconstruct whether a Celtic language spoken in the Netherlands would have had aspirated fortes and/or voiced lenes?
    In fact, assuming Anglo-Frisian, can we even tell how old aspiration is in English? Could it have been imported from Late Old West Norse (and then passed on to Welsh)? Or perhaps dialect mixture between the Angles, Saxons and Jutes was responsible? ~:-|

  78. Trond Engen says:

    Hey, I didn’t mean to recommend the paper. I just saw an example of ‘Belgic’ in use.
    Every description of Scandinavian phonology will tell that Norwegian and Swedish fortes are aspirated except in clusters. I will say on my own account that (at least allophonically) the lenes are very lene, so much so that a foreigner may perceive /b-/ as /mb-/ or /β-/. I don’t think this is common in all dialects, though.

  79. David Marjanović says:

    Hey, I didn’t mean to recommend the paper. I just saw an example of ‘Belgic’ in use.

    I know. I just engaged in topic drift. :-)
    Thanks for the information on .no and .se.

Speak Your Mind

*