NOBODY IS REALLY TOO KEEN ON A CYRILLIC NTA.

November 20, 2010 by languagehat 101 Comments

Mark Liberman at the Log writes about a Kyiv Post article by Paul Goble that begins: “A statement by a Kazakhstan minister that his country will eventually shift from a Cyrillic-based alphabet to a Latin-based script and reports that some scholars in Dushanbe are considering dropping another four Russian letters from the Tajik alphabet suggest that a new battle of the alphabets may again be shaping up in Central Asia.” It’s well worth reading, far better informed than most journalistic attempts to deal with linguistic matters, and contains interesting links. And it gives me a chance to plug some of my favorite Log posts of all time, those dealing with Central Asian alphabets: How alphabetic is the nature of molecules, Birlashdirilmish yangi Turk alifbesi, and Vaslav Tchitcherine, call your office—not to mention my own Language in Central Asia. As I said there, be grateful you weren’t trying to become literate in that part of the world in the 1930s.

Comments

michael farris says

November 20, 2010 at 11:34 am

“be grateful you weren’t trying to become literate in that part of the world in the 1930s”
or now….
J-P says

November 20, 2010 at 1:28 pm

This article is primarily about orthographies, so, in that sense, is not about ‘linguistic matters’. You know.
boo says

November 20, 2010 at 5:14 pm

Poland chose Latin, Ukraine chose Cyrillic, and I’ve always assumed (without any evidence) that this reflected who had the upper hand in politics at the time: western-oriented or Russian-oriented politicians. If I’m wrong, I’m sure I’ll be corrected (and gladly so), but it seems that this current controversy is also primarily political, denials in the Kyiv Post article notwithstanding.
Vasha says

November 20, 2010 at 5:29 pm

This article is primarily about orthographies, so, in that sense, is not about ‘linguistic matters’

Writing down a language involves, at a minimum, some analysis of its phonology and morphology; and issues of history, variation, and yes, language politics will get in there — if this isn’t “linguistic matters” what is?
marie-lucie says

November 20, 2010 at 5:48 pm

Poland chose Latin, Ukraine chose Cyrillic
Wasn’t this originally because of the prevailing religious traditions?
bulbul says

November 20, 2010 at 10:25 pm

What vasha said. Also, what m-l said. That decision came down long ago, in times when language policy was a matter much more closely associated with religion than the whole nexus of secular politics.
marie-lucie says

November 20, 2010 at 10:41 pm

There is also the parallel use of Cyrillic for Serbian and Roman for Croatian (even though those languages are almost the same), because of adherence to the Orthodox or the Catholic tradition.
Similarly, Persian and Urdu, which are Indo-European languages, are written with the Arabic script because their speakers adopted Islam, but Hindi, which is very similar to Urdu, uses the Devanagari script which is associated with the scriptures of the Hindu religion.
John Cowan says

November 20, 2010 at 11:33 pm

Serbian is really Cyrillic/Latin, whereas Croatian is just Latin. It’s not uncommon, for example, for people to send a Latin-script article to a publisher, even though it will be typeset in Cyrillic. Transliteration with Unicode is easy, automatic, and perfect.
Other multiscript languages include Azeri (Arabic, Latin, Cyrillic), Hausa (Arabic, Latin), Kurdish (Arabic, Latin), Mongolian (Cyrillic, Mongolian), Panjabi (Arabic, Gurmukhi), Tachelhit (Latin, Tifinagh), Uzbek (Arabic, Cyrillic, Latin), and of course Mandarin Chinese with its simplified and traditional characters.
michael farris says

November 21, 2010 at 7:00 am

I assume that the big winners in this script turmoil will be Russian (as de facto working language of bureacracy and media) and English.
Constantly messing with the script of a language is a very good way of sabotaging literacy and literary output.
As to the specifics, Kazakh has been making periodic noises about romanization for a long time and they have an okay system in place (essentially a transliteration of the cyrillic but with the hard sign which is useless for Kazakh). The best solution imho would be to declare both official and let the marketplace of native users decide. But then large scale language reformers generally don’t care what the marketplace of native users wants….
The Tajik proposal doesn’t seem so radical if it’s just dropping letters that aren’t needed (and respelling loan words to correspond better with local pronunciation).
Sashura says

November 21, 2010 at 7:14 am

Other multiscript languages
I’d add Japanese, of course, with two alphabets, Chinese characters/kanji and romaji/Latin all co-existing. And versatile left-to-right, right-to-left, horizontal and vertical lines.
But in Europe Serbian example is really impressive.
marie-lucie says

November 21, 2010 at 7:52 am

The best solution imho would be to declare both official and let the marketplace of native users decide. But then large scale language reformers generally don’t care what the marketplace of native users wants….
“Native users” are not all adults purchasing books and newspapers. They are also schoolchildren learning to read and using textbooks. If there are two official alphabets, who decides which one to use in schools (at least for beginning readers)? The situation is different if there are two languages, because in that case there are also two different populations.
michael farris says

November 21, 2010 at 8:12 am

What do they do in Serbia?
To be clear I don’t think there’s much point in Kazakh switching from a very workable cyrillic orthography to a basically identical roman based one. I have no particular idea if existing literate speakers want to do this or not (and i kind of doubt it).
If they are interested in going ahead with it anyway for whatever reason then gradual incrementalism with a transitional period of biscript literacy would be a better idea than whatever it is they think they’re doing now.
Sili says

November 21, 2010 at 9:19 am

Off topic (as always), but I just saw these gorgeous colour photographs from early last century Russia:
http://www.huffingtonpost.com/2010/11/20/rare-color-photos-of-the-_n_785798.html#186298
Sili says

November 21, 2010 at 9:19 am

Off topic (as always), but I just saw these gorgeous colour photographs from early last century Russia:
http://www.huffingtonpost.com/2010/11/20/rare-color-photos-of-the-_n_785798.html#186298
marie-lucie says

November 21, 2010 at 9:29 am

Does Serbia use both alphabets then? I knew that was true of the former Yugoslavia (eg on stamps), but I did not know that Serbia was still doing it.
michael farris says

November 21, 2010 at 9:55 am

yes, it isn’t hard to find web publications in either
here’s one in cyrillic
http://www.politika.rs/
and here’s one in latin
http://www.danas.rs/danasrs/naslovna.1.html
IIRC cyrillic is official for bureacracy but latin use is very widespread (and possibly growing).
Don’t know anything about relative script numbers for book publishing or how the problem is dealt with at school.
michael farris says

November 21, 2010 at 9:59 am

Also, on interesting side effect of the Serbian situation is that foreign names in latin are usually written as if transcribed from cyrillic. I remember seeing a movie magazine in latin script and with references to Voren Bejti and Edi Marfi etc
boo says

November 21, 2010 at 10:45 am

Thank you for the clarification re Catholic/Orthodox religions.
The example mentioned above, of simplified versus traditional Chinese, is a good warning of what can happen when politicians take over writing systems. When I was in the biz, I used human editors to convert between the two, because the differences were great enough that no automatic conversion facility worked completely.
An example of romanization gone wrong: in Japanese, a relatively simple language to romanize, there are three major romanization systems in use, and even the government is inconsistent about it (it allows Hepburn for passport names, for example, even though it officially uses Kunrei-shiki in education).
marie-lucie says

November 21, 2010 at 11:36 am

on interesting side effect of the Serbian situation is that foreign names in latin are usually written as if transcribed from cyrillic.
This seems to indicate that cyrillic is the dominant alphabet.
Sashura says

November 21, 2010 at 2:45 pm

What do they do in Serbia?
Does Serbia use both alphabets then?
yes, that’s what I meant: Serbia is the only country in Europe where Latin and Cyrillics peacefully coexist and, I think, officially recognised. It’s not surprising – it’s there that the Roman Commonwealth (Latin) meets the Byzantine Commonwealth (Greek).
Etienne says

November 21, 2010 at 2:50 pm

It seems to me the Tajiks are the sensible ones here. In the wake of the fall of the Soviet Union the various former Soviet Republics wished to distance themselves from Russia and the Soviet past: that is perfectly understandable.
But surely there must have existed ways of doing so which did not involve instantly transforming a whole generation into illiterates? Mass literacy in the various vernacular languages of the region may well be the only positive legacy of Soviet rule: it seems unwise to endanger this legacy.
Hence the sanity, to me, of the Tajik approach: keeping Cyrillic, but modifying it somewhat, making the orthography less Russian-like. Thereby establishing a distance from an unattractive Soviet past without ditching its chief (sole?) positive accomplishment: vernacular mass literacy.
And I agree with Michael Farris: Russian and English will be the winners here. The lack of a well-defined norm, orthography (or even script) in the language(s) of various newly-decolonized countries has played a major role in keeping the former imperial power’s language in its dominant position.
Sashura says

November 21, 2010 at 3:04 pm

Russian and English will be the winners here.
Michael, Etienne, could you elaborate, please? To me, it seems like an either-or situation.
Ran says

November 21, 2010 at 4:36 pm

@Sashura: When people are illiterate in their own language, but literate in a foreign language, they will start to use the foreign language for various things that they would otherwise have used their own languages for.
Sashura says

November 21, 2010 at 3:01 pm

Japanese, a relatively simple language to romanize
maybe, when you think of shi or chi, but what do you do with strong rhotic sounds, tsu, which is phonetically closer to Russian Ц (ts as in tsar or czar) and not very comfortable on an English tongue, and of course the strong KH in the ha-hi-fu-he-ho column of gojuon? My mother, whose first foreign language is French, always says ‘itashi’ for ‘Hitachi’. Most anglophones are not very comfortable with strong ‘kh’ too.
Andrew says

November 21, 2010 at 7:07 pm

Can someone explain the Tajik proposal to me? I don’t read Russian, but I looked at the article in question and found the four proposed letters to be eliminated. All of them represent Tajiki sounds and are used in Tajiki Persian words*, so I don’t understand why they are called ‘Russian’ letters. That makes it sound like they are somehow a poor othographic fit for Tajiki, which isn’t true.
*For instance ‘e’ is the first letter in ‘Iran’, ‘e’ with the two dots is ‘ya’ یا which means ‘or’ in Persian, the backwards ‘R’ is the first letter is yak یک which means ‘one’, etc.
The Modesto Kid says

November 21, 2010 at 7:34 pm

Highly recommended reading in this regard (besides O.C. the pertinent passages of Gravity’s Rainbow) is The Dictionary of the Khazars by Milorad Pavic.
minus273 says

November 21, 2010 at 7:37 pm

‘Cause there’s absolutely no reason to spell /jV/ as a single letter other than Russian compatibility.
John Cowan says

November 21, 2010 at 7:45 pm

Japanese isn’t multiscript in the same sense as the others: it requires multiple scripts to write it at all, but there is in practice only one way of doing so. (Obviously you can write English in Shavian script or Greek in IPA, but I am dismissing such extremely minoritarian practices.)
The Serbian situation is indeed impressive. In most cases, languages are multiscript either because they have used different scripts at different times within living memory, or because different populations, often in different countries, use different scripts. But in Serbia (and Montenegro) there is stable bi-scriptism: every Serbian-speaking child learns both alphabets as a matter of course, though I do not know in detail which, if either, is taught first. (I can imagine them being taught simultaneously, as upper and lower case are taught simultaneously.)
The scripts are so thoroughly fused in that when doing crossword puzzles in Latin script, the Latin digraphs lj nj dž that are equivalent to a single letter each in Cyrillic are written in a single square: one can write either LJ U B LJ A N A or Љ У Б Љ А Н А in the same seven squares. (From a historical point of view, љ and њ are ligatures of ль and нь, but Serbian views them as unitary letters and does not use ь.)
As for Edi Marfi, Serbian orthography is ruthlessly phonemic, so everything is written as Serbian-speakers would pronounce it. By the same token, Russian or Bulgarian names with щ are never written so in Serbian: they are transliterated to шч or шт as the case may be. Similarly, ю and я in other Cyrillic-script languages are transliterated to Serbian Cyrillic ју and ја.
marie-lucie says

November 21, 2010 at 11:21 pm

Latin and Cyrillic alphabets: I can imagine them being taught simultaneously, as upper and lower case are taught simultaneously.
Upper and lower case versions of the same letter rarely look the same, and the characters of one series cannot be confused with those of the other one. Latin and Cyrillic share some characters which have the same shape and value in both, but there are also a number of characters which are ambiguous, like B, C, H, N, P, X, Y, at least. This would make it difficult to teach the two alphabets simultaneously, as opposed to teaching one first, and adding the second one later, when the children are competent in the first alphabet.
Bob Violence says

November 22, 2010 at 2:44 am

maybe, when you think of shi or chi, but what do you do with strong rhotic sounds, tsu, which is phonetically closer to Russian Ц (ts as in tsar or czar) and not very comfortable on an English tongue, and of course the strong KH in the ha-hi-fu-he-ho column of gojuon? My mother, whose first foreign language is French, always says ‘itashi’ for ‘Hitachi’. Most anglophones are note very comfortable with strong ‘kh’ too.
Those are all issues with pronunciation, not romanization. An Anglophone (or any other foreigner) who wants to speak Japanese will learn the proper pronunciation of “tsu” or that “h-” initial is usually more fricative (and sometimes palatized as well); everyone else will continue to use Anglofied (or Francofied, Russified, etc.) spelling pronunciations. No romanization system can be 100% intuitive across every single language, but it can still be a perfectly adequate representation of the original language’s phonology. Hanyu Pinyin poses all sorts of issues for English-speakers — q, x, zh, etc. — but it’s used pretty consistently, which gives it a leg up over the various Japanese romanizations.
Bob Violence says

November 22, 2010 at 2:49 am

And mea culpa: I know that transliteration of Japanese into Russian is a matter of cyrillization, not romanization, although it poses similar pronunciation issues. (My understanding is that Russian uses a single standard cyrillization, but not everyone understands or follows the rules. Dunno about non-Russian cyrillic scripts.)
Andrew says

November 22, 2010 at 5:14 am

“‘Cause there’s absolutely no reason to spell /jV/ as a single letter other than Russian compatibility.”
Was this in response to my question about why certain letters in the Tajiki alphabet are considered ‘Russian’? If so, what letter does it refer to? I don’t understand /jV/.
Ran says

November 22, 2010 at 8:09 am

@Andrew: /jV/ means roughly “a ‘y’ sound followed by a vowel”. Russian has several such letters, but minus273 is saying that Tajik doesn’t need such a thing, and can just as well write the two sounds separately. <я>, for example, could be written <йа>.
Andrew says

November 22, 2010 at 12:59 pm

Ahh, thanks Ran.
Sashura says

November 22, 2010 at 1:12 pm

Russian uses a single standard cyrillization
it does, you can always tell if someone has at least some knowledge about Japan by the way they spell chi-ти, not чи, and shi-си, not ши (sushi-суси-суши).
J. W. Brewer says

November 22, 2010 at 2:07 pm

It seems like a missed chance that Turkmenbashi-era Turkmenistan apparently ditched cyrillic for a boringly roman-based alphabet for Turkmen as soon as the USSR dissolved, rather than devising a brand-new script as a tribute to the unique genius of the Turkmen people and their fearless leader — who engaged in other innovations such as renaming the months of the year etc.: http://en.wikipedia.org/wiki/Renaming_of_Turkmen_months_and_days_of_week,_2002. Although to be fair, since neither North Korea’s rulers nor Myanmar’s have discarded the pre-existing alphabets for their country’s dominant languages, inventing and imposing new writing systems may have fallen out of the Deranged Supervillain playbook.
Dave B says

November 22, 2010 at 4:08 pm

@Sashura: Interesting point about cyrillization of Japanese. But I guess it means nobody in Moscow has any knowledge about Japan: the last time I was there, it was all chain суши restaurants and no суси in sight. (Presumably this means the recent sushi trend made it to Russia via the west rather than Japan.)
Sashura says

November 22, 2010 at 4:12 pm

bet it has.
But суши also reeks of сушка – both the ring-shaped tea bisquits and sun-dried fish (mostly in the North, not in Moscow)
Sashura says

November 23, 2010 at 2:34 am

illiterate in their own language, but literate in a foreign language
Ran, this is succinctly put, but not entirely correct. Surely, you don’t introduce democracy (people power), anarchy (no power – small government) or Soviet (council, local government) and sputnik (satellite), because you are illiterate in your own language, but literate in a foreign language. That’s at the high end, and at the low end, it’s not for literacy/illiteracy reasons that woodcock, in Russian originally ‘pizdrik’, was changed to вальдшнеп (Waldschnepfe).
But my question was about scripts – that’s why I am slightly puzzled with the argument that Russian and English would both be winners.
Etienne says

November 23, 2010 at 11:58 am

@Sashura: If I may elaborate. When the Soviet Union fell the various Central Asian republics all had a “national” language written in Cyrillic, which were the “low” languages, with Russian being the “high” one, sociolinguistically. Subsequent events introduced English as a second “high” language.
Attempts on the part of Central Asian authorities to extend the domains of use of their national languages (at the expense of Russian) are weakened if not nullified by these “wars of the scripts”: educated native speakers/users, feeling uncertain as to which spelling/alphabet to use, will (I sense) in the end feel more comfortable falling back upon one of the “high” languages (Russian or English) instead of their L1, which recent events make them feel they don’t really master as written codes.
A very similar situation was and is found in many formerly colonized cou8ntries: even in ones with a dominant non-european vernacular, the low degree of standardization means that even native speakers who are literate in the former colonial power’s language prefer using said language in formal contexts, including writing. Hence the dominant role still played by Dutch in Surinam, French in the Central African Republic, English in Ghana as “high” languages, with Sranan, Sango and Akan (respectively) the dominant informal, “low” languages.
I would not be surprised if a generation hence the relationship of the various Central Asian languages to Russian and English will be reminescent of the linguistic landscape of the above countries…
Noetica says

November 23, 2010 at 6:45 pm

I know that transliteration of Japanese into Russian is a matter of cyrillization, not romanization, …
Nice caution. But a note concerning terms:
I would not use transliteration unless both source coding and target coding were alphabetic: that is, using letters (literae). I think it is best to use transcription as the general term, with romanisation (and cyrillisation, etc.) as specific kinds of transcription. Orthogonally, every transliteration is a kind of transcription. I don’t know that all experts use the terms in these regimented ways; but I think many do, and they have logic, etymology, and the need for clarity on their side.
Bathrobe says

November 23, 2010 at 8:12 pm

Waldschnepfe
Bird names and their standardisation are something else again. I have no idea why ‘pizdrik’ (пиздрик?) would have been changed to вальдшнеп, but it is possible that ‘pizdrik’ referred to more than one species, and вальдшнеп was introduced in the interest of taxonomical precision. In any case, bird names according to official taxonomies are a totally artificial beast.
John Cowan says

November 23, 2010 at 10:45 pm

Noetica: I would unhesitatingly use transliteration for a mapping between hiragana and Latin letter combinations, though hiragana are not strictly speaking an alphabet. As a Unicadet, the difference to me is primarily that transliteration maps the characters of a script onto the characters of another script without regard to a specific language (thus it is reversible), whereas transcription uses the written conventions of one language to represent the written form of another, and is typically not reversible exactly.
John Cowan says

November 23, 2010 at 10:47 pm

By the way, Hat, I keep reading the headline as a reference to a non-existent casing pair of Unicode characters, CAPITAL and SMALL LETTER NTA, whose existence is seen as undesirable.
Noetica says

November 23, 2010 at 11:33 pm

I would unhesitatingly use transliteration for a mapping between hiragana and Latin letter combinations, though hiragana are not strictly speaking an alphabet.
Good for you! I hesitated with that restriction to alphabetic, exactly; and I do not prescribe, but simply observe. Still, the points regarding etymology and clarity stand.
… that transliteration maps the characters of a script onto the characters of another script without regard to a specific language (thus it is reversible), whereas transcription uses the written conventions of one language to represent the written form of another, and is typically not reversible exactly.
I wonder about your restriction to “another [language]”. Perhaps usefully ignoring that restriction, and taking the example of a standardised Mandarin Chinese, your suggestion implies that a text in pinyin (an encoding devised by Chinese experts) prepared from one written in characters counts as a transcription, but not as a transliteration, right? Only because it is not reversible? I wonder more generally about your parenthetic gloss “(thus it is reversible)”. Do you mean “in such a way that it is reversible”? Not all mappings achieve reversibility.
I don’t do Japanese, so I’ll pretty well take your word for it with hiragana. Except, aren’t the standard “transliterations” of じ–ぢ and ず–づ irreversible? And isn’t ん “transliterated” differently according to context? Of course there are similar issues with transliteration of Greek iota subscripts, breathings, and the combinations γγ and γκ, etc. These would have to be accommodated in any truly rigorous definition of our terms.
Sashura says

November 24, 2010 at 3:01 am

Etienne, thanks for your observations – it does make sense.
Waldshnepfe – pizdrik
Bathrobe, sorry, I should have explained. The older name has been wiped out, because of its link to ‘unprintable’ swear lexicon (maternye slova) – the c- word, the verbal form of which means ‘chatter, blabber’ and, in the case of woodcock, refers to its characteristic mating cry. I learned this word when hunting in the North of Russia, but some dictionaries attribute it to lapwing or oreole.
michael farris says

November 24, 2010 at 6:41 am

Etienne explained my point better than I would have, so big thanks to him.
And as chance would have it, I’m just home from a conference where one of the papers was on ….. (wait for it) … “Language Policy in Kazakhstan”.
Brief out of order summary from the paper and my questions to the author afterward.
Kazakh could easily serve as the dominant state language but is lagging behind Russian for a number of reasons including (but not limited to) the very cozy relationship between the Kazakh and Russian leadership and serious efforts by Russia (similar to British and French efforts in former colonies) to maintain the dominant position of the language in the country. It’s also a large market for Russian language products and so that urban Kazakhs read international works in Russian rather than Kazakh translations.
There are also typical code and domain switching issues that will be familiar to anyone familiar with colonial-like language situations. Private conversations in Kazakh turn into Russian for more public use. A Kazakh speaker will ask a bus drivr (also Kazakh) questions in Russian. Buyer and seller (each speaking Kazakh privately a moment before) will negotiate the final price in Russian, that kind of thing.
The Romanization issue is for all intents and purposes dead (and has been since 2007) no matter what some random minister says now. The only result is a very adequate roman transliteration for those that need or want such a thing.
Good written materials exist in Kazakh but distribution issue abound so that even those in the market for them have trouble finding them.
A lack of the mechanisms of civil society means that there’s no long term thought about what language policies are in the best long term interests of the country. Families that can afford to do so send children abroad to learn English (perceived as a competitive domestic advantage).
I aksed if any countries in the region have anything like a good language policy and was suprised to hear …. Turkmenistan which after some missteps has gotten its romanization in order and the spheres of use of the language are increasing. Uzbekistan is a mess because of dialect compataiblity issues of any written standard.
language hat says

November 24, 2010 at 8:41 am

I have no idea why ‘pizdrik’ (пиздрик?) would have been changed to вальдшнеп
For the same reason coney has been changed to rabbit in English, as Sashura explains above (the “c-word” in Russian is pizdá).
michael farris: Thanks very much for that informative comment!
Sashura' says

November 24, 2010 at 1:51 pm

Michael,
on Turkmenistan, did you get the impression that their progress in romanisation is influenced by or related to Azerbaijan and Turkey?
And thanks for that summary.
Sashura' says

November 24, 2010 at 1:57 pm

LH: I didn’t know about coney, what a horizon it opens. ‘I have a cunning plan’ – would that be related to coney, the rabbit?
michel farris says

November 24, 2010 at 3:33 pm

Didn’t have time to discuss the Turkmen case in more detail though I might be meeting with the person who gave the paper sometime in the not too distant future.
I do know that the original romanization plans were pretty …. eccentric and they got a lot more realistic, limiting the letters to ones that are fairly common and thereore easy to reproduce in various media (even if there’s a motley appearance as it mixes haceks, cedillas and umlauts).
AFAICT the former crazy leader did have something to do with expanding the range of use of the language and that habit has survived him.
Bob Violence says

November 25, 2010 at 1:06 am

Turkmen Wikipedia has what may or may not be an accurate rundown of the various scripts, including the “eccentric” 1993 alphabet (with £, $, ¢ and Ұ!). (Why no [ɯ] in the 1927 script?)
language hat says

November 25, 2010 at 8:21 am

Wow, that 1993 alphabet is really something.
John Cowan says

November 26, 2010 at 2:02 am

I believe the 1993 alphabet was based on what could be wedged into ISO 8859-1, which for example did not have Y-umlaut, hence the use of the yen sign.
John Cowan says

November 26, 2010 at 4:13 am

Sashura: No, no connection. Cunning is a native word, built on the old meaning of can ‘know’, with cognates ken in Scots and kennen in German. The old name for a wizard or sorcerer in English was cunning-man, that is, one who knows.
Coney on the other hand is a borrowing from the Romance languages, and perhaps cunny, its former homonym, is too; nobody knows for sure. Nowadays coney is used only in place names and the Bible, and is usually pronounced with [o] rather than historic [ʌ].
The word that displaced it, rabbit, originally referred only to the young animal. It appears in French and Dutch too, but its further relations are not known with certainty: it may have been transferred from a Dutch and Low German word for ‘seal’, presumably some sort of comparison of the fur.
marie-lucie says

November 26, 2010 at 6:16 am

JC: rabbit, … appears in French
????
stormboy says

November 26, 2010 at 6:24 am

@John Cowan: “Nowadays coney is used only in place names and the Bible, and is usually pronounced with [o] rather than historic [ʌ].”
It’s also used (or was used, at least in my childhood in the ’70s) instead of ‘rabbit’ when referring to the fur used in coats – I guess because ‘rabbit’ is obviously cheap (compared to, say, mink) and not everyone would immediately identify ‘coney’.
language hat says

November 26, 2010 at 9:00 am

????
Here’s the brand-new (Sept. 2010) OED etymology of rabbit:
[Apparently < an unattested Anglo-Norman or Middle French *rabotte (French regional (central) rabotte rabbit, rabbit hole), with suffix substitution (see -ET suffix1); the French word would represent a form with dissimilation of o in the first syllable (contrast French regional (Walloon) robète, robett rabbit, with suffix substitution) < *robotte < an unattested Middle Dutch noun corresponding to early modern Dutch robbe rabbit (1599 in Kiliaan; Dutch regional (West Flemish) robbe, also ribbe, rubbe; of uncertain origin: see below) + Middle French –otte, feminine form corresponding to –ot -OT suffix; compare the early modern Dutch diminutive form robbeken (1599 in Kiliaan); compare also Middle French, French rabouillère rabbit burrow (1564 as rabolliere; 1542 in general sense ‘hole’). Compare post-classical Latin rabettus (1407 in a British source), robettus (1473 in a British source), both in sense ‘young rabbit’.
With early modern Dutch robbe rabbit, perhaps compare Middle Dutch robbe seal (1488; Dutch rob), cognate with West Frisian robbe, rob, German regional (Low German) Robb (> German Robbe (1618)), all in sense ‘seal’, of uncertain and disputed origin; however, the connection between the two animals is not immediately obvious; for a discussion of this and possible ulterior etymologies see A. Liberman in Gen. Linguistics 35 (1997) 108-19.]
Now, that’s what I call etymology. Amateur etymologists should be forced to examine every link in that chain and figure out why it’s there until they understand what valid evidence is and how much work it is to assemble it.
marie-lucie says

November 26, 2010 at 9:42 am

LH, thank you for looking this up. Those French words are new to me! They are all from various dialects, and never made it into the standard language. I have never seen them mentioned in lists of French words of Germanic origin.
All those words in rab- or rob- have to do with holes, and another word that could be added to the mix is the dialectal verb rabouiller, from which derives the noun pair le rabouilleur/la rabouilleuse. The feminine form is known from Balzac’s novel of the same title, and Balzac explains the verb rabouiller as to stir up the water and mud in a waterhole in order to confuse the fish or crayfish and catch them more easily. This was the childhood occupation of the female main character, who came from a very poor family.
Ø says

November 26, 2010 at 2:27 pm

What is this saying about bouiller? There seems to be something about stirring up the water with a bouille, whatever that is, and fishing, and even a rabot, but I haven’t got the French to put the pieces together. Are there holes here, other than the gaping holes in my understanding?
Trond Engen says

November 26, 2010 at 2:50 pm

Rather than assembling evidence, here are couple of stray (and mutually exclusive) thoughts from this amateur etymologist:
There’s a North/West Norwegian word kobbe “seal” with no known etymology — not to me, anyway. One would have to work to unite the continental r and the Scand. k, though.
Could the semantic connection between the rabbit and the seal be that they’ve both been seen as prey? According to etymonline (I’ve called it that before!) the verb rob is

late 12c., from O.Fr. rober, from a Germanic source (cf. O.H.G. roubon “to rob,” roub “spoil, plunder;” O.E. reafian, source of the reave in bereave; see reave), from P.Gmc. *raubojanan, from *raub- “to break.”

This verb early got the meaning “steal, plunder”, as in No røve. The noun rov n. means “plundering”, extended to “loot”, then to “prey”. Rovdyr is the Scandinavian word for “carnivore”. Bjorvand % Lindeman tell me that this semantic extension can be seen even in (an unspecified old variety of) German raupa theo “the loots taken from the fallen”. Here raupa is a fem. acc. pl. descended from *ráub/fo-.
marie-lucie says

November 26, 2010 at 5:12 pm

Ø: What is this saying about bouiller? There seems to be something about stirring up the water with a bouille, whatever that is, and fishing, and even a rabot, but I haven’t got the French to put the pieces together.
I can’t help you here: bouiller here is obviously dialectal, and there is no definition of bouille meaning the tool used.
Un rabot is a “plane” (a tool for planing wood), but it might be something quite different in this context, although probably not a rabbit.
It is possible that rabouiller is related to this bouiller, but I thought it might go with the “rabbit” word group because of the word rabouillère “rabbit burrow” mentioned in the etymological entry (but the dictionary might be wrong here if the word comes from ra-bouiller rather than rab-ouiller).. I think that one would have to look through dictionaries of French dialects (there are some local ones, but, to my knowledge, not a comprehensive one).
Trond: kobbe and robbe (etc) are too far from each other to be related – there is no instance of an alternation between [k] and [r] anywhere else. But perhaps Kobbe is related to English cub? That the same word means a young animal in one language, an adult in another, is not uncommon (as in the etymological discussion above).
“Seal” and “rabbit” are both prey animals, but where seals live there are not too many other prey animals (and catching a seal is not easy), while rabbits tend to be plentiful and easy to catch (by snaring), so not something an adult hunter would be proud of. Whether the root rob- or rab- is related to raub, would need the expertise of a Germanic specialist.
language hat says

November 26, 2010 at 5:35 pm

Round up the usual Germanic specialists!
Trond Engen says

November 26, 2010 at 7:11 pm

marie-lucie: I meant “have to work” as an understatement. I know well that it’s phonetically impossible. I just had this curious thought that there could be some sort of rhyme association involved. But since the two words don’t seem to overlap geographically, it’s probably a silly idea.
And now that you say it, the relation kobbe ~ cub seems obvious. etymonline lists it too, I see, but it won’t give an etymology beyond that.
I had to go before I finished my line of argument on the ‘rob’ word. I was going to say that apart from the neuter and feminine derived from the full grade, there’s also a zero-grade attested. Also, the raub/rob alternation would seem to be attested even within the listed French relatives of ‘rabbit’. Finally, there’s a tendency in Dutch and neighbouring Germanic varieties to have doublet forms with gemination, a tendency that’s part of the case for the Central West Germanic substrate hunters.
Anyway, back on my computer I wanted to find something authoritative on these doublets and stumbled upon this paper on Raupe (and similar) “caterpillar”. It handles my possible etymology in passing.
And yeah, a specialist would be better.
Ø says

November 26, 2010 at 11:15 pm

A fishing implement called bouille, and another called rabot (scroll down to sense 11).
marie-lucie says

November 26, 2010 at 11:25 pm

Good work, Ø!
My guess is that since la bouille has a round end, le rabot has a flat one (like the mason’s tool). Both are used to stir the muck at the bottom of the water. (It does not sound like a legal fishing technique, but I could be wrong).
John Cowan says

November 27, 2010 at 1:35 am

There’s nothing like posting based on a vague memory when you’ve have forgotten the details, as I did above. Thanks to all Hattics for making things ship-shape and Bristol fashion.
m-l, are there actually laws in France against certain fishing techniques? (Besides dropping dynamite into a pool to “catch” all the fish in it, I mean.) I know of laws limiting when and where one may fish, and how many fish one may take, and regulating the use of natural vs. artificial bait, but not of methods that are prohibited absolutely. Still, I am no fisherman.
marie-lucie says

November 27, 2010 at 7:15 am

JC, I really don’t know if there are prohibited methods of fishing, especially for non-industrical fishing. I know that for hunting rabbits, snares are prohibited, even though when I was young many people still did it (I don’t know about now – to be successful you need to know the woods and the animals’ habits, and modern life does not encourage such knowledge).
language hat says

November 27, 2010 at 8:48 am

I know that for hunting rabbits, snares are prohibited, even though when I was young many people still did it
I’m envisioning a jolly, snub-nosed character in a Marcel Pagnol movie, checking his illegal traps and talking in a thick southern accent.
David Marjanović says

November 27, 2010 at 7:10 pm

Other multiscript languages include Azeri (Arabic, Latin, Cyrillic), Hausa (Arabic, Latin), Kurdish (Arabic, Latin), Mongolian (Cyrillic, Mongolian), Panjabi (Arabic, Gurmukhi), Tachelhit (Latin, Tifinagh), Uzbek (Arabic, Cyrillic, Latin), and of course Mandarin Chinese with its simplified and traditional characters.

Most of these are written with different scripts in different places and/or by people with different religions. Azeri and Uzbek are, in their own republics, transitioning from Cyrillic to Latin, and I don’t know if that transition is now complete.
Serbia and Montenegro don’t work that way. If you stand in the streets and don’t know both alphabets, you’re illiterate.
Cyrillic does have connotations of nationalism and Orthodoxy, but they aren’t as strong as one would think. Conversely, I think the book NAJBOGATIJI SRBI SVETA (“The world’s richest Serbs”) is written in Latin script specifically so that the author was able to spell himself as MARKO LOPU$INA as a visual pun on his š (Cyrillic ш)… and yes, it is specifically in Serbian; in Croatian, the title would end in svijeta.

Can someone explain the Tajik proposal to me?

To get rid of the unitary letters for /je/, /jo/, /ju/ and /ja/.
Azeri Cyrillic had done the same and introduced the letter J j for /j/, like Serbian.

But in Serbia (and Montenegro) there is stable bi-scriptism: every Serbian-speaking child learns both alphabets as a matter of course, though I do not know in detail which, if either, is taught first. (I can imagine them being taught simultaneously, as upper and lower case are taught simultaneously.)

I think my father was taught them on alternate days. In any case he had to write his homework in alternate alphabets on alternate days.

The scripts are so thoroughly fused in that when doing crossword puzzles in Latin script, the Latin digraphs lj nj dž that are equivalent to a single letter each in Cyrillic are written in a single square: one can write either LJ U B LJ A N A or Љ У Б Љ А Н А in the same seven squares.

Indeed. And even in Croatia, they’re considered single letters for all purposes. On money exchange offices you can find vertical signs saying
M
J
E
NJ
A
Č
N
I
C
A
with each of the ten letters in a separate square, and in handwritten signs in all-caps you can find Nj and Lj with a sort of subscript j.
This does not extend to Slovenia, though.

It seems like a missed chance that Turkmenbashi-era Turkmenistan apparently ditched cyrillic for a boringly roman-based alphabet for Turkmen as soon as the USSR dissolved, rather than devising a brand-new script as a tribute to the unique genius of the Turkmen people and their fearless leader — who engaged in other innovations such as renaming the months of the year etc.: http://en.wikipedia.org/wiki/Renaming_of_Turkmen_months_and_days_of_week,_2002. Although to be fair, since neither North Korea’s rulers nor Myanmar’s have discarded the pre-existing alphabets for their country’s dominant languages, inventing and imposing new writing systems may have fallen out of the Deranged Supervillain playbook.

LOL! Well, it’s unique enough in the details: rather than using ñ for /ŋ/ the way the Tatar Zamanälif (“modern alphabet”) does, it uses ň; it uses ş /ʃ/ and ç /tʃ/ just like Turkish, but for /dʒ/, it uses j rather than c — in fact, it doesn’t use c at all, even though it uses ç; /ʒ/, which is probably limited to Russian loans, is ž (in Turkish, where it’s limited to French loans, it’s j); and then there’s the stupid ý for /j/ — y and j are both otherwise occupied, because Türkmenbaşy refused to introduce the Turkish ı for /ɯ/. (Yes, the good man himself is credited with the Latin orthography. Whether that’s true or not amounts to the same thing.)
North Korea does handle certain morphophonemic issues differently than the South, but apart from that, no change to the Korean script could ever be sold as an improvement!
The generals of Myanma”r” don’t look to me as if they cared about such things. Power comes out of gun barrels.

(> German Robbe (1618)), all in sense ‘seal’

Well, no, “seal” (Phoca) is Seehund. (…Hm. And here we have the two uses of italics clashing violently. I’m too tired to do anything about this. I mean, I could use <i> around Phoca and <em> around Seehund, but nobody would notice…) Robbe is the cover term for seals and sea lions and perhaps walruses, “pinniped”.

Rovdyr is the Scandinavian word for “carnivore”.

Same in German: Raubtier. It’s nowadays avoided, though, because it contains all the criminal associations of rauben “rob”, Räuber “robber”, and Raub “robbery” plus the implication of throwing ecosystems off balance that comes from berauben “bereave” — all these connotations imply “exterminate them all”.

Finally, there’s a tendency in Dutch and neighbouring Germanic varieties to have doublet forms with gemination, a tendency that’s part of the case for the Central West Germanic substrate hunters.

Curiouser and curiouser!!!

this paper on Raupe (and similar) “caterpillar”.

…Wow. I’ll need to read the whole thing.
I cannot at all confirm the absence of that word from Bavarian/Austrian dialects (mentioned on the second page), but it may well be a recent import there; the /p/ is definitely suspicious.
Rupfen looks good, rülpsen is just full of win…
Trond Engen says

November 27, 2010 at 9:33 pm

m-l:
“Seal” and “rabbit” are both prey animals, but where seals live there are not too many other prey animals (and catching a seal is not easy), while rabbits tend to be plentiful and easy to catch (by snaring), so not something an adult hunter would be proud of.
I meant to address this yesterday, but I forgot. Slow and clumsy as they were on land, seals were important prey for coastdwellers until their virtual extinction. The tidal flats along the North Sea shore must have abunded in them. And even after their numbers dropped they would have been sought after. The rabbit was the prey for snarehunters, no matter what the leasure hunters may have thought of it. And some more speculation: Might we even include the Rebhuhn?
DM:
Robbe is the cover term for seals and sea lions and perhaps walruses, “pinniped”.
Ah, that’s a concept worthy of a term. But the extention of reference isn’t enough to change my semantic conjecture. They were all food.
Curiouser and curiouser!!!
The substrate theory? I don’t know it too well, that’s why I went searching, but I think there are at least three camps: Those who see a Celtic substrate (which shouldn’t be too controversial), those who see a non-Celtic IE (“Belgic”) substrate, and those who see a non-IE substrate. And Vennemann who sees a Semitic substrate…
Wow. I’ll need to read the whole thing.
I’ll be looking forward to that. I’ll even try to reread it for comprehension myself! But as I meant to imply, I think it’s only borderline relevant to the question at hand.
marie-lucie says

November 27, 2010 at 10:07 pm

Trond: seals and rabbits
Perhaps my earlier post was not clear. It is not uncommon for a language to have a word for “meat” which was originally the name of the most prestigious prey animal. This animal needs to be fairly large (providing lots of meat) and demand individual hunting skill. In the Arctic, a seal or other pinniped would qualify (hunting seals at holes in the ice is not obvious, as the seals appear and disappear unfrequently and with lightning speed), but snaring rabbits in the woods does not carry much prestige (as opposed to shooting a deer or moose, for instance). This is why I think that calling seals and rabbits by the same or a similar name on the basis that they are both prey animals seems unconvincing. If it turns out that the same word is indeed used, its original meaning could have been something else.
Trond Engen says

November 28, 2010 at 9:42 am

The paper I referred to seems to imply that the two are united in a meaning “animal with whiskers”. I think there’s an assumption of an unattested meaning “brush (n.)” in there.
marie-lucie says

November 28, 2010 at 10:10 am

I did not have time to read your paper thoroughly, but that would make more sense than “prey” as the common theme.
David Marjanović says

November 28, 2010 at 10:55 am

The substrate theory?

The idea of a substrate for specifically Continental West Germanic. But maybe I’ve misunderstood something.
The idea of a substrate for Germanic in general is not new to me. Except…

I don’t know it too well, that’s why I went searching, but I think there are at least three camps: Those who see a Celtic substrate (which shouldn’t be too controversial), those who see a non-Celtic IE (“Belgic”) substrate, and those who see a non-IE substrate. And Vennemann who sees a Semitic substrate…

Celtic substrate? All I know of are a few Celtic loans.
I’ve never heard of “Belgic”. Have you got any links?
Vennemann sees a Semitic superstrate, and I must say, it looks fairly convincing… even though I’d restrict it to the good old Phoenicians and leave the megaliths far out of it. My impression is he regularly takes very good ideas and runs way too far with them. (He has recently stopped supporting the glottalic hypothesis of PIE, or so I hear.)
Trond Engen says

November 28, 2010 at 1:27 pm

In this company we can safely assume that all misunderstandings are on my part.
The tripartition was my own brief summary, based on following over-my-head discussions where names like Schrijver and de Vries have been dropped. Substrate hunting is mainly a Leiden school thing, and I now suspect my confining it to Central West Germanic may be an artefact of how far the influence from Leiden reaches. Anyway, I don’t have a good reference for “Belgic”, but note e.g. endnote 4 in this paper.
If Vennemann’s Semitic hypothesis is limited to loans in a semantic field connected to, say, a Phoenician trading network during the later Nordic Bronze Age, I don’t understand the fuzz.
David Marjanović says

November 28, 2010 at 3:38 pm

Anyway, I don’t have a good reference for “Belgic”, but note e.g. endnote 4 in this paper.

Thanks. Looks like it’s an idea from the 1950s that was based on placenames and hasn’t received any attention since. :-/
* * * * * * * * * * * * *
Incidentally, that’s just about the worst possible paper on its subject. It confuses (at least) two very distinct processes. One is the change of velar plosives next to front vowels into affricates or fricatives, more precisely assibilates or sibilants, because the vowels draw the place of articulation of the consonants forward; the other is the change of all plosives in almost all positions into affricates or fricatives at the same place of articulation because the consonants are aspirated to the breaking point.
The first is such a common phenomenon that its occurrence in Anglo-Frisian simply doesn’t need an explanation. It’s something that happens just so. Romance except Sardic is one example, the four “palatalizations” of Slavic are another, the change of *ki, *gi, *hi, *kü, *gü, *hü into qi, ji, xi, qu, ju, xu in Standard Mandarin is one more…
The second is considerably less common; the only certain examples I know, off the top of my head, are the High German consonant shift and what happened to the Greek aspirates. I’m not sure about Hebrew and not even about Grimm’s Law.
* * * * * * * * * * * * *
The paper does mention a few interesting facts.
One of them is the etymology of Catalonia from Gothalania, which suggests that the Gothic /g/ was voiceless and was therefore interpreted as /k/ by Latin/Romance speakers. The paper says there are no other known examples of this phenomenon; I have read somewhere (Wikipedia?), however, that some etymologists think pizza is a Germanic loan, cognate to English bit(e) and German Bissen, think “fast food”, and that this fits other unspecified evidence which suggests that the Longobards started the High German sound shift. (The alternative etymology of pizza is pita, which fits well enough meaning-wise, but what would have triggered that assibilation within Italian?)
Immediately, the paper goes on to conclude that, in Germanic, the fortes (/p t k/) have been aspirated by default and the lenes (/b d g/) voiceless since at least the Migrations Period, so the absence of aspiration in Dutch and Frisian requires an explanation. I forgot if it was here or on John Wells’s Phonetic Blog that the lack of aspiration in Dutch (where, additionally, the lenes are fully voiced) was discussed and was, for lack of an alternative explanation, ascribed to French influence. French influence on Sater Frisian, spoken in a pocket in northern Germany, seems very unlikely to me, yet the paper explicitly says that most speakers of Sater Frisian don’t aspirate, while the surrounding varieties of German are aspirated. Therefore the paper posits substrate influence, and I can’t argue against this.
Here’s my own knowledge about this issue:
Germanic
Icelandic: fortes aspirated as strongly as in Chinese; lenes voiceless, and sometimes they go all the way and turn into unaspirated fortes.
Danish: Lenes voiceless where not turned into fricatives or dropped. I didn’t pay enough attention to the fortes during the few days I spent in Copenhagen this June.
Faeroese, Norwegian, Swedish: No idea. Please help me out.
Dutch, Frisian: Fortes unaspirated, lenes voiced like in French or Slavic or Japanese. (Never mind word-final fortition.)
English: Fortes weakly aspirated by default, lenes unreliable – always voiced in singing, otherwise devoiced to variable degrees depending on the accent and the position in a word. Aspiration strong enough that fortes behind /s/, where they are unaspirated, become lenes most or all of the time (something I need to work on in my own pronunciation of English).
Low and AFAIK Middle German, and Standard German spoken in places where Low or Middle German dialects are or were spoken: fortes weakly aspirated by default, lenes fully voiced. (Never mind word-final fortition.)
High German except Carinthian, and Standard German spoken in places where High German dialects are or were spoken: old fortes except initial /k/ (exception of exception: Tyrolean and Alemannic) eliminated by High German consonant shift, new fortes (from High German consonant shift and from loans) unaspirated (never mind the word-final lenition of /t/ in most positions); lenes fully voiceless, so that historical spellings are often confused; initial /k/ sort of borderline aspirated except where fortes have become long lenes (Alemannic) or merged with the lenes altogether (Swabian, eastern Austrian dialects).
Carinthian (High German dialect close to Tyrolean, but with big fat Slovene substrate): /k/ is a cluster [kh], /p t/ unaspirated, lenes fully voiced (I’m not sure what happens word-finally).
Celtic
Scottish Gaelic: Fortes very strongly aspirated by default, lenes voiceless, fortes behind fricatives have become lenes even in writing. This is blamed on the Vikings in what I’ve read.
Irish Gaelic: Same, except I don’t know how strong the aspiration is, and fortes behind fricatives are still written as such.
Welsh: Like English, says Wikipedia.
Breton: AFAIK like French, Dutch and Frisian.
Can we even reconstruct whether a Celtic language spoken in the Netherlands would have had aspirated fortes and/or voiced lenes?
In fact, assuming Anglo-Frisian, can we even tell how old aspiration is in English? Could it have been imported from Late Old West Norse (and then passed on to Welsh)? Or perhaps dialect mixture between the Angles, Saxons and Jutes was responsible? ~:-|
Trond Engen says

November 28, 2010 at 4:11 pm

Hey, I didn’t mean to recommend the paper. I just saw an example of ‘Belgic’ in use.
Every description of Scandinavian phonology will tell that Norwegian and Swedish fortes are aspirated except in clusters. I will say on my own account that (at least allophonically) the lenes are very lene, so much so that a foreigner may perceive /b-/ as /mb-/ or /β-/. I don’t think this is common in all dialects, though.
David Marjanović says

November 28, 2010 at 8:05 pm

Hey, I didn’t mean to recommend the paper. I just saw an example of ‘Belgic’ in use.

I know. I just engaged in topic drift. 🙂
Thanks for the information on .no and .se.
David Marjanović says

January 21, 2018 at 6:36 pm

Me, just over 7 years ago:

Immediately, the paper goes on to conclude that, in Germanic, the fortes (/p t k/) have been aspirated by default and the lenes (/b d g/) voiceless since at least the Migrations Period, so the absence of aspiration in Dutch and Frisian requires an explanation. I forgot if it was here or on John Wells’s Phonetic Blog that the lack of aspiration in Dutch (where, additionally, the lenes are fully voiced) was discussed and was, for lack of an alternative explanation, ascribed to French influence. French influence on Sater Frisian, spoken in a pocket in northern Germany, seems very unlikely to me, yet the paper explicitly says that most speakers of Sater Frisian don’t aspirate, while the surrounding varieties of German are aspirated. Therefore the paper posits substrate influence, and I can’t argue against this.

I had forgotten about Sater Frisian by the time I read that there’s a complete aspiration-free belt that includes not just Dutch (Sater Frisian wasn’t mentioned), but all of southern Low and northern Central German. (In southern Central German, all evidence has been erased by the Inderior German Gonsonant Weagening.) In the western part of that belt, you could blame a Romance substrate, in the east you could blame Slavic, but that leaves a large center unaccounted for. The author concluded that aspiration does not date back to Proto-Germanic or even Proto-Northwest-Germanic, but is a northern innovation within Northwest Germanic, was then carried south in the Migration Period (or earlier), and then was exaggerated till it broke in the whole new southern area, i.e. Upper German (as well as, much more recently, in Scouse and for t in Danish). From there it trickled into Central German as a fashion, not as a natural sound change, because the aspiration necessary to develop it as a sound change wasn’t there. Aspiration comes out as step 0 of the High German consonant shift.

I’ll look up the source later; if I do that now, this computer might play American government.
David Marjanović says

January 21, 2018 at 8:42 pm

I can’t believe it. The book is gone from Google. It was there just a few months ago.
January First-of-May says

January 21, 2018 at 9:25 pm

By the way, Hat, I keep reading the headline as a reference to a non-existent casing pair of Unicode characters, CAPITAL and SMALL LETTER NTA, whose existence is seen as undesirable.

Same for me. I keep imagining a CYRILLIC CAPITAL (or SMALL) LETTER NTA, and wondering what it might theoretically look like and/or be used for.
j. says

January 25, 2018 at 1:01 pm

A ligature of Н н and Т т of course, following the example of Ҥ ҥ and Ҵ ҵ.

As long as we are staying in non-southern Eurasia: there are Tibetic varieties with prenasalized stops, but they’re almost all in the east (the westernmost that I know of is Humla Bhotia in NW Nepal), not quite in the range where anyone would want to use Cyrillic; outside of alternate histories at least.

Or maybe some adaptation of Modern Greek? ‹б в› will handle /b v/ ‹μπ β›, but a new ligature would seem like a typical Cyrillic-by-committee solution to /d ð/ ‹ντ δ›.
David Marjanović says

January 25, 2018 at 6:03 pm

Bashqort has Cyrillic letters for its /θ ð/: Ҫ Ҙ.
Hans says

January 26, 2018 at 3:29 am

Interesting that they didn’t resurrect the fita for this.
Lazar says

January 26, 2018 at 5:38 am

@Hans: Maybe because it was ideologically troublesome, or because it wouldn’t have a good voiced counterpart.
zyxt says

January 26, 2018 at 6:47 am

At a stretch, you could use djerv as a voiced counterpart for fita.
John Cowan says

March 11, 2019 at 10:28 am

Un rabot is a “plane” (a tool for planing wood)

In English, a rabbet is a groove in wood (prototypically, but sometimes in stone), a rabbet joint is a tongue-and-groove joint, and a rabbet plane (spelled rebate plane in the UK, but still pronounced “rabbit”) is a specialized wood plane with a narrow blade used for making rabbets. Rabat is the name of the ordinary carpenter’s plane is obsolete. Rabbet and rebate are probably doublets, both < ré-abbatre.
Lars Mathiesen says

November 17, 2019 at 5:27 am

German rauben making Raubtier suspect: Not in Scandinavia, because rov is ‘prey’ and røve (both imported from different types of German) is ‘to rob’ — of course it’s a doublet, but not felt as such.

In older usage, Danish does have Sabinerindernes Rov for ‘The Rape of the Sabine Women’. (With the action noun, not the identical patient/result noun).

English rob/reave and rape are doublets according to Wiktionary, the latter through Latin and French, though the two pages disagree on the exact shape of PIE *h₁re(w)p- and its meaning (‘snatch’ or ‘break’?) — and Etymonline has rob from *runp- and cognate to L rumpere instead, and Ringe has *Hrunep- (present indicative) / *Hrewp- (aorist subjunctive) > *reufaną > ON rjúfa so it’s all the same root anyway.
David Marjanović says

November 17, 2019 at 6:49 am

A Kluge mess, of course, with all of *pp (rupfen), *p (raufen, with /fː/: “1. theatrically pull one’s hair; 2. wrestle, as of little children or a bar brawl on the harmless side”), *β (reave, rjúfa, raub-) and *bb (rob, Raupe).

Most likely *Hréwp-nh₂- ~ *Hrup-náh₂-, then.

Also, der Raub der Sabinerinnen: their rightful owners were robbed.

Raub can also refer to the loot, but that’s rare.

…

And now I’ll actually look it up in Kroonen’s book if Google lets me.
PlasticPaddy says

November 17, 2019 at 7:07 am

@dm
I think Raub der Sabinierinnnen is an old mistranslation of Latin rapere “to abduct”. Modern German would be Entführung.
David Marjanović says

November 17, 2019 at 7:23 am

rauben is not in the book, except in a footnote where Grimm himself is cited as comparing schnupfen, schnaufen, schnauben (“make various noises with your nose”) to rupfen, raufen, rauben.

Raupe, however, is on p. 282 of Kroonen (2011), and in apparently identical text in his thesis from 2009, which I quote without the source citations:

*rūbō, *ruppaz ‘caterpillar’
• *rūbbōn-: MHG rūp(p)e f. ‘eelpout, caterpillar’, G Raupe f. ‘caterpillar’, Aal·raupe, Pal. raupe f. ‘id.’
• *rūpᵖōn- [i.e. with the long consonant shortened to avoid an overlong syllable]: MLG rūpe ‘hairy maggot’, Kil. ruype ‘caterpillar’, Du. dial. ruip ‘id.’, WFri. rûpert ‘rough-haired animal’
• *rubbōn-: MHG ruppe f. ‘caterpillar, eelpout’, G Ruppe f. ‘eelpout’, Pal. Ool·rapp, ·ropp, ·rupp, Ruppe f. ‘eelpout’, Thur. roppe, ruppe ‘caterpillar’

The word for ‘caterpillar’ shows the kind of formal variation that is typical of ablauting n-stems. The material gives proof of a vocalic interchange of *ū with *ŭ and a consonantal interchange of *-bb- with *-pp-.
The variant *rūpᵖōn- is found in the Low German speech area, and is supported by MLG rūpe, Kil. ruype and Du. dial. ruip. It superficially resembles the High German form Raupe, which therefore has been regarded a Low German intrusion. The geminate of MHG rūppe [preserved today where applicable] nevertheless shows that Raupe must have developed out of *rūbbōn-, which with its combination of a long vowel and a geminate looks like a typically High German n-stem, cf. Swab. kauzen m. ‘entangled thread’ < *kūttan-, Pal. schaupe f. ‘forelock’ < *skūbbōn-, etc. It can, at any rate, not be derived from *rūpōn- or *rūbōn-, because these forms would have yielded **Raufe and **Raube respectively. So, if interdialectal borrowing actually did take place, the direction must have been from High to Low German, not the other way around. Finally, G Ruppe, with its correspondences in e.g. Palatinate and Thuringian, seems to point to a variant *rubbōn- with a short *ŭ.
The attested polymorphism can be interpreted as deriving from a paradigm *rūbō, *ruppaz that was split up into 1) *rūpō, *ruppaz and 2) *rūbō, *rubbaz. I assume that it was derived from the IE root *reup-, which in Germanic gave rise to a large verbal complex including an iterative opposition, cf. ON rjúfa, OE rēofan ‘to break’ < *reufan- vs. MHG ropfen ‘to pluck’ ~ Icel. rubba ‘to scrape’, Als. roppen ‘to pull, pluck’ < *ruppōþi, *rubunanþi. The original meaning of the West Germanic n-stem therefore probably was “plucker”.^611
A slightly different etymology is given by De Vaan (2000). De Vaan argues that, give the widely attested meaning ‘rough maggot’, the Benennungsmotiv for the word must have been “rough one”. De Vaan further connects MDu. robbe ‘seal, rabbit’, Kil. robbe(ken) ‘rabbit’, Du. rob ‘seal’, MLG rubbe, LG rabbe m. ‘seal’, WFri. robbe ‘id.’, G Robbe mf. ‘id.’ < PGm. *rubba/ōn-, because these animals are also “rough-haired”. Note that Matthias Kramer, in his German-Dutch dictionary of 1719 calls a robbe ‘ein hartschuppiger seehund’, i.e. ‘a rough-haired seal’. [Literally “hard-scaled”, which is rather confusing.]
Finally, Boutkan and Kossmann (1999) have sought to explain the formal variation as being the result of substrate influence. On the basis of Lat. rēpō, Lith. rėplióti and Latv. rāpât, all meaning ‘to creep, crawl’, they hypothesize that a non-Indo-European root *rū̆/āp- ‘to crawl’ entered these languages at a relatively late date. Likewise, the same root would have been borrowed into Germanic, ultimately to surface as *rū̆p/bb- ‘caterpillar’, i.e. “crawler”. This explanation, however, fails to recognize the principle of Germanic consonant and vowel gradation.

Footnote 611:

Note that the presence of consonant gradation in the verbal complex opens the possibility that the polymorphism of ‘caterpillar’ is not due to its inflection as an n-stem, but rather the result of its derivation from the iterative. This explanation, however, has the disadvantage that the n-stem would need to have been coined several times to several different verbal roots. Furthermore, it does not account for the long *ū.
David Marjanović says

November 17, 2019 at 7:25 am

Entführung is “kidnapping”; Raub implies something more violent and more patriarchal.

Isn’t rapere more like “seize”?
Lars Mathiesen says

November 17, 2019 at 7:37 am

Ringe says that ‘opaque’ PIE n-infixed presents (from secondary present stems with metathesized *-n- affix, I guess) were often replaced in Germanic by the (unaffixed, root) aorist stem (in the subjunctive, by some argument that I didn’t follow) — thus rjúfa goes back to a form without a nasal. *raubōną (vel sim.) is not covered, would that be a further derived verb (with *-nah₂-)?
David Marjanović says

November 17, 2019 at 7:52 am

rjúfa goes back to a form without a nasal

Kroonen agreed, as quoted above. But given that he barely mentioned the verb, i’m not sure he still would if he looked at it in detail; I don’t know what he put in his etymological dictionary of Germanic (2013) because that’s not on Google.

would that be a further derived verb (with *-nah₂-)?

Yes, and with *b by analogy.
PlasticPaddy says

November 17, 2019 at 9:42 am

Re rapere,
Cur exempla petam Graiûm? tu criminis auctor,
nutritus duro, Romule, lacte lupae:
tu rapere intactas docuisti impune Sabinas: per te nunc Romae quidlibet audet Amor.
Propertius, Elegie 2.6.1
I agree this could mean seize but in the whole poem the direction is to corrupt or mislead, so I would go with “abduct” here.
David Marjanović says

November 17, 2019 at 6:01 pm

Nine years ago:

Turkmen Wikipedia has what may or may not be an accurate rundown of the various scripts, including the “eccentric” 1993 alphabet (with £, $, ¢ and Ұ!). (Why no [ɯ] in the 1927 script?)

Presumably the distinction from [i] was left to vowel harmony and the distinction between [k] and [q], which was spelled out in the 1927 script but no script since.

The strangest part is that vowel length has never been spelled out (well, I suppose it was in the Arabic script). It is phonemic, unlike in almost all other Turkic languages (but it’s a retention that has been lost elsewhere, so the Turkologists are all over it).
David Eddyshaw says

November 17, 2019 at 6:15 pm

Yakut keeps it (but you knew that …)
David Marjanović says

November 17, 2019 at 6:50 pm

That’s why I wrote “almost”. 🙂
Lars Mathiesen says

November 17, 2019 at 8:09 pm

Aorist subjunctives and rude noises. What’s not to like?
David Marjanović says

November 30, 2019 at 8:53 pm

More on aorist subjunctives: p. 98, most footnotes omitted:

1.2. ‘Bite presents’

One subset of Germanic presents stands out in that their cognates in the other IE languages form characterized presents, either inﬁxed (with *-n-) or suffixed (with *-sk̂é/ó-, *-yé/ó-, etc.). The most widespread view holds that these presents exceptionally continue the subjunctive of the corresponding root aorists, which also ended in *-e/o-. Likely examples are the following,which involve PIE nasal-inﬁxed or reduplicated presents beside root aorists:

PIE pres. *bʰi-né-d- ~ *bʰi-n-d-´ ‘split’ (Ved. bhinátti, pl. bhindánti; thematized in Lat. findō), root aor. *bʰéyd- ~ *bʰid-´ (Ved. abhet), aor. subj. *bʰeyd-e/o- (Ved. bhédati) > PG *bītaną ‘bite’ (Go. beitan, ON bíta, OE, OS bītan, OHG bīⱬan);
[…]
PIE pres. *(H)ru-né-p- ~ *(H)ru-n-p-´ ‘break, tear’ (thematized in Ved. lumpáti, Lat. rumpō), root aor. *(H)réwp- ~ *(H)rup-´ (Lat. pf. rūpī), aor. subj. *(H)réwp-e/o- > PG *reufaną (ON rjúfa; OE pret. ptcp. rofen ‘riven’, berofen ‘bereft’)
[…]

It is true that one or another of these verbs could have been remodeled on the basis of the standard type after the strong preterite,⁹ but the quantity and quality of examples suggests a more systematic morphological replacement. The large number of PIE nasal presents listed above makes it extremely likely that at some point in the prehistory of PG, speakers began to have difficulty interpreting such presents in terms of the synchronic grammar; this led to a period of competition with the aorist subjunctives, which with their transparent formal structure, identical to that of the majority of strong verbs, prevailed in all cases. The same was no doubt true for PIE reduplicated presents (see above s.v. *ĝews- ‘test’) and also suffixed presents in *-sk̂é/ó-, which were likewise eliminated in PG. The functional shift aorist subjunctive > present indicative is not easy to motivate, but it is also attested in other Indo-European languages, e.g. Gr. leípō ‘leave’, pʰeúgō ‘ﬂee’ < *léykʷ-e/o-, *bʰéwg-e/o- vs. pres. *li-né-kʷ- ~ *li-n-kʷ-´ (Ved. riṇákti), *bʰug-yé/ó- (Lat. fugiō); such cases suggest that the PIE subjunctive may originally have been formed to roots, rather than aspectual stems. Presents of this type are often referred to in the literature as ‘aorist presents’, but this label is also used to denote a different kind of strong verb postulated for Germanic (see below, section 2.2.1). Here and below, they will for convenience be called simply the ‘bite type’.

⁹ Such remodelling is especially likely for those few verbs for which the PIE nasal present also survives into Germanic, e.g. PIE pres. *gli-né-bʰ- ~ *gli-n-bʰ-´ ‘stick to’ > PG *klimbaną > OE climban, OHG klimban ‘climb’ vs. PG *klībaną > ON klífa ‘climb’, OE -clīfan, OS -klīban, OHG klīban ‘stick’ (Scheungraber 2014:44–6; see below, section 2.2.1) or PIE (*gʰ-né-d- ~) *gʰ-n̥-d-´ ‘catch’ → *gund-ne/o– > *gunni/a– → PG *ginnaną ‘begin’ > Go., OE, OS, OHG -ginnan vs. PG *getaną > Go. bi-gitan ‘ﬁnd’, ON geta, OE bi-gietan ‘get’, OS far-getan, OHG far-gezzan ‘forget’ (LIV²: 194).

Then (p. 102), nasal-infixed verbs with root-final laryngeals were, once the laryngeals were gone, reinterpreted as having a nasal suffix, which then became productive and created most of the *-naną verbs. I don’t know how the *-ōną verbs fit into this.

NOBODY IS REALLY TOO KEEN ON A CYRILLIC NTA.

Comments

Speak Your Mind

Archives

Search

Recent Posts

Recent Comments