Written Language Quiz.

December 30, 2014 by languagehat 113 Comments

From Anotherquiz.com: Can You Identify 11 Languages By Their Writing? They call it “Our hardest trivia yet!” but I found it ridiculously easy; I got 11 of 11, and there were only a couple about which I felt even a momentary doubt, enough to make me take a closer look before hitting the button. It seemed like they weren’t even trying to make it hard; on a couple they could have made me sweat a little if they’d chosen languages that use similar-looking systems instead of just random languages from around the globe. And some of the language samples themselves are weird in ways I won’t specify because I don’t want to give away answers, but you’ll see what I mean if you know any of the languages. Kvetch, kvetch, kvetch! But it’s fun anyway, all language quizzes are fun, so go ahead and give it a try (obviously if you haven’t spent a lot of time splashing around in foreign languages you may find it more challenging), and don’t go into the comment thread if you don’t want spoilers, because I expect people will be discussing the samples and their results.

Comments

Y says

December 30, 2014 at 10:27 pm

The Hebrew and Arabic… YIKES!

Hebrew: someone, a person or a computer, translated “How do you do today” to Hebrew איך אתה עושה היום, which would literally mean “How do you do today”: this would be merely opaque nonsense, except it’s also ungrammatical, since עשה ‘do’ is transitive (except in some slang expressions, where it means something like ‘do it’ in English). And then, as a bonus, the letters are set from left to right, with the finals ך and ם on the right end of the word.

Arabic: I hardly know any Arabic. I think they meant to write كيف حالك اليوم (which I think is correct) but used the non-combining forms for all the letters, and again set them from left to right.
George Grady says

December 30, 2014 at 10:33 pm

That was easy. I wouldn’t have guessed the Bulgarian one in isolation, but it was obvious by process of elimination.

What were the marks above the я in сегодня in the Russian sample?
Y says

December 30, 2014 at 10:34 pm

Oh, and when you’re done, you get a box urging you to get attached to them via Facebook, by the irresistible caption Cras erat arcu, cursus et sodales nec, tincidunt ac augue. Pellentesque habitant morbi tristique senectus et netus et malesuada fames ac turpis egestas..
Keith Ivey says

December 30, 2014 at 11:41 pm

I wouldn’t have gotten the Bulgarian on its own either, and even with the choices I thought a split second about Mongolian. Everything else was pretty easy, some insanely so.
Sili says

December 31, 2014 at 12:28 am

Hmmm. It didn’t tell me how many I got right, but I suspect even I guessed them all. Bulgarian I wouldn’t have gotten in isolation, but with the options it was easy.
iakon says

December 31, 2014 at 12:37 am

. . .pretty easy, some insanely so.

My sediments exactly. Couldn’t say I ‘liked’ it.
Jongseong Park says

December 31, 2014 at 12:57 am

George Grady, it looks like they missed deleting some devanagari marks from the previous sample after their copy-paste job. It’s a bit sloppy even for a quick online quiz.

The Korean sample is 오늘 어때요, which is literally “how is today”. It could mean “how about today?”, or it could mean “how are you/how is [some other understood subject] today?”. You would normally expect a question mark. It makes me wonder what procedure was followed to generate the samples, which seem to be variations of “how do you do today” at first glance, though I’m not sure about the Japanese where I only recognize the kanji for machine and don’t see the expected kanji for “today”.
Keith Ivey says

December 31, 2014 at 1:09 am

I really expected there’d at least be some Georgian or Sinhala or Cherokee or something.
John Emerson says

December 31, 2014 at 1:58 am

Oddly, I guessed Mongolian instead of Bulgarian, because I knew that Mongol was written in several different scripts and didn’t know that about Bulgarian. So my very small knowledge of Mongol Cyrillic betrayed me.
SFReader says

December 31, 2014 at 2:13 am

Mongolian Cyrillic can be distinguished by use of two vowels absent in Russian or Bulgarian – Өө (ö) and Үү (ü) – and by frequent duplication of letters to represent long vowels – like in өнөөдөр (önöödör).

Unfortunately, letters Өө and Үү are also used in several dozen other languages spoken in former Soviet Union (along with many more additional letters).

Kazakh, for example, has them too.
SFReader says

December 31, 2014 at 2:20 am

Bulgarian is written entirely in Cyrillic. If you don’t know any Slavic languages, best hint would be frequent use of letter ъ to represent unstressed vowels.
SFReader says

December 31, 2014 at 2:31 am

I think it would be very easy to make the quiz much harder if there was a choice between Hindi and Marathi or Bulgarian and Russian (though even Mongolian-Bulgarian choice seems too hard for many).
Y says

December 31, 2014 at 2:45 am

I’d like it better if it was designed by and for people who like language and think about it a lot (like this noble company), rather than by some hacks at an address-harvesting click-bait operation.
Tore says

December 31, 2014 at 2:58 am

The Japanese was correct but a bit old-fashioned. Perhaps they were influenced by the recent drama series on NHK about the translator of Anne of Green Gables? In this series a similar phrase was used a lot.
juha says

December 31, 2014 at 3:01 am

@Jongseong Park

It was ご機嫌はいかがですか
きげん4【機嫌】ﾛｰﾏ(kigen)
a mood; a humor; spirits; temper. [⇒ごきげん]
ごきげん【御機嫌】ﾛｰﾏ(gokigen)
ご機嫌いかがですか. How are you (getting along)? | 〔病人に〕 How ┌do you feel [are you feeling] today?
languagehat says

December 31, 2014 at 10:06 am

I really expected there’d at least be some Georgian or Sinhala or Cherokee or something.

Yup!

I think it would be very easy to make the quiz much harder if there was a choice between Hindi and Marathi or Bulgarian and Russian (though even Mongolian-Bulgarian choice seems too hard for many).

Indeed!

I’d like it better if it was designed by and for people who like language and think about it a lot (like this noble company), rather than by some hacks at an address-harvesting click-bait operation.

Exactly!

Nice to get such a thorough sense of vindication; y’all reacted just the way I did. And I’m glad to get the details of what was wrong with some of the samples where I knew the alphabets but wasn’t able to tell exactly how screwed-up the sample was.
Trond Engen says

December 31, 2014 at 11:16 am

I decided most of them in a glance before looking at the alternatives, but I wasn’t sure it was Thai rather than Lao and Hindi rather than Marathi. Also, I sort of expected it to increase in difficulty to some arcane minority languages written in, say, Arabic script. The alternatives removed all doubt, except that I too paused for a split second to be sure that Bulgarian wasn’t Cyrillic Mongolian.
languagehat says

December 31, 2014 at 11:24 am

I wish somebody would do a better version; it’s a great idea.
SFReader says

December 31, 2014 at 1:13 pm

Lao script is more rounded than Thai. (there is also a popular font in Thailand which is even more squared and makes Thai to look like Latin or Greek.)
SFReader says

December 31, 2014 at 1:21 pm

Also, large chunks of Japanese text could be written entirely in kanji and to distinguish it from traditional Chinese is not very easy for someone who can’t read Japanese or Chinese.
SFReader says

December 31, 2014 at 1:33 pm

And it could be a quite a challenge to tell Persian from Arabic or Urdu from Persian.
Athel Cornish-Bowden says

December 31, 2014 at 1:44 pm

Absurdly easy! I hardly had to think about any of them apart from Somali/Korean, and in that case I made a foolish analysis and came up with Somali. I would guess that Somali uses Arabic script (or conceivably Amharic). Although this was clearly not Arabic or Amharic I thought it looked a bit crude for Korean, so I guessed it might be an archaic way of writing Somali. None of the others needed more than a glance, though I thought the Bulgarian might be Russian until I saw that Russian wasn’t offered as a possibility. I might have difficulty distinguishing between those two in a very short text, unless the letter ъ occurred frequently, whereas it’s very rare in modern Russian. Even older Russian shouldn’t create a problem with this as it’s my impression that although ъ was very common at the ends of words it hardly ever occurred anywhere else (is that right?). If this was their most difficult quiz I wondered what their easy ones might be like.
Sashura says

December 31, 2014 at 2:12 pm

absurdly easy, especially considering the prompting choice. I only hesitated between Chinese/Japanese because it looked like they used old Chinese characters, long abridged.

And a Happy New Year!
Keith Ivey says

December 31, 2014 at 3:45 pm

The title is “Can You Identify 11 Languages By Their Writing?”, so I guess it’s not unreasonable that it doesn’t require any knowledge of the languages themselves. It’s a test of your ability to recognize common writing systems plus your knowledge of what writing system is normally used for some common languages. I wonder if the inclusion of Mongolian as an alternative in the Bulgarian question was just a mistake made by someone who didn’t realize Mongolian was written in Cyrillic.
juha says

December 31, 2014 at 3:57 pm

By the way, where does the h in Amharic come from? It’s amarıñña in the language in question.
Jongseong Park says

December 31, 2014 at 5:31 pm

@juha:
Thanks for the explanations on the Japanese sample. I think Amharic derives from Amhara, which in Ge’ez is አምሐራ ʾÄməḥära (አማራ Āmara in Amharic).
John Cowan says

December 31, 2014 at 10:29 pm

Somali was written in Arabic script for centuries, but the dominant script nowadays is Latin. A number of Somali-specific scripts were devised in the 20C and still see varying degrees of use.
SFReader says

December 31, 2014 at 10:59 pm

Maybe we should make a harder quiz ourselves?

How about this one.

“Jeg kommer fra Norge” – this sentence is written in

a) Danish
b) Norwegian
c) Swedish

Which sentence is written in Danish?

a) Det var en fuktig, grå sommardag i slutet av juni.
b) Det var en fuktig, grå sommerdag i slutten av juni.
c) Det var en fugtig, grå sommerdag i slutningen af juni.
juha says

January 1, 2015 at 4:42 am

@Jongseong Park
Thanks a lot!
uwe says

January 1, 2015 at 2:58 pm

What were the marks above the я in сегодня in the Russian sample?

яैं – not sure if that’s rendered correctly though.

I think it’s DEVANAGARI VOWEL SIGN AI and DEVANAGARI SIGN ANUSVARA added by accident. Hindi sample ends with them too.
languagehat says

January 1, 2015 at 3:15 pm

Ha! Boy, they sure worked hard to get such a lousy result. You’d think finding sample sentences would be an easy matter.
John Cowan says

January 1, 2015 at 3:52 pm

Which sentence is written in Danish?

I have no idea, but it’s just obvious that they all mean “That was a fuckup, gray summer days and the sluts of June.”
Melchior AJP Anderegg says

January 1, 2015 at 5:02 pm

c is dansk.
languagehat says

January 1, 2015 at 5:26 pm

They don’t have sluts in Denmark, they have slutnings.
Y says

January 1, 2015 at 6:22 pm

Match the texts (all different, all from Wikipedia) with the languages (all Romance):

1. Su trenu ’e Casteddu arribàda a mesudì. Ma de candu intràda in is furriàdas de Tacchenurri e ancora no si bìat, s’intendìat su ciuff ciuff e su fragu ’e su fumu, màssima candu tiràt bentu estu.
2. Tutti i ggìenti nascianu libberi e ‘gguali all’àtri ppì ddignità e diritti. Ognunu tena cirbìeddru raggiune e cuscìenza e s’ha de cumbortà cull’atri cumu si li fòssaru frati.
3. La gualp eara puspe egn’eada fumantada. Qua â ella vieu sen egn pegn egn corv ca taneva egn toc caschiel ainten sieus pecel.
4. Sei bedda chi dugna cori / s’innammurigghja di te / pa l’occhj mei un fiori / ed è la meddu chi c’è.
5. Tuota nuester, che te sante intel sil, sait santificuot el naun to. Vigna el raigno to. Sait fuot la voluntuot toa, coisa in sil, coisa in tiara.

a) Sutsilvan Romansh
b) Campidanese Sardinian
c) Gallurese Corsican
d) Dalmatian
e) Cosentino Calabrian
David Marjanović says

January 1, 2015 at 7:42 pm

letter ъ to represent unstressed vowels

It’s a full-fledged phoneme that occurs in stressed and unstressed positions like the other five vowel phonemes.

“Jeg kommer fra Norge” – this sentence is written in

Not Nynorsk, where it’s Noreg rather than Norge. But I’m not sure about the other three options.

Which sentence is written in Danish?

c, because it has -gt- instead of -kt-.

Match the texts (all different, all from Wikipedia) with the languages (all Romance):

1) Sardinian. Su trenu can’t be anything else.
3) Romansh. Lots of words ending in a consonant, general weirdness, vowel clusters, and the sch doesn’t exactly hurt. 🙂
5) Vegliot (northern “Dalmatian”), because of the unique first word, the extreme change to Latin /aː/, the unique to, general weirdness that doesn’t seem Romansch, and… I’ve read the Wikipedia article a few times. :-]

That leaves Corsican and Calabrian. 2) seems less close to Standard Italian, showing the southern u for Standard o, so that’s got to be Calabrian.
languagehat says

January 1, 2015 at 8:34 pm

Thank goodness somebody upheld the honor of the Hattery — I couldn’t have!
Y says

January 1, 2015 at 9:21 pm

Yes on all five—give the man a cigar!

If I had to solve this, I’d have picked Vegliote first because it’s the only ‘old’ text; To make it harder I could have picked Christian texts for the other languages too. But I’d have a hard time telling Corsican and Sardinian apart.
SFReader says

January 1, 2015 at 9:26 pm

I have no idea what language the 2nd sentence is written in, but I can understand it easily.

If it’s indeed Calabrian, then I will add Calabrian to list of languages in my CV….
Trond Engen says

January 1, 2015 at 9:50 pm

I wouldn’t weigh in to early on the Nordic quiz, but I’m pretty sure that both Danish and Norwegian Bokmål are valid answers. For Nynorsk, many write Norge these days, some even choose the new weak present kommer for the strong kjem and fra for frå, but Eg rather than Jeg is a lithmus test.

(In really archaic Eastern Nynorsk, however, one might use jeg, but that would be accompanied by very archaic forms elsewhere too: Jeg kjem fraa Norig or something like that.)

Swedish has Jag and från.
SFReader says

January 1, 2015 at 10:24 pm

You mean Norwegians can’t tell their own language from Danish?
Trond Engen says

January 1, 2015 at 10:50 pm

Oh, we can, but for historical reasons Bokmål is quite close to Danish, and it just happens that none of the tells are present in your short example sentence (1).

Your second question is easy, but it would have been harder without the alternatives. I’m quite sure I wouldn’t remember that Danish preferred slutningen for slutten. I’m not sure I’d remember that about Swedish slutet either. But I’d still recognize Swedish because of the retained a in unstressed position in sommar (2).

1) I have a feeling that Norwegians are more likely than Danes to choose the phrase Jeg kommer fra, but that doesn’t make it unidiomatic in Danish.

2) Nynorsk shares this, but also retains diphtongs: Det var ein varm, fuktig sommardag i slutten av juni.
SFReader says

January 1, 2015 at 11:50 pm

I once met an Ukrainian who spoke Russian, but didn’t know it was Russian and mistakenly believed he was speaking Ukrainian.

Norwegians are not at this level of confusion yet, but getting closely enough…
Sili says

January 2, 2015 at 1:53 am

I don’t know what I’d say instead of “Jeg kommer fra Norge.” Perfectly idiomatic to me. I guess I “Jeg er fra Norge” would work too.
Sili says

January 2, 2015 at 1:58 am

They don’t have sluts in Denmark, they have slutnings.

We have both. Slutnings are juvenile sluts. They grow up so fast.
John Emerson says

January 2, 2015 at 3:19 am

Here’s a very familiar text in an obscure romance language.

Tată a nostru, ți eșci tu țerl,
s’ayisească numa a Ta,
s’yină amirăriľa a Ta,
si facă vreare a Ta,
cum tu țerl, ași ș’pisti locl.
Pânea a nostră ațea di cathi dzuă dă-nă-u ș’ază
și ľartă-nă amărtiile a noastre
ași cum ľi ľirtăm ș’noi a amărtoșlor a noșci.

The church I was raised in had a schism around 1920-1930 which I think included a switch from Bokmal to Nynorsk, or maybe a refusal to switch. No one alive to ask about it, and the records are sketchy. Around 1948 they cancelled the Norse service and went to English-only, after a considerable bilingual period.
Alex says

January 2, 2015 at 6:36 am

The use of the letter ľ probably indicates that this Pater Noster is in Vlach/Aromanian..
SFReader says

January 2, 2015 at 6:38 am

I wonder if Tata is related to Tuota in Dalmatian.

Is it a Slavic borrowing?
AJP Eggedosis says

January 2, 2015 at 6:45 am

You mean Norwegians can’t tell their own language from Danish? …Norwegians are not at this level of confusion yet, but getting closely enough.

Haha. No. There’s nothing to be confused about. Bokmål + written Danish are very similar, but there’s never going to be the faintest doubt about which language someone’s speaking.
David Marjanović says

January 2, 2015 at 7:08 am

I’d have a hard time telling Corsican and Sardinian apart

They’re very different. Sardinian is the sister-group to all other Romance languages together, with su and sa as definite articles, and with no changes to the Latin vowel system except the loss of length. Corsican is very close to Tuscan and thus to Standard Italian; this is slightly obscured here by spelling out the radoppiamento (s)sintattico, the perfectly standard lengthening of word-initial consonants when the previous word ends in a vowel, which produces Standard de + la = della, in + la = nella, a + la = alla, e + pur = eppur and so on.

I once met an Ukrainian who spoke Russian, but didn’t know it was Russian and mistakenly believed he was speaking Ukrainian.

This blows my mind.
SFReader says

January 2, 2015 at 9:06 am

Early 20th century censuses in Western Belarus discovered that majority of rural population didn’t know what nationality they were and when asked replied simply that they are “tutajszij” (from here) .

Of course, they couldn’t name their own language either and simply called it “mowa” (language).
languagehat says

January 2, 2015 at 10:31 am

As seen here in 2005.
John Cowan says

January 2, 2015 at 2:56 pm

never going to be the faintest doubt

No, indeed. All you have to do is determine if the potato is in or out.

su and sa as definite articles

The only other Romance variety like this is Algherese Catalan, where there is a contrast between default ipse-based articles and specialized ille-based articles. In particular, la mort is ‘Death’ (abstract or personified), whereas sa mort is ‘the death (of which we were speaking)’. Since Alghero is on Sardinia, this isn’t too surprising.

However, Gallurese and Sassarese, which are spoken in northern Sardinia, are Corsican/Tuscan with a massive Sardinian substrate/adstrate, and they are often called “Sardinian”, which confuses the issue.
Rodger C says

January 2, 2015 at 4:24 pm

There are other dialects of Catalan that use es, sa.
Etienne says

January 2, 2015 at 7:07 pm

First: */felike annu novu/ to all.

SFReader: you’re half-right: Dalmatian TUOTA and Romanian TATA are related, but this is no loanword, but an attested Latin term (TATTA) which was originally confined to children’s language.

John Cowan: Rodger C. is quite correct, reflexes of IPSUM, IPSAM are used as the definite article in other dialects of Catalan, and there is at least one “dialect” of Italian which may still do so, or perhaps did so well into the twentieth century.

John Cowan, David: making the whole thing even more confusing is the fact that Gallurese and Sassarese are much closer to Southern lects of Corsican spoken in Corsica itself than the latter are to lects of Corsican spoken further North. The internal diversity of Corsican is remarkable, has only been partly lost through the influence of Pisan Tuscan over the past thousand years or so, and has only been documented recently. It is quite ancient, and indeed in some ways Corsica is more diverse, linguistically, than the rest of Romance-speaking Europe combined. Really.

This is quite a contrast to the lack of diversity of “core Sardinian” (Campidanese + Logudorese: both of these dialects can be derived quite unproblematically from the Old Sardinian language attested in Medieval texts and Charters). My own hunch (for whatever that is worth) is that this difference is due to Corsica having been linguistically romanized at a much earlier date than Sardinia (i.e. pre-Old Sardinian must have been spoken either outside Sardinia, or on a very small part of Sardinia, at a time when the dominant Romance variety of Corsica had already split into distinct forms ancestral to the present-day dialects.
David Marjanović says

January 2, 2015 at 8:45 pm

The internal diversity of Corsican is remarkable, has only been partly lost through the influence of Pisan Tuscan over the past thousand years or so, and has only been documented recently. It is quite ancient, and indeed in some ways Corsica is more diverse, linguistically, than the rest of Romance-speaking Europe combined. Really.

That’s… not surprising, but I had no idea of it. Thank you!

Corsica having been linguistically romanized at a much earlier date than Sardinia

Also faster and… more thoroughly, I suppose. There are things (used to be in Wikipedia, can’t find them anymore) that Sardinian has in common with Basque and apparently nothing else, and on top of that Sardinian has things like the unique prefix ti- on some animal names of Latin origin… fascinating stuff.
languagehat says

January 2, 2015 at 9:44 pm

I had no idea of it. Thank you!

Same here!
SFReader says

January 3, 2015 at 3:25 am

— Sardinian has in common with Basque

Sardinia and the Basque are descendants of the first wave of neolithic colonization from Asia Minor circa 6000 BC.

The Basques even managed to keep their language intact
John Cowan says

January 3, 2015 at 3:56 pm

indeed in some ways Corsica is more diverse, linguistically, than the rest of Romance-speaking Europe combined

How fortunate we are, then, to know that the Romance languages did not arise on Corsica.
David Marjanović says

January 3, 2015 at 6:53 pm

Sardinia and the Basque are descendants of the first wave of neolithic colonization from Asia Minor circa 6000 BC.

Yep.

The Basques even managed to keep their language intact

Looks like it (a thick layer of Latin and Romance loanwords excepted, and there may be Celtic ones, too).

How fortunate we are, then, to know that the Romance languages did not arise on Corsica.

True, and a very good point.

There are cases like this in biology as well. People used to wonder for a long time whether parrots come from South America or from Australia, their two areas of greatest current diversity… in the last 20 years, the fossil record has spoken up and shown that parrots are fundamentally European.
SFReader says

January 5, 2015 at 4:30 am

Classic quiz question for confused Slavic studies majors.

Which of the following is the self-name of the Slovak language?

a) slovenčina
b) slovenščina
c) slovienčina
d) slovenština
e) slovinčtina
f) slovinština
g) slovinščina
h) slovaščina
i) słowakšćina
j) słowjenšćina
k) slovanština
l) słowiński
m) słowiański
o) słowacki
p) słowiański
q) słoweński
r) slavenski
s) slovački
t) slovenački
u) słowińska
v) słowakska
w) słowjańska
x) słowjeńska
y) słowińsczi
z) słowacczi

and two extra choices for the most erudite students

27) słowiańsczi
28) sloweńsczi
John Cowan says

January 5, 2015 at 10:52 am

fundamentally European

Even if the oldest fossils are European, that’s not necessarily diagnostic either; that might be an artefact of chance preservation. In any case, Wikipedia can’t make up its mind whether the European parrots are true psittacids or only psittaciforms.
Etienne says

January 5, 2015 at 2:57 pm

John Cowan, David, SFReader, Hat, and whoever else might be interested:

1-Considering the extreme internal diversity of Corsican and how unlike the rest of Romance Sardinian is, a linguist knowing nothing of the history of the Roman Empire but familiar with basic principles of linguistic geography might well correctly deduce that the Urheimat was in the middle of the Italian peninsula, not least since among Continental Romance varieties (=Romance minus Sardinian and Corsican) it is those of the Italian peninsula which exhibit the most diversity.

2-Sardinian and Basque do indeed share a number of words, but it most certainly does not follow therefrom that the pre-Romance language of Sardinia (=Palaeo-Sardinian, or Proto-Sardinian, as it is sometimes called) was Basque-like: to repeat a point first made by Luis Michelena (or Mitxelena, to use the Basque orthography), Basque pre-Romance words shared with Romance varieties needn’t be ancient, indigenous Basque words: they could have entered Basque via Latin.

3-David, Hat: you’re both quite welcome.
John Cowan says

January 5, 2015 at 4:00 pm

And then again your linguist might decide that Proto-Romance originated on the islands and then spread to the peninsula and half Europe.
David Marjanović says

January 5, 2015 at 8:26 pm

Which of the following is the self-name of the Slovak language?

Either a) or b). And the other is the name of the Slovene language for itself.

…I’m pretty sure a) is Slovak.

In any case, Wikipedia can’t make up its mind whether the European parrots are true psittacids or only psittaciforms.

What it actually says is the point: stem-group psittaciforms are known only from the northern continents, so that’s where the origin of the crown-group must lie. The crown-group, by definition, consists of the last common ancestor of all extant psittaciforms plus all descendants of that ancestor; the extant psittaciforms are all parrots proper, cockatoos or lories. The stem-group is all the rest. The Wikipedia article calls the crown-group “modern parrots” and explains that its oldest known fossils are from Europe as well, though fossils from New Zealand are only a little younger.

I was wrong about the restriction of stem-group parrots to Europe, though; I had forgotten about the one from the Green River Formation (a bit older than Messel or the London Clay, IIRC) and didn’t know about the one from India. “Fundamentally Laurasian” it is, then… or maybe “pantropical”.
SFReader says

January 5, 2015 at 9:35 pm

-…I’m pretty sure a) is Slovak.

Correct!

I did not include choices for slovenský and slovenski, first of which are another self-designations of Slovak and Slovene.

All of the remaining choices were taken from words for Slovak, Slovene and Slavonic in every Latin-script Slavic language I could find on Wikipedia.
Il vergognoso says

April 16, 2015 at 4:25 pm

The internal diversity of Corsican is remarkable, has only been partly lost through the influence of Pisan Tuscan over the past thousand years or so, and has only been documented recently. It is quite ancient, and indeed in some ways Corsica is more diverse, linguistically, than the rest of Romance-speaking Europe combined. Really.

While Étienne is here, I would much like hear more about this one. Itched over this for all the four months between, really.
languagehat says

April 16, 2015 at 4:38 pm

Seconded. Tell us, Étienne!
Etienne says

April 16, 2015 at 6:29 pm

Okay, I haven’t the relevant monograph on Corsican dialectology at hand, so details will have to wait, assuming anyone is interested. Here’s the gist of it: Romance-speaking Europe can be split into three “macro-zones” on the basis of the fate of Latin stressed vowels. Leaving aside diphthongs, Latin had ten vowel phonemes: long and short /a/ /e/ /o/ /i/ /u/. The three macro-zones treat these vowels thus:

1-In Sardinian and Southern Corsican, plus a few isolated areas of Southern Italy, Latin length distinctions were lost, with the quantities remaining intact: thus, Latin /i/ and /i:/ merge as /i/, /u/ and /u:/ as /u/, and so on.

2-In most of the Italian pensinsula and everywhere (minus Sardina and parts of Corsica, see below) further West (France, Spain, Portugal) /i/ and /e:/, on the one hand, and /u/ and /o:/, on the other, merge and (typically) become mid-high vowels, distinct from the reflexes of high long vowels (which lose their length but not their quality: /i:/ becomes /i/, /u:/ /u/) on the one hand hand and the mid-low reflexes of Latin short /e/ and /o/ on the other. Latin /a/ and /a:/ merge as /a/.

Naturally subsequent changes have affected different languages: Spanish, for example, turned its mid-low vowel phonemes into diphthongs (/je/ and /we/), and thus ended up with five stressed vowel phonemes only.

3-In Romanian (defined broadly, i.e. including Aromanian, Megleno-Romanian and Istro-Romanian) and a few isolated parts of Southern Italy we find a sort of “compromise” system, where the back vowels have changed in the same way as in the first macro-zone (/u/ and /u:/ merge as /u/, /o:/ and /o/ as /o/) and the front vowels in the same way as in the second (i.e. short /i/ and /e:/ are merged as a phoneme separate from the reflexes of /i:/ and short /e/). Here too Latin /a/ and /a:/ merge as /a/.

So: outside of Corsica, the above three schemata account for ALL Romance varieties.

It turns out that much of Northern Corsican is aligned with the second macro-zone, and Southern Corsican with the first. BUT there exist at least two other groups of Corsican dialects, EACH OF WHICH exhibits a pattern of merger of Latin stressed vowel phonemes that is utterly alien to any Romance variety outside Corsica.

And it is in this sense that I wrote that Corsican shows more internal diversity than the rest of Romance combined.
languagehat says

April 16, 2015 at 8:00 pm

Fascinating. Thanks, Étienne!
Il vergognoso says

April 17, 2015 at 1:50 am

Wonderful!
David Marjanović says

April 19, 2015 at 7:19 pm

I’m having what Kids Today call a nerdgasm.
John Cowan says

August 26, 2015 at 4:11 pm

So what are the vowel mergers in the remaining two Corsican dialect groups, or where is a description to be found?
David Marjanović says

August 26, 2015 at 7:29 pm

*jumps up and down in anticipation*
Etienne says

August 28, 2015 at 12:24 am

Well, since you asked…for what follows I am indebted to Marie-José Dalbera-Stefanaggi (2002), LA LANGUE CORSE. Paris, Presses Universitaires de France.

You might want to take notes.

Along with the two types of vowel merger I had sketched above (the Sardinian-like type in Southernmost Corsica, and the Italo-western-like one in most of northern and central Corsica), there exists, sandwiched between the two, what is called (after the major river of the area) the Taravo-type vowel system, which is unique within Romance. Classical Latin long /i/ and long /u/, and long and short /a/, all remain unchanged in terms of vowel quality, as is the case in the other two macro-zones. However, Classical short /i/ and short /u/ both fail to merge with any other vowel, and have as their present day reflexes a mid-low front and back vowel (respectively). Classical long and short /e/ have merged as mid-high /e/, and Classical long and short /o/ have merged as mid-high /o/. You thus have seven stressed vowel phonemes, as in the Italo-western system, but with different historical origins. Crucially, it is impossible to derive the Taravo system from either the Sardinian or the Italo-western system. All three must go back to the Classical Latin system.

The fourth type is found in the northernmost tip of Corsica (known as Cap Corse), but unlike the Taravo-type system, this one COULD be derived from the Italo-western type.

This system is very similar to what is known as the Sicilian vowel system. In both, Classical long and short /a/ merge as /a/, long /i/ remains /i/, and long /u/ remains /u/. All the remaining back vowels (Classical long and short /o/ and short /u/) merge as /o/.

In Sicilian you find a parallel treatment of front vowels, where short /i/, long and short /e/, all merge together as /e/. Sicilian thus has a 5-vowel system. The Cap Corse system, on the other hand, merges the same three vowels as /e/, BUT, unlike Sicilian, in some positions it keeps short Classical /e/ as a separate phoneme (mid-low front vowel). Yielding a 6-vowel system (with, if you have been paying attention, the same vowel phonemes found in Proto-romanian…but with different historical origins).

Notice, incidentally, that the Cap Corse system, if it does not go back directly to Classical Latin, can only go back to Italo-western, as neither the Sardinian nor the Taravo systems preserves a separate reflex of Classical short /e/. The Sicilian system, on the other hand, could be derived from either the Italo-western or the Taravo system.

And finally, there is something else about Corsican vowels which is special and quite possibly unique within Romance. In the Southern area, with Sardinian-type vowel systems, you actually find a 6-vowel system in Corsica, whereas in Sardinia you find a 5-vowel system. Whence the difference, you ask? Well, it’s because of the fate of the Latin diphthong /au/. In Sardinia it merged with /a/, in most of Corsica and in Romance varieties where it became a monophthong it typically merged with mid-low or mid-high /o/, but in the Corsican South, with Sardinian-type vowel mergers, /au/ became mid-LOW /o/, phonologically distinct from the mid-high /o/ which goes back to Classical long and short /o/.

Thus, synchronically, in terms of vowel phonemes, you can divide Corsica into three zones. The far South, with its Sardinia-like vowel merger and Classical /au/ becoming a separate monophthong, has a 6-vowel system, with more back than front vowel phonemes. Cap Corse, in the far north, has a 6-vowel system, but with more front than back vowel phonemes. This makes the two systems mirror images of one another…and in between you’ve most of Corsica, with symmetrical 7-vowel systems going back to either Italo-western- or Taravo-type diachrony.

*Sigh* Apologies, I really should have kept this shorter…
George Gibbard says

August 28, 2015 at 12:43 am

I would think you should have kept it longer with examples, but thanks.
George Gibbard says

August 28, 2015 at 12:55 am

Etienne: some of us are willing to pay small amounts for such wisdom, but we very much like to read it free of charge, so thank you, it’s fascinating.
minus273 says

August 28, 2015 at 2:51 am

Absolutely fascinating stuff. Going to read the Dalbera-Stefanaggi book.
languagehat says

August 28, 2015 at 9:30 am

I agree with everyone else. Don’t apologize again or we shall have to chastise you severely.
marie-lucie says

August 28, 2015 at 11:06 am

Je suis entièrement d’accord! Merci!
David Marjanović says

August 28, 2015 at 7:10 pm

What everyone is saying. 🙂

Question: is there any Romance vowel system where the long and short /a/ of Classical Latin have not merged with each other?
Etienne says

August 28, 2015 at 10:38 pm

In answer to David’s question, no, there is no attested Romance variety which failed to merge Classical long and short /a/. This is unsurprising, really, since the pattern of mergers points to a general lowering of the quality of short vowels when compared to their long counterparts, and since /a/ is as low a vowel as a human mouth can produce, a general trend whereby short vowels became lower in quality would have left short /a/ untouched, so that once vowel quantity was lost phonemic merger between Classical short and long /a/ was all but inevitable.

There is good evidence that long and short /a/ still existed as separate phonemes in the spoken Latin of Roman Britain. This is because Welsh/Breton/Cornish loanwords from Latin maintain separate reflexes of Latin stressed long and short /a/ (Things are murkier in unstressed position). Thus, Welsh has BARF from Latin BARBA (beard) and PECHOD from Latin PECCATUM (sin). Latin BARBA has a short /a/ in its stressed (initial) syllable, Latin PECCATUM has a long stressed /a/, and in Welsh /a/ is the reflex of Latin stressed short /a/, whereas /o/ or /au/ are the reflexes of Latin stressed long /a/.
David Marjanović says

August 29, 2015 at 3:05 pm

So, while Classical Latin had a ten-monophthong system which may have been quite similar to Standard German
/a aː ɛ eː ɪ iː ɔ oː ʊ uː/,
Proto-Romance appears to have had only nine monophthongs
/a ɛ e ɪ i ɔ o ʊ u/,
and Romance never reached Britain (before 1066), or if it did, Britain was already full of Christians by then.
languagehat says

August 29, 2015 at 3:25 pm

There are those who say romance never reached Britain at all…
David Marjanović says

August 29, 2015 at 3:57 pm

Ouch. There’s a nice one in one of the middle seasons of Blackadder, though, and then there’s always the Doctor. 🙂
John Cowan says

August 30, 2015 at 2:03 pm

/a aː ɛ eː ɪ iː ɔ oː ʊ uː/ is how I was taught to pronounce Latin in 1971-75, though the first two often came out merged (and backed) anyway.
Etienne says

September 1, 2015 at 5:42 pm

David: or the Classical Latin ten-voyel system may have originally been a system where differences in length were unaccompanied by any difference in quality: (a a: e e: i i: o o: u u:).
David Marjanović says

September 1, 2015 at 6:47 pm

At some point (perhaps long before Latin, perhaps not), yes, except that [ɛ ɛː ɔ ɔː] are much likelier than [e eː o oː] from first principles as well as comparative IE evidence.
David Marjanović says

September 1, 2015 at 6:49 pm

Hm, actually, given that Celtic shifted the long ones all the way to [iː] and [uː], maybe the shift from [ɛː ɔː] to [eː oː] already happened on the way to Proto-Italo-Celtic. ^_^
George Gibbard says

September 1, 2015 at 7:39 pm

I don’t know about “first principles” — one can find many languages where long and short vowels are not said to differ in quality — but what JC was taught has abundant support from Romance languages.

But I was recently thinking about this: what about a short vowel before a hiatus in Latin, should it be close or open? I read somewhere that it should be close, a wrinkle on JC’s system, but I can’t think where I read this. Now: there is actually reason to think that /ĕ/ should originally be open before hiatus, namely Latin Dĕum/Dĕō ‘God’ > French Dieu, so we have an instance of accented Latin /ĕ/ diphthongizing in an open syllable as if it had been *[ɛ], cf. Vulgar Latin *caelōs ‘heavens’ > *[kɛːloːs] > French cieux.

On the other hand there is evidence to favor the idea that *ĕ was close before a hiatus, namely that French Dieu [djø], cieux [sjø] have unlike *bellōs > beaux [bo]. So does this mean that whoever I read was right about *[ɛ] > *[e] before hiatus, but thought this was an early rule whereas in fact it postdates *[ɛ(ː)] > *[iɛ]?
George Gibbard says

September 1, 2015 at 7:40 pm

Sorry, in haste I misunderstood the “first principles” comment, and on rereading I don’t understand it at all.
George Gibbard says

September 1, 2015 at 7:43 pm

angle-bracket problem: “French Dieu [djø], cieux [sjø] have unlike *bellōs > beaux [bo]”
George Gibbard says

September 1, 2015 at 7:49 pm

still an angle-bracket problem. I refer to the fact that the two cases are spelled eu and pronounced with [ø], while the other word has eau and is pronounced with [o].
George Gibbard says

September 1, 2015 at 8:00 pm

It occurs to me that French Dieu may be way of *Diós as in Castilian, but this will not work for Romanian Dumnezeu < *-dieu < Dɔmɪnɛ Dɛus. Does Portuguese Deus have /ɛ/ or /e/?
George Gibbard says

September 1, 2015 at 8:15 pm

That is DM, I quickly misread your comment on the assumption that you were defending JC’s Latin vowel system ‘on first assumptions’ — sorry for that. Then i saw you weren’t doing that, but making a different point, which I realized I don’t understand.
John Cowan says

September 1, 2015 at 8:38 pm

He’s saying that five-vowel systems are worldwide more likely to be [a ɛ i ɔ u] like Polish than [a e i o u] like Spanish, vowel length or no vowel length.
SFReader says

September 1, 2015 at 8:41 pm

Romanian Dumnezeu and Polish “Pan Bóg” always struck me as an absurdly formal way to address deity.

Mr. God?
George Gibbard says

September 1, 2015 at 9:09 pm

> He’s saying that five-vowel systems are worldwide more likely to be [a ɛ i ɔ u] like Polish than [a e i o u] like Spanish, vowel length or no vowel length.

Aha. What’s the evidence?

In Mexican Spanish I think the vowels in question are often closer to [ɛ ɔ] than [e o], but really in between. It doesn’t help that the IPA advises that, if the language doesn’t distinguish close-mid from open-mid vowels, one should say [e o], which makes you more likely to be right, but more unlikely that there is producible evidence.
marie-lucie says

September 1, 2015 at 9:35 pm

GG: dieu(x), cieux vs. beau(x)

In at least one dialect I am aware of (part of Normandy), eau has turned into (still monosyllabic) iau [jo], so biau, viau, de l’iau, etc for beau, veau, de l’eau and similar words.
David Marjanović says

September 2, 2015 at 4:41 am

He’s saying that five-vowel systems are worldwide more likely to be [a ɛ i ɔ u] like Polish than [a e i o u] like Spanish, vowel length or no vowel length.

Yes; on top of that, while mid vowels (halfway between [ɛ ɔ] and [e o]) are apparently widespread in Spanish, I think actual [e o] are rare there, while actual [ɛ ɔ] are not.

Mr. God?

Lord God. Боже Господи.

It doesn’t help that the IPA advises that, if the language doesn’t distinguish close-mid from open-mid vowels, one should say [e o], which makes you more likely to be right, but more unlikely that there is producible evidence.

What’s really going on here is that 1) the earliest version of the IPA was developed for English and French; 2) the IPA is meant to look pretty when printed, so unmodified Latin letters are supposed to be used for as long as you can get away with them.
Trond Engen says

September 2, 2015 at 7:02 pm

Herregud!
Etienne says

September 9, 2015 at 7:46 pm

David: actually, Proto-Indo-European /o:/ only becomes Proto-Celtic /u:/ word-finally: in other positions it becomes /a:/. So I’m afraid the evidence doesn’t point unequivocally to pre-Proto-Celtic /e:/ having been higher than /e/, alas…
David Marjanović says

September 12, 2015 at 9:25 am

Oh. Thanks.

So I’m afraid the evidence doesn’t point unequivocally to pre-Proto-Celtic /e:/ having been higher than /e/, alas…

It still does, just more weakly so, because the usual parallel for /o:/ is lacking.

This invites comparison with Greek, where η has gone all the way to /i/, while ω is /o/ today, not /u/.
David Marjanović says

November 25, 2015 at 7:11 pm

This interesting paper on the origin of the Romance vowel systems mentions two more systems (“Marginal area”, “Outpost”) that are found in small parts of Italy; but one could be derived from the Italo-Western type and the other from the Eastern type by mergers. The Taravo system is not mentioned under any name. Corsica isn’t mentioned either, except the south with its Sardinian-type system.
John Cowan says

January 9, 2016 at 2:21 pm

I fed Etienne’s five texts through GT, which identified (and garbled) them as Italian, Italian, Spanish (I thought it might pick Catalan), Italian, and Finnish (!) respectively. Note that Google Translate’s language identification algorithm is better (and more computationally expensive) than the Google language identification API, which looks (I think) only at letter frequencies and maybe a few very common words.
Etienne says

January 9, 2016 at 3:29 pm

John Cowan: the five Romance texts were shared with the hattery by Y, not by me.
Y says

January 9, 2016 at 3:50 pm

I look forward to seeing a more challenging set of such texts, without resorting to a bunch of closely related dialects.
John Cowan says

February 19, 2018 at 7:46 pm

Lord God.

Here lies Martin Elginbrod.
Have mercy on my soul, Lord God,
As I would do, were I Lord God,
And thou wert Martin Elginbrod.
John Cowan says

March 8, 2019 at 8:55 pm

the Classical Latin ten-voyel system

Voyel is a lovely compromise between vowel and voyelle. Which reminds me that whereas the absent protagonist in La disparition is named Anton Voyl, his English counterpart in A Void is named Anton Vowl.
Bathrobe says

March 9, 2019 at 12:04 am

Maybe because they had such crappy quizzes, I clicked through to the quiz and found this:

The domain anotherquiz.com is for sale. To purchase, call BuyDomains.com at 781-373-6841 or 844-896-729
J Pystynen says

March 10, 2019 at 6:03 pm

Voyel is a lovely compromise

Pronounced [vɔɥəl], right?
David Marjanović says

October 30, 2021 at 4:47 pm

Me on Sept. 1, 2015:

Hm, actually, given that Celtic shifted the long ones all the way to [iː] and [uː], maybe the [Latin] shift from [ɛː ɔː] to [eː oː] already happened on the way to Proto-Italo-Celtic. ^_^

Nope, the history of Latin is in the way: the shift of oe as in Poenulus to ū must have gone through a stage where it was [oː], and that must have happened before ō shifted from [ɔː] to [oː] because they didn’t merge.

Dutch has done the exact same thing more recently.

Long vowels can take centuries to rise. They can even put it off till it’s too late. OHG and MHG had a fairly bizarre mid-vowel inventory of /ɛ e ɛː o ɔː/ and eventually /ø œː/ without long mid-high vowels and without short /ɔ œ/. Standard German, or rather East Central German, has made it saner, but most of Bavarian has made almost no changes to this except losing vowel length entirely and messing with vowel rounding. Due to pressure from the chain shift of /a aː/ to /ɒ/ and /æ æː/ (among other things) to a new /a/, the old /ɔː/ has merged into /o/, but the umlaut products of the former /o ɔː/ remain distinct from each other as /e ɛ/:

groß, größer /gros grɛsːɐ/
Loch, Löcher /lox lexːɐ/
January First-of-May says

October 30, 2021 at 7:12 pm

Maybe we should make a harder quiz ourselves?

We’ve finally gotten one in Sara’s Family.

Written Language Quiz.

Comments

Speak Your Mind

Archives

Search

Recent Posts

Recent Comments