Sara’s Family.

From John Cowan:

1. La famille de Sara est d’origine italienne.
2. La famiglia di Sara è di origine italiana.
3. La familia de Sara es de origen italiano.
4. A família de Sara é de origem italiana.
5. La famiya de Sara es de orijin italyana.
6. A familia de Sara é de orixe italiana.
7. A familia de Sara ye d’orixen italiano.
8. La família de Sara és d’origen italià.
9. La familha de Sara es d’origina italiana.
10. A famiglia di Sara hè di origine italiana.
11. Sa famìlia de Sara est de orìgine italiana.
12. Familia Sarei este de origine italiană.
13. Familia Sarae originis italicae est.

JC adds: “Some are easy, some quite tough, at least for me.” I got 1-4, 8, and 12-13 at first glance; the rest are tougher.

Addendum. It’s probably best to assume there will be spoilers in the comments, if you don’t want any help figuring it out.


  1. Why is origin feminine in most languages above but masculine in Spanish and whatever no. 7 is? I don’t know any Latin but presumably ‘origin’ is feminine.

    Very grateful for your elucidations.

  2. Good question. I’m going to assume plain old analogy — if it ends in a consonant, it looks like it might be masculine — but maybe somebody knows more. (It’s also masculine in 8.)

  3. I’m going to give it a try:

    1. French
    2. Italian
    3. Spanish
    4. Portuguese
    5. Ladino?
    6. Galician
    7. Asturian
    8. Catalan
    9. Occitan?
    10. ??
    11. Sardinian?
    12. Romanian?
    13. Latin

  4. David Eddyshaw says:

    1-4 as above (the easy ones); I agree 8 is Catalan; 12 is indeed Romanian, 13 Latin. Galician is said to be like Portuguese (according to the only speaker I know) so I expect J is right about 6 too.

    11 has to be Sard because of the article. No idea about the others.

  5. Trond Engen says:

    I immediately took 1-3 and 13. Then I put Romanian on 12 for being different and Portuguese on 4 for the -m. Galician followed on 6 for the same reason as David and Occitan on 9 for the sequence -lh-. Then I guessed Catalan on 8 for the -à, 7 is obviously close to Galician/Portuguese on the Spanish side, but I couldn’t guess which intermediate dialect or colonial variety it might be. 10 is somewhere in the Italian sub-continuum, and not far from the Toscan center. Corsican? I had no idea what to make out of 5 or 11.

  6. David Marjanović says:

    I’m pretty sure origo is masculine, but that’s highly irregular in Latin, where almost all other n-stems were feminine: the masculine ones are ordo, sermo, personal names like Cato and Nero, and that’s pretty much it.

    5 has to be Ladino because of the spelling (i.e. Atatürk orthography, like YPG Kurdish) – which incidentally allows us to see the yeísmo.

    I agree that 6 looks more Galician than 7; 7 could well be Asturian.

    10 is some “Italian” “dialect”, not too far north.

    11 is obviously Sardinian: article sa, unreduced est.

  7. Roberto Batisti says:

    Ok, without looking at the comments…:

    1 – French, 2 – Italian, 3 – Spanish, 4 – Portuguese, 5 -Judeo-Spanish (?), 6 – Galician, 7 – ?? (some Ibero-Romance variety I guess), 8 – Catalan, 9 – Occitan, 10 -Corsican??, 11 – Sardinian, 12 – Rumanian, 13 – Latin

  8. Giacomo Ponzetto says:

    Aren’t some of these missing a definite article? Not sure if it’s a rule, but I’m definitely used to hearing that from native speakers of Portuguese (4. A família da Sara) and Catalan (8. La família de la Sara).

    @David: Italian is rather conservative with its genders (un ordine, un sermone) and this seems to work for origin (un’origine) as well: principii autem nulla est origo.

    I have full confidence in the collective reconstruction above (which I wouldn’t have been capable of), but I find the systematic absence of peninsular “Italian dialects” vaguely disturbing.

  9. John Cowan says:

    Well, not surprisingly to anyone who knows the Hattics, everyone’s assertions, conjectures, and approximate conjectures are correct, with the exception of the very difficult #7. In fact it is (mountain) Aragonese, not Asturian, but they are quite similar.

    Some features of Aragonese not shown here include the preservation of Latin /ll/ and (some) intervocalic voiceless stops unchanged. An example of the latter is cleta ‘trellis, hurdle’; cf. Spanish cleda, French claie. I suppose this means this and similar words had irregular gemination at some point. The articles are also curious: they are o/a/os/as as in Portuguese when the preceding word ends in a consonant, but ro/ra/ros/ras when it ends in a vowel (clearcut l/r interchange here).

    Asturian, on the other hand, is the only Western Romance language that has retained three genders in the adjective, ending in -u, -a, -o respectively. However, some neuter nouns take the masculine article, some the feminine: el fierro vieyo/*vieyu, la lleche frío/*fría: semantically they are mass, collective, or abstract nouns. Deadjectival abstracts also have neuter adjective agreement, but take the special article lo, as in Spanish.

    I’m pretty sure origo is masculine

    Wikt, Lewis & Short, and Gaffiot’s Latin/French dictionary all agree that it is feminine, and L & S quote Cicero’s De Re Publica: principii nulla est origo ‘there is no beginning to the beginning’.

    Update: Huh. Giacomo and I quoted the same Latin tag but seemingly drew opposite conclusions from it! Or perhaps I am simply confused.

    Update 2: It would be cool to get 12-13 speakers together, synchronize them with a click track, and get them to pronounce the sentence and then overlay them. (The Latin would have to be rearranged to SVO order, but that would be perfectly grammatical: Caesar is big on SOV, but about 90% of Livy’s sentences are SVO.

  10. As i understand it, Sara is from Slovenia. Or Slovakia

  11. Or Slobbovia.

  12. Finländare says:

    Pff. How good is your ability to recognize Finnic languages?

    1. Sara ema ja isa tulevad Itaaliast.
    2. Saran mutsi ja faija tulee Italiast.
    3. Saran äiti ja isä tulevat Italiasta.
    4. Saran muamo da tuatto tullah Italiaspäi.

  13. David Marjanović says:

    Nullam novi originem!

    1 seems Estonian: lack of vowel harmony, -d.

    2 I have no idea.

    3 is probably (Standard) Finnish.

    4… with ua and -h, is it Karelian?

  14. Lars (not the original one) says:

    @Finländare: I’d say 1 is Estonian, 3 is Finnish, 4 looks like Veps (the one with even more cases). In (2), mutsi and faija look like Germanic loans… and it also seems to have lost subject/verb agreement. Interesting, but I’ve no idea what it could be.

  15. Roberto Batisti says:

    @ Giacomo Ponzetto: you’re right, let’s do one with Italian ‘dialects’! It’s going to be fun (and not much harder for non-experts in Italian dialectology than the Finnic one is for non-Finnologists, I think…).

  16. PlasticPaddy says:

    Re Finnish 2 the phrase mutsi ja faija is used in a song by an artist from what is now kouvala. Alternatively 2 might be Helsinki slang, which substitutes Swedish for familiar terms, according to Wikipedia.

  17. 2. Informal spoken Finnish. I can explain why (if anybody’s interested) but I must admit to cheating.

  18. Trond Engen says:

    Alex K. (if anybody’s interested)

    In this environment it’s usually safe to assume a yes to that. No matter what came before the parenthesis.

  19. Yes, do tell.

  20. De familie van Sara is van Italiaanse afkomst.

  21. about 90% of Livy’s sentences are SVO

    Are you sure about that? I just took a quick look at ab urbe condita and at least the first book is heavily SOV.

  22. I first thought (2) could be Livonian, for purely geographical reasons, but Livonian diacritics is similar to Latvian, not Estonian or Finnish. Bad guess.

    I googled “mutsi ja faija” and – as PlasticPaddy said – there’s this song by a Finnish pop artist. According to Suomen slangisanakirjaa (accessible via, mutsi is äiti and faija is isä.

    Now on to the verb. As Lars noted, the subject-verb agreement seems broken: in Finnish, tulee is the third-person singular indicative present form of tulla (“to come”). Now, it’s possible to say mat’ s otsom edet domoy in Russian so perhaps this trick can also work in colloquial Finnish. Also, can the essive Italiasta (from Italy) lose its final -a in fluent speech?

    The answer seems to be yes in both cases. I should have turned to Wikipedia from the start, “Colloquial Finnish” aka Puhekieli. Here are the relevant bits:

    …the third person plural suffix -vat or -vät is not used in the spoken language; instead, the third person singular form is used…


    Particularly in Helsinki, the deletion of some, but not all word-final vowels even beyond /i/ occurs sometimes…

    -sta — -st elative case, “away from the inside of”

    Which, I think, takes care of it all.

  23. Giacomo Ponzetto says:

    @Roberto: I’m afraid I don’t speak any “Italian dialect,” so I cannot help give the peninsula more representation.

    @John: sorry for being opaque. I also meant that origo is feminine in Latin, just like origine is in Italian. It’s an unserious way of recalling Latin genders, but it’s awfully tempting because it succeeds so often … Sure enough, the irregular masculines in Latin also include cardo (un cardine) and margo (un margine).

  24. David Eddyshaw says:

    De familie van Sara is van Italiaanse afkomst.

    Or to put it in a more familiar language:

    Asaaratu yaanam la da yi nɛ Italiya teŋin na.

  25. David Marjanović says:

    mat’ s otsom

    That’s also an option in Basque. (Using the comitative case in -ekin; there’s no word for “with”.)

  26. After getting the same ones as Hat at first glance and then identifying Ladino, Galician, and a couple more Iberian tongues, I was also struck by the relative neglect of Italian “dialects” as well. I also thought Romansh would be a good candidate for such an exercise, but didn’t see anything that looked like it.

    I think a similar exercise with Turkic languages will be much easier than it otherwise would be due to the different spelling conventions adopted by each of them.

  27. 14. La familio de Sarah estas de itala origino.

  28. @John Cowan: Good one! Any reason that it’s Sarah, not Sara?

  29. 15. La famiglia da Sara è d’origin talian.

  30. Roberto Batisti says:

    Actually, a problem with translating this phrase in regional languages is that ‘origin’ wouldn’t really sound right. Such abstract Latinisms are well integrated in the lexicon of the big, standardized national languages, but quite alien to regional varieties. Dialect speakers nowadays would probably produce a superficial adaptation, but that would be an Italianism (in an dialect of Italy; the same may apply mutatis mutandis to other countries/languages), and as such less interesting as an example of the lexical and structural peculiarities of each dialect. In the actual regional varieties it would generally be more natural to say something like “Sara’s family *comes from* Italy”.

    That said, I’ll try to provide a couple of version later…

  31. David Eddyshaw says:

    Quite so: my rendition above is actually “Sara’s forebears came hither from Italy-land.” A more literal rendering would be hopelessly unnatural. As a matter of fact, the Latin version is not at all idiomatic either (and familia does not mean “family”, as JC will of course know very well.)

    The actual concept “family” is more culture-bound than is apparent from a modern Standard Average European perspective.

    [Presumably the real reason that there are no Italian dialect versions is that Sara’s people left all that behind them in the Old Country, and it would not be tactful to remind them of it all.]

  32. David Marjanović says:

    Any reason that it’s Sarah, not Sara?

    It’s actually either Saro or, Zamenhof forbid, Saraho.

  33. Trond Engen says:

    This is essentially a word-by-word comparison. Comparing idiomatic sentences, as in the Finnic examples, is interesting too, but it runs into all sorts of problems of taste, register and nuances of meaning in context — if that’s not the point, as in the Finnic examples.

    I made a similar (but much shorter) list once to show the systematic correspondence of wh-words and some endings of nouns and verbs between the Scandinavian languages, and I had to cheat with both default syntax, idiomatic constructions and lexical meanings to keep the word by word structure.

  34. familia does not mean “family”, as JC will of course know very well

    I think it must have. Classically it meant the slaves of a household (domus, lit. ‘house, mansion’) to the exclusion of the free persons, but to those who did not own slaves, the vast majority, it must have meant ‘household, (extended) family’, just like its Romance descendants.

    It’s actually either Saro or, Zamenhof forbid, Saraho.

    It’s actually quite common to leave proper names alone in both formal and informal Esperanto, unless they are already names with extensive multilingual variation: the 26th U.S. President is not *Teodoro Rozevelto, though the Pope is Fransisko (Unua/1a). A José may be any of Josefo, Jose, Ĥose, José.

    In particular, names in -a are very unlikely to be replaced with -o. Not only is this perceived as misgendering and blurring essential distinctions (what do you do if your family contains both a Mario and a Maria?), but it sounds okay to add the accusative -n to Maria and/or plural -j to -a names, assimilating them morphologically to adjectives, all of which end in -a. Foreign names like those above are treated as indeclinable. (There are, however, gendered diminutive suffixes -ĉjo, njo.)

    Tl;dr: I should have used Sara rather than Sarah above (chalk it up to anglophone habits, though my own sister is Sara), but not Saro, Saraho.

    (By the way, the letter ĥ [x ~ χ] is very rare in E-o and is almost always replaced, even in writing, by something else, usually k; thus Zamenhof’s ĥemio, teĥniko, ĥino became kemio, tekniko, ĉino early on, though ĥaoso is still preferred to kaoso. Only the minimal pairs eĥo/eko ‘echo/beginning’, ĉeĥo/ĉeko ‘Czech/cheque’, and ĥoro/koro/horo ‘chorus/heart/hour’ are universally preserved, though the alternative koroso is not unknown.)

  35. Allan from Iowa says:

    Earlier today I congratulated myself on identifying a person named Firdavs Something-something-ov as being from Tajikistan.

    Then this quiz brought me back down to earth.

  36. By the way, the letter ĥ [x ~ χ] is very rare in E-o and is almost always replaced, even in writing, by something else

    This is the kind of thing that makes me roll my eyes about people who think Esperanto is some kind of ideal/rational language.

  37. 16. Familia de Sara esse origine de Italia.

    I suppose 15 is Venetian; the Brazilian dialect of Venetian (with some koine effects plus of course the Portugese superstrate) spoken in parts of the state of Rio Grande do Sul by those d’ origin talian is called Talian.

  38. 4. Saran muamo da tuatto tullah Italiaspäi.

  39. 17. La ffefil di Sara es di *origin di Italia (best guess).

  40. David Marjanović says:

    16. Latino sine flexione.

  41. January First-of-May says:

    By the way, the letter ĥ [x ~ χ] is very rare in E-o and is almost always replaced, even in writing, by something else

    It had been noted that the distinction between /x/ and /h/ is actually fairly rare cross-linguistically, and that Eastern Polish (supposedly including the dialect of Zamenhof’s native area) has both, but Standard Polish had merged them (to /x/, ironically enough).

  42. David Marjanović says:

    No, Vaguely Eastern Polish uses [ɦ] for /g/, just like Ukrainian.

    Loans that have [ɦ] or [h] in the original are spelled with h in Standard Polish, but pronounced with /x/, which is otherwise spelled ch.

  43. Giacomo Ponzetto says:

    @Roberto: I don’t think it’s just origin. Surely John Cowan purposefully picked a sentence composed entirely of abstract words that show minimal variation across Romance languages.

    Try instead something practical and farm-like such as: “His wife left the mushrooms on the windowsill.” I’m immediately in trouble even with the languages I allegedly know. With some hesitation I’d try the following.

    1. Sa femme a laissé les champignons sur le rebord de la fenêtre.
    2. Sua moglie lasciò i funghi sul davanzale.
    3. Su mujer dejó las setas en el alféizar.
    4. A mulher dele deixou os cogumelos no peitoril da janela.
    8. La seva dona va deixar els bolets a l’ampit.
    13. Uxor eius super limen fenestrae fungos liquit.

  44. John Cowan says:

    To be clear, “from John Cowan” means I sent them to the Hat, not that I composed them. I saw them on Quora, but there are several other pre-LH hits. (It also shows up on various blogs that display the current LH post in their sidebars.) So I didn’t choose the words, but I agree they were probably chosen for their Latinate nature.

  45. And

    12. Soția sa a lăsat ciupercile pe pervazul

    I think. I agree – that is a more interesting sentence.

  46. Interestingly, Romanian ciupercă is from Serbo-Croatian печурка (pečurka).

  47. At the end of Volume One of Основы финно-угорского языкознания (Fundamentals of Finno-Ugric Studies) there is an appendix containing short texts in many languages; short exerpts (not error-free) can be seen here:

    All three volumes are available here:

  48. Nice!

  49. A surprising fact related to the text:

  50. Randall Cooper says:

    Is there a way to get the definitive answer?

  51. 15 isn’t Venetian, but it is a language that has already been mentioned (although I don’t know how idiomatic it is in that language).

