languagehat.com : Page 232

Commas in the News.

January 28, 2020 by languagehat 93 Comments

Alison Flood writes for the Grauniad about the latest commatic contretemps (actually, commatic doesn’t mean ‘relating to commas,’ it means ‘having short clauses or sentences; brief; concise,’ but I couldn’t resist):

Three million coins bearing the slogan “Peace, prosperity and friendship with all nations” are due to enter circulation from 31 January, with Sajid Javid, chancellor of the exchequer, expressing his hope that the commemorative coin will mark “the beginning of this new chapter” as the UK leaves the European Union.

However, early responses include His Dark Materials novelist Philip Pullman’s criticism of its punctuation. “The ‘Brexit’ 50p coin is missing an Oxford comma, and should be boycotted by all literate people,” wrote the novelist on Twitter, while Times Literary Supplement editor Stig Abell wrote that, while it was “not perhaps the only objection” to the Brexit-celebrating coin, “the lack of a comma after ‘prosperity’ is killing me”.

The hyper-pedantic reactions are of course absurd — it doesn’t matter a damn whether there’s a comma there or not, it really doesn’t — but the piece is worth it for the other problematic coins it cites:

The criticism of the new coins follows the Bank of England’s decision to use a quote on its Jane Austen bank note about the joys of reading – apparently unaware that the character who utters the words has no interest in reading. Ireland’s Central Bank, meanwhile, misquoted Ulysses on a commemorative coin intended to honour James Joyce.

Vet those proposed inscriptions, ye bureaucrats! (Thanks, Trevor.)

Two from Laudator.

January 27, 2020 by languagehat 18 Comments

1) Beastly (from John Burnet, Ignorance [OUP, 1923]):

When I was at school we certainly thought it ‘beastly’, as we called it, that we should have to learn such things as irregular verbs by heart. On the other hand, it was not particularly laborious for us at that age, and we could more or less see the use of it. It was clearly the way to get the power of reading Homer and Virgil without constant interruption, and I honestly believe that most of us enjoyed that. Of course we should not have dreamed of confessing it to one another, and still less of admitting it to ‘old so-and-so’, our master, who was doing the best he could for us with scant hope of reward and no expectation of gratitude. To do so would have violated that mysterious schoolboy code, which is not only a beneficent provision of nature to protect society from juvenile prigs, but springs from a native instinct of the young Soul to preserve the solitude so needful for the growth of its inner life. Of course the time came later when we were ready to admit, very shyly at first, to one another that we did like Homer and Virgil, but at first we were quite content to learn our irregular verbs. There is no great mystery in that. Mere memorizing comes natural to the young, and it does not matter at all whether they understand what they memorize or not. Children have always invented things—counting-out rhymes and the like—the main purpose of which is to be memorized. Think of the undying popularity of The House that Jack Built. We may say, indeed, that they have a passion for rigmarole, and small boys retain a great deal of this. One would think that our educational system would take advantage of that, and so it does in matters of absolute necessity like the multiplication table.

[…]

For the grown man, of course, grammar may be one of the most dangerously fascinating studies, but for the boy it is just what I have called the sediment of dead knowledge, to be acquired as speedily as may be for the sake of its results and not for itself. This is quite understood in many other branches of training. It is really a good deal easier to read Homer than it is to play the piano, and yet the proportion of people who learn to play the piano, at least to their own satisfaction, is far greater than that of those who learn to read Homer. In this case every one can see that the first thing to be done is to acquire the necessary automatism, and the methods of acquiring it have been more or less systematized. If you had to think of every chord, you would never play anything. On the other hand, no one imagines that the traditional scales and exercises are music. They are simply practice, directed to the acquisition of automatic power, and that is how grammar should be treated at school. It is an historical fact that, when this method was followed, a large number of people did acquire the power of reading Homer, and that a very considerable number continued to read him all their days.

2) How Long Does it Take to Make a Mummy? (from W. Jackson Bate, “The Crisis in English Studies,” Harvard Magazine 85.1 [1982]:

If you took a Ph.D. here in English as late as the 1930s, you were suddenly shoved — with grammars written in German — into Anglo-Saxon, and Middle Scots, plus Old Norse (Icelandic), Gothic, Old French, and so on. I used to sympathize with the Japanese and Chinese students who had come here to study literature struggling with a German grammar to translate Gothic into English! William Allan Neilson, the famous president of Smith College, had been a professor of English here for years. Forgiveably, he stated that the Egyptians took only five weeks to make a mummy, but the Harvard English Department took five years.

And for lagniappe: Hats off!

Türik Bitig.

January 26, 2020 by languagehat 7 Comments

A reader wrote me:

Didn’t see it mentioned anywhere, so I thought I’d pass along the Türik Bitig site. It’s got a lot to sink your teeth into:
1. Etho-Cultural Dictionary of Old Turkic
2. Uploads of most (all?) of the common Old Turkic inscriptions
3. Even a bit on learning to read Old Turkic

I’ve added the links; the site has a note:

The basic idea of creating the electronic historical and cultural fund was based on issue of The Oriental Studies Section of The Institute of Oriental Studies named after Suleimenov in 2005 under the govermental program “Cultural Heritage”: “Қазақстан тарихы туралы түркі деректемелері” сериясының 2-томы Н.Базылхан “Көне түрік бітіктастары мен ескерткіштері (Орхон, Енисей, Талас)” Алматы: Дайк-Пресс. 2005, 252 б. +144 бет жапсырма.

Thanks, Parry!

Update. As of Feb. 25, 2021, the site appears to have been infected with malware, so I have substituted archived links.

Defense Language Institute Enrollment Data.

January 25, 2020 by languagehat 43 Comments

Drab title, I know; I was tempted to call the post “Why Provençal?” but I opted for the tediously factual one. Trust me, the link is fantastic and makes up for the drabness; the Monterey County Weekly had the brilliant idea of asking the Defense Language Institute for enrollment rates for each language of instruction from 1963 to the present, and Asaf Shalev’s article is the result:

The data first arrived from the DLI in a format that made it hard to use: computer printouts that were scanned and turned into digital images. With the help of optical character recognition and data extraction tools, the Weekly transformed the scanned printouts into a proper database, allowing for in-depth analysis of the history of foreign language education in the U.S. Department of Defense. It took more than three months to gather, analyze and visualize the data. To our knowledge, this is the first time such a project has ever been carried out. Even DLI itself doesn’t have a database of this sort and could not readily produce a similar analysis, according to former commandant Dino Pick.

They did a terrific job putting it in visual form; the first chart, “How the DLI’s focus shifts with world events,” makes the change from Eastern European languages (mainly Russian) to Middle Eastern languages (mainly Arabic) crystal clear, and the second, “Enrollment at Defense Language Institute 1963-2018,” is an exciting year-by-year race (just watch Chinese and Persian overtaking each other in recent years). There’s a “Searchable database of language enrollment at DLI” and “DLI language with highest enrollment 1965-2018,” and at the end comes “All the languages and dialects ever taught at DLI, in order from highest total enrollment to lowest,” from which we learn that a single person was taught Provençal in 1983. Why Provençal? At any rate, I highly recommend a visit to the link — thanks, Bathrobe!

Fendrik.

January 24, 2020 by languagehat 57 Comments

Another tidbit from Irina Reyfman’s How Russia Learned to Write: Literature and the Imperial Table of Ranks (see this post): since her book has a great deal to say about the Table of Ranks, she provides a detailed version of it on pp. 188-90, and glancing over it, one of the first things I noticed was that in the 14th (lowest) class, under “Military ranks,” the first entry was “Warrant Officer (fendrik) in Infantry (1722-30). Fendrik! What a word! So I looked it up in Vasmer to see its origin, and it wasn’t there; it turns out that that’s because it’s not in Dahl, which is quite strange. But of course there’s a Russian Wikipedia article, which tells us it’s from German Fähnrich ‘color-bearer, standard-bearer,’ whose etymology is a bit confusing — German Wikipedia says “Das Wort „Fähnrich“ stammt vom althochdeutschen faneri, dem mittelhochdeutschen venre und dem frühneuhochdeutschen venrich ab und ist daher mit dem modernen Wort „Fahne“ im Sinn von Truppenfahne verwandt, die der Fähnrich einst zu tragen hatte.” So it’s from Fahne ‘flag,’ but I’m not sure what’s going on with those suffixes. Wiktionary adds the information that there’s an archaic form Fähndrich, which is presumably the ancestor of the Russian word.

I was also briefly perplexed by an entry in the 6th class, under “Court ranks,” where the first item is “Kammer-Fourrie (until 1884),” but it turns out the mysterious Fourrie is simply an error for fourrier ‘quartermaster’ (from Old French fuerre ‘fodder’; the two words are related). It’s probably a typo (though there are very few in the book), since on the next page she correctly has “Court Fourrier” under the 9th class.

A Dictionary of Varieties of English.

January 23, 2020 by languagehat 15 Comments

JC sent me a link to Raymond Hickey’s Dictionary of Varieties of English, saying “This is far more than the title suggests: it not only contains sketches of various Englishes, but is at the same time an actual lexicon of terms that are relevant to the study of English variation, like æ-tensing. I’m only about 10% through it, but it’s great and a fun read to boot.” Thanks, John!

The Kraken Wakes.

January 22, 2020 by languagehat 22 Comments

Ofer Aderet reports for Haaretz (cached) on the progress made in Haifa University’s program to decipher ancient manuscripts:

“Hasten to the Shoko,” urged the computer. “The mouth asked to smoke,” it mused another time. Then it declared, “Jesus God to rejoice.” The cryptic phrases brought both smiles and satisfaction to the managers of the digital humanities laboratory at the University of Haifa. One is a Talmud and Midrash teacher and the other a professor of information systems.

The platform, called Kraken, is taking its tentative first steps in attempting to decipher ancient Hebrew. The hope is that in the not-too-distant future, after completing its studies, Kraken will be able to read any Hebrew text, even if the manuscript is distorted, illegible or hard to decipher. It’s part of a discipline called digital humanities, which uses advanced technology to enhance studies in history, the Bible and literature.

Like children encountering Hebrew religious texts in elementary school for the first time, Kraken also needs practice to become familiar with the material. The “shoko” was supposed to be “shoket” – trough. The mouth wanted to “deal with the Torah,” not to smoke, while Jesus, heaven forbid, has nothing to do with the third phrase, which was originally “the Lord will again rejoice.”

Moshe Lavee, a Military Intelligence veteran, a senior lecturer in Talmud and Midrash in the university’s department of Jewish history in the University of Haifa. He is the director and founder of eLijah-Lab, Kraken’s home and one of the two researchers heading the lab. This week he spoke with contagious enthusiasm about the digital revolution, which is destined to save several research fields from oblivion. […]

On a monitor he showed a scanned section from Midrash Tanhuma, three collections of Pentateuch aggadot (homilies) from the end of the ancient history. The script is difficult to read, but the computer doesn’t give up. Kraken – developed by Prof. Daniel Stoekel Ben-Ezra of Ecole Pratique des Hautes Etudes in Paris – succeeds in reading it, and later presents it to the researcher as a simple text file. This opens new research possibilities that ignite the imagination, first and foremost searching and analyzing information in large scopes and kinds of texts that until now even the most skilled researcher couldn’t carry out alone. “Our vision is to make all the Hebraic scripts accessible,” says Lavee. “We’ll turn Jewish and Hebraic legacy into texts accessible to computer search and study and save a huge treasure trove of knowledge and Jewish traditions.” […]

The revolution was enabled by Handwritten Text Recognition technology, which enables a computer to read tens of thousands of pages – like novels and poetry from the 19th century, diaries and letters from WWII and ancient philosophical and religious texts, including intelligible handwritten input. Lavee says “the computer is taught to read the texts automatically, based on practice, so it acquires contextual knowledge about the language and uses it to reach better results.” […]

At this stage the computer still needs the researchers’ help. They are teaching it to read and “understand” the ancient Hebraic texts it encounters for the first time. “We show the computer many pictures from manuscripts, alongside their correct transcription,” says Lavee. “The computer itself finds the leading mathematical formula from the visual data for the text, and develops the ability to decipher even the written manuscript, whose transcript it hasn’t been shown before.” Dror Elovits, the lab’s technology manager and a graduate student in history, believes “the day is not far when we won’t need the human factor anymore, the texts will digitize themselves.”

(For a similar story about the Vatican Archives, see this 2018 LH post.) Thanks, Kobi!

A Scots Proverb.

January 21, 2020 by languagehat 29 Comments

Via the always interesting Laudator Temporis Acti:

Allan Ramsay, A Collection of Scots Proverbs (Edinburgh: J. Wood, 1776), p. 33 (Chap. XIV, Number 130):

        He snites his nose in his neighbour’s dish to get the brose to himsell.

Oxford English Dictionary, s.v. snite, v., sense 2.a:

        transitive. To clean or clear (the nose) from mucus, esp. by means of the thumb and finger only; to blow.

Id., s.v. brose, n:

        A dish made by pouring boiling water (or milk) on oatmeal (or oat-cake) seasoned with salt and butter.

Hat tip: Eric Thomson, who adds:

How can southron English have retained snout and snot and yet let their cousin ‘snite’ fall by the wayside? As so often, Scots is the last bastion, a vernacular that Hume and Boswell were brought up speaking but were forced to renege, and now Scots itself has more or less fallen by the wayside. The curmudgeon’s chosen path must always be backwards, to rescue everything senselessly tossed aside.

I don’t know where I ran across the verb snite, but it was many years ago, and it was so obviously convenient and pleasurable that I’ve used it ever since — mainly to myself, since I imagine hardly anyone else in the US knows it, but I hope to spread awareness of it with this post.

Livery.

January 20, 2020 by languagehat 114 Comments

I was discussing the pronunciation of livery with a friend (who thought it had a long i, as in alive, having only seen it written) and I thought I’d check the etymology in case it might help, which it does. OED (updated September 2009):

Etymology: < Anglo-Norman leveré, liveré, livereye, livré, lyveré, lyveree, lyvereye, Anglo-Norman and Middle French liveree, livree, Middle French livrée (French livrée) allowance or ration of food (late 12th cent. in Anglo-Norman), delivery, act of handing over (1283 or earlier in Anglo-Norman in general sense; second half of the 14th cent. or earlier in Anglo-Norman in spec. use with reference to the legal delivery of real property into a person’s possession, in faire liveré de), distinctive dress or uniform worn by an official, retainer, or servant (and given to him or her by the employer) (c1290 in Old French; now historical), liveried retainers collectively (1354; rare before late 17th cent.; now historical), assignment (14th cent. or earlier in Anglo-Norman), disbursement (1355 or earlier in Anglo-Norman), lodging, quarters of an army (a1400 or earlier), surrender (1438 in an apparently isolated attestation), distinctive guise or appearance of a thing (although this is apparently first attested later: c1450 with reference to the distinctive colours of an object; a1675 in more general sense), company, party (c1460 (in the passage translated in quot. 1477 at sense 12a) or earlier), stipendiary allowance granted to a canon (1549), in Anglo-Norman also denoting a City of London company (1386 or earlier), use as noun of feminine past participle of liverer, livrer liver v. (compare -y suffix5). Compare post-classical Latin liberata allowance, payment, provision (of food, clothing, etc.) to retainers or servants (frequently from 12th cent. in British sources), badge, uniform (frequently from late 14th cent. in British sources), lodging, quartering (14th cent.), allowance of provender for horses (15th cent. in a British source as liberatum), academic stipend (15th cent. in British sources), and also Spanish librea (end of the 15th cent.), Italian livrea (1424), Middle Dutch livereye, livreye, levereye (Dutch livrei), Middle Low German (rare) lēverīe, liberīe, German Livree (c1600; earlier as †liebrey, †liberey, etc. (15th cent.)), all earliest in sense 11b, all < French.
[…]

I. Senses relating to delivering or handing over.
†1. The action or an act of handing over or conveying to another; the release of a person from imprisonment, etc.; (also) the delivery of goods (money, a writ, etc.). Obsolete.
[…]
II. Senses relating to the provision of food, etc.
5. a. The food, provisions, or clothing dispensed to or supplied for retainers, servants, or others; an allowance or ration of food served out. Now historical.
[…]
III. Senses relating to clothing or other uniform which serves as a distinguishing characteristic.
10. Something assumed or bestowed as a distinguishing feature; a characteristic garb or covering; a distinctive guise, marking, or outward appearance.
This sense should probably be regarded as a figurative development of sense 11, though it is recorded earlier.
[…]
11. a. The distinctive dress worn by the liverymen of a Guild or City of London livery company (see Compounds 2); (also) an item of this dress.
[…]
b. More generally: the distinctive dress or uniform provided for and worn by an official, retainer, or employee (in early use esp. a single item such as a collar, hood, or gown, but more generally a suit of clothes or uniform); spec. the characteristic uniform or insignia worn by a household’s retainers or servants (in later use largely restricted to footmen and other manservants), typically distinguished by colour and design; the dress, uniform, or insignia (e.g. king’s livery, riding livery), by which a family, etc., may be identified. Also as a count noun: a set of such clothes, a uniform. Cf. colour n.1 19a. Now chiefly historical.

So you can remember the pronunciation by keeping in mind that it’s just delivery without the prefix.

Phones.

January 19, 2020 by languagehat 26 Comments

Mark Liberman has a Log post about a recent paper, Jialu Li and Mark Hasegawa-Johnson’s “A Comparable Phone Set for the TIMIT Dataset Discovered in Clustering of Listen, Attend and Spell” (pdf). The abstract reads:

Listen, Attend and Spell (LAS) maps a sequence of acoustic spectra directly to a sequence of graphemes, with no explicit internal representation of phones. This paper asks whether LAS can be used as a scientific tool, to discover the phone set of a language whose phone set may be controversial or unknown. Phonemes have a precise linguistic definition, but phones may be defined in any manner that is convenient for speech technology: we propose that a practical phone set is one that can be inferred from speech following certain procedures, but that is also highly predictive of the word sequence. We demonstrate that such a phone set can be inferred by clustering the hidden nodes activation vectors of an LAS model during training, thus encouraging the model to learn a hidden representation characterized by acoustically compact clusters that are nevertheless predictive of the word sequence. We further define a metric for the quality of a phone set (the sum of conditional entropy of the graphemes given the phone set and the phones given the acoustics), and demonstrate that according to this metric, the clustered LAS phone set is comparable to the original TIMIT phone set. Specifically, the clustered-LAS phone set is closer to the acoustics; the original TIMIT phone set is closer to the text.

Mark says:

As exemplified above, the TIMIT phonetic transcriptions often reflect expectations from the formal dictionary-based pronunciation standard, which is influenced by the spelling even before any continuous-speech reductions set in — so matching TIMIT’s performance on this paper’s “metric for the quality of a phone set (the sum of conditional entropy of the graphemes given the phone set and the phones given the acoustics)” should not be all that difficult. Still, no one has ever done it before, so this research is an important contribution.

The relationship between phonetic variation and lexically-stable phonological categories remains an open theoretical question, in my opinion, but work like this is one very useful direction of inquiry.

This sounds like it must be important, but it’s been so long since I had anything to do with that kind of linguistics that I’m only vaguely aware of how it works. But it sounds like a useful alternative to the usual transcriptions.

languagehat.com