Meet the Ńdébé Script.

July 18, 2020 by languagehat 88 Comments

Kọ́lá Túbọ̀sún writes for Popula about a new African writing system:

Yorùbá and Igbo have evolved over the years, with various twists and turns affecting usage both spoken and written. And since Bishop Àjàyí Crowther wrote the classic texts, Vocabulary of Yorùbá (1843), Isoama-Ibo Primer (1857), and Vocabulary of the Ibo Language (1882), disputes regarding the orthography of these languages—attempts to agree on how, exactly, they should be written—have continued to rage in academic, literary, and colloquial circles. It is not uncommon today to find competent speakers of both Yorùbá and Igbo who don’t know how to tone-mark written words, so that the end result appears like a standard English text, leaving room for plenty of ambiguity. […]

Unicode was designed to facilitate the digital rendering of languages by encoding scripts into uniquely identifiable computer codes. But because of its lack of precomposed characters for vowels requiring both the subdot [e.g. ẹ, ọ] and the tonal diacritic [e.g. ì, í, á, à, ò, ó, è, é], digital rendering of Yorùbá vowels that carry both [like ẹ̀, ẹ́, ọ̀, ọ́] often need to be overwritten with the second diacritic mark. This results in frequent font-related snafus on modern electronic platforms. […]

Enter the Ńdébé Script: a writing system that addresses the tonal peculiarities of Nigerian languages, pleasing to the eye, which might carry the burden of our literary and academic aspirations. Created by visual artist and software engineer Lotanna Igwe-Odunze, the Ńdébé script provides a suite of tools for cultural expansion through literature, calligraphy, and visual art.

Having been exposed to other logographic scripts like Hangul and Devanagari, I found Ńdébé to be quite straightforward to learn. Consonants are the main stems of the script, while the vowels are appended to the tops of characters. Tone is accounted for with dots, and the visual direction of the vowel rendered in a manner not dissimilar to our current diacriticized Latin script. The high tone conveys climbing a hill; the low tone descends. Users can intuit from their own rising or falling voices which image best represents the appropriate vowel.

In addition Ńdébé supplies, for Igbo at least, an opportunity for different dialectal variations to find harmony, a problem that has bedeviled written Igbo for years. […]

In attempting to write my name—a Yorùbá name—in Ńdébé, I ran into a problem that I believe needs to be solved even for Igbo. My last name Túbọ̀sún can only be written as Tú-bọ̀-sú-n in Ńdébé, where the /n/ is treated as a stand-alone consonant, though it’s there only to show that the -un is nasalised. […] But then Ńdébé doesn’t pretend to be a sound-based script, though its impressive attention to the rendering of tone makes it seem like one. Its main focus is the syllable, much like Hangul or perhaps other scripts like Devanagari. As a writing system, it is easy and logical to learn and easy to teach to either humans or to computers.

There’s mention of other scripts like Nsibidi (see this LH post) and much more; it ends with hopes for the future: “I want to see Ńdébé on signboards, computer decals, book covers, art installations, comics, scriptures, and yes, computer fonts (no excuses now, Unicode), textbooks, government documents, Nollywood films and literature.” And in case you were wondering:

Ńdébé is coined from “ide”, which is the Igbo verb meaning “to write”, added to the “n” morpheme for “continue to.”

Comments

David Eddyshaw says

July 18, 2020 at 8:33 pm

A simpler, if less romantic, solution to the problem of subdotted ẹ and ọ would be to adopt ɛ and ɔ instead, which seems to have caused no particular problems in Twi, Ewe, Ga, Dagbani … Kusaal …

While I personally would love to see tone marking become the norm for African languages, there are actually quite good reasons why it hasn’t: reasons which have usually been advanced by the speakers themselves, for that matter, who very often don’t like the idea at all.

One very good one is that mother-tongue speakers generally find tone marking redundant, except sometimes in a few function words (and such cases are in fact usually marked in Yoruba.) Yoruba is admittedly pushing it more than most: tone does carry quite a high functional load in that language. Even so, the point is not refuted by the stock examples (the favourite Yoruba one is the tonal minimal pair “wash/break the plates”, where I have to say that confusion seems improbable in real life.) All written languages ignore many suprasegmentals, and remain comprehensible even so. The nuisance is more to hapless foreigners who can’t just supply the missing information from their own knowledge of the language.

Perhaps even more to the point is the fact that accurate tone marking is difficult, even if (maybe, especially if) you’re an L1 speaker. It’s not at all unusual for speakers of tone languages to be actually unaware of tone as such. There’s also the difficulty of tone sandhi (pretty complex in Yoruba, as it happens): you can write all the words with the tones as they would appear in isolation, but then you’ll actually end up with wrong tones throughout your text. (All but a few Kusaal words with initial low tone in isolation change to initial high tone in most contexts phrase-internally, for example.)

Lastly, for many languages there aren’t actually any adequate descriptions of the tonal system at all: sometimes not even to the level of saying how many individual tonemes there are. I haven’t seen any description of the tone system of any Western Oti-Volta language which is even up to the level of my account of Agolle Kusaal (which, believe me, means that there is a problem.)
J.W. Brewer says

July 18, 2020 at 9:16 pm

The notion that genuine confusion as to whether the speaker has recommended that the plates be washed or that the plates be broken is “improbable in real life” seems to presuppose more sobriety in real life (on the part of both speakers and hearers) than I have reason to suspect is actually the case. (That doesn’t mean that marking tone orthographically is cost-justified on net, of course; I’m just responding to the specific minimal pair.)
D.O. says

July 18, 2020 at 9:44 pm

Is there a general agreement of whether an alphabet or an abugida is better in abstract or for a particular language? Assuming, of course, that there is no previous system of writing and people are free to invent whatever they want.
David Eddyshaw says

July 18, 2020 at 10:14 pm

If you were starting de novo, Yoruba would most certainly lend itself well to an abugida: all open syllables, and not too many distinct vowels (especially if you use a diacritic for nasalisation.)

Kusaal, on the other hand, not so much. Lots of closed syllables, nine distinct short vowels and nine long, contrastive nasalisation and/or glottalisation for almost all of them, and a complicated and asymmetrical system of diphthongs of three distinct lengths. You could get by quite handily without marking tone, though.
David Eddyshaw says

July 18, 2020 at 10:22 pm

confusion as to whether the speaker has recommended that the plates be washed or that the plates be broken is “improbable in real life” seems to presuppose more sobriety in real life (on the part of both speakers and hearers) than I have reason to suspect is actually the case

I have to concede that it is appropriate in this forum to adopt a somewhat more Russocentric view of these matters than I have perhaps been doing.
John Cowan says

July 18, 2020 at 11:17 pm

The evidence is that in naive grammatogeny people come up with syllabaries rather than any other system: alphabets are too hard to invent and logographic systems too hard to learn. But whether a script is a syllabary proper or an abugida is a matter of point of view. Ethiopic and all Brahmi-derived scripts are usually analyzed by scholars as abugidas: indeed, the word abugida represents the first four letters of Ethiopic in traditional order, like abjad for Arabic and alphabet for Greek.
But the treatment of them by traditional literacy instruction and by Unicode encoding is quite variable:

1) Ethiopic is taught as a syllabary and encoded as a syllabary. It does not have anything corresponding to a virama (the mark that shows a consonant does not form a syllable).

2) Tamil is taught as a syllabary, but encoded as an abugida. The other Indic scripts are taught as abugidas and encoded the same way; Buginese is a special case in which final consonants are simply not written at all and therefore no virama is needed.

3) Canadian Syllabics is taught as a syllabary and encoded as a syllabary, but can be seen as an abugida in which the vowel is represented by rotating the consonant letter, and the virama-equivalent is to write the consonant as a superscript.

4) Tengwar is taught (by Tolkien) as an abugida and, if it is ever formally encoded, will surely be encoded as an abugida as well. (There is an encoding in the Unicode Private Zone as an abugida already.)

I am omitting the question of whether a script does or does not have an inherent vowel that has no vowel mark of its own.)
David Eddyshaw says

July 18, 2020 at 11:43 pm

the first four letters of Ethiopic in traditional order

Homer nods. The first four letters of Ethiopic in traditional order are (as you know, I know) h l ḥ m.
(I feel there should be a special LH medal for catching JC in an error. Awarded annually, or pro re nata, whichever should be the more frequent.)
John Cowan says

July 19, 2020 at 12:50 am

The first four letters of Ethiopic in traditional order are (as you know, I know) h l ḥ m

Yes, hence (or so I believe) elementum, a word utterly without an IE etymology.

But Peter Daniels must have had some reason for choosing a North Semitic label for a South Semitic script, and since he indeed has forgotton more about writing systems than I will ever learn, he must have had something in mind. His other technical terms include abjad and grammatogeny with its subtypes naive g. and sophisticated g.. These words are now used without attribution to him. which as he himself says is the mark of true acceptance in a field.

I used to read Daniels frequently and at length on various mailing lists, and had formed an impression of him as a crusty and crabby old professor, so I was shocked indeed to meet him at lunch in the company of Michael Everson and others and learn that he was at least twenty years younger than I was.

I feel there should be a special LH medal for catching JC in an error

Heavens, the award would be cheapened by the frequency with which it has to be issued, especially recently. Smullyan was once asked if being a logician made him less susceptible to mistakes. He replied “Certainly not — and if I’m wrong about that, then here I am, a logician who has just made a mistake.”
PlasticPaddy says

July 19, 2020 at 1:47 am

@jc
Re some reason, Christian interlopers (northern semites or literate in Hebrew/Aramaic?) seem to have introduced the new writing, in which each consonant has four alternates, depending on the following vowel, in the order äuia. Hence the mnemonic ä-bu-gi-da.
zyxt says

July 19, 2020 at 4:01 am

The “invention” of entirely new scripts seems to be a trend of late in India and Africa. This occurs even for languages that have been previously been reduced to writing. eg. Yoruba and Igbo

It will certainly keep Unicode busy for a while to come.

You have to wonder how widespread the use of these scripts is though.
Trond Engen says

July 19, 2020 at 6:11 am

John Cowan: I used to read Daniels frequently and at length on various mailing lists, and had formed an impression of him as a crusty and crabby old professor, so I was shocked indeed to meet him at lunch in the company of Michael Everson and others and learn that he was at least twenty years younger than I was.

Are you sure? Back when I knew him from sci.lang, I was no less surprised to discover at some point (probably from a blurb for WWS) that he’s less than 20 years older than me*. I think I’m a decade younger than you.

*) Easily confirmed by Wikipedia.
Athel Cornish-Bowden says

July 19, 2020 at 6:12 am

he [Peter T. Daniels] was at least twenty years younger than I was.

He is 68 — born in 1951. Are you older than 88?

Not one of my favourites.
David Marjanović says

July 19, 2020 at 6:30 am

All written languages ignore many suprasegmentals

…and then there’s German, which is happy to mark vowel length three times in the same syllable.
Athel Cornish-Bowden says

July 19, 2020 at 6:54 am

For example?
John Cowan says

July 19, 2020 at 9:23 am

Well, I am shocked a second time. I didn’t actually ask him his age, so he must have been extremely well-preserved at that time. He is indeed older than me by a bit: I am 62 (born in 1958).

Those of you who wish may consider yourselves duly gonged.
languagehat says

July 19, 2020 at 9:55 am

I’m 69 now and my hair is still the same brown it was decades ago (though my beard has grayed). Most people would mistake my age, I think.
David Marjanović says

July 19, 2020 at 1:04 pm

From the article:

Compulsory Nigerian language education was dropped from high school syllabi in 2015.

*facepalm*
David Eddyshaw says

July 19, 2020 at 2:11 pm

While that’s surely regrettable, it’s also understandable, for both severely practical reasons (200 or so different languages, and inadequate resources not only for language teaching but across the board) and, sadly, political. I can easily think of places where any decision at all you chose to make about which language(s) to teach could be quite fraught.

Even the question of script can be toxic, as Nigerian experience can unfortunately testify.

One great plus for English is that it’s nobody’s ethnic language. It can’t even be said to belong to the colonial oppressor any more.
AJP Crown says

July 19, 2020 at 5:55 pm

I’m 69 now and my hair is still the same brown it was decades ago (though my beard has grayed). Most people would mistake my age, I think.
Me too, though I’m a mere 67. Were I to let it grow for more than a week my beard would be grey or white, which I think would look a bit odd, like Susan Sontag (not that she ever had a beard afaik, her b&w do).
Etienne says

July 19, 2020 at 6:05 pm

@zyxt: Colin Masica wrote that script invention in India is linked to the already existing multiplicity of scripts: Because of this existing large number of scripts, it is believed that a distinctive script is a requirement for a language to be considered a genuine language, rather than a patois/dialect/whatever. I wonder: could script invention in Africa be related to a similar factor, namely the presence of both a European language and (over much of the continent) Arabic as prestigious written languages, leading to a similar belief that, if an African language is to become a written language, indeed if it is to be taken seriously as a genuine language, it too needs its own script? Thoughts, anyone?

(On a related topic, I have wondered whether exposure to, if not knowledge of, Greek and Hebrew -each with its own separate script- among Protestant missionaries might not explain why so many Protestant missionaries have created or spearheaded the creation of new scripts for the language(s) of the peoples they sought to convert: could a similar sort of dynamic relating to the perception of the relationship between language prestige and use of a separate script have been at work? The reason I ask is because Catholic missionaries, who typically did not know Greek or Hebrew, seem to have been far less enthusiastic when it came to script creation. Again, any thoughts?)

@Both Davids (Eddyshaw and Marjanović): If Nigeria discontinued Nigerian language teaching in 2015, I suspect it will not be re-introduced for quite some time: oil is one of that country’s major exports, after all, and considering how low oil prices are today I think that the economic situation will mean that it simply will not be possible to create the relevant teaching material (that is in addition to all the problems relating to national unity, script choice, dialect choice and the like, which alas will not be going away).

At least in Nigeria there does not seem to be a massive shift away from African languages, unlike the case in a neighboring oil-producing country:.I recently learned that according to the most recent census, some 71% of Angolans speak Portuguese at home. It is clear that, a generation hence, a strong majority of Angolans will be native speakers of Portuguese. Which is a rather amazing trajectory, because practically all Portuguese L1 speakers had left Angola in the late seventies in the aftermath of its independence. As a result the only Portuguese speakers left were L2 speakers. Thus, in less than a century a language variety (Angolan Portuguese) will have gone from zero L1 speakers to tens of millions of L1 speakers. Without, be it noted, any significant immigration from other Portuguese-speaking countries during this time period. It is the sort of thing that should remind us that we ought to be careful when interpreting genetic and linguistic data: nothing in the genetic profile of Angola since the seventies indicates that massive language shift has taken and is taking place there…
languagehat says

July 19, 2020 at 6:13 pm

That is remarkable.
David Eddyshaw says

July 19, 2020 at 6:44 pm

Prior to the European invasions there was a strong tradition of writing Hausa in Arabic script (the generic term for using Arabic letters in this way being ajami); this was deliberately undermined by the policy of that horrid man Lugard, in the teeth of opposition from missionaries who actually knew the language and culture (he despised missionaries and suspected them of not being fully on board with the great imperial project. He eventually banned them from the whole territory altogether.) This was all part of a (successful) divide and conquer strategy intended to set “pagan” Hausa speakers against the Caliphate of Sokoto. (This is the issue at the back of the adoption of the name “Boko Haram.”)

Ajami writing of Hausa is still going strong, but pretty much only in the context of traditional Islamic education, not for writing history, poetry etc as hitherto.

Other West African Muslims developed ajami writing (including Yoruba, as the article says), but not to the same extent as Hausa. In East Africa too, there is Swahili poetry in Arabic script going back a few centuries.

Ajami is never going to catch on among non-Muslims. Even among Muslims, there’s the problem that it’s basically a terrible fit for pretty much any Niger-Congo language, which probably has a lot to do with explaining why it never took off more widely. (Also, if you were going to write a history, say, you would be quite likely to write it in actual Arabic. Arabic was the official administrative language of the Sokoto Caliphate.) For (Chadic) Hausa, there’s much less of a problem; not so many vowels, glottalised consonants/implosives that were early on lined up with the Arabic “emphatics”, and not many other sounds without nice Arabic equivalents (no /p/!)

Apart from the fact that using it obviously means that you’ve deliberately chosen not to use ajami, which is a toxic issue in northern Nigeria for those specific historical reasons, the Latin alphabet as such doesn’t seem to have any particular cultural baggage for Africans. I think well-meaning efforts to promote African languages by inventing scripts for them are unfortunately missing the point as to why literacy in African languages is typically much less than in the local colonial languages, even when the African languages are very much going strong (as in Ghana and Nigeria.) The real reasons are unfortunately not very mysterious, and not likely to be materially alleviated by adopting a new and unfamiliar script.

It’s all different in Ethiopia, of course, where people were literate before the English were.
David Marjanović says

July 19, 2020 at 7:18 pm

leading to a similar belief that, if an African language is to become a written language, indeed if it is to be taken seriously as a genuine language, it too needs its own script?

“Kante created N’Ko in response to what he felt were beliefs that Africans were a culture-less people, because before then, no indigenous African writing system for his language existed.”

The reason I ask is because Catholic missionaries, who typically did not know Greek or Hebrew, seem to have been far less enthusiastic when it came to script creation.

I wonder if the reason is instead that the Protestant missionaries either spoke English or at least worked in British colonies. Canadian Aboriginal Syllabics was created pretty explicitly because the English spelling conventions would only lead to confusion when the same Latin letters are pronounced differently between the omnipresent English and whatever other language. (…Still, Navajo appears to have pulled it off.)
David Eddyshaw says

July 19, 2020 at 7:45 pm

Missionaries of all sorts of linguistic backgrounds seem to have been quite sophisticated enough to avoid trying to introduce English spelling conventions into orthographies of African languages, or indeed any others, by the nineteenth century. I don’t think this is the missionaries’ doing, at least not directly.

On the other hand, if you go to school in Ghana (say), you will very much want to learn English well whatever else you do, because that’s how you get not to be a peasant farmer for the rest of your life. Because learning English is the key objective, “literacy” for you will mean “literacy in English” above all. [When I was learning to communicate with patients in Kusaal, my staff would sometimes helpfully interject that the patient was “literate”, meaning that he/she spoke English, so I was wasting my efforts.]

If you, the Ghanaian schoolchild, can see the point of becoming literate in your mother tongue (in which there is nothing much actually available to read, apart from the Bible, which you can read in English anyway, and – uh – literacy materials) it’s just going to be a further annoyance if you have to learn a completely new script too, and ideally you want there to be as few new spelling conventions to grapple with as possible.

In Australia, Claire Bowern’s Bardi grammar uses a horrible orthography in which the three vowels of the language, /a/ /i/ /u/ are written a i oo. This is because the actual speakers prefer it that way. It’s the same with Gooniyandi (sic.)

Basically, one needs to be able to answer the schoolchild’s question, “Why should I learn to read and write my mother tongue? How will that help me?” We Hatters are the sort of people who find it hard to imagine how anybody could even ask that, because we know that all languages are beautiful and valuable and deserve to be cherished and promoted. But that has little traction for most people, whose aim in life is basically to make a good enough living to look after their family properly. Appeals to cultural pride are much less powerful as motivators than one might imagine (this shades into the sad question of actual minority language survival.)
Bathrobe says

July 19, 2020 at 7:53 pm

Mongolian traditional script is said to be a completely alphabetic language (Wikipedia), and yet it is traditionally taught as a syllabary. Indeed, some Inner Mongolians once reacted to my attempt to decompose a (rather rare) syllable into vowel and consonant with a sharp rebuke that you couldn’t do that because Mongolian is “different”.

Methods for electronically processing the script in Inner Mongolia are syllabic in nature, that is, each syllable is recorded as a separate glyph, presumably using code points in the private use area of Unicode. It works quite well. Or should I say, it works quite well in Windows, although not on a Mac because it is a proprietary system that somehow incorporates some kind of Microsoft technology.

For Unicode, it is treated as an alphabet, i.e., having separate glyphs for consonants and vowels, which gives rise to problems because there are different ways of combining them and special invisible characters must be inserted to ensure that the correct form appears. This is the reason for the many problems that are still associated with the script.

I was prompted to make this comment because of JC’s comment above about “an abugida in which the vowel is represented by rotating the consonant letter”. There is no doubt that the Mongolian script represents the vowels as separate segments, but it’s not entirely so. Syllables starting in /x/ and /g/ are somewhat problematic since they are divided in accordance with vowel harmony. xa, xo, and xu use one form (᠊ᠬᠠ᠊, ᠊ᠬᠤ᠊, and ᠊ᠬᠤ᠊), to which ga, go and gu are identical except for the addition of a voicing sign (᠊ᠭᠠ᠊, ᠊ᠭᠣ᠊, and ᠊ᠭᠤ᠊) — note that o and u are not distinguished by the orthography. On the other hand, xe, xi, and xü are written with a different glyph that is integrally linked with the following vowel (᠊ᠬᠡ᠊, ᠊ᠬᠢ᠊, and ᠊ᠬᠦ᠊). ge, gi, gü are written exactly the same way because there is no distinction between x and g in these glyphs.

These are the only series that actually distinguish vowel harmony (apart from a and e as initial vowels), so they are important markers in reading the script. If a word does not contain them, there is no way of telling whether the word contains masculine or feminine vowels. My favourite example is ᠳᠠᠯᠠᠢ, which could be dalai, delei, talai, or telei. In fact, only the first and last are actual words, dalai meaning ‘sea’ (well known from the Dalai Lama), and telee (or telii) meaning ‘belt’.

My point is, though, that a different letter form is used in the case of x and g, and only in the case of x and g, to distinguish the following vowel, as in the Canadian “abugida”.

Traditionally x and g are distinguished in transliterating the traditional script, so that xagan is usually transliterated as qaγan while ger is ger. This suggests that they were two separate sounds in earlier stages of Mongolian. Unfortunately I am too unschooled in the history of Mongolian phonology to know for sure. Perhaps JC has a better idea.
David Marjanović says

July 19, 2020 at 7:55 pm

That’s what I was trying to say: if you have to learn English spelling conventions and the very different, because sensible, spelling conventions of your own language, the stark differences could lead to confusion. Such confusion is impossible across completely different scripts, so that’s one point in their favour.

/a/ /i/ /u/ are written a i oo

I thought /a i u aː iː uː/ had ended up as a i u aa ee oo?

(That’s how Tlingit is written, incidentally but not coincidentally.)
Bathrobe says

July 19, 2020 at 8:03 pm

(Note: Xagan / qaγan ‘khan’ is written ᠬᠠᠭᠠᠨ while ger ‘yurt’ is written ᠭᠡᠷ)
David Eddyshaw says

July 19, 2020 at 8:11 pm

I thought /a i u aː iː uː/ had ended up as a i u aa ee oo?

Bardi uses oo for the short vowel /u/ as well as for /u:/. Like I say, horrible. (It uses aa ii for the other two long vowels, though. And there is also an o, which mercifully only occurs short in the language.)
David Marjanović says

July 19, 2020 at 8:19 pm

Ah yes, I had forgotten about ii. The ambiguity of oo is odd enough that I never noticed it…

This suggests that they were two separate sounds in earlier stages of Mongolian.

Aren’t they still? [x] and [g] with feminine (-RTR) vs. [χ] and [ɢ] / [ʁ] / zero with masculine vowels (+RTR)?

Clearly that used to be the [k g] vs. [q ʁ] allophony that is all over Turkic and Tungusic, and various eastern Uralic languages.
Bathrobe says

July 19, 2020 at 9:51 pm

Thanks for pointing that out. If it was allophony then they are not (technically speaking) separate phonemes, although they must have been at least partly understood that way when the script was created (or Tata-tonga ᠲᠠᠲᠠᠲᠤᠩᠭ᠎ᠠ, the Naiman who adapted the Uyghur script to Mongolian, perceived them that way).

In the modern language, the two allophones are not distinguished in Cyrillic, except at the end of words (e.g., бага /baɢ/ vs баг /bag/), although the pronunciation of the consonant in /ga/ usually differs from that of /ge/ in speech.

In Inner Mongolia, ᠭᠠ᠊ (γa) and ᠭᠡ᠊ (ge) are assigned to the same series (the ‘g’ row, if one might call it that), which makes sense since they are in complementary distribution. It makes them a whole lot easier to learn.

Other treatments I’ve seen treat ᠭᠠ᠊ (γa) and ᠭᠡ᠊ (ge) as belonging to separate, incomplete rows. By ‘incomplete’ rows, I mean that ᠭᠠ᠊, ᠭᠣ᠊, ᠭᠤ᠊ (γa, γo, and γu) is a row with only three vowels, while ᠬᠡ᠊, ᠬᠢ᠊, ᠭᠥ᠊᠊, ᠭᠦ᠊ (ge, gi, gö, and gü) is a row with only four vowels. All other ‘rows’ contain seven vowels, e.g., ᠮᠠ᠊, ᠮᠣ᠊, ᠮᠤ᠊, ᠮᠡ᠊, ᠮᠢ᠊, ᠮᠥ᠊, ᠮᠦ᠊ (ma, mo, mu, me, mi, mö, mü) has all seven vowels (Note: I’ve varied the usual order).
David Marjanović says

July 20, 2020 at 4:05 am

Compare the use of K and Q in early Greek and, with C added, early Latin: ci ce ka kr qo qu. Corinth kept a Q on its coins for centuries.
zyxt says

July 20, 2020 at 10:24 am

“it is believed that a distinctive script is a requirement for a language to be considered a genuine language”

I am aware of the situation in India, a place with a rich tradition of writing systems. But this appears to be spreading to Africa for some reason.

“why so many Protestant missionaries have created or spearheaded the creation of new scripts”

There were a couple of “standardised” traditions used by missionaries.

One tradition was to use the Standard Alphabet: Carl Lepsius “Standard Alphabet for reducing Unwritten Languages and Foreign Graphic Systems to a Uniform Orthography in European Letters” 1863.

Another tradition was to use “Italian vowels” and “English consonants” This is used for Swahili for example.

In Australia, the Aboriginal languages have been reduced to writing by anthropologists more often than missionaries. The principle seems to be English vowels and consonants, hence “oo” for /u/ but also strange combinations like “tj” for /c/

The International Institute of African Languages and Cultures in 1930 proposed the Africa alphabet which contains some elements of the IPA, notably in the representation of vowels. David Eddyshaw’s comment above gives some examples.

The missionaries affiliated Summer Institute of Linguistics are not as prescriptive when it comes to orthography as far as I can tell.
David Eddyshaw says

July 20, 2020 at 11:50 am

In Africa and nowadays, at any rate, SIL are pretty pragmatic about orthography, and make considerable efforts to consult the speakers about orthography (and everything else) rather than impose a system.

The Kusaal standard orthography was altered quite a bit for the 2016 Bible translation. The new system seems to be largely based on the work of Anthony Agoswin Musah, who is a native speaker and a trained linguist, and who’s written a full-scale grammar of Agolle Kusaal (basically his PhD thesis, as is the way of these things.) It introduces the symbols ɛ and ɔ which zyxt just alluded to; interestingly, they are strictly speaking redundant (no significant confusion arises from just using e o like the old system), but people had in fact been using them for years and they are pretty familiar to Ghanaians, not least because they are used in Twi, which people are quite used to seeing written down in the south of Ghana. It also introduces ʋ for /ʊ/, which is a definite step forward, but mysteriously still uses i for both /i/ and /ɪ/ notwithstanding. (No idea why the asymmetry.)

I eventually learnt to appreciate the older orthography, which is pretty quirky but shows considerable ingenuity in using just the plain Latin alphabet without diacritics with the addition of just ‘ and ŋ to represent what is really a pretty complex phonological system with remarkably little actual ambiguity in practice. The thing that most annoyed L1 speakers about it was the system for marking nasal vowels, which is consistent but not very intuitive (e.g. kenn for /kɛ̃/.) Unfortunately the only change made to this in the new system introduces a completely new ambiguity (gaan /gã:/ “jackalberry tree” but daan /da:n/ “owner.”) However, the change seems to have been made in response to popular demand, and is not really a problem for L1 speakers. It’s their language, after all …
David Eddyshaw says

July 20, 2020 at 12:28 pm

It occurs to me that ɛ and ɔ might well have won favour exactly because they aren’t used in English (or French) and therefore have an African vibe for local people. A sort of minimalist version of the idea floated in the article, about adopting distinctive scripts to give languages linguistic street cred. It would help that Twi is pretty high-profile and prestigious in Ghana.
Stu Clayton says

July 20, 2020 at 2:03 pm

It occurs to me that ɛ and ɔ might well have won favour exactly because they aren’t used in English

Wait, I was told by two accent coaches this year that ɔ is the first vowel in my pronunciation of English “opportunity”. By adopting it in my pronunciation of geworden, Ost usw, I was praised for getting them exactly right.

Do all you allophone connaisseurs realize how hard it is to clean up the way you’ve spoken for 50 years ? You essentially have to give up think-speaking non-stop, as I do (and most people too, I ‘spec) and concentrate on some stupid vowel coming down the road. Even my aural feedback misleads – pronouncing geworden with ɔ sounds stilted in my mind’s ear.

I was told that my accent is charmant, why change ? Oh, the shame of it ! A charming old fart.
languagehat says

July 20, 2020 at 2:30 pm

DE was talking about the symbols, not the sounds.
Jen in Edinburgh says

July 20, 2020 at 2:48 pm

because the English spelling conventions would only lead to confusion when the same Latin letters are pronounced differently between the omnipresent English and whatever other language.

This doesn’t seem to be a particular problem within Europe. Not that people would necessarily have spoken English in earlier days, I suppose, but probably Latin – I can’t imagine that using the same letters for different spelling conventions would have been foreign to many of them.

(No doubt Manx is perfectly readable with practice, but its phonetically English spelling just looks awkward to me, compared to Gaelic and Irish which seem to just use the letters as they need them.)
Etienne says

July 20, 2020 at 4:44 pm

Thanks for the comments, everyone.

David Eddyshaw: I quite agree that script invention in most cases will not do much (if anything) to further the cause of indigenous/minority language literacy. What I would like to know is why script invention is believed to be an answer in Africa and India, and not in other parts of the world.

David Marjanović : I do not think the poor fit between sound and spelling in English explains script creation: French spelling is not much better from that point of view, and yet francophone missionaries seem to have been as reluctant to create new scripts as their Spanish- or Italian-speaking colleagues. Furthermore, plenty of adaptations of Roman script made by anglophone protestants have proven themselves quite adequate, despite being very un-English (or indeed very un-European) in terms of the sound values of the letters (For example, Fijian has an orthography created by a Scottish missionary, with such “exotic” values as G being used for the velar nasal, or Q for pre-nasalized /g/).

In my earlier comment I contrasted Catholic and Protestant missionaries/religious founders: the latter created such things as Canadian aboriginal syllabics, the Fraser alphabet, the Pollard script, or the Deseret alphabet. The former, on the other hand, while they certainly adapted the Roman alphabet to a great many languages, never seem to have created new scripts. It occurred to me (about two minutes after my original message was up) that the history of indigenous language literacy involving the Orthodox Church is more Catholic-like than Protestant-like: from the Balkans to Alaska, a great many languages were first written down by Orthodox clergy/missionaries, and yet while the Greek/Cyrillic alphabet was adapted in all sorts of ways to various languages (the Old Permic script being perhaps the most extreme example), nowhere was a new script created. Which fits my theory: Orthodox missionaries/priests, like their Catholic counterparts, lived in a culture dominated by a single script (Latin on the one hand, Greek/Cyrillic on the other).

Finally, I am struck by the fact that over the past few centuries two hotbeds of script creation were Albania and Somalia, two countries where Arabic and Roman script had been in a state of competition (plus Greek in the case of Albania and Ethiopic in the case of Somalia): this fact does not prove that I am right, but it is compatible with my theory, I think.
David Eddyshaw says

July 20, 2020 at 6:39 pm

Good point about French; though, at least as far as vowel symbols go, I don’t think French can really be accused of being as left-field as English. And it’s got a handy precooked way of distiguishing /ɛ/ from /e/ as è and é, which is readily extended to distinguishing /ɔ/ from /o/ as ò and ó (works fine until someone decides that you need to mark tone as well.) The older Mooré orthography worked like that (generalising the principle even to the i/u vowels.)

I wonder if there is an interaction with different colonialist ideologies: the French (to grossly oversimplify) was – at its most idealistic, anyhow – to spread the benefits of French culture everywhere and essentially create more Frenchmen of all races all over the world, which has certainly led to a more culturally centralising (not to say culturally imperialist) set of practices; the British, on the other hand, were more into Indirect Rule, and hence (in principle, at any rate) more comfortable with the idea that local peoples might want to do things in their own idiosyncratic ways, so long as they were all gratefully loyal and paid their taxes. (Also, it’s much cheaper. Nation of Shopkeepers, after all …)
J Pystynen says

July 20, 2020 at 6:43 pm

English spelling conventions would only lead to confusion when the same Latin letters are pronounced differently between the omnipresent English and whatever other language

If this is the problem, you can also do worst-of-both-worlds and design something like Saanich! with conventions inspired by but still gratuitously different from English, e.g. ‹Á Í› being /e əj/.

the [k g] vs. [q ʁ] allophony that is all over Turkic and Tungusic, and various eastern Uralic languages

Yukaghir, too, for good measure.
David Marjanović says

July 20, 2020 at 7:13 pm

I was told by two accent coaches this year that ɔ is the first vowel in my pronunciation of English “opportunity”

That’s [ɔ]dd – it’s n[ɔ]t what I’d have expected from a Texan; it’s Very British. Are you sure you aren’t using something unrounded, like [ɑ], and the accent coaches were just making assumptions?

This doesn’t seem to be a particular problem within Europe. Not that people would necessarily have spoken English in earlier days, I suppose, but probably Latin – I can’t imagine that using the same letters for different spelling conventions would have been foreign to many of them.

Everybody on the European mainland pronounces Latin according to the spelling conventions of their native language, or nearly so. Here, have some Latin in German pronunciation.

I do not think the poor fit between sound and spelling in English explains script creation: French spelling is not much better from that point of view

Oh, but it is! In French, the difficulty is one-sided: given the pronunciation, it is hard to infer the spelling (often several options are equally likely), but given the spelling, the pronunciation is obvious – except exceptions, but the exceptions are pretty much limited to function words (eu), proper names, and the occasional quirk like the leftover i in oignon. In English (and apparently Manx), well, if the spelling contains ea, ou, ow or ugh, the best you can do is infer two most likely pronunciations, and pretty often they’re both wrong…

The trick about Cyrillic is that it was long seen as just Greek with extra letters. Yet more letters are created whenever needed, so that hardly any two Slavic languages use the same inventory of Cyrillic letters. The Soviet alphabets of the 1930s simply extended this principle: they’re practically all Russian Cyrillic with extra letters. In sharp contrast, there’s almost no tradition of inventing new letters for the Latin alphabet; apart from the formalization of a few distinctions (i/j, u/v, ss/ß/sz), there’s practically nothing between the medieval þ and ø and – tellingly – the Soviet alphabets of the 1920s. Instead, the tradition is to use diacritics or digraphs (or trigraphs), with occasional reinterpretation of existing letters. Even the First Grammarian, who was happy to use þ among other things, marked vowel length with an accent and vowel nasality with a dot.

On top of that, there have – if only by historical accident – never been Cyrillic orthographies with as much irregular historical baggage as French, let alone English. Prerevolutionary Russian had its quirks, but less than modern German (let alone contemporary German).

Albania and Somalia

I agree they fit very well.

distinguishing /ɔ/ from /o/ as ò and ó

Or o and ô, which has also been used in some places in West Africa.
zyxt says

July 20, 2020 at 7:22 pm

David, there are quite a few examples of new letters being invented for languages written in Latin script. I’ll provide more examples later as i’m doing the morning rush hour. One example for now: new letters for mayan used by the Spanish.
Rodger C says

July 21, 2020 at 9:59 am

I know one of those: Mayan uses ɔ for /ts’/, also spelled dz when you don’t have ɔ in your font.
Rodger C says

July 21, 2020 at 10:01 am

Also, the Deseret script was invented by Catholic missionaries??
David Marjanović says

July 21, 2020 at 11:40 am

No, “the latter” refers to “Protestant […] religious founders”, and while Mormons can hardly be said to be Protestant, they still fit that label better than “Catholic”, so I didn’t say anything…

Its purpose, though, seems to have been to cut the Mormon community off from the outside world and the past. When that turned out to be impossible and unnecessary, the Deseret alphabet was abandoned.
Etienne says

July 21, 2020 at 5:16 pm

David Marjanović: I will certainly grant that for a reader French spelling is indeed easier than English. But when it comes to adapting this spelling system to another language, I don’t see either as being better than the other: both languages are quite idiosyncratic in terms of representing various vowels: it is bizarre to have /u/ represented by OO in English or OU in French, to take an obvious example. And while I agree that no language written in Cyrillic has ever exhibited the sound-spelling gap which bedevils both languages, Modern (and Medieval…) Greek spelling is definitely comparable to them when it comes to the lack of fit between sound and spelling, especially when vowels are involved.

David Eddyshaw: Keep in mind that in French itself most /ɔ/-/o/ minimal pairs are spelled as O versus AU (cf. /pɔm/ “pomme” versus /pom/ “paume”), so that extending the accent marks from E to O is quite un-French, graphically (doing so does mean you can use AU to represent any /au/-like diphthong, another very un-French feature). As for your question on the link between orthography and different colonialist philosophies: there may be something to it, but keep in mind that the French Third Republic (for instance) was VERY anti-Catholic, indeed more generally anti-religious, and thus the missionaries who first designed writing systems in its colonies would definitely not have seen themselves as bearers of (any) official French ideology.

zyxt: Among new symbols created to augment the Roman alphabet when adapted to non-European languages, one of the oldest I know of is the use of 8 (originally, a short o: ŏ) to represent /w/ in aboriginal languages of New France: this symbol is still in use among some Ojibwe-speaking communities of Quebec.

Oh, another argument in favor of my thesis: while both the Pollard and the Fraser scripts were invented by Protestant missionaries in South-East Asia in order to represent various local languages in a phonologically adequate way, tones and all, Catholic missionaries in the same part of the world came up with a different solution. Faced with a language, Vietnamese, whose phonology is also typical of South-East Asia, they adapted the Roman script and created Quốc ngữ. To the best of my knowledge, no attempt was ever made to create a brand-new script to represent Vietnamese.
Rodger C says

July 21, 2020 at 6:32 pm

I thought that 8 for /w/ (and also /u/) was a French abbreviation derived from the Greek digraph for ou (no access to a Greek font at the moment).
ktschwarz says

July 21, 2020 at 7:32 pm

Wikipedia on Ou (ligature):

Ou (Majuscule: Ȣ, Minuscule: ȣ) is a ligature of the Greek letters ο and υ which was frequently used in Byzantine manuscripts. …

The ligature is now mostly used in the context of the Latin alphabet, interpreted as a ligature of Latin o and u: for example, in the orthography of the Wyandot language and of Algonquian languages e.g. Mohawk and Western Abenaki to represent /ɔ̃/, and in Algonquin to represent /w/, /o/ or /oː/. Today, in Western Abenaki, “ô” is preferred, and in Algonquin, “w” is preferred.

But in the image from an 1871 Algonquin calendar, the printer was obviously using a numeral 8: it’s closed at the top and identical to the 8 in 1871.
zyxt says

July 22, 2020 at 9:15 am

Thanks Etienne for bringing up 8.

Here is what I wanted to add about additions of new letters to the Latin alphabet.

The Latin Extend D block in Unicode contains a lot of the letters I had in mind, eg. the tresillo and quartillo used in the 16th century for Mayan languages, ꞗ and đ for 17th century Vietnamese, and the 17th century Latvian letters.

The Polish Ł was introduced in c 16th century. Croatian and Slovenian were also productive in introducing new letters for sounds not found in Latin. For example, in Croatian poetry of the 16th century, a letter resembling x but with a loop was used for /ʒ/ . Use was also made of s and ſ to represent different sounds in both Croatian and Slovene, eg. in Slovenian s stood for /z/ and ſ for /s/.

In the early years of the 19th century, there was a 3 sided alphabet war in Slovenia between the proponents of the old Slovene alphabet and the newly created Dajnko and Metelko alphabets (these were Latin with the addition of several new letters to represent affricates, and in Metelko’s case to represent vowel sounds as well). Many publications including school books were issued in all three alphabets, but the Croatian alphabet (so-called “gajica”) won out in the end.

Đuro Augustinović in the 19th century invented a number of letters to represent Croatian affricates. He used his alphabet in all his publications, including a medical journal he edited in the middle of the 19th century. However, his proposals were not accepted more widely as he worked outside Croatia.

In Europe, Romance languages generally had no need for modifications to the alphabet, because of the strong influence of Latin. For Germanic and Finnic languages, the consonant inventory was not a problem while for vowels, different combinations of existing vowel letter were used, as opposed to creating new vowel letters – unless of course accented letters are regarded as “new” letters – eg. ä is considered as a letter separate from a. It is really the Slavic and Baltic languages that saw the greatest need for new letters.

Albanian and Maltese too have this need, and there has been a lot of activity around inventing letters for sounds specific to these languages – unfortunately, Unicode has not incorporated any of these as yet. The Wikipedia articles on Albanian and Maltese alphabets provide more information. I provided the original write up, but they are in need of an update – something that’s been on my to-do list for a while. The wonderful “Specimen characterum typographei S. Concilii christiano nomini Propagando” from 1843 is also useful in this regard.

Finally, Latin itself has had a number of letters invented and abandoned over the course of its history. However, there is a perception that there has not been much change in the Latin alphabet since classical times. I suspect that this is due to the aim of learning Latin in schools being about reading classical authors and poetry using a standardised alphabet. The consequence of this is that textbooks do not deal with palaeography. At most, a textbook might mention that in Latin I/J and U/V were not distinguished. I have yet to come across a Latin textbook that mentions the variety of letters used over the 2 thousand year history of Latin: eg. I longa, apices, Claudian letters, the use of Æ, W, &, and other ligatures, Tironian notes, scribal abbreviations etc.
David Marjanović says

July 22, 2020 at 5:31 pm

both languages are quite idiosyncratic in terms of representing various vowels: it is bizarre to have /u/ represented by OO in English or OU in French, to take an obvious example.

My point is that in French it’s always ou. That may be inefficient, but it’s consistent enough to be exported – or abbreviated, to 8 as mentioned, or to o in Malagasy that doesn’t have a separate /o/.

To the best of my knowledge, no attempt was ever made to create a brand-new script to represent Vietnamese.

Several were actually made, but none of them caught on; they were made just as Quốc ngữ was catching on (in reaction to it and to its support by the colonial authorities).

That said, what was actually used before Quốc ngữ was Chữ Nôm, Chinese characters supplemented by a considerable number of newly created characters (many more than the few kokuji that were created in Japan).
Etienne says

July 22, 2020 at 7:14 pm

All: Thanks for the correction on the origin of 8 as a symbol representing /w/. At the hattery, you really do learn at least one new thing every day…

Oh, and I freely admit that my theory on script creation being a specifically Protestant tendency might just be a manifestation of Quebec ethnocentricity on my part: in Quebec Aboriginal languages written in syllabics (East Cree, Inuktitut, Naskapi) are spoken by groups who remain mostly Protestant today, whereas the major languages spoken by Catholics (Innu, Attikamekw, Ojibwe, Micmac) are written in the Latin alphabet. It would be natural for me to assume this sort of division to be true of Catholic and Protestant missionaries/linguistic pioneers in general.

zyxt: Romance languages have had a stormier relationship with Latin orthography than you seem to believe! Many would-be reformers suggested various new letters, diacritics or the like. One of my favorite such suggestions is an Italian scholar who, in Renaissance times I think, wanted to adapt the Greek letters eta and omega to Italian, in order to have seven vowel letters, each corresponding to an Italian vowel phoneme: Eta and omega would have served to represent the mid-low front and back vowel phonemes, respectively. And schemes involving the Latin script with added Cyrillic or Cyrillic-like letters certainly were taken seriously in nineteenth-century Romania.

David: Granted, English has far more ways than French of indicating /u/ in writing (although in French, you might want to include all instances of OU + silent E and/or silent consonant): my point is that when it comes to using English orthography to create an orthography for another language, OO would be the default representation of the vowel /u/ for anglophones, just as OU would be the one for francophones. That English spelling is inconsistent is something I will not dispute, but what I dispute is the (seeming) claim that this inconsistency makes it impossible for anglophones to have a default fashion of representing a given (English) phoneme in writing.
zyxt says

July 22, 2020 at 7:22 pm

Etienne, it would be great to know more about alphabetreform in Romance. I was referring more about usage rather than proposals (ie. Unicode worthy characters).
In English too, William Bullokar was one of the pioneers of alphabet reform, but this never caught on.
I wonder if Basque too had attempts at adding new letters to the alphabet?
PlasticPaddy says

July 23, 2020 at 5:28 am

@etienne
Maybe the point is that ou is univalent in French whereas oo is multivalent in English (except for a few speakers in Scotland or Dublin).
January First-of-May says

July 23, 2020 at 6:47 am

doing so does mean you can use AU to represent any /au/-like diphthong, another very un-French feature

Of course, if pressed, French speakers can always use aou, like Brillat-Savarin did for Bugésien:

“Ce qui caractérise ce patois, c’est une diphthongue que je ne connais dans aucune langue, et qu’on ne peut exprimer par aucun caractère connu. Elle se prononce aou, comme dans les mots baou, laou, taou et saou, qui signifient une écurie à bœufs, un loup, un tuf et un sureau. Les trois voyelles ne donnent qu’un seul son.”

Or, in the (slightly corrected) words of Google Translate:

“What characterizes this patois is a diphthong that I do not know in any language, and that cannot be expressed by any known character. It is pronounced aou, as in the words baou, laou, taou and saou, which mean an ox stable, a wolf, a tuff and an elderberry. The three vowels give only one sound.”
John Cowan says

July 23, 2020 at 2:55 pm

if you have to learn English spelling conventions and the very different, because sensible, spelling conventions of your own language, the stark differences could lead to confusion

The Osage script[*] was devised by Osage educators when they found that students became confused between Osage Latin and English sound-to-spelling conventions, exactly as described above. The letters are recognizably Latinish, but different enough, it seems, not to trip English-language reflexes. The original 2006 version was very successful, but when the Unicadets (including Michael Everson) got involved in 2014, some improvements were made by a working group of the Osage Nation government:

1) One of the two existing vowel nasalization diacritics was dropped in favor of the other.

2) Two ligatures considered unnecessary were abolished.

3) A letter representing indifferently /x/ or /ɣ/was split into two letters because it turns out to be phonemic, though with a very low functional load.

4) The preaspiration mark was abolished in favor of unitary letters for all preaspirated stops, as some dialects (conservatively) pronounce them as geminates; this is something like the Breton spelling zh for what is [h] in Vannetais but [z] elsewhere.

Most significantly, Osage now has a bicameral alphabet, though the lower case is basically just small capital letters graphically. The same thing happened to Cherokee at about the same time, and to the Old Hungarian runes somewhat earlier. The capitalization rules are basically those of English or Hungarian as the case may be.

The two orthographies for Occitan, the traditional and the Mistralian, are an interesting case. Traditional orthography is very un-French and is diaphonemic like English orthography. Mistralian is clearly designed for people who know French orthography well (as essentially every literate Occitan(e) does), but suits only Provençal dialect.

Here’s Ivan Derzhanski’s discussion of the question (he was talking specifically about Bulgarian, but it’s generally applicable):

Three questions were asked, and thrice each one was answered:

(1) Must there be one speech, and one writing?
a. nay and nay; b. nay and aye; c. aye and aye.

(2) What must a word know?
a. its roots; b. its kin; c. itself.

(3) What must a tongue know?
a. its roots; b. its peers; c. itself.

So if one man says [maɪt] and the other says [mɪxt], (1a) each might spell the word in a way that reflects his own pronunciation; or (1b) they might both spell it the same, say might, but pronounce it in different ways; or (1c) one pronunciation, say [maɪt], might be made standard and reflected in the spelling, and the others would have to cope with that.

And a word such as [saɪn] might be (2a) spelt with a g in it because it comes from Latin signum, or (2b) spelt with a g because it is cognate to [‘sɪgnəl], or (2c) spelt without g, because no [g] is pronounced.

And the whole orthography might be made to assert (3a) the tongue’s ancient lineage and old glory, or (3b) its kinship and liaisons with others (this one need not be a conscious priority […]), or (3c) its individual and inimitable character (this is where national scripts and ‘national characters’, such as Spanish ñ, come in).

[*] I’m linking to the final Unicode proposal rather than the WP article, because WP just assumes that you have an Osage font, whereas the proposal is a PDF with an embedded font.
David Marjanović says

July 23, 2020 at 4:25 pm

Oh yes, Osage is a great example that I completely forgot about.
Etienne says

July 23, 2020 at 5:48 pm

John Cowan, David: I don’t see Osage as being that relevant to the matter. My understanding is that the language is no longer being transmitted within the home, and thus Osage orthography does not involve having a written form which young native speakers of Osage learn in order to write a language which they already know: rather, it involves an orthography which native English speakers of the Osage nation need to learn as a first step to learning Osage.

zyxt: At least one attempt was made to create a diacritic-rich Basque orthography: this was the work of Sabino Arana (better known as a founder of Basque nationalism), which did not catch on. In France, two of the more radical would-be reformers of French spelling were Jacques Pelletier du Mans and Louis Meigret (first half of the sixteenth century, both of them). You can find an example of the former’s reformed French orthography at his Wiki page in French, and an example of the latter’s at his Wiki page in German.
David Marjanović says

July 24, 2020 at 8:26 am

rather, it involves an orthography which native English speakers of the Osage nation need to learn as a first step to learning Osage.

And to keep them from confusing those spelling conventions with the English ones, they’re made visibly different as a whole other script, like Canadian aboriginal syllabics or Chinuk pipa.
John Cowan says

July 24, 2020 at 6:31 pm

My understanding is that the language is no longer being transmitted within the home

Quite so, but I don’t really think the situation is so different Students are asked to become literate in both English and Osage, and demonstrably it’s easier for them to do so if English letters are associated only with English pronunciation and Osage letters only with Osage pronunciation. This would be equally true whether their native language was English (as it is) or Osage (as it is not).

As another simpler example, there are several languages which I can pronounce from the written form but do not know well or at all. So for example I can (with a few glitches) read out a passage of Gernan (in which I have a good, if archaic, accent) or French (in which my accent is terrible) that is far beyond my understanding.

That is, I can do this until I come to a number written in figures, and then I instantly revert to English, even though I know how to name numbers in German and French. It is a terrible effort for me to see 123 and read it as anything but “one hundred and twenty-three”.

orthography which native English speakers of the Osage nation need to learn as a first step to learning Osage

I certainly hope this is not the case, and that the adult L2 Osage-speakers who teach the classes begin with spoken Osage before introducing the written form.
D.O. says

July 24, 2020 at 7:19 pm

It is a terrible effort for me to see 123 and read it as anything but “one hundred and twenty-three”.

Numbers are a notoriously difficult thing to switch from your native language. Would it be difficult for you to, for example, expand an initialism that you know? Let’s say can you quickly expand OTAN or CERN?
David Eddyshaw says

July 24, 2020 at 7:37 pm

It’s not really difficult if they don’t match the English equivalent (as with NATO/OTAN) or don’t really have one (like CERN or AOF.)

French year dates often cause me to hiccup in reading aloud, not only because of the numeral issue that JC mentions but because they structure differently in English and French.
Bathrobe says

July 24, 2020 at 9:56 pm

I’ve always found numbers in foreign languages to be damnably difficult. Not because they’re difficult in themselves but because numeracy and literacy seem to be processed in different parts of the brain.

After many, many years I’ve basically mastered the Japanese and Chinese numerals, which are based on four places* unlike three places in English, and can switch back and forth from these languages and English. Now I’m in Mongolia, where English or Western-style three-place counting has been adopted from the Russians (unlike Inner Mongolia where they retain the traditional four-place system), and it wreaks havoc with my mental arithmetic. It should be easy because it’s the same as English but old habits die hard.

* I don’t know the technical term for this. Basically it’s where the comma logically goes. For English it’s after every third digit. In Chinese and Japanese it’s after every fourth digit.
E says

July 24, 2020 at 10:30 pm

Etienne says: Colin Masica wrote that script invention in India is linked to the already existing multiplicity of scripts: Because of this existing large number of scripts, it is believed that a distinctive script is a requirement for a language to be considered a genuine language, rather than a patois/dialect/whatever.

Could I get a reference for this? It’s a compelling argument, I’d like to read whatever article or book it comes from.

Finally, I am struck by the fact that over the past few centuries two hotbeds of script creation were Albania and Somalia, two countries where Arabic and Roman script had been in a state of competition

I think this is a coincidence, because most of the rest of the Arabographic world was a “hotbed of script creation” during the past two centuries, too. There were more than 50 new scripts proposed for Persian in Iran alone, just from the end of the 19th century to the middle of the 20th. (This is something I’m writing about elsewhere, as it happens).
zyxt says

July 25, 2020 at 12:16 am

Re Albania, Somalia and Iran: This is probably more to do with the inadequacy of an abjad to represent fully all the sounds of a non semitic language. Turkish, Azeri, Kurdish, Kazak, Uzbek, Turkmen, Tajik, Malay, etc have all converted to an alphabet.

The Albznian case is special because the earliest attestations are around the same time as the Ottoman conquest, and with the catholic using the Latin alphabet and the orthodox the greek. There was also a 4 way tension between Latin, cyri,lic, greek and arabic alphabets, which gave impetus for indigenous alphabets to be invented.
E says

July 25, 2020 at 12:26 am

The script changes in many of those cases have been just as much about identitarian factors as utilitarian ones – seeking to align Turkey with Europe, or imposing Cyrillic on all the languages of the USSR aside from Armenian and Georgian. In Iran the proposals for alternative scripts were rarely about utility (the Arabic script isn’t perfectly suited for Persian, but it’s about as good a fit as you can get outside the Semitic languages due to Persian’s small vowel inventory and historical relationship with Arabic) but about identity, tied to Iranian nationalism.
SFReader says

July 25, 2020 at 1:25 am

Estonian, Latvian and Lithuanian always had Latin script.

Finnish too (Finnish was official language of the Karelian Republic).
January First-of-May says

July 25, 2020 at 6:21 am

It is a terrible effort for me to see 123 and read it as anything but “one hundred and twenty-three”.

I mentally pronounce numbers in an idiosyncratic way based on Russian, which wreaks utter havoc whenever I need to put an English indefinite article before a number.
(I don’t recall offhand if I ever got around to describing this system on LH, though I do recall writing it down in a LH comment entry at least twice. [It doesn’t help that I’m not actually sure of some of the details.] Notably, initial 1 is [approximately] /i(n)/, so anything starting with 1 gets “an”.)

As for reading out loud, I think I’ve learned to pronounce English numbers in English text, but in anything else I would hesitate at best (and just use Russian numbers if I expect the hearer to understand them).

Estonian, Latvian and Lithuanian always had Latin script.

IIRC, there’s a bunch of very early attestations in Cyrillic, from back when it was the western fringe of the Novgorod trade area. [EDIT: apparently that was Karelian.]

Apparently Lithuanian used to be officially written in Cyrillic between 1864 and 1904 (most non-official people used Latin script anyway).

Finnish was official language of the Karelian Republic

Yes, but in the late 1930s, so was Karelian – written in Cyrillic.
zyxt says

July 25, 2020 at 6:38 am

Agree with SFReader. Latin was also used for Germans in the Volga German ASSR, Greek in the case of the Pontic Greeks and let’s not forget the Hebrew for Yiddish in the Jevrejskaja ASSR.

It appears that there are some “identitarian factors” but the point is that an alphabet was adopted where the language in question was not Semitic, and Arabic was inadequate to represent accurately the sound system of the language in question. eg. Malay.

On the other hand, we have the example of Berber languages (re-)adopting the Tifinagh abjad (and yes, the modern version of tifinagh is used as an alphabet as well).

It also appears to be the case that in Pakistan, Arabic is used with modifications for each separate language. Some of these function more like an alphabet than an abjad.
E says

July 25, 2020 at 12:50 pm

My bad on Cyrillic in the USSR, I was mistaken.

It also appears to be the case that in Pakistan, Arabic is used with modifications for each separate language.

Again, this is true of most Arabographic languages, not just in Pakistan.

Some of these function more like an alphabet than an abjad.

Like which? I wasn’t aware of this being the case with any languages spoken in Pakistan, though it’s true of Sorani Kurdish, Uyghur, and some others elsewhere.
Etienne says

July 25, 2020 at 5:46 pm

E: The Colin Masica reference (about the South Asian sense that a distinctive script is required for a language to be considered a full-fledged, genuine language) is drawn from…(INSERT DRAMATIC DRUMROLL HERE)…his 1993 book THE INDO-ARYAN LANGUAGES (Cambridge University Press), page 144 (But considering your stated interest in scripts, I recommend reading the entire chapter on scripts, pages 133-153. Actually, the whole book is worth reading, if you ask me).

And to repeat what I wrote upthread, you always learn something new here at the Hattery. FIFTY new scripts proposed for Persian alone? I had no idea. Where will you be writing/publishing on the topic?
John Cowan says

July 25, 2020 at 6:13 pm

Basically it’s where the comma logically goes. For English it’s after every third digit. In Chinese and Japanese it’s after every fourth digit.

In India and nearby countries large numbers are written like 12,34,56,789, which in English is “twelve crore thirty-four lakh fifty-six thousand seven hundred eighty-nine”.
gwenllian says

July 25, 2020 at 6:14 pm

At least in Nigeria there does not seem to be a massive shift away from African languages, unlike the case in a neighboring oil-producing country:.I recently learned that according to the most recent census, some 71% of Angolans speak Portuguese at home. It is clear that, a generation hence, a strong majority of Angolans will be native speakers of Portuguese.

I remember hearing (probably here) about the scale of language shift in Angola but, wow, this is catastrophic. How much pushback is there against this process?
zyxt says

July 25, 2020 at 8:16 pm

“Some of these function more like an alphabet than an abjad.”

For example Baluchi – there is systematic representation of Baluchi vowels. But my knowledge of Baluchi is minimal, so not sure whether this is done in everyday usage, or only in certain publications (eg. school books) where it is important to set out the vowels.
John Cowan says

July 26, 2020 at 12:10 am

How much pushback is there against this process?

It’s pretty much a consequence of internal migration, which was pretty much the result of 40 years of war, first the war for independence from Portugal and then a civil war between two major groups. Rural Angolans, who mostly spoke one of the 26 major indigenous languages, ended up moving to the cities to make a living or for sheer survival. The lingua franca in those cities was Portuguese, and it’s Portuguese that was the L1 of the second generation that repopulated to the countryside.

At present Angola is about 40% L1 Portuguese speakers, more than any other L1, and another 30% of Angolans speak it at home even though it is not their L1. So in the next generation Portuguese will be overwhelmingly the L1 of Angolans, and that is as irreversible as the dominance of English in the U.S. The languages of cities are the languages that will survive the 21C.
Etienne says

July 26, 2020 at 6:33 pm

Gwenllian, John Cowan: I suspect there is more to the spread of Portuguese in Angola than large-scale internal migration (not least because, depressingly enough, such large-scale internal migration has been all too common in Africa over the past few generations…): what the explosive urbanization of many African countries has yielded in many countries, linguistically, is the birth of a distinctive urban variety of an African language becoming a lingua franca in the process (Wolof in Dakar, for instance) rather than the spread of the local official European language. The question should be, why did Kimbundu (A major language of Angola, originally spoken in Luanda and neighboring areas) fail to become the (informal) lingua franca of Luanda?

One such factor, I suspect, is the fact that the Portuguese presence in Africa dates back to much longer than is the case for most European colonizer states in subsaharan Africa: Luanda was founded in 1576, almost two generations before the first English pilgrims arrived at Plymouth Rock, for instance. Partly because of this Angola has a Catholic majority today (and has Protestantism as its second most important religion), and I suspect this may have predisposed non-Lusophone Angolans in the seventies to perceive anything European/Portuguese as being inherently of value. Including the Portuguese language. Perhaps to such a degree that Kimbundu in Luanda, unlike Wolof in Dakar, simply lacked the prestige required to become an urban lingua franca.

Gwenllian: Not only does there not seem to be any pushback against this process, but from what I have read the process is continuing, in the sense that Angolan Portuguese is shedding its distinctive features: apparently, the speech of the younger generation in Luanda is increasingly indistinguishable from their peers’ speech in Lisbon.

P.S. I welcome Hatters’ corrections and further thoughts on the matter: I am no expert on Angola or indeed on subsaharan Africa, believe me!
rozele says

July 26, 2020 at 8:57 pm

“Some of these function more like an alphabet than an abjad.”

this is true for yiddish and djudezmo as well, adapting the jewish alphabet to their fusion-language needs.

(and/but/though there are plenty of examples, down to the present, of full nekudes [vowel diacritics] being used on fully alphabetic yiddish texts, which i’ve found very useful at times for understanding spelling conventions in texts that don’t use standardized spelling – i also find it charming…)

english written in the jewish alphabet generally adapts yiddish spelling conventions (with some innovations for /θ/, /ð/, &c).

i don’t know enough to know whether other jewish languages use the alefbeys in more alphabetic or abjadic [sic?] ways. i wonder whether it correlates to relationships to semitic languages (e.g. haketia or other judeo-arabics being more abjad-y), to other coterritorial scripts (e.g. judeo-persian or jewish aramaic being more abjad-y), or to anything at all…
Brett says

July 26, 2020 at 11:18 pm

@rozele: This Web page seems to have a good description of how the Hebrew letters were adapted and modified for representing Ladino alphabetically.
Jonathan Morse says

July 27, 2020 at 3:52 am

Rozele, here’s an example.

https://jonathanmorse.blog/2020/07/26/signal-in-transit/
David Eddyshaw says

July 27, 2020 at 3:56 am

Mandaic uses the familiar Aramaic letters in a way which is basically alphabetic; the development is not unlike Yiddish, in that letters for consonants which are no longer in use are repurposed as vowel symbols. Not unlike Greek come to that …
TR says

July 27, 2020 at 5:33 pm

Compare the use of K and Q in early Greek and, with C added, early Latin: ci ce ka kr qo qu

Greek sometimes used koppa before consonants, too. I remember a presentation by an eminent epigrapher on a newly discovered inscription that referred to Croesus. The inexperienced scribe, he mentioned in passing, had spelled the name as ΟΡΟΙΣΟΣ instead of ΚΡΟΙΣΟΣ — a strange mistake, but there you go. The very first questioner in Q&A pointed out that of course the Ο must be a Ϙ.
gwenllian says

August 6, 2020 at 4:41 pm

Thanks, John and Etienne! Yeah, I was wondering why things seem to be playing out so differently than in other African countries with internal refugee situations and intense urbanization. I do remember certain cities having a lot of L1 French speakers in some countries, but (for now at least) not quite like what’s happening in Angola.

Gwenllian: Not only does there not seem to be any pushback against this process, but from what I have read the process is continuing, in the sense that Angolan Portuguese is shedding its distinctive features: apparently, the speech of the younger generation in Luanda is increasingly indistinguishable from their peers’ speech in Lisbon.

P.S. I welcome Hatters’ corrections and further thoughts on the matter: I am no expert on Angola or indeed on subsaharan Africa, believe me!

Anyone know if Angolan Portuguese features are stigmatized and actively discouraged or is this mostly happening through the influence of Portugal Portuguese media? Or both? Is there a lot of such influence from Portugal? I remember hearing a lot about an uptick in Portuguese migrating to Angola a couple of years ago. I wonder if that figures at all into this lisbonization of Angolan Portuguese.
Etienne says

August 6, 2020 at 6:21 pm

Gwenllian: It turns out that whereas Angola will have a lusophone L1 majority in a generation, in another former Portuguese colony, Sao Tome e Principe, this is possibly already the case: according to some sources a majority of its inhabitants have Portuguese (and not one of the local creole languages) as their L1. Two of the most linguistically gallicized countries in Africa, Congo (the former French, not the former Belgian colony!) and Gabon, are geographically next door: I wonder if this geographical clustering of African countries where European colonial languages are spreading as first languages is significant. Thoughts, anyone?

In answer to your question on whether Angolan features of Portuguese are deliberately stigmatized by the school system: I do not know, but a few years ago I obtained via inter-library loan a book from Mozambique which I thought was a description of the local variety of Portuguese, but which turned out to be a Portuguese grammar for teachers which indeed described local Mozambican features (mostly variable gender and number agreement and somewhat distinctive use of prepositions, as I recall) …so that teachers would have an easier time teaching their students “proper” Portuguese, by knowing what sort of “mistakes” their students were especially liable to make. Considering the many similarities between Angola and Mozambique, I would expect similar such school grammars to exist in Angola as well.
David Marjanović says

August 6, 2020 at 6:42 pm

The child soldiers in the formerly Belgian Congo are fluent in French, too; I have no idea what else they speak.
David Eddyshaw says

August 6, 2020 at 6:48 pm

@Etienne:

I think (as you suggested before in this context) that it is probably urbanisation in Africa that is significant. Big cities are likely to drive the development of of lingua francas; if no African language is advantageously placed to fulfil this role, then a colonial language may well do so. Important factors are going to be the degree of ethnic diversity within the cities (which varies a lot) and the extent to which African lingua francas are already firmly established to begin with (likewise.) In cases like Kano (with Hausa) and Kumasi (with Twi) everything comes together for an African language; and even Lagos is in the middle of a large pretty-thoroughly Yoruba-speaking area.

It’s interesting to speculate about why Congo-Brazzaville is rather different from Congo-Kinshasa; I don’t know if Lingala was never as thoroughly established in the former. There are a lot of other confounding factors (like deliberate promotion of Lingala in the Zairean army in Mobutu’s day.)
Etienne says

August 7, 2020 at 5:47 pm

David Eddyshaw: it was Lameen, not I, who brought up urbanisation as a key driver involved in the birth and spread of ingua francas iin Africa (see his April 29 5:07 comment on this thread here:

http://languagehat.com/pronouncing-an-igbo-name/

-and various responses thereto, including mine).

And in answer to my own question (what do the linguistically most Europeanized countries in Africa otherwise have in common?), I cannot help but note that Angola, Gabon, the Congo and Sao Tome e Principe all seem to have one thing in common: (European-origin) Christianity seems much more widespread there than in most African countries. Making me wonder whether this spread of a European religion may have paved the way (in terms of social acceptability) for the spread of European languages,,,
David Eddyshaw says

August 7, 2020 at 6:38 pm

The south of Ghana is pretty solidly Christian (to a quite startling degree to someone freshly arrived from Europe), but Twi/Fante is equally solidly established as the lingua franca of the south (it used to annoy some of my northern Ghanaian co-workers that southerners with no experience of the north would just assume that if you were Ghanaian, you would understand Twi.)

Northern Ghana, Burkina and northern Togo (at least when I lived there) were both highly polyglot and largely still the domain of traditional African worldviews, which I suppose is a sort of negative support for your thesis. On the other hand, there are plenty of Christians in countries where Swahili is undoubtedly the major lingua franca (despite the strongly Islamic associations of Swahili itself.)

Correlation is not causation of course; and it’s notable that Christianity has from the beginning been characteristically an urban religion (as indeed has Islam, for all the romantic desert myths.)

Some of this might be to do, also, with the rather different nature of Portuguese colonialism; the Portuguese got there early and they had few hangups about marrying local women and raising families together, compared with the later European colonials.
ktschwarz says

May 10, 2023 at 9:41 pm

On July 21, 2020, I copy-pasted from Wikipedia’s Ou (ligature):

… in the orthography of the Wyandot language and of Algonquian languages e.g. Mohawk and Western Abenaki …

and somehow overlooked that bit of stupidity: duh, Mohawk isn’t an Algonquian language! Somebody has since deleted “Mohawk” from the sentence.

But wait, there were French Jesuits there, couldn’t Mohawk also be one of the languages where they used the ȣ ligature early on? Yes, it was: Google turned up a 2020 dissertation on 360º Video and Language Documentation: Towards a Corpus of Kanien’kéha (Mohawk) with a short survey of the history of written Mohawk. French missionaries in the 1600s did use ȣ and other Greek letters such as χ and θ for un-French sounds, but “as printers rarely had enough Greek characters on hand” those letters were abandoned and the ȣ modified to 8, which remained in use into the 19th century when it was finally replaced by w. There were also competing orthographies devised by Anglican missionaries, but they were less consistent and less popular (and there were some written records by early Dutch colonists too, but they were ad hoc with no systematic spelling). It was the French-descended system that was the basis of the 1993 spelling standardization, with the addition of the apostrophe for glottal stop and diacritics for tone and vowel lengths.
David Marjanović says

May 11, 2023 at 4:13 pm

Me on July 19th, 2020:

All written languages ignore many suprasegmentals

…and then there’s German, which is happy to mark vowel length three times in the same syllable.

The very next comment was “For example?”, and I never answered.

An extreme example is Vieh “livestock”. That’s /fiː/, and the length is marked not three, but four times: 1) the word is an open stressed syllable; 2) it’s a monosyllable ending in less than two consonants; 3) h; 4) ie.

In this case, the h happens to be etymological (not every “mute h” is), but the e is just made up. The cognate in my dialect (meaning “animal”, countable) is /fɪx/. The English cognate is fee.

The reason for the e may be that the sequence ih only occurs in personal pronouns… in all of which the h is unetymological.

Meet the Ńdébé Script.

Comments

Speak Your Mind

Archives

Search

Recent Posts

Recent Comments