Zhou Youguang, RIP.

Zhou Youguang, the inventor of the pinyin system of writing Chinese, has died at 111 — a remarkable age in any event, but especially so for someone born in his time and place. I hadn’t known about him, but he led quite a life; Margalit Fox has a fine obit at the NY Times:

[…] It is to Pinyin that we owe now-ubiquitous spellings like Beijing, which supplanted the earlier Peking; Chongqing, which replaced Chungking; Mao Zedong instead of Mao Tse-tung; and thousands of others. The system was adopted by the International Organization for Standardization in 1982 and by the United Nations in 1986.

Yet for all Mr. Zhou’s linguistic influence, his late-life political opposition — in 2015, the news agency Agence France-Presse called him “probably China’s oldest dissenter” — ensured that he remained relatively obscure in his own country.

“Within China, he remains largely uncelebrated,” The New York Times wrote in 2012. “As the state-run China Daily newspaper remarked in 2009, he should be a household name but is virtually unknown.”

It took Mr. Zhou and his colleagues three years to develop Pinyin, but the most striking thing about his involvement was that he was neither a linguist nor a lexicographer but an economist, recently returned to China from Wall Street. […]

And Victor Mair has a touching post at the Log:

Zhou xiansheng,

You were my dear friend for decades. I wish that you had gone on living forever. You will be sorely missed, but yours was a life well lived. […]


  1. January First-of-May says:

    According to Wikipedia, on his 111th birthday, he was the world’s 7th oldest known living man.
    According to the Gerontology Research Group, there are still no known supercentenarians (110 years or older) from China ever, whether living or dead (and of either gender). I’d expect Zhou to be the first though.

    I wanted to make a funny blog post about something something eleventy-first. But I forgot to check on the 13th, and wasn’t really sure if he was still alive. Certainly didn’t expect to find out that way.


  2. Reading his obituaries, I discovered that pinyin predates the Sino-Soviet split. Wouldn’t that then mean it’s a complete myth that the Chinese, left with almost no friends in the world, had to turn to the Albanians for help with a new alphabet, and that’s why it abounds in the relatively weird letters like q and x?

  3. Yes, it would.

  4. David Marjanović says:

    Q is at least pronounced similarly in Mandarin and northern Albanian (southern: [kʲ]), but x isn’t by any stretch (Albanian: [dz]). Clearly the motivation was first of all “we’ve got sounds without a letter, and we’ve got letters without a sound, so let’s put them together”, all else being details.

  5. J.W. Brewer says:

    Of course Q stood for theta in the primitive system often used for typing Classical Greek back when the Internet was still in its Bronze-Age phase and couldn’t handle non-Latin characters. That’s arguably a bit more extreme than the hanyu pinyin use of it.

  6. J.W. Brewer says:

    NB that word-initial X was used in some of the 17th-century romanizations of Chinese devised by Jesuits whose L1’s might have been Spanish or Portuguese, but for a slightly different consonant than hanyu pinyin uses it for, e.g. “Xantung” for the region more modernly romanized as Shantung or Shandong. Whether Mr. Zhou/Chou/etc. and his colleagues were mindful of that precedent in devising their own system is unknown to me.

  7. Not only Greek; blåbærgrød came out as BL@B#RGRQD not too many years ago. Before it turned into bl}b{rgr|d.

    (That was on Univac 1100 mainframes with 6-bit FIELDATA. But even now I see similar tricks on receipts, presumably because people can’t figure out how to access non-default input character sets on their cheap Chinese cash registers. The ISO 646 variant encoding seems to have died a natural death though).

  8. January First-of-May says:

    Of course the normal letter Q does exist in both Classical Greek (where it usually stands for Corinth, and in more archaic versions also for 90) and Scandinavian (notably in the last name of Vidkun Quisling). It’s pretty rare in both though, so it’s not that much of a confusion to have it stand for something that appears more commonly.

    Coin Community, a large numismatic forum, used to allow many non-Latin characters (a few, notably the Russian and Greek letters Г, did show up weirdly – I could never figure out a pattern), but a few years ago a new software update had apparently broken that entirely (even on older threads). So yes, they do end up using Q for thetas in Greek coin legends (even – perhaps especially – where it means 9; can’t recall any case when they had to post something with a 90 in it, though statistically at some point it must have happened).

  9. Stephen C. Carlson says:

    Of course Q stood for theta in the primitive system often used for typing Classical Greek back when the Internet was still in its Bronze-Age phase and couldn’t handle non-Latin characters.
    Ah, yes, Beta Code! ABGDEZHQIKLMNCOPRSTUFXYW. The letters Q (θ), C (ξ), and Y (ψ) are the oddest. W actually resembles the lower-case ω.

  10. the primitive system often used for typing Classical Greek back when the Internet was still in its Bronze-Age phase
    That system is still in use, e.g. for searches in the online Liddell-Scott.

  11. What I like about the Chinese is that they have improved our alphabet in a similar way that the Greeks improved the alphabet when they took it over from the Phoenicians.

    Several Phoenician consonants were not present in Greek, so the Greeks took over the letter representing those consonants and used them to represent vowels: A E I O Y.

    Several letters of our alphabet represent sounds that do not exist in Zhuang, a Tai-Kadai language of China (https://en.wikipedia.org/wiki/Standard_Zhuang). Instead these letters (Z J Q H X) are used for tones.

    So, just as the Greeks improved the alphabet (or the abjad) by using letter to represent vowels, the Chinese have created a more complete alphabet by using letters to represent tones as well.

  12. A number of Chinese linguists are now protesting the term “inventor”, saying that Zhou was only one of the committee and not the most important one (though he was the representative to the ISO committee that made pinyin an international standard). They also say that because he is the last survivor, and a very old one, he got far more popular credit than he actually deserved. I reproduce a remark by one of them, but mostly for its English-language interest, as it shows what can happen if you use ‘s as a direct translation of the Chinese relative/possessive particle de:

    You might have your own good reasons for respecting Zhou that much and happen to be happy to call him “the father” too, but please don’t mix up your (and perhaps more foreign linguists’) understanding with medias (both in and outside of China, oh yes these reports are everywhere in China too these days, but they’re all the same, claiming he’s the father because he created Pinyin or contributed so much in the creating process but didn’t want to receive that much credit)’s superficial gimmick, and the latter is what I’m against.

    Which beats all hollow the longest clitic separation by a native speaker I know of, namely “That-there umbrella is the young lady I go with’s.”

  13. That is truly remarkable.

  14. Is that the guy that sits in front of my girlfriend in Dr. Gnarlflup’s European history class on Tuesday afternoon’s book?

  15. Sure, it’s easy for people like us to make them up. The point about my two examples is that they are spontaneous.

  16. My favorite languages with recycled Latin letters are Hmoob and Natqgu.

  17. January First-of-May says:

    Wow, Natqgu is really weird. I’ve found a few texts in it, and they pretty much look like rot-13 (except with even less vowels).

  18. It’s apparently Natügu in a less bizarre system of spelling.

  19. Back when I was a National Spelling Bee contestant ~1960, somebody sent me a spelling-reform pamphlet illustrating his “Fonetik Crthqgrafi.” I disliked it, not so much because letters were drastically repurposed, as because the pamphlet came from Chicago and reflected the COT-CAUGHT merger, which I didn’t have.

    I recall from the following decade a proposed romanization of Somali that used Z for the glottal stop. I don’t think it caught on.

  20. c, q, r, x, z are the new letters for o̱, ü, ö, ä, ë (/ɔ/, /ʉ/, /ɵ/, /æ/, /ə/). While umlauts are familiar to Westerners, they were considered less intuitive by the local community.

  21. @Rodger C: Isn’t Chicago unmerged? The choice of stressed vowel in the city’s name can be a major point of contention for them.

    I had a similar experience as a teenager in the 2000s, though. Me and some other regulars on a language forum got into a feud with a very kooky spelling refomer who was convinced, among other things, that hardly any English speakers distinguished “merry” and “Murray”. (By total coincidence, he was from the Philadelphia area.) He also ran a one-man political party advocating for the gradual annexation of the entire world by the United States.

  22. @Lazar: Interesting. I also grew up merging “merry” and “Murray” in the Ohio Valley of West Virginia, but the school system and the media made sure I learned “better.” (Did this fellow not have a television? Oh well, you can’t hear what you don’t produce.)

  23. advocating for the gradual annexation of the entire world by the United States
    And here I always thought that that has already happened… 😉

  24. Greg Pandatshang says:

    An oldschool “Superfans” Chicago accent probably does participate in the cot-caught merger. I’m not sure; I’ve hardly ever met anyone who has that accent. As far as Chicago-inflected GA is concerned, Chicago is a bastion of resistance to the cot-caught merger. It’s true that there is variability in the realisation of the vowel written “a” in “Chicago”, but in my experience that’s lexical, i.e. specific to that word, and doesn’t necessarily reflect anything about a speaker’s handling of the same vowels in other words.

    Incidentally (and I fear I might have complained about this on LH before), I have always been baffled by linguists’ descriptions of the Northern Cities Vowel Shift, which we Chicagoans supposedly participate in. What they describe always sounds to me like somebody doing a Wisconsin accent.

  25. very kooky spelling reformer

    I know (slightly) a fellow who thinks there is no point in discriminating between those and doze. By total coincidence, he is an old fart from Brooklyn. He seems to be okay with keeping thin and tin separate, though.

    gradual annexation of the entire world by the United States

    I have more or less the converse plan: allocate 200-odd new votes in the Electoral College, to be divided among all the nations of the world who are willing to hold free and fair internationally monitored popular elections to determine how to cast them. Why should non-Americans get no say in who the Leader of the World is to be?

    Northern Cities Vowel Shift, which we Chicagoans supposedly participate in

    Labov said the delightful thing about studying the NCVS is that the participants don’t know they have it and it is not stigmatized. Consequently, they don’t try to reduce or eliminate it when you interview them, unlike Southerners or AAVE speakers or New Yorkers. But it was the sound clip of the Chicago cop saying “make a block-to-block search” and sounding to other Americans like “black-to-black search” that brought the NCVS to national attention.

  26. Greg Pandatshang says:

    Ah, that explains it. Given the recent DoJ report, I’m pretty sure he actually was saying “make a black-to-black search”.

  27. It’s true that there is variability in the realisation of the vowel written “a” in “Chicago”, but in my experience that’s lexical, i.e. specific to that word, and doesn’t necessarily reflect anything about a speaker’s handling of the same vowels in other words.

    What I meant was, the traditional “dispute” over how to pronounce the city’s name seems to presuppose a cot-caught distinction.

  28. Greg Pandatshang says:

    Good point. Theoretically, it could be a scenario where the caught-cot merger is ubiquitous, but some speakers only have /ɔ/ and others only have /ɑ/ in the same words, and that would be the basis of the dispute. Doesn’t sound very plausible – does any American with the cot-caught merger merge them to /ɔ/?

    On the other hand, I think I’ve noticed a popular misconception among /ɔ/-sayers that cot-caught is driving the dispute, i.e. that people who say /ɑ/ in Chicago will also have /ɑ/ in all positions. This is plausible on its face, but not accurate in my experience.

  29. David Marjanović says:

    Not rather a LOT-PALM distinction?

  30. Greg Pandatshang says:

    For GA speakers, around here anyway, the vowel in lot is the same as in cot, and the vowel in palm is the same as in caught. So, that would be the same distinction. I have “palm” as /pɔɫm/ (when I was a kid, I noticed that the dictionary said the “l” should be silent in -alm words, but that sounds super weird in my dialect).

  31. For me, the CAUGHT-COT merger exists only some of the time. I had the full merger when I was very young, but now whether the merger occurs or not seems to be essentially random. It’s not tied to specific words or even contexts.

  32. J.W. Brewer says:

    My LOT and PALM aren’t merged, but they’re closer to each other than my LOT and my CLOTH. I think my “Chicago” has PALM. (I lived there for a few years in my twenties; probably not long enough, given the point in life I was at, to affect my idiolect.)

  33. David Marjanović says:

    the vowel in palm is the same as in caught

    Oh wow.

  34. Speaking of Chicago, in Croatian, the name of the city used to be pronounced t∫ikago (ie. CH as in chicken). In more recent times, the younger folk appear to have ‘corrected’ it to ∫ikago (CH as in champagne).

    Does this happen in any other language?
    What is the origin of the strange pronunciation of CH? Is it French?

  35. J.W. Brewer says:

    The “ch” in the spelling of Chicago is indeed French, i.e. dates back to a late 17th century attempt by French explorers to write down a toponym from some relevant local Algonquin language. The via-French origin is guess is why it is ∫ rather than t∫ in AmEng for that particular toponym. There are other US toponyms where “ch” comes out as ∫ rather than t∫ for similar reasons, i.e. Francophones had already interacted with the relevant indigenes and tried to write some names down before Anglophones got to that part of the continent. Cheyenne is one that comes to mind more or less at random.

  36. I was reading an article yesterday in a newspaper here written in the Mongolian traditional script about Obama’s speech in Chicago. They spelt it ᠴᠢᠺᠠᠭᠣ᠋ (chikago, with ᠴ tʃ rather than ᠱ ʃ). This seems to be from the Cyrillic Чикаго (chikago), and since Cyrillic Mongolian got all its place names from Russian, then it must be Чикаго in Russian, too — confirmed from Wikipedia. On the other hand, one of my Inner Mongolian dictionaries also gives ᠴᠢᠺᠠᠭᠣ᠋ chikago, whereas Chinese prefers 芝加哥 zhījiāgē. Either they’ve been misled by the English spelling, or they’ve got the pronunciation direct from Mongolia. In any case, they haven’t copied the Chinese.

  37. @Greg Pandatshang: Generally not to [ɔ], but cot=caught=[ɒː] is found here in Eastern New England (distinguished from [aː] in “father”), and variably among people in other merged parts of North America (especially Canada).

