Emojis and Unicode.

Michael Erard (of whom LH has long been a fan, and whose first appearance here in 2003 also concerned Unicode) has written a typically well-informed piece for the New York Times Magazine, “How the Appetite for Emojis Complicates the Effort to Standardize the World’s Alphabets.” He leads off with a timely reference to an obscure Rohingya alphabet that will soon be usable on computers or smartphones thanks to “a 26-year-old international industrial standard for text data called the Unicode standard, which prescribes the digital letters, numbers and punctuation marks of more than 100 different writing systems”: “The Rohingya will be able to communicate online with one another, using their own alphabet.” (Though as he points out they have more pressing problems at the moment.) He goes on to describe the history of emojis and the culture clash that ensued when the two phenomena collided:

At Emojicon, resentment toward Unicode was simmering amid the emoji karaoke, emoji improv and talks on emoji linguistics. “Such a 1980s sci-fi villain name [Unicode Consortium],” one participant grumbled. “Who put them in charge?” A student from Rice University, Mark Bramhill, complained that the requirements for the yoga-pose emoji he had proposed were off-puttingly specific, almost as if they were meant to deter him. A general antiestablishment frustration seemed to be directed at the ruling organization. One speaker, Latoya Peterson, the deputy editor of digital innovation for ESPN’s “The Undefeated,” urged people to submit proposals to Unicode for more diverse emojis. “We are the internet!” she said. “It is us!”

I have to confess I rolled my eyes, but I understand the reasons emoji-lovers want lots of emoji in Unicode; I thought Ken Whistler had a good take:

“Emoji has had a tendency to subtract attention from the other important things the consortium needs to be working on,” Ken Whistler says. He believes that Unicode was right to take responsibility for emoji, because it has the technical expertise to deal with character chaos (and has dealt with it before). But emoji is an unwanted distraction. “We can spend hours arguing for an emoji for chopsticks, and then have nobody in the room pay any attention to details for what’s required for Nepal, which the people in Nepal use to write their language. That’s my main concern: emoji eats the attention span both in the committee and for key people with other responsibilities.”

Anyway, it’s a good piece, and there’s a good discussion going on at the Log thanks to Victor Mair’s post.


  1. Matthew Roth says

    The commentary is interesting. I think the criticisms are worth listening to. For my part, it’s problematic that MS has an international keyboard layout that facilitates typing of European languages but lacks “œ,” which is required in French…

  2. In the interests of wedging every diacritic-letter combination used in Western Europe into a single 96-character space, the most that could be readily supported at the time, the French standardization agency AFNOR agreed to live without œ, Œ, and Ÿ, on the grounds that they were not strictly necessary. In general, there are few or no minimal pairs between oe and œ, and ÿ is only rarely capitalized. The “international” keyboard provides just the characters available in this space.

    Unicode of course incorporates all three.

  3. I personally think Ysaÿe would look better with double diacritics: Ÿsaÿe. Metal!

  4. marie-lucie says

    MRoth: “œ,” which is required in French…

    Theoretically, yes, but many people use “oe” instead, especially if there is no “œ” on their keyboard.

    LH: Ysaÿe would look better with double diacritics: Ÿsaÿe.

    This spelling of a last name is a variant of Isaïe [i-za-i], the French name of the prophet known in English as Isaiah. The diacritic is needed in French to show that the i is pronounced separately, since Isaie would be only two syllables, [i-zE] (here I am using “E” for the mid low vowel). LH’s suggestion might look prettier, but the diacritic on the first y would be superfluous and would only confuse the readers as to what the pronunciation should be.

  5. There used to be a restaurant in my neighborhood in Washington, DC, named Yanÿu. I have no idea what they intended by the dieresis.

  6. Eli Nelson says

    @Marie-Lucie: From what I understand, there are some words in French where “ï” represents /j/ before a non-schwa vowel, as in “aïeul” /ajœl/. Maybe this is just a special case of the neutralization between /i/ and /j/ that occurs in some other contexts. Is it possible in the normal writing system for final “aïe” to represent /aj(ə)/ rather than /a.i/, or is that an exceptional correspondence that would not be expected to be possible for a name, like “Isaïe”? Wiktionary transcribes the interjection “aïe” as /a.i/ but then says it rhymes with “aille” and “haïe”. The CNRTL says

    Prononc. − 1. Forme phon. : [aj]. Fouché Prononc. 1959, p. 5 précise que la graph. ï se prononce [j] dans l’interj. aïe (à comparer avec la graph. ï = [i] dans aï4). Il note par ailleurs (p. 438, 440) que ,,dans le cas de deux mots non séparés par un silence, il n’y a pas de liaison lorsque le second (…) est (…) une des interjections : ah! aïe! (ou ahi), eh! oh! ouais! ouf!“.

    The CNRTL also indicates that the interjection to horses “haïe” could be pronounced as [aj].

    @Keith Ivey: “Yanÿu” looks like it could be derived from a Chinese word ending in 语/語, which as far as I know is pronounced in Mandarin as something like [y], [ɥy] or [jy]. The Pinyin transliteration of this syllable is just “yu”, but after “n” or “l” as an onset, the same vowel is written as “ü”, with a umlaut to mark the front quality (officially–in entry systems, I believe “v” is often used since it doesn’t have any other use in Pinyin). https://chinesepod.com/tools/pronunciation/section/15

    So the person writing “Yanÿu” might have been thinking of the use of “ü” in syllables like “nü”, and misused it on the “y” in this case. Apparently there is a Chinese word “yanyu” meaning “proverb”. The first syllable of the name could instead or also be a reference to the chef Jessie Yan.

  7. David Marjanović says

    Probably slipped from Yanyü, which would be Pinyin except for the rule to omit the dots in yu and most other syllables that contain this vowel because there’s no danger of confusion.

  8. Only lü, nü actually require the umlaut mark, as they directly contrast with lu, nu.

  9. In general, there are few or no minimal pairs between oe and œ, and ÿ is only rarely capitalized. The “international” keyboard provides just the characters available in this space.

    The corresponding Windows character encoding supports all three characters. It seems to be a mistake on the part of the author of the keyboard layout.

  10. Windows-1252 is a superset of Latin-1; I can only suppose that the “international” keyboard predates Windows-1252. The keyboard does not provide things like the rounded quotes or the § sign either.

  11. David Marjanović says

    The § sign, used more often in German than in English, is on the German keyboard layout (Shift+3).

    Only lü, nü

    Yes, I was too lazy to explain this: lü, nü contrast with lu, nu; lüe, nüe are also spelled with ü despite only contrasting with luo, nuo. That makes four syllables in total, not counting tones, that are spelled with ü. The others with that sound are the ones with a palatal (if any) consonant: yu, ju, qu, xu plus -e or -n or -an.


  1. […] Hat takes a look at the potential for emojis to overwhelm Unicode, as does Language […]

Speak Your Mind