A Boing Boing post describes the “curious censorship and intellectual property details of Microsoft’s new blogging tool MSN Spaces” as they affect Chinese sites; I was particularly struck by this (from Weizhong Yang in Taipei, Taiwan):
We found that the Traditional Chinese MSN Spaces censored words such as oral sex, anal sex and so on, by the way, they censored two important and common used words which make us feel unbelievable.
One is a word pronounced as cao which means fucking sometimes, however, it also means operating, handling, exercising or practicing, and there was a famous king/hero/tyrant in about the second century called Cao Cao. Therefore you cannot set certain derivations of that word (for instance Cao Cao and Yang Xiu, which is a famous traditional Chinese drama play) as the title of your MSN Space.
I can attest that Cao Cao (traditional transliteration Tsao Tsao) is an extremely famous figure in Chinese history, and it’s absurd that his name is censored because of homonymy! (Thanks to Songdog for the link.)
I wonder if Chinese writing is maybe unable to merely “suggest” a word without actually telling you what it is.
English newspapers, for example, can print obscenities by leaving out letters: “F___” or “s___”; you know it’s dirty and can skip it if you wish. Chinese has two options: “X”, the equivalent of “@%?*!”, where readers must supply their own filth with no clue as to which kind; and a substitute homophone, which usually makes no sense in context unless the reader decodes it, basically turning it into an obscenity as well.
The filtering of the homophone is interesting because it runs counter to the popular myth that the written language is “true Chinese”, with the spoken language merely pronunciation of the charaters. 操 (as opposed to 肏) ought to be harmless. But my registration for Gymnastics World (体操天地) fails.
The blog comments also show MSN Spaces blocking “David” because it contains “av”.
What’s wrong with “av”?
I like your sites, by the way!
language hat, “av” is an acronym for “adult video”
Alas for my innocence!
lh: glad you like them. Good thing I decided to go back and check up on those half-remembered phonology terms last night; should be correct, now.
[LH, I read your message; thank you for the kind words. Now I know how to call you in Chinese!]
Zhwj, as you rightly point out, this is fascinating on many levels and gives a lot to think about Chinese Sprachgefühl. However, the problem with cao1操 is not merely a question of homophony (something Weizhong Yang himself notes in his message to Boing Boing). Not only does it often stand for cao4 肏 in the Mainland (typically in American movies subtitles, to translate the F-word), as if it were some kind of “simplified” form of it, but it also has its own insulting meaning, listed 10th in the Hanyu Da Cidian (Cd-Rom ed. 2.0) entry, which merely indicates “slur” (lici詈詞) and gives the following example, from writer Ye Zi 葉紫 (1912-1939):
“I cao his ancestors to the eighth hundredth generation (Wo cao1 ta babai dai zuzong 我操他八百代祖宗)!”
Close enough to the example given, in the same dictionary for the “sexual” cao, from a play by the great Lao She 老舍 (1899-1966):
“I fuck his ancestors to the eighth generation (Wo cao4 tade ba bei zuzong 我肏他的八輩祖宗)!”
People do giggle at cao1 the way they do with gan4 幹 (“to do”, in both neutral and sexual meaning), which seems to have the same shock-value as “Fuck”, although it is not inherently vulgar: I remember the way a Taiwanese paper’s “entertainment news” (yule xinwen 娛樂新聞) section emphasized how the repetition of the “gan character” on Zhang Zhenyue’s latest album barred most songs from being played on the radio (those were rap songs, but he also sings inoffensive ballads which do get successful air time).
I wonder if “gaochao” 高潮 (high tide/climax) would be blocked, too.
And I won’t go into detailing how confused my exclusively Mainland-experienced self was to see the ‘comrade literature’ (同志文學) shelves in Taiwanese bookshops (they use the old meaning of tongzhi as ‘homosexual’). I won’t, but I get the feeling that a large part of the discussion has to do with the not-so-insignificant differences between Taiwan ‘guoyu’ and Mainland ‘putonghua’.
Now, since the ‘specialized’, narrow-single-meaning cao4 character (penetrate+flesh) appears in the Honglou meng and the Jin Ping Mei, a good test would be to see if it has been replaced in the jiantizi (‘simplified’) editions of both novels. Current dictionaries like the Xiandai Hanyu Cidian and the (encyclopaedic) Cihai are to puritan to give any hint: no effword-cao4 (not listed either in the simplified Chinese IME editor) and no slur meaning for handle-cao.
Finally, what Zhwj’s rejected attempt (Gymnastics World 体操天地) show, as well as other hilarious examples in Weizhong Yang/Zonble original post in Chinese, is how this brings us back to the old problem of defining what a Chinese “word” is (how many words are they in “ti cao tian di”? One, four, or in between?). Though equally robotically one-sided, the Chinese blocking system does seem slightly more coherent than the alphabetic, English one, which, for instance wouldn’t accept a japanese URL with the name “yamashita” because of the “shit” segment (that was about a “children protection” filter, not MSN Spaces). I am not a linguist, though, so this may be very banal in fact.
[By the way, is there a way to distinguish zidian 字典 from cidian 詞典 other than “character dictionary” and “word dictionary” in English translation?]
Cao Cao has usually been thought of as a bad guy. But avoiding the name of the deceased in Chinese culture shows very high respect, so actually Cao’s shade is benefiting from this taboo.
Bad guy, maybe (at least Cao Cao the Three Kingdoms novel character). Great poet, certainly.
About AV: as far as I know, it is from Taiwan, via Japan. I first saw it in the kind of “entertainment” articles I refered to earlier, which do not restrain revealing (or suggesting) that such actress or model started her career by shooting “AV films” (AV片).
Jimmy Ho: Microsoft seems to be doing work (or at least taking advantage of work done) in the area of word-boundary identification. The XP versions of Office and the MS Pinyin IME allow a context menu to be pulled up at any character, offering suggestions first for words and compounds, and then for the character itself. It’s not perfect, but they ought to be able to use that technology with their online filter. They obviously do it to some extent – the blog post you linked to lists blocked compounds where the individual characters would pass, and they block Mao Zedong 毛泽东 but not Chairman Mao 毛主席.
I think it means something very significant that porn words now have the Living Emperor-name taboo extended to them, but I don’t know what.
Jimmy — absolutely a great poet, and not a bad emperor, probably, either.
I’ve done research on the origins of the shih style of poetry in the Han and San Kuo Wei dynasties, and they are really rather dubious from a nice Confucian point of view. But Confucians have this talent for ignoring things that don’t fit into their master story. So shih poetry is perhaps the crowning point of Chinese culture, even though the first shih poets were often brigands and interlopers. (Cao Zhi in particular has been equipped with a misleading biography).
AV is indeed from Japan. Love those AVギャル(ahem!)
Note on 同志:the use of 同志 to mean ‘homosexual’ is quite well known on the Mainland.
shuō cáo cāo, cáo cāo dào
Speak of the devil.
Cao Cao’s Chinese opera mask is here: Mask
He was a stereotype villain in the Three Kingdoms novel and in plays and operas.
A quick note that might interest those who participated in this thread (I still have to respond to zhwj and Bathrobe, but that’ll come later, hopefully): Robert H. van Gulik (高羅佩)’s handwritten Erotic Colour Prints of the Ming Period 秘戲圖考 (Tokyo, 1951) contains a pretty detailed Appendix on the “Chinese Terminology of Sex” (pp. 229-234) with the following paragraph:
Could anyone explain why Van Gulik transcribes 肏 as ri instead of cao?
(End of note.)
hello great