Old Sinitic Reconstructions and Tibeto-Burman.

A guest post by Tsu-Lin Mei at the Log describes work he has been doing on Tibeto-Burman cognates and Old Chinese:

The work is quite interesting. It involves the internal history of the Tibetan language, internal history of Tibeto-Burman, etc. Some of these areas have been covered by James Matisoff, and others are terra incognita. My young colleague Jackson Sun at Academia Sinica has been working on comparative Tibetan dialects as well as Proto-rGyalrong 嘉戎。 I am making up the phonological history as I go along. The final product will be 100 cognate sets, supported both by philological evidence and by evidence from living Tibeto-Burman languages. […]

Why am I doing this? While there are many books on Sino-Tibetan comparative linguistics, there is no succinct account of the reasons why we believe Sino-Tibetan-Burman are genetically related. The answer is quite simple. (1) There are between 140 to 300 cognate sets involving Old Chinese and Tibeto-Burman languages. (2) Sino-Tibetan has a causative *s- and a nominalizing *-s. Both (1) and (2) have been in the literature since 2000, but nobody took the trouble to give a short, easy-to-understand account. This is what I am doing and I am writing in Chinese.

I have always thought that in Chinese lexicography there should be a section which tells the reader which Chinese words have Tibeto-Burman cognates and which do not. The American Heritage Dictionary does that for English; for every English word, the Dictionary gives its Indo-European root, if any. We should be able to do that for Chinese.

That’s exciting stuff to me, and I presume to anyone interested in historical linguistics. (We discussed Rgyalrongic languages here a bit last year.)

For anyone who reads Russian, by the way, Sashura has a very interesting interview in Ogonyok with Dmitry Bobyshev on the occasion of Bobyshev’s eightieth birthday; I translated a poem of his five years ago.


  1. It can’t be five years! Sounds like yesterday (re Bobyshev’s translation)
    thanks for the mention

  2. It IS a very interesting interview. And it mentions the poem translated by languagehat in the context of Bobyshev’s arrival in the US. “Just think, there are places where beasts can live life simply, without any forethought or strife.”

    I was actually there in Urbana in the late 80’s as one of the “дети эмигрантов третьей волны, родители которых старались как можно быстрей их американизировать, но ребята решили вспомнить о своих русских корнях. А заодно получить “лекий” зачет.” He and his wife would take the Slavic department grad students out for beers once in a while and I got to tag along. I had no idea who he was at the time.

  3. Greg Pandatshang says

    Blench and Post hypothesised that there is an especially close relationship between Chinese, Tibetan, and Burmese (and van Driem has argued the same just for Chinese and Tibetan). This makes the terminology Sino-Tibetan or Tibeto-Burman as including a much wider range of languages potentially confusing. It’s good that Tsu is working with a wider range of spoken languages. I wonder if the results will tend to confirm or refute Blench/Post/van Driem.

    P.S. Sometimes I wonder if perceived similarities between Chinese, Tibetan, and Burmese could be, rather than genetic, an areal superstrate effect from, perhaps, the Shang dynasty language, which Scott DeLancey argued was non-Sinitic.

  4. An interesting idea! I wonder how it could be evidenced?

  5. Jim (another one) says

    “the Shang dynasty language, which Scott DeLancey argued was non-Sinitic.”

    Much later the original language of the Chu state was Hmongic. There are clearly Hmongic loans in Hubei dialects.

    Furthermore the Koreans and the Hmong have oddly similar origin myths, both claiming descent from Hou Yi. Korean origins are obscure, but that claim of ancestry may explain why Dong Yi was written the way it was – 东夷 – with a bow as the word for Yi.

    The Shang language does vary, what we know of it, from Zhou and later Chinese and in some non-trivial etyma, such as body parts. It could have any number of connections. It could have been an isolate that died out.

  6. @Greg: In the case of Chinese and Burmese, two very innovative languages with highly eroded morphology, it is indeed not immediately obvious that their commonalities are to be explained by a genetic relationship rather than by contact if restrict your attention to these languages.
    You need to look at the whole Sino-Tibetan/Trans-Himalayan family, in particular morphologically complex languages. There are various pieces of evidence from morphology showing that the resemblances between these languages are not due to contact, in particular:
    (a) The person indexation systems of Kiranti and Gyalrongic (see this article, this one and others cited therein).
    (b) The person pronoun system in Qiang and Chang Naga, which present the same suppletive paradigm, traces of which are found elsewhere.
    Languages that lost this morphology (like Lolo-Burmese and Chinese) can be shown to be related to the conservative languages by the fact that non-productive traces of that morphology remain, though in obscured form. In the case of Burmese, it is a bit easier, owing to the fact that Lolo-Burmese and Gyalrongic languages probably belong to a common branch of the family

  7. @Jim (another one), I had not heard about Koreans claiming descent from Hou Yi 后羿, who in Korean reading would be Hu Ye 후예. I’m vaguely familiar with the myth, but did not remember the name.

    I looked up several sources in Korean about Hou Yi, but didn’t find anything about Koreans claiming descent from him. Most only referred to him as a figure from Chinese mythology.

    One page in Korean I did find (warning: nationalist pseudo-history) quotes the Classic of Mountains and Seas (山海經 Shan Hai Jing) as saying that Hou Yi was also known as Yi Yi 夷羿 and was the main god of the Dong Yi 東夷. Now, the name Dong Yi (conventionally translated as “eastern barbarians”) certainly was applied to proto-Koreans (as well as other groups) in the 3rd-century Records of the Three Kingdoms (三國志 San Guo Zhi), but it was applied fairly indiscriminately to all sorts of peoples throughout history, not just Koreans. Many Koreans however are only familiar with the term as a historical (and sometimes poetic) name for Koreans, so may mistakenly equate Dong Yi with Koreans whatever the context. The Dong Yi mentioned in the Classic of Mountains and Seas (versions of which date back to the 4th century BC) let alone during the Shang period definitely would not have been Koreans, however. Originally, the label seems to have been applied to some groups in the Shandong and Anhui regions in China.

    Although I haven’t found any sources that explicitly mention Koreans claiming descent from Hou Yi, I could envision a scenario where the Dong Yi are said to claim descent from Hou Yi, and the Dong Yi are interpreted as Koreans. It seems more likely to me that this is a case of confusion due to the use of the name Dong Yi rather than an actual independent Korean tradition of descent from Hou Yi.

    Academic scholars of Korean history are well aware of the fact that Dong Yi does not always refer to Koreans. But fringe pseudo-historians of the ultranationalist bent often wilfully misinterpret history based on the assumption that any mention of the Dong Yi relates to Koreans. The page I referred to above is one such example, and includes such fantastic claims as that the Chinese characters were invented not by the Chinese but by the Dong Yi, i.e. Koreans.

    Going back to the forms seen in oracle-bone script and bronze inscriptions, we see that Yi 夷 is not a combination of 大 (dai “great”) and 弓 (gong “bow”), but of 人 (ren “man, person”) and 尸 (shi “corpse”), which were both originally characters for “man” or “person”. The later resemblance to 大 and 弓 is accidental, so the claim that it represented a bow is a bit of later folk character-etymology.

  8. Jim (another one) says

    Jongseong, I must have misremembered it then. Origin myths are the sort of thing a kid would hear in passing many times in a childhood and if you didn’t, that is because it’s not your origin myth.

    Thanks for that bit on the Dong Yi and Hou Yi. For me that makes a connection between the Dong Yi and Koreans even more tenuous, because if there is a real connection, why has that origin myth been dropped form the culture?

    “The Dong Yi mentioned in the Classic of Mountains and Seas (versions of which date back to the 4th century BC) let alone during the Shang period definitely would not have been Koreans, however. Originally, the label seems to have been applied to some groups in the Shandong and Anhui regions in China.”

    That’s far enough south of modern Korea, but is it clear what the original extent of the proto-language was? An isolate is typically relictual.

    As for the general reference of the term Yi, I remember at lest one anti-Manchu reference to them as Man Yi.

  9. Greg Pandatshang says


    Thanks for your comments and links to papers (which I have not had a chance to read all of yet). I do want to clarify that when I speculated about similarities between Chinese, Tibetan, and Burmese being areal rather than genetic, I meant as far as an especially close relationship linking them much more closely than Trans-Himalayan in general, as suggested very tentatively by Blench & Post (and by van Driem with regard to Sino-Bodic). I didn’t mean to call into question that all three are genetically part of Trans-Himalayan (not that I have the expertise to do more than take note of the consensus on that point). If Lolo-Burmese shares a clade with Rgyalrongic, that might refute Blench & Post’s Sino-Bodic-plus-Burmese, although I don’t know just where Rgyalrongic fits in their tentative TH family tree.

    Blench and Post reference (fig. 5): http://www.rogerblench.info/Language/Sino-Tibetan/Blench%20&%20Post%20paper%20%20HLS%202010%20final%20for%20submission.pdf

  10. Greg, I see — sorry for misunderstanding your post. Concerning Post and Blench, I have nothing in principle against their ideas, but the evidence for the subgrouping they propose is meager. Attempting to determine the phylogeny of the Trans-Himalayan family is premature given our poor understanding of the sound laws. Concerning the Burmo-Qiangic (or Burmo-Gyalrongic) hypothesis, I apologize — I posted a link to the wrong article in my previous post, the correct one is this one and its supplementary file.

  11. @Jim (another one), the origins of Korean are indeed quite obscure. Old Korean dates back to the Three Kingdoms period of Korea which ended in the 7th century, but only the language of the southeastern kingdom of Silla can be securely established as Koreanic. The other kingdoms Goguryeo/Koguryǒ and Baekje/Paekche might have spoken dialects of the same language, related languages, or something else entirely depending on how you interpret the evidence.

    Other ancient polities that may have spoken Koreanic or proto-Korean include Buyeo/Puyŏ in Manchuria (there is also a Buyeo languages hypothesis which would link it with the languages of Goguryeo, Baekje, and possibly Japonic while keeping it separate from the language of Silla) and Gojoseon/Kojosŏn or Old Joseon/Chosŏn. The latter developed in what is now the Liaoning region of China in the early stages (around the 7th century BC) before losing territory to the Chinese state of Yan 燕 and retreating to northern Korea around Pyongyang. So Gojoseon is probably the best candidate for an early proto-Korean Dong Yi.

    Indeed, the name Joseon 朝鮮 (Chaoxian in Chinese) does appear at least twice in the Classic of Mountains and Seas.

    Passage 1. 朝鮮在列陽東, 海北山南. 列陽屬燕.
    Chaoxian [Joseon] is in the east of Lieyang 列陽, south of Haibei 海北 Mountain. Lieyang belongs to Yan.

    Passage 2. 東海之內, 北海之隅, 有國名曰朝鮮 / 天毒, 其人水居, 偎人愛人.
    Inside the eastern sea, near the northern sea, there is a country named Chaoxian [Joseon] / … [I’m not confident translating the rest]

    On the other hand, I couldn’t actually find the term Dong Yi in the Classic of Mountains and Seas at least on the online version I consulted. If it is the case that the term Dong Yi does not appear in the book, then we are probably talking about the term as applied by later scholars or commentators discussing the book so it doesn’t help us figure out the original use of the term.

    In any case, later scholars seem to have hypothesized about the various myths in the book deriving from various cultures including the Dong Yi. But here, they are talking about the Dong Yi in the earliest sense, as a label applied to some groups from the Shandong and Anhui regions. There is no evidence that any proto-Korean states extended that far, and neither did the Liaoning bronze dagger culture which might be an archaeological expression of Gojoseon.

  12. [I’m not confident translating the rest]

    I’m not capable of translating at all, but since I have Birrell’s Penguin Classics translation of The Classic of Mountains and Seas on the shelf, I thought I’d look it up. It seems that the rest of the passage doesn’t refer to Joseon at all:

    Within the East Sea region, at the corner of the North Sea, there is a country. Its name is Dawnfresh. The land of Skypoison is here too. The people of Skypoison live on the water. They cuddle each other sensuously and are very amorous.

    Consulting the notes on Chinese names and terms at the back of the book confirms that “Dawnfresh” is “(Chao-hsien): The ancient Chinese name for Korea”. “Skypoison” is “(T’ien-tu): The ancient Chinese name for India[‽]”.

  13. They cuddle each other sensuously and are very amorous.

    SkyPoison sounds like a good place.

  14. Re: obscure origins of Korean

    According to what I learnt here on languagehat, there were two main waves of migration into Korean peninsula by Bronze Age rice-growing farmers. First wave spoke languages related to Proto-Japanese and indeed, they migrated into Japan from Korea circa 300 BC. Second wave spoke proto-Korean and eventually assimilated and incorporated previous Japonic population.

    Koguryo, Silla and Paekche were probably quite similar – all three likely had Korean-speaking elite on top of largely Japonic-speaking population, but perhaps proportions were different in each state. Higher proportion of Korean speakers in Silla probably could be explained by closeness to Japan which led to mass emigration of Japonic speakers.

  15. @Tim May, thanks for providing Birrell’s translation of the passage!

    A more familiar historical name for India is Tianzhu 天竺. According to Wiktionary, it is:

    Itself a transcription of Old Persian *Hind-uka – hypocoristic of *Hinduš (“people living beyond the Indus”). Late Old Chinese pronunciation: *l̥ˁin tˁuk. Variant transcription: 身毒 (Shēndú, from Late Old Chinese *qʰjin dˁuk).

    Tiandu 天毒, which would be Old Chinese *l̥ˁin dˁuk, looks like it could be a variant transcription of the same name. In the confused geography of the Classic of Mountains and Seas, then, Joseon and India are in the same general region…

    Of course, this isn’t quite satisfactory, and there is inherent difficulty in making sense of the language of pre-classical Chinese texts such as that of the Classic of Mountains and Seas (God knows later Classical Chinese texts invite enough ambiguity in interpretation already). But it is enough to launch some crackpot theories.

    For comparison, I found a translation into Korean by the late amateur pseudo-historian Yulgon Yi Jungjae 율곤 이중재 律坤 李重宰 (Yulgon is the byname, Yi is the surname, and Jungjae is the given name):

    東海之內. 北海之隅. 有國名曰朝鮮. 天毒. 其人水居. 偎人愛之.
    동쪽나라 안의 북쪽의 모퉁이에 있는 나라이름은 조선(朝鮮)이며 조선은 천독(天毒)이며, 그 사람은 물이 있는 곳에서 살고 사람을 사랑하며 가까이 한다.
    In the eastern realm, in the northern corner, the name of the country is Joseon [Chaoxian] and Joseon is Cheondok [Tiandu], whose people dwell where there is water, love people and are intimate with people.

    Apparently, Yi once gave a talk at the Library of the National Assembly where he “demonstrated” that the Korean kingdom of Silla was not located in the Korean peninsula but in China. Then he was invited to the main hall of the National Assembly itself to give a talk claiming that the Later Han was none other than Goguryeo/Koguryǒ. In fact, his lifelong work seems to have been to show that much of what we know of as Korean history took place in China—Koreans ruled over the whole of China, Manchuria, and the Korean peninsula. So it isn’t surprising that the ambiguity and confused geography of the Classic of Mountains and Seas which seems to connect Joseon and India fed into Yi’s ultranationalist fantasy.

  16. @SFReader, I think you are referring to this thread, where I see that Jim (maybe the same one commenting above?) had this to say:

    This puts me in mind of wider affiliations for these groups. Manchuria is not the only possible urheimat for Korean. the area immediately north of Shandong was not always Chinese speaking and some group of eastern Yi were in conflict with the Chinese state all the way back in Shang times. They or someone who inherited that name claimed to be descended form Yi, the archer, who also happens to be connected to the Koreans as well as being claimed as the ancestors of the Hmong. If that sounds like too big a geographic spread, it’s not. The Chu state had a Hmongic substrate, at least in language (Hubei dialects show Hmongic lexical material) it was snugged up against the Chinese states, and it was old, supposedly being started by a descendant of the Yellow Emperor.

    So that’s another statement to the effect that Koreans claimed descent from Yi the archer. I wonder if there was some sort of confusion there between Yi 夷 the barbarian group (with folk character etymology connecting it to a bow) and 羿 the name for the archer Hou Yi.

    As for the origins of Korean, the broad scenario you summarize of proto-Japonic reaching Japan from the Korean peninsula and being replaced in the latter by a later proto-Korean wave certainly sounds plausible and even probable (the alternative being that proto-Japanese bypassed the Korean peninsula entirely). Tying the proto-languages to actual states and cultures is difficult though.

    Based on the records, some have posited a Buyeo-Han 扶餘-韓 language split in the proto-Korean cultures, where Goguryeo and Baekje had Buyeo-derived elites and spoke a different language from most of the (non-elite) inhabitants of Baekje and Silla, who spoke the Han language (named after the Samhan 三韓 or the Three Han that previously existed in the southern part of the peninsula. Now, since the Silla language was Old Korean at least in its later stages, proto-Japonic probably could not have been a Han language. Could it have been a Buyeo language? Well, some have noted lexical similarities between several Goguryeo toponyms and proto-Japonic.

    I’m not convinced of this theory, though, including the whole notion of the Buyeo-Han split. Other scholars hold that the Goguryeo toponyms that seem connected to proto-Japonic reflect not the language of Goguryeo but of the previous inhabitants. I am definitely not a specialist and would like to do much more reading on the latest theories, but at the moment a more plausible picture seems to be that the classical Three Kingdoms of Korea—Goguryeo, Baekje, Silla—all spoke Koreanic languages at least at the elite level, whereas the Buyeo language might well have been quite different (but could also conceivably have been Koreanic). Proto-Japonic remnants on the peninsula might be represented by the Gaya/Kaya statelets in the south (just across from Japan), which enjoyed close relationships with Japan in historical times before they were absorbed by Silla.

  17. I have nothing to add except that this is fascinating stuff and I’m glad you guys are talking about it!

  18. There is another question which I haven’t yet seen answered conclusively in the literature. How much of Koguryo’s population actually spoke Chinese?

    The reason why I am asking is that north Korea was annexed by Han China and remained part of China for 400 years. The entire region was colonized by Chinese military settlers.

    I’ve read that even after Koguryo reconquered the area, Chinese presence continued and that ethnic Han Chinese military commanders became officers/local officials of the new Koguryo regime.

    Thus, if we take into account continued presence of ethnic Han Chinese (who spoke what Beckwith described as Northwestern Middle Chinese), ethno-linguistic situation in Three Kingdoms period Koguryo becomes even stranger with a mix of Chinese, Old Korean and proto-Japonic (plus Tungusic in Manchuria, which also was part of Koguryo.)

  19. There were Chinese migrations already into Gojoseon (Old Joseon), before Goguryeo. The chaos that followed the dissolution of the Qin dynasty in China led to a lot of Chinese refugees crossing into Gojoseon. Then in 195 BC, during another episode of chaos in the area of the former state of Yan, Wiman 衛滿 (Weiman in Chinese) led a group of 1,000 refugees into Gojoseon. Wiman was entrusted by King Jun 準王 of Gojoseon with the defence of the western border, but soon got powerful enough to overthrow him and become the king of Gojoseon himself. Wiman is described as a man of Yan, so he could well have been of Chinese descent, although the descriptions that he had a topknot and dressed in Gojoseon clothes coupled with the fact that a large stretch of Yan territory had only been conquered from Gojoseon barely a century before have led some to speculate that he was a Yan subject of Gojoseon descent. In any case, though, this episode shows that there were migrants crossing into Gojoseon from Chinese-ruled areas.

    The Han dynasty conquered Gojoseon in 108 BC and initially established four commanderies (administrative structures) in its former territory, although soon only Lelang 樂浪 (Nangnang in Korean) survived as the others were abolished or moved. Lelang and its spinoff Daifang 帶方 (Daebang/Taebang in Korean) represented Chinese presence in Korea for about four centuries. But I wouldn’t go as far as calling it a Chinese annexation of northern Korea, which would anachronistically overestimate the degree of control that China could exercise over this far-off border region. Chinese control from the centre was only asserted intermittently during these four centuries, as semi-independent governors and warlords ran affairs quite free from central interference most of the time. Lelang must have been centred around Pyongyang based on the archaeological record.

    According to this news report in Korean from 2007 wooden slips from the Lelang period in Pyongyang were recently discovered, which showed a census of the 25 counties of the Lelang commandery from 45 BC. The total population was around 280,000, with a high population density around the county of Joseon (the area around Pyongyang, the former capital of Gojoseon). Of these, around 40,000 or around 14% were classified as Chinese and the rest as natives. So a fair proportion of Chinese settlers were present around a half-century after the Han conquest of Gojoseon. But this seems rather low a number to have maintained a robust presence of the Chinese language by the time Lelang was absorbed by Goguryeo in 313, given that the commandery went through several periods of turmoil and the Chinese wouldn’t have established enough control over the area to replenish settlers. But if at least some communities of Chinese settlers resisted assimilation until the takeover by Goguryeo, Sinitic could very well have been spoken in Goguryeo, and even in Baekje or Silla if some refugees fleeing Goguryeo ended up going south. After all, literacy in Classical Chinese is thought to have spread in the three kingdoms due to the influence of the Chinese presence around Lelang. So Sinitic probably never achieved a high number of speakers in Korea proper and would have eventually been overwhelmed by Koreanic, but possibly left its mark in the adoption of Classical Chinese as the literary language.

    A century after the absorption of the Lelang commandery, Goguryeo gained control over the whole of the Liaodong peninsula, which would have been inhabited at least partly by Sinitic speakers due to its period of control by the Yan, the Qin, and the Han (even if it was originally ruled by Gojoseon). So Goguryeo would certainly have had Sinitic speakers within its realm.

    Goguryeo was indeed an interesting ethnic mix. Apart from the ethnic elements assumed to be proto-Korean and the Sinitic speakers in the Liaodong region, it included the Malgal 靺鞨 (Mohe in Chinese) tribes of Manchuria who were Tungusic speakers. I’m guessing that elements of the proto-Mongolic(?) Xianbei 鮮卑 or even the Xiongnu 匈奴 may have been present within the realm. But at least the core area of Goguryeo would have been mostly Koreanic speaking (I’m going with the hypothesis that the languages of the three kingdoms of Korea were all Koreanic).

    As for proto-Japonic, the Wikipedia page on the Buyeo languages states that the Goguryeo toponyms that resemble Japanese are mostly found in the central part of the peninsula and not in the north. So by the time of the hypothesized “second wave” of Koreanic speakers into the Korean peninsula, proto-Japonic speakers may have been present only in the central and southern parts and not farther up north. Goguryeo might not have encountered proto-Japonic (or at least the substratum) until its expansion down south in the early fifth century. I should perhaps note that the territorial expansion into the entire Liaodong peninsula and into the territory of the Malgal tribes in Manchuria took place around the same time. These conquests would have added a fair number of non-Koreanic speakers to Goguryeo’s population, at least in the north (central Korea was already Koreanic-speaking by this time).

    The most obvious legacy of Goguryeo’s multilingualism is the development of the Jurchen and (later) Manchu languages, Tungusic languages that seem to have been considerably influenced by Koreanic. They were spoken by peoples who largely traced their ancestry to the Malgal tribes that were ruled by Goguryeo.

  20. “Due evidently to failure either to glance at the relevant chapters of the Samguk Sagi itself or to read the recent philological-linguistic study of the toponyms in that source (Beckwith 2004), some scholars (Janhunen 2005; Unger 2005;20 Vovin 2005b) openly claim or imply that the text includcs only toponyms from the central Korean Peninsula area of the former Koguryo Kingdom. This claim is false.21 The Samguk sagi gives several lists of glossed toponyms of places in Koguryo north of the Yalu, each list being preceded by a title which explicitly states that the names are of localities north of the Yalu. Although the Samguk sagi is written in Classica! Chinese, these toponyms too have been discussed in English (Beckwith 2004: 89-92). The Samguk sagi also includes toponyms from the former Ye-Maek Kingdom region (Beckwith 2004: 83-88), which was already Puyò-Koguryò-speaking in Antiquity, as noted above. Overlooking this material is a gross error that alone falsifies the view that the language of the Koguryò toponyms was not the Puyò-Koguryò language.

    In the Samguk sagi there are 19 glossed and linguistically identified toponyms from the region north of the Yalu, 14 from the east-central coast (fomer Ye-Maek Kingdom) region, and 88 from the central and west-central Korean Peninsula regions. ” (c) Beckwith’s article on Koguryo language in Journal of Inner and East Asian Studies (December 2005).

    Citing Samguk sagi, he then goes on to argue that Koguryo language was completely un-Korean, that Silla spoke only Korean and that Paekche spoke two languages, one related to Koguryo and one related to Silla language (ie, Korean).

  21. I gather that areas of north Korea devoid of Koguryo toponyms had toponyms in another language – Chinese most likely. Unfortunately Beckwith doesn’t say this explicitly, but that’s the impression I get. If true, then probably significant areas of north Korea could have been still speaking “Northwestern Middle Chinese”

  22. I remember being taken in at first by Beckwith’s bold and provocative claims about the language of Goguryeo being related to Japonic while rejecting connections with Koreanic (you can preview his Koguryo: The Language of Japan’s Continental Relatives), but then realized neither Korean nor Western scholars seemed to think much of his conclusions. In addition to objections from historical and archaeological angles, many take issue with his historical linguistics—I’ve seen several complaints about his reconstructions of proto-Japanese and Old Chinese in particular.

    The following excerpt is from a review by Thomas Pellard (PDF):

    Unfortunately, Beckwith’s ambitious work is heavily flawed in many aspects, of which I will provide only a few examples. First, I deplore the general opacity of his methodology, since most of his reconstructions are his own, quite different from the ones adopted in mainstream Chinese (Baxter 1992; Sagart 1999; Starostin 1989, 1998-2003) and Japanese (Martin 1987) historical phonology, and it is unclear how they were arrived at. His comparisons thus use reconstructions that are too often problematic, sometimes simply incorrect, or, worse, just circular.

    For instance, the mysterious Proto-Japanese (PJ) *mika ume ‘plum’ and *rmey > umi ‘sea’ (pp.146-47) are completely ad hoc. They are supported by neither internal nor comparative method, and such consonant clusters have never been posited for PJ. The Yaeyama form “ᵐmi” quoted as evidence (p.147) cannot be found in Hirayama’s reference dictionary (1988:139-40; Yaeyama dialects forms are recent loans from mainland dialects since plums don’t grow there). Anyway, both words cannot be reconstructed with the same onset since umi doesn’t exhibit the m-/ø- alternation of mume/ume in Japanese, and both words have completely different Ryukyuan reflexes (Shuri ʔɴmi ‘plum’ vs. ʔumi ‘sea’). Their putative Chinese sources don’t exhibit an initial *r- in standard reconstructions either: ‘plum’ *mɨ (Baxter), *mǝ̄ (Starostin); ‘sea’ *hmɨʔ (Baxter), *smǝ̄ʔ (Starostin).

    It seems that all the above “reconstructions” are motivated only by the urge to provide them with an etymology: external comparison is privileged in detriment of internal evidence. Other quite irregular correspondences and derivations can also be found, with irregular forms too easily dubbed as “dialectal”, and, for some of them, the author even confesses that “these phonological changes are almost completely unexplained” (p.149).

    I don’t have the knowledge on this topic to judge these objections, but given that several others have similarly complained about Beckwith’s reconstructions, I have reservations accepting his conclusions. However, it does seem to this non-specialist that Beckwith is more convincing when he argues that Japonic-looking toponyms did indeed occur outside of central Korea, north of the Yalu. That still doesn’t demonstrate that these toponyms represented the language of Goguryeo and not that of the previous inhabitants, which would be unsurprising assuming that Koreanic was a relatively recent arrival in the region (the second wave of migration).

  23. @SFReader, I doubt that the Chinese language had much of a presence in the core areas of Goguryeo (setting aside the Liaodong peninsula for now). It would only have been a recent arrival in the region, and if only 14% of the population of the Lelang commandery were Han Chinese a half-century after its establishment as I mentioned in a previous comment, Sinitic would have had at best a minority presence in northern Korea.

    As for the languages present in the Korean peninsula before the arrival of Koreanic, they are usually assumed to be Paleosiberian (I think Nivkh has been proposed). This is a catch-all label for the indigenous languages that preceded the arrival of later dominant language families like Tungusic, Koreanic, and Japonic. The states of Gojoseon (Old Joseon) and Buyeo are certainly not described as part of the Sinitic cultural sphere in the Chinese records.

  24. Whitman, in his “Northeast Asian Linguistic Ecology and the Advent of Rice Agriculture in Korea and Japan” (I think this is the paper SFReader is referring to), presents the hypothesized historical scenario where Japonic arrived in the Korean peninsula around 1500 BC associated with wet-rice agriculture and Koreanic arrived in the south-central part of the Korean peninsula around 300 BC associated with the Korean-style bronze dagger culture, itself an offshoot of the Liaoning bronze dagger culture whose prototype is found in the Korean peninsula from 1300 BC on. The Shandong region is discussed as a possible source of the dispersions of Japonic and/or Koreanic. Sinitic enters the picture only much later.

    I’ve noticed that 300 BC which Whitman proposes as the spread of Koreanic to the Korean peninsula was around the time that Gojoseon (Old Joseon) lost vast expanses of territory to the Chinese state of Yan, thought to correspond to the Liaodong region based on the archaeological evidence. Gojoseon was thereafter centred around Pyongyang in northern Korea. If Gojoseon represented Koreanic speakers, the timing seems right for the expansion of Koreanic into the peninsula; or if they weren’t Koreanic speakers themselves, maybe their migration still served to push Koreanic speakers into the peninsula.

  25. I have an impression that Chinese state of Yan had considerable non-Sinitic population. Proto-Mongolic, perhaps.

    That’s the pet theory of mine. Around 1000 BC, the original Lower Xiajiadian Culture with settled agricultural population speaking Proto-Mongolic languages (and culture quite similar to Shang, including practice of oracle bones divination) started to split into two – Upper Xiajiadian Culture which underwent transition to nomadic economy (Mongols are the end result of their evolution) and the state of Yan which fully adopted Chinese culture and later Old Chinese language.

  26. Just for the record, DeLancey didn’t put forth any *linguistic* argument, based on the study of the language itself, that the Shang language was not Sinitic. He speculated it on the basis of Tibeto-Burman languages being mostly SOV while the Sinitic languages are SVO. Logic would dictate that this difference must have been caused by some process, which DeLancey imagines has to do with the Shang population speaking a SVO language and then making a linguistic switch to a Zhou-popularized Tibeto-Burman lexicon while retaining the SVO syntax. He provides as support Benedict’s comparatively limited set of cognates between Tibeto-Burman and Chinese, as well as Benedict’s discovery of Austro-Asiatic/Tai-Kradai/Hmong-Mien cognates in Old Chinese. These Southeast Asian language families are well known to have included many historically SVO languages.

    As far as I know, the deciphered sections of the oracle bone script shares many cognates with Tibeto-Burman languages, but also certain Austro-Asiatic/Tai-Kradai/Hmong-Mien languages. So the jury is still out as to whether the Shang population spoke Sinitic. Yet I do not believe that the language switch DeLancey referred to occurred with the Zhou, because the Zhou was a vassal tribe of the Shang, not the other way around, and DeLancey’s model of a “Zhou linguistic sphere” does not fit the well-established Sinology opinion, also backed by lexical studies, that there was no large linguistic break between the Zhou and the Shang, because the Zhou had already adopted the high culture of the Shang through intermarriage and alliance before their take over. It is more plausible that the Zhou converted to speaking the Shang language, as they must have known the language during their service to the Shang; in that case, it might actually be the Shang Dynasty that imposed a Sino-Tibetan language on top of a previously Austro-Asiatic/Hmong-Mien/Tai-Kradai speaking population, causing the syntactic change described by DeLancey.

    Or it might just be an areal effect.

  27. David Eddyshaw says

    Classical Chinese itself shows possible traces of itself having developed from an SOV pattern. Quite apart from the pervasive modifier-head pattern it shows everywhere else, which tends to correlate Greenbergianly with SOV, it still puts object pronouns before the verb in certain circumstances.

  28. David Marjanović says

    Also, there’s very little evidence indeed for a bifurcation into Sinitic and Tibeto-Burman…

  29. Which is why some people are starting to call the family “Trans-Himalayan”, which I find irritating: after all, Indo-European was not spoken solely in India and Europe even before the Great Migrations. Still less is Austroasiatic spoken throughout South Asia, or Afroasiatic throughout both Africa and Asia. Names are but names, and nobody doubts that Sinitic and Tibetic belong to the same family.

  30. Greg Pandatshang says

    Van Driem objects to Sino-Tibetan because he believes it specifies a particular model of the family’s structure: bifurcated into Sinitic vs. Tibeto-Burman. He initially argued for Tibeto-Burman as the name of the entire family, including Chinese.

    Sino-Tibetan also has a mnemonic shortcoming, implying that Sinitic and Tibetan as representative of contrasting top-order taxons (regardless of whether there are only 2 top-order branches or not). However, it now seems more likely that Chinese and Tibetan are fairly closely related compared to other more obscure Sino-Tibetan languages, such as those spoken in Arunachal Pradesh: that is, they would be in the same top-order taxon. There are probably some other language family names that commit similar offenses, but this one might be particularly egregious. I associate “Trans-Himalayan” with a much-enhanced appreciation of the family’s diversity, although of course this appreciation could still be accomplished without changing the name.

    India and Europe (or Germany) are at least located roughly at either end of the IE-speaking world.

  31. Greg Pandatshang says

    Since April, I’ve read a good chunk of Baxter & Sagart’s Old Chinese: A New Reconstruction (as well as a smaller chunk of Schuessler’s Minimal Old Chinese and Later Han Chinese) and thereby now have some appreciation of how xiéshēng works, i.e. Chinese morphemes whose characters share the same phonetic element. Everyone knows that they often do not sound very similar in modern Chinese, but that they presumably sounded similar in predictable ways in Old Chinese.

    I began to wonder how this could be compatible with DeLancey’s theory that the Shang language was not Sinitic. Chinese characters were presumably originally designed to write the Shang language. The hypothetical change from Shang to Sinitic is partly invisible to us because the characters could have stayed the same while being pronounced quite differently, as if a régime that insisted on writing in baihuawen and pronouncing characters in Standard Cantonese were suddenly replaced by a régime that insisted on writing kanbun but reading all characters using only kun’yomi. But, if so, (going back to DeLancey’s scenario) we would expect the phonetic elements to reflect the Shang language pronunciations, not Sinitic. As in my hypothetical scenario, you would notice some limited patterns based on phonetic elements when you’re pronouncing in Cantonese, but the phonetic elements prove completely meaningless and haphazard when you’re using kun’yomi.

    I’m aware that there was extensive standardisation of the characters in the early Qin, long after the fall of Shang. Is it possible that all of the phonetic elements were revised? Or brand new characters were created for all Sinitic words that were not loan-words from the Shang language?

    Here’s the link to DeLancey’s article: https://naccl.osu.edu/sites/naccl.osu.edu/files/NACCL-23_1_04.pdf

  32. David Eddyshaw says

    “Is it possible that all of the phonetic elements were revised? Or brand new characters were created for all Sinitic words that were not loan-words from the Shang language?”

    In a word, no it isn’t; not as far as Li Si’s reforms are concerned. The Qin reform regularised an existing system. Every page, virtually, of the pioneer work, Karlgren’s Grammata Serica, shows this.
    (For the very first example I found essentially at random, the Zhou word for “you” sounded enough like the word for “woman” that the originally pictographic sign for woman appears already in the sense “you” in Zhou inscriptions.)

    It’s not quite so easy to say this unequivocally for the Shang-Zhou transition; however I think the original-pictogram for “wheat” already appears for “come” in the oracle bones, for example.
    Baxter’s book takes it for granted that the oracle bones are Chinese, FWIW.

    The whole thing reminds me of the doomed attempts to demonstrate that cuneiform wasn’t invented by the Sumerians, which similarly come up against awkward problems like the fact that the inventors evidently had a word for “reed” which sounded like the word for “go”, which by complete coincidence happens to be the case for Sumerian …

  33. David Eddyshaw says

    It’s possible of course that the Shang rulers themselves weren’t Chinese (it’s not as if there aren’t plenty of other examples of Chinese dynasties of foreign origin.) But it seems pretty fanciful to suggest that their written language isn’t Chinese.

    DeLancey’s article doesn’t seem to claim necessarily that the Shang didn’t speak Chinese; his idea seems to be more that later Chinese is a sort of creole based on proper Tibeto-Burman style urChinese as adopted by speakers of other languages. Personally my hackles rise whenever people start invoking “creolisation” as part of the history of any language without any actual supporting historical evidence, but in the case of Chinese I suppose the idea is to some extent fairly uncontroversial; it’s a matter of degree, and timing. At the very least there is evidently a major Sprachbund comprising Chinese and its southern neighbours.

    Talking of creolisation and bioprograms and whatnot, I was reading Albert Valdman’s book “Haitian Creole”, which has a very different take on the whole matter from what I’ve seen in works on English-based pidgins and creoles; he makes a persuasive case that there is no historical evidence that the alleged conditions under which creolisation is supposed to occur were ever really present, and also that the slaves in Hispaniola basically were pretty successful in acquiring a settlers’ language which was *already* much farther removed from standard French than most notions of creolisation suppose.

  34. Fascinating!

  35. marie-lucie says

    Oracle bones

    A few years ago I saw a documentary about life in the Canadian Arctic. One short scene has stuck in my mind ever since: an older Inuit man, sitting in front of a fire, holding in his hand what seemed to be the shoulder blade of some animal and moving this thin flat bone gently over the flames, as the far edge of the bone started to char. The purpose of this activity was described as divination. I immediately thought of the Chinese “oracle bones”, actually from tortoises but also thin and flat and treated by fire.

  36. “DeLancey’s article doesn’t seem to claim necessarily that the Shang didn’t speak Chinese; his idea seems to be more that later Chinese is a sort of creole based on proper Tibeto-Burman style urChinese as adopted by speakers of other languages.”

    DeLancey’s article is more about speculating the reason for the fundamental typological differences between Sinitic and its most closely associated Sino-Tibetan languages, and the other Sino-Tibetan languages out west, especially word order – SVO vs. SOV. But as far as I see he’s provided no solid evidence of why the timing should be the Shang-Zhou transition, except that the Shang oracle bone language was more solidly SVO than the Zhou’s. For example, in “Languages of Mainland Southeast Asia: The State of the Art,” it is stated that Pre-Archaic Chinese – the language of the oracle bone script – was a stable SVO language that usually placed obliques after the object, as SVO languages tend to do, and as later Sinitic does not. Yet this does not prove that the Zhou language was SOV – the Zhou bronze script wasn’t – and it does not prove that the SVO structure of Pre-Archaic Chinese wasn’t the typological result of substrata established earlier than the Shang.

    With 50% of the oracle bone characters still not being deciphered, there is always room for doubt, but had the Zhou language replaced the Shang’s “in the mouths of the Shang population,” we’d expect there to be some evidence of fully SOV Sinitic in and around the Zhou homeland, preferably with a different lexicon, since that is where their original language should have been in established usage. But as David McCraw shows in “An ABC Exercise in Old Sinitic Lexical Statistics,” Zhou bronze script texts are even more conservative than later Sinitic texts with respect to their adherence to the Shang lexicon, and are still primarily SVO. This is what led to the mainstream belief that there wasn’t a language break between the Shang and the Zhou, to begin with, due either to the Zhou taking up the Shang language, or having spoken a similar language all along.

    Of course, what is productive about DeLancey’s argument is that there does need to be an explanation for why eastern Sino-Tibetan – Sinitic, Bai, Tujia, etc. – are SVO while western Sino-Tibetan – virtually all of the “Tibeto-Burman” languages – are SOV. DeLancey is probably right to say that SOV was the original order of proto-Sino-Tibetan as it is less likely that all the various scattered Tibeto-Burman languages became SOV than that Sinitic and its close neighbors became SVO. But since we don’t yet have a proper model of the structure of the Sino-Tibetan family, it might be too early to make a solid speculation.

    “It’s possible of course that the Shang rulers themselves weren’t Chinese (it’s not as if there aren’t plenty of other examples of Chinese dynasties of foreign origin.) But it seems pretty fanciful to suggest that their written language isn’t Chinese.”

    The Shang court’s virtual monopoly on literacy and divination makes it less likely, I think, that the creators of the oracle bone script and the Shang rulers didn’t speak the same language, since if there is an established tradition of writing outside of the Shang court’s, we haven’t found it, so the best explanation remains that it was the Shang government that created the script, and that it therefore matched their language. This is especially so since oracle bone divination is ultimately a private courtly matter, representing communication between the Shang kings and their ancestors/gods, and since the vast majority of the population was illiterate and couldn’t have read the inscriptions, there was no reason to cater to the public.

    Some have speculated on the existence of earlier forms of writing that perished – for example writing on bamboo and wood, that preceded the Shang oracle bone script, just as there were precedents to cuneiform in Mesopotamia that showed the long and difficult road to literacy. But I know of no serious scholarship, due to the lack of evidence, and the only signs that have been found in China before the Shang are “pre-writing.” If the Shang were, indeed, the first to invent/adapt writing in China, then the oracle bone inscriptions should represent their language, not another’s.

  37. @David Eddyshaw: In creolistics there is a divide between an Anglo-American and a French intellectual tradition: the former emphasizes the break between the colonial language and the resulting creole language, and factors which account for this break, chiefly pidginization and substrate influence: to this tradition these are THE factors which account for the genesis of creole languages.

    The latter tradition (of which Valdman is a typical representative), conversely, emphasizes continuity from colonial language to creole and sees the distinctiveness of creoles as being due to large-scale second language acquisition of forms of the colonial language which (through dialect koineization) were already divergent from standard varieties of the colonial language in question.

    Both traditions are in my opinion guilty of cherry-picking, but the first is closer to the truth. To my mind the evidence is unambiguous: creoles are nativized pidgins. Period. Substrate and/or dialect features in creoles do exist, but cannot explain why creoles and the early colonial languages are today separate languages. Pidginization followed by nativization (=creolization) can.

  38. David Eddyshaw says


    Thanks. I was hoping you’d chip in. I was indeed wondering whether this was an Anglophone/Francophone thing. Valdman’s is the only book of any substance I’ve seen on a French-based creole, and I was very struck by how different his take was from the John McWhorter type stuff I’ve come across before. The references in Valdman’s book led me to conjecture that his was a fairly mainstream French view rather than a personal idiosyncrasy.

    V’s book does make it plain that the issue is highly political too, at least as far as Haitian goes.

  39. David Marjanović says

    the Chinese “oracle bones”, actually from tortoises

    Shoulder blades were also used.

  40. marie-lucie says

    Thanks David. Shoulder blades must have been the original medium in North Asia, since they can be obtained from a variety of mammals, while tortoises are not found everywhere.

  41. @David Eddyshaw: I am very much in agreement with McWhorter’s take on creole genesis, not least because he is one of the very few creolists who seriously examines the arguments presented by both the French and the Anglo-American schools. And you’re quite right, Valdman’s view of creole genesis is very typical of the French school’s.

    As for the political dimension of creole studies…Ugh. Don’t get me started. All I will say is this: it’s messy, it’s ugly, and it’s everywhere.

    @David, Marie-Lucie: since Eskimo-Aleutian seems to represent the most recent language spread from Eastern Siberia into North America, there may well be direct cultural continuity between the Chinese and the Inuit practices of divination through oracle bones. If so, however, I would expect the custom to also exist (or to have existed) in Siberia: can anyone confirm or deny this?

  42. I can indeed. Section 2 of this paper on pyro-osteomancy speaks of it among the Chukchi (who used reindeer scapulae). It was also practiced, says the same paper, by the ancient Mongols (sheep), the ancient Japanese, and far to the east by the Naskapi (caribou).

  43. marie-lucie says

    Etienne, JC: There are lots of resemblances in cultural practices and traditions between North Asia and North America. Sometimes you see references to one such practice (like here), and that’s it, but when you tabulate a lot of them you find that the resemblances are too many and too precise to be coincidences. These resemblances, plus the linguistic variety on the American side, strongly suggest that the widespread opinion that there were no contacts between the two continents for 10,000 years or so cannot be true. Instead there must have been a series of migrations, most of them involving fairly small groups who intermarried with the existing populations, as well as cultural diffusion.

  44. John Cowan: thanks for the paper! “Pyro-osteomancy”…I don’t often learn a new word these days, but this one is new. It could come in handy in a debate, too…(“This scholar’s predictions have so consistently been proven wrong that one less charitable than myself might go so far as to suggest that even the practice of pyro-osteomancy would lead to a tangible improvement in the heuristic value of said predictions”).

    That the Naskapi practiced this is interesting: the Naskapi are geographically very close to Inuit speakers, and indeed some Algonquianists have pointed out to me, informally (I don’t think they’ve published anything on the topic yet) that some aspects of Naskapi and Northern East Cree phonology seem to point to a period of language contact with Inuit. Perhaps the ancestors of Naskapi speakers acquired the custom of Pyro-osteomancy at the same time.

    Marie-Lucie: Eskimo-Aleutian is surely an exception to the claim that there was no contact between the Americas and other continents for 10 000 year or so: I believe Sapir was the first to point to Eskimo-Aleutian as being the most recent linguistic arrival in the New World.

  45. marie-lucie says


    You are quite right about the recent arrival of Eskimos, whose cultural adaptation to life among ice and snow enabled them to cross the Bering Strait and other straits on the ice over the sea.

    I was thinking of most other peoples, often thought to have come in on foot on the land referred to as “Beringia” before the melting of the vast glaciers caused the sea level to rise and separate Asia from America.

  46. Divination by reading shoulder-blades is fairly common all of the world, but reading the cracks induced in them by fire seems to be confined to East Asia and North America.

  47. Trond Engen says

    That, reminds me.. What about the word Innu “people”?

  48. marie-lucie says

    What about the word Innu “people”?

    What about it? The Innu (also known as Montagnais) speak a language of the Algonkian family. I don’t know whether the name is related to those for “human” or “people” in Eskimo languages.

  49. Trond: if you’re asking about the similarity between INNU and INUIT, I can assure you that it is wholly coincidental: INNU is a straightforward reflex of Proto-Algonquian */erenyiwa/ “person”, and indeed reflexes of this word are widely found in Algonquian languages: indeed, the French realization of the name of the State of Illinois (/ilinwa/) preserves a much more conservative reflex of */erenyiwa/.

    For more details on reflexes of */erenyiwa/ in Algonquian, have a look at this, and the references quoted therein:


  50. Trond Engen says

    Thanks. That’s indeed what I was asking about. I didn’t expect a borrowing but maybe lexical interference or contamination. But not with that pedigree.

  51. And just to be on the safe side, I had a look on the other side…”Inuit” derives from a base which synchronically can be argued to be /inuk/, which in turn goes back to a Proto-Eskimo base */iŋuɣ/ ~ /inuɣ/, “human being”. The similarity between this and Proto-Algonquian */erenyiwa/ isn’t very striking, to say the least.

  52. And *inu- is the root, so /n/ is the only thing they have in common.

  53. marie-lucie says

    Thanks Etienne and Y for the detailed explanations.

  54. SFReader says

    “The use of heat to crack scapulae (pyro-scapulimancy) originated in ancient China, the earliest evidence of which extends back to the 4th millennium BCE, with archaeological finds from Liaoning, but these were not inscribed.”

    This would be the Hongshan culture of Neolithic Inner Mongolia and Liaoning. The Hongshan farmers were the direct ancestors of Lower Xiajiadian culture of Inner Mongolia which in turn was ancestor of early nomadic Upper Xiajiadian culture (which I, and not a few Chinese archaeologists, link with origins of Mongolian people).

    So it’s quite likely that the Shang Chinese borrowed practice of “Pyro-osteomancy” from ancestors of Mongolians, they just substituted sheep bones with turtle shells.

  55. Trond Engen says

    Yes, the etymology of Inuit is safely reconstructed for Eskimoic. That’s why I asked about Innu,

    I’m fascinated by the first consonant of Algonquinian *erenyiwa. It has different reflexes, but not between branches. Instead all possible reflexes are present within pretty much every branch. Take (narrow) Cree:

    Plains Cree: iyiniw
    Woods Cree: iθiniw
    Swampy Cree: ininiw
    Moose Cree: ililiw
    Atikamekw: iriniw

    (I think) I know that North American languages generally don’t discern r from l, but the wide array of reflexes is suggestive of something more complicated than that, e.g. coarticulation. Did any languages preserve wide allophonic variation and/or coarticulation long enough to be captured by modern transcription or recording?

  56. Trond: have a look at this thread


    especially my comment on December 12, 6:43, where I discuss the issue a little. The evidence does appear to indicate that the original Proto-phoneme was typically realized as /r/. The wide array of reflexes is because Proto-Algonquian had a very simple set of consonantal segmental phonemes; as a result */r/ was liable to be realized in very different ways (simple or palatalized /r/ or /l/, with other possible secondary features such as assibilation) without its phonological identity being threatened.

    Proto-Cree seems to have preserved the phoneme largely unchanged from Proto-Algonquian, with Atikamekw thus being the most conservative member of the Cree continuum in this respect. This differentiation of Cree dialects according to reflexes of */r/ is not very ancient, and may postdate European colonization: we’ve tombstones in Innu where the letter R is used to indicate the reflex of */r/, indicating that the transformation of this phoneme into /l/ or /n/ (according to region) in Innu hadn’t taken place yet. In like fashion there is evidence that there once existed in colonial times a dialect of Cree in the Canadian Prairies, geographically located where Plains or Woods Cree are spoken today, which likewise still realized this phoneme as /r/.

  57. Trond Engen says

    Thanks. I remember the thread now, even though I didn’t have time to engage in it at the time, Algonquinian is one of those cases that make me doubt shared innovations as diagnostic of branching.

  58. Trond Engen says

    Well, I shouldn’t say I doubt it as diagnostic. A shared innovation in something you already suspect to be a branch may support the diagnosis. But I doubt it as a defining characteristic of a branch, or as a decisive argument for whether or not a variety belongs to this or that branch. Changes are rarely universal, and regional effects keep working even after the continuum is torn apart.

  59. Greg Pandatshang says

    Here’s a considerably expanded version of DeLancey’s paper: https://www.academia.edu/3894773/The_Origins_of_Sinitic (I didn’t realise there were two versions when I posted the NACCL link above)

  60. SFReader says

    Very interesting!

    I am still reading the paper, but the suggestion that Sinitic is, in fact, a sub-branch within Bodic, is striking.

    I wonder what journalists will make of this claim if they become aware of it.

    “Scientists prove that Chinese is just a dialect of Tibetan!”

  61. “Scientists prove that Chinese is just a dialect of Tibetan!”

    Yup, I’m afraid that’s the inevitable journalistic takeaway. And the political (over)reaction is just as predictable. Sigh.

  62. marie-lucie says

    Trond: Cree dialect forms: iyiniw / iθiniw / ininiw / ililiw / iriniw

    Except for θ , the correspondences y / n / l / r are all attested in many languages. The exception θ suggests an intermediate step d (or t if allophonic with d). So the reconstruction *r makes perfect sense.

  63. Trond Engen says

    Yes, of course. I might have chosen *ð, or something in the vicinity of n, but that”s not my main point. I meant to highlight the fact that the specter of reflexes is found within almost every branch.

  64. marie-lucie says

    Trond, sorry I must have misunderstood your comment. I guess you mean that that feature cannot be used for subclassification. Fair enough: there are probably others justifying the overall classification. The variety of correspondences suggests to me that the proto-language had a single phoneme (*r) with a rather wide spectrum of allophones, and that the ‘daughter’ dialects perpetuated this variety, only adopting specific allophones locally fairly recently.

    In general the case of *r is interesting: modern European languages (at least the IE ones) all have this phoneme (an abstract phonological unit), but its phonetic realization is quite different according to the countries, regions, dialects, sociolects, etc. Whether someone uses a French, German, English, Italian, etc r is rarely an obstacle to communication: hearers identify the sound they hear with their “own” /r/.

  65. I am still reading the paper, but the suggestion that Sinitic is, in fact, a sub-branch within Bodic, is striking.

    I think this paper is misrepresenting van Driem. As far as I know, he argues for a Sino-Bodic branch but doesn’t subsume the former in the latter.

  66. Found van Driem’s classification of Tibeto-Burman:

    Western (Baric, Brahmaputran, or Sal):
    –Northern (Sino-Bodic):
    —-Northwestern (Bodic): Bodish, Kirantic, West Himalayish, Tamangic and several isolates
    —-Northeastern (Sinitic)
    —-Southwestern: Lolo-Burmese, Karenic
    —-Southeastern: Qiangic, Jiarongic
    –a number of other small families and isolates as primary branches (Newar, Nungish, Magaric, etc.)

    Yes, it makes more sense.

    I wonder what is the date for divergence of Sino-Bodic into Bodic and Sinitic.

    For historical reasons, it should be around 2000 BC. 1500 BC at most, because first oracle inscriptions circa 1300 BC are already written in recognizable Chinese.

  67. In one of van Driem’s articles I’ve read an interesting idea that the Shang, Zhou and Qin all spoke different Sino-Tibetan languages. And one Chinese script easily served all three languages just as it serves different Sinitic languages(aka Chinese dialects) today.

  68. David Marjanović says

    The exception θ suggests an intermediate step d (or t if allophonic with d).

    Not necessarily: the French of one of the Channel Islands has turned /r/ into [ð].

    In general the case of *r is interesting: modern European languages (at least the IE ones) all have this phoneme (an abstract phonological unit), but its phonetic realization is quite different according to the countries, regions, dialects, sociolects, etc. Whether someone uses a French, German, English, Italian, etc r is rarely an obstacle to communication: hearers identify the sound they hear with their “own” /r/.

    I only noticed a few years ago that my grandmother seems to use [ʀ] and [r] at random, the latter being uvularized or something to sound as much like [ʀ] as possible.

  69. marie-lucie says

    David: the French of one of the Channel Islands has turned /r/ into [ð].

    Even better!

    I remember reading an anecdote involving rural Bourguignons in which this also occurred, as in père ‘father’ transcribed as pèthe.

    I wonder if the sound in question is actually interdental or just dental /alveolar. The latter seems more plausible phonetically, although it could easily evolved into the former.

    my grandmother seems to use [ʀ] and [r] at random

    When I was still a child I noticed that my maternal grandfather (a native speaker of Occitan) seemed to do that, but I was much too young to try to pay attention o the environments in which this happened. Much later I realized that he used a strongly raspy [ʀ] in words with medial rr, like carré ‘square’, especially to emphasize the word, and [r] otherwise as in Occitan. My grandmother, who was from the same village, only used [r].

    The consciousness of this allophony led me to speculate that the historical switch from [r] to [ʀ] in French, apparently starting among lower-class Parisians, might have started in the same allophonic context and later expanded to all instances of the phoneme. I have not looked up references lately, so I don’t know whether this hypothesis has been considered, whether positively or negatively.

  70. David Marjanović says

    I wonder if the sound in question is actually interdental or just dental /alveolar.

    Acoustically that’s exactly the same thing. (At least there’s a broad overlap.)

    Much later I realized that he used a strongly raspy [ʀ] in words with medial rr, like carré ‘square’, especially to emphasize the word, and [r] otherwise as in Occitan. My grandmother, who was from the same village, only used [r].

    That’s interesting. Starting from the same -r-/-rr- distinction as in Spanish, Portuguese has turned rr into [ʀ] (further changed into [χ] and [h] in different parts of Brazil*); Wikipedia says that at least one Occitan dialect has done the same, citing the minimal pair /gari/ “healed” vs. /gaʀi/ “oak”. Perhaps your grandfather tried to apply this to French based on its spelling?

    * Rio de Janeiro [ˌhiud͡ʒaˈnei̯ru].

  71. “This would be the Hongshan culture of Neolithic Inner Mongolia and Liaoning. The Hongshan farmers were the direct ancestors of Lower Xiajiadian culture of Inner Mongolia which in turn was ancestor of early nomadic Upper Xiajiadian culture (which I, and not a few Chinese archaeologists, link with origins of Mongolian people).

    So it’s quite likely that the Shang Chinese borrowed practice of “Pyro-osteomancy” from ancestors of Mongolians, they just substituted sheep bones with turtle shells.”

    I cannot find any mention of Hongshan in the article, and the tables at the end of the article do not list Hongshan sites as containing them. The practice seems to have appeared first in the Fuhe culture, which is usually listed separately from Hongshan and further north/west of it.

    Also, if you read the article, you’ll see that it was not the Shang Chinese that borrowed this practice, because it was already widespread in China during the late Neolithic, between 3,000 BC and 2,000 BC, which was ~1,000 years before the rise of the Shang even by the earliest dating. The practice does appear to have spread into China from the north, but it had done so before the rise of anything that could be properly called “Chinese.”

    Pay particular attention to section 3.1.2. which describes the Longshan cultural horizon in northern and central China. It would seem that oracle bones were used in divination in China as early as 2,700 BC or 4,700 years ago and that it was first used by “wandering diviners”, perhaps corresponding to traveling fortune tellers today, who had diverse practices and who might have formed a specific social class of artisans working initially for villages and then, eventually, for urban centers and governments.

    Pyro-osteomancy also developed differently in different regions, and by the time of the Shang, the practices within China had taken off in their own direction, for example in the use of turtle shells, separate from the practices to the north. Nonetheless it appears that these diviners from different regions maintained some contact/information flow, as there were also mutual influences.

    So I wouldn’t describe it as a matter of Chinese ancestors borrowing from Mongolian ancestors, especially as the ancestors of Mongolians might not be fundamentally separate from the ancestors of Chinese.

  72. marie-lucie says

    David: Starting from the same -r-/-rr- distinction as in Spanish, Portuguese has turned rr into [ʀ] (further changed into [χ] and [h] in different parts of Brazil*); Wikipedia says that at least one Occitan dialect has done the same, citing the minimal pair /gari/ “healed” vs. /gaʀi/ “oak”.

    I did not know either of these two phenomena, but they confirm my hunch of the (independent) origin of French [ʀ].

    I would like to read the Wikipedia page. What keyword(s) should I search for?

    Perhaps your grandfather tried to apply this to French based on its spelling?

    I don’t think it was necessarily his own adaptation. I remember him mentioning on several occasions that one of his teachers, also an Occitan speaker, would take advantage of his own and his pupils’ bilingualism to help them with French spelling. For instance, in many cases written French en and an correspond to the same nasal vowel, making it difficult to remember the spelling from the words heard, but the Occitan cognates (in which the vowels are not nasalized, just followed by a nasal consonant) give relevant clues, for instance that French for ‘wind’ is written vent not *vant. So the teacher might have had the r/rr distinction in his own Occitan pronunciation and also in his French pronunciation (probably exaggerated for the purpose), and used the alternation as another clue for the students to remember some French spellings.

    I don’t remember my grandmother ever making that distinction although both grandparents spoke the same dialect. I gather that her school experiences (probably all with nuns as teachers) were not very positive, and she did not speak about them very much. I know that as a teenager she had been sent to a Catholic boarding school, where her teachers’ speech might have been a more standard French than the local Occitan-coloured variety.

  73. David Marjanović says

    I would like to read the Wikipedia page.

    I was citing from memory… the information is still there, but it’s now a bit hidden in the table under this paragraph.

    Note that the map next to that paragraph is, except for Italy, thoroughly outdated; there’s a reason why the German-speaking part of Italy is marked in the deepest purple!

    I remember him mentioning on several occasions that one of his teachers, also an Occitan speaker, would take advantage of his own and his pupils’ bilingualism to help them with French spelling.

    Ooh, that makes a lot of sense!

  74. marie-lucie says

    Thanks David. In the table I notice that Catalan correr is given as an example of rr realized as [ʀ], even though the section on Catalan phonology (in another Wiki page) only mentions [r] in this context.

  75. George Gibbard says

    On the “Uvular trill” page, looking at the second column of the table, it says Catalan rr is [ʀ] in “some northern dialects”, and there is a footnote citing:
    Wheeler, Max W. (2005), The Phonology Of Catalan, Oxford: Oxford University Press.

  76. George Gibbard says

    Meanwhile, as you would guess from the orthography (given correctly as córrer), the stress in the dialect transcription [koˈʀe] ‘to run’ is non-standard, assuming it isn’t a mistake.

  77. George Gibbard says

    And Wheeler is in Google Books! Page 24 says: “A trill, with two to four contacts, is found in a syllable onset at the beginning of a root or a lexical prefix (11a), after a heterosyllabic consonant (11b), and between vocoids word-internally (11c). Only in the last of these contexts is a contrast with a tap available. […] In north Catalonia, and in the town of Sóller (Majorca), a uvular trill ([ʀ]) or approximant ([ʁ]) can be heard instead of an alveolar trill.” So there is supposed to be a /ʀ/ : /ɾ/ contrast.

    Meanwhile, searching does not turn up the infinitive córrer ‘to run’ as an example in the book, except as the abstract root /koRR/ and other inflected forms. Maybe the Roussillon dialect form of the infinitive is in the book but not in the Google preview.

  78. Josep Roca-Pons taught me no other pronunciation for Catalan rr than [r], but I one heard a friend named Montserrat introduce herself with a uvular. I was surprised and thought she was being Frenchified to impress the other person. But her mother was from Gerona.

  79. David Marjanović says

    Here’s a considerably expanded version of DeLancey’s paper: https://www.academia.edu/3894773/The_Origins_of_Sinitic

    And here is his slightly earlier paper on why not only Sinitic, but also Lolo-Burmese, Bodo-Garo and to a lesser degree Tibetan are close to the isolating end of the scale.

  80. David Eddyshaw says

    Thanks, DM!
    I was interested to read (in the earlier paper)

    Definitions of the creoloid pattern tend to be simplistic and of limited usefulness. For example, some scholars (e.g. McWhorter 2001) claim lack of phonemic tone to be a characteristic of creole languages, a notion which is not particularly useful in an East/South Asian context …

    which chimes with part of my own beef with McW. However, he goes on to say

    Much of the recent discussion of these issues has been framed in terms of one or another notion of “complexity”. For our purposes a simpler and more easily defined value is what we may
    call transparency. A characteristically creoloid morpheme has a unitary, coherent meaning, which is inherent to the morpheme itself, not dependent on paradigmatic or syntagmatic relations to other morphemes.

    This idea of “transparency” is very interesting. However, it looks rather to be more of a criterion for distinguishing agglutination from fusion. It would make Turkish (say) pretty “creoloid”, it seems to me, which is probably not the intended effect.

    I suppose, depending on the amount of work you make “paradigmatic” do, you can rescue typical Bantu languages from the aspersion of creoloidism on the grounds that although many of them have pretty lego-like agglutinative morphology, in nouns this comes in lots of unpredictable noun-class dependent sets, so there’s no one “plural” prefix, and no plural prefix just means “plural.”

    Still: hmmph. If you try to get out of this by simply declaring that grammatical gender is intrinsically uncreoloid, you seem just to be in danger of falling back into McWhorterism after a brief glimpse of formal rigour: creoles are just “simple” (on some aesthetic metric to be chosen by the investigator.) Why should grammatical gender be “complex” if having lots of fine tense/aspect distinctions isn’t?

  81. John Cowan says

    I on[c]e heard a friend named Montserrat introduce herself with a uvular.

    She might just have been following the apparently worldwide idiolectal rule: If you can’t trill, growl. Any sufficiently canine /r/ will do.

  82. David Marjanović says

    This idea of “transparency” is very interesting. However, it looks rather to be more of a criterion for distinguishing agglutination from fusion. It would make Turkish (say) pretty “creoloid”, it seems to me, which is probably not the intended effect.

    Elsewhere in the paper it seems to be important whether grammatical morphemes can (still) be matched up with synchronically occurring free-standing words; in the “creoloid” ST languages that seems to generally be the case.


    I prefer doves over dogs as the comparison. Growling with an actual [ʀ] doesn’t sound like a dog to me; it sounds like the Gorn (though he approximates a little).

  83. Canine growling is generally better imitated by an epiglottal trill; which, strangely, has no IPA or even extIPA symbol as distinct from the fricative(s). (I’ve sometimes used [ᴙ] as an ad hoc substitution.)

  84. I assure you I’ve never heard Montse growl.

  85. ə de vivre says

    If you try to get out of this by simply declaring that grammatical gender is intrinsically uncreoloid, you seem just to be in danger of falling back into McWhorterism after a brief glimpse of formal rigour: creoles are just “simple” (on some aesthetic metric to be chosen by the investigator.) Why should grammatical gender be “complex” if having lots of fine tense/aspect distinctions isn’t?

    Isn’t there a principled way to think about what features would count as “simple” in creole languages by appealing to what elements of language are most available to meta-linguistic awareness (in the Michael Silverstein sense)? That is, if we restrict “creole” to mean pidgins that have become full languages, the language had to pass through a stage where its basic elements were created by adult speakers consciously acquiring a new code. This predicts that creole grammars will be built from elements that are referential and form continuous segments. So representing TAM distinctions by particles rather than stem alternations would be more segmentable. Having bound morphemes transparently derived from content words (or content words in the source language) would be more referential. I’m no creolist, but it seems to hold up with what I’m familiar with. This approach would also predict that things like noun gender and tone would be more likely to appear in creoles where there’s a strong areal influence toward these features, and rare where there isn’t. It also predicts that creole noun gender would be more likely to appear in a Bantu-like system of distinct affixes on the noun itself than an Indo-European complex agreement paradigm. This seems like a suspiciously easy solution to me, so I’m curious what more knowledgeable hatters think.

  86. David Eddyshaw says

    I think that’s a pretty good account of what DeLancey actually means; as DM pointed out, the paper does in fact go on to talk about pretty agglutinative structures as being “creoloid”, so I suspect that DeLancey actually wouldn’t regard Turkish as any sort of embarrassing counterexample after all. He’s not really concerned with word boundaries at all, but only with the integrity of morphemes. It does seem to take him quite far afield from McWhortery creoleness, but that’s probably a feature rather than a bug.

    Bantu (as I’ve banged on about before) has lots of morphology, and this has been unthinkingly assumed to represent the prelapsarian ur-state of all of Niger-Congo all too often. But quite apart from the fact that this is extremely hard to square with evidence from practically every other branch of the phylum, the very regularity and transparency of typical Bantu flexion implies that either the system is of relatively recent origin, and/or it’s been continually reshaped to keep it transparent despite the ravages of phonological change over the millennia. In other words, it’s become creoloid, in DeLancey’s sense.

    I don’t know anything about Tibetan, but I recall reading that in the development from Classical to modern Lhasa Tibetan it’s striking that the actual grammatical categories expressed have not changed very much: what has changed is that they are now expressed much more analytically. There’s an interesting parallel with Egyptian: Polotsky’s extraordinary work on Egyptian syntax went from recognising what the system of (analytically expressed) “second tenses” in Coptic meant, to hypothesising that Coptic was expressing distinctions analytically which were not new, but had been expressed synthetically in previous stages of the language.

    I’m not sure that this sort of “creoloidisation” need necessarily be driven by language contact or large-scale imperfect learning. It seems indeed to be a common diachronic process: the hypothesis that it can only arise by abnormal language transmission seems logically quite separable from the observation that it happens a lot. I can see circular arguments readily arising from the assumption that it must always be so.

  87. “Becoming Creoloid” was supposed to be the followup to “Turning Japanese,” but the band broke up and the master got lost.

  88. David Eddyshaw says

    Trying to refresh my memory about Humboldt’s currently-in-bad-odour Cycle, I came across this


    which contains the excellent statement by Georg von der Gabelentz (known to me as a great Sinologist, though I was dimly aware that he’d done other stuff too):

    The history of language moves in the diagonal of two forces: the impulse toward comfort, which leads to the wearing down of sounds, and that toward clarity, which disallows this erosion and the destruction of the language. The affixes grind themselves down, disappear without a trace; their functions or similar ones, however, require new expression. They acquire this expression, by the method of isolating languages, through word order or clarifying words. The latter, in the course of time, undergo agglutination, erosion, and in the mean time renewal is prepared: periphrastic expressions are preferred…always the same: the development curves back towards isolation, not in the old way, but in a parallel fashion. That’s why I compare them to spirals.

    What I would add in this context, is that the affixes don’t always “grind themselves down” or “disappear without trace”: they can be remodelled by analogy or simply resist erosion while remaining affixes. The Cycle (if you believe in such a thing) can move back from fusion to agglutination.

  89. ə de vivre says

    Yeah, it seems like there are two questions: (1) To what extent does being a creole predict that a language will have a certain collection of features? A question you can answer by looking at languages around the world that we know, for historical reasons, to be creoles. (2) To what extent does possessing a certain collection of features predict that a language is a creole. In theory these are two different questions, but when linguists use “possessing characteristically creole features” as a shorthand definition of what a creole is, the whole thing gets rather circular, and the lesson here seems to be that using typology to find hidden creoles is, at this point, a dubious practice.

    Even if we assume, for the sake of argument, that the language of the first generation of L1 creole speakers will always have certain “simple” features, it doesn’t follow that finding hidden creoles would be a straight-forward test of whether or not any given language possesses this bundle of features.

    Your point about Bantu transparency is interesting. In the case of the Turkic languages, we know that the population movements starting with the rise of the Mongol Empire resulted in a lot of leveling across genetically divergent Turkic families. I wonder if frequent intra-family contact and leveling is predictive of morphological segmentability and/or transparency, or if it just reinforces tendencies that already exist in families. There are a lot of possibilities that might plausibly be true, but it seems like only a few of them have been adequately explored so far.

  90. David Eddyshaw says

    The resistance of affixes to phonological attrition is visible practically in real time in the One True Human Language, on account of the way Kusaal deletes word-final short vowels in most but not all contexts, which is a great way of attriting suffixes.

    In noun flexion the great majority of deviations from the regular pairing of sg and pl noun class suffixes are due to the “wrong” suffix being picked in order to avoid an ambiguous form that would otherwise have resulted after final vowel deletion. In verb flexion, it’s not possible to avoid ambiguity by drafting in a suffix from a different paradigm, because there is only one conjugation of verbs which inflect for aspect, and they all (except one) take the imperfective suffix *-da.

    There’s no problem with stems ending in consonants which insert an epenthetic vowel in the consonant cluster between the stem-final consonant and the *d:

    perfective zabɛ, imperfective zabida “fight”, normally appearing as zab, zabid.

    There is a problem with stems ending in n or m, which undergo the assimilations *nd -> nn and *md -> mm. After the deletion of final vowels and simplification of word-final nn, mm, this results in identical perfective and imperfective forms. Accordingly, in contexts where ambiguity is actually possible the clusters *md *nd insert an epenthetic vowel instead of assimilating: this is never permitted in noun flexion at all.

    M karim nɛ. “I’m reading.”
    M pʋ karimma. “I don’t read/am not reading.” (never *M pʋ karimida.)


    M daa karimid. “I was reading.” (M daa karim can only be perfective: “I read.”)

  91. David Marjanović says

    Canine growling is generally better imitated by an epiglottal trill;

    *lightbulb moment*

    which, strangely, has no IPA or even extIPA symbol as distinct from the fricative(s).

    The International Phonemic Alphabet strikes again: the trill isn’t known to be a distinct phoneme from both the stop and the fricative anywhere, and it hasn’t been known from Europe for ages*, so it is denied a symbol.

    * The labiodental nasal, [ɱ], got its symbol for the English allophone at least 100 years before a /m/-/ɱ/-/n/-/mpf/-/mbv/ distinction was discovered in one language on the Congolese plateau in 1975.

    (I’ve sometimes used [ᴙ] as an ad hoc substitution.)

    That seems to be common among Caucasianists.

    Meanwhile, this paragraph on Wikipedia argues, with two citations from the same author, that the symbol already exists, because the “epiglottal fricatives” are better called “pharyngeal trills”. I must say this all seems too easy for me, especially given this.

  92. @David Eddyshaw: Interesting use of “the diagonal of two forces” there.

Speak Your Mind