The World Loanword Database (WOLD) is the most amazing thing I’ve seen in a while, linguistically speaking. Lameen Souag took time off from thesis-writing to share it, and I’m glad I have neither a thesis to write nor (at the moment) work to do, so I can splash around in it to my heart’s content. Here’s their description:
It provides vocabularies (mini-dictionaries of about 1000-2000 entries) of 41 languages from around the world, with comprehensive information about the loanword status of each word. It allows users to find loanwords, source words and donor languages in each of the 41 languages, but also makes it easy to compare loanwords across languages.
Each vocabulary was contributed by an expert on the language and its history. An accompanying book is being published by Mouton de Gruyter (Loanwords in the World’s Languages: A Comparative Handbook, edited by Martin Haspelmath & Uri Tadmor)….
The database can be accessed by language, by meaning, by author, or by reference.
Here‘s the “Languages” page (with a nifty map: recipient languages are shown by a red symbol, donor languages by blue) and here‘s the “Vocabularies” one, with a percentage of loanwords for each language (ranging from Old High German at 6% to Tarifiyt Berber at 53%). I’ll give a random example of the kind of information you get when you dig down. Bezhta (Affiliation: Nakh-Daghestanian, Avar-Andic-Tsezic; the section is by Bernard Comrie and Madzhid Khalilov) has 32% loanwords; one of them is čarx ‘whetstone,’ the page for which tells us that it is from Avar čarx ‘whetstone’, from Georgian čarxi ‘lathe.’ It goes on to say:
Other comments: Georgian may be the ultimate source – cf. also the related verb čarxva ‘to grind (knife)’ – in which case the details of the direction of derivation are unclear
And if you click on “Contact situation: Avar as local lingua franca,” you get:
Avar is the single largest immediate source for loans into Bezhta, and contact with Avar has been intense for at least some centuries, with Avar serving as the main means of communication with the outside world, in addition to personal meetings between speakers of Bezhta and Avar. It is difficult to justify the assignment of a particular date to the beginning of this process, but as a rule of thumb we have taken the beginning of the eighteenth century, as it was during the eighteenth century that the Bezhta speaking area was incorporated into a larger, religiously Muslim community under Avar leadership. In addition to loans of indigenous Avar origin, Avar has also provided the major conduit for the introduction into Bezhta of words of ultimate Arabic, Persian, or Turkic origin.
I could spend weeks rooting around in here, and probably will as work and other interests allow. Thanks, Lameen, and good luck with your thesis! (There’s already discussion of possible errors at Lameen’s post.)
I have to say, I’m not thrilled about “languoid,” which they call “a (relatively new) cover term for ‘language’ and ‘language family,'” but I suppose I can get used to it.