ROMLEX is a project to document the major Romani (“Gypsy”) languages of Europe.

ROMLEX is not a Romani dictionary in the usual sense. ROMLEX is a lexical database. It contains data that are representative of the variation in the lexicon of all Romani dialects, and offers almost complete coverage of the basic lexicon of the Romani language. At present, data are available online covering 25 different Romani dialects. These are accompanied by translations into English and, depending on the Romani dialect, into other European languages as well. By providing an electronic resource of the highest quality, which can constantly be updated, the ROMLEX database can serve as a foundation for future dissemination of Romani literary resources and Romani language literacy itself.

You can access the database itself here; if you want some information about the various dialects, it’s here; and here is a discussion of the Roma, Sinti, and Calé and where they and their language came from:

Roma means all groups residing in central and eastern Europe, or respectively, those who in the 19th and 20th century emigrated from central and eastern Europe to western Europe and overseas. The term Sinti comprises those subgroups which entered the German speaking cultural area at a relatively early point in time and who for the most part live in western Europe today (Germany, France, Italy, Austria, etc.). Calé defines, among others, groups who have been living for a long time on the Iberian Peninsular (Spain, Portugal)…

Proto-Romani is believed to have split from subcontinental Indo-Aryan during the transition period from Middle to New Indo-Aryan. It retains some conservative features especially in the verb inflection, but also in nominal inflection. Phonology and lexicon point to an ancient affinity with the so-called Central Indo-Aryan languages, such as Hindi. On the other hand, there are morphological and arguably some phonological parallels with the languages of the extreme Northwest, such as Kashmiri. It is therefore assumed that Proto-Romani split off from the Central branch, then underwent a shared areal development with the North-western languages, before leaving India. A similar profile is shared by Domari, the language of the Near Eastern Dom. The linguistic history of both groups thus points to successive migrations of the speaker populations, leading ultimately to their present locations.
Proto-Romani must have been spoken in Asia Minor by the eleventh or twelfth centuries. It absorbed Iranian and Armenian influences. The strongest impact however was Greek, which has made a significant contribution not only to the Romani lexicon but also to derivational and inflectional morphology and to the syntactic typology of Romani. Features such as the preposed definite article, Verb-Object word order, and the split between factual and non-factual complementisers can be attributed to Greek influence, while the emergence of prepositions, the reduction and ultimate loss of the infinitive, and the structure of relative and adverbial clauses may have been triggered already by Iranian influence…

Thanks for the tip, peacay, and I expect the omniscient zaelic to show up at any minute and provide his own Romaphilic insights.


  1. I saw this a few months ago before it was up and running. I decided to give it a test. A Rom acquaintace of mine works at an NGO dealing with Roma human rights. Romani has become one of the recognized languages in Europe, and so there is a lot of work translating documents into Romanes, which is pretty much symbolic, since almost nobody involved in non-governmental Roma life spends much of their time reading EU documents in Romanes.
    My friend’s problem was that there is no agreed on Romani word for the abstract term “society.” There is “how we do things” and “how the Gaje do things.” Most modern “mashkartemengipe” (international) written Romani is based on the Lovara dialect, which lists a Hungarian loan word “tarsasago” (from Hung. ‘tarsasag’.) Romanian varieties use terms based on Romanian, and most of the other eastern dialects use the slavic “drustvo.’In Burgenland you get “farajn” from ‘verein.’ But there is no commonly agreed term for “society” that is used in written international Romani.
    Now, you don’t want to translate all your EU documents with phrases in which “adapt to society” is expressed by the somewhat awkward phrase “adapt to the way Gaje do things.”
    The same friend pointed out to me that, on the other hand, Romani has two definate terms for different kinds of fart. One for loud messy ones, and one for silent-but-deadlies.
    khaj: fart (noiseless)
    ril:fart (audible)
    řîl: fart (noisy)
    khaj: 1. smell, stench 2. fart
    East Slovak Romani:
    riľ: fart
    khaň: fart
    To quote my friend, what can you do with a language that has no abstract term for society, but two separate terms for fart?
    I can see I will have a lot of fun with this over Nitl…er… the Holidays… I see the term for ‘sausage’ “goj” is extremely widespread, whereas I had thought it pretty local. Thanks for making my christmas ‘but feder losho!’

  2. Merry Christmas, all of you, Hat and fans of Hat!!!
    How do I learn more about the cale’?
    I’m enjoying listening to Radio Tarifa these days.

  3. Gregor Kwiek says

    Hi zaelic,

    there is a term for society in Romanes, although it will have different meanings in different contexts. The word manush means mankind (manusha malekind, manushnja femalekind). As a noun, manushipe, the word can be translated to civility and society, however, one cannot use this in the same context as to speak about being a part of society (jekh kothor vaj partjia ando manushipe). The reason for this is simple, because we have manushipe, as to where we may not see others having it because they may not practice cleanliness rules that are a part of manushipe. When we speak of society in the sense of “Swedish society” for example, or Romani society, we will say e luma (the world) o them (the country). It is not finding words which is the problem but that the semantics, the logic is different to English. When I say e luma Romanes, I do not mean the earth, I say e phuv (the ground) when I mean the earth. So I think studies are needed to look at the semantics, usages, and so on. We at times move too quickly with things. I have met people who have learned Romanes and used the database and when they spoke some words just did not fit in the sentences they used.

  4. two separate terms for fart

    It’s part of the common Indo-European inheritance, that. English is only accidentally impoverished in this way.

  5. A better link (directly inspired by zaelic’s comment).

Speak Your Mind