Ramsey Nasser and Programming in Arabic.

Ramsey Nasser’s Artist’s Notebook page is absolutely fascinating to me, even though I barely understand a single thing he says. He starts off:

Arabic programming languages with the honest goal of bringing coding to a non-Latin culture have been attempted in the past, but have failed without exception. What makes my piece قلب different is that its primary purpose was to illustrate how impossible coding in anything but English has become.

About the last paragraph I more or less understood was this:

The current name قلب means Heart, but is actually a recursive acronym for قلب: لغة برمجة pronounced ‘alb: lughat barmajeh meaning Heart: A Programming Language. Acronyms in Arabic are generally difficult to pull off, and قلب is the first recursive one I have seen. Recursive acronyms – acronyms where the initial letter stands for the acronym itself – are common in computer science humor. PHP stands for PHP: Hypertext Processor, GNU stands for GNU’s not Unix, and so on. قلب’s name connects it to that tradition of software engineering names.

After that it got too technical for me, but the illustrations are pleasing to look at, and it ends with a nice piece of tile calligraphy. For informed commentary (also incomprehensible to me), go to the related MetaFilter post. (Thanks, ardge!)

Comments

  1. I was intrigued by the statement that programming was performed in specifically American English. Certainly, most programming languages incorporate English words into their structures, but I would not have thought that it could be picked out as a specific kind of English. I suppose that some command names may use words that have different spellings in different places, but I can’t think of any off the top of my head.

  2. I know that some Brits find the systematic use of “color” in programming-language libraries very annoying.

  3. Off the top of my head, some idiomatic English, not easily translatable, in some programming languages: ‘else’ (after ‘if’), ‘get’, ‘put’, ‘do’, ‘for’, ‘not’ (distinct from ‘no’). Most languages have something like Adj-N and SVO word order, which are of course not universal.

    While we’re at it, syntacticians’ usage of ‘left’ for the beginning side of an utterance and ‘right’ for its end can come off as unnatural or even chauvinistic to this user of a right-to-left script.

  4. Stefan Holm says:

    A computer’s language, binary math, is absolutely neutral vis-à-vis any spoken one. “Bringing coding to a non-Latin culture” however has to start with a (native) mathematician, not a linguist, to get it on track from the beginning. Nasser’s attempts have to pass through several stages of English down to the compiler and the binary code before reaching the machine’s CPU. It’s like letting Google translate something through multiple languages – it almost inevitably ends up as mumbo jumbo.

    In its very core the electronic language is the simplest one ever with just two letters (0 and 1) and two actions taken upon them (add or subtract). If they succeed in constructing a quantum computer though, things will be somewhat more complicated. But in essence, Nasser is trying to do the translation at the wrong level (too high up in the hierarchy).

  5. LISP must be quite adapted to VS languages like Arabic.

  6. What I know about computer programming would fit in a very small thimble, so I’ll comment only on Ramsey Nasser’s remark that creating acronyms in Arabic is difficult: Hebrew creates them easily and does so at a furious pace. The phenomenon seems due to the triliteral root system and the common absence of vowels in the written language, allowing the creator of the acronym to use any vowels he wants. The modern language abounds with them, and they are amenable to evolution as verbs as well.

    A very old acronym, dating I believe to Talmudic times, is דו”ח doḥ, or more formally duaḥ. It stands for דין וחשבון din vi-ḥeshbon and means ‘report’ (noun). At some point, possibly since the modern revival of Hebrew, a verb לדוויח le-daveaḥ (to report) appeared. Both noun and verb are in everyday use.

    I wonder how Arabic differs such that it’s resistant to these constructions.

  7. GeorgeW says:

    “. . . Ramsey Nasser’s remark that creating acronyms in Arabic is difficult: Hebrew creates them easily and does so at a furious pace.”

    I wonder if Modern Hebrew’s Germanic influence may facilitate the use of acronyms more than Arabic. (It has been argued that Modern Hebrew is actually a Germanic language with Hebrew orthography and lexicon).

    Maybe, there is a stronger root-semantic influence in Arabic which deters freely producing acronyms. I can image that many potential acronyms would lead one down a false semantic path. As an example, al-Azhar University would be abbreviated to a word meaning ‘agitated’ (hmm), or the Muslim Brotherhood, a word meaning ‘drinking vessel.’

  8. For what it’s worth, one of the weirder rabbit holes in the hacker world is construction of non-linguistic computer languages. The granddaddy of them all is ‘brainfuck’:

    http://en.wikipedia.org/wiki/Brainfuck

    but there are many others, which you can see and link to from Wikipedia’s ‘brainfuck derivatives’ page.

  9. Hebrew acronyms long predate the MH period. As to why Hebrew likes them and Arabic doesn’t, you might as well ask why English accepts loanwords so readily and Icelandic resists them so strongly.

  10. Hebrew acronyms long predate the MH period

    That’s reaching pretty far back. Can you point to an example? I wonder if the surviving bits of (non-Judeo) Aramaic or even Amharic like acronyms.

    Icelandic vs. English: Good point. What you’re saying is that the matter has little to do with language structure and much more to do with surrounding culture. Nature vs. nurture.

  11. Sorry, I meant Modern Hebrew, not Mishnaic Hebrew. (I should have written ModH or IH.) Just think of all those acronymic names.

    I would say it has to do with the culture of language; language, after all, is a bearer of culture as well as being a component of culture. Less drastically but more complexly, Serbian and Croatian are almost identical as languages, but the Croatian culture of language resists loanwords but often leaves their spellings unchanged as English does, whereas the Serbian culture of language accepts loanwords freely but respells them to fit in with its unique phonetic-über-alles culture of writing as Icelandic does — intensified in the Serbian case by its unique pervasive biscriptalism. There is nothing, as far as I know, in the rest of Croatian and Serbian culture that would enable you to predict these facts.

  12. Pfffft. That 68-word sentence “Less … biscriptalism” is the sort of thing that rolls off my fingers when I don’t go back and edit myself for straightforwardness. No one would ever mistake me for Papa Hem even when I do, but I don’t usually inflict that sort of thing on my readers. Sorry about that.

  13. I didn’t even notice, but then I’m given to producing long sentences myself.

  14. FYI my RSS feed is displaying all your recent posts as right justified, with wonky punctuation.

  15. John wrote a sentence. It was long. He apologized.

  16. FYI my RSS feed is displaying all your recent posts as right justified, with wonky punctuation.

    Must be an effect of the Arabic in this post. Anybody else having that problem?

  17. Stefan Holm says:

    In the case of Icelandic insular isolation must be emphasized. The main trigger in the change from ON to modern peninsular Scandinavian was undoubtedly the influence of the Low German (Platdeutsch) speaking merchants of the Hanseatic League. Their impact on phonetics, vocabulary, inflections, pre- and suffixes etc. is comparable to the Norman one on Englisc (let aside that ON and Platdeutsch were quite close from the beginning).

    The Hansa traders however didn’t care much about Iceland and consequently our runaway brethren up there kept their language intact. Combined with their strong sense for poetry, story telling and writing it became a player piano enforcing conservatism in language.

    Even after the Hansa era we’ve been under constant influence from High German (1500-1650), French (1650-1800), High German again (1800-1945) and English (1900-….). Nothing of this has really affected Iceland – e.g. they never had an aristocracy, inbred with the continental one. But today times are changing – as the world is getting smaller I hear that the young generation is increasingly including English loans in their speech.

    So, not that I oppose the idea of language being both bearer and component of culture but geography as a factor shouldn’t be ignored.

  18. Hat, you probably wouldn’t, given the sort of prose you consume professionally. But only now do I notice that the first occurrence of unique doesn’t belong there, since Serbian is not unique in its phoneticism, only its biscriptalism.

    One of my email signatures says:

    I must confess that I have very little notion of what [s. 4 of the British
    Trade Marks Act, 1938] is intended to convey, and particularly the sentence
    of 253 words, as I make them, which constitutes sub-section 1. I doubt if
    the entire statute book could be successfully searched for a sentence of
    equal length which is of more fuliginous obscurity. –MacKinnon LJ, 1940

    Paul O: “Introduces, in this paragraph, the device of sentence fragments. A sentence fragment. Another. Good device. Will be used more later.” —David Moser, “This Is the Title of This Story, Which Is Also Found Several Times in the Story Itself”

    I see no serious wonkiness when viewing this post in Feedly; browsers are more likely to get this sort of thing right than random desktop or mobile apps.

  19. Icelandic insular isolation

    Apt alliteration’s artful aid, crossed with a semantic doublet. (Though as Tolkien points out, alliteration alliterates on /l/.)

  20. vrai.cabecou says:

    When it comes to Arabic acronyms, does anyone know how accurate this is?

    “Fatah: Acronym for Harakat al-Tahrir al-Falistiniya, the Palestinian Liberation Movement, with the first letters in reverse order giving FATAH which means conquest (whereas the word derived from the normal abbreviation Hataf means « death »).”

  21. The EB believes it, for what that’s worth. Note, however, that fatḥ (the second vowel is parasitic) can also mean ‘victory’; translating it solely as ‘conquest’ is tendentious.

  22. GeorgeW says:

    vrai.cabecou: I personally don’t know about the accuracy of Fatah being a “reverse acronym.” It could be accurate, or it could be folk etymology. If true, this would be a good example of an Arabic acronym. I am really pushed to think of another. The normal root order for the name, with its meaning, is a good example of the potential problem of Arabic acronyms. Most consonant combinations would give a word meaning, some of which might have negative connotations, misleading or trivial. I doubt if a serious religious or political organization, as an example, would want to be known as ‘toilet,’ ‘carrot,’ ‘toe’ or worse.

  23. Eww, non-English programming languages. Thank God, I was young enough to avoid monstrosities like Russian Cobol or any of the home-grown programming languages.

    Since Russian is inflected, using Russian in the code, where you can’t use inflection, obviously, always felt very stilted. I guess it’s not as bad in English (for the native speakers).

    Algol-68 was designed to be translatable/translated (http://jmvdveer.home.xs4all.nl/report.html#115). In USSR “Revised Report” (RR) was published bilingually (but RR is useless as a manual). “Informal Introduction” used Russian Algol and I loathed it for that (the implementation we had on IBM clones used English). Admittedly, translation of the RR was an astonishing feat. Original English is very technical with creative use of puns and neologisms (some say it’s weapon-grade gibberish).

  24. Well, lingua romana perligata uses inflection, indeed depends on it. Here’s programmus primus, the sieve of Eratosthenes:

    #! /usr/local/bin/perl -w
    use Lingua::Romana::Perligata;
    maximum inquementum tum biguttam egresso scribe.
    meo maximo vestibulo perlegamentum da.
    da duo tum maximum conscribementa meis listis.
    dum listis decapitamentum damentum nexto
        fac sic
            nextum tum novumversum scribe egresso.
            lista sic hoc recidementum nextum cis vannementa da listis.
        cis.

    Which compiles to:

    print STDOUT ‘maximum:’;
    my $maxim = ;
    my (@list) = (2..$maxim);
    while ($next = shift @list)
         {
             print STDOUT $next, “\n”;
            @list = grep {$_ % $next} @list;
         }

  25. Too bad we can’t ask native Latin speakers how they feel about it :)

  26. Stefan Holm says:

    insular isolation … semantic doublet

    I was about to say “etymological doublet, you mean?” but soon realized, that I was fooled by German Insel and Swedish insulär, which refer only to islands opposite to isolieren/isolera which usually don’t. Online Etymology Dictionary however told me, that semantics differ somewhat on the British isles. A trap to always be aware of when dealing with closely related languages.

  27. One may be physically isolated without being an insular thinker, especially with the Interislandnet.

  28. Breffni says:

    My favourite antique peeve: “The affected, frenchified and unnecessary word isolated is not English, and we trust never will be.” (OED, ‘isolated’, 1800 quotation.) OED explains that “the French isolé was at first used unchanged or with -d , isolé’d”. Isolate is a back-formation.

  29. Obie said he was gonna isolate us behind a firewall. He said: “Kid, I’m gonna isolate you behind a firewall. I want your RFCs and your Lions Book.”

    I said, “Obie, I can understand your wantin’ my protocol descriptions, so I don’t have any documentation about the firewall, but what do you want my obsolete V6 kernel source for?” and he said, “Kid, we don’t want any unexpected panics.” I said, “Obie, did you think I was gonna firestorm my local network for litterin’?”

    Obie said he was makin’ sure, and, friends, Obie was, ’cause he took out the CTRL and ALT keys so I couldn’t give a three-finger salute and warm-boot, and he disconnected my 10baseT cable so I couldn’t speak the BOOTP protocol to a friendly host and reload the OS image over the network. Obie was makin’ sure.

    —Arlo, as parodied by yours truly

  30. Alon Lischinsky says:

    “. . . Ramsey Nasser’s remark that creating acronyms in Arabic is difficult: Hebrew creates them easily and does so at a furious pace.”

    I have no Arabic and only minimal Hebrew, but my guess is that the smaller vowel spectrum of Arabic plays a role here. Hebrew has a larger pool to select from in order to avoid semantic collisions, so that, say תנ״ך (Tanakh) does not sound like תנוך (tenukh, ‘earlobe’).

  31. GeorgeW says:

    Alon Lischinsky: Also, it is worth noting that Mod Hebrew has a way of marking acronyms (with a special character right before the last letter) so that they are not confused with ordinary words. Arabic has no orthographic means of distinguishing acronyms from ordinary words.

  32. Alon Lischinsky says:

    @GeorgeW:

    Mod Hebrew has a way of marking acronyms

    While true, that is only relevant in writing (and there’s nothing to say that acronyms could not be similarly marked in Arabic, say, by writing them only using the independent forms of the letters). It has no bearing on homophony.

  33. GeorgeW says:

    Alon Lischinsk: Good points. I was thinking that Hebrew might be unique in marking acronyms, but then I remembered we do this for some in English with all Caps. Orally, we sometimes create a word (like nato, potus), other times we just use the initials (like juIn, InwaIpidi).

    Of course Arabic, like Hebrew, does not have capital letters. In addition, writing letters unjoined, independently in Arabic is very unusual. But, there is no reason this could not be used for acronyms.

  34. In addition, writing letters unjoined, independently in Arabic is very unusual. But, there is no reason this could not be used for acronyms.

    For handwriting, yes. But for anything out of a modern word processor juggling would be required because the software automatically re-shapes the letter depending on its position in the word. Try it by copying an Arabic word from Wikipedia or some other source, drop it into Word, and then insert word spaces between the letters.

    Which makes me wonder about Farsi: It uses a slightly modified version of the Arabic alphabet but because it’s an IE language I would think creating acronyms would be more difficult than in a Semitic language. Anybody up to speed on that?

  35. GeorgeW says:

    Paul Ogden: I tried writing Arabic with both spaces and periods with my iPad keyboard and both seemed to work. As an example, UNESCO:

    م ا م ت ع ث

    I wondered how to handle articles, clitics and conjunctions. So, I just ignored them. The result looks really weird. This needs more thought.

  36. There are two invisible Unicode characters called “zero-width joiner” (ZWJ) and “zero-width non-joiner” (ZWNJ) that allow control of Arabic shaping; you can type them in HTML with & #x200D; and & #x200C; respectively (without the spaces after the ampersands). The ZWNJ in particular forces any Arabic characters before or after it not to join with their neighbors, whereas the ZWJ forces joining (provided the character has a joined form at all).

    Thus, for instance, to generate the initial form of a letter in isolation, you simply follow it with ZWJ; to generate isolated forms without whitespace between, you separate all the letters with ZWNJ.

  37. David Marjanović says:

    I personally don’t know about the accuracy of Fatah being a “reverse acronym.” It could be accurate, or it could be folk etymology. If true, this would be a good example of an Arabic acronym. I am really pushed to think of another.

    Hamas (Arabic: حماس‎ Ḥamās, “enthusiasm”, an acronym of حركة المقاومة الاسلامية Ḥarakat al-Muqāwamah al-ʾIslāmiyyah, “Islamic Resistance Movement”) is the Palestinian Sunni Islamic or Islamist[5] organization, with an associated military wing, the Izz ad-Din al-Qassam Brigades,[6] located in the Palestinian territories.”

  38. But is that an acronym or a backronym?

  39. David Marjanović says:

    That one is definitely an acronym.

Speak Your Mind

*