A very interesting article by Eildert Mulder about the difficulty of setting Arabic script in type:

The technical problem is this: Arabic letters are generally not written separately but joined to each other in groups or entire words, like a script typeface in English. And though the Arabic alphabet has only 28 letters, most letters have four forms, depending on whether they occur at the beginning of the word, in the middle of the word, at the end of the word, or stand alone. Furthermore, each combination of letters is unique, creating a typographic challenge greater than Chinese. Because all letters connect dynamically with the preceding one, and most also with the following one, the number of unique combinations is almost astronomical.
The esthetic problem comes from the dizzying mutability of written Arabic. For example, there are actually three ways the letter ha can be written in the middle of a word, and the calligrapher’s choice is influenced not only by the letter immediately preceding the ha, but also by the letters earlier in the word, and even by letters that follow it—yet, in whatever form, it is still in essence the ha in the beginner’s textbook. A sequence of letters can run along a baseline the way Roman letters do—though Arabic runs from right to left, of course—or they may start above the baseline and descend in a diagonal if the connections from one letter to the next make that an esthetically pleasing choice.

The result is that the individual letters in a well-written piece of text are in constant motion, like dancers in a polonaise: In the course of the dance, they bow to each other, embrace each other, push each other away, hug each other’s necks and fall at each other’s feet—and there are some real acrobats among them. Thus, well-written Arabic texts feel alive to their readers, whereas mechanically typeset ones feel like graveyards: At their best they are only still photographs of the calligrapher’s living, moving polonaise.

Thomas Milo, Mirjam Somers, and Peter Somers have solved this problem:

Using the calligraphy of Mustafa Izzet Effendi and other great calligraphers, the Milo–Somers team took the concept of script analysis further than either Müteferrika or Mühendisyan, making the basic unit they examined not the letter but the penstroke. That made it possible to derive the dancing, shifting letters, the tens of thousands of combinations, and the variable words all from a few hundred individual penstrokes and a clear and limited set of rules—just the sort of fundamental, tabular information that computers like to use. And with modern computers, it became possible finally to resolve the conflict that has blighted the relationship between Arabic script and book-printing technology for most of five centuries.

Fascinating stuff, with some gorgeous illustrations. (Via MetaFilter.)


  1. komfo,amonan says:

    That’s a fine article.
    Müteferrika: Hungarian. Mühendisyan: Armenian. Milo-Somers: Dutch. The article gives the impression that all the most important innovations in Arabic (-script) typesetting were made by foreigners. I confess that that struck me.

  2. Lugubert says:

    Typesetting Arabic Arabic must be easy. At least using a standard newspaper font of the Naskh http://en.wikipedia.org/wiki/Naskh_%28script%29 type, compared to Arabo-Persian beauties of the Nasta’liq type: http://en.wikipedia.org/wiki/Nasta'liq_script
    I have found a couple of pdf converters that handle plain tedious Naskh, but so far, none that manages the lovely Urdu Nastaliq.

  3. It’s so interesting how complex orthography translates into such settings (I still am utterly fascinated by Chinese typesetting, with its seemingly endless array of kanji), and it’s a pleasure to read about it in such an engaging article. Great link!

  4. komfo,amonan says:

    An interesting overview of the history of Arabic type lives here.

  5. @ “I have found a couple of pdf converters that handle plain tedious Naskh, but so far, none that manages the lovely Urdu Nastaliq.”
    Conversion which way – from PDF or to PDF?
    I used Nuance PDF Converter Pro 5 to make this PDF:

  6. Here is another history article from Saudi Aramco World, but 25 years earlier. It mentions the first Arabic book printed with metal type, the 1514 Kitāb ṣalāt al-sawāʻi (Book of Hours). And the first such book printed in the Middle East, the 1706 Kitāb al-Zabūr al-Sharīf (Psalter). And the petition to and 1726 decree by Sultan Ahmed III authorizing secular works.

  7. MMcM,
    can you access the book? Apparently Krek mentions Bratislava and I’d love to know in what context.

  8. Just snippets, of which only one (of three) is legible enough to tell what it refers to: Arabische, türkische und persische Handschriften der Universitätsbibliothek in Bratislava. I’ll try to look up the other two when I happen to be at a library with a physical copy.

  9. Lugubert says:

    @ Stuart: Such “linear” fonts as yours work fine. It’s to pdf, and I think that the ScanSoft PDF Create 3.0 is an earlier version of your Nuance. My PDF Create handles lots of fonts (for Arabic, Chinese, Hebrew, Hindi, Panjabi) without any protests. Adobe Acrobat doesn’t even try but hangs immediately.
    The problem with proper Nasta`liq fonts is that they are sloping, so the number of possible letter combinations must be enormous. Look at the font name in the wiki article I quoted! Another font that doesn’t work is Tibetan Machine, also vertically complicated but in another way.
    I need all of them, because I’m trying to explain all non-English words and expressions in Kipling’s Kim and write them correctly using language appropriate fonts. I think I’ve managed some that aren’t in the Kipling society’s material, and add some illustrations.

  10. “I need all of them, because I’m trying to explain all non-English words and expressions in Kipling’s Kim and write them correctly using language appropriate fonts. I think I’ve managed some that aren’t in the Kipling society’s material, and add some illustrations.”
    Arre vah! That sounds REALLY interesting. I’d never be able to master nastaliq, devanagari is beyond my impaired handwriting without resorting to the keyboard, but I love the poetry of Urdu, and Kim is my favourite of Kipling’s work because his world was the world of my father and his father, and Kipling writes about it the way my Dad told stories of it when I was a kid. Plus, I’ve long thought that my Dad’s cursive Roman looks suspiciously like the nastaliq he learned at boarding school, only slightly less readable.My only beef with Kipling is his shonky transliteration, but then I’m looking at from a Hindi POV not his Urdu one, so I guess I have to allow for that. If access to your work is permitted to a pieriansipist such as I, I would be VERY keen to read it.

  11. That sounds REALLY interesting
    Indeed. Please alert me when it becomes available; if it’s online, I’ll link to it.
    I’d never be able to master nastaliq
    I feel the same way. Lovely to look at, but yikes.

  12. I’d love to see the Kim material too.
    I well remember the first time I got my hands on an Arabic word processor — the Xerox Star of the 1980s, the spiritual ancestor of every desktop computer in use today. I didn’t (and don’t) know any Arabic, so I could only type either copied examples or gibberish — fortunately, the Star provided an on-screen keyboard so I could find the proper keys. Watching the cursor chug along from right to left was impressive enough in itself. But watching the letters mutate from isolated to initial form, or from medial form to final form, as I typed more letters — that was really something. They seemed alive on the screen, instead of just lying there as Latin letters and symbols did.
    The Star also did bidi, although a less sophisticated flavor than Unicode provides for, and it was a very strange feeling, typing Latin script inside the Arabic and watching the letters slide away leftward from the unmoving cursor. But the ultimate hack was typing “The Arabic for Islam and the Arabs is al-islam wa al-arab” (a sample sentence), selecting the “wa” and changing it to English “and” — and watching the surrounding Arabic words magically switch places! Spooky actions at a distance indeed.
    Finally, as to bogus transliteration, or rather transcription, Kipling was probably mostly trying to be helpful to his monolingual readers, giving them the flavor without confusing them. Doubtless “sati” is a better transcription than “suttee”, but unquestionably the latter form has become the English spelling, and its natural pronunciation in English is closer to the underlying Hindi. See also Hat’s transcription (ha!) of T.E. Lawrence’s attitude toward transliteration (scroll down a bit).

  13. caffeind says:
  14. “See also Hat’s transcription (ha!) of T.E. Lawrence’s attitude toward transliteration (scroll down a bit).”
    Thanks for that. I’ve only read 7 Pillars once, when I was 15, and clearly missed that delightful gem. I wholeheartedly agree with this:
    “There are some ‘scientific systems’ of transliteration, helpful to people who know enough Arabic not to need helping, but a wash-out for the world. I spell my names anyhow, to show what rot the systems are.”
    I realised after reading Lawrence’s replies that my attitude to Kipling’s transcription of hindustani words was exactly the same as that of Lawrence’s correspondent toward his transcription of Arabic. Now that I’ve been able to see that for what it is, and laugh at myself accordingly, I can relish Kipling with unmarred delight. For that, and for reminding me how much I enjoyed 7 Pillars, thank you!

  15. so pleased to see this addressed here, as well as the previous arabic-related posts. it’s even more pleasing to see this being addressed in this manner to solve my hours of frustration in typing arabic on myown computer.
    as for lawrence’s transliteration, i confess that the introduction to seven pillars in which he outlines the letters between his editor and himself was one of my favourite parts of the whole text, which i otherwise greatly enjoyed。

  16. Yes, it’s completely irresistible.

  17. Lugubert says:

    I agree that Kipling’s “transcription” is quite suitable to English readers. But it took me twoscore years and a few until I realized that the sorceress Huneefa must be named Hanifa, and what that meant, or that Hurree Babu should be Hari Babu.
    I have interrupted my Kim work, because my Hindi prof. didn’t accept my project for a third semester thesis, the book being in English, not Hindi.
    Chapter one is rather completely covered, the rest not particularily detailed. I suppose my 1.5 MB Word file would be readable to people who aboe standard fonts have installed the open SimSun, Tibetan Machine, Nafees Nastaleeq and Raavi fonts (which could be supplied if you don’t find them). For Hindi/Sankrit, I use Sanskrit 2003 (also free), because I find it much more pleasing than the (retch) Mangal supplied by MS.
    Oh, for many transliterations, TITUS Cyberbit Basic will be needed.
    Try this one excerpt:
    P آباد ābād ‘a city, building; cultivated etc.’ When added to a noun, it denotes a city or place of abode, as allāhābād ‘the city of Allah’. – This ābād should not be confused with the initial short ‘a’ termination of Persian and Urdu expression like “Zindabad!”, “Mordabad!” (Long live, resp. Down with, or literally, may [X] die), where it is a rest of an earlier Persian optative.
    This paragraph has to be heavily amended, referring to the mistranslations of Ahmadinejad’s supposed threats and changing ‘the city of Allah’ to, like, Ilahabad ‘the city of God’. What I have written on urdu/hindi/hindustani has to be severely reviewed as well.
    Anyway, if you’re sufficiently interested, I’d be quite happy to inform you of the current project state and supply my humble effort file. Just (before 1 June, when that supplier vanishes) write a few lines to aring at rixmail dot se.

  18. MMcM,
    thanks, I figure the Bašagić collection (which also includes prints) would be there.

  19. Fascinating post, thanks.

Speak Your Mind