The Kraken Wakes.

Ofer Aderet reports for Haaretz (cached) on the progress made in Haifa University’s program to decipher ancient manuscripts:

“Hasten to the Shoko,” urged the computer. “The mouth asked to smoke,” it mused another time. Then it declared, “Jesus God to rejoice.” The cryptic phrases brought both smiles and satisfaction to the managers of the digital humanities laboratory at the University of Haifa. One is a Talmud and Midrash teacher and the other a professor of information systems.

The platform, called Kraken, is taking its tentative first steps in attempting to decipher ancient Hebrew. The hope is that in the not-too-distant future, after completing its studies, Kraken will be able to read any Hebrew text, even if the manuscript is distorted, illegible or hard to decipher. It’s part of a discipline called digital humanities, which uses advanced technology to enhance studies in history, the Bible and literature.

Like children encountering Hebrew religious texts in elementary school for the first time, Kraken also needs practice to become familiar with the material. The “shoko” was supposed to be “shoket” – trough. The mouth wanted to “deal with the Torah,” not to smoke, while Jesus, heaven forbid, has nothing to do with the third phrase, which was originally “the Lord will again rejoice.”

Moshe Lavee, a Military Intelligence veteran, a senior lecturer in Talmud and Midrash in the university’s department of Jewish history in the University of Haifa. He is the director and founder of eLijah-Lab, Kraken’s home and one of the two researchers heading the lab. This week he spoke with contagious enthusiasm about the digital revolution, which is destined to save several research fields from oblivion. […]

On a monitor he showed a scanned section from Midrash Tanhuma, three collections of Pentateuch aggadot (homilies) from the end of the ancient history. The script is difficult to read, but the computer doesn’t give up. Kraken – developed by Prof. Daniel Stoekel Ben-Ezra of Ecole Pratique des Hautes Etudes in Paris – succeeds in reading it, and later presents it to the researcher as a simple text file. This opens new research possibilities that ignite the imagination, first and foremost searching and analyzing information in large scopes and kinds of texts that until now even the most skilled researcher couldn’t carry out alone. “Our vision is to make all the Hebraic scripts accessible,” says Lavee. “We’ll turn Jewish and Hebraic legacy into texts accessible to computer search and study and save a huge treasure trove of knowledge and Jewish traditions.” […]

The revolution was enabled by Handwritten Text Recognition technology, which enables a computer to read tens of thousands of pages – like novels and poetry from the 19th century, diaries and letters from WWII and ancient philosophical and religious texts, including intelligible handwritten input. Lavee says “the computer is taught to read the texts automatically, based on practice, so it acquires contextual knowledge about the language and uses it to reach better results.” […]

At this stage the computer still needs the researchers’ help. They are teaching it to read and “understand” the ancient Hebraic texts it encounters for the first time. “We show the computer many pictures from manuscripts, alongside their correct transcription,” says Lavee. “The computer itself finds the leading mathematical formula from the visual data for the text, and develops the ability to decipher even the written manuscript, whose transcript it hasn’t been shown before.” Dror Elovits, the lab’s technology manager and a graduate student in history, believes “the day is not far when we won’t need the human factor anymore, the texts will digitize themselves.”

(For a similar story about the Vatican Archives, see this 2018 LH post.) Thanks, Kobi!

Comments

  1. I wondered what the underlying joke to “kraken” might be, so I looked it up:

    https://github.com/mittagessen/kraken

    kraken is a fork of ocropus intended to rectify a number of issues while preserving (mostly) functional equivalence.

    (where “OCRopus is a free document analysis and optical character recognition (OCR) system released under the Apache License v2.0 with a very modular design using command-line interfaces.” — WikiP)

    While there seems to be an obvious straight line between the two puns of OCRopus & Kraken, I wondered if there might not be a less obvious Hebrew pun in there. “Kra” could be the imperative form of koreh, “read”, and “ken” ¹ could be the Hebrew for “yes”.

    So “Kra! Ken!” => “Read! Yes!”

    Maybe.

    =_____________________________________________
    1: ken can have other meanings as well.

  2. David Marjanović says:

    I don’t think I’ve encountered Hebraic before.

  3. Google has 1,420,000 results for Hebraic.

  4. Ah yes, “Professor Ben-Ezra,” by Robert Smoking.

  5. Owlmirror: Thanks for that; I’d wondered too, but was too lazy to investigate.

  6. John Cowan says:

    Teacher: “Why did Abu ben-Ezra’s name lead all the rest?”

    Teenage Asimov [waving his hand wildly]: “Alphabetical order, sir!”

    Teacher points his thumb towards the door; Asimov leaves, being familiar with the routine.

  7. Maybe OCRopus was equated with the Octopus, since the Kraken monster is a kind of an octopus or squid?

    I surely got tons of handwritten Hebrew for deciphering…

  8. @Dmitry Pruss: I’m pretty sure that’s what Owlmirror meant by “an obvious straight line between the two puns of OCRopus & Kraken.”

  9. Abou Ben Adhem was the chap whose name led all the rest. (I only know this from reading Wodehouse.)

  10. John Cowan says:

    Right, of course; I had ben-Ezra and ben-Adhem conflated. Kind of weird to mix Arabic “abu” with Hebrew “ben”, but there it is.

  11. Yeah, and Leigh Hunt seems to have thought Abou is a first name (as did the writers of Aladdin) and Ben Adhem a surname.

    I think Owlmirror is right that Kraken is a pun on “read” — it’s especially obvious in Hebrew because the first three letters of קראקן are the same as the triliteral root “read”.

    (“Jesus, heaven forbid” in the article is presumably — the original is behind a paywall — a reflexive bit of Christianophobia of a type that’s unfortunately pretty standard in Israel. At least it doesn’t say ימח שמו וזכרו “effaced be his name and memory”, a backronym of ישו “Jesus”.)

  12. I thought heaven forbid was more self-mocking than reflexive. Surely you’re already laughing when you realize of all the mistakes the Kraken could have made in interpreting a Hebraic text, it managed to come up with an avatar of the Lord from a competing religion.

    The author was certainly laughing. Anyone reflexively Christ-phobic would have suppressed that anecdote instead of drawing attention to it.

  13. the original is behind a paywall

    So my cached link didn’t work for you? Rats. It still works for me.

  14. So my cached link didn’t work for you?

    It does — I meant the Hebrew original.

    The author was certainly laughing. Anyone reflexively Christ-phobic would have suppressed that anecdote instead of drawing attention to it.

    You may be right, but not a few Israelis who wouldn’t call themselves anti-Christian will nevertheless automatically put in the verbal equivalent of a shudder when mentioning Jesus. That’s why I’m wondering what the tone is in the original. (In grade school in Israel kids are, or used to be, taught a modified version of the plus sign with the bottom half of the vertical line removed so that it doesn’t resemble a cross.)

  15. David Marjanović says:

    Maybe OCRopus was equated with the Octopus, since the Kraken monster is a kind of an octopus or squid?

    “Octopus” in German: Krake nom. sg., Kraken all other forms.

  16. FWIW, the Hebrew of the original is here. As TR notes, it is mostly behind a paywall, and none of the tricks that work on the haaretz.com site (Google cache, as above, or the printable version of the article) work on the haaretz.co.il site. They don’t even want to show me the limited preview if I browse with Javascript enabled and an adblocker enabled.

    Seeing that Kraken is written “קראקן” and not “קרא-כן” invalidates the second half of my pun; oh well. Ken with a qoph means “nest”, usually.

    (Without the hyphen, the kaf would usually be read as a guttural khaf, and “קראכן” (krachen) means (the verb) “crack” in Yiddish)

  17. Owlmirror says:

    @John Cowan:

    Kind of weird to mix Arabic “abu” with Hebrew “ben”, but there it is.

    While it is more usually written “ibn” or “bin”, some Arabic dialects do use “ben”.

    @TR:

    Leigh Hunt seems to have thought Abou is a first name

    Behindthename on Abu says:

    Means “father of” in Arabic. This is commonly used as an element in a kunya, which is a type of Arabic nickname. The element is combined with the name of one of the bearer’s children (usually the eldest son). In some cases the kunya is figurative, not referring to an actual child, as in the case of the Muslim caliph Abu Bakr.

    WikiP on the topic gives more examples, including the ailurophilic Abu Huraira, “father of kittens”.

    Hebraicists will note that kunya is an obvious cognate of kinuy (which, for non-Hebraicists, also means a name that one is known as; a nickname).

    I am also reminded of Abba bar Abba, and Barabbas.

  18. If Trump joined ISIS, he would be known as Abu Ivanka Al-Amriki.

  19. David Marjanović says:

    While it is more usually written “ibn” or “bin”, some Arabic dialects do use “ben”.

    OBL was known in French as Oussama Ben Laden for Western Arabic reasons.

  20. Only marginally related, but I first encountered the word “kraken” in the story “Captain Stormalong Meets a Kraken” in the fifth-grade-level reader Freedom’s Ground. The problem with the story, from my viewpoint, was that it did not explain what a “kraken” was, except that it was a sea monster with tentacles. Nor did any of the secondary worksheets or workbook pages. When I asked my reading teacher, Mr. Wolf (a huge asshole; one of my then classmates is now a language arts teacher, and she told me that one of her personal commitments as a teacher was to never behave like Mr. Wolf), he merely mocked me for not knowing (or figuring out) what a kraken was.

  21. “Captain Stormalong Meets a Kraken”

    Not online, alas, but a Google Books search reveals the start:

    Captain Alfred Bulltop Stormalong was a seagoing Down East Yankee. He was a sailor in the days when wooden ships and iron-muscled men traveled over the seven blue seas. Where …

  22. It’s adapted from “Captain Stormalong, The Revolution and Clipper Ships,” from Tall Tale America, which you can see here at Google Books.

Speak Your Mind

*