THE SPACE BETWEEN WORDS.

September 20, 2003 by languagehat 21 Comments

A fascinating interview (from Jill Kitson’s Lingua Franca radio show, to which I clearly should be paying more attention) with Paul Saenger, author of Space between Words: the Origins of Silent Reading (Palo Alto, Stanford University Press, 1997) (reviews here [by Orest Ranum] and here [Michael Richter, American Historical Review, April, 2001, pp. 628-9, pdf available here), whose basic point is that the gradual spread of marking word-breaks in writing, from Ireland (where it may have been picked up from Syriac manuscripts) to England and across Europe, made possible the development of silent reading, which we take so much for granted. A couple of extracts to whet your appetite:

Jill Kitson: Well I found you made this really interesting, and I think almost mind-boggling point, that languages that exist only in oral form have no word for ‘word’.

Paul Saenger: Yes, that’s true.

Jill Kitson: So you say it’s really because they’re not conscious of words as graphic units.

Paul Saenger: Right. And the beginning of word separation sharpened that consciousness. For example in Roman antiquity, there was still ambiguity as to what constituted a word for the Roman and Greek grammarians particularly, with enclitic particles, conjunctions, prepositions, it wasn’t clear exactly what was the demarcation between a word and syllable. And even there’s a trace of that today. If one looks in a dictionary, you’ll find along with words certain syllables listed, which are not really words, but which have to be attached to something else….

Jill Kitson: Well in fact you say there was a sort of stage between the space and what you call ‘aerated script’. So what was that and where was it introduced?

Paul Saenger: The whole tendency to aerate script, and on this I talk about in the book—perhaps only very, very slightly, but I’ve done some research on it since—it is probably true that inscriptions of the lowest level in the Roman world that is tomb inscriptions, simple inscriptions written to commemorate the humble dead, tended to be not written in scriptura continua in the pure sense. There was a tendency to try to help the reader by leaving breaks sometimes between syllables, sometimes between words. Not every word, not every syllable. And to write in very short lines; especially this is true among Christians who represented a sort of middle tier between the literate elite of the pagan world and the vast number of people who were totally illiterate of course. And this process created a certain model for facilitating reading by the intrusion of space, which was rapidly expanded in the British Isles, particularly in Ireland. And by the end of the 7th century you had full word separation. And there you had the greatest disjunction between the language, the vernacular language, that is among the Celtic people and Anglo Saxons, and the literate language, or the Latin tongue of the Church.

Now in France for example, that awareness that the language spoken was different from that of the church and of literature, only evolved over a course of centuries, in the 9th and 10th century it became true. But it was much more evident in those areas, which were beyond the borders of the Roman Empire, where there was no tradition of Latin literacy.

This is a book I want very much to read. (Via Transblawg.)

Addendum. Margaret Marks has discovered a Henry Hoenigswald review of a book mentioned in the second of the Saenger review links above, M.B. Parkes, Pause and Effect: An Introduction to the History of Punctuation in the West; it too is well worth your attention:

Anyone who has marked up a manuscript or typescript for delivery is aware of a special desideratum. Intonations, instead of having a segmental location in the linear flow of speech, occur over stretches that have a beginning and an end. In this regard, though not in others, they are like word dividers. By convention, it is the end of the stretch that gets the punctuation which is nevertheless understood to extend to the entire sequence that precedes. But (as actors studying their scripts well know) readers/speakers need it in both places, just as they need both the spaces or dots that set off words, as provided in orthographies other than mere scriptio continua; cp., in the latter context, Quintilian Inst.or. 1.1.34 dividenda intentio animi ut aliud uoce alius oculis agatur (10). Get-ready signals are devised here and there, East and West: the Armenian paroyk set over the peak of stress in all kinds of questions makes a contribution toward serving that purpose, and so does, in the context of lexical (not intonational) accentuation, the roundabout way in which the grave accent mark is employed in pre-Byzantine Greek writing. In the Vedic texts of India it is the vowel which precedes the prominent lexical pitch which is shown as specifically unaccented. And in the eighteenth century the Royal Spanish Academy prescribed the addition of inverted question and exclamation points at the start of what seem to be the appropriate intonational stretches.

Comments

zizka says

September 20, 2003 at 6:24 pm

Ivan Illich “In the Vineyard of the Text” attributes silent reading to St. Augustine. All of his books are fascinating, including this one.
The same issues appear in Chinese, which is normally unpunctuated in any way, though the script does separate words if they’re monosyllabic.
Jonathon Delacour says

September 20, 2003 at 6:51 pm

This is fascinating indeed — the reviews to which you link, particularly the first one, are also worth reading. Lack of time rather than lack of interest will preclude my reading Saenger’s book, so I hope you do read it and write about your impressions.
Written Chinese, Japanese, and (I think) Korean lack word breaks. I’ve never found this a problem with reading Japanese since, as you are aware, the language is written in a mixture of three character sets: kanji (Chinese characters) plus two phonetic syllabaries (hiragana and katakana).
Most words written with kanji are either separated by particles written with hiragana or have inflections (also) written in hiragana. Thus, even though there are no spaces, the Japanese text comprises “visual units” which are easy to differentiate and recognize.
The same is not true of Chinese which looks, to my eye, like one continuous string of unspaced characters. I’ve always wondered, given my experience of reading Japanese, how Chinese readers separate the characters into units of meaning.
Jonathon Delacour says

September 20, 2003 at 6:57 pm

Hmmm, the old “someone else (in this case, zizka) added a comment while you were writing yours” phenomenon. I should remember to scroll down the preview page and see if that’s occurred.
I’m hoping one of your readers will explain how the Chinese deal with the lack of word-breaks — I’m not quite sure what zizka means by the script’s separating words if the are monosyllabic. I’ve never seen a Chinese text with word spaces (though, admittedly, I haven’t seen all that many Chinese texts).
language hat says

September 20, 2003 at 7:13 pm

Well, not that I can read Chinese, but there are common particles like de (which comes after adjectives), zi (which comes after many nouns), and le (which comes after past-tense verbs—don’t hit me, you Sinologists out there, I know these descriptions are hopelessly inadequate and probably misleading, I’m just waving my hand and trying to give a vague idea), and along with frequent combinations of characters they probably provide enough cues to do the trick. But Jonathon and I would welcome input from anyone who actually knows what they’re talking about.
language hat says

September 20, 2003 at 7:16 pm

Aargh! The same thing happened to me: I posted my response to your first comment and found your second had snuck in while I was shaping my precious words. I don’t know what zizka means by that either. zizka?
xiaolongnu says

September 20, 2003 at 9:46 pm

All right, I’m a Sinologist. Chiming in on the Chinese question — Classical Chinese was completely unpunctuated, so when I am working on ancient inscriptions the first thing I have to do is punctuate them, which can be very tricky (you have to sort of work back and forth between placing the breaks based on what you think the meaning is, and figuring out the meaning with the help of the breaks). Fortunately most of the most widely used classical texts are now available in punctuated editions. Naturally there are people who think this is cheating. Anyway, one of the upshots of this is that there isn’t any word in Classical Chinese that really means “word” and indeed, it’s difficult sometimes to decide whether only single characters should be considered words or whether two-character compounds also count (or even longer compounds).
Modern Chinese texts are made up of evenly spaced strings of characters, and although there are some rules for punctuating the modern language (including two different types of comma, one for pauses in a sentence, and one for delimiting the items in a list), in practice Chinese sentences are much more sparsely punctuated than English sentences. So a key to learning to read modern Chinese is being able to parse the sentence correctly — to realize which characters “clump” together into compounds and which should be read singly. Often it happens that there is more than one possibility. languagehat, you’re right that particles like de, le, and zi sometimes help with this, but unfortunately, they are used much less frequently in written Chinese than in spoken Chinese. As I remember my experience of learning Chinese, the “clumping” problem tripped me up time and again — I can just remember spending hours staring at sentences that seemed to be nonsense because I wasn’t breaking them up correctly.
There is a word for “word” in modern Chinese (ci), which for you Japanese speakers is the “ji” in “jiten” (dictionary). (It has been simplified to a different form on the mainland.) In practice I’ve found this term unsatisfactory because the idea of “word” is sort of unstable, whereas the idea of “character” is well-defined. If I am using “ci” to talk about English words, people seem to get what I’m talking about, but it’s led to some confusion when I use it to talk about Chinese “words.”
Jonathon, I was interested to know that you find Japanese easy to read because of the visual rhythm created by the alternation of kanji and kana. My experience is that I find Japanese difficult to read because I always want to jump from one kanji to the next and ignore the kana. I think because I learned Japanese long after learning Chinese, I tend to think of the kanji as signal and the kana as noise. I know that’s ridiculously Sinocentric, and of course this habit messes me up all the time when I forget to note whether a given word has a positive or negative ending (just one common experience). Anyway, I have to read Japanese scholarship pretty often, and I can only pull it off if I sit down, turn off the music, and really concentrate on putting kanji and kana together.
language hat says

September 20, 2003 at 10:05 pm

A fascinating and informative comment; many thanks! But shouldn’t that word for ‘word’ be zi rather than ci?
k says

September 21, 2003 at 1:56 am

Modern Korean definitely has word-spacing.
On a slightly related topic, I’ll try to remember to look up the names of the researchers who were testing kana/kanji switching strategies in native readers of Japanese. I was more than a bit worn out when I saw the poster session, but I do remember it as fascinating.
Adam Morris says

September 21, 2003 at 3:09 am

Thanks for the tip, I really enjoy your posts.
The “word” ci is the better word for “word” in Chinese, zi means more like single characters in and of themselves. I’ve also noticed that sometimes when I’m talking about words I have to say “expression” otherwise people don’t really understand what I mean.
I’m going to write more on this on my own blog. I’m not sinologist but I think I have something to say. Thanks again.
MM says

September 21, 2003 at 3:15 am

In pinyin it’s ci, but maybe you have seen a different romanization?
Jonathon Delacour says

September 21, 2003 at 3:34 am

“kanji as signal and kana as noise” — xiaolongnu, I cracked up when I read that.
I wasn’t surprised that you wrote “a key to learning to read modern Chinese is being able to parse the sentence correctly” though, for some reason, it hadn’t occurred to me — strange, since one uses exactly the same strategy to decode Japanese sentences.
As an intermediate-level Japanese reader I have two main tasks: learning kanji and vocabulary and learning how to parse the sentences correctly.　(Japanese grammar has never given me any problems, though it seemed to be an ongoing difficulty for others in the classes I attended, particularly those who were a fair bit younger and had suffered under an educational regime that discounted the importance and usefulness of English grammar.)
I’m steadily working my way through Aihara Setsuko’s “Strategies for Reading Japanese: a rational approach to the Japanese sentence” and it’s proving quite useful (though a hard slog).
The core of the book comprises a series of strategy exercises which teach “the knack of determining the underlying structure of the Japanese sentence with its multiple subordinate clauses”. A typical “strategy exercise” sentence reads:
Aを、こうした思いもかけぬBのCのDで、しかも、追われてEのFも分からずに逃げているという、思いもかけぬGで歌うのです。
When the missing words and phrases (signified by A, B, C, etc) are included, the sentence is found to mean:
“We did sing that song in such unimaginable mountains of a foreign land, and moreover, in the totally unexpected circumstances of retreating in the face of pursuit, without knowing whether the next day we would live or die.”
language hat says

September 21, 2003 at 9:17 am

Sorry, all—you’re talking about ci, and I was talking through one of my many hats. I stand corrected.
This stuff about reading Chinese and Japanese is absolutely fascinating; I’ll have to link the discussion over at Avva, where someone was asking about it.
zizka says

September 21, 2003 at 10:49 am

Xiaolungni said it. Part of what I meant is that classical Chinese words tended to be monosyllabic or to be thought of as such. Even for bisyllabic words, e.g. hutieh butterfly, the two syllables traditionally were given independent definitions even though in most cases at least one of them was only seen in combination. You also have a few common suffixes which attach to the previous word. But in general, one character = one word.
So anyway, as xln said, it’s the periods and commas that are the problem. What you think is the beginning of a sentence might be the end of the previous sentence, etc.
When dealing with unknown texts of uncertain context, howlers of interpretation are so common that they are aren’t even terribly funny. Someone might puzzle out the meaning of a phrase and then find out later that the phrase was just a series of proper names, or was a meaningless garble. A.C. Graham says that when Watson translated Chuang Tzu, on the “finish the job” principle he produced intelligible English translations of passages with no apparent meaning in the Chinese.
Paul D says

September 21, 2003 at 5:52 pm

As another intermediate Japanese learner, I concur that the script mixing makes word parsing easy. In fact, once you know your kanji, reading a sentence in all-hiragana or romaji (dumbed down for foreigners) is much more difficult.
On the other hand, parsing long and complex sentences (like the one Jonathon inserted above) I find very difficult, since the internal “way of thinking” is opposite to that of English. Knowing the grammar rules isn’t always enough. I must track down a copy of that “Strategies for Reading Japanese”. It sounds terrific!
Adam Morris says

September 23, 2003 at 3:54 am

As far as sentence parsing goes, there’s a lot more to Chinese than just the de and zi particles mentioned. There’s also liao, xing, shi, and bu.
Also, sentence parsing in Chinese for foreigners is definitely one of the major difficulties, although Chinese grammar is so presentable and straight-forward the major problem is making sure you’re corresponding the characters properly to the words there.
Owlmirror says

September 12, 2020 at 10:22 pm

Lingua Franca is no longer being broadcast, but the show archives are available.

The transcript for the episode “The Space between Words: The Origins of Silent Reading”.

I have not been able to find any audio archive of the episode.
Owlmirror says

September 12, 2020 at 10:56 pm

As to the reviews — the Internet Archive does have the first one (Compuserve!?), but it is also here on an updated site.

The Internet Archive does not seem to have the second review, as best I can tell, but everything but the final paragraph can be read as part of a preview.

The American Historical Review archive, and therefore, the review, is also on JSTOR. The final paragraph reads:

The book includes thirty-six well-chosen illustrations and nearly all the required scholarly apparatus (the glossary includes the term aeration with a meaning not given in the Oxford English Dictionary). Each reader will add items of personal choice in the index (in the present case, Moengal, pp. 103, 111). A work of stupendous learning (with 150 pages of endnotes) as well as down-to-earth vision, this monograph is the fruit of a long process of studying manuscript culture in the Middle Ages; the first tentative steps in this field were published by Saenger twenty years ago. The bibliography (pp. 439-48) in no way does justice to the amount of scholarly achievement that is pressed into service here. This study has all the qualities to make it indispensible to medieval studies of every kind.
MICHAEL RICHTER
Konstanz University
Bathrobe says

September 13, 2020 at 1:48 am

This thread was before my time at LH.

I think that knowing how to speak the language is of immeasurable help in reading it. The difficult Japanese sentence cited (Aを、こうした思いもかけぬBのCのDで、しかも、追われてEのFも分からずに逃げているという、思いもかけぬGで歌うのです) is not really difficult if you can speak the language to a reasonable degree, rather than just having to parse it on paper. The same goes for Chinese. You might make a few garden-path errors, but you soon realise your error and can go back and correct it. On the other hand, I have a lot of difficulty parsing Mongolian sentences, which seem to be more grammatically complex (or perhaps strung out) than Japanese, and more difficult for me because my spoken Mongolian is still rather poor.

The 詞 cí as ‘word’ is a concept introduced from the West. Chinese tend to think more in terms of 字 zì or characters.

But I’m sure much of this has been dealt with, probably on repeated occasions, in the 17 years since this thread appeared. (LH seems to have become far more sophisticated linguistically over time.)
David Marjanović says

September 13, 2020 at 6:24 am

LH seems to have become far more sophisticated linguistically over time.

It has, and so have I!
John Cowan says

September 13, 2020 at 9:58 pm

字 zì or characters

Characters, syllables, or morphemes, take your pick.
Gordon Holley says

August 16, 2022 at 2:38 pm

Funny thing is, despite the complaints leveled against POST-Byzantine grave accent rule, putting it on top of all the other rules for accent placement and on which letters may end a word, and it makes continua even easier to read.

THE SPACE BETWEEN WORDS.

Comments

Speak Your Mind

Archives

Search

Recent Posts

Recent Comments