Punctuation Identification.

Alexandra N. M. Darmon, Marya Bazzi, Sam D. Howison, and Mason Porter have written a paper on “textual analysis via punctuation sequences”:

Punctuation is a largely overlooked stylistic feature in “stylometry”, the quantitative analysis of written text. In this paper, we examine punctuation sequences in a corpus of literary documents and ask the following questions: Are the properties of such sequences a distinctive feature of different authors? Is it possible to distinguish literary genres based on their punctuation sequences? Do the punctuation styles of authors evolve over time? Are we on to something interesting in trying to do stylometry without words, or are we full of sound and fury (signifying nothing)?

I confess the cutesy style of that last sentence irritates me, but so do the giggly styles of today’s newscasters and interviewers — I’m an old fossil used to solemnity in public utterances. But never mind that; they’ve created a web app that will compare the punctuation style of any writing sample to the authors in its database, and of course it’s fun to put in samples and get results. The problem is that the results are essentially meaningless. To quote verstegan at MetaFilter (where I got the link):

It doesn’t inspire confidence in the authors’ methodology that they analyse Shakespeare’s punctuation without, apparently, being aware that this varies enormously from edition to edition. Ever since the time of Samuel Johnson, editors have freely repunctuated the text of Shakespeare. The claim that (to take one example) ‘Shakespeare appears to use more exclamation marks and question marks than H.G. Wells’ is thus completely meaningless.

The same goes for most of the earlier texts in their sample, as they are using public domain texts from Project Gutenberg, many of which will have been repunctuated. In other words, their text corpus is totally contaminated and their claims about ‘the evolution of punctuation marks over time’ are completely untenable. (And that’s even before we get into the question of whether the punctuation of nineteenth- and twentieth-century books reflects authors’ preferences or printers’ house styles ..) I’m afraid this is what happens when four mathematicians write a paper without bothering to consult any literary scholars, textual editors or bibliographers.

Sigh. But enjoy the game, as long as you realize it doesn’t mean anything!


  1. Athel Cornish-Bowden says:

    Apparently I write like Arthur Conan Doyle, or maybe Thomas de Quincey, or Honoré de Balzac.

  2. David L says:

    It tells me I write like Louis Becke and no one else. I’d never heard of him but Wikipedia comes to the rescue. Sounds like an interesting fellow. But then: “Becke was criticised by some reviewers for lapses in grammar and taste.” Oh dear.

  3. January First-of-May says:

    …Who the triangular heck is Marie Lebert? Wikipedia doesn’t seem to be helpful.

    (Some googling tells me that she’s a linguistics-related blogger, which presumably means that I write like a blogger. This is to be expected, since, lacking any lengthy pieces of regular fiction, I entered some of my longer blog posts.)

    Other offered choices (with shorter texts) included Henry Haggard and Honore de Balzac, but Marie Lebert showed up multiple times.

    EDIT: several other blog posts I tried were said to be similar to Thomas De Quincey, who was apparently a guy who wrote about opium?

  4. Yes, and he’s probably a guy with a large presence in their corpus.

  5. I suppose I would have guessed H. Rider’s first name was Henry, but I’ve certainly never seen him called that. The webapp thinks Project Gutenberg’s “King Solomon’s Mines” was punctuated by Gordon Stables.

  6. Athel Cornish-Bowden says:

    I’ll try again.

    I suppose that “Henry Haggard” is the chap we usually call H. Rider Haggard, but I wonder how familiar the authors are with the writers who appear in their database.

    Will there never come a season
    Which shall rid us from the curse
    Of a prose which knows no reason
    And an unmelodious verse:
    When the world shall cease to wonder
    At the genius of an Ass,
    And a boy’s eccentric blunder
    Shall not bring success to pass:

    When mankind shall be delivered
    From the clash of magazines,
    And the inkstand shall be shivered
    Into countless smithereens:
    When there stands a muzzled stripling,
    Mute, beside a muzzled bore:
    When the Rudyards cease from Kipling
    And the Haggards Ride no more.

  7. January First-of-May says:

    I rechecked the comment that got the “Henry Haggard” answer, and it was actually “Haggard, H. Rider (Henry Rider)”.

    So it was me who was sufficiently unfamiliar with H. Rider Haggard to misremember it as “Henry Haggard”; no offense (at least in this case) to the creators of the app.

    That’s a neat poem, though. I’m assuming that “the genius of an Ass” refers to the Apuleius novel?

  8. John Cowan says:

    It was James Kenneth Stephen, who wrote “To R.K.” above. This one is called “A Sonnet” (it isn’t), and likewise parodies the poet mentioned in the last line.

    Two voices are there: one is of the deep;
    It learns the storm-cloud’s thunderous melody,
    Now roars, now murmurs with the changing sea,
    Now bird-like pipes, now closes soft in sleep:
    And one is of an old half-witted sheep
    Which bleats articulate monotony,
    And indicates that two and one are three,
    That grass is green, lakes damp, and mountains steep:
    And, Wordsworth, both are thine.

  9. At 7:49 PM John Cowan says J. K. Stephen’s “A Sonnet” isn’t a sonnet, but his source quotes only the octave. Here’s the whole poem: octave plus sestet.

    Two voices are there: one is of the deep;
    It learns the storm-cloud’s thunderous melody,
    Now roars, now murmurs with the changing sea,
    Now bird-like pipes, now closes soft in sleep:
    And one is of an old half-witted sheep
    Which bleats articulate monotony,
    And indicates that two and one are three,
    That grass is green, lakes damp, and mountains steep:
    And, Wordsworth, both are thine: at certain times
    Forth from the heart of thy melodious rhymes,
    The form and pressure of high thoughts will burst:
    At other times — good Lord! I’d rather be
    Quite unacquainted with the A.B.C.
    Than write such hopeless rubbish as thy worst.

    As criticism, that’s about right. When you’re young, you can get away with saying, “Listen to me! I’m interesting!” because when you’re young there’s a possibility that you ARE interesting. But when you’re old . . .

    Well, take a look at Wordsworth’s late sonnet sequence in praise of capital punishment. Wordsworth (1770-1850) kept writing almost to the end, but stopped developing at about the age of 35.

  10. Bathrobe says:

    @Athel C-B

    Honoré de Balzac

    I wonder which translation are they using?

    I was identified as being close to “Barrie, J. M”. I guess I write Peter Pannish. Not sure whether to be proud or not.

    @Jonathan Morse stopped developing at about the age of 35

    I know a retired editor who still writes interesting stuff (interspersed with the odd diatribe).

  11. Stu Clayton says:

    Nobody develops any more, you can print directly from the photostick.

  12. Nero Wolfe, of course, once identified a writing authorship by paragraph breaks. If memory serves, the perp has written a number of literary works in styles of different authors and then used a number of confederates to shake the said authors down for plagiarism. But paragraph breaks were impossible to emulate (in Wolfe’s opinion) and the culprit was duly exposed.

  13. David Eddyshaw says:

    I am apparently John Masefield. I may sue.

    Quinquireme of Nineveh from distant Ophir …

  14. David Eddyshaw says:

    I’m with Byron and Auden on the subject of Wordsworth. Bloody daffodils.

  15. The author of the Declaration of Independence can now be revealed “George Borrow”

  16. David Eddyshaw says:

    Wild America. Yes, I see that.

  17. Punctuation? where is Victor Borge when we really need him?

  18. For those unfamiliar with the reference: Phonetic Punctuation.

  19. John Cowan says:

    Russell on Wordsworth: “In his youth Wordsworth sympathised with the French Revolution, went to France, wrote good poetry, and had a natural daughter. At this period he was called a ‘bad’ man. Then he became ‘good,’ abandoned his daughter, adopted correct principles, and wrote bad poetry.”

    Carlyle on Wordsworth, as relayed by Sir Charles Cavan Duffy:

    But though Wordsworth was the man of most practical mind of any of the persons connected with literature whom he had encountered, his pastoral pipings were far from being of the importance his admirers imagined. He was essentially a cold, hard, silent, practical man, who, if he had not fallen into poetry, would have done effectual work of some sort in the world. This was the impression one got of him as he looked out of his stern blue eyes, superior to men and circumstances.

    I said I expected to hear of a man of softer mood, more sympathetic and less taciturn.

    Carlyle said, `No, no, not at all; he was a man quite other than that; a man of an immense head and great jaws like a crocodile’s, cast in a mould designed for prodigious work.’

  20. George Grady says:

    Geoff Nunberg on Jane Austen and punctuation at the Language Log in 2010: Jane Austen: missing the points.

    For a very large number of authors, the punctuation you see in their published works is as much their printer’s as it is theirs.

  21. January First-of-May says:

    I tried to enter a snippet from the last few pages of Ulysses (which famously don’t contain any punctuation) and got a weird glitched bug page. I’m guessing the app tried to divide by zero?

    (I actually started with A Pickle for the Knowing Ones, but there was more punctuation in the Gutenberg edition than I expected, even after I excluded the properly-punctuated passages.)

  22. My favorite punctuation style is that of Leela in The Mystic Masseur:


  23. I tried to enter a snippet from the last few pages of Ulysses (which famously don’t contain any punctuation) and got a weird glitched bug page. I’m guessing the app tried to divide by zero?

    Interesting. I put in a chunk of Molly’s monologue and added a period at the end, and got “ValueError at /punctuation/ Columns must be same length as key”; I added a few more marks and got the same, then put in a shorter chunk and added some marks, and got “Le Fanu, Joseph Sheridan.”

  24. Stu Clayton says:

    Python. Division by zero clue as to how to design a program.

    Good sleuthing, Steve !

  25. I wouldn’t call it “sleuthing” so much as “random flailing.”

  26. Stu Clayton says:

    But you clearly had a hypothesis about the causes of the crashes, and a strategy for verifying it – constrained trial and error. Baron Verulam would have minted you a medal.

  27. AJP Crown says:

    I tried something by E.E. Cummings yesterday but it said Django was doing something true that ought to be false and I never received a proper response.

  28. Stu Clayton says:

    # Django is a high-level Python Web framework that encourages rapid development … #

    In the middle of difficulty lies opportunity. And vice versa when you don’t understand what you’re doing. This kind of thing always reminds me of Mickey Mouse in The Sorcerer’s Apprentice.

  29. So they unchained Django… 😉

    @David Eddyshaw: That brought back school memories. I got in an argument with my English teacher, because she used “Cargoes” as an example of a “bad” poem in comparison to some other poem, and I didn’t think “Cargoes” (which I first encountered during that lesson) deserved that.

  30. “Cargoes” (for those unfamiliar with it).

  31. David Eddyshaw says:

    It’s an interesting question. Although the poem has unfortunate associations for me too (also associated with schooldays – they made us sing) I agree that it is in fact a pretty good poem qua poem.

    I have arguments with my English-major daughter, who is (as is appropriate) much woker than I, about Kipling, who I myself think is pretty much the poster child for excellent poet despite his sometimes rebarbative message. The Female of the Species, for example, despite being a polemic against female suffrage, is, as a poem, truly brilliant.

    Kipling’s actual politics are a good deal more nuanced than is sometimes alleged, for that matter, but that is a separate (or at least separable) issue.

  32. David Eddyshaw says:

    I was just today moved to think of Kipling’s sanity by the stupidity of journalistic accounts of the veterans of the Second World War:

    We aren’t no thin red ‘eroes, nor we aren’t no blackguards too,
    But single men in barricks, most remarkable like you

  33. I agree with you about Kipling. But he’s liable to get you in a lot of hot water with a lot of people nowadays.

  34. Stu Clayton says:

    In the olden days, one just avoided people who carried a chip on their shoulder. It may surprise you that the same tactic works well against people who carry buckets of hot water around. You can even kick their buckets.

    Le parti des dévots

  35. John Cowan says:

    I see your “Cargoes” and raise you “The King”:

    “Romance!” the season-tickets mourn,
    He never ran to catch His train,
    But passed with coach and guard and horn —
    And left the local — late again!”
    Confound Romance!… And all unseen
    Romance brought up the nine-fifteen.

    His hand was on the lever laid,
    His oil-can soothed the worrying cranks,
    His whistle waked the snowbound grade,
    His fog-horn cut the reeking banks;
    By dock and deep and mine and mill
    The Boy-god reckless laboured still!

    Of course, steam trains and ships are romantic nowadays, and poets’ scorn is reserved for yet more recent products of technology.

  36. I confess I don’t understand that poem. Who is the Boy-god, and what is he doing with an oil-can?

  37. David Eddyshaw says:

    When Kipling was good (which was often), he was very good indeed. He had the real ability to see beauty and significance underneath the everyday and familiar.

    The poem puts me in mind of a remark of Walter Savage Landor’s (which I came across in the introduction to Gordon’s Introduction to Old Norse):

    The Romans are the most anti-picturesque and anti-poetical people in the universe. No good poem ever was or ever will be written about them.

    … which tells you nothing about the Romans, but a lot about Landor’s want of imagination. (I suspect he knew better, to be fair.)

  38. Yes, Landor was (like me) a Hellenophile, and Hellenophiles and lovers of Rome are like Mets fans and Yankee fans — you can overcome the barriers to mutual understanding, but it takes work.

  39. John Cowan says:

    That’s only a bit of the poem: follow the link. Romance, capitalized, is the personified spirit of romance, but we are also meant to understand that he is Cupid. The oil can is for oiling the train’s various parts to keep it running.

    I changed Banks to banks, as if the banks surrounding the track, but the glossators say it is a reference to the Grand Banks off Newfoundland. Shoulda left it alone….

  40. No, I read the whole thing, but it still seemed incoherent to me.

  41. Athel Cornish-Bowden says:

    Many of the wealthy Romans in the classical period were Hellenophiles as well, and expected their sons to speak Greek properly — sending many of them off to Marseilles to learn it.

  42. Trond Engen says:

    No, I read the whole thing, but it still seemed incoherent to me.

    The “Boy-god” reference is probably less obvious nowadays. Yet I took it.

    The poem tells how we always have been looking backwards to a past of truer colors. without the toils of the modern world. Then, from “Confound Romance”, he turns to tell how every litte bit of the modern world has the same potential for poetry. I don’t take the final line, though. My hunch was that it was a reference to a poem inspired by the steam engine, but googling “Our King was with us — yesterday!” only brings up Kipling.

  43. John Cowan says:

    Just so. As for the last line, it means that poets (other than K) are taught to believe that Romance never exists in the present, only in the past.

  44. Trond Engen says:


  45. Owlmirror says:

    Re: Cargoes: “. . . and gold moidores”. Well, what’s moidores?

    Wikt: Portuguese moeda de ouro, literally “golden coin”.

    So gold coins of gold, then. I see.

    I also note that quinquiremes appear to have been Greek warships, not Assyrian merchant vessels.


    I must not be a romantic, because I read “ivory”, and all I can think is that many elephants would have been killed and had their teeth ripped from their mouths. Those apes and peacocks would have been miserable, cooped up in tiny cages, and probably seasick to boot.

    The Spanish one involves slavery and death of hundreds of thousands — millions? — of humans. And why is the galleon “stately”, rather than covered with barnacles and riddled with woodworms, while the coaster is explicitly “dirty”?

    Romanticism is just reimagining the past with the blood and shit magicked away.


    But it is a good poem, I grant that, if you close your eyes to historical accuracy and context.

  46. AJP Crown says:

    You’re just judging Victorian cruelty and indifference by our standards, Owl Mirror. People care more about elephants nowadays than they did when Leopold II was plundering the Congo, but what about all the other animals (chickens, sheep, pigs, goats etc.) that are killed every day? Fish, birds. That’s what we’re closing our eyes to, really. Wm Blake wrote about it 200 yrs ago in Songs of Innocence & of Experience Shewing the Two Contrary States of the Human Soul of which ‘Tyger, Tyger, burning bright’ and ‘The Lamb’ (of God, vs of Sunday dinner) are the best known poems.

  47. David Marjanović says:

    And why is the galleon “stately”, rather than covered with barnacles and riddled with woodworms, while the coaster is explicitly “dirty”?

    Because the barnacles are below the waterline and you can’t see the holes the woodworms made unless you come very close, while if the coaster has been in a few storms near a coast, it may have quite visible mud or kelp or whatever on the outside.

  48. Trond Engen says:

    Romanticism is just reimagining the past with the blood and shit magicked away

    Oh, yes. That’s close to Kipling’s point, and I’m pretty sure it’s what Masefield is aiming at. His two first — “romantic” — verses are piles of evocative words out of time and place, and transparently so.

    Quinquireme — a Greek (or Mediterranean) warship invented in the 4th century BCE.

    of Nineveh — far up the Tigris, in ruins by the end of the 7th century BCE

    from distant Ophir — somewhere in India?

    The rest of the verse is mostly a recount of the biblical account of the imports from Ophir:

    Rowing home to haven in sunny Palestine,
    With a cargo of ivory,
    And apes and peacocks,
    Sandalwood, cedarwood, and sweet white wine.

    … though ivory and peacocks aren’t listed in the Bible, and cedarwood and white wine wouldn’t be brought back from India,

    Stately Spanish galleon coming from the Isthmus,
    Dipping through the Tropics by the palm-green shores,
    — the Isthmus wouldn’t be the main origin, but probably just a broad-brush description.

    With a cargo of diamonds, — not from the Americas at the time of the galleons

    Emeralds — the emerald mines of Colombia were expolited by the Spaniards. OK,

    am[e]thysts — not at the time of the galleons

    Topazes — mostly mined locally in Europe until the 19th century

    and cinnamon, — from India and Ceylon

    and gold moidores. — not in significant amounts on a Spanish ship

  49. John Cowan says:

    Songs of Innocence & of Experience Shewing the Two Contrary States of the Human Soul

    The poems that best illustrate the contrast are “Infant Joy” and “Infant Sorrow”, I think. Quoting them here I can’t give them the markup I deserve, so I have resuscitated my blog for the purpose.

  50. Rodger C says:

    JC, something has happened to “Like a fiend hid in a cloud.”

  51. John Cowan says:

    Thanks: sloppy pasting: fixed.

    I seem to remember hearing of both gold and silver modoires, but I can’t track it down. Not too surprising: the mines at Jáchymov yielded silver, yet there is such a thing as a gold dollar.

  52. Owlmirror says:

    though ivory and peacocks aren’t listed in the Bible


    The first stanza of the poem is clearly referencing 1 Kings 10:22 (repeated in 2 Chronicles 9:21), where those items are mentioned.

    One translation — out of many — says that the word for peacocks, “tukki’im”, means “baboons”; one other one says “monkeys” (which seems rather unlikely even without the preponderance of “peacock” by other translators; the distinction between “(tailless) apes” and “(tailed) monkeys” is a modern one, and not even made by most languages).

    I see that one commentary suggests that the word translated as “ivory”, “shenhabim”, may be a mistake for “shen habnim”, “ivory [and] ebony”.

    Modern Hebrew uses the Arabic for peacock, “tawwas” (“tavvas”?).

  53. Trond Engen says:


    Oh, right, sorry. I meant to remove that line, but I see I must have taken the one summing up at the end instead.

  54. John Cowan says:

    Per Wiki.en s.v. “Ophir”, Smith’s Bible Dictionary (1863) says that the Biblical Hebrew words for ‘ivory’, ‘cotton cloth’, and ‘ape’ are borrowed from Old Tamil, and that the word for ‘peacock’ is likewise borrowed from the Old Tamil for ‘parrot’. This may well be worth nothing, but LH is probably the place to find out.

  55. Trond Engen says:

    I’m thinking that King Solomon* became a member or a partner of the Phoenician/Canaanite trade network with special control of the overland route to the Red Sea and the sea route to Yemen and beyond, and that it’s no coincidence that this happened during Egypt’s 3rd Intermediate Period and Babylonia’s Period of Chaos.

    *) Or whichever historical and political entity that came to be identified with king Solomon in later accounts.

  56. @Trond Engen: I have long felt that the strongest evidence that David and Solomon were historical figures, and the Biblical accounts relatively accurate about their policies, is that the stories are so full of tawdry details. The narratives have an extremely pro-Davidic slant, yet they include all sorts of stories about how the kings were horrible people, which the try very hard to talk around. Compare those stories with the recorded exploits of a truly fictional dynast like Emperor Jimmy, and there is a vast difference.

  57. Trond Engen says:

    I agree that they were likely historical figures. What I mean is that they were also legendary.. A king who was remembered as great would attract other memories of greatness in the general historical vicinity and might end up with more history attributed to him than he actually deserved.

  58. @Trond Engen: Certainly, I agree with that. It seems that the actual direction of causal inferences regarding King Solomon’s wisdom was not, Solomon was wise; therefore he built the temple, as given in the Biblical narrative; rather, the chain of implication was, Solomon built the temple; therefore he was wise.

    I also see autocorrect gave me “Emperor Jimmy” in my last post. That’s what I get for commenting on my phone.

  59. Trond Engen says:

    When you said “truly fictional”, I thought you meant “truly fictional”.

    I actually thought he was from some part of popular culture I didn’t know.

  60. Stu Clayton says:

    Emperor Jimmy just abdicated.

    # We know him as Akihito, the emperor of Japan, a gentle figure who championed peace in a nation devastated by war. But she called him Jimmy.

    It was the autumn of 1946, a year after the end of the Second World War, and he was a 12-year-old boy, the crown prince of a defeated land, sitting in an unheated classroom on the outskirts of Tokyo. There, a new American teacher insisted on a more prosaic name for his highness. His father, the wartime emperor, Hirohito, had been revered as a god, but she made clear he never would be.

    “In this class, your name is Jimmy,” declared the teacher, Elizabeth Gray Vining, a 44-year-old librarian and children’s book author from Philadelphia. #

  61. David Eddyshaw says:

    yet they include all sorts of stories about how the kings were horrible people, which they try very hard to talk around


    I agree with your conclusion, but am not altogether persuaded by your argument: I think this is something characteristic of the Bible in general. The patriarchs are also presented in a very far from hagiographic light, for example. If I were going to invent some stories about my forebears, these are not the stories I would invent, to put it mildly. The warts-and-all treatment is by no means confined to the more marginal figures whose descendants weren’t writing the book: Jacob in particular is not what people nowadays would call a positive role model. Nor Judah, for that matter.

    (I should say that I myself believe that the patriarchs were historical, so for me personally that’s not a counterargument at all: but it pretty clearly would be a counterargument for most people.)

    Who was Emperor Jimmy before he got autocorrected?

  62. David Eddyshaw says:

    Ah. Jimmu. Should have thought of that. Especially when prompted by Stu’s Japanese diversion.

  63. Stu Clayton says:

    Nobody’s perfect. That’s why an admixture of imperfection makes any story more credible. It’s the principle of Realism in Westerns, and in forgeries of Old Masters.

    Whether a given story is true can’t be accurately judged by how believable it is. Sad but true (sic!). As Mr T wrote a couple of thousand years ago: most people don’t trouble to find out the truth, but accept the first story they hear.[Pel.W. 1.20]

  64. David Eddyshaw says:

    On the other hand, some stories wear their falsehood on their sleeve by their evident impossibility. I doubt whether even the most gullible has ever believed that the Four Branches were a veritable account of pre-Christian Britain, for example. (The Mabinogion immediately sprang to mind in thinking of – let’s say – imperfect heroes.)

    Somewhere or other (JC will probably instantly locate it) I once read an article about how one attempts to tell science from pseudoscience in published articles in practice; IIRC the somewhat melancholy conclusion was that it had a rather worrying amount to do with style and presentation.

  65. Stu Clayton says:

    That’s a start. If it stops there, it’s due to laziness, not science.

  66. David Eddyshaw says:


    Come to think of it, Thucydides is a potential case in point. I recall another article pointing out that we are in fact rarely in a position to check his account, and when we can, he is surprisingly often wrong (the particular example was some case where archaeological evidence actually bore on the point in question, if I remember right.)

    But Thucydides’ style immediately gives him credibility. The anti-Herodotus.

  67. Stu Clayton says:

    If truth were apparent, it would be only the appearance of truth.

    Cue hermetics, hermeneutics and the FBI.

  68. David Eddyshaw says:

    The Tao that can be told …

  69. Stu Clayton says:

    The upshot is that a little humbleness and forbearance are as effective as a tuxedo, if you can carry them off. Otherwise best stick with plus-fours.

  70. Bathrobe says:

    Yes, Landor was (like me) a Hellenophile

    Does that partly account for the distaste for modern Greek nationalism?

  71. Unlike David, Solomon actually does come off as too-perfect, at least by the standards of his time.

  72. What I was struck by was that Solomon’s reign begins with a bloodbath. However, the narrative tries to give a justification for each grandee that he has executed. Either they violated some minor prohibition (in a manifestly pointless way), or Solomon was somehow justified in playing them false.

  73. I think killing people was fine as long as it was done for a reason. I don’t think any of that was held against Solomon.
    He was responsible for some naughty idolatry, but that’s all. People remember the big Temple, not the little ones. The Talmud even says that it wasn’t him, it was some of his wives who built those, and his only sin not stopping them.

  74. Does that partly account for the distaste for modern Greek nationalism?

    I don’t think so; I dislike nationalism in general.

  75. John Cowan says:

    We know him as Akihito.

    Non-Japanese do. But no emperor is called anything but “His Majesty the Emperor” in Japan until after he is dead, and even then he’s called by his era name. (The exception is on Hirohito’s and Akihito’s marine biology papers.)

    The Tao that can be told …

    The route you can traverse
      isn’t a static route.
    The name you can dereference
      isn’t a universal name.

    Namelessness is the root of everything.
    Names are the mother of everything.

      the unchanging, seen from outside the box,
        reveals its inner nature;
      the unchanging, seen from inside the box,
        reveals its outer form.

    These two are alike in origin,
      but different in name.
    Their unity is called “the mystery”.

    Mystery of all mysteries,
      the gate to all wonders.

  76. Stu Clayton says:

    Never have I seen such a to-do made about the lowly pointer.

  77. Stu Clayton says:

    Unless you count semiotics, which is a to-do through and through.

  78. David Marjanović says:

    The exception is on Hirohito’s and Akihito’s marine biology papers.

    Akihito had one in Nature a few years ago, under the pseudonym “His Majesty the Emperor of Japan”. Named 3 new fish species or something.

    Namelessness is the root of everything.
    Names are the mother of everything.

    “Money isn’t everything!
    But without money everything is nothing!”

  79. The route you can traverse
    isn’t a static route.
    The name you can dereference
    isn’t a universal name.

    OK, now I have to quote Boodberg:

    Lodehead lodehead-brooking : no forwonted lodehead;
    Namecall namecall-brooking : no forwonted namecall.
          Having-naught namecalling : Heaven-Earth’s fetation,
          Having-aught namecalling : Myriad Mottlings’ mother.

  80. As it happens, I knew Myriad Mottlings’ mother, and boy could she call names!

  81. ktschwarz says:

    Speaking of punctuation changes, check out the X-Treme semicolons in one frequently cited text of “A Modest Proposal”:

    It is a melancholly Object to those, who walk through this great Town, or travel in the Country; when they see the Streets, the Roads, and Cabbin-doors crowded with Beggars of the Female Sex, followed by three, four, or six Children, all in Rags, and importuning every Passenger for an Alms. …

    … whoever could find out a fair, cheap, and easy Method of making these Children sound and useful Members of the Commonwealth; would deserve so well of the Publick, as to have his Statue set up for a Preserver of the Nation. …

    I shall now therefore humbly propose my own Thoughts; which I hope will not be liable to the least Objection.

    That’s from the 1735 edition of Swift’s collected Works, which is favored by literary critics because Swift supervised it (sort of), but all those semicolons are startling to me. They’re not even typical of the period: the original 1729 edition has commas in those places. I guess it was just that printer who had a heavy hand with semicolons. Now compare Project Gutenberg:

    It is a melancholy object to those, who walk through this great town, or travel in the country, when they see the streets, the roads and cabbin-doors crowded with beggars of the female sex, followed by three, four, or six children, all in rags, and importuning every passenger for an alms.

    Annoyingly, PG doesn’t say what edition they used, but it must have been something nineteenth-century; it’s very close to Walter Scott’s 1814 edition of The Works of Jonathan Swift. A striking demonstration of how printing conventions changed more from 1735 to 1814 than in the two centuries since — which is apparently news to Darmon et al. According to their web app, the 1729 edition of “A Modest Proposal” was written by Daniel Defoe, the 1735 edition by Alexandre Dumas (Dumas wrote in English? Who knew!), and the Project Gutenberg version by … Jonathan Swift! I think we can conclude what was in their training corpus.

    Other fine literature from Project Gutenberg includes “A Tale of a Tub” by Charles Dudley Warner and various selections from Gulliver’s Travels by Samuel Pepys, Daniel Defoe, Harriet Beecher Stowe, and George Borrow.

  82. Good lord, their work is even more feckless than I thought.

  83. John Cowan says:

    This essay from the PG wiki says quite clearly:

    Project Gutenberg has avoided requests, demands, and pressures to create “authoritative editions.” We do not write for the reader who cares whether a certain phrase in Shakespeare has a “:” or a “;” between its clauses. We put our sights on a goal to release etexts that are 99.9% accurate in the eyes of the general reader. Given the preferences your proofreaders have, and the general lack of reading ability the public is currently reported to have, we probably exceed those requirements by a significant amount. However, for the person who wants an “authoritative edition” we will have to wait some time until this becomes more feasible.

    In later prose that I can’t lay my hands on right now, they say that they make no attempt to make their texts conform to any specific printed edition. As for metadata, that’s a nightmare for everyone.

  84. Stu Clayton says:

    Anyone who seriously wants an “authoritative edition”, and expects to find it at PG, cannot be serious in any serious sense of the word. The serious stuff is done elsewhere, to this day.

    Decades ago, I was suitably impressed by authors who developed a point by comparing, say, p 75 of the second and p 82 of the third edition of the Kritik der reinen Vernunft. And, to be honest, I also thought “who cares”. The first one counts, right ? Or maybe the last one ? Oh well …

    Now I have learned that there can be very interesting differences between various editions of a book – and that very fact means I myself have no use for the notion of “authoritative edition”.

    I thought I had read pretty much all of Luhmann (he died in 1998, for Pete’s sake), one central work being the 1984 Soziale Systeme which I have worked through several times. Then Systemtheorie der Gesellschaft was posthumously published about 2 years ago, another fat tome of his from around 1974. The topics are clearly related to those of Soziale Systeme. I read the editor’s essay comparing the two in a rough way, but I’m not hung up about the authoritative details. I read the book itself with more attention.

    I read Luhmann primarily because his ideas interest me enormously. I admire and respect the work that edition compilers and scholars do, but I’m a consumer.

  85. This essay from the PG wiki says quite clearly:

    I trust it was clear that my animadversions on fecklessness were aimed at Darmon et al., not at the estimable folks at Project Gutenberg, which of course does not aim at authoritativeness, nor should it. I am a frequent and grateful consumer of their wares.

  86. Stu Clayton says:

    Me too. That’s why I got upset at finding access from Germany blocked

  87. John Cowan says:

    Ah, no, I did think you meant to call the PG people feckless, perhaps ironically.

  88. Good lord. That would never occur to me.

  89. Trond Engen says:

    I thought the same, but then I decided I’d probably lost some context.


  1. […] via the always fascinating Language Hat, comes word of a paper entitled “Pull out all the stops: Textual analysis via punctuation […]

Speak Your Mind