AI Fails at Whitman.

I know there’s a ton of blathering about AI and ChatGPT out there, and I have no desire to overload LH with it, but I found this discussion by Andrew Deck interesting (and cheering) enough to share:

Training an AI tool to generate high-quality literary writing, like poetry, is no small challenge. Many large language models (LLMs) are not trained to be creative. One of the criteria used by AI researchers to judge creativity is novelty — how different the writing generated by a model is from what already exists in the world. But tools like ChatGPT were built to mimic human writing, not to innovate on it. […]

ChatGPT, for example, even struggles to imitate the structure and rhythm of well-established poets in English, especially when the poets are famous for breaking literary norms. A recent study found ChatGPT largely fails to produce English-language poems in the style of Walt Whitman, one of the more easily accessible poetry catalogs in the American canon. Whitman’s style features fluid and unstructured verse, but ChatGPT often wrongly defaulted to the rigid norm of four-line stanzas. It continued to do this even when prompted not to.

These issues are often exacerbated when ChatGPT is asked to produce poetic writing in languages other than English. The same researchers struggled to imitate common Polish styles of poetry, according to Goes. Earlier this year, researchers attempted to refine models to address shortcomings in AI-generated Japanese poetry, such as haiku and waka.

Rest of World observed similar problems when we tested ChatGPT’s ability to write a poem in Tamil. The poems were incoherent at best.

I know, I know, they’ll probably get better at it as more bushels of money are thrown at the problem, but one can hope. (And speaking of money: “Telugu-speaking contractors, for example, can only earn $1.43 per hour.” I try not to dream of guillotines, but they make it so hard…) Thanks, Trevor!

Comments

  1. LLog had a bit on a wretched thing which ChatGPT had produced as a Dickinson-styled poem. Unlike Whitman’s “fluid and unstructured verse”, Emily Dickinson’s poetry is structured, unique, and consistent, making her especially imitable by anyone but a dumb robot.

  2. David Eddyshaw says

    There is precedent for these companies to lean on experts for data work

    Hey! Turkey! Wanna come to my Christmas dinner?

    LLMs are noosphere parasites, and belong on Mars with Elon Musk.

  3. Exactly. Ah ! ça ira, ça ira, ça ira…

  4. jack morava says

    [lightly edited:]

    … the Economist put up [Henry Farrell] and Cosma Shalizi’s piece on shoggoths and machine learning, cf

    http://bactra.org/weblog/shoggothim.html

    riffing off the meme that every large language model is really a shoggoth. …[A]n LLM is a way of taking the vast incohate chaos of written-human-language-as-recorded-on-the-Web and simplifying and abstracting it in potentially useful ways. They are, as Alison Gopnik says, cultural technologies, more analogous to library catalogs than to individual minds. This makes LLMs recent and still-minor members of a larger and older family of monsters which similarly simplify, abstract, and repurpose human minds: the market system, the corporation, the state, even the democratic state. Those are distributed information-processing systems which don’t just ingest the products of human intelligence, but actually run on human beings — a theme I have been sounding for while now…

  5. J.W. Brewer says

    Doesn’t “120 rupees per hour” sound better?

  6. I only have read the title, but it reminded me:

    Купил грубый русский мужик нежную японскую пилу.
    Решил он ее проверить – дал ей маленькую щепочку…
    – Вжик, – сказала нежная японская пила.
    – У-у, сука, – сказал грубый русский мужик, и дал ей полено.
    – Вжик, – сказал нежная японская пила.
    – У-у-у, сука, – сказал грубый русский мужик, и дал ей здоровенное бревно.
    – Вжжик, – сказала нежная японская пила.
    – У-у-у-у, сука, – сказал грубый русский мужик и дал ей рельс.
    – Р-р-р-а-а, – сказала нежная японская пила.
    – А-а-а, сука! – сказал грубый русский мужик.

  7. Lars Skovlund says

    @drasvi: Is there a punchline to that joke?

  8. Why would anyone throw bushels of money at the problem of generating Whitman pastiches? Is there some pent-up demand that only waits for someone to monetize it?

    Who is saying “our big problem is that there’s NOT ENOUGH Whitman! If only we could generate MORE Whitman, we could really hit the jackpot!”?

  9. John Cowan says

    Whitman was never a best-selling author, but it’s the principle of the thing.

  10. David Eddyshaw says

    I imagine Whitman himself would have been delighted to know that he had defeated the mind parasites. That was kinda his thing.

  11. @Lars,

    the background: Soviet industry lagged behind, manufacturers often were saving on quality or dealt with disfunctioning supply chains and Soviet tools and instruments were crudier.

    When USSR fell apart and we began importing better and more precise foreign tools, the perceived advantages [alongside with disadvantages] of Soviet-made or self-made equipment was price (true) robustness (in terms of how often it breaks – usually not true), “can be fixed rather than replaced” (true, quite unlike iPhones, those things were designed so that a user could fix them), maintenance and service cost (true, spare parts are cheaper and those things were degned to be maintained by user) – and for self-made equipmnent, of course, creativity of the person who makes it.
    Also a new expensive foreing tool was likely to be handled with greater care (so the joke is unrealistic).

    The joke is from that period.

    vocabulary:
    – грубый “rude, rough, crude, imprecise” – I’ll translate as “rough”.
    – нежный “tender” (when said about people and not “touches” etc – either treating others so or sensitive) – I’ll use “gentle”.
    muzhik -= before the revolution, a peasant man. Now “man, guy”, but often as an ideal of masculinity, “real man”. “A muzhik must be mighty, smelly and hairy“. Can fix things (perhpas this property indeed comes from the times when those were peasant men as opposed to nobilty).
    Vzhik! – onomatopoeia, *quickly cutting something with a knife*
    – u-u-u [u::::] (with open or closed mouth) a displeased grunt uttered when… well, sort of a hostile amazement.
    – a-a-a – here “Aha! Gotcha!”.

    So a rough Russian muzhik bought a gentle Japanese saw and decided to try/test it and offered her [I’ll leave Russian ‘her’ here] a small splinter. “Vzhik!” said the gentle Japanese saw. “U-u, bitch!” said the rough Russian muzhik and offered a piece of firewood. “Vzhik!” said the gentle Japanese saw. “U-u-u, bitch!” said the rough Russian muzhik and offered her a large wooden log. “Vzhzhik!” said the gentle Japanese saw. “U-u-u-u, bitch!” said the rough Russian muzhik and offered her a rail. “R-r-r-a-a!” said the gentle Japanese saw. “A-a-a, bitch!” said the rough Russian muzhik.

  12. cuchuflete says

    I tried to imagine an AI imitation of Nicolás Guillén’s Sensemayá, and
    visions of Gaetz and McCarthy having a food fight appeared.

    Sensemayá
    Canto para matar a una culebra.

    ¡Mayombe—bombe—mayombé!
    ¡Mayombe—bombe—mayombé!
    ¡Mayombe—bombe—mayombé!

    La culebra tiene los ojos de vidrio;
    la culebra viene y se enreda en un palo;
    con sus ojos de vidrio, en un palo,
    con sus ojos de vidrio.

    La culebra camina sin patas;
    la culebra se esconde en la yerba;
    caminando se esconde en la yerba,
    caminando sin patas.

    ¡Mayombe—bombe—mayombé!
    ¡Mayombe—bombe—mayombé!
    ¡Mayombe—bombe—mayombé!

    Tú le das con el hacha y se muere:
    ¡dale ya!
    ¡No le des con el pie, que te muerde,
    no le des con el pie, que se va!

    Sensemayá, la culebra,
    sensemayá.
    Sensemayá, con sus ojos,
    sensemayá.
    Sensemayá, con su lengua,
    sensemayá.
    Sensemayá, con su boca,
    sensemayá.

    Sensemayá
    (Chant to kill a snake)
    translated by Willis Knapp Jones

    ¡Mayombe-bombe-mayombé!
    ¡Mayombe-bombe-mayombé!
    ¡Mayombe-bombe-mayombé!

    The snake has eyes of glass;
    The snake coils on a stick;
    With his eyes of glass on a stick,
    With his eyes of glass.
    The snake can move without feet;
    The snake can hide in the grass;
    Crawling he hides in the grass, Moving without feet.

    ¡Mayombe-bombe-mayombe!
    ¡Mayombe-bombe-mayombe!
    ¡Mayombe-bombe-mayombe!

    Hit him with an ax and he dies;
    Hit him! Go on, hit him!
    Don’t hit him with your foot or he’ll bite;
    Don’t hit him with your foot, or he’ll get away.

    Sensemayá,
    the snake, sensemayá.
    Sensemayá, with his eyes,
    sensemayá.
    Sensemayá, with his tongue,
    sensemayá.
    Sensemayá, with his mouth,
    sensemayá.

  13. John Cowan says

    @DE: Oh, I agree. I was just imagining that in the mind of such researchers was the idea “If we can make fake Whitman, we can make fake anybody!” And so they go for the barbaric yawp. Here’s what GPT4 gave me:

    In the heart of wilderness, I stand, a primal force untamed,
    Unbound by society’s shackles, unbroken by its rules.
    I am the scribe of spirit and flesh, an echo of nature’s song.
    Upon the world’s apex I roar, my voice reverberating through canyons,
    A savage cry above the world’s rooftops.
    My words are not timid murmurs but mighty bellows,
    They are not soft zephyrs but raging tempests.
    I am vast, I encompass contradictions.
    Within me dwell both shadow and radiance, grief and bliss,
    The twin faces of existence mirrored in my essence.
    I am every soul yet none at all.
    My life-force throbs with the cadence of existence itself,
    Each beat a declaration of my being.
    My lifeblood is the ink for my verse; each vessel a verse line,
    Every breath drawn is a tribute to life’s majesty.
    Seas surge within me; peaks ascend in my spirit.
    Celestial bodies dance in my thoughts; galaxies spiral in my dreams.
    I am woven into this infinite cosmos yet distinctly apart,
    A solitary chord in nature’s orchestra yet an entire harmony myself.
    Thus here I rise, defiantly resonant and fiercely sovereign
    With a savage cry that echoes through eternity.

    While prompt hacking I told it not to plagiarize Whitman, but it failed at that. Too.

  14. Andrew Dunbar says

    Just a few thoughts/notes:

    1. LLMs are trained in several stages. There will be a stage where they have tons of text of random kinds and tons of time to train, I believe on their own. There are also stages where a human is very involved and stages where they’re given a smaller but still huge amount of more focussed text on domains they should specialize in. So I believe to get better than current poetry results you would need to do a focussed training session on poetry and probably one on good poetry and probably one on a specific poet you want it to imitate.

    2. Apparently all the work to prevent the LLMs from telling you how to make a bomb or say sexist and racist things has a poorly understood side effect of substantially reducing the quality of the output. The independent hackers all want to use LLMs without the guardrails for this and possibly other reasons. The big AI companies don’t want to release LLMs without guardrails for fear of getting into trouble.

    3. I read one analysis saying there are three kinds of creativity. LLMs can sometimes do very good interpolation. Can sometimes do fairly decent extrapolation. But are very bad at or cannot at all do completely novel out-of-nowhere leaps of creativity.

    And my own favourite little observation: The top two complaints about LLMs are “They’re not creative” and “They just make stuff up”.

  15. John Cowan says

    Nothing contradictory about that. We want poets to be original (for the last two centuries or so), but not when testifying in court. In addition, the older tradition of “What oft was thought, but ne’er so well exprest” implicitly rejects poems that are bullshit, which is all we get out of LLMs.

    For your listening pleasure, here is GPT4’s response to “Please write me a poem in the style of Alexander Pope on the Sydney Harbour Bridge”:

    In Sydney’s fair city, where the sun doth rise,
    A marvel of steel and rivets lies.
    The Harbour Bridge, in grandeur stands,
    Linking together distant lands.

    From North to South, its arches sweep,
    O’er waters dark and fathoms deep.
    Its granite pylons touch the sky,
    Where seagulls wheel and loudly cry.

    Upon its deck, both day and night,
    Cars and trains in endless flight.
    Beneath its span, with sails unfurled,
    Ships from the corners of the world.

    Surely Pope would not have failed to mention the bicycle paths. But more broadly, this poem violates Johnson’ dictum: “The business of the poet is to examine, not the individual, but the species: to remark general properties and large appearances”, the attitude which drew such an apoplectic response from Blake: “To Generalize is to be an Idiot. To Particularize is the Alone Distinction of Merit — General Knowledges are those Knowledges that Idiots possess.”

    Random fact: “The Bradfield Highway [which traverses the bridge] is designated as a Travelling Stock Route, which means that it is permissible to herd livestock across the bridge, but only between midnight and dawn, and after giving notice of intention to do so. In practice, the last time livestock crossed the bridge was in 1999 for the Gelbvieh Cattle Congress.”

  16. David Eddyshaw says

    ¡Mayombe-bombe-mayombé!

    A clear reference to the Miyobé language of Togo and Benin (a grammar of which I chance to have before me at this moment …)

    It is indeed the BOMB.

  17. ChatGPT also fails at summarizing:

    No really: ChatGPT doesn’t summarise. When you ask ChatGPT to summarise this text, it instead shortens the text. And there is a fundamental difference between the two. To summarise, you need to understand what the paper is saying. To shorten text, not so much.

  18. And there is a fundamental difference between the two
    That brings back school memories – we were often given a text that we had to a) retell in our own words, with the Nacherzählung usually being shorter than the original, and then to b) summarize (Zusammenfassung), giving just the gist. Fun times! I remember several (non-AI) co-students also having problems distinguishing the two…

  19. David Eddyshaw says

    From Hat’s link:

    It sometimes feels like trying to convince people that buying tulip bulb options really is less of a certain investment than they think.

    The whole “AI” thing is certainly, objectively, a bubble. In a rational world, the entire duplicitous enterprise would be heading for a thoroughly merited crash. However, I think the outlook is worse than that suggests: we have tech billionnaires very heavily committed to pretending (perhaps even believing) that the underlying ideas are valid, and these people have bought enough political power that they may be able to coerce governments into pretending that APEs really are the “Artificial Intelligence” that they claim, enabling them to poison the noosphere for fun and PROFIT indefinitely.

    As, in reality, APEs are not intelligent at all, the effect of using them for any critical purposes will be disastrous.

    But the situation is analogous to the politics of Trump and similar fascists: their “solutions” to real political problems are lies and illusions: the billionnaires bankroll and promote them nonetheless, because their actual agenda has nothing at all to do with actually solving those problems.

    It’s no accident that the Thiel sockpuppet Vance and similar Silicon Valley antidemocrats are, specifically, aggressively opposing government attempts to control what is done with APEs (also attempts to control cryptocurrencies, though their reasoning there is blatantly transparent in its self-seeking effrontery.)

  20. David Marjanović says

    That brings back school memories –

    We were taught Nacherzählung first and Zusammenfassung later, in separate years (as part of a progression to Interpretation). The sudden switch was just as difficult as simultaneous usage would have been.

    But the situation is analogous to the politics of Trump and similar fascists:

    Not just their politics. $DJT, the stock of Truth Social, is worth anything because the people who bought the stock want it to be worth something.

  21. AI is not worthless, but it is overrated. To compare, in one word — Plastics. There was a time when plastics were It. Their promoters and the public saw the expanding variety of polymers as wonder materials which will replace everything, from glass to wood to ceramics and maybe eventually even metal. In the long run, some materials were successfully replaced by plastics; some were poorly replaced, which made plastics a byword for cheap and shoddy substitutes. And some materials never were and are not expected to be replaced. And, oh, there was the predicted and ignored environmental cost, and later on the unpredicted environmental cost.

    A friend, who has been working with neural networks for years, said to me that all they are is glorified curve-fitting (he doesn’t work with discrete systems like NLP, but it’s the same idea.) As glorified curve-fitting, AI has done very well in some applications, exceeding the capabilities of less-glorified methods. The field is developing very fast indeed, and the right questions are being asked, even if they are not always answered. However, the naive idea that a “self-learning” system just needs data and computing power thrown at it and it will figure out everything, is, I hope, being abandoned, and with it the expectation for easy money from nothing (well, nothing but gigawatt-hours).

  22. Stu Clayton says

    The comparison with plastics is very apposite.

  23. David Eddyshaw says

    I am not trying to seduce you.

  24. Stu Clayton says

    I blush to relate that I used that ploy successfully for decades. My conscience protested, but I was just fascinated by how easy it is to distract the Cerberus of morality in guys, of all people, and slip past. I concluded that they are no better than they should be, nor worse than I. Look, a squirrel !

  25. Stu Clayton says

    In case no one noticed, my last post was about the exploitation of gullibility. It was a down-market riff on DEs remark about the marketers of AI “poison[ing] the noosphere for fun and PROFIT indefinitely”, with nary a scruple. To see things this way, it helps to have studied gullibility in its many forms.

    My first awareness of this phenomenon, as far as I can remember, was by way of astonishment at a sentence in the introduction to a book by Hartshorne: “I decided to devote my life to [the study of ?] reason”, or words to that effect. I thought: what a weird thing to “devote” oneself to. Reason takes care of itself. I was 17 at the time.

    Only later did I see that Reason does no such thing.

  26. I suspect many Hatters have had to learn that lesson. I certainly did.

  27. jack morava says

    I got into mathematics to avoid calculation

Speak Your Mind

*