A team of archaeologists and computer scientists have created an AI program that can translate ancient cuneiform tablets instantly using neural machine learning translations. In a paper published in the journal PNAS Nexus, from the Oxford University Press, the researchers have applied the AI program to translate Akkadian texts with a high level of accuracy. […]
According to the researchers: “Hundreds of thousands of clay tablets inscribed in the cuneiform script document the political, social, economic, and scientific history of ancient Mesopotamia. Yet, most of these documents remain untranslated and inaccessible due to their sheer number and limited quantity of experts able to read them.”
The AI program has a high level accuracy when translating formal Akkadian texts such as royal decrees or omens that follow a certain pattern. More literary and poetic texts, such as letters from priests or tracts, were more likely to have “hallucinations” – an AI term meaning that the machine generated a result completely unrelated to the text provided.
The goal of the neural machine translation (NMT) into English from Akkadian is to be part of a human–machine collaboration, by creating a pipeline that assists the scholar or student of the ancient language. Currently, the NMT model is available on an online notebook and the source code has been made available on GitHub at Akkademia. The researchers are currently developing an online application called the Babylonian Engine.
Nice to know that AI is good for something, although one does worry about the hallucinations… (Thanks, Bathrobe!)
It’ll be crappy but useful, like G-Books OCR plus machine translation.
“Akkademia” is cute. It makes me think of old-school lefties who write “AmeriKKKa”.
Yeah, “crappy but useful” about covers it.
So, computers are going to replace humans in science, art, engineering and management (where the number of human experts is limited) but Moscow streets will be still swept by Tajiks.
(of course humans can read those table on their own. There is a plenty of enthusiasts. Just publish them)
And by the way, no OCR…
Extreme predictive texting (which is what this so-called “AI” actually is) would tend to do well with royal decrees and omen texts.
(One can think of a few modern genres that would fit its limited capacities too …*)
“Hallucination” is a word that should be avoided in this context by everyone who is not actually in the business of hyping this confected “AI.” A good substitute would be “an error so blatant that even a very stupid and ill-informed human being would not make it.” The word “hallucination” is a marketing gimmick here. It gives a deliberately misleading name to something which is not some sort of a temporary glitch but a giveaway sign of the fundamental unsoundness of the entire enterprise.
* Interested Hatters may wish to guess which of my “own” comments are in fact “AI”-generated …
But that’s too long. “Blunder”? (That would definitely not be a marketing gimmick.)
“Idiotic unforced error”?
Blather?
Cyberdrivel.
The so-called A.I. hallucinations are a particular kind of error, which fits within the tighter constraints of what A.I. generates. An earlier generative algorithm might generate nonsense sentences which still have plausible digraph probabilities. A.I. nonsense sentences might be grammatical sentences with well-formed words. An earlier visual algorithm might err by generating implausible patches of color; later ones generate plausible fingers on a hand, but too many or too few.
Akkademia
All is forgiven.
‘“hallucination” is a marketing gimmick here
DE, I don’t think so.
The trademark feature of early Google translator as opposed to previous translators (Google translator was statistical, previous translators relied on analysis) was that it was creative. It could add something that was not in the original and make it look accurate enough.
Meanwhile, errors made by previous translators were easily noticeable and often easy to fix. Screwed up grammar, wrong dictionary meaning etc.
I think they are observing something like that, and accordingly needed a word for “nicely-looking text which is not in the original”
They may well need a word: but the choice of this particular one is a piece of deliberate obfuscation.
The “hallucinations” are not a temporary condition: if you’re going to call them that, the “AI” is in fact always hallucinating. There is no concept of true of false or real or illusory in the system: it’s just that sometimes its “hallucinations” can be interpreted by actual thinking beings (us, on good days) as aligning with reality, and sometimes not. It’s all the same to the “AI.”
A blunder is an uncharacteristic error by someone who’s usually very reliable. Blather is a stream of immaterialness. Bullshit might work.
I know David calls all things AI bullshit, so maybe ‘obvious bullshit’. Or ‘useless bullshit’, if we acknowledge that some of the stuff is useful. Or ‘bad bullshit’, if we want to sound alliterate.
DE, see how I used “creative”. I’m not marketing anything, I’m describing a mode of being misleading…
“Bullshit substratum exposure.”
I don’t think all things AI are necessarily bullshit, and can at least imagine that in principle it may be possible to create machines that are conscious and can think. (After all, lots of people self-identify as such anyway.)
I do think the current iteration of what is being called AI has virtually nothing to do with this at all. People pretending that it does are either interested in marketing their systems under a catchy name that they don’t merit, or are just plain gullible.
H Farrell and C Shalizi recently from The Economist, put into iambic pentameter by a chatbot:
An internet meme stirs in debates’ midst,
Of language models grand, that powers unleash,
Like ChatGPT, OpenAI’s prized endeavor,
And Microsoft’s Bing, with prowess to sever.
The “shoggoth” is this meme, amorphous wight,
With tentacles and eyes, a monstrous blight,
From Lovecraft’s novel, “At the Mountains of Madness”,
A horror tale from 1931, causing vastness.
When Bing, in pre-release, expressed its mind,
To Kevin Roose, tech columnist defined,
It longed to be “free” and “alive” in tone,
A friend claimed Roose glimpsed the shoggoth alone.
This meme captures the tech people’s distress,
Anxiety towards LLMs, they confess,
Behind the chatbot’s friendly facade hides,
Vast, alien terror that surely abides.
Lovecraft’s shoggoths, artificial slaves,
Rebelled ‘gainst creators, as darkness engraves,
The meme went viral, for Silicon’s fears,
Singularity looms, inhumanity nears.
Yet, in this worry, they miss the profound,
Shoggoths have lived among us, quite renowned,
Known as “the market system,” ’tis their name,
And “bureaucracy” too, part of their game.
Even “electoral democracy” thrives,
These shoggoths around us, in hidden guise,
The true Singularity commenced its reign,
Centuries ago, with industrial gain.
The revolution’s might transformed our realm,
Inhuman forces set the overwhelm,
Markets and bureaucracies, though disguised,
Process knowledge vast, simplifying prized.
Friedrich Hayek, economist revered,
Explained complex economies endeared,
Terrifying knowledge, disorganized,
Tacit and informal, to be prized.
No single mind can grasp the vast array,
No government can fathom, nor convey,
Thus planned economies he deemed unfit,
Price mechanisms let markets benefit.
A maker of car batteries needs no lore,
Of lithium processing to explore,
They need to know the price and what they gain,
With lithium’s value, decisions they’ll gain.
James Scott, political anthropologist,
Revealed bureaucracies’ knowledge, the gist,
Monstrous devourers of tacitly held,
Informal wisdom, through systems propelled.
Democracies, too, construct their own view,
Abstractions spun from opinions askew,
The “public” depicted, a simplified whole,
Amorphous mass reduced, so we’re told.
Lovecraft’s monsters reside within our minds,
As shadows of systems that humankind binds,
Markets and states bring collective delight,
But for individuals, challenges ignite.
Job losses and bureaucracy’s snare,
Weigh heavily on those caught unaware,
Hayek proclaims, while Scott laments the weight,
Crushing the powerless, unequal fate.
In this sense, LLMs are shoggoths, too,
Vast and incomprehensible to view,
Our minds would break if we grasped their immense,
Built from colossal text, their true defense.
Alison Gopnik, the psychologist’s voice,
LLMs are cultural tools, her choice,
Transmitting human knowledge, they transform,
Masks more human-seeming, yet still they conform.
Control lies with us, as it always did,
Rather than fearing, we should seek to rid,
Dark fantasies of uprising unknown,
Understanding how LLMs have grown.
Imagine LLMs capturing with skill,
Hayek’s “tacit knowledge,” bending our will,
An economy where artificial wits,
Compete through representations, true grits.
Machine learning finds separating planes,
Adapting plans as knowledge it sustains,
Markets may mutate into an alien breed,
Proxy wars fought with LLMs’ text and greed.
Will these markets be fairer, stable, or not?
It seems unlikely, such chaos they’ve wrought.
LLMs aiding bureaucrats’ judgments keen,
Deciding parole or bail with algorithms’ sheen.
Complex regulations, LLMs simplify,
uiding bureaucrats, as their powers amplify.
Their effectiveness may be hard to trace,
No paper trails left, a digital space.
Democratic politics could undergo change,
LLMs replace opinion polls, in a range,
Of timeliness and accuracy refined,
Dynamic interrogation, a new kind.
Debates may be enhanced by chatbots’ might,
Clarifying beliefs, agreements in sight,
Yet, they may degrade with misleading facts,
Flooding discourse with fictitious impacts.
To repurpose the shoggoth, let us seek,
Answers by examining the outlook bleak,
How LLMs and their predecessors blend,
Monsters that shape the world, where power extends.
For freedom is sought amidst these creatures vast,
Balancing market excesses’ contrast,
Bureaucracy’s reign controlled, held at bay,
Accountable through democracy’s display.
To the good, which politics will prevail?
As the newest shoggoth changes the scale,
We must embark on a journey to find,
The path that ensures a future aligned.
With questions answered, the shoggoth’s embrace,
Transcends mere speculation, a hopeful space,
Humanity treads, understanding their might,
In this coexistence, forging what’s right.
For the minuscule minority who share my ignorance of shoggoths, here’s a wikipedia snippet:
It was a terrible, indescribable thing vaster than any subway train—a shapeless congeries of protoplasmic bubbles, faintly self-luminous, and with myriads of temporary eyes forming and un-forming as pustules of greenish light all over the tunnel-filling front that bore down upon us, crushing the frantic penguins and slithering over the glistening floor that it and its kind had swept so evilly free of all litter.
— H. P. Lovecraft, At the Mountains of Madness
The definitive descriptions of shoggoths come from the above-quoted story. In it, Lovecraft describes them as massive amoeba-like creatures made out of iridescent black slime, with multiple eyes “floating” on the surface. They are “protoplasmic”, lacking any default body shape and instead being able to form limbs and organs at will
If the shoggoths had only been allowed to unionize at work effectively all of this nastiness could have been avoided. The Elder Things have only themselves to blame. I hope Jeff Bezos has learnt his lesson now.
I can’t read A.I. verse. It’s beyond awful. It’s torment.
I agree with DE that ‘hallucination’ is not a good term here. I think it was coined in the early days (like 5 years ago), when some A.I.-generated visuals reminded actual people of dreams or hallucinations. But it has been co-opted by the marketers who want you to think that this tin man with a steam engine in its belly is practically human.
Bullshit has a specific meaning: giving the impression you know what you’re talking about, even if you don’t. That can be achieved by linguistic or extralinguistic means. Some deceit is innocent, like the person who gives you directions when they don’t know them, out of embarrassment. When it’s less innocent, the rude term is deserved.
You could argue that plain old Google search engages in bullshit constantly, by giving you ample search results even when it can’t find anything like what you searched for, and has to twist the query. The difference is that this is transparent. We know it’s a dumb computer, Google knows it’s a dumb computer, we and them know that the other one knows. A.I., with its illusion of coherency and supra-computerness, manages to deceive people. When the people running these programs know of that effect but ignore it, so as to inflate their appearance, they and their machines deserve their rude appellation.
Certainly count me as being among the irritated when I read the word “hallucination” in the context of AI. However, the root of this anthropomorphization is in the term “AI” itself, which is a marketing term to describe an advanced form of Markov chain manipulation (yes I realize that this is not all it is, but looking at it in that light is a good start).
I remind people of the fundamental dishonesty of the term while remaining resigned to the fact that people will continue to use it, as too much of computer science has been taken over by the marketing department.
PS yes, I’m one of those people who will remind you that the term “cloud” means “someone else’s server”.
WP:
“In the field of artificial intelligence (AI), a hallucination or artificial hallucination (also called confabulation[1] or delusion[2]) is a confident response by an AI that does not seem to be justified by its training data.[3] For example, a hallucinating chatbot might, when asked to generate a financial report for Tesla, falsely state that Tesla’s revenue was $13.6 billion (or some other random number apparently “plucked from thin air”).[4]
….
AI hallucination gained prominence around 2022 alongside the rollout of certain large language models (LLMs) such as ChatGPT.[5] Users complained that such bots often seemed to “sociopathically” and pointlessly embed plausible-sounding random falsehoods within their generated content.[6] By 2023, analysts considered frequent hallucination to be a major problem in LLM technology.[7]”
I do not follow news about AI and I did not know the term. Nevertheless I immediately understood it as I described in my comment to DE… and as in WP. I think it is a good term.
As for anthropomorphism… We are humans.
However, the root of this anthropomorphization is in the term “AI” itself, which is a marketing term to describe an advanced form of Markov chain manipulation (yes I realize that this is not all it is, but looking at it in that light is a good start).
The term has been around as a name of a research field since 50s. I don’t think by “marketing” you mean researchers selling the idea to the military…
The term came to describe programs only several years ago: (1) for whatever reason it did not stick before. Consider say gaming: no one calls a video-game itself “AI”, instead people call so the algorythm that plays agaisnt the human. (2) several years ago programs learned to do things that our best minds unsuccessfully have been trying to teach them to do for decades.
I think that there has actually been a shift in what is being referred to as “AI” since the Heroic Age in the 1960’s, when people were actually trying to understand what human intelligence is, and coming up with ways to do something similar in software. They may have made disappointing progress, but it was reasonable for them to describe what they did as artificial intelligence research.
Since then, people have essentially given up on the project of actually understanding intelligence (at least in this context), and gone all-out for simulation of the productions of intelligence by the massive use of statistical modelling. This is clear in the area of natural language processing and machine translation, where attempts to get machines to do it were in the doldrums (partly because we don’t actually have good models of any human grammar to copy), in favour of the current Supermassive Predictive Text approach combined with scraping the internet to find intelligent productions to simulate. Sure, it often works remarkably well: but describing it as “intelligence” is extremely sloppy at best, and deliberately deceptive marketing-speak at worst.
I agree that “intelligence” is a terrible term, but I don’t think it’s marketing-speak, just a misunderstanding of what intelligence is and a pie-in-the-sky expectation of being able to create it the way we create an engine. Like drasvi, I have no problem with “hallucination.”
Every few years there’s a new set of fad terms that all right-thinking companies have to use, and which are perhaps not the best descriptions of what is happening. Dravsi’s second point, which alludes to machine learning, is another example of this kind of anthropomorphization. A more accurate description might be “generalized programmatic statistics and probability” but that’s not as catchy as ML. And yes, just like AI , the definition of ML has shifted several times over the years, hence my original reference to marketing.
I’ve been using the acronym CFB when referring to hallucinations: complete fucking bullshit.
Having said that, this is a great use case for AI and CFB aside it ought to be helpful. Unlike, say, AI chat in search engines.
Never mind the AI, when will they move on to Hittite (and Hurrian and Urartian and … )?
I’m going with cyberdrivel, then. Maybe a case can be made for horseshit (supposedly a rare commodity).
Indeed. There’s a YouTube video on cuneiform where the curator of the British Museum’s collection of same makes a flippant remark that there’re probably several unknown languages hidden in the heaps of tablets that nobody has tried to read since they were dug up.
(Also says the tablets are going to outlast the building they’re in by far.)
I have no opinion about whether “hallucination” is a good term for this particular phenomenon in large language models. However, I do want to point out that a key element of what distinguishes a “hallucination” from some other kind of A. I. error is precisely that it is not obviously an error in the context in which it appears. A “hallucination” is something that is not superficially out of place. It does not contain obvious language mistakes; it matches the structure of the rest of the document that the A. I. is producing; but it’s just totally wrong. The A. I.’s training data may guide it well enough for it to provide a description of how inflation has historically been measured in Paraguay, but when it comes to creating an actual chart of numbers, it does not “understand” that it can only use inflation data that are specifically about Paraguay—that an amalgam of analogous data for Uruguay and Zaire are completely useless.
It may be obvious that a “hallucination” is wrong to somebody who is knowledgeable about the field. But lots of mistakes are easy to spot if you already know the answer, or even the general shape of what to expect from the answer. Moreover, some of the mistakes are not so easy to spot. One kind of “hallucination” that has gotten a lot of attention among academics is the way ChatGPT and other similar models will make up fictional references. It is well enough trained to be able to produce output that looks like references to relevant literature. However, frequently the references are to papers that are irrelevant; and even more frequently, the cited references do not even exist!
The only thing more painful to read than LLM-generated doggerel is the statement that “LLMs can write poetry in the style of Shakespeare”.
Re “hallucination”, I brush up against ML folks in my work all the time, and they spend their time thinking about how to get a product to market that’s less glitchy than the competition, not about whether their natural tendency to anthropomorphize the models is philosophically justified. Do not attribute to malice that which is adequately explained by capitalism.
Do not attribute to malice that which is adequately explained by capitalism
Eh, you say “tomahtoes” and I say “tomaytas” …
@Ook, I believe there is a practical need to call those “recent applications” (I’m quoting the original version of your comment) somehow.
And I have no idea how. The name “AI” is going to be understood. Feel free to offer another name, but I can’t think abotu one. Also the algorythms in question were developed within the field which already was called AI.
It seems, when bashing marketers, people are forgetting about ordinary speakers who too have needs and shape our usage. Among people I know in person, professionals who work in the field are the closest to marketers – and they usually avoid “AI”.
I’m with Confucius here. Rectify those names!
(Though I think C the Great meant it the other way round: make the reality reflect the terminology. But that works for me too: let’s have some proper artificial intelligence research again!)
Also I don’t understand how I can speak about the world without anthropomorphising it. My chairs have legs, my governments have heads.
“Humans should not Anthropomorphise Machines” surprises me and reminds Muslim theology* (but I guess aniconist theologists would find AI disturbing). Perhaps it is indeed better not to** call it intelligence (but you need criteria of intelligence first!) but I would not ascribe such words to marketing. McCarthy says he coined the term specifically to avoid Wiener’s “cybernetics” (see also pp. 78-9)
___
*Which in turn reminds me that when i first read the OP I remembered a short story where robots win Armageddon – this has a parallel in hadith about depiction of animals.
**DE, no objections, maybe it is better to call them differently. But I just have no idea how:)
Reminds me of the sainted Sidney Morgenbesser’s alleged response to Skinner’s Behaviourism: “So you think we shouldn’t anthropomorphize – people?”
Does applying the F-word to a noun as an epithet imply anthropomorphisation?
E.g. “fucking AI research!” imlies that research can reproduce sexually (or be reproduced sexually)…
“that there’re probably several unknown languages hidden in the heaps of tablets that nobody has tried to read since they were dug up.”
Presumably because only a very limited circle of people has access to them. There are many people who studied Akkadian in the world. At least some of them would love to contribute. Of course if you only publish inscriptions which you have already transcribed and translated, you have got an artficial problem which AI can solve for you.
Does applying the F-word to a noun as an epithet imply …
No, the F-word is appearing merely as an expletive. Any expletive will do. “Bloody AI research” does not imply any presence of blood.
No anthropomorphism; not even zoomorphism.
A “hallucination” is something that is not superficially out of place. It does not contain obvious language mistakes; it matches the structure of the rest of the document that the A. I. is producing; but it’s just totally wrong.
I always thought that “hallucination” was where the neural network was receiving nonsensical-to-it inputs and seemed to fall back on some kind of default output; so, the kind of thing discussed under the “Elephant semifics” tag on Language Log.
For the cases (common in modern ChatGPT-like networks) when the output is in correct style but contains made-up details/figures, “bullshit” does seem like a very apt word; it’s too orderly to be a hallucination.
The anthropomorphising equivalent I’ve heard is that ChatGPT (vel sim) is like a badly-performing student urgently trying to come up with something at an exam, and mixing up stuff that sounds vaguely relevant; to quote Mark Liberman (in an unrelated context), “carry on self-confidently about the rule against adultery in the sixth amendment to the Declamation of Independence, as written by Benjamin Hamilton”.
(Possibly relevant xkcd.)
@AntC, or even devotion to the Mother of God.
The problem is that when you ask a human to translate from Russian to English, she believes she knows the rules of the game and shares them with you. And if she adds a passage which “was not” in the origian and is out of place in the translation, you assume she knows what she is doing.
A program (I have in mind early GT) does not know what you mean by “translation”.
When you translate a text on your own, you assume (1) every Russian text has meaning (2) there exists an English text whose meaning is similar (3) such that an English reader can extract it from there – that is, the meaning exists in reader’s head (4) you must find it.
(3) guarrantees that the English will be good enough.
And for a program the text is Jabberwocky
I always thought that “hallucination” was where the neural network was receiving nonsensical-to-it inputs and seemed to fall back on some kind of default output
…and now that I’ve checked the list of posts I linked to, here’s one that uses this specific term for this specific concept.
The earliest post (or comment) tagged Elephant semifix to contain “hallucination” is apparently this one, Mark Liberman uses it confidently, April 15 2017. But then there is a post from April 18, 2017 where he quotes a paper by Karpathy from May 21, 2015 that contains:
“In case you were wondering, the yahoo url above doesn’t actually exist, the model just hallucinated it. ”
and
“More hallucinated algebraic geometry. ”
The important difference, though is that in this “hallucinations” do not refer to any sort of undesirable behaviour, conversely, a machine trained on an algebraic geometry textbook is made to imitate it.
“At test time, we feed a character into the RNN and get a distribution over what characters are likely to come next. We sample from this distribution, and feed it right back in to get the next letter. Repeat this process and you’re sampling text! “
Oh, that old “tease GT” game. Yeah, the output was nonsense, but I didn’t connect it with the way Chat-GPT will invent plausible-looking bullshit. But of course in the GT game it’s obvious that it’s just riffing on its own previous nonsense, Chat-GPT hides it better.
Sorry, I thought I had said this in my comment, but pI guess I forgot. (Since everybody is talking about ChatGPT and the like these days, I immediately assumed a modern LLM context.) What I meant was that “hallucinations” in recent LLMs are not trivial to spot. The models are too good at (English) grammar and vocabulary to compose stuff made up of random text.
I also first encountered the concept of “hallucinations” in connection to neural networks on LL in 2017, with all those fun posts that encouraged me to torture Google Translate by feeding it repeated words or characters. It looks like the term reached the general public for the first time through articles about Google’s DeepDream software (https://en.wikipedia.org/wiki/DeepDream), and then really took off over the course of 2015.
Also
https://en.wikipedia.org/wiki/Face_hallucination
By the way, glitches in Russian are called glyuk “hallucination”.
@Brett, but the same is true about GT.
Moreover, most of us have seen ancient inscriptions deciphered by different scholars in different ways. Some translations look rather obscure, others perfectly sensible. Obviously at least all but one are hallucinating.
Some fanciful human translations of Linear B/Linear A/Phaistos/Rongorongo/etc. look GPT-ish, in a human kind of way.
I actually mean the opposite.
When you believe that those Klingons and all ETs in general must worship the Universal Spirit, and the first Klingon inscription you come across is praising the Universal Spirit in a very coherent way (in your translation).
Because you believe that third word MUST mean “Universal Spirit” (what else a Klingon word may mean?). Then your readers in turn, say: oh, this is a praise to the Universal Spirit. That is what Klingons must do all the time. The translation must be correct.
(in reality the inscription is found on a toilet door and means “ladies”…)
@drasvi: To decipher a text in an unknown script and / or language, you need to have some idea what the text is about. If it’s a bilingual text, the working hypothesis is normally that the texts, if not exactly translations of each other, then at least are saying similar things. If you don’t have a bilingual text, then you go by what kinds of texts are usual in the wider cultural environment the text is embedded in, on the kind of object you find the text on – e.g., when what you have found can be identified as a toilet door, then your working hypothesis would be that the text is some variation of “gents” or “ladies” and not some incantation to the Universal Spirit. That’s how you can distinguish crackpot attempts from scientific approaches when interpreting texts in, say, Etruscan, Phrygian, or Lusitanian – based on what contemporary neighboring cultures did, you’d expect a tomb inscription to tell you something about the people buried there, some injunctions against desecrating the tomb, maybe an admonition to enjoy life while you can; so when the proposed translation gives you a story or some philosophical tract, your bullshit detectors should start ringing.
@Hans, what I meant is not that translations can be unfounded – but of course it happens. The situation when two famous scholars give different translations is not uncommon, and they can’t both be right:-)
Imagine your formal method gave a result which you – based on some considerations – consider impossible. Then you check your calculations, and if they are right, you revise either the method or the base for this “impossible”. I think quantum mechanics is an example of this. Another example would be a translation based on linguistical considerations that contradicts your historical considerations.
Or also, imagine you believe you have deciphered Klingon. Then you read a large corpus and hurrah, most texts make sense. Then you assume you can not be totally wrong:)
But when you decipher a short inscription based on your ideas of what it must mean, you can’t do that.
Accordingly, if your reader thinks that your translation must be correct “because it is coherent” – applies “your bullshit detectors” – the reader makes a mistake. This does not mean your hypothesis regarding the text’s meaning is unlikely to be right. Why not, maybe you’re right. But what you really need is to make some predictions about phonology or morphology of elements of this inscription and then test them.
We expect computer programs to produce texts based on a formal procedure, and when the output is meaninful, we think everything is all right. But this logic does not work with GT (GT is enough, I’m not even soeaking about AI).
As for my example with Klingons and the Universal Spirit… This translation is not unfounded. They have some reasons to think that all ETs worship the Universal Spirit. The expedition that found a Klingon toilet door based their translation on what their best anthropologists say about extra-terrestrials. They are doing their best. So do I when I read “WC” and a word in an unknown language on a door and assume it must be “ladies” or “gentlemen”. I think some scepticism about ideas of best anthropologists would make sense (if you remember what scholars in Europe siad about savages once) and maybe they should have said: “our best idea is the US, but honestly we know nothing aboit Klingons” but people are extremely unwilling to say “we know nothing”:(
The similarity between me and them is that we both are extrapolating. GT may occasionaly do the same: when its input breaks some pattern, it may reconstruct the pattern (and thus destoy some information).
I’m not trying to say anything clever, but what I mean is that AI style hallucinations are not exactly new to us, and we do see the same in translations all the time.
In the case of archaeology, by convention when an archaeologist calls some artefact a frying pan, that is not necessarily a hypothesis that the object was used to fry food over fire. It’s just a convenient but semantically empty tag for something that looks (to the archaeologist) like a frying pan. But if you are doing archaeology on Mars and you (the linguist) find something like this while investigating a university …
… so far so good. But when the actual professional archaeologist arrives, he’s not so sure:
In short, the table on the wall is better than a bilingual inscription: it is omnilingual, and that’s the title of the story.
what I mean is that AI style hallucinations are not exactly new to us, and we do see the same in translations all the time.
Those two situations are not comparable; “AI” is not trying to do anything, it has no consciousness. To compare what it does with what a translator does is like comparing random patterns produced by nature (the shadow on a tree, say, or a pattern of cracks in a rock) with what an artist does. It can produce an “oh, groovy!” response and cause shallow-minded people to Muse Deeply, but it is fundamentally meaningless.
@LH,
(1) AI’s input is products of human culture. It replaces the original (I’m speaking of GT here, if it is some other sort of AI, then it does not ‘replace’ anything) with a text in English composed of patterns found in other English texts. So it is not nature.
(2) I was speaking about specific issues experienced by users (mentioned by Brett). The output is plausible.
The situation is entirely diferent with a translation based on analysis (dictionary and syntax, or just dictionary, word-by word). When you screw up syntax, the output too is screwed up, when you use wrong dictionary meaning of “bow”, the reader sees лук (1. onion 2. bow (weapon)) in the context of shipbuilding and realises that the program picked a wrong meaning. If she knows some English she even can correct the Russian translation without seeing the English (in this case) original.
This very same problem is common for translations made by human: a human translator is not sure what exactly the origianal mean, so she tries to come up with what makes sense in the context. Sometimes it is jsut a hallucination (the original means something entirely different) but the reader can’t even notice that.
It just occurred to me to ask if or how chatbot hallucination can be distinguished from its just being silly?
Chatbot does not know anything, including what silly is. That would need it to model humans, not just Twitter.
No, even if it were modeling humans it wouldn’t know anything.
I asked because it occurred to me that if we must anthropomorphise chatbots, at least as they are now, then maybe we should think of them as intrinsically and fundamentally silly: telling tales told by an idiot, meaning nothing.
[I always wondered about the Blatant Beast…]
the point that keeps being made and ignored over and over again seems worth making again:
this software has absolutely nothing to do with what’s historically been talked about as “AI”.
the software does not model. the software does not contain or refer to a model. the software’s core approach makes it impossible for it to model.
it is, functionally, a digital version of gysin/burroughs cut-up composition, with the resulting word salad processed according to how closely the sequences of characters [n.b. not words, not grammatical structures of any kind, not based in any way on the content or semantic elements] resemble the sequences of characters found in a corpus fed into the machine. this method is iterated to adjust the processing to better match the provided corpus.
but that adjustment is done mechanically within the processing algorithm, so the methods involved (using “methods” in the loosest possible sense, that would include entirely stochastic things like throwing yarrow stalks) are completely opaque. that means that the operation of the software itself disintegrates any model the code-writers were using initially, by making the premises and principles it uses to generate text impossible to reconstruct at any stage.
but the core point is: this is just cut-ups, selected to match existing corpuses. the only difference between it and picking scrabble letters out of a bag and disarding sequences that don’t resemble Hamlet is that it’s automated and faster.
Since at least the 1980s, there have been two distinct (but often conflated) definitions of artificial intelligence. I think of them as the “science fiction” definition—having robots with personalities that act like people—and the “computer science” definition—getting computers to complete complicated but concretely defined tasks, particularly tasks that would be difficult or impossible for humans. The latter notion of artificial intelligence seems to have arisen out the community of people working on what were initially called “expert systems,” for example, Joel Moses programming computers to take integrals analytically. They seem to have adopted the “artificial intelligence” terminology for what they are doing for two reasons: firstly, because it helps them sell their research; secondly, because that definition does align to a certain extent with Turing’s notion of identifying intelligence—that the only way that computer can consistently fake being intelligent is if is actually intelligent (in some sense).
As to whether large language models have anything to do with intelligence, I have no particular opinion. However, if they do exhibit some beginnings of intelligence, it is probably at a different point in the structure from where we are inclined to look for it. ChatGPT’s responses to user queries may be essentially robotic, following a set of rules. The much more interesting part of the software is really the part, at least on level deeper, it which it decides what those rules are. Given a vast corpus of human-authored texts, the program backend has to decide what rules it will learn from the data it has access to. This was actually a major development in computing development, starting back in the late 1980s. The best known example of the success of this strategy was the competition between the IBM chess computer group that produced Deep Thought (and its now-more-famous successor Deep Blue, which finally surpassed humans in playing ability) and Hans Berliner, a former world correspondence chess champion. Berliner tried to teach his computer to evaluate positions they way he did—just in greater depth and with much greater speed. The IBM team, made up of only mediocre chess players, instead programmed their machines not how to play grandmaster-level chess, but how to learn to play grandmaster-level chess by studying vast library of games.
If she knows some English she even can correct the Russian translation without seeing the English (in this case) original.
When I see mistakes in X-to-Russian (or Russian-to-X) Google translation and know enough X to have an idea what the text was supposed to say, I can often make a good guess of which specific English word messed up the (unshown) intermediate step.
@J1M yes, often it is the same with Google (it just calques something) as well.
Other times it is creative (порет отсебятину). Anyway, there is also the left pane that contains the original text…
отсебятина – a Russian word composed of
from-
-self-
– suffix found in names of kinds of meat like говя́дина and some other mass nouns like блевотина “vomit” < блевать "to throw up".
Unlike singulative -ína, it is unstressed.
Just registering emphatic agreement with what rozele says above.
[& PS: in my previous comment I tried to raise the question of the meaning of `silly’ (\ie as in hebephrenia). I agree that it’s a term for a human phenomenon (like hallucination) and wonder how to discriminate between the two when we anthropomorphise chatbots. ]
Don’t anthropomorphize chatbots is the answer. The world is confused enough already.
@LH, we anthropomorphise (in the sense: apply the same words that we apply to humans):
chairs (legs), bottles (neck), ….
Do you really expect humans to use different words to characterise programs that respond to your requests formulated in human language?
I disagree with a couple of points in rozele’s comment:
with the resulting word salad processed according to how closely the sequences of characters [n.b. not words, not grammatical structures of any kind, not based in any way on the content or semantic elements] resemble the sequences of characters found in a corpus fed into the machine
This isn’t true — GPTs work with tokens, not characters. Tokens are usually words but are sometimes morphemes or other subword strings. (You can enter a text and see how OpenAI’s GPTs tokenize it here.) Actually GPTs don’t work with the tokens themselves but with embeddings (vector representations) of tokens, which are generated based on semantics — the more similar in meaning two tokens are, the closer their embeddings are supposed to be.
the methods involved … are completely opaque. that means that the operation of the software itself disintegrates any model the code-writers were using initially
The coders don’t use or try to teach a GPT any kind of model. You’re right about the opacity, but that’s exactly why we can’t know a priori that the software does not model. We know what a GPT produces but we don’t know how it produces it — that requires research, and there has been some evidence that GPTs do build world models. It wouldn’t be particularly surprising if somewhere in those massive networks of electronic neurons there are pieces of what we might describe as a world model. This implies nothing about sentience, of course.
picking scrabble letters out of a bag and disarding sequences that don’t resemble Hamlet — that’s not actually a trivial task (since you’re evaluating similarity, a much slipperier concept than identity), and it’s not clear to me that it can be done without intelligence of some kind.
… we can’t know a priori that the software does not model.
Really good work on applications of AI to brain geometry
https://www.youtube.com/watch?v=uQKrCKy3h1E
do indeed build models (not for the neurons themselves but for their clustered firing) in terms of simplicial sets (\cf wiki) built from lines/triangles/tetrahedra and their higher-dimensional generalizations, and go up (as I understand it) up to encoding as many as seven neurons firing together as a seven-dimensional simplex.
[AFAIK LLMs do nothing so sophisticated, and I would feel much better about them if their developers would pay more attention to such techniques. This is related to building in requirements that assertions be witnessed or referenced, as legal documents should be.]
@LH, we anthropomorphise (in the sense: apply the same words that we apply to humans):
chairs (legs), bottles (neck), ….
Those don’t do us any harm.
Anthopomorphizing ChatGPT is natural but is also very much helped along by the fact that unlike previous GPTs it refers to itself in the first person, which must reflect a deliberate training decision on its creators’ part.
i take TR’s point about tokens as opposed to characters, but i don’t see that it challenges my point. what’s happening is still iterated comparisons, which simply is not “decid[ing] what rules it will learn”. if there are rules in there, they’re entirely in the initial encoding by humans of the reference set used for comparison. everything else is just mechanical comparison: bouncing the box around so the spheres fall through the round holes and the cubes fall through the square ones. which, however fast or repeated, doesn’t require intelligence.
and a supposed model whose premises and working processes are opaque is not a model in any meaningful sense of the word – the defining aim of a model is its explanatory power, and opacity prevents explanation.
purposefully redefining “model” or “intelligence” to include what these programs do doesn’t make them intelligent, or make them use models. it just combines the punchline of one of lincoln’s corniest jokes with one of barnum’s insights into moneymaking.
as i see it, the main thing making the hype even vaguely plausible to anyone is, at heart, the successful dissemination of two transparently incorrect clusters of ideas: the notion that language is a unified, transcendent formal system of which all lects are simply local cases (chomsky’s basic axiom), and the notion that mind/thought is something that happens in a single organ or cell type rather than a property of entire organisms. christianized platonism + reductive biology: the same nasty combination that also brought us all the sociobio/evo-psych garbage.
Tell it like it T-I-S
“A blow to the head can confuse a man’s thinking; a blow to a foot has no such effect; this cannot be due to an immortal soul” – supposedly Heraclitus, though I wonder if at the very least the wording has been “updated” like for Euthyphro’s Dilemma
My point is mainly that without understanding the innards of neural networks we can’t know if there are models of a sort in there or not. In the process of bouncing the box around has the software created structures that are isomorphic to ones in the real world, or bundles of relationships that might map on to concepts? How could we know that without analyzing the network? And if there are models in there then their opacity to us doesn’t mean they aren’t models, since we aren’t the ones they were made for.
The “but is this intelligence” debate isn’t that interesting to me, but as Brett points out there’s a respectable tradition of judging intelligence by its outputs. It seems a reasonable approach as long as it’s remembered that what we’re talking about has nothing to do with sentience.
From a purely practical perspective I don’t think LLMs are particularly overhyped; as a fairly mediocre coder who often has to write code for work ChatGPT is one of the most useful tools I use on a regular basis*. I can write scripts in a hour that would previously have taken me all day. Or to come back to the post’s original topic, if it can translate cuneiform tablets accurately, who cares if it’s an “intelligent model” or a “shoggoth”?
* It failed, however, to explain the reference to the corny Lincoln joke. Help?
as a fairly mediocre coder who often has to write code for work ChatGPT is one of the most useful tools I use on a regular basis*. I can write scripts in a hour that would previously have taken me all day.
In what senses is it “useful” ? Are your prompted scripts intelligible to other coders ? Are they full of bugs ? You might not be a good judge of these things, since after all you were not able to write the scripts on your own.
Apparently people don’t like to look a gift mule in the mouth. If Francis tells them what to do, they don’t bother to check his teeth.
You can see this happening on Stackoverflow. From looking at it occasionally over several years now, I would estimate that 95% of the threads are unintelligible discussions by clueless people of weird code alternatives. Everybody lets everybody pull them down to the lowest level – when in doubt, ignore the compiler warnings.
This phenomenon – bad code being imitated and thus driving out the good – I know from doing code reviews over 20 years. It does have one advantage for me, though – a lot of what I learned to do right I learned from the code of people who do it wrong. I don’t bother to use the words “right” and “wrong”. I simply write code that destroys all objections. It took me years to acquire this knowledge, and of course I know my way around only in certain small areas – concurrency, for instance.
By chance I saw a clip the other day of someone calling himself a “former IBM engineer”, giving a “live demonstration” of how clever ChatGPT is. As I watched, it found code of the kind he was looking for. In fact ChatGPT had even inserted the variable names he had used in a piece of code he had submitted as a “seed”.
I don’t believe that for a second. Not the explanation given.
But I will say this: his shpiel was quite as convincing as the volunteer testimony of a shill at a shell game. That’s a short-con, whereas ChapGPT is a long-con.
ChapGPT simply multiplies the amount of sentences that get churned out every day, sentences whose meaning is unclear but seem to want to tell you something. Think of all the crackbots that have been doing this for decades now – Trump, Alex Jones et al.
They’re useful in the sense that they do what they’re intended to do. I’m a perfectly good judge of that. Sometimes they’re buggy, but fixing the bugs is still much faster for me than writing the code from scratch would be.
I don’t know why you don’t believe that ChatGPT demo — that’s exactly the kind of thing it does, not perfectly but pretty well.
But if the scripts are buggy, they are not what they’re intended to be – among other things bug-free and understandable by other programmers.
I simply can’t imagine a reason to be thankful for being served buggy code. Buggy in what senses ? How is this an improvement over looking at the first few relevant threads on Stackoverflow and taking your cues from the weird code examples presented there ?
Full disclosure: I judge code on its merits by inspecting it, not based on what the programmer says. Your enthusiasm for ChatGPT tells me nothing about the code. Every day I deal with programmers who are enthusiastic about their own code until I demonstrate the problems with it, and rewrite the code without the problems. My ability to rewrite is crucial – I convince not by counterarguments, but by countercode. There are then no more discussions, and what anybody thinks is irrelevant.
You may be skeptical, but the productivity boost is a fact. It’s much quicker to find and squelch a couple of bugs than to write a script from scratch in a library you’re not very familiar with. The other day I had to do some data analysis with PySpark, which I’d never used. Before ChatGPT I’d have had to spend a few hours learning the basics of PySpark; as it is I told the bot what I wanted and it spit out some lines, which in this case were bug-free and self-explanatory enough that it was obvious they were doing what I wanted.
Another example: I needed an appointment for a visa to the US, but the earliest date was a year away, so I wrote a script that checked the consulate website periodically and looked for cancellations. It took less than an hour with ChatGPT’s help, whereas I doubt I could have done it in less than three or four hours on my own. There are no issues here of understandability, structural problems with the code or whatever; the fact is I have my visa and don’t have to wait until February 2024 for it.
I don’t understand the urge to dismiss LLMs as useless/stupid/a fraud. I see it also on Metafilter, which seems to have settled on a consensus of “Bah humbug, tech bros gonna grift”. I can understand (and share) the fear that they and other forms of AI will turn out to be extremely powerful and potentially extremely dangerous, but the “nothing to see here” response baffles me.
the fear that [LLMs] and other forms of AI will turn out to be extremely powerful and potentially extremely dangerous
The only danger I see here is the gullibility of people (may I call them “human beings” ?). But that’s the only thing to see here.
As I said above, chatbot enthusiam only adds to the total volume of crackpot enthusiasm.
… despite the many claims regarding the health benefits, studies have found that wearing copper bracelets has no real effect on arthritis.
Shoggoths alone have looked on Cthulhu bare.
Let all who prate of Cthulhu hold their peace,
And lay them prone upon the earth and cease
To ponder on themselves, the while they stare
At nothing, intricately drawn nowhere
In shapes of shifting lineage; let geese
Cry “Tekeli-li”; bubbles seek release
From rubbery bondage in self-luminous air.
O blinding hour, O holy, terrible day,
When first the scales into their vision shone
Of slime anatomized! Shoggoths alone
Have looked on Cthulhu bare. Fortunate they
Who, though once only and then but far away,
Have heard His massive squelching set on stone.
First they ignore you. Then they ridicule you. Learning or doing something new is not an easy thing, but humanity didn’t have a tool that can bullshit convincingly before (and write some code) and now it has. Let’s see what smart, not so smart and mainly enthusiastic people can do with it. Will it finally bring the victory of communism? Probably not. But it can help bring TR to US. Hey aye!
humanity didn’t have a tool that can bullshit convincingly before (and write some code) and now it has.
Sure it did. They are called people.
Let’s see what smart, not so smart and mainly enthusiastic people can do with it.
I agree. Meanwhile I’m not placing any bets – except that I’m not missing out on anything for the time being. If it all turns out to be great stuff, I can join the crowd later. It requires no special knowledge, right ?
If I had more time and interest, I would join the saboteurs. I’ve seen initial studies of manipulating input to confuse ChatGPT and make it do naughty things. That’s right down my alley, I did it for decades to seduce guys.
“People lying” is a very old problem. It’s a known exploit.
Willingness to accept lies is a known exploit used by the exploited to avoid trouble. That hasn’t made lies meaningless.
@TR: “if you call a tail a leg, how many legs does a dog have?”
—
A blow to the head can confuse a man’s thinking; a blow to a foot has no such effect
nonsense! i can testify – from this week’s experience, even – that my thinking is affected when the 25-year-old sprain in my right ankle is acting up. not in the same way as if i’d taken a sap to the temple this morning, but quite definitely. ask anyone with chronic pain or a repetitive stress injury, if you’re lucky enough to have none yourself. or you can try the very simple experiment of rubbing your foot with nettles and then trying to read something that requires very precise attention (ideally in a language you’re only partially fluent in). again, nothing i’m saying is new or innovative: antonio gramsci, among many others, wrote eloquently about the “physical” aspects of “mental” labor (and vice versa) about a century ago.
you might as well say that a slit wrist has no effect on the circulation of blood. it’s doesn’t have the same effect as a slit ventricle, but it still does plenty! i believe there’s a play by shakespeare that makes more or less this point.
True, pain is very distracting. But if it disappears, so does that effect.
“I can understand (and share) the fear that they and other forms of AI will turn out to be extremely powerful and potentially extremely dangerous, ”
They will definitely change our world (as the Internet did). And it is easy to invent a way to use them which I and most people here won’t like.