Part of Simon Willison’s Catching up on the weird world of LLMs (Large Language Models) is about language, which makes it Hattic material; a great deal of it is about coding, which is Greek a mystery to me but of interest to a lot of Hatters, so it’s worth posting for that as well. Consider it also as a public service message — I draw your attention in particular to the “Prompt injection” section at the end. It’s written so clearly and conversationally that even I was able to get a lot out of it. Here’s a passage with some good stuff:
I’ll talk about how I use them myself—I use them dozens of times a day. About 60% of my usage is for writing code. 30% is helping me understand things about the world, and 10% is brainstorming and helping with idea generation and thought processes.
They’re surprisingly good at code. Why is that? Think about how complex the grammar of the English language is compared to the grammar used by Python or JavaScript. Code is much, much easier.
I’m no longer intimidated by jargon. I read academic papers by pasting pieces of them into GPT-4 and asking it to explain every jargon term in the extract. Then I ask it a second time to explain the jargon it just used for those explanations. I find after those two rounds it’s broken things down to the point where I can understand what the paper is talking about.
I no longer dread naming things. I can ask it for 20 ideas for names, and maybe option number 15 is the one I go with. […]
Always ask for “twenty ideas for”—you’ll find that the first ten are super-obvious, but once you get past those things start getting interesting. Often it won’t give you the idea that you’ll use, but one of those ideas well be the spark that will set you in the right direction.
It’s the best thesaurus ever. You can say “a word that kind of means…” and it will get it for you every time.
An important bit that he mentions in passing: “they don’t guess next words, they guess next tokens.” These models don’t know anything about words or meaning, they just predict token use. Which brings me to what is to me a very basic and important point. I got this via MetaFilter, where one user commented:
But, is that different than me? My words aren’t numbers, but they are squeeks and hoots and grunts that, when strung together, have meaning. As I read this section, I swung between “it’s fake” to “I’m fake”.
And another said “that applies to a lot of people as well.” No! Stop thinking like this, people! I know it feels edgy and cool, but it reinforces an already too common tendency to degrade people’s humanity. Saying “how do I know I’m not a Markov chain?” is like saying “How do I know I’m conscious?”: it’s stupid and self-defeating. The world is hard enough to decipher without pulling the wool over our own eyes.
Recent Comments