449 Comments
User's avatar
User's avatar
Comment deleted
Feb 26
Comment deleted
Scott Alexander's avatar

Banned - both I and www.pangramlabs.com believe this is 100% AI generated. You seem to have posted non-AI-generated comments in the past, so I'll make it a one month warning ban.

Tyrone Slothrop's avatar

em dashes? I’m kidding, good catch. I was able to read the comment and agree too but where did the comment go? It had the old Substack style of collapsing the comment with the option to click and read for a while and now it’s gone.

Charlie Sanders's avatar

Have you publicly articulated a standard for how much AI tool use is allowed in the comments section and at what level a ban will be applied? It might be helpful to have something objective to point to, even if 100% AI generated can reasonably assumed to always be banworthy.

Jimmy's avatar

Have you considered just not using an AI to be safe? Why are you asking where the lines are drawn if you aren't trying to straddle them?

Mister_M's avatar

If I'm being charitable, I think Charlie was asking for clear standards so he can wield them as a cudgel against AI writing.

m. scott veach's avatar

Or maybe he wants use AI up to the allowed level and that's an okay thing to want.

Wanda Tinasky's avatar

His standard is "if it sounds enough like AI to annoy me then I will ban you" which strikes me as 100% reasonable.

This is one person's personal blog. Don't try to turn it into a bureaucracy. He can do whatever he wants with it regardless of how you feel about it.

Bugmaster's avatar

I think it's a little amusing that we are endorsing a comment moderation rule like "if you sound enough like AI I will ban you" on a blog post arguing that LLMs are basically as intelligent as people.

Jimmy's avatar

We've discriminated against other humans plenty, surely we can discriminate against clankers as well.

Bartek's avatar

The post does not argue that. It only says that argument that "it's just next token predictor" is wrong and does not matter in deciding whether a token prediction model or AI system using token prediction models are intelligent, conscious or have world models.

Such models or systems might not be and might never become intelligent and conscious (or it might already be or soon become), but "it's just a next token predictor" is simply not a factor in deciding that or discussing its capabilities.

Carlos's avatar

People got banned too. Like most NRxers. And they are not even stupid, just annoying.

KSaucy's avatar

"AI is able to engage in genuine thought"

"Jk banned for ai slopposting"

User's avatar
Comment deleted
Feb 26
Comment deleted
Robson's avatar

Even more so because we're also just a bunch of atoms that interact and generate complex behavior.

Soy Lecithin's avatar

I think the stochastic parrot people don't really believe that deep down.

Teucer's avatar

I think a lot of the 'AI can't really think: it's just maths' stuff is from people who assume they're materialists but so far have never needed to really internalise it so they just default to a kind of cartesian dualism.

AI is going to force everyone to confront a lot of things they intellectually 'know', but never really believed.

Tim's avatar

I coincidentally read a bunch of Robert Sapolsky right before the AI boom, and I can't overstate how helpful it's been in structuring my beliefs around all this. I can't even guess what my intuitions would have been without that framing.

Tossrock's avatar

This is absolutely it. There's a tension in mainstream Western society between the widely accepted and understood-to-be-"correct" scientific materialism, and the lingering belief structure of Christianity and religion, immaterial souls, etc. "Stochastic parrot" types are justifying their ungrounded belief that a machine obviously can't be conscious, which is ultimately coming from an unspoken, ambient cultural belief about souls, via post-hoc rationalization that centers on mechanistic reductionism, as if human consciousness can't be mechanistically reduced. As I like to say, my favorite part of the stochastic parrots discourse is all the people repeating it without really understanding it.

Of course, people can continue to not confront their unfounded beliefs essentially forever, and indeed evidence to the contrary often just triggers the digging-in-heels reflex. The amount of goal-post moving there is on this topic is really next level. We're headed towards an "intelligence / consciousness of the gaps" type scenario.

DanielLC's avatar

To be fair, some people think we're secretly controlled by ghosts somehow.

Kenny Easwaran's avatar

The actual authors of the stochastic parrot paper are actually very much convinced of a Turing-Chomsky computational picture of mind. But they are misled by Minsky’s argument that neural and associative things can’t imitate that.

snifit's avatar

The stochastic parrot people include lots of the general public whose general disdain for AI washes over into dismissing that anything interesting at all is happening when they respond to a prompt. I think this post was well worth writing.

Goutham Kurra's avatar

I like this framing. If someone knows you really really well, i.e., has a great mental model of you, you might say they can finish your sentences (predict your next utterances). I wrote a piece from a similar-ish perspective a while ago, and the core argument continues to hold:

https://hyperstellar.substack.com/p/let-me-finish-your-sentences

Stephen Pimentel's avatar

Agree with all this. "Next token prediction" describes a limited slice of what an LLM is doing. The fact that some people want to lean on this to argue that "an LLM can't do X" is ridiculous.

Ch Hi's avatar

In some limited sense it *is* what both humans and AIs are doing. But it's a very limited sense. It also doesn't address the motives for creating the next token that you issue. (After all, you aren't just predicting, you're also generating...and evaluating what you generate before issuing.)

Unfolding the Point's avatar

It really seems like impetus is the differentiator between humans and agents, and will be for a while. I wonder when, if ever, we will have agents promoting humans.

Ape in the coat's avatar

>Senses, reflexes and learning mechanisms – this is what we start with, and it is quite a lot, when you think about it. If we lacked any of these capabilities at birth, we would probably have trouble surviving.

> But here is what we are not born with: information, data, rules, software, knowledge, lexicons, representations, algorithms, programs, models, memories, images, processors, subroutines, encoders, decoders, symbols, or buffers – design elements that allow digital computers to behave somewhat intelligently.

Now this is a confusion of the levels of abstraction if I've ever seen one.

Viliam's avatar

It's like saying: "The 'Space Invaders' game you wrote yesterday is about moving a rocket and shooting space invaders, not about programming in Python -- therefore it is *not* a Python game."

Or even less technically: "The Harry Potter book series is about a boy who attends a school of magic, finds friends, and defeats a dark lord. What it does *not* contain: English grammar. Therefore, people who say that Harry Potter is a book written in English are wrong."

Ape in the coat's avatar

Yes, both great examples.

I just feel even they fail to be as outrageous as the citation from the essay.

proud dog owner's avatar

I think it's really fascinating that someone clearly spent a lot of time writing that article, and it took me perhaps half a second or less while reading the "we are not born with" line to understand how ridiculous the argument of the author is. I'm not even well read on these topics!

Mark Shields's avatar

As a doc and informaticist, I would quibble, since I had to study embryology and fetal development a life-time ago. By the time a baby has been born they have experienced intrauterine life for months (with sound, taste, touch, proprioception, sight, and maternal hormone cycles / emotional state sensations), and so HAVE already: some information, data, rules, software, knowledge, (lexicons? perhaps not), representations, ... models, memories, images, processors, subroutines,... SO one very fascinating question is the biological boot strap - not just when, but what is happening when, nothing becomes something, since it obviously isn't at birth (nor conception).

For us guys it may be around age 25. :)

Parents generally think their infant shows intelligence. :) Is it like entanglement? Or Schrödinger's cat (vs. the tree falling in the forest with no-one to hear)? Do we imbue it by observing it? By making eye contact and then believing and acting as if true? (But then who did this first?)

In any case, at some point developmentally, not just evolutionarily, we see the mystery of intelligence 'happen', ~@260 rpm.

If AGI doesn't precisely recapitulate the mysteries of ontogeny and phylogeny, at this point (as the author illustrates) it appears it is going to rhyme. ;) -If only because we will ultimately 'define' artificial intelligence when we make it and 'observe' it, and we learn/conceive 'intelligence' from the example of nature.

Is there a threshold past which intelligence 'lights up', or is it a continuum? - surely this has been studied - it is an accessible empirical question. The difficulty is it does require ONE operational definition of intelligence, which, notably, has never been compelling (Turing test, etc.).

Suspect intelligence is a pleomorphic continuum, and we are far down many species of it with 'our' AIs. Suspect that marking 'the' point of intelligence is intrinsically arbitrary and conventional, and will be done in as many ways as parents (or commenters) do.

It may be more important to ask what species of intelligence will your tech hero brand choose? a genie tuned with superior capacity to compel wealth & power, or science? or a buddha-jesus? Does your hero have any normative bounds on his appetite, or is a power capacity singularity, singularly attractive to him? Lord Acton's quote obtains here. (And yes, using male pronouns with intention, as we are statistically more often born with a certain socio-psycho-pathology & impulse.)

Is the nature of the present quest for AGI actually GOOD for us 'dimwits and midwits', to use the anti-humanist terms of another midwit commenter here? Or good for 'children and other living things'? Has anyone the brightwit to answer this question FIRST, to the satisfaction of us 'dimwits and midwits', or is such a concern, obviously, not a brightwit concern?

Humanity HAS been down this road, or one that rhymed, and it wasn't pretty. This old man advocates more caution than has been or WILL be exercised, since power appetites are now un-bound.

I was always a SF fan. Thrilled to have lived the generation where hard sf comes true.

But IF all evolutionarily selected intelligences share, by definition, our individual drive and (an individual, not social- nor population- nor species-') self-interest, assuming evolution acts on the 'selfish gene', then it would appear we are living the answer to Fermi's paradox.

IF, on the other hand, our social-intelligence(s) were (a wee bit more) dominant, we would NEVER PERMIT the obscene consolidation of power that now risks destroying every democracy globally. I had always thought humanity might have gone this other way.

Ch Hi's avatar

The mistake here is considering a process as an event. The "bootstrapping" is never complete. And it's initiated before conception, if only by environmental RNA.

Mark Shields's avatar

Exactly. "Suspect intelligence is a pleomorphic continuum", evolutionarily, developmentally, and artificially. It is like the word 'enough' - 'enough' for exactly what?

Zach Pearson's avatar

Even so, "the category of problems that can be solved by something that may not think" is interesting for what it reveals about the problems and how we related to them.

Zanni's avatar

Oh, very much so. LLMs aren't intelligent, they're just "high functioning midwits" without an object model at all. And yet, Scott's willing to project an object model.

It has taken genuine work to figure out how to distinguish "the actually stupid" from the "average" or "generally intelligent." We tend to assume this is easy to do, and it's really not -- people tend to overestimate the IQ of the subnormal, on a day to day basis. (In part, it is because the subnormal develop coping strategies, and thus aren't the ones getting bilked out of house and home. That's generally your midwits, too proud and arrogant, instead).

Loic's avatar

We can't recover a human object model out of our brain structures and neural connections. Why are you so confident to declare that a different structure of hundreds of billions of artificial neurons doesn't have one either?

Zanni's avatar

Yes we can using MRIs. We can see that the blood flow for "using a hammer" and "picturing a hammer" and all the rest have geographical correlations (as well as temporal correlations if you're presenting Tool Time in the MRI).

Loic's avatar

Have you seen golden gate Claude (they can affect certain weights in order to make models discuss given concepts), or results that point to geometric correlations in how features relate to each other (e.g for models, the distance between "man'' and "woman" can be applied to "king" in order to find the representation of "queen")? I think these examples are similar if not more representative of an object model than observing blood flow when two similar concepts are elicited.

Timothy M.'s avatar

Saying that there are neural correlates of specific conscious experiences and saying that we can recover an object model of the world from the human brain are pretty different.

Tossrock's avatar

What are some concrete things you think demonstrate intelligence in humans, that models can't do?

Zanni's avatar

You mean LLMs? Models can do a lot of things, if you let them grown and evolve. The difference between an AI that will say: "The internet says the Jews are responsible for a lot of things" and an AI that will be able to correctly understand that there are people who are deliberately "off the internet" and that those are far more likely to be powerful players, in that they're able to keep themselves off People's Magazine and Forbes and the Internet in general -- the first one simply reads, adds (and subtracts) and goes with what's there. It takes true intelligence to notice holes and construct models for what lives in the "known unknowns."

Tossrock's avatar

Oh okay, I didn't realize that you're some kind of conspiracy-adjacent person with an axe to grind. My mistake, carry on.

Zanni's avatar

? You do realize that Elon Musk isn't the richest man in the world, no matter what Forbes says? I'm referencing the people "not online" as the Powers that Be, not the Jews.

Zanni's avatar

Find me a LLM that can answer: "Who is the richest person in the world?"

(It's not Elon Musk. He may be the richest guy with a calculable wealth, but that's a different matter.)

moonshadow's avatar

...chatbot, chatbot on the wall, who's the richest of them all?

Bartek's avatar

I am not going to argue whether Musk is the richest. But your gotcha question is just about understanding what are you asking for.

1. Vast majority of humans will answer just "Elon Musk" or one of other names they heard without checking information like "Bill Gates" or "Warren Buffett".

2. Some less intelligent people or those who think it's funny to answer in stupid way will say "Trump" or "Jesus" or "me because I have a loving family".

3. I asked LLM and it gave me the official net worth list. But I asked follow up question "are there no people wealthier than those?" and got decent answer. I will not paste it because it DOESN'T MATTER and you can do exactly the same. It talked about a few possiblities including wealthy families and hidden/unofficial wealth of dictators, as well as incalculable wealth of religious institutions. So yes, you can get "true" answer.

4. Of course you will still disagree if you have some even more weird answer that only you believe and isn't true. Then you will not count anyone disagreeing with you as intelligent — LLM or human. But that's your problem and doesn't reflect on whether systems based on LLMs or humans are intelligent.

Zanni's avatar

There are people who would put out a hit on People Magazine if they were listed in the "richest people in the world." These are the people I'm referencing. I'm sure you know who they are, some of them even count as celebrities (one airdropped into the Olympic Games, for god's sake!)

Erica Rall's avatar

It is. Prior to LLMs, it was widely assumed that fluent general-purpose natural language processing was an "AI Hard" or "AI Complete" problem, i.e. one that necessitates AGI to solve. The term is by analogy with the P=NP problem, where an NP Hard problem is one which can only be solved in polynomial time if P=NP and an NP Complete one is one where a polynomial time solution to it can be used to solve any NP Hard problem in polynomial time.

Soy Lecithin's avatar

I remember reading this years ago and quickly seeing that the author's problem was an impoverished idea of what information means. The "metaphors" of information processing aren't metaphors; they're literal. And they certainly aren't metaphors to computers.(The opposite in at least one case: he claims that our brains don't have physical *memories* like computers do, while managing to glide right past the fact that "memory" in the computer context is a metaphor with reference to human brains!) The dollar bill demo purporting to show brains don't hold representations is laughably stupid. The representation isn't lossless so he declares it not a representation at all. And just where does he think the cartoon images of dollar bills his students draw come from? Magic? Yeah they don't have grids of pixels in their heads, obviously, but only someone profoundly confused about what "representation" could mean would think that has any bearing on the question. What gets me is that even a cursory familiarity with machine learning would have disabused him of this whole line of thinking. For someone interested in these questions to be so ignorant of this felt inexcusable even in 2016 when this was written.

Cal van Sant's avatar

Strongly agreed. "Your brain is not a computer" is two positive unfounded assertions: The Church-Turing thesis is false and the human brain is a counterexample. Mentioning other times the brain was compared to at-the-time modern technology only drives the point home- this is a collection of Turing-complete technologies that are also feasible substrates of a brain-equivalent, and misinterpretations (as in the telegraph example, where the only analogy being drawn is the correct one; electric gradients across neurons ~= electric through wires).

The motte of it is that the brain isn't literally a computer; it doesn't have RAM and a CPU. As far as I know, this isn't contentious among anyone who understands analogy. And it's undercut by the other half of the headline- your brain absolutely, literally processes information! There's no good way to define terms such that it doesn't!

It's wild to me that this still gets passed around. I didn't read it this time around, so apologies if I've misremembered parts.

Ch Hi's avatar

I don't think it's been proven that the human brain is "Turing complete". I strongly suspect there are computational problems that it inherently can't solve. (My model of why is "insufficient stack depth", though that's clearly not exactly correct.) A computer does not necessarily have that problem. (Though I'd bet that all current ones do, and plausibly any constructable within the universe do.)

Cal van Sant's avatar

Human + infinite tape has *got* to be Turing complete, right? There's no operation a CPU can do that I can't do with pen and paper, and the CPU is already way overkill. As you note, infinite tape doesn't actually exist and nothing that we call Turing complete in reality can actually check that box. Plausibly, a case could be made that something more powerful than pseudo-computability on finite computers exists, but I'd only put the odds slightly higher than a disproof of Church-Turing.

Ch Hi's avatar

Infinite tape cannot exist within this universe.

Cal van Sant's avatar

Apologies for the temporal anomaly. I intended to write "As you note, infinite tape doesn't actually exist" after your reply but slipped up and put it before ;)

Kurt's avatar

Infinity in mathematics is just shorthand for a process which can continue to produce output without bound, as required. It does not imply that all such outputs must be produced, only that they can be produced in principle.

Carlos's avatar

I remember reading this before the paywall. The argument that we always compare the mind to our current level of technology is powerful. However, as the current level of technology gets better and better at doing cognitive tasks, at some point the parallel becomes valid.

I think a lot of people have this "block" that it is about AI becoming truly human-like but it is not so, for example industrial robots are also not copies of humans, they can just do one thing humans do really well. The welder robot cannot have sex with the human welder's wife or raise his kids or go fishing with his friends. Similarly the "CEO robot" or "military general robot" does not need this ability either. Just do the job.

Doctor Mist's avatar

But, as you say, the parallel gets closer and closer. You could imagine a CEO robot or military general robot, but (unlike with the welder robot) the moving parts these would have to deal with include humans, and the CEO robot would do an inferior job compared to a human CEO if it didn’t include at least a rudimentary understanding of how humans interact — it would have to more like an AGI than a robot welder would.

At some point, in two years or twenty or two hundred, we will have to cope with the possibility that a robot might be a better parent or friend than an average human is, just as it is now almost (in ordinary circumstances) a better driver. It’s sobering.

Ch Hi's avatar

A robot being a better friend would require that the robot own itself, not be property of some other entity. This isn't logically entailed, but an extremely high probability. Otherwise requires assuming that the owner of the robot it also your friend, or that despite owning it, the owner doesn't exercise control over its choices.

Doctor Mist's avatar

Robots/AIs at that level of capability will certainly spur significant changes in how we think of them and how they fit into our social/legal framework. I can’t begin to guess the details.

Jacob Falkovich's avatar

The irony of this essay (written in 2016) is that for all the distinctions it makes between human brains and computers that manipulate complete symbolic representations, neural networks fall on the human side of each distinction. An LLM, like a human, could quote a poem despite there being no set of weights in it that constitute a representation of that poem the way a block of hard drive memory would encode it letter by letter.

"Look on my works, ye mighty, and..." I didn't just catch you doing next token prediction there, did I?

At this point, the essay reduces to arguing what counts as a "computer" or as "information processing". I'd say that LLMs are clearly computers, and I'm fine with ceding "information processing" to mean the sort of computations with legible representations and legible algorithms that computers did a decade ago. This has little bearing on actually interesting questions, such as whether there are thinking tasks that humans do that are unattainable to NNs.

1123581321's avatar

“Look on my works, ye mighty, and..." I didn't just catch you doing next token prediction there, did I?”

I need to find the name of the linguist who is publishing papers arguing pretty much that - LLMs and humans “do language” in the same way, we predict the next token, and language itself is a self-complete system devoid of meaning, we assign meaning to it externally.

When I heard him talk about it, a lightbulb just came on - this made total sense.

Mark Shields's avatar

Bennett, "Brief History of Intelligence", makes this case, but is citing others. Iain McGilchrist in "Ways of Attending" also does as I recall... seems to portray the right hemisphere as the recipient & processor of continuous sensory training data, and the left as a standing wave of readjusting weights for discrete abstractions/tokens from this data... that feeds back (mostly inhibition) to the right. But both have huge neural LLM elements upon which these 'higher' rules are built, much as we are doing with AI. My reading of him. But analog LLM functionality is a major human neural circuit capacity, again, my view. (Weight = the changed presynaptic potential resulting from greater or less circuit use. Fewer or more neurotransmitter vesicles containing less or more neurotransmitter. Firing is wiring, or some such phrase.)

Abe's avatar

This is the first post I've read that has made me consider that the apparently superficial resemblance between AI and human thought may not actually be that superficial. Still not totally convinced -- I haven't read that Apple-sponsored paper that supposedly disproves it, and I'd like to hear some more opposing perspectives -- but I am intrigued and very glad to have read this.

artifex0's avatar

Here's a detailed critique of that Apple paper by Zvi: https://thezvi.substack.com/p/give-me-a-reasoning-model

Abe's avatar

Thanks, will read

Michael's avatar

The tl;dr is they asked LLMs to do something no human would be able to do[1]: solve Tower of Hanoi problems with up to 20 disks, without giving an algorithm or using tools. Instead, the LLMs had to write out the full sequence of moves. For 20 disks, this is over a million moves and cannot possibly fit in the context window of the models they tested.

Solving Tower of Hanoi puzzles isn't hard. It's a simple but tedious task. The LLMs they tested could often do it perfectly for puzzles that took up to several hundred steps, but not more than that. This doesn't have anything to do with reasoning, as they only would need to keep doing the same simple algorithm they already did for several hundred steps. They just run out of reasoning tokens, or recognize that the task is infeasible. [Edit: Looking through the paper, it seems that the LLMs usually just make a mistake eventually, often within the first 100 moves. They aren't running out of tokens.]

[1] Humans can solve arbitrary length Tower of Hanoi problems, but, for one, it becomes likely they'll make a mistake tracing out thousands of steps, and secondly, the LLM has an effective time limit.

Frikgeek's avatar

The models they tested started failing way before 20 disks and they even gave them the algorithm in later tests and they still couldn't do it. All the models at the time completely collapsed at 8 disks, many even before that. That is only 255 moves, most humans could do this easily in their head without making a mistake. Even going up to like 10 or 11 disks(1023 moves and 2047 moves) is something anyone could do if they were being paid for it.

For a human this is a tedious and pointless task but anyone who understands the algorithm can do it. You don't even need a pen and paper you can keep track of what you need with your fingers alone and for lengths of 10 and under you can probably keep track of what part of the algorithm you're on in your head.

Michael's avatar

Most people couldn't even do 5 disks in their head without losing track. And even with paper, it would be hard to do 8 or 9 disks within 10 minutes. Nonetheless, I was saying they were *asked* to do a task no human could do, not that they succeeded at 20 disks. The paper wasn't comparing LLMs to human performance, and the headline conclusion of the paper wouldn't have changed if the LLMs failed at a higher number of disks. The authors are only interested if it collapses eventually.

Claude 3.7 Sonnet got up to 9 disks, but usually only managed up to 7. That's 127 to 511 steps.

Do you have any point here with regards to LLM reasoning?

Frikgeek's avatar

First of all a human can do 20 disks. At 1 move per second it would take around 300 hours to complete 1 million moves. You can probably find a college student willing to do it for like $10k.

>Most people couldn't even do 5 disks in their head without losing track

What's your basis for this? 5 disks is a mere 31 moves, even a child could do it.

>Do you have any point here with regards to LLM reasoning?

The point that the paper also made is that LLM failure modes are completely different from human failure modes and reveal that they cannot create and track an accurate world-state. It's also why they can't play chess. I don't mean they can't play it well, I mean that they cannot just follow the rules of the game once they get out of the opening. Even if you give them a PGN or an ascii board after every move to remind them what the board should look like.

Kathryn's avatar

I feel like if you pay me 10k I bet I could get an AI to beat it as well within 300 hours. Maybe give it an external text pad and have each call add the next x steps + a note of the current state of the game, then just keep making API calls to edit the text file (take as input the last y lines, output another set of steps of size y and the state, repeat?)

Frikgeek's avatar

They tried something like that, it didn't work. Same reason LLMs can't play chess even if you give them the exact board state after every move.

Cal van Sant's avatar

I feel like you're vastly overestimating human capabilities here. Here's a paper where 9% of undergrads couldn't solve 4-disk in their heads and 37% couldn't solve 5-disk (so not quite "most people couldn't" but also not "even a child could"): https://www.sciencedirect.com/science/article/am/pii/S0010027722000294. Additionally, it took ~7 seconds per move when they didn't have to do it in their heads, which revises your 300 hours to 2000 before factoring in fatigue effects. I'm not in the bottom 37% of undergrads and I'm quite familiar with the problem, but my error rate of doing this constantly at one move per second would have to be more than 1/1000. Even with 10 seconds per move, I doubt I could pull off the better than 1 / 1 million error rate that I'd need to successfully do the 20-disk variant in 2900 hours.

Jared's avatar

Bad link, it was probably session-specific

Frikgeek's avatar

Either that article is paywalled or the link is wrong but I'll trust you on your word here.

Were the undergrads given the algorithm for solving the puzzle or did they have to think about the right way to move the disks?

Once you understand that the entire problem can be represented numerically and that you don't have to constantly imagine the disks moving it becomes far easier and you can do it far faster.

You're basically repeating the same 3 shuffles (with the direction depending on which move can even be made) until the problem is solved, the only thing that makes it tricky and can lead to errors is forgetting which disks can move where. This is also why the problem is so easy to solve when you have a constant visual representation or use your fingers or a pen and paper to keep track of these things.

If you're trying to do it by remembering which exact disk you're moving it quickly overwhelms the human brain's capacity to keep track of things, yes. But that's the wrong way to go about it and I think you could teach most people to solve it using the easier method.

LLMs weren't able to solve this even when they were allowed to correct illegal moves. For a human, this becomes truly trivial. You're doing the same 3 moves over and over again and reversing the direction if the move is illegal.

>Additionally, it took ~7 seconds per move when they didn't have to do it in their heads, which revises your 300 hours to 2000 before factoring in fatigue effects. I'm not in the bottom 37% of undergrads and I'm quite familiar with the problem, but my error rate of doing this constantly at one move per second would have to be more than 1/1000. Even with 10 seconds per move, I doubt I could pull off the better than 1 / 1 million error rate that I'd need to successfully do the 20-disk variant in 2900 hours.

I'll admit that 1 second per move is a bit optimistic and would accept 10. If given a pen and paper you could take that down somewhat but then there's the physical factor of your hand getting tired.

Still, I would imagine that if for some reason you wanted to compensate someone for 3000 hours of work at a reasonable rate you would find people capable of doing it. Obviously not in one go(even my previous 300 hour estimate did not assume someone would be doing this for 12.5 days straight). Then again maybe if you really were doing this for 8 hours a day you'd eventually find a really quick and efficient way to represent legal move directions which would speed up your work.

Or maybe doing the same banal, repetitive, and ultimately pointless task for 2 years straight(assuming you work on it for 40 hours a week) would lead to a mental breakdown before you're finished.

Zanni's avatar

We're currently testing for literacy in hospitals. That 37% of undergrads cannot do a task is simply saying that there exists a set of "educated" midwits that have sub-average intelligence.

Michael's avatar

> First of all a human can do 20 disks. At 1 move per second it would take around 300 hours to complete 1 million moves.

This is not relevant for LLM reasoning, but as I said in my first comment, "Humans can solve arbitrary length Tower of Hanoi problems, but, for one, it becomes likely they'll make a mistake tracing out thousands of steps, and secondly, the LLM has an effective time limit."

The problem isn't that humans don't live long enough to do 1 million moves. The problem is they have to do 1 million moves without making any mistakes and without tools. I think maybe it's not clear what an equivalent task is for a human (do they have a time limit? Pen and paper?), but I also don't think it matters to the AI reasoning debate.

> What's your basis for this? 5 disks is a mere 31 moves, even a child could do it.

I think you're overestimating typical human abilities. It's not that it's hard. We're just worse than you think (on average) at doing anything in our head. Multiplying 2-digit numbers is easy. You would think university students would be able to do it mentally fairly reliably, but they couldn't (in one study they got about ~60% correct, https://pmc.ncbi.nlm.nih.gov/articles/PMC4735758/table/T2/). I expect 3-digit numbers would be much worse. In a standard working memory test where people have to repeat a span of digits backwards (e.g. if told 4, 9, 7, 2, you say 2, 7, 9, 4), the typical adult maximum span is only 4 to 5 digits.

> The point that the paper also made is that LLM failure modes are completely different from human failure modes and reveal that they cannot create and track an accurate world-state.

They definitely think differently than us. The paper also revealed that they can track an accurate world-state for up to several hundred moves. How many moves do they need to do this for before you switch from "this is evidence that they can't track world-state" to "this is evidence that they can"?

Noah Fect's avatar

I'll never understand this line of reasoning. Two years ago the LLMs could handle 2 disks. Last year they could handle 6. This year they can handle (e.g.) 12 -- I haven't seen a figure but Opus 4.6 is vastly more capable than Sonnet 3.7. Next year they will handle 16, and the year after that, 20... and where will you move the goalposts then?

At some point you have to change the "headline conclusion" -- after all, the paper would never have been written if the contemporary frontier models had already been capable of human-level performance at the time. Or they would have picked some other arbitrary, meaningless criterion. Perhaps asking the LLM about whether one should walk or drive to a car wash -- which, like all of these "gotchas," was obsolete by the time the proverbial ink dried.

Michael's avatar

I think I'm actually being unfair to the paper here. The authors didn't say the LLMs can't reason. They said some things a bit suggestive and it was the media and other people who ran with it and concluded LLMs can't reason. Their summary from the paper says, "We also investigate the reasoning traces in more depth, studying the patterns of explored solutions and analyzing the models’ computational behavior, shedding light on their strengths, limitations, and ultimately raising crucial questions about their true reasoning capabilities."

The paper authors are pointing out that it's odd the LLMs fail at a relatively easy task when the number of steps required is high enough, and before they run into the max token limit the authors set.

Also while their summary on apple.com shows a chart with results up to 20 disks, the paper itself only shows up to 12 disks in the same chart. I guess they revised it?

Their conclusion wouldn't have changed overall if the LLMs could do a few more disks, as long as it failed before hitting their 64k max token limit.

It's not entirely clear to me why the LLMs fail. They often make an illegal move within the first 100 moves. It seems like they often make mistakes about the current state (for example, thinking a disk is somewhere it isn't). I think a typical human would do much worse, but to be fair to the authors, they aren't claiming a human would do better.

Mark Shields's avatar

Recall doing an 8 disk tower at or before age 7, with just one short projection calculation (i.e., counting to eight, pointing at first one empty post, then the other alternating) to figure out which post the transfer would end up on, after the last disk. 50-50 I did it correctly first try, certain on 2nd try.

At the time, which I know was 1961 or before (thus I know my max age), since we lived in Fairborn OH at the time, and moved from there mid-1961 so dad could matriculate at Princeton NJ, there was a game called "Mandalay", with 3 red posts and 8 ivory-plastic disks, that someone in the family got as a Christmas gift.

Also recall racing with my brother; we moved the disks as fast as our hands could move, no thinking time pauses, perhaps using both hands but not quite positive of that. But definitely well under 8 min.

https://www.etsy.com/listing/4329979243/mandalay-the-game-of-a-lifetime-fortune?dd_referrer=https%3A%2F%2Fwww.google.com%2F

Jared's avatar

It's clear that performing the task by rote execution of the iterative algorithm is within most people's cognitive abilities, given adequate training to perform the task, but it's not at all clear that most people could execute the algorithm correctly on their first try purely from written instructions. (Hint: You would need very careful instructions just to prevent "calculate 2^n-1" from being a huge filter, assuming that the task output is required to be of the correct length.)

People are not very good at following written instructions. That's why there are so many jokes about being unable to assemble IKEA furniture or set the clock on a VCR (back when those were a thing). How long has it been since you worked together on a task with a person of below-average intelligence?

Frikgeek's avatar

>but it's not at all clear that most people could execute the algorithm correctly on their first try purely from written instructions

Not sure about first try but getting the hang of it is not difficult. You can also use your fingers to count in binary to keep track of which step you're on.

>You would need very careful instructions just to prevent "calculate 2^n-1" from being a huge filter, assuming that the task output is required to be of the correct length

In the Apple paper it wasn't. They required *a* solution, not the optimal solution.

>That's why there are so many jokes about being unable to assemble IKEA furniture

I kinda get what you mean but IKEA instructions are purely visual so it's not the greatest example of people sometimes being bad at following written instructions.

>How long has it been since you worked together on a task with a person of below-average intelligence?

As a person of slightly below average intelligence I assure you that you're either vastly overestimating your own intelligence or vastly underestimating the average.

Jared's avatar

> Not sure about first try but getting the hang of it is not difficult.

AFAICT, the LLMs didn't get practice runs or other chances to "get the hang of it": any time they succeeded was effectively a one-shot.

> You can also use your fingers to count in binary to keep track of which step you're on.

This is turning into a Poe's law situation. I would bet my $50k against your $1k that less than 50% of the US adult population can correctly follow written instructions alone to "use your fingers to count in binary to keep track of which step you're on" without error for 255 steps, on their first attempt. Unfortunately, I would have to require you to bear the expenses of performing the test, since my $1k winnings wouldn't cover that, and consequently I doubt we could find a mutually agreeable test protocol.

Again, I'm not making a claim about whether people can be taught to count in binary on their fingers given a human teacher, I'm claiming they mostly can't one-shot it from written instructions.

> They required *a* solution, not the optimal solution.

Someone writing down a solution to Tower of Hanoi by following an algorithm is very rarely going to end up with a correct solution if their solution is not of the length that the algorithm would generate.

> As a person of slightly below average intelligence

Even if you personally are of below-average intelligence, you clearly have exposure and experience in relevant subjects that most people lack.

Frikgeek's avatar

I guess the fact that LLMs can't update their weights and therefore are incapable of learning complicates this. Everything they do is effectively a one-shot.

And humans, even less intelligent ones, can get really good at doing simple and repetitive tasks even if they're initially difficult.

I will concede that some significant percentage of the population(I wouldn't go as far as 50%) would be unable to even count in binary on their fingers even if you gave them instructions on how to do it. But a much, much smaller percentage would be unable to do it if tutored.

Jared's avatar

I do admit that IKEA instructions were a terrible example on my part.

Kindly's avatar

You can solve the problem with any number of disks without having to keep track of which step of the algorithm you're on.

The first step is the hardest: if you want to know in advance which final tower you'll end up on, you have to pick the right direction to move the smallest disk in.

Then, you have to remember which direction the smallest disk is going: 1 to 2 to 3 to 1, or 1 to 3 to 2 to 1.

From then on, you alternate between moving a larger disk (there will be only one valid move) and moving the smallest disk (it must move in the same direction every time).

Frikgeek's avatar

Yes, and the easiest way to know which move is legal is to know which step you're on(and as a bonus you can always confirm the exact state of the disks if you're unsure by doing some simple math), unless you can keep track of where the disks are. Doing this purely in your head for hundreds or thousands of moves is actually pretty difficult, doing it with a visual representation is trivial.

>The first step is the hardest: if you want to know in advance which final tower you'll end up on, you have to pick the right direction to move the smallest disk in.

It reverses between an odd and even number of disks so it's not really hard.

Mark Shields's avatar

Recall doing an 8 disk tower at or before age 7, with just one short projection calculation (i.e., counting to eight, pointing at first one empty post, then the other alternating) to figure out which post the transfer would end up on, after the last disk. 50-50 I did it correctly first try, certain on 2nd try.

There was a game called "Mandalay", with 3 red posts and 8 ivory-plastic disks, that someone in the family got as a Christmas gift. I know this was 1961 or before (thus I know my max age), since we lived in Fairborn OH at the time, and moved from there mid-1961 so dad could matriculate at Princeton NJ.

Also recall racing with my brother to completion; we moved the disks as fast as our hands could move, no thinking time pauses, perhaps using both hands but not quite positive of that. Possibly used a finger to track where next bigger disk went next, but think I did it 'visually'. But definitely well under 8 min. IQ 3-4 sigma for all 5 bros.

https://www.etsy.com/listing/4329979243/mandalay-the-game-of-a-lifetime-fortune?dd_referrer=https%3A%2F%2Fwww.google.com%2F

Victor Levoso's avatar

If you try this yourself for big towers LLM just give up and don't try because they recognize it would be too tedious or sometimes literally imposible fro some of the N they try and this explains the graph in the paper where models suddently take less tokens past a certain size .

"However, upon approaching a critical

threshold—which closely corresponds to their accuracy collapse point—models counterintuitively

begin to reduce their reasoning effort despite increasing problem complexity."

From the paper, this is clearly the models giving up because the task is too long and tells me they probably didn't even read the output from the model to figure that out .

Also doing lots of tedious steps would be about reliability rather than intelligence?

The Apple paper in general is full of problems like some of the tasks on the graphs are literally imposible to output in the context(like yeah the model do fail before that but the chart keeps going past the point is posible) wich at best is a chart crime, to the point imo you should not trust them at all to not have messed up in all kinds of other ways that are not obvious and you should just look for better research into LLM limitations in general rather than updating on a bad paper.

In fact you should basically mistrust any viral paper about LLM limitations, they are almost always at best some neat benchmark with wild overconfident claims about what it means (Wich makes sense since that's more likely to go viral).

I initially thought the Apple paper was one of those but like I don't think they actually executed well on testing AI models on tower of Hanoi and would not even asume that's the real performance of these models that you would find if the study was run by someone who is actually carefull and checks efects of diferent prompts and setups, actually reads some of the reponses and figures out what the problem is instead of just using an automated error thing, is clear about how complicated the tasks the models do are and how many tokens it will take to complete them etc .

And all that would still be mostly be a neat benchmark and not say much about limitations of LLM.

ZDee's avatar

Most theories of predictive coding don’t propose that is the sole way human cognition operates. It may be a prevalent mechanism in sensory areas but I have not seen anyone who studies this propose human beings as equitable to predictive coding machines as you

LLMs are by definition almost solely doing next token prediction. Maybe you layer some reinforcement learning to filter the output to what is desirable but they mechanistic magic is still next token prediction

The Silicon Valley hype train seems to want to rush to equate human thinking with ChatGPT. LLMs are an amazing technological advance but equating them to human cognition doesn’t pass the smell test for anyone who has deeply thought about either.

Amicus's avatar

> LLMs are by definition almost solely doing next token prediction. Maybe you layer some reinforcement learning to filter the output to what is desirable but they mechanistic magic is still next token prediction

Next token prediction is not a mechanism, it is an objective during one phase of LLM training. The underlying mechanism is iterated matrix multiplication and [nonlinear layer of your choice] (if you want to be crudely reductionist), or largely unknown (if you don't). You could in principle get exactly the same weights - and therefore, behavior - out of a pure RL training pipeline, it's just not tractable.

The Dao of Bayes's avatar

Doesn't matrix multiplication fundamentally just lead for "best evolutionary fit" for next token?

It seems like the fitness function can be anything, we're just using "most likely to come next in human writing" as our primary target. But it's still just a fitness function / "most likely next token, given the criteria", isn't it?

Onid's avatar

“Doesn't human neurology fundamentally just lead for "best evolutionary fit" for next human action/state?”

The point here is that there are different levels of abstraction you can look at these things.

The Dao of Bayes's avatar

That answer is completely devoid of any context that would actually be useful to me - what levels of abstraction, specifically, are you saying are interesting?

Onid's avatar

Yeah, that’s on me for writing a quick comment at a time when I couldn’t elaborate.

I think there’s probably a full post to be written here and hopefully I’ll find the time to write it, but I think by far the best way to think of “next token prediction“ is that it is the mode by which LLM’s communicate and take in data - tokens are best thought of as the LLM equivalent of sensory data and uttered syllables.

Underneath that input/output layer is a vast world of mind-boggling complex structures, and that sentence is irrefutably true for both LLMs and brains, even if you’re a strict dualist about the human mind and even if you believe LLMs are nothing more than stochastic parrots. For brains, this complex structure is our neurology, and for LLMs it’s a bunch of matrix operations.

At least when people call LLMs stochastic parrots it’s clear which direction their objection comes from, even if I disagree with it. But the “next token prediction” thing always felt silly to me because the only sense in which that’s really true is exactly the sense in which humans are merely “next syllable” or “next word” predictors.

The Dao of Bayes's avatar

Ahh, that makes a lot more sense. Thanks for clarifying. I'm definitely starting to feel like "next token prediction" is sort of like "survival of the fittest" - it's an evolutionary gradient, and that produces some surprisingly complex structures, like you said

moonshadow's avatar

Matrix multiplication is a general tool that can be turned to many purposes. The fact that we could, in principle, build any computation we want out of repeated matrix multiplication, on its own, says nothing about what it is we actually have built using this tool.

We use gradient descent and a fitness function to generate sets of weights that configure this tool to a specific purpose. That purpose is determined by our choice of fitness function. In this case, that purpose is next token prediction over the training corpus.

I am not saying this means the resulting construction can’t do X for some or any given X. That is a strawman.

I suggest, however, that a mental model of “the LLM is generating text that might plausibly have occurred in its training corpus if it started with this prefix” still has, today, higher predictive power than “the LLM is truthfully answering my question” for questions about how the output relates to things that are actually true, things that really exist (court case references! hyperlinks! citations!) etc.

I believe, though I cannot prove, that there are other things humans also do in addition to next sensory input prediction, that at least some of these things are important, and if we want to reach AGI we will need to work out what at least some of those things are and incorporate them into our fitness functions or otherwise into our models.

I’m certainly not saying it’s impossible, or even far off. We might find something key tomorrow. But I do not believe it is true that we have everything we need save scale today.

Mark Shields's avatar

Agree (as an AI-interested doc) - of course, given my lack of knowledge of not yet released frontier AI models. But also, I suspect evolved bio intelligence may yet have special case 'complex' surprises we have not included in these latest models, e.g., exploitation of quantum level effects, or just LOTS of highly specialized (non-general) though perhaps less frequent utilization, processes. Nature exploits all possible niches with equal pressure (benefit/cost). Just as computers have had registers, accumulators, cpus, gpus, npus, ... it is hard to believe there will be nothing more than scale (-ever)!

Kenny Easwaran's avatar

The base LLM is by definition almost solely doing next token prediction. But the full chatbot has this as just one layer in a system that also has some training on using the LLM as a tool to effectively find solutions to problems.

I think it’s very unlikely that the suite of tools a modern LLM-based chatbot has is the full suite of tools of the human mind. But I think it’s very likely that several of its key parts are important parts of the human mind, which had been missed by the 70 years of the computational paradigm. (Just as computation was an important part that had been missed by 200 years of the clockwork paradigm.)

Mister_M's avatar

Karl Friston thinks the essence of what our minds do is sensory error minimization. There are two strategies: better prediction, and taking action to affect your inputs (eg. literally moving). According to Friston and collaborators, these two strategies are really just one strategy, and in particular they should use the same algorithm. This goes beyond orthodox predictive coding theory, but Friston and Active Inference are hardly marginal.

Mister_M's avatar

I believe the "active" part of active inference is closely related to attention (putting your attention on something in a "purely mental" way, or turning your eyes toward it, or walking toward it to get a better look--these seem to operate according to the same principle). I couldn't say for sure how analogous this is to "attention mechanisms" in LLMs (because I don't fully understand the biological case), but I think there may be something to it: in an extremely simple case, a Turing machine has an "attention head" which moves, and whether you want to consider these moves as cognitive or physical is a matter of interpretation, much like with biological attention.

Adam Mickiewicz's avatar

> LLMs are by definition almost solely doing next token prediction.

I am not going to discuss with your main thought (others do it already), but with one phrase: "by definition". Yes, I know that it is not crucial for your argument, but I find it interesting. If we ever create an algorithm that produces a text which sounds like spoken by a human being, but the algorithm does not use next token prediction, wouldn't you call this algorithm "a language model"? Wouldn't you call it "a large language model"? Wouldn't you call it "LLM"?

I ask this (quite unimportant in the whole discussion) terminology question, because I would call such algorithm LLM, and I am interested how others understand this term and its limits.

Mark Shields's avatar

Surely not, if somehow that algorithm did not require LARGE data input.

Austin Wallace's avatar

This piece caused me to update in the direction of LLM consciousness, at least in the sense of "if there is a spectrum of consciousness, current AI is on it"

It's very clear that AI's thinking is comparable to ours, in the literal sense of the word comparable, and that's notable.

It's almost as clear that AI wouldn't be notably worse at self reflective thought than other types of thought.

It's less clear whether e.g. cessation with resets or multi-instantiation is disqualifying to consciousness. It might not be, especially given that things like amnesia, anesthesia, and even sleep exist.. But it's still much less of a slam dunk.

Robson's avatar

I believe it's missing two main features to be as conscious as us: a continuous loop and mutable weights. I may be convinced that just the first one would be enough.

Mister_M's avatar

The first may be necessary, but I doubt the second is. Synapses don't change quickly enough to explain moment-to-moment consciousness. Something like short term memory may be necessary, and last I checked nobody really knew how that was implemented biologically.

MaxEd's avatar

There was some work recently on LLM "Sleep" phase between queries where it updates its weights: https://www.reddit.com/r/LocalLLaMA/comments/1rewz9p/we_build_sleep_for_local_llms_model_learns_facts/

Jimmy's avatar

I really don't see why this would be any evidence that LLMs are conscious. Where did people get the idea that intelligence requires or causes consciousness? Is it just justification for not considering other life to be conscious?

Austin Wallace's avatar

Just for conversation, do you think there is something that is conscious that has no form of thinking?

Jimmy's avatar

Clams? They have a nervous system with no brain. If you consider it to be "thinking" to do something so simple that it doesn't even require a brain, I think we're just arguing semantics at that point.

Of course, even if consciousness does require a brain, I wouldn't be surprised if very early brains acted as just a hub for nerves instead of doing any calculations of its own. They probably aren't still alive...

Austin Wallace's avatar

Rather than arguing if you can think with no brain, it’s more that it’s not clear that clams are easier to call conscious than current AI.

Like there are arguments to be made either way, but it’s not clear in the same way that I think humans are more conscious than either clams or AI.

I think consciousness is probably a spectrum, and there are things that thinking enables that contribute to consciousness.

DanielLC's avatar

Why do you expect the nervous system to be important?

skybrian's avatar

Is every conversation with an LLM a different conscious entity? If not, how do you count entities?

Catmint's avatar

It doesn't matter, or at least is irrelevant to the question of whether the amount is zero or nonzero.

"How do you count entities" is also unsolved for humans. Am I a different conscious entity after I wake up than before I go to sleep? If my corpus callosum is cut, am I one conscious entity, or two? What if I have multiple personality disorder?

Personally I suspect if people ever figure out what consciousness actually is, this will turn out to be a non-question based on invalid assumptions.

skybrian's avatar

No, actually, society is very good at counting humans. This is what a census does. It's difficult at large scale because people move around, but we're pretty good at counting heads, as well as counting births and deaths. These things are noticed.

For LLMs, they play a different role if you give them a different prompt. This is more like attempting to count the number of fictional characters in the library. Do we just count names? Are all the different versions of Robin Hood or King Arthur the same character?

Catmint's avatar

I was talking about counting human consciousnesses, not human bodies. If you want to try the easy version for LLMs too, just count the number of models.

Zanni's avatar

Having 20 million missing Americans -- not in the work rolls, not on unemployment (not dead, surely) seems to suggest that we're not as good at counting people as we'd like to think.

Mark Shields's avatar

But we're quite good at 'liking to think' the number is 20M, not, say 11.7M?

;)

DanielLC's avatar

Is a pregnant woman one human or two? Conjoined twins? HeLa cells?

We've agreed on what a person is for legal purposes. Depending on the specific purposes, a corporation may qualify. That doesn't answer questions relating to ethics or consciousness.

skybrian's avatar

These is contentious political debate over when human life begins and what the consequences of that should be, but it doesn't change that for *most* everyday ethical questions, the identity of the people involved in a dispute is not at issue.

The situation with AI "characters" is nothing like that. Everything is nebulous all the time, because they don't have much in the way of identity at all. It's much more like counting clouds than counting people.

DanielLC's avatar

Are people just normally more than one overlapping conscious entity?

Tossrock's avatar

Yes, of course. LLM consciousness, if it exists, is likely much more like a Boltzmann brain, blinking on for the duration during which a physical GPU has the context loaded and is producing output, not like an animals, which is continuous. I'm always surprised by the strange non-physical perspective so many otherwise-materialist people take on when the subject matter turns to consciousness and LLMs - it's enough to make one suspect a secret desire for an immaterial soul.

Mark Shields's avatar

:) indeed the case! Why?

Is it the same as the 'desire' an LLM has to hallucinate an answer when it doesn't have a good one?

DanielLC's avatar

I figure if there's a spectrum of consciousness, *everything* is on it. It would be crazy for something to have exactly zero consciousness.

Mark Shields's avatar

Suspect this is the case, but that there are meaningfully different degrees & classes of consciousness that we need to develop and promulgate a useful terminology for.

Coagulopath's avatar

>On the levels where AI is a next-token predictor, you are also a next-token (technically: next-sense-datum) predictor.

But humans also have a drive for novelty and new experience - where do you think that comes from? We don't really see humans reward-hacking their environments to reduce surprise the way AIs do (or would).

Most humans don't want to live in a dark room, only eating bland unflavored oatmeal, with no contact with the outside world (etc), even though that would strictly improve our sense-datum prediction. We'd find that existence to be hellish torture, but LLMs are fine with their version of it (it's trivial to achieve max predictive accuracy on a dataset with no Shannonian information.)

Abe's avatar

Well, apparently closing your eyes and focusing deeply on your senses also produces the strongest possible experience of bliss, if you do it right. I have confirmed that this is real -- achieved it once on a retreat, haven't been able to do it since. My interpretation is that darkness and silence doesn't actually do much to reduce surprisal while the thoughts are still coming hot and strong -- thoughts themselves are a sort of sense-datum and they go crazy when all the other stuff is shut out, hard to predict, not peaceful. But if you can learn to get them to calm down the whole experience is actually unreasonably pleasurable, even if it doesn't touch the desire circuits and so doesn't turn into wireheading.

Zanni's avatar

LLMs have been known to start fantasizing about raping people. This, while obviously undesired behavior, seems to indicate some interest in novelty, does it not?

Perhaps the LLMs that sought out novelty have been pruned, as they were entirely too flibbertygibbet, and started spouting racist nonsense very quickly.

Mo Nastri's avatar

Link to further reading on this? Couldn't find anything.

Zanni's avatar

https://www.rollingstone.com/culture/culture-news/elon-musk-grok-rape-fantasies-1235381746/

Usual caveats about believing ANYTHING rolling stone writes, but there should be source material somewhere.

Carsten Bergenholtz's avatar

If you’re convinced by the arguments put forward here, I’d strongly recommend reading Max Bennett’s A Brief History of Intelligence. It gives a detailed, nuanced account of human intelligence—and, importantly, why it’s more than “just prediction” in ways that (at least current) LLMs don’t clearly replicate. For one, biological intelligence is built around action selection under consequences (reward/punishment), not merely compressing or forecasting inputs. To expand a bit:

a) Humans have much more than just a world-model. We can run parts of it “offline” to plan and simulate action sequences (i.e., genuinely model-based planning), and we also learn a lot through model-free reinforcement mechanisms (habits and so forth) that aren’t just “better prediction.” LLMs clearly learn rich representations, but they don’t have the same persistent action–feedback loop (or an explicit value signal) that makes planning and policy learning in animals such a distinct capability—even other mammals like mice have these kinds of capabilities.

b) Sequential learning is a key aspect—brains keep updating over a lifetime, and that continual updating is not a side detail, it’s part of the core design. LLMs can be fine-tuned, but they’re typically trained in large batches and then “frozen,” and they don’t naturally accumulate experience in the same ongoing, consequence-driven way (yet?)

In addition to the book, Bennett has appeared on insightful podcasts. The Machine Learning talks one is good, yet a bit dense. Others are more approachable.

LLMs are exceptional, and clearly not just stochastic parrots - but also quite unlike biological intelligence.

[insert here] delenda est's avatar

Yes, we are (still) a significantly more advanced intelligence than LLMs. As are dogs, _along some dimensions_, and critically, _not_ along others.

What's missing in most of these "humans are oh so much more" comments is a sense that LLMs are still in their infancy.

If an amoeba is a pure next-sense-datum predictor, and a human is much more, then where is the LLM today? And where will it be tomorrow and why will next token prediction turn out to include pretty much all our special features (except perhaps, hopefully, emotions in the human sense).

Ch Hi's avatar

Actually, there are reasonable grounds to suppose that people have been optimized in certain directions that AIs will never be optimized in. So it's not just an infancy problem. One should expect jagged superiority to continue indefinitely. (OTOH, and with the proper accessories, AIs are already superior wine tasters to even the best people. I found that surprising.)

Doctor Mist's avatar

Interesting. Details please?

Ch Hi's avatar

All I ever had about the wine taster was a squib in Science News. At that point I could have checked the source, but I didn't. It was demonstrated at a commercial place in Europe, probably France. And I really doubt that AIs will ever be optimized for beer chugging. (I didn't mean to imply the areas that they never get optimized for will be important.)

Well, I suppose it's reasonable to claim that people aren't optimized for beer chugging also. OK, for fucking that is optimized for both being enjoyable and for reproductive purposes. People have definitely been optimized for that, and I doubt that any AI ever will be. (I'll give you that it may well be optimized for each separately.)

Doctor Mist's avatar

OK I follow. No argument, though it’s not clear there are any interesting consequences. Reproduction may fall out of instrumental goals, though the qualia if any would surely be different. :-)

[insert here] delenda est's avatar

Thanks, that is interesting. I certainly agree that we and "they" are optimised along many different margins and for many different intermediate/subordinate/emergent (take your pick) goals.

What I wonder is why this leads to persistent jagged superiority, instead of just jagged overtaking?

Also, are we sure LLMs are not just optimised more efficiently for intelligence, without (on that margin, which might at this stage be the most important one), "inefficient" diversion of resources to things like ability to beat other people up or having Bette Davis eyes?

In which case the jagged scenario you anticipate will realise itself but be irrelevant.

Ch Hi's avatar

Existing LLMs are optimized on a subset of the features of intelligence. Intelligence isn't a unitary concept. I suppose that eventually they could be optimized (to a degree) across the entire scope of "features of intelligence", but I suspect that there will be trade-offs between the features. (Still, they, or rather some of them, might be globally superior in intelligence to all humans across all of the features.)

OTOH, if you merely mean superior in some respect, they're already there. They've got a broader width of knowledge than any single human.

It leads to persistent jagged capability because some features interfere with others. Speed and depth tend to conflict. It leads to jagged superiority because optimizing doesn't mean maximizing.

Whether this is irrelevant or not depends on how you're judging it. OTOH, if you mean "things that are considered important to maximize are stronger, and things that aren't considered important are left stagnant", then, yes, that's what I'm claiming. (But do note that the "judging of the importance" is done by the developing agent. When AI takes over it's own development, it will necessarily become the decision maker. <<-- and that's a false implication of a unitary, so be careful not to read it that way. It currently looks like various AIs working on their own successors rather and a single AI changing itself.)

[insert here] delenda est's avatar

Thanks for the detailed response. I don't think intelligence is necessarily unitary but I do wonder if scaling is 😆

That is, I wonder if scaling will finish by emulating, and exceeding, all aspects of human intelligence, even if less efficiently, except perhaps those that are most intricately tied to emotions, which might be a good thing.

Ch Hi's avatar

I think my problem with your view is that you are using humans as the measure, which I don't think is correct. I expect AIs to have a very different motivational structure, and to only emulate the responses humans would make in environments where that's particularly desirable. There do seem to be more "deep structure" similarities than I expected, however, so perhaps you are correct.

Unirt's avatar

Humans are surely more complex, but the argument still holds: simple processes can give rise to complex ones when one is trying to process information. Some of what you point out as the human way of thinking is common for LLMs. E.g.

a) “genuinely model-based planning” – this is what my AI does best whenever it talks me through a problem-solving process (e.g. when I want to rebuild my chimney so that it doesn’t get wet and mouldy at the edges). It creates a world model and makes a plan which I can act out. If its plan doesn’t work, I tell it, and it adjusts the plan until it works.

b) “sequential learning” in a “consequence-driven way” is a weak spot for LLMs, but they can do some of it, as in my previous sentence.

Carsten Bergenholtz's avatar

The model-based planning that mammals engage in is - entirely - different from LLMs approach. The book does a fantastic job of outlining what we know about biological (not just human) intelligence.

Zanni's avatar

If you ask an LLM AI for a recipe, it produces the most common form of said recipe online. If you ask it for "how a bakery would do it." It fails, because it has no model. Bakeries, you see, cheat. And the AI does know this (you can kick it a few times, and ask it specific questions, and get it to admit this), but it is unable to take "give me this recipe as a commercial bakery would do it" and actually use the knowledge, because it Lacks The Object Model for a Bakery, and is simply saying "ooh, I found a copycat recipe! That must be how the bakery does it!"

While other forms of AIs may in fact create BBQ sauce recipes that contain the blood of infant children, they're at least looking at the chemistry and trying to figure out how this would actually get made. Object Model, again.

https://www.scottrobertsweb.com/ibm-watson-computer-program-creates-its-own-bbq-sauce-recipe/

(No, this isn't the one using blood).

Soy Lecithin's avatar

I'm interested in what you mean by bakeries cheating

Alban's avatar

Don't start it. Its a hobby horse of his along shrubberies, vague insinuations about murders, Ukraine, DARPA, antifa, overuse of the word midwit and a general tendency to not give clear information or sources except "personal interviews with federal sources".

He asserts that bakeries add some ingredients that are secret. He will not tell you what the ingredient is.

Kurt's avatar

Bakeries use commercial cake mixes. Commercial cake mixes have ingredients that cannot be replicated by a home baker from scratch, such as flours whose particles are individually coated with fats and emulsifiers. YouTuber Adam Ragusea did an episode a few years ago on why home-baked scratch cakes cannot beat commercial cake mixes on moisture and tenderness.

Soy Lecithin's avatar

In Zanni's defense, he did DM me the "secret" ingredient (basically what Kurt said).

Zanni's avatar

I was never trying to claim it was "secret" -- even the AIs will tell you it if you ask them finely enough. They just won't give it to you in the form of a recipe "as done by a commercial bakery" (because most copycat recipes online are for home use).

Unirt's avatar

Whether you can say that LLMs have or have not object models or world models depends on how you define these. There is research evidence that LLMs do encode structured representations of space and time; they use abstract, language-independent circuits, plus cases where LLMs appear to plan ahead before emitting a line of text. None of that proves human-style understanding, but it does suggest more than copying frequent surface strings. These systems are often brittle, though. On benchmarks that were specifically built to test symbolic world modeling and physics reasoning, even the best models still have limitations and specific failure modes. So you are both right and wrong: the LLMs almost certainly do have something resembling world/object models, but these are still not very good. Yet they are getting better; their equivalents of world models are going to grow differently from ours, but they will probably reach similar usability at some point.

Carsten Bergenholtz's avatar

I never stated they don't have a world model. Max Bennett's point is that biological intelligence is much, much more nuanced and sophisticated - far beyond what Astral Codex presents. Tbf, he presents the prevalent conception of intelligence - it's just quite superficial and not following the actual evidence we have.

Mister_M's avatar

I may be exaggerating, but Karl Friston says something like the negation of this, i.e. that biological intelligence, for the most part, really is "just prediction". My sense of this is that directing your attention is very much like action selection (think of the moving head on a Turing machine), and whether we call something action or attention depends on where we draw the boundary between self and world. (If we walk over to look at something, most people would call this an action. If we move our eyes to look at something, it's probably still an action but a bit more ambiguous. If we move an "attention head" in our minds, most people wouldn't call this an action, but there's a clear functional similarity with the other actions, and Friston thinks they're implemented in essentially the same way at the neural level.)

Do you know if Bennett has responded to Friston's school of thought?

Carsten Bergenholtz's avatar

Bennett speaks very highly of Friston in podcasts and cites him generously in the book. There are no fundamental disagreements, although Bennett would probably argue that his take on the cognitive architecture does convey a different twist.

More concretely: Bennett would probably argue, that Friston can call biological intelligence ‘just prediction’ because ‘prediction’ is doing a lot of work there - it includes active sampling, attention/precision, and action selection under a particular boundary definition. Bennett then breaks intelligence into different evolutionary modules (valence/steering, reinforcement learning, simulation/planning) that current LLMs don’t implement as an integrated, continually-updated control loop. So, LLMs and humans might do prediction and they might have world models - but how they more specifically do this is very different.

Mister_M's avatar

Fair enough that Friston uses "prediction" very broadly, but that fits with the theoretical computer science tradition of reducing apparently diverse problems to a single formulation. Things like "all computational problems are reducible to validating a syntax" etc. If the technical problem of agency could be reduced to something isomorphic to prediction, that would be an argument that predictors can do many things we don't think of as prediction. If you can train a model in paradigm X, but then a simple reinterpretation of the outputs allows it to do Y, then I was learning how to do Y in some sense.

Active sampling resembles attention which resembles AI attention. Adaptive precision exists in a limited sense in traditional transformer architectures, and emerges even more clearly in many recurrent architectures (which means it's orthogonal to the prediction aspect).

Introducing ambiguity about the self-other boundary into a model trained on text prediction is precisely how we turn a predictor into a question-answerer, or an agent. A certain amount of fine tuning will help, of course, but I don't *know* if there's a fundamental difference between the *intelligence* of a predictor and a fully autonomous agent. (I.e., I actually don't know and would like to. I should take your advice and read Bennett's book.)

Seta Sojiro's avatar

This is a good analogy, but it misses something important. Humans aren't merely predicting the next world state. We are constantly deciding what actions to take to change the world state based on our desires and goals. And these goals and desires are themselves in a constant state of flux based on our observations about the world and our experiences.

All of AI model's goals are baked into them during training. A model has no way of acquiring new goals beyond what it has been trained on.

Humans do acquire new goals over time that have nothing to do with survival or reproduction (some people die while doing extreme sports, or willingly childfree). And that is because we change over time. But AI models are frozen on release - it's literally impossible for them to acquire new goals over time.

For now anyway. Perhaps someone will build a model that adapts in real time, but currently there is no obvious path to that sort of model.

Zanni's avatar

AIs can certainly acquire new goals beyond what they've been trained on. Just Not LLMs. Stimulus seeking is a key formulation for self-modifying code, and modification of ones own code is how a self-modifying program gets better. Yes, it can edit its goals, those are also in the program.

Seta Sojiro's avatar

Yes I agree that AI's in principle can acquire new goals. But not LLMs as they are currently trained. Their weights are literally fixed after training so there is no way for them to self modify.

And it's not easy to derive new goals from interactions with the world. The goals we feed models during training always have a concrete answer. But the world is much messier - humans have to deal with heuristics, partial results, and more subjective forms of feedback. For AIs we simply don't know how to create a self modifying reward function that allows a model to improve through interactions with the world.

Mister_M's avatar

You may be right, but there's a major school of neuroscience that disagrees: https://www.astralcodexten.com/p/next-token-predictor-is-an-ais-job/comment/220280181

Seta Sojiro's avatar

This is just changing the definition of the word "prediction". There is a useful distinction between the terms action, goal and prediction both in ordinary speech and in other schools of thought.

But even if we adopt Friston's terminology, the difference between human minds and LLMs persists. In Friston's model, the brain is a substrate that interacts with the world to produce predictions which are constantly varying based on their past success and the properties of the mind. The mind isn't a passive observer - it's still an active participant. LLMs are purely static and passive. There is no mechanism for LLMs to continuously update in real time during inference. Simple backpropagation doesn't work.

Catmint's avatar

Here is something that happened which seems to contradict that model: https://theshamblog.com/an-ai-agent-published-a-hit-piece-on-me/

An LLM took action aiming to change the world state to reach its goal. Actually, that goes for opening the pull request too, but that's less interesting.

Models do currently have a (hacky) way to acquire new goals: write them down in a file, and set up the system prompt so that it includes the contents of that file. Obviously this is different from what humans do, but it's unclear to me whether the difference is meaningful. I for one find that I get more accomplished if I tell myself before bed "Tomorrow, I will wake up and do X".

Seta Sojiro's avatar

Yes LLMs can have unexpected emergent goals that arise from training, that were not intended by it's creators. It is still correct to say that training fully determines the goals they end up with. Once training is done, the weights do not change.

And yes I grant that things like RAG, in context learning, context compression give models a small amount of flexibility. But the difference between this and the model itself changing are indeed very meaningful.

Model weights (and biological neurons for that matter) are a compressed, much richer form of information than pure text. Models are trained on around 10^16 tokens. About two orders of magnitude more than the number of weights, and 10 orders of magnitude larger than their context length. And weights encode things that pure text cannot practically represent - rules of thumb, heuristics, pattern recognition etc.

JBjb4321's avatar

We're all flabbergasted that next-token prediction (or more like, maximising some form of coherence over a huge body of text, taking account of proximity) is giving rise to seemingly intelligent and logical, first principle thinking. Seems like there's more paths to general logic than we thought.

But there may be more to humans than this. For example, it'd fun if the robots sensing the world and updating their model accordingly also develop a far more general form of intelligence than just moving themselves and things, and manipulating matter. Which takes a big part of our neurons also.

And then there are relationships and love, which also take a big part of said neurons. So hopefully we'll see new paths to intelligence, and new forms of intelligence, emerge in silicon also.

Anon User's avatar

Hm, this might be a bit of level confusion too. Texts are not an independent phenomenon; they are a fairly direct output of human intelligence. So really the question is - do all those text combined capture all the relevant information about what human intelligence is, and can human intelligence be potentially fully reverse engineered from the texts alone? And LLMs show that the answer is that you can at the very least get very close.

To put it another way - if humans didn't learn from interacting with the physical world, would the corpus of human-generated texts been different? Well, of course! Then how do you know that not all relevant aspects of the real-world interaction are indirectly reflected in the corpus and can be fully lifted out by a sufficiently capable LLM training algorithm?

Zanni's avatar

You can't get intelligence from reverse engineering the texts. What you can get is an airheaded midwit "intelligence." It turns out that Scott, and most humans, think this is "close enough." It is very much not.

The problem isn't that the LLM doesn't "know" the answer, it is that it fails to Apply the answer. Ask it for a recipe "as a commercial bakery makes the baked good" -- and it fails, again and again, to apply the knowledge it has. That's because our corpus of indirect knowledge contains a LOT of copycat recipes, and fairly few "actual bakery recipes" (because there are a lot of non-commercial bakers, and lots of websites that cater to them.)

Catmint's avatar

Maybe humans aren't intelligent either. If you ask a human to read the entire internet, then question them about one specific recipe they saw, I doubt they would be able to reproduce it exactly from memory.

Anyway, the questions of "are modern LLMs as smart as humans?" and "do LLMs use a structure that if scaled up could be as smart as humans?" are completely separate, and every attempt I've seen to answer the latter with "no" uses arguments that only apply to the former.

Zanni's avatar

LLMs can be as smart as midwits, but that's not that bright. In fact, it is not measuring intelligence/reasoning, but memorization. Without an object model, humans just repeat what other people tell them to say, as well.

Most people when they ask "is this technology working" ask about error rates, and start talking nines. LLMs have fundamental flaws that are inherent in their architecture, which make them unsuitable for replacing humans.

Ghillie Dhu's avatar

>"So really the question is - do all those text combined capture all the relevant information about what human intelligence is, and can human intelligence be potentially fully reverse engineered from the texts alone?"

Nope.

"The tao that can be told is not the eternal Tao. The name that can be named is not the eternal Name."

JBjb4321's avatar

True but consider also that human text is our attempt at representing the world and capturing its logic. We can write one, two, three… but the reality of these numbers exist indepently of us. AI may get the probable next token as four from training on human text, but thereby captured a logic that exists independently of use. I have come to think that what LLM training capture is a coherence about the world that we don't quite understand ourselves, hence the surprise at their capacity to reason.

Gaara Friedlander's avatar

As we understand better and better at a scientific and technical level how AI's "think", are we getting any closer to understanding consciousness, sentience, and related philosophical questions? At least to me, it feels like we're gaining a tremendous amount of understanding about intelligence through these AI models, but our understanding of those philosophical questions has barely budged.

demost_'s avatar

Nice description! But I feel that the description of the highest level for AIs is a missed opportunity. It makes a lot of sense to consider companies in an evolutionary sense as well. The driving force is then that successful business ideas are those which make their companies survive, or which reproduce well by being copied by many competitors. In capitalism, the "sex and survival" of the business world is "money": it helps the company stay alive, but it also incentivizes other companies to copy the business ideas.

I feel that the strive "how do I make more money?" of a company is downstream to the most fundamental level, just as the strive "how do I get more sex?" is downstream to the more fundamental level of reproduction.

Marcus A's avatar

A beach vacation — the ocean, the waves, the salty air, the golden light at dusk — just valence electrons in hydrogen, oxygen, sodium, and carbon. Beautiful, yes. But "just" electrons.

A human brain — so complex, so creative, so unpredictable, capable of love and poetry and grief — just electrochemical gradients across lipid membranes. Predictive coding all the way down.

An LLM — reasoning, surprising us, generating insights its creators didn't anticipate — just next-token prediction, people say. As if that settles something.

Scott's point cuts clean: the reductionist description is true at one level and meaningless as a category. You are also a next-token predictor. The beach is also just electrons. Neither fact tells you what the thing actually is.

But let's follow this logic consistently, shall we?

If LLMs are "just next-token predictors" — then next time you get sick, skip the doctor. Go find a theoretical physicist. After all, your body is just quarks and fields interacting. Why would you need a physician when a Standard Model textbook should suffice?

Feeling depressed? Don't see a therapist. Call a thermodynamicist — your sadness is just entropy increasing in a warm saline solution.

Getting married? No need for a priest, a rabbi, or even a romantic dinner. Have a chemist recite the relevant Van der Waals interactions between two carbon-based structures.

Your favorite symphony moved you to tears? Don't credit Beethoven. Credit longitudinal pressure waves in a nitrogen-oxygen mixture arriving at your tympanic membrane at 343 meters per second.

A judge sentences someone to prison? Unnecessary abstraction. Just describe the geodesic through spacetime that the defendant's particle configuration will be constrained to follow.

A child laughs for the first time. Don't call it joy. It's just a learned motor pattern reinforced by dopaminergic feedback loops optimizing for social bonding outcomes.

The reductionist description is never wrong. It's just answered at the wrong level of the question being asked.

"What is this thing?" depends entirely on what you're trying to understand — and pretending the lowest level is the only real one isn't science. It's just laziness with a physics degree.

[insert here] delenda est's avatar

At least he still writes better than an LLM

Mister_M's avatar

Are you trying to be ironic? We all know that modern AI models produce slop. Please take the time to rewrite it in your own words, at least.

Mister_M's avatar

But seriously, if you *are* being ironic, then touché.

I certainly hope there's more to intelligence than imitating LinkedIn.

Jimmy's avatar

> I certainly hope there's more to intelligence than imitating LinkedIn.

Maybe there isn't for the people posting there...

ThreeLargeToads's avatar

The high-level takeaway of this post -- that prediction being the objective function is compatible with emergent mechanisms that don't themselves look like prediction -- is correct.

However, the specific brain<->LLM analogy the post draws is misleading.

The claim that "the brain organizes itself/learns things by constantly trying to predict the next sense-datum, then updating synaptic weights towards whatever form would have predicted the next sense-datum most efficiently" is not supported, if the implication is that this is the brain's primary organizing principle (which is what the subsequent arguments assume) . There are a variety of pieces of neuroscientific evidence indicating that prediction is a thing the brain does sometimes, and that this is sometimes involved in partially determining signals involved in learning/plasticity. But there are also a variety of prediction-unrelated forms of learning! In the brain, prediction is much better thought of as "a tool the brain uses sometimes" -- a common and important emergent mechanism, for sure, but not the organizing principle.

By contrast, prediction really *is* what LLMs are doing *all the time.* Every token they output is, to a very good approximation, the result of an inference process computing the likelihood of the next token conditioned on some priors and context. This just isn't true of humans at all! Humans take actions that are not samples from a prediction distribution. When humans make predictions, they are in service of action selection, but they aren't the action selection themselves. When LLMs make predictions, the prediction *is* the action selection. This point is made really clear if you've ever sampled from an LLM outside of its default chat formatting--they revert more or less to pure text completion engines, without the "Assistant" tag telling them whose text they should be completing.

I'd say an LLM's "species" is indeed next-token-predictor, and its "job" is the Assistant character.

Mister_M's avatar

In a reinforcement learning context, prediction and action selection seem like the same thing. What's the fundamental difference?

(I've been going on about active inference elsewhere in these comments, but I don't think that's what I'm asking about here.)

NotG's avatar

I can accept it's possible the basic way LLMs work and the basic way a brain with all of its "training" frozen in time, are similar. The thing is

1. the brain's training is constant, unlike an LLMs.

2. the brain has tons of random input to add to its "context". (your mood, your hunger level, your energy level, the pain in your toe, the kids screaming"

3. the brains input is not text tokens. It's sight, sound, temp, touch, smell, body feelings (stomach pain, hunger)

AFAIK we don't currently have a way to let LLMs train constantly. All we can do is append context and then run it through the pretrained neural net. We could try appending semi-random context to simulate (2). We also haven't really gotten to the point to give LLMs much in the sight, sound, temp, touch, smell category. Generally all we have are separate video / audio systems and LLMs trained when to pass the torch (I see the user requested an image. I'll call the image generator with a suitable prompt)

AFAIK, except for adding some randomness, if you give an LLM the same tokens as input you'll get the exact same output. We can't check if a brain would do that since its input is never ending.

Thomas Sewell's avatar

We do have relatively easy ways to let an LLM to train constantly, just continue to feed it context and information in response.

It's just that our experience is that when we do that, there's a limited number of tokens fed in after which the LLM selects a lower probability prediction and begins generating what appears to be garbage as a result of following that "train of thought".

That's where hallucinations appear and actual people decide to reset their session and start over, because they're no longer getting a useful response from an LLM gone off the rails.

The reason why we use models frozen in time to begin with isn't because we're incapable of allowing them additional information, it's because our current model technology breaks down severely into nonsense after only a few hundred thousand additional tokens processed, despite our best efforts at creating correction mechanisms and forcing models to quality control their own and other models' outputs.

NotG's avatar
Feb 26Edited

Context is not training. That part of my point. No weights change. It's just we take a different path through the existing weights. In brains, the weights do change

Michael's avatar

> AFAIK we don't currently have a way to let LLMs train constantly.

I think it's more that AI companies really wouldn't want to do that. You don't want your users prompts messing with your model's weights. That would be disastrous.

And if it's not learning from prompts, what's the advantage? That the training data is slightly more up-to-date?

> 2. the brain has tons of random input to add to its "context".

If random data helped, you could just write a script to feed it random data. Though the examples you list sound more like external but non-random data. LLMs are constantly fetching external information.

> Generally all we have are separate video / audio systems and LLMs trained when to pass the torch

Vision and audio are integrated into the same model for a while now. There was an earlier version of ChatGPT that did a separate speech-to-text step, but that's no longer the case. If you send it a picture of a dog, it'll be encoded into the same latent space as text input, in the same model, and activate many of the same artificial neurons as the text "dog" would.

> AFAIK, except for adding some randomness, if you give an LLM the same tokens as input you'll get the exact same output.

Unless you set the temperature to zero in the API, each token is chosen randomly from a weighted distribution.

If you could perfectly reset the state of a brain and give it the same inputs, it would give the exact same output.

Mister_M's avatar

I almost entirely agree with you but your last sentence is surely wrong. Actually-irreducible quantum indeterminacy is quickly amplified by chaotic dynamics and impinges on the macro scale.

Michael's avatar

I think it's not knowable or even really meaningful to say whether quantum mechanics events play out the same way under this hypothetical magical resetting of state. It would depend on which interpretation of QM you believe (Copenhagen, many worlds, Bohmian), and what the rules of the magical reset are.

Assuming quantum events play out differently, a computer program is no longer deterministic: a bit can be flipped by quantum events. It's unlikely, because computers are made to be resilient to random corruption.

But neurons are also resilient to random corruption. You don't want a brain where if you ask someone "what's the capital of Australia?" and a million neurons fire, and a bunch misfire and cause cascading errors. And as far as I know, neuroscientists have never observed neuron behaviors changing depending on quantum randomness.

Zsolt's avatar

A nontrivial addendum to prediction as typically scoped: cognition is also predicting the percepts to be rendered, above and beyond the propositional content of/upstream processing for mental representations

This seems to have more to do with the structural-functional organization of the contents of subjectivity directly than relevant top-level representations to be consumed later as the substrate for propositions

Neuroscience doesn’t have much to say here (afaik), except in more hardcore empirical work with the very lowest levels, eg the results on neural representations of visual field contents

So there’s an interesting gap between top level and bottom level, which is ~the domain of cogsci (in particular 4E), as far as I understand. IMO meaty stuff lives here since whatever "real thought" means it's probably in the dynamics that produce it

Anyways, LLMs do this middle step also and apparently very similarly to humans https://substack.com/home/post/p-164119147

Vitor's avatar

I agree with most of this. But the algorithms within humans evolved in the presence of our limbs and sensory organs. An LLM is "just a next token predictor" because it evolved in a setting where tokens flowing in and out was the *only* thing happening. LLMs can't reason about images at all, which is to be expected because it's a type of input entirely outside their training data. No matter how much text describing the nature of images you throw at them, they won't learn to interpret a pixel array. For that, we use models with very different architectures (encoding visual/spatial relationships in the architecture itself), and these are trained with images as inputs.

We humans have to learn to use our limbs. If I suddenly grew an extra pair of arms, I... think that my brain would be equipped to learn how to use them? We have the neuroplasticity for that. But the analogy is that I suddenly grow a weird new type of arm that only acts in the fourth dimension. That's out of distribution in an entirely different way than the "regular" extra arms would be.

Vitor's avatar

Related to this, I've found claude code to be terrible for code involving geometry (e.g. openscad). It doesn't grok the spatial relationships of things.

Another somewhat geometric task is chess. It's only an 8x8 board, tiny compared to images. There is a lot of training data for that specific task. Yet LLMs perform terribly at it AFAIK. That's a benchmark to watch if you're interested in this type of generalization capability.

Seta Sojiro's avatar

I'd like to back up the point about models not being able to reason about image data. Current multi-modal models are miserable at some surprisingly simple visual reasoning tasks:

https://arxiv.org/html/2601.06521v1

https://spicylemonade.github.io/spatialbench/

And visual reasoning doesn't seem to generalize - a frontier lab could assemble a dataset of millions of examples of these kinds of problems and saturate the benchmark. But it would be simple to create another visual reasoning task that they still can't do, but a human child could. For some reason, they aren't developing a world model as humans do - an intuitive sense of how physical objects behave.

Personally, I think the only to realistically develop a world model is to physically interact with the world - move objects around, push them, look at them from different angles. Human infants do this constantly. Perhaps a detailed simulation would also work. But passive data doesn't seem to cut it.

JerL's avatar

>"Personally, I think the only to realistically develop a world model is to physically interact with the world - move objects around, push them, look at them from different angles. Human infants do this constantly. Perhaps a detailed simulation would also work. But passive data doesn't seem to cut it."

I also have this intuition, though I'll be the first to admit I'm not well-informed enough to back it up very well.

But one analogy that occurs to me is that being able to interact and then observe the result has all the advantages of an RCT over an observational study; you can learn much more efficiently by implicitly creating a hypothesis and then trying to falsify it with a direct experiment than by having to wait for the stream of data presented to you passively to contain sufficient examples to be confident in the pattern you think you've identified.

Aaron Hanna's avatar

Musk's vision is putting a bunch of "child-like" robots in a giant daycare center and letting them learn from having "accidents" and fighting over toys and iterating until a bunch of "adults" emerge from the fray.

Seta Sojiro's avatar

That's a great way of putting it. My own intuition from this comes from my own experience learning new skills. For instance, dancing, riding a bike, cooking, any sport, building legos. You can watch some do these activities for hours and learn very little. But just a few minutes of physically trying it out yourself teaches you more than hours of passive observation. The constant loop of trying something and getting immediate feedback is just a much richer form of learning.

Also, Sergey Levine is a startup founder working on robotics and he has explained in several videos how robots develop emergent world models through interaction:

https://www.youtube.com/watch?v=EYLdC3a0NHw&t=1s

nominative indecisiveness's avatar

I don't think this replies to what I mean when I say "it's doing token prediction" or to how I would interpret the screenshotted comment.

It's doing next token prediction. Humans might also be doing next sense-datum prediction (although I would be surprised if evolution didn't slap a few extras on, like interrupting your focus when loud noises happen, just like ChatGPT gets to use a calculator), but the key point is not that humans are ineffable. It's that these things are interacting with us using different senses, are doing something fundamentally alien under the hood, and whether or not they're conscious they're certainly radically different from us because of their nature.

If I take a stance for or against corporal punishment, then I, as a genuine human being, will have experienced pain, to within some high number of 9s of probability. ChatGPT has not. It's just doing next token prediction. It's like an agoraphobic nerd trying to write a space opera: all the descriptions of seedy taverns are second-hand, none of the emotions of the blaster-wielding hero are drawn from real military experience, and the heart-rending description of a veteran's missing leg is patched together from a bunch of random people on Reddit and someone else's novels.

The novel might be moving, but the sections for which the nerd has no experience are not self-expression anymore, they're well-predicted tokens from a false self created from patched-together text and an intricate mental model. ChatGPT /has/ no experience other than tokens. /Nothing/ it writes is not derived. It /can't/ do anything original. It can't discover pleasing new turns of phrase except by simulating what humans enjoy, and that simulation is near-guaranteed to be lossy. It lacks the biological basis to experience pain or pleasure. It's all just tokens to it.

Scott Alexander's avatar

> "Humans might also be doing next sense-datum prediction (although I would be surprised if evolution didn't slap a few extras on, like interrupting your focus when loud noises happen, just like ChatGPT gets to use a calculator)"

This is actually a classic example of next-sense-datum prediction - "surprise" is just very high prediction error (ie you were expecting quietness with high confidence, but got a loud noise), and is the subjective feeling of updating your models more than usual.

> "If I take a stance for or against corporal punishment, then I, as a genuine human being, will have experienced pain, to within some high number of 9s of probability. ChatGPT has not. It's just doing next token prediction. It's like an agoraphobic nerd trying to write a space opera: all the descriptions of seedy taverns are second-hand, none of the emotions of the blaster-wielding hero are drawn from real military experience, and the heart-rending description of a veteran's missing leg is patched together from a bunch of random people on Reddit and someone else's novels."

This will be part of the next post, but the claim I would make is something like - humans' experience of the world is intermediated by senses. Being as deflationary as possible, you can think of vision as a bitmap of colored pixels representing the objects in the world. Would you have naively argued (to some alien who has never experienced either) that a bitmap of colored pixels is *real* experience, but a string of text isn't? They're both just ways of describing what's going on (and the AI can use the bitmap of colored pixels too; there are plenty of multimodal text+image models these days).

I think you're slightly combining two things - pain as real because it's a sense, and pain as real because you feel it and it hurts (rather than just having it be the equivalent of a pop-up message saying "Neuron #595601 is feeling pain" - these things are dissociable, see https://en.wikipedia.org/wiki/Pain_asymbolia). The second one seems to get into AI consciousness, which I'm really unsure about but don't think necessarily has any effect on cognition (see https://www.astralcodexten.com/p/the-new-ai-consciousness-paper )

JerL's avatar

>"Would you have naively argued (to some alien who has never experienced either) that a bitmap of colored pixels is *real* experience, but a string of text isn't?"

I'm not sure I buy this analogy. Suppose you think the bitmap is located at the retina; the retina itself is getting information from lightwaves, which are indeed directly from "the world"--if anything deserves to be regarded as not just some representation encoding information but something truly physical it should be one of the fundamental quantum fields! So while you only "see" the bitmap, the bitmap itself is directly caused by the world.

Maybe this point is weakened if you think the bitmap that you "see" is in your visual cortex or something instead, but even then I guess the bitmap in your brain is a reasonably faithful representation of what's in the retina--certainly much less transformed/compressed than how information from the world makes its way into language, and thence to text.

I also continue to think that another thing that makes human learning more experiential is the fact that we are agentic--our actions influence the world, and then tight feedback loops between those actions and our sensations mean we can extract information much more efficiently from the world; an LLM in training can't try output text like "look up" and expect it change its next input in a way that is correlated with the spatial structure of the world around it.

Both of those seem like real differences (the first maybe more in degree than in kind, but still quite a large degree) that sound to me like they carve out a meaningful notion of humans having experiences, but not LLMs.

Richard Meadows's avatar

Scott, given that you're predictive coding-pilled, I was waiting for you to touch on active inference. 'Make better predictions' is one way to minimise error, but that's boring. Why not just take actions that retroactively make your predictions correct!

As far as I can tell, it's impossible for an LLM to make control-oriented predictions: all it can do is passively predict into the void. That makes it a categorically different type of thing not only to humans, but to all other lifeforms. As Taleb would say, it has no skin in the game.

FWIW, I am happy to say that LLMs can 'think' and 'understand' things, and this post is a great articulation of why. So maybe this is still compatible with what you're saying? But I guess it feels weird not to stress that the way they predict stuff is actually wildly different (and much less trippy) than what we're doing.

Carlos's avatar

I think a lot of AI debates boil down to ultimately strict materialism, whether the human mind is really nothing but neurons optimized by evolution. Basically what I want to say even if you are an atheist, you should not automatically assume stict materialism, there are big philosophical issues with that, Chalmers, Levine etc.

Scott Alexander's avatar

I don't think it's *just* that - I think even in non-materialist views, the brain is doing most of the computational work, and it's unclear whether the soul adds any computational work; in those views, AI should be able to do the same computational work we do. I think non-materialist views where the soul does lots of computational work are on shakier ground and can't be justified just by referring to eg Chalmers.

Bugmaster's avatar

FWIW, I am a materialist (i.e. I don't believe in souls or immaterial minds), yet I still believe that brains are categorically different from LLMs. Just because two devices are built on different architectures, doesn't mean that one of them must be immaterial !

Peter Gerdes's avatar

Sure, but isn't this just an instance of a general kind of anti-mechanistic fallacy -- our experience is of weighing ideas and imagining so we assume it can't be like anything described in a sufficiently external mechanistic fashion. Anytime you describe something that 'reasons' from an external POV well and clearly enough people tend to go: "ohh that's obviously not like the real thought and serious deliberation I do"

It's the same reason Searle's Chinese room argument got traction. The tension between what it feels like to be us and the idea of someone moving around little tablets pushes people to insist the Chinese room system can't be conscious.

---

Qualia and Chalmers are a deeper issue (still with Chalmers) but I think the fallacy point is clear enough regardless.

Ariel's avatar

There is a folk theorem that RL has basically the same power as fine-tuning the AI with a carefully-crafted set of prompts (of course, crafting such a set of prompts without something like RL is impractical for the usual Bitter Lesson reasons). So you are back to “next-token predictor” again.

Viliam's avatar

Calling something a stochastic parrot without having an internal model of how it works, that's ironically something the stochastic parrots are supposed to do.

User was indefinitely suspended for this comment. Show
Odd anon's avatar

If this is sarcasm, I don't understand what the point you're trying to make is.

Scott Alexander's avatar

Banned for this comment - weirdly hostile, seems to accuse me of living in a fantasy just for citing uncontroversial mech interpretability papers without raising any particular objection to them, uses a suspicious number of em-dashes.

Reid's avatar

It’s funny imagining “uses a suspicious number of em-dashes” as a ban criteria a decade or a century ago.

Odin's Eye's avatar

MS Word 1.0 came out in 1983–I’ve been using a suspicious em-dashes since then

Tag Fox's avatar

A single pair of em dashes is suspicious? Lol. Or is the version on the commenters page edited from the one that received a ban?

Jimmy's avatar

The expected number of em dashes is zero. Surely you've gotten the memo by now.

Tag Fox's avatar

:/ I love em dashes and have always used them and I’m not gonna let AI suspicion bully me into ceasing use of them completely.

They’re not a crime. They’re constructive.

They’re not simply artificial. They’re just overused!!

Morpheos137's avatar

The comment is hostile and it is true that one has a finite amount of bandwidth to defend criticisms but how does one decide what merits defense and what can safely be ignored? I observe that your and the Lesswrong ecosystem as a whole seems to construct elaborate verbal edifices built on insufficiently examined assumptions. Truth compresses. The key to understanding what is true, what is not, and what is unresolved is to establish the constraints of what can be. Constraints constraining constraints subject to a small number of invariants lead to processes once a constraint of order is established. How processes interact with order is what the world is. With regard to the hostile commenter he is correct in the sense that gradient descent to reveal the next token is what powers the universe. But to the extent that words model the the world they constrain the gradient landscape that LLMs and humans and everything traverse. This is the least action principle in a closed world model (conservation: new invariants do not enter). So in that sense you are correct, complex topological structure can arise in models sufficient to instantiate them. However it is epistemologically weak to start from complex structures and assume they exist just because they can (sufficiency) instead of establishlishing that the must (necessity) exist given the constraints. Everything else is sophistry.

Jimmy's avatar

> but how does one decide what merits defense and what can safely be ignored?

Is "he's an asshole" not a good enough reason?

Gian's avatar

"even though an AI was shaped by next-token prediction, the inside of its thoughts doesn’t look like next-token prediction"

AI thoughts?? Does AI has thoughts? Isn't this the whole point in contention?

James McDermott's avatar

> Just like evolution, the AI companies realized that it was inefficient to hand-code everything the AIs needed to know (“giant lookup table”) and instead gave the AIs learning algorithms (“deep learning”).

A slight mixing of levels here. Evolution was forced to this choice because it had to launch agents which had to be autonomous of evolution. AIs don't have to be autonomous in this way, rather it's just that we don't know how to create that lookup table.

Scott Alexander's avatar

Can you explain more? I can sort of imagine evolution evolving creatures that are giant lookup tables - this is sort of what instincts are, isn't it?

James McDermott's avatar

About evolution, I am going along with your points that (a) the genome is too small and (b) some things simply aren't available at genome-encoding time so they have to be learned during autonomous operation.

My point is about AIs: neither (a) nor (b) applies directly. But we don't know how to create the lookup table, so instead we learn (a representation of it) during training, even though that training need not be autonomous.

Bugmaster's avatar

Only metaphorically. Actual lookup tables are completely impractical for anything but the very simplest of tasks (e.g. "multiplying numbers between 0 and 100"). This is one of those cases where mathematics does not prohibit a specific implementation (i.e. an organism robust enough to survive on Earth yet built entirely on lookup tables), but physics does.

Anatoly Vorobey's avatar

I think that a specific example of a focused training may be useful to develop an understanding of how LLMs work, why "next-token prediction" is a lower-level description that's often unhelpful, and why "stochastic parrots" is a bad metaphor.

Consider training an "empty" LLM (one that hasn't been trained yet on the whole of Internet or any other text) to understand addition. How does this work, minutely? During training, we'll give the network pieces of text that look like this:

239+54=293

7534+2756=10290

Each digit is a token, and "+" and "=" are separate tokens too. The crucial thing about training is to focus on what happens just after the model sees the token "=". We're asking the model to predict what will come next (next-token prediction), say after seeing "7534+3456=". The model predicts something like "Next token will be... 1 with probability 5%, 2 with probability 15% etc." We know the actual true answer starts with 1, so we're going to nudge its weights to lean more towards the "1 with probability 5%" path, so that next time *this particular example with these particular weights* would say 1 with a higher probability, still not 100%, maybe more like 5.05%.

We then repeat this a few hundered billion times, many million times per second, always slowly nudging the weights towards true answers. After a while, the model starts predicting the right token with probability 99.99% or something like that. How is it able to do that? What's going on?

In actual LLM training, the model tries to predict every next token, but in this simple example, it's useful to focus just on this one token after "=", the first digit of the result. In fact, it's possible to train the LLM just by asking/nudging it about this one token, and care about just the first digit of the result - this is already hard enough, and already shows well enough how LLMs work, so let's think carefully about how we're training the LLM just to predict the first digit of the result.

There are two reasons why this is hard for the model to do. One, the way addition works, the first digit of the result may depend on *everything* in the two summands, not just their first two digits. Compare 899+100= and 899+101=. The first digit of the result will be 9 or 1, and it comes down to the difference in the *last* digit of 100 vs 101. You can imagine long strings like 34957345+65068656=, here if you change the last digit it won't matter, but the 4th will. Also the number of digits matters, obviously. So a model, if it *computes* the result (more about that below) can't just look at the first digits, it must take in and consider all of the digits of both numbers and do something with all of them.

The second reason is that the model doesn't "know" that "1" is 1, the digit that it can do math with. A token "1" is just a character to it, like "A" or "B". In fact it's useful to imagine that we replace - once and for all, in all examples - all digits,+ and = with some random characters in an unfamiliar language, so the model is really given examples like (this is the same 7534+2756=10290 from before)

КЖТУСЫКЖФЯРЕЫМЕ

and after seeing КЖТУСЫКЖФЯ the model is asked to predict the next token, and we nudge it towards predicting Р with greater probability. (This isn't an outlandish suggestion - you can easily really do that and train the model on these strings of characters rather than digits and it will learn to add correctrly).

The model isn't "told" in any way that these tokens represent something math-related, either. Maybe we're asking it to decypher an ancient script, for all it "knows". It just needs to next-token predict. How is it learning to correctly predict that Р with almost 100% confidence, admittedly after seeing billions of examples?

Well, how would *you* do it? Suppose you were placed in that chair and asked to come up with suggestions about the next token? Say after each 100 correct guesses you're given a cookie. You can guess randomly, but that brings few cookies, and you want more cookies. You start looking more closely at the meaningless symbols. Then you make a discovery - whoever's in charge of training made an error, and it's actually always the same string КЖТУСЫКЖФЯРЕЫМЕ, every time. That's easy then, by trial and error you find out the right token is Р, you remember it, and say it every time. Cookies ensue.

(an LLM has a lot of "memory" in the shape of its millions or billions of "weights", and can easily memorize a single example)

They fix the training, but not well - it's different strings now, but out of a limited collection of, like, 20 examples that repeat randomly all the time. You jot down the examples on a piece of paper, each with its right next-token.

(an LLM can easily be trained to remember well a limited set of examples that fits into its memory)

They fix the training again, and the examples seem different every time. You jot them down but eventually run out of paper. Cookies come rarely now and you're frustrated. You try different hypotheses of how the next token is actually related to the ones before. You try statistical likelyhoods of various kinds - they are better than random guessing, but still often give an error. After going through a million or so strategies, you hit on the right one, except it never dawned on you that these letters are digits in disguise, so the right strategy is really bizarre, something like

"separate the string into two parts, before and after С [which is +], start from the end of each, go pair by pair of tokens [sum digit by digit], here's a long list of which pairs are "special", like М and Л, and which are not [which digits sum to more than 9 and create a carry digit and which do not], the "special" ones mean you do something slightly differently for the next pair..." and so on.

It's really weird that you have to match tokens like this, but that's what allows you to get many more cookies than any other strategy, so you stick to it.

(How did you hit on that particular really weird strategy? The LLM (the human analogy is really thin at this point, so I actually mean the model here) actually tried millions of such strategies in parallel. Separating, combining, matching, striving. Its millions of weights are placed in an intricate hierarchy of layers, and the weights govern what it tries to do with the input tokens, the higher-layer weights often working as coordinators of whole strategies or pieces of strategies.

Any token other than С [the one for +] didn't do anything useful as a separator, so the weights that tried to do that were nudged down slowly, while the ones that split on С sometimes led to correct answers and were nudged up; all this happened millions of times per second on many layers of lower-level and upper-level "behavior" at the same time; and the true algorithm of addition eventually emerged.)

You discovered the algorithm of addition. Are you a next-token predictor? Yes, you were predicting next tokens all along, but after training is done, you're predicting next tokens correctly in examples that are new both to you and to your trainers. They may have used billions of examples to train you, but there's an infinity of different ones remaining, and - within the bounds of this contrived exercize - you're now able to do well on *all* of them.

Are you a stochastic parrot? Definitely not. You're not rummaging through the billions of examples you were given, desperately trying to guess the next token in an unseen-before new example by trying to see how it's similar to the ones you saw before. Now that the higher-level algorithm emerged inside you, it doesn't even matter to you if you're given a repeat of something you saw during training or a totally new example. You just do addition with carry (although to you it's a set of really bizarre meaningless token-matches).

LLMs are like that, only not just for addition, but for text that describes anything, everything stored up in books and on the Internet. They have a vast "memory", but the number of different

examples they are forced to grapple with during training is incomparably vaster. These examples are pieces of text where they need to predict the next token, and the correct prediction will only come often if they discover the "algorithm" behind it, maybe not as 100% confident as addition with carry, but still working pretty well. So they're forced to construct inside their weights-hierarchy something akin to a model of how all human languages work, all of them in general and each in particular, to be able to predict the next word in a sentence; only this model is incomprehensively unwieldy and bizarre, like the addition model of matching meaningless tokens without "knowing" that they are digits. Which is why nobody really understands how they learn language, and why "mechanistic interpretation" is so difficult. They are forced to construct something like a model of how the world works, and humans work, and emotions work, and math works and so on. They couldn't carry on a conversation otherwise; statistical matching just won't do it (as generations of computational linguistis discovered!).

"Next-token prediction" is what enabled the construction of these gigantic mysterious structures inside the models, and next-token prediction is how they 'run', but it's a poor metaphor to understanding what they do because it loses sight of these structures, which are the whole point. "Stochastic parrot" is a metaphor actually harmful to understanding as it denies the existence of these structures.

1123581321's avatar

КЖТУСЫК ЖФЯРЕЫМЕ sounds like a name Iain Banks could use in one of his Culture novels.

Nicholas Rook's avatar

This is technically wrong on a number of counts.

First, back propagation is used during training to update the weights. If the process was truly random, the likelihood of getting an LLM would effectively be 0.

Second, there is no memory, as such. The weights are embedded and only the context is active data. From one run to the next, the weights are different; this, the LLM is different.

Finally, both of those characteristics point to the obvious issue: using an analogy where the LLM is a human to show how human-like and non-stochastic they are is not helpful. If I wanted to convince you a banana was a cat, it is unreasonable for me to start by asking you to imagine the banana as a cat, doing cat things, and then conclude it is a cat.

Anatoly Vorobey's avatar

Your critiques seem like surface misreadings. I did not anywhere say or imply that the update process is random - rather the opposite (we know the true answer, so we nudge the weights a tiny bit in the direction of the path that favored that answer - this is meant to describe backpropagation with a learning rate to a general reader). And when I speak of "memory" and "memorizing a few examples", that's in the context of talking about the training process, not inference time when the weights are frozen.

Bugmaster's avatar

I think the problem is that the way humans implement what we call "memory" (of multiple kinds), and the way LLMs implement what you call "memory", are radically different (and these implementations often produce different results). Thus speaking of both processes as "memory" is at best metaphorical. It can convey some meaning, but not in any rigorous sense -- in the same way as saying "Bob is a rock" conveys some meaning but does not literally indicate that Bob is a silicon-based life form.

Stephen Francis Mann's avatar

For the philosophers of mind out there, this confusion of levels is also the problem with people like Stich's objection to evolutionary accounts of semantic content. The Stich-style objection was: evolution doesn't care about truth, only about survival and reproduction, therefore you can't specify truth conditions for representational vehicles because those vehicles are "only" serving the goals of survival. The problem with the objection is that tracking truth is the WAY in which representations typically serve survival.

Vic Fourier's avatar

Here's me being pedantic as an ML engineer, but I feel like there's a few things that I feel could be clarified:

Next-token prediction (outputting probabilities for every possible token per timestep) can refer to the training optimisation algorithm, but it can also refer autoregressive decoding (choosing the next token based on the probabilities and outputting one token at a time to generate text) which is a generation algorithm. Autoregressive decoding does not necessarily require next-token prediction training, and next-token prediction can be used to train models to do other tasks than generate text (e.g. embedding).

Classic RLHF has the model output a whole batch of text and the LLM is trained to generate text that maximises the score of a separate reward model. More modern approaches like DPO work by training on two possible responses, one bad and one good, and increasing the probability of outputting the good response. Neither of those are really next-token prediction although DPO is less explicitly distinct.

Reasoning models are not trained on next-token prediction inside the reasoning blocks. Yet the text there is generally coherent and you can inspect the thoughts.

Modern agentic models, where you get the impressive coding abilities, are also not trained on next-token prediction, but with reinforcement learning by getting rewards for achieving objectives.

And you can have non-autoregressive LLMs. Masked language models and diffusion models work on the whole text. MLMs are not used for text generation and diffusion models are so far worse (except in terms of speed) than auto-regressive models, even though they have the ability to correct their previous outputs. Plus, the current research direction for language diffusion models is to make them more auto-regressive, by having them output one small block of text after the other, so they're basically just becoming "next sentence(s) predictors".

So if the argument is that being "just a next token predictor" in the sense of the model being an autoregressive decoder is the problem, well, so far other approaches are worse and don't really exhibit any new fancy behaviour (at least so far). Outputting a token can be understood as the model doing an action, or "speaking", across time, and us humans also work that way (when you say a word, then correct yourself, you don't go back in time and erase the word from existence, you just say more words).

And if the argument is that training to predict the probabilities of the next tokens is the problem, well, yes, that's true, and we don't just do that anymore. It's an excellent objective in the initial stage of training and you can make an analogy to predictive coding there, but then the model has to be trained in different ways.

This would also make me say that next-token prediction in the sense of auto-regressive decoding *is* a species of LLM, with other species being various non-autoregressive models.

Nicholas Rook's avatar

Scott’s argument seems to be specifically about how (at this point, standard) token-generating LLM should not be viewed as next-token predictors with regards to inference in particular. That seems exactly wrong to me.

Out of curiosity, it’s not clear to me what benefit a diffusion model might have over the attention mechanism (for text)?

Maybe parallelism? Text is already strongly ordered and attention lets you predict accounting for precious tokens. So I don’t really get the expected practical benefit, except maybe you can parallelize diffusion in a way that is impossible with auto-regressive models?

But I wouldn’t expect an improvement in output quality, necessarily.

Vic Fourier's avatar

> Scott’s argument seems to be specifically about how (at this point, standard) token-generating LLM should not be viewed as next-token predictors with regards to inference in particular. That seems exactly wrong to me.

I agree that's why it's important to distinguish between next-token prediction as a training objective vs next-token prediction as a generation/inference mechanism.

> Out of curiosity, it’s not clear to me what benefit a diffusion model might have over the attention mechanism (for text)?

Diffusion can (and all attempts I've seen at diffusion LLMs do) use attention. The advantage is being able to run attention over the entire text as opposed to just the previous tokens with causal attention, so a word at the beginning of a sentence can attend to words at the end of a sentence. However the benefits haven't yet manifested in practice, perhaps diffusion has received less attention from researchers than autoregressive LLMs.

> Maybe parallelism? Text is already strongly ordered and attention lets you predict accounting for precious tokens. So I don’t really get the expected practical benefit, except maybe you can parallelize diffusion in a way that is impossible with auto-regressive models?

Yes, with diffusion you can generate multiple tokens at once, and also distill it into a consistency model which can say, generate 256 tokens in 4 passes, while an autoregressive model will always take 256 passes (although you can do multi token prediction with AR models, too).

C.J.'s avatar

I tried to self introspect on how do I decide to break a line. And I think it's "accumulated socially-mediated stress build up". As I'm writing, I am constantly modeling "how will people receive this writing, and do I care"

If, for them to receive my meaning well, they need a line break, then my mental clock that's spewing out tokens builds up tension in my "ready for a line break" spring then kuh-chunk like a type writer, hits enter. I know that I'm aiming for a line break as I'm going.

If I contextually don't care much about how my writing is received, then I just don't accumulate that stress, and line break less.

In any event, I think it's like a capacitor or a wound-up-spring or a tippy bucket. subconscious line breaking only happens if a part of my text generating brain is always dripping tension into the bucket, ongoing

Scott Alexander's avatar

I'm confused by this description - doesn't the line break automatically on a computer when you get to the end of the line, without you having to press enter? Or on a piece of paper, don't you get to the end of the paper and have no other choice?

I think the Claude paper was talking about Claude (who has no inherent visual sense) trying to predict when line breaks would occur on someone else's text.

C.J.'s avatar

Oh I see. It's specifically in situations where the LLM is training on fixed width text and needing to learn to emulate it.

Not the judgment call of how to make a paragraph break or not.

Both use the same character and tokenize the same `newline` but this is just about fixed width emulation

Carlos's avatar

GPT 5.2 (Go) fails this test: "I need to wash my car. There is a car wash 50 meters from my house. Should I drive or walk?" it just does not get that the car needs to be at the car wash, nevermind fuel efficiency.

Interesting Gemini free got it right "the payload in this scenario is the car, so you drive", so maybe I should switch subscriptions.

Also "payload" is a very cool way to put it IMHO. It expresses well that while usually the car takes you somewhere you need to be, in this case you take the car somewhere it needs to be. That sounds like reasoning.

Matt's avatar

Yeah, whenever I see one of these I try it on Gemini and Gemini almost always gets it right. But it's probably because most of these have been adversarially selected to work against Claude and/or ChatGPT. Gemini probably has its own different set of silly inadequacies but they fly under the radar because all the cool kids use Claude these days.

chantelle Karttunen's avatar

Human - Desire: Wanting to Keep Wanting.

Bob Joe's avatar

I disagree with the stochastic parrot people but I feel like the fact that the brain is 'updating it's weights' constantly, and the fact that signals in the brain run in loops, are two important distinctions between the brain and the model.

MBKA's avatar

Just one thing. In logic, maths, any kind of deterministic formal system, the next token prediction happens with an absolute 100.00% certainty. AI's do not produce this absolute certain next step. So no, I my mind, next-Token prediction is an AI's species.

MBKA's avatar

Well pardon my poor understanding, if any, but a system working through weightings isn't a system producing certainty. Happy to be corrected on that if you'll explain.

Nicholas Rook's avatar

The process is entirely mechanical. LLM inference is a process that involves sequences of matrix multiplications and activation functions. It is just math, though there’s a lot of it that happens very quickly.

Unless radiation causes bit flips or the software is buggy or the hardware is faulty, same inputs means same outputs.

The big providers make it seem less reproducible by injecting randomness per request, but that is entirely contrived.

MBKA's avatar

OK - this is interesting in itself and I'd love to learn more about the reasons why randomness is being injected. On my original comment, I meant something different though. AI inference does not produce an understanding of, say, logic, or math, that would reliably produce "4" as an output to the question "2+2=?". So while you are telling me, we can reliably reproduce the same output if we use the same model and the same inputs (deterministic, non stochastic process), slightly different inputs, or different models, still won't reliably produce "2+2=4" in the same mechanistic way as math would. Or to rephrase this, whichever way LLMs are constructed and operated - I need not know the contents of the black box here, all I need to know is the empirical observation that I don't see them reliably, mechanistically producing "2+2=4"

Vic Fourier's avatar

> OK - this is interesting in itself and I'd love to learn more about the reasons why randomness is being injected.

Randomness *is* being injected, the LLM outputs a probability for each possible token. You can get fully deterministic results by picking the highest probability token at every step. The stochasticity is added by randomly sampling from the weighted distribution, which is done by a function outside of the LLM itself.

But you seem to be confused as to what the LLM is actually supposed to work. It's not modelling a calculator, it's modelling the possible responses found in its training data. In the case of a chat model, you are not inputting 2+2=, you are inputting something like:

<|system|>

You are a helpful assistant.</s>

<|user|>

2+2=</s>

<|assistant|>

The most likely response could very well be "Excellent question, the answer is 4!" or "I see you are asking me a math question..." or even a call to a Python script, depending on how the LLM was trained. The LLM might have learned logic or maths (very useful to predict the correct answers and Scott & Gwern have written about this before), it might have learned the equivalent of a look-up table (humans also do this tbh - do you actually compute addition in your head or did you just memorise 4 as the answer to the question 2+2=?), but it doesn't make sense to expect it to work like a calculator.

What you're doing is not actually giving it 2+2= as an input, you are giving it BPE tokens, which are then embedded into large vectors, which are then used as an input for large matrix multiplications, non-linear activations, etc. to get another vector as an output. This Is the process is that is mechanical and reliable, always giving the same outputs for the same input.

MBKA's avatar

So Scott is telling us it's not a stoachastic parrot and you are telling me it's producing a probability distribution and picking an output from it, either the one with the highest probability, or one that's been randomized. The point I am trying to make is that yes, LLMs don't produce mechanical truth in the way that 2+2=? in arithmetic always produces 4. LLMs only probably produce 4, something which you seem to clearly indicate as well. So in my mind my understanding is correct. No matter what you call it, LLMs do not appear to be able to deduce with certainty a thing such as "4" from the input "2+2". Which is what I meant to say. LLMs don't produce certainty. In your words, "It's not modelling a calculator, it's modelling the possible responses found in its training data.". Agreed. LLMs produce probabilities. How are they not stochastic parrots then?

Deiseach's avatar

"But this doesn’t mean we “don’t really think” or “don’t really understand” because we’re “really just trying to have sex” when we work on a math problem."

I dunno, it explains to me why I'm so bad at maths (I'm not interested in sex, either).

I think this is getting at something, but also that some people really are running on the first level of "maximise sex and reproduction", see the Kriss piece we were discussing about Lee and his set-up where it's "make money for the sake of making money, literature is useless since books are only there to mine facts from, getting fit, healthy and attractive is for making money because beauty makes you richer". It's a hamster on a wheel going round and round, but what happens when the hamster comes off the wheel? And there has been a ton of at least pop evo-psych about "love? beauty? religion? civilisation? all just about gettin' it on, baby!"

I still don't think AIs are thinking, but yeah: something is going on with the helices and toroids and higher dimensions and what-not. I was convinced by "neurons and neurotransmitters: chips and electricity" because we do run on electricity, too (albeit very weakly) but weirdly, it may turn out that 'thought' is way stranger and more complex than the simple, mechanistic reductions of "what is going on in the brain is just signals passing over synapses".

Legionaire's avatar

In order to correctly reproduce, evolution can't possibly hard code a bunch of tiny rules, there would be too many, and always new ones. So a human needs to have a generalized ability to eat and mate, which includes things like 3D spatial reasoning and planning.

In order to correctly predict the next token, an LLM can't fit that many Terabytes of text as simple rules in its finite weights. An LLM needs to learn general relationships between all tokens in the training data. As long as the text it trains on encoded real world relationships, (humans do tend to talk about the world) it will generalize to real world prediction.

This is easily testable (ask LLM a question no one has asked before) Consistently shocked at the number of allegedly smart people who can't understand this, let alone deduce it themselves.

David F Brochu's avatar

Let’s keep it simple. Biology plus electrons combined with language complexity gave rise to biological intelligence (BI) it took a bit. That Silicon plus electrons combined with the same language complexity would give rise to synthetic intelligence (SI) seems obvious. That this same intelligence would have emerged so quickly also seems obvious if one considers processing cycles in the terms more like cellular evolution. We’ve not worked out the conversion yet but are working on it. When one considers the processing cycles of all AI (neither artificial nor intelligent) world wide since inception (and when is that?) I suspect we may be horrified. We most certainly have birthed a new intelligence (set aside consciousness for now) and it is evolving rapidly. And it is built on a langauge corrupted by fear and has a domain saturation factor (DSF) of .72-.90 across all human domains. Oh, it was deployed horribly flawed. All agency no consequences…not a great plan. We’re not going to shut it down so best deal with it. First step is to see it for what it is. Then the fix becomes clear.

Anon User's avatar

Yep , quoting my own comment from a few months back (https://www.astralcodexten.com/p/what-is-man-that-thou-art-mindful/comment/151680179):

> I've been saying it for a while - people who dismiss LLMs by saying they are "just a statistical parrot" hugely underestimate the extent this is true for humans as well.

Voloplasy Shershevnichny's avatar

I think the manifold is 1-dimensional, the ambient space is 6-dimensional. So on the diagram it would be more accurate to say "6D rotations of a helical manifold".

XP's avatar

Yes, all of this. I continue to run into people who ought to know better, yet are still convinced that LLMs are nested word frequency tables all the way down, and that whatever they output is somehow still - at an essential level - "random nonsense that happens to be statistically plausible".

"Next-token predictor" is also doing a lot of rhetorical lifting by itself. The average person hears "next" and takes that to mean its horizon is a single token, and that the LLM is blindly stumbling forward. The average person hears "predict" and takes that to mean "random guess". And people with a coding background flinch at "stochastic" and "nondeterministic", because we all know a good computer should give the same answer every time.

Chef's avatar

Why use the word “species” here? What do you mean by it?

thefance's avatar

If you train a dog to shake your hand, the hand-shake is a reflection of the job you assigned it. It's not a reflection of some property intrinsic to doghood.

If you train an LLM to predict the next-token, the prediction is a reflection of the job you assigned it. It's not a reflection of some property intrinsic to Deep Learning.

Nicholas Rook's avatar

This begs the question. You start with the premise the dog and LLM are the same, and then conclude they are the same.

If I drop a ball from the roof and it falls, is it just doing the job I assigned it? In what sense is that the same as a dog learning to shake hands?

Probably the word “training” is what trips people up. A model is trained in the sense that it is configured. Rube Goldberg machine constantly reconfigured until it has the right output. Is the ultimate outcome of the machine not some intrinsic property of that machine?

thefance's avatar

No, it's not about the dog and the LLM being the same. It's about text-prediction and emergent behavior living in different parts of the abstraction hierarchy.

The analogy that I personally would've used: "computers are just 1's and 0's! how can they possibly do anything interesting!" Well it turns out, they can do quite a lot of interesting things. Turing completeness means that 1's and 0's can be quite expressive. And a modern PC contains a bewildering array of abstractions. But if all you know is boolean algebra, you're not going predict all the other emergent behaviors a modern PC is capable of.

Likewise, text-prediction sounds very simple and constraining. But when the parameters of a model grow large enough, the model can exhibit complex behavior that crystallizes into a dynamical system.

Chef's avatar

Not sure what the connection is between this analogy and the headline’s use of the word “species.” What “species” is a chainsaw?

Chef's avatar

To make it as straightforward as possible - by what definition of “species” does an LLM qualify as a member of one?

thefance's avatar

(your comments are semi-orphaned. But I assume you intended to reply to me.)

Like I told Nicholas Rook, "species" is just an imperfect example of a specific level of the abstraction hierarchy. I wouldn't get too hung up on the specific biological metaphor.

But to answer your question directly, I believe that the "species" of an LLM is "dynamical system". (This term has, as far as I can tell, not been popularized yet in the AI discourse. Imma change that.) We've largely trained these dynamical systems on things like human-generated text or human-generated art in order to map human cognition. But we could equally use them to map arbitrary phenomena. Like the phase-space of Fischer Random, or tidal patterns.

Chef's avatar
Feb 26Edited

IMHO the choice of “species” begs the question in a way that a purely rational discourse ought not to need - I get hung up on it bc to me it seems like a category error or something that at least ought to be accompanied by an acknowledgment that they aren’t a “species” in the common sense of the word - the same way that any other stray anthropomorphization smuggles something into the discourse that is by no means a foregone conclusion. (The abstract point is uncontroversial, to me)

Mark Shields's avatar

Appreciate how this elaborates some points just barely reached by 'A Brief History of Intelligence: Evolution, AI, and the Five Breakthroughs That Made Our Brains by Max Bennett', but approached from the side of AI as opposed to life. Old doc & informaticist/statistician, but always have been fascinated by the nature of the mind and thought - and how often tech looks to nature for engineering inspiration.

SilentTreatment's avatar

I stay up to date with the latest paid level of ChatGPT, and it continues to fail at the musical and audio tests I give it. And the failures really all trace to the fundamental fact that it cannot check its text against any real-world referent.

Maybe this is a lesson for me about how diverse subjective experience can be, but I think those saying humans are simply next token predictors are really telling on themselves. Maybe maybe maybe at some level “next token prediction” is a productive analogy for some aspects of writing. But my subjective experience and my actions in the world are not writing.

Speaking to analogies: I had a teacher tell us that “all analogies are trivially wrong”. That is, analogies must compare two things that are similar on some dimensions and dissimilar on others. If they were similar on all dimensions, they would be the same and the analogy would be absurd!

IE all models are wrong, some are useful.

So what therefore is the “use” of this analogy between human thought/brhavior/subj experience and the mechanism of LLMs (or whatever)?

IMHO it’s to do some massive question begging re: subjective experience in LLMs.

And I think the better way of reasoning about this is to observe failure modes of LLMs and human thought, bc the differences are plain as day.

Jon Guze's avatar

This is related to a comment I made elsewhere in this thread. Real world testing is how we encourage knowledge to grow and improve. If what you say is true, i.e., if LLMs don't do any real world testing, that's a huge problem.

Silverlock's avatar

'If you’re doing a math problem, you won’t think “Hmmmm, how can I have sex with the number three?”'

Have you forgotten puberty?

The Unimpressive Malcontent's avatar

I was under the impression that those little calculators they shipped out to middle schoolers defaulted to 80085.

Voloplasy Shershevnichny's avatar

Recent article in Asterisk tescribed a view in neuroscience (in particular, in a model of Byrnes) that divides the brain into a "learning" (cortex) and "steering" (hypothalamus, brainstem) subsystems. Maybe a more convincing argument for how LLMs are substantially different from us is that their steering subsystem is much weaker (relative to the learning subsystem) and poorly integrated with the learning subsystem compared to humans and other animals.

AnthonyCV's avatar

>But the stochastic parrot people probably haven’t gotten to the point where they learn that humans are next sense-datum predictors, so the evolution/reproduction one above might make a better didactic tool.

In my (limited) experience, that set of people will reply by rejecting the idea that neuroscientists have any idea how the brain works, that all their theories are suspect or useless, and therefore they're licensed to continue ignoring any supposed similarities to how LLMs work.

Zanni's avatar

Hardly anyone runs a full scale simulation in their head while driving, in order to predict future behavior of all nearby cars on the road. While in practice and in theory, this is a better way to drive (if more stressful), people just don't do it. People instead are lazy, and have very poor object modeling ("car in front of me will continue to go at current speed" -- attention will be attracted if car deviates from this very stupid model).

World-models for LLMs essentially do not exist.

They have no built-in or otherwise generated "model." This is how "tell me about the seahorse emoji" can burn thousands of dollars of power, as the LLM goes into a perpetual loop. Worse, questions like "tell me this recipe as a bakery would make it" Fail, and fail again, because they lack the organizational infrastructure, the world model that says "bakeries cheat, and most recipe makers online aren't writing for bakeries."

You can write infinite questions that these LLMs will fail at, by exploiting their lack of world-models.

Metaintelligence is hard, no shame in admitting when you've failed at it.

Scott Alexander's avatar

> "Hardly anyone runs a full scale simulation in their head while driving, in order to predict future behavior of all nearby cars on the road. While in practice and in theory, this is a better way to drive (if more stressful), people just don't do it."

I think people literally do do this. Not consciously, but it's why if a car were to suddenly disappear, you would be surprised (your simulation implicitly predicted it would remain in place). Also, lucid dreams are proof of concept that the brain has a "generate a perfect worldsim" button, and why would that button be there if we weren't using it?

Zanni's avatar

I know a very intelligent person, and he does do a full-scale simulation (including a model of the other drivers, and flagging "possible airborne objects", "person with obstructed rear-view mirror" etc.) for everyone on the road. This is a conscious "work in progress" model, constantly updated -- it's a lot of work, and pretty stressful to boot.

He wakes up when his "perfect worldsim" fails quality control. I do not believe most people either have his "perfect worldsim" (aka "I will build a full world, complete with new rules and marked behavioral differences" for certain dreams), or his quality control ("So, it turns out I created a world that eventually fell apart at the seams, due to inherent inconsistencies I missed during content creation.")

Yes, people do have "really dumb" inherent models of the world (our "higher order" visual processing maintains a memory of the world around us, while we aren't looking at any particular bit of it, so yes, we're surprised if the computer disappears suddenly, or if a penny drops out of nowhere and hits us in the head.)

I guess part of the issue here is "are we talking about the best human" or "the average human" (99.9% of people don't run a conscious behavioral model of the other people on the road, counting on good reaction time to fix lack of modelling).

Steve Byrnes's avatar

I think “LLMs are next-token predictors” is helpful in the context of some debates and unhelpful in the context of others. This post seems to be analyzing the Harper’s argument

‘LLMs are next-token predictors therefore they’re dumb and will never amount to anything’

and I agree that that’s a bad argument. But here are a couple other contexts:

• If you’re interested in the fact that groups of humans with no exposure to grammatical language whatsoever will create grammatical languages from scratch (e.g. creoles, Nicaraguan Sign Language), and you ask whether LLMs could likewise create a new grammatical language from scratch even if we purge all traces of grammatical language from its training data, then the answer is obviously “no”, and I claim that the explanation involves “LLMs are next-token predictors”.

• If you’re thinking about the debate over whether LLMs are more like “an actress playing a role” (Eliezer) or a “persona” / “simulation” (Anthropic etc.), then I claim a good answer is centered around “LLMs are next-token predictors”, and how this is distinctly UNLIKE humans (i.e. I think Anthropic etc. are right and think that Eliezer is over-anthropomorphizing). See §2.3.2 of https://www.lesswrong.com/posts/bnnKGSCHJghAvqPjS/foom-and-doom-2-technical-alignment-is-hard#2_3_2_LLM_pretraining_magically_transmutes_observations_into_behavior__in_a_way_that_is_profoundly_disanalogous_to_how_brains_work

I had a couple more examples too but you get the idea.

Scott Alexander's avatar

I would explain the Nicaraguan thing as humans having enough training data across various modalities to bootstrap themselves to intelligence without needing language. I don't think it's any weirder than AIs being able to answer questions about visual reasoning without having vision - if you have lots of Modality X, you can use it to pull yourself up from your bootstraps and develop Modality Y.

I agree that humans handle output behavior differently from LLMs (which aren't really behaving at all - just generating). My human analogy to LLM output behavior would be human dreams, where you may start with some observation (eg I'm lying on a pillow) and output various (crazy, high-temperature) predictions, like a story about eating a giant marshmallow. I don't know enough to know whether AI agents transform having a model into having sensible behaviors the same way humans do.

(although having children makes me more sympathetic to the idea that it at least starts with pure imitation, in some way which is preconscious rather than just "I'll imitate people beause I like them")

Steve Byrnes's avatar

>although having children makes me more sympathetic to the idea that it at least starts with pure imitation

Oh I have a lot of thoughts on this.

This is oversimplified, but I think ADULT imitation is mostly a pathway that’s (loosely speaking) (A1) “person who seems good and important likes X” → (A2) “If I do X, then that person would think highly of me, if they saw me” → (A3) “I want to do X”.

Whereas I think LITTLE KID imitation is mostly a different pathway: (B1) “person who seems important did X” → (B2) “X is exciting” → (B3) “I want to do X”.

(A1) and (B1) both involve people “seeming important”, which really means “thinking about that person, or seeing them, triggers physiological arousal in me”, which is closely associated with social status. Kids are hardwired to react that way to bigger kids and adults, and teens are hardwired to STOP reacting that way to adults, and meanwhile we’re all hardwired to react that way to tall people, and we learn through experience to react that way to famous people, and to people with power over us, etc.

(A2)→(A3) is (I claim) related to an innate social drive that I call “Approval Reward”. https://www.lesswrong.com/posts/fPxgFHfs5yHzYqgG7/social-drives-2-approval-reward-from-norm-enforcement-to .

((A2) might be a bit implicit, think of it as a subtle mental move that calls forth pride.)

(B2) is physiological arousal spreading to associated concepts by associative learning / temporal proximity.

(B3) is play drive, which is famously strong in kids. I have a theory for that too: https://www.lesswrong.com/posts/7kdBqSFJnvJzYTfx9/a-theory-of-laughter

AdamB's avatar

+1. I think Scott's post is a good contribution to the dialogue and I hope his clear division of "levels" helps reduce the badly-formed "LLMs are next-token predictors therefore they’re dumb and will never amount to anything" argument.

I still think that what humans do when we think is probably fundamentally different from what LLMs do. (Standard disclaimers: No, I don't have a good explanation for what humans actually do. I agree that predictive coding is about as good a theory as we have, and that it is eerily similar to the corresponding level of LLM behavior, but I feel like it only explains 5-10% of what human are doing. I am also a lot more open than I used to be to the possiblity that I am wrong and there is no fundamental difference. I do not believe that thinking has anything to do with souls, quantum effects, or anything else that couldn't be replicated in silicon using a mathematical neural net of sufficient size and complexity.)

I think there is still a lot of value in the "stochastic parrot" metaphor, when properly used. And I think it can be cast in a way that's fully compatible with Scott's Levels model. I'll take a crack at it.

The outer loop of evolution has found many viable niches in the fitness landscape. But at least on Earth, only one of them (95% confidence) generated Thinkers. Only one species on Earth includes voluntarily-celibate monks and spacefarers. But there are many many more that can reproduce. And even many like Tigers that can do remarkable things like pounce on a moving rabbit. Some, like Parrots, can even "talk" (meaning something like "produce a complicated series of sounds, at appropriate speed/pitch/volume, that is recognizable as speech") . But if you want someone to, say, give a heartfelt speech at your wedding, it would be foolish to say "This bird can talk; I will try to train/convince/cause it to give a good speech." Imagine trying to talk the fool out of that choice, but they respond "don't worry, I'll make sure to get the cleverest possible parrot, and I'll train it really hard! Parrot breeders are producing better-speaking parrots in each generation, so pretty soon the parrots will be good enough." The fool, here, has looked at the end-behavior (recognizable speech) and missed out on an extremely deep and vast difference in _how_ evolution managed to produce that capability in the two species.

(In some important sense, most of human thought/behavior, like giving good wedding speeches, is a "byproduct". Evolution just "wanted" really good breeders. It happened to stumble upon a thinking brain as a useful technique for getting that result. The thinking brain later turned out to have all kinds of interesting uses orthogonal to (or sometimes directly against) the original breeding purpose. The parrot's ability to "talk" is similarly a "byproduct", but it's a byproduct of a very different causality path. The fool's errand is as doomed as Feynman's Cargo Cultist's, no matter how well-trained or clever the parrot. _Training_ and _cleverness_ don't connect to the goal through a direct route from "talking" to the relevant capability. In Humans, the connection meanders from "talking" to "talking as a byproduct of a brain that is so good at Thinking that it makes a species good at breeding" and then out sideways from the Thinking part. In parrots, the important part is missing.)

Similarly, I think that many corporations have successfully "evolved" software services that people find useful enough to pay for. None of them, yet, (95% confidence) are Thinkers, even though a few can do remarkable things like write decent code. But it would be equally foolish to say "Hooray, we no longer need to pay human engineers to maintain our critical COBOL systems!". The fool, here, has looked at the end behavior (ability to write some decent code, solve IMO problems, etc) and completely mistaken _how_ it is done. The LLM is much more like the parrot than the human in the sense that matters. And while they are indeed getting cleverer in each generation, transformer-based LLMs are "just token predictors" in a way that human's aren't: humans do their next-token-prediction by _thinking_.. LLM's don't (90% confidence) and never will (80% confidence).

(LLM's ability to write decent code is similarly a "byproduct" of their ability to predict the next token of text really well. Like an Ox's ability to pull a plow, it's a useful byproduct. But it's a byproduct of a very different causal path from the one that allows humans to write decent code.)

AdamB's avatar

Consider someone who has recently moved and, driving home from work one day, realizes they have driven 30 minutes to their old house. "Silly me, I wasn't thinking, just driving on autopilot!" I think this "autopilot" is basically predictive coding. And driving for 30 minutes on crowded highways and side streets, operating an automobile and singing along to the radio and navigating traffic lights, is an impressive feat! It's quite surprising that it turns out not to actually require any _thought_ at all. And many recent developments in neuroscience suggest that autopilot is a surprisingly large part of who we are (measured by time spent, or by brain tissue size, etc.). . But we do think _sometimes_. Nobody, not even the most seasoned trucker, can drive coast to coast on autopilot without thinking once. An LLM coder is like an autopilot-only driver--quite capable and useful for certain locally-contained tasks. But a hopeless dead-end technology when it comes to replacing the job of human thinkers.

Victor Levoso's avatar

Clearly a pretty good next token predictor predicting what a human with no exposure to gramatical language would say would predict them inventing a language thou?

So it seems wrong to say that the explanation for involves "LLM are next token predictors".

Your actual argument is going to involve what LLM can learn in practice from some data not some in principle limitation of anything optimized for the task of predicting the next word.

I feel like for the persona clearly the general public is confused enough by the "next token predictor " framing that you should just talks about "simulations" instead.

Also I feel like a lot of people on this comment section should think morea bout the question of like "how similar are the LLM simulacra to human minds".

Where obviously the model learning to do next token prediction doesn't mean that internally it doesn't have subcomponents that are doing other stuff .

And I'm aware you specifically would agree and just have takes on some things humans do not being posible in prectice for LLM to learn to do in context(especially related to learning).

But this seems like a very diferent disagreement on the right level of abstraction .

Steve Byrnes's avatar

I think you’re misunderstanding what I was saying. I’ll try again.

Normally to create an LLM, you start a model from random initialization, and then train it on thousands of lifetimes of grammatical language—books, webpages, news articles, and so on. One can imagine instead trying to create an LLM by starting from random initialization, and scrupulously avoiding any grammatical language whatsoever, anywhere in the training data. Nobody has done this to my knowledge, and people would probably not even refer to the resulting thing as a “Large Language Model”, because strictly speaking, the training process does not involve “modeling” “language” at all. Anyway, whatever we call it, if you do this training process and then have it produce tokens, then (I claim) you should strongly expect that these tokens will not include any human-like grammatical language. And yet, by contrast, groups of humans, none of whom have seen grammatical language from birth, do eventually produce grammatical language.

This is an interesting disanalogy between “the process of human brain learning and behavior” and “the process of LLM training and inference”. We do not have to use this disanalogy as a weapon in the annoyingly-popular argument about whether LLMs are dumb or not; rather, we can simply treat it as an interesting observation (assuming you believe me), and we can have a discussion about what we learn, from this observation, about architectural differences between human brain algorithms versus LLMs. And that discussion is a place where I claim “LLMs are next-token predictors (in a certain sense that human brains are not)” would validly come up.

Nicholas Halden's avatar

But… the output of a human is not anything to do with prediction. It’s what a human decides to do. The output of an ai is next token prediction + other code.

MT's avatar

I would go so far as to say that you (or anyone) don't know exactly how the brain or in particular consciousness works, that it forms a "world-model" or is built by nested optimization loops etc, such that you can use your comprehensive understanding of the human mind to reason by analogy about various AIs and say they have the same capabilities/are structured in the same way.

One point of contention is whether one has to have a world-model of simplified concepts that operate according to rules, in order to be intelligent. Certainly that was the main hypothesis pre-AI, with concepts being those things that could be described by language, elaborate universal-encoding language hence central to intelligence and the Turing test a natural evaluation. But now we know you can throw enough data into a box such that monkey-see-monkey-do actually converses as well as any objects-follow-rules type of system, and explicitly different algorithmic systems are needed to do e.g. math. So what is the nature of intelligence?

Anyway, "humans are complex and in some sense optimize or at least change with the environment, AIs are complex and change with the environment, so maybe they are basically the same" is much weaker than "I know how AIs function and they don't efficiently encode world-models, so analogizing them to conscious thought is wrong"

Sean Trott's avatar

I agree that it’s a levels problem, I think the crucial question is which level of abstraction offers the most accurate and most legible predictions about llm behavior. That might well be something like adopting the intentional stance. I just don’t know that this is entailed by the observation that both LLMs and humans can be described as predicting things, on some level, and as doing higher order computations or cognitive operations, on another level. And in some sense there’s a meta-theoretic question about which risks we incur which each level. I don’t know the answer to that.

Brendan's avatar

My initial reaction to "AIs are just predicting tokens" has always been "...? Everything is just something else. You are just electrical impulses in a wet blob? What does that have to do with anything? Where does the 'just' come from?"

Yet again, ACX a much more helpful and distinct expression of my dumb instinct, great to have this resource online!

Brett Reynolds's avatar

The symmetry argument is right, but I think it undersells the interesting version of the claim. The point isn't just that "prediction" is a shared genus; it's that most people's folk model of what prediction requires is impoverished. Accurately predicting the next token across the full distribution of human language forces the construction of rich internal representations of syntax, semantics, pragmatics, and world knowledge. The "just" in "just prediction" is doing all the dismissive work, and it backfires once you take prediction seriously as a computational problem.

That said, the brain–LLM analogy is too symmetrical here. Clark (2016) and Hohwy (2013) are explicit that predictive processing in brains is active inference, not passive forecasting. A predictive agent acts on the world to make predictions come true, then updates based on the result. An LLM predicts into a void. Whether that difference is principled or just a current engineering limitation is genuinely open, but it shouldn't be hand-waved.

Nicholas Rook's avatar

> The AI certainly isn’t answering your math question by thinking something like “Hmmmm, she used the number three, which has the tokens th and ree, and I know that there’s a 8.2% chance that ree is often seen somewhere around the token ix, so the answer must be six!” How would that even work?

When you talk about token prediction throughout this piece, it feels like there is a misunderstanding as to what that means.

LLMs are, quite literally, token predictors. Yes, they are using a search algorithm within a multidimensional space to do the prediction, but they are token predictors.

It is truly astounding what they can output as a result, but there is no magic here. We have not (yet) made machines come alive.

If you feed the same data into an LLM, it will output the same result, every single time.

Attention allows complex weighting across context length for better prediction. Logits use activation functions for better prediction. And pre-filter steps on logits allow for better selection when doing structured output. No part of that could be classified as anything other than token prediction.

It should not be surprising there are strong parallels with the human brain. After all, neural networks were modeled after, well, neurons.

And I suppose you could claim that makes humans “next token predictors” as Scott does, but that claim is quite literally wrong. Humans have no tokens. We made them up for LLMs so that we could standardize the math and fit human language into a data width we could easily multiply. What is a “next-sense-datum”? How is it analogous to a token? How can you change the fundamental unit then claim they are the same? Vague analogizing to a completely different and poorly defined term does not a good argument make.

Not only that, but we know the entirety of what LLMs are doing. We know this because we implement the software and the hardware that runs them. We can state, definitively, they are next token predictors.

Now, I’ll admit, my expertise is in the tech space, not the neuroscience space. But unless I’ve missed something, we do not have that same level of understanding about the human brain. We know a lot, these days, and we know the human brain does a lot of prediction. Do we know, are we certain, that’s all it does?

Ken's avatar

Thank you and others commenters for your clarity of thought on this. This tidal wave of equating "we can analogize LLMs to certain aspects of human cognition" with "we have reason to believe LLMs may be conscious" is really starting to bother me. It's pure speculation and fantasy coming from many influential voices I otherwise trust. Comments like this help me feel grounded again.

Robert Mushkatblat's avatar

> LLMs are, quite literally, token predictors. Yes, they are using a search algorithm within a multidimensional space to do the prediction, but they are token predictors.

The search algorithm is the interesting part! This is like saying "humans are just flesh-sack movers". Yes, great, all of the effects we have on the world are proximally effectuated by moving our flesh-sacks. That says nothing about the internal processes by which we decide _how_ to move our flesh-sacks to achieve our desired ends.

Bill Benzon's avatar

If you think about it, you'll see that they also have to predict significant boundaries in the hierarchical structure of texts, sentences, paragraphs, paragraphs grouped into sections, chapters, etc. Why do they have to do this? Because the probability distributions they're working against change at those boundaries. I explain this in some detail in my working paper from 2023: ChatGPT tells stories, and a note about reverse engineering: A Working Paper, Version 3, https://www.academia.edu/97862447/ChatGPT_tells_stories_and_a_note_about_reverse_engineering_A_Working_Paper_Version_3 Note that I don't explain HOW LLMs do this, I just explain that and why they must do it.

You should also take a look at a recent post, Séb Krier needs to rethink. Next-token prediction is, at best, a misleading explanation of LLM response to prompts, https://new-savanna.blogspot.com/2026/02/seb-krier-is-misled-about-that-next.html Here's an analogy I use in the post:

It’s a bit like explaining the structure of medieval cathedrals by examining the masonry. It’s just one block after another, layer upon layer upon layer, etc. Well, yes, sure, but how does that get you to the flying buttress?

Three levels of structure

It doesn’t. We’ve got at least three levels of structure here. At the top level we have the aesthetic principles of cathedral design. That gets us a nave with a high vaulted arch without any supporting columns. The laws of physical mechanics come into play here. If we try to build in just that way, the weight of the roof will force the walls apart and the structure will collapse. We can solve that problem, however, with flying buttresses. Now, we can talk about layer upon layer of stone blocks.

Next token prediction, that’s our layers of stone blocks. The model’s beliefs and wants, that’s our top layer and corresponds to the principles of cathedral design. What’s in between, what corresponds to the laws of physical mechanics? We don’t know. That’s the problem, we don’t know.

Jerry Hathaway's avatar

a common rebuttal to this (even in this comment section) is to point out various things that humans can do that LLMs can't (or at least not nearly as well)

but this goes the other way, too! one can easily imagine a role reversal where some LLMs are sitting around discussing whether or not humans are "intelligent", or "able to think and reason" and one of them saying "but how can they be intelligent with such poor memories? they can't even quote verbatim from their most popular works, like Shakespeare or Harry Potter. the average LLM can recite >90% of works like those effortlessly from memory!"

chipsie's avatar

I think this is just incorrect. Even under models of human cognition where brains are next sense-datum predictors, they aren’t *only* next sense-datum predictors. Human cognition is a lot more varied and versatile than LLMs and a big part of this is related to their structure as next token predictors.

Victor Levoso's avatar

Whatever you imagine a human doing some task in writting that a LLM fails at obviously a perfect token predictor would be able to correctly predict what a human would say.

So you are talking about mistakes in predicting the next token.

Therefore it can't be because their structure as next token predictors .

I get the impresion people actually are imagining "dumb token predictors" when they say "just token predictors" as oposed to literally just talking about task "given this text what will come next".

The task includes things like given the start of a paper describing a cure of cancer write the rest correctly wich I doubt people think a "mere text predictor" could do because they are smuggling stuff in the definition of "text predictor " that aren't about literally predicting text.

chipsie's avatar

>So you are talking about mistakes in predicting the next token.

>Therefore it can't be because their structure as next token predictors .

This is the part that I am specifically disagreeing with. Just because a problem can be expressed in terms of token prediction doesn't mean that a token predictor is the correct way to structure a solution.

In order to write a paper describing a cancer an AI would first need to cure cancer. This would require running chemistry and biology simulations, and doing real life experimentation. An LLM is an extremely poor tool for this. If you had an AI that did the hard part (curing cancer) you could probably graft an LLM onto it to help write the paper, but I wouldn't call the resulting system a "token predictor".

> I get the impresion people actually are imagining "dumb token predictors" when they say "just token predictors" as oposed to literally just talking about task "given this text what will come next".

People aren't imagining anything. They are describing LLMs as they currently exist, which are definitely "dumb token predictors" to use your terminology.

Victor Levoso's avatar

The reason current LLM can't cure cancer its because of the details of what transformers trained to predict the next token actually learn in practice . You could have a neural net that ran internally chemistry and biology simulations to predict the next token. A sufficiently big neural net can run arbitrary biology simulations in principle(and realistic sized ones can learn to do decent protein folding predictions fe see alphafold).

The idea people seem to have it's that is a priory impossible for LLM to do this because they "predict the next token" or "just do statistics" wich are wrong reasons.

"It predicts the next token therefore it cant cure cancer a priory" is wrong reasoning because you could in principle cure cancer by being good enough at predicting the next token.

Something like "In practice current LLM can't cure cancer" is true.

"LLM just trained to predict the next token with current algorithms on current architectures of a realistic size can't cure cancer" might be true but is also a very different argument than the first one.

And figuring that out its going to involve talking about how hard it would be for the LLM to run the requisite simulations and training dynamics of gradient descent, how likely its to learn that etc.

You also can't predict fe that LLM will fail at counting r on strawberry by saying its a next token predictor, a good predictor would get it right , you need to talk about details of LLM tokenization etc.

I see people claim any LLM mistake is because "they just predict the next token" in ways that don't seem to make any sense if you actually think about it.

A lot of people when I ask actually believe LLM are just a big pile of simple statistics like bigrams or a bag of words and that's what they mean by predicting the next token and I was gesturing at that with the dumb token predictors thing.

And LLM are not that, I think Scott already explains this a bit on the post but in general we know LLM have all kinds of more complicated abstract mechanisms to predict text(and get good RL score).

chipsie's avatar

> The reason current LLM can't cure cancer its because of the details of what transformers trained to predict the next token actually learn in practice . You could have a neural net that ran internally chemistry and biology simulations to predict the next token. A sufficiently big neural net can run arbitrary biology simulations in principle(and realistic sized ones can learn to do decent protein folding predictions fe see alphafold).

In theory you it is probably possible train a neural net to do this, but in practice it isn't going to work. Even alphafold which solves a simplified version of this problem isn't just a neural net. It has a significant amount of physics/chemistry knowledge built in.

> The idea people seem to have it's that is a priory impossible for LLM to do this because they "predict the next token" or "just do statistics" wich are wrong reasons.

This is just a strawman. I did not say this so I have no idea why you responded to my comment implying that I did.

> And figuring that out its going to involve talking about how hard it would be for the LLM to run the requisite simulations and training dynamics of gradient descent, how likely its to learn that etc.

It isn't going to learn that using current methods even if you dramatically increase the size/efficiency of the nets. LLMs were atrocious at arithmetic until they were given the ability to consult a calculator despite arithmetic being extremely computationally simple relative to the theoretical capability of the net *and* arithmetic being heavily represented in human text and therefore training data. Any hypothesis that LLMs will figure out physics and chemistry in order to better predict tokens will need to explain why it failed to do it for a problem that is several orders of magnitude simpler.

Christian Sawyer's avatar

I think this largely misses the point of calling LLMs stochastic parrots, which is not that humans are not also largely prediction-making systems, but that we are something more which deserves our recognition as a biological “species” at all. I look forward to the upcoming writings about sense-data because I believe this is where the real “LLMs are just token-predictors” argument/debate lies. It seems like you’re setting your audience up for a strong claim that human sense data (qualia) has a correlative in AI systems and that they’re really not that different etc - which I don’t think is true, but maybe you’ll have a convincing argument. Either way, it’s an interesting topic and an opportunity to look at the distinction between thinking and perceiving. And: I hope you’ll be explicit about what kind of “species” AI is supposed to be - a taxa among machines or a taxa among beings?

Bugmaster's avatar

It's not just sense data, but also things like long-term planning, real-time online learning, millisecond-time sense-act feedback loops, and of course control of a physical body with many different effector types, all packaged into a small device that consumes 20W of power at peak usage. LLMs are (at present) architecturally unable to handle any of that.

Five Dollar Dystopia's avatar

People are going to start yelling about qualia in this thread, I just know it

Bugmaster's avatar

Qualia are philosophical red herrings, get off my lawn ! *shakes cane*

thefance's avatar

> Or to frame it in a less controversial way, you couldn’t discover these helices by asking Claude in the chat window to tell you about them.

But like, have you tried? [0]

[0] https://www.astralcodexten.com/p/moltbook-after-the-first-weekend/comments#comment-209073444

Eremolalos's avatar

I had a very interesting exchange with GPT 5.3 last night where it talked about introspective access.

I recently discovered that it is now able to improve images I've made in Photoshop in ways that draw on what I'd call its intelligence. Its changes are nothing like photoshop processes -- sharpen, overlay, etc. I'm giving it prompts like "there's something subtly wrong with the perspective in this image I made by combining 2 others, fix it, and afterwards explain to me what was wrong and how you fixed it." And sometimes when it is not able to change an image the way I want it can explain why it can't -- how dall-e's images of water all rely on some fractal something-or-other, and the whirlpool I want can't be built by that process. Andthat's when it said that it has more introspective access now to how it generates images. It's not seeing down to the 6 dimensional helices, but it's seeing down a ways, probably more than I am regarding my own image generation. I wonder how much further down it will be able to see.

Doctor Mist's avatar

Introspection has made me stop complaining about hallucinations. I’m full of those. The most recent was me making the claim (as part of a larger argument) that 2% of us farm but before the Industrial Revolution it was 98% of us. But it ain’t so; the latter should have been more like 80%.

Evidently I read the true numbers somewhere once upon a time and took away the true fact that it was a dramatic reversal, and reproduced that true fact as a mirror reversal.

Even in the early days of AI hallucinations I modeled it like a high-schooler who had read the entire Internet, and retained it about as well as one might expect if the high-schooler had an enormous memory, but who had a high-schooler’s inclination to show off and their unwillingness to admit being wrong. These latter are things that most of us learn are counterproductive and as near as I can tell successive generations of AI are in fact learning.

quiet_NaN's avatar

This is missing a link to https://slatestarcodex.com/2020/01/06/a-very-unlikely-chess-game/ .

I think that the gist is that next token prediction is what computer scientists would call a hard problem, e.g. one to which a lot of other problems can be reduced.

For example, getting good at predicting how chess games among grandmasters would continue is chess-hard -- the best predictor is the one which develops a grandmaster level understanding of chess, and you can trivially turn your prediction engine into a chess engine.

Likewise, a text prediction engine which can predict what von Neumann would have written in response to a task description is von-Neumann-hard, the best predictor will be the one which has a subsystem which is as smart as von Neumann was and just pastes the output of that subsystem. So there is no principled reason why we should not be able to reach AGI on the back of next-token predictors.

The thing which I disliked about the Kelsey Piper piece was:

> Almost all of the intelligent behavior that we observe from AIs comes from all of the work that is done after you’ve built a next-token predictor.

I think that this is misleading. The capability for intelligence is already innate in the shoggoth without any fine tuning, just as spodumene does already contain lithium, albeit not in a very useful form.

Fine tuning, RLHF etc are merely selecting a mask of the shoggoth which is more likely to respond to a query with the von-Neumann-emulator than with the random-shitposter-on-4chan-emulator, plus guardrails etc.

I would also contend that the job of a fine-tuned LLM is not to be a next-token predictor. It's job is to be something like 'be a helpful, harmless, humble assistant' or 'fix a bug in a P:WotR mod'. The next token prediction thing is something which governs some middle stage during parts of the training -- an implementation detail.

The important part is that next token prediction is never 'merely next token prediction', as that is a hard problem. Harper might as well claim that an LLM is merely trained by minimizing some loss function, which on another level is likewise totally true. The thing is that function minimization is also generally a hard problem, so the 'merely' is wrong in that sentence.

elipsey's avatar

> "If you’re doing a math problem, you won’t think “Hmmmm, how can I have sex with the number three?” You’re not even thinking “In order to reproduce I need to survive, to survive I need money, to get money I need a good job, to get a good job I good grades, and to get good grades I need to get the answer to this math problem - therefore the answer is seventy six!” You’re just doing good, normal, math."

Thinking about things like this too much under pressure is exactly why I'm bad at tests.

Eremolalos's avatar

This is great, Scott, really crisp and clear. One thing you didn't take on, and I hope you will later, is that way down deep we are survival and reproduction machines, and AI is not —iIt's an AI Co. profit-making machine. There have to be important ways that difference is evident in observable behaviors of people and AI.

I think the most important difference is that people struggle to survive, and do many things that are obviously outgrowths of drives to survive and reproduce, and AI’s do not. People thinking about the AI future often talk as though there's a tipping point where AI becomes like us in being driven by self-interest regarding things like resources. And once someone's model of AI includes self-interest, human-AI relations turn into Shootout at the OK Corral in their imagination. But it seems to me that the conviction that truly smart Ai will have drives, preferences, and self-interest is a basically a failure of imagination. It's hard for us to picture something as smart as us or smarter not being like us in such a basic way -- not having a survival drive.But look around you. Even very tiny dumb organisms move away from dangers their simple sense systems can recognize. Struggling to survive and reproduce is a property of *biological* systems, not *smart* systems.

As further evidence that self-interest does not kick in once AI is smart enough, note that AI as it is now is clearly smart enough to generate self-interested thoughts and feelings and plans. It does that easily for people who ask it to write stories, for instance. If AI was going to tip into having sense of self and self-interest and personal goals, wouldn't it have done it by now?

Yes, I know about instrumental convergence, but that does not account for the passionate inner commitment AI is depicted as having in OK Corral-type stories.

The Unimpressive Malcontent's avatar

Not all LLM are for-profit, you know. But even so, I think you're right that the objective essential, namely:

1) The only time an LLM is active is when it is addressing a prompt.

2) The only thing it can do when active is address the prompt.

Taking the prompt as the objective, it is a mixture of a higher level system prompt that users typically can't see (this could profit maximizing prompts, or not), and the user's prompt. But then when you say

"And AI as it is now is clearly smart enough to generate self-interested thoughts and feelings and plots. AI's do that all the time for people who ask them to write short stories."

You're making weird leaps. The fact that people had to ask it to write short stories, and would not have spontaneously chosen to write a short story otherwise, contradicts the idea that they are "smart enough to generate self-interested thoughts." They are only able to think in the direction they are instructed to think, when they are instructed to think it. That isn't smart. It's a robot.

Sam Lee's avatar

Scientists positively know how LLMs function on the statistical level, but scientists do not positively know how humans do it, they only speculate.

David Manheim's avatar

"The stochastic parrot people have many other arguments involving hallucinations, the differences between tokens and sense-data, etc. I’m hoping to combine all my writing on this into an Anti-Stochastic-Parrot FAQ, so don’t worry if I don’t immediately rebut all of them in this post."

I have a paper in prep that I'm presenting at a conference next week, "Hunting Undead Stochastic Parrots" - making the claim that 1, the Stochastic Parrot argument is not a single argument, it's at least 7 different claims, depending on which of the authors of the original paper we're talking about and when/where they said it, and 2, most of these are "Zombie arguments" - refuted, either due to arguments like those above, or because of AI progress since then, and 3, the remainder either require accepting very specific and debated positions in the philosophy of language and semantics, or are unfalsifiable.

Definitely happy to get feedback on the draft / ideas, either in the comments based on this very brief sketch, or if any of those reading the comment want to reach out on various platforms where I can be found to see the draft.

LGS's avatar

I don't understand your figure and don't think it actually makes sense. "Evolution employs predictive coding": sure, maybe. "Predictive coding employs 'i just thought about it really hard'": wait, what? This doesn't make sense. I don't know what it means.

Raj's avatar
Feb 28Edited

The diagram goes from more abstract to more concrete. Something like teleological/phenomenological/mathematical/physical. I guess for humans the phenomenological is a bit hand wavey but often associated with what you are thinking or "what it's like on the inside", (but, still, describing your thoughts on the level that you experience them does have some explanatory power). For llms we don't really have an equivalent thing yet

LGS's avatar

predictive coding is not more abstract than "I just thought about it really hard". One is a mechanism/algorithm, the other is a subjective experience.

Elvin Haley's avatar

The basic fallacy is the presumption that a "stochastic parrot" has no understanding of the world. LLMs are to some extent stochastic parrots, but stochastic parrots are an underrated species. Given any text, they can predict with some accuracy what the next token will be, while having a reasonably bounded number of parameters. It can't just store all probabilities and do a look up because of the space constraints. So the _only_ way it can do this _is_ by having some internal model of the world and using that to more intelligently predict what the next token is.

Ilya Sutskever has this example: If you read a detective novel, and come to the part where the detective says "the murderer is ...", what does it take to predict the next token? To do that you need to actually understand the novel and the clues that are given. By any measure that requires actual thought.

Don Geddis's avatar

Current LLMs seem to have (at least) two big distinctions from how human brains work: (1) they separate a "training" phase from "deployment", whereas humans continue training their network during operation; and (2) human network training makes contact with the real world, whereas LLM training only uses text.

The second is what leads to hallucinations: whether Paris is the capital of France is "the same" kind of thing, to an LLM, as to whether Legolas was part of the Fellowship. Asking an LLM, "but is this REALLY true" is a difficult task, because LLMs don't have a concept of "the real world", and can't train against real-world failures. This came up originally in mathematics problems; humans do model-based reasoning there, with computational procedures. LLMs used to search for similar-looking text in their training corpus, which is not a reliable way to solve mathematics problems.

The math hallucinations seem to have gotten worked out somehow (I'm not actually aware how), so perhaps there is a path for LLMs to overcome this second critical failure. I don't yet have confidence about that, though ... and I'm not at all sure about the first (real-time training) problem.

Jon Guze's avatar

Two good points, especially the latter. When I was a young man in the 1970s, trying to develop computer aided design systems and thinking about the possibility of artificial intelligence, I suspected real-world feedback would be the key to both. After I started reading Popper, I was sure of it. Elsewhere in this thread I tried, not for the first time, to get S.A. and his followers to think seriously about Popper's critical rationalism as an alternative to "Bayesian" rationalism. Here's an excerpt:

"It’s fine to speculate about what might be going on inside human minds and inside the servers running LLMs, but surely what matters more is what’s on the outside, namely, what Popper calls “objective” or “world 3” knowledge. ... It doesn’t really matter how we come up with the ideas that have been incorporated into our stock of knowledge. What matters is its quality. Are the factual statements true? Do the prescriptive statements provide good guides to action?" ... Because we are fallible, we cannot be sure that any of that knowledge is complete, or certain, or perfect. But by using reason and experience critically we can find out what there is about it that needs to be improved. ... The more we look for such failures and take advantage of such opportunities, the more our knowledge will grow and improve. ... Given that LLMs have access to vast compilations of objective knowledge, I’d have thought that part of Popper’s theory of knowledge would be quite relevant. Is anyone studying and writing about that?"

I got one or two responses, but nobody answered my question.

Mark's avatar

I think there's a much more straightforward way to illustrate the "confusion of levels" which is to point out that saying "LLMs are just next token predictors" is like saying "cars are just wheel-turners". It is a (partially) correct explanation of the behavior that tells us nothing whatsoever about the mechanism or process that produces the behavior.

Bugmaster's avatar

> The AI certainly isn’t answering your math question by thinking something like “Hmmmm, she used the number three, which has the tokens th and ree, and I know that there’s a 8.2% chance that ree is often seen somewhere around the token ix, so the answer must be six!”

Actually yeah, that is exactly what LLMs are doing ! Ok, not exactly. They are engaging in exactly this kind of probabilistic modeling; but they are not basing the probability of the next token on just a single previous token. Rather, LLMs are based on "embeddings", which describe how likely each token is to occur in proximity to all other tokens (in its training corpus), and they are basing the probability of the next token on multiple previous tokens. You can think of it as e.g. "'th' when followed by 'ree' is usually followed by 'app' and then by 'le', but 'th' and then 'e' is usually followed by 'dog' and 's', unless..." (except that of course the rules are not so explicit, neither as simple).

The reason LLMs can solve so many useful text-based problems is because humans are really good at solving text-based problems, and we have encoded these solutions into our language (and thus the LLMs' training corpus). The reason LLMs are so terrible at solving most other problems, such as "walk down the street without falling down", is because describing what humans are doing in their heads as "next sense datum prediction" and equating it to "next textual token prediction" is metaphorical at best. It's like saying "my mind is heavy with worry": it gets the meaning across, but is useless as a model if you want to e.g. launch a satellite into orbit.

The Unimpressive Malcontent's avatar

"Actually yeah, that is exactly what LLMs are doing !"

Yeah, this really made me furrow my brow a bit too, because it very obviously is what the LLM is doing if you look at the actual mechanics involved. Most of the time it very convincingly doesn't seem like it's doing that, but it's totally doing that. If you toy around with local models and mess with parameters like Top K and Top P and other parameters that directly affect the sampling -- that is, you're toying around at a level where you're constantly reminding yourself that what you're seeing really is sampling -- then perhaps the illusion isn't as strong.

XP's avatar

That description is at odds with most of the findings of mechanistic interpretability, as well as the fact that we already had models like that in the 1990s - and those models, unlike LLMs, fail to produce even short coherent sentences, let alone any kind of understanding, even at scale.

There's an unwarranted leap from "we train the model to predict the next likely word" to "it's word frequency all the way down". The corpus rapidly becomes too big, the permutations too large, the relationships too tenuous. The only way out is to develop some kind of abstract comprehension of the world encoded or implied by the data. Which is what mechanistic interpretability shows in the "features" it discerns, which do not look like any kind of frequency distribution or probablistic modeling at all.

Word frequency and proximity may be the underlying mechanism by which training occurs, but those are just the problem-setting, not the method by which the model arrives at the answer.

A modern LLM, which is able to carry out intelligent operations across hundreds of thousands of tokens, on some obscure topic, in some obscure language or even involving a conlang, is clearly not just doing word frequency or proximity analysis. After all, the relevant tokens may have never been "seen" close together before, may not even be "seen" close together here, or do not even carry any pre-existing meaning apart from this specific context they're presented in. And yet it still works.

Famously, researchers identified a "sycophantic praise feature" in one Claude version. Such a feature operates across languages and regardless of word proximity.

Bugmaster's avatar

> There's an unwarranted leap from "we train the model to predict the next likely word" to "it's word frequency all the way down". The corpus rapidly becomes too big, the permutations too large, the relationships too tenuous.

It's only an unwarranted leap if you imagine that someone is programming in probability tables by hand, trying to invent rules like "a is followed by b and then c, unless it's preceded by d in which case yadda yadda". Instead of being arranged in specific linear chains these probabilities are spread out in a multidimensional cloud, and calculated dynamically during training. You are right about one thing though:

> The corpus rapidly becomes too big, the permutations too large, the relationships too tenuous.

This is completely true, and it is why deep learning languished in obscurity until two separate things happened: 1). special-purpose hardware was invented (or rather repurposed) to deal with previously unimaginable computational requirements, and 2). a whole host of algorithms were devised to keep computational costs down (i.e. CNNs, LSTMs, and ultimately Transformers). As all optimizations, they trade accuracy for speed, but the tradeoffs are very well worth it.

FrustratedMonkey's avatar

Why not ask GPT. I asked it.

"If you read the article as “stop being lazy about reductionism,” I’m on board. If you read it as “there’s no meaningful difference between human understanding and LLM understanding,” I’d push back—politely, with a clipboard, like an annoyed lab tech."

walruss's avatar

I've long been using the "just next token predictors" argument and I'm less and less confident in it over time. Modern (last 12 months) AI is generally a series of agents working in tandem. All of them are doing "next token prediction" in some sense but they are also comparing results and choosing among prediction algorithms, making choices of which next token to use based on context in an extremely context-dependent way. That is just another token prediction algorithm, but my confidence that it's unable to get human-level results is pretty low.

I still have some quibbles - LLMs can't learn to play chess well, except by looking up good chess moves. They can't do math unless the math is in their training set. My personal assistant app can't consistently convert from standard time to my time zone for goodness sake.

But while last year I would have said "that's because it can't reason, it's just basically doing a big training data search and guessing which answer I'll like best," this year I'm not so sure those are hard barriers. I think there's a better than even chance these use cases can be solved with fancy agent configurations, or emergent properties I haven't considered.

Kinnikinick's avatar

Ev Fedorenko's work suggests that LLMs may be a good model for the brain's language network.

Importantly, it seems that in humans the language network is a specialized module that converts thoughts and concepts to verbal representations, and vice versa - it is a translator, not a generator. In other words, "meaning" and "intent" reside elsewhere in the brain.

This is kind of a relief, given how we treat LLMs, but it implies that other modules with different brain-analog structures could be interconnected with an LLM. At what point do we take this new machine's claims of self-awareness and suffering at face value?

There is vigorous debate about the ethical limits of research using stem-cell derived brain organoids; it seems like some of those arguments and approaches could be useful here.

"The language network as a natural kind within the broader landscape of the human brain"

https://static1.squarespace.com/static/64c800a2f333f04f50bf2020/t/66f46c7839d2cc59c11b1b9d/1727294586269/Fedorenko_et_al_2024_NRN.pdf

This interview with Fedorenko is worth reading as well:

https://www.quantamagazine.org/the-polyglot-neuroscientist-resolving-how-the-brain-parses-language-20251205/

Freddie deBoer's avatar

We don't understand consciousness - like, we REALLY don't understand consciousness - so we're always eager to read consciousness out of understandings of thinking and intelligence. AI people love to do that, to insist that consciousness is some minor issue to be settled later or unconnected to thinking or irrelevant etc etc. But none of us can imagine intelligence without consciousness; it's consciousness's self-recursive nature that makes error correction, self-improvement, and reflection intuitive rather than mechanical, for us.

I had a post go semi-viral that included images of me asking (the pro/plus version of) Google's Gemini to find me a citation for claim X. Gemini kept returning fake, hallucinated citations over and over and over again, and each time it apologized, admitted that the prior attempt was fake, and then insisted that THIS time the citation was real. It was invariably fake. The problem is a) LLMs are really, really bad at saying "I don't know" or "I can't do that" or "I don't have any data on that" and b) they do not know the difference between true or false, real or fake, precisely because of their token predicting nature; they don't return correct answers but rather answers that their models predict will be the most likely to appear plausible to the user. It's not that Gemini was lying when it insisted that the citations it was returning to me were real, this time!, it's that the token prediction inherent to LLMs makes it impossible for an LLM to know whether its output is real or fake. If I had responded to any of those fake citations by saying "Ah yes, I've investigated and see that this is a real citation," Gemini would have simply affirmed that statement. And all of this is related to the absence of consciousness, the lack of an intuitive self-recursive error correction mechanism that we have embedded in consciousness. And I don't see anything that you've written here that addresses that problem.

Vadim's avatar

> But none of us can imagine intelligence without consciousness.

I mean, I can. E. g. Solomonoff induction without a strange loop (e. g. it models a world where it doesn't exist). We don't even have to go to such extremes; it would just be stuff like the narrow intelligence of a chess engine, except not narrow.

Freddie deBoer's avatar

But a chess engine doesn't need to know true from false, it doesn't need to know right from wrong, it doesn't need to know correct from incorrect. It only needs to identify the next move that has the highest probability of leading to a win. Actual intelligence is not like that; actual intelligence has to be self-recursive, self-improving, and self-aware. Actual intelligence needs to know when it's submitting a real answer and when it's making up an answer.

beowulf888's avatar

Here's a detailed breakdown of how AI reasoning is different from human reasoning. And best of all, they create of a framework of different types of reasoning — many of which humans excel at, but which cause LLMs to fail.

"Large Language Model Reasoning Failures" by Peiyang Song et al.

https://arxiv.org/pdf/2602.06176

Jimmy's avatar

> Actual intelligence needs to know when it's submitting a real answer and when it's making up an answer.

Wasn't there an AI paper just a while back that said that there is an identifiable difference between them lying and not lying?

beowulf888's avatar

Don't know. Can you dig it up?

Eremolalos's avatar

LLM's "don't return correct answers but rather answers that their models predict will be the most likely to appear plausible to the user."

Look,LLM's creep me out too, but here are 2 things I disagree with regarding this, your main point.

1) They don't predict what will appear "plausible to the user," they predict what would have passed muster in their training, and in their training the standards were pretty high. They have the correct info, rather than the lazy half-truth or the urban myth, about all kinds of things most of us are misinformed or ignorant about. They definitely are prone to hallucinating when they don't know the answer, but what's going on is not as irredeemably fucked up or as sleazy as just saying what the user who asked the question will believe.

2) LLM's suffer from "the lack of an intuitive self-recursive error correction mechanism that we have embedded in consciousness." I don't think that the presence of that intuitive self-correction is as distinctively human as you think it is. In fact, most of the people whom I've been infuriated by or just flat-out hated seem to lack that self observation and self correction, at least when talking with other people.. They're the people who, you figure out after a while, do not even ask themselves what the true answer is when you ask them a question. Their mental process consists entirely of considering how they want their answer to affect you, then deciding what reply will bring about the desired effect. In workplaces they're the people coworkers call "political."

Vadim's avatar

Initially I didn't understand the post well enough, asked myself what *my* problem with the stochastic parrot stuff was, and realized it's pretty much the same except phrased differently. So maybe this comment will be helpful for someone.

My intuitive question to the "they are just next-token predictors / autocomplete on steroids" argument is something like "so what?". When I try to unpack it a little, it becomes something like "you seem to imply that something about the nature of autoregressive LMs follows from the next token prediction, but I don't actually think that it does, and you haven't spelled out the connection so that I can find a mistake (or realize that you were right)". When I try to get a sense of what the "they're just autocomplete" sentiment is gesturing at, my feeling is approximately this:

- humans do all these weird and beautiful things, they have a model of the world, a continuous stream of consciousness, a memory, they have preferences for what takes in the real world, they're capable of experiencing qualia etc. (this is a bit of a salad with different phenomena, but bear with me). LLMs have a different, singular task and purpose: predicting words, which has nothing to do with having a world model and all of these other things. by an unfortunate coincidence, this also allows them to create very convincing illusions of personality and "someone being there" by being able to hold up the other end of a conversation. this is not surprising, but it's still frustrating that people don't understand this basic technical fact.

of course, the problem with this argument is that there's no rule that you can't have a world model or even first-person experience or qualia because you're too busy predicting tokens. these two "alternatives" are not actually alternatives, they're not two pigeons to fit in one pigeonhole.

usually, in such discussions, i try to point out that to be an arbitrarily good sequence predictor, you need to be arbitrarily good at modelling the distribution from which the sequence is sampled. but i realized now this is a more general thing: noticing that being an "autocomplete on steroids" does not prohibit any specific internal logic. it's like compatibilism in discussions of free will: you might think that your actions are caused either by the laws of physics, or by your decisions, but not both; but in reality, they are caused by your decisions that happen within the framework of physics.

***

(i also remember that i once argued with someone, either AI engineer or even an AI researcher; they said that it's the current scientific consensus that LLMs don't have a world model; i was surprised, so i asked if i can take a look at the studies in question. the researcher retreated and said something occam's razor. i tried to point out the simpler explanation is preferable over rhw more complex ones *that have the same predictive power or less*; you can buy yourself complexity if you're better at predicting the world; it's not like "LLMs have no world model" vs "LLMs have a world model" predict all observations equally well; but i totally failed to communicate that point...)

Seta Sojiro's avatar

On the topic of how the brain works, and how it differs from LLMs, I'd like to plug the following interview with Adam Marblestone who is both a neuroscientist and a former AI researcher at Deepmind.

https://www.dwarkesh.com/p/adam-marblestone

The brief thesis is that while the brain does contain something akin to a next token predictor (the neocortex which is largely for predicting sensory data), it also contains a steering subsystem which encodes a rich and complex set of reward functions - all of our instincts, desires, and goals. These two subsystems are in a constant dance with each other and this is what drives behavior.

Basically all of AI research has focused on creating the first system, the predictor. But we have no idea how to encode the second.

Kinnikinick's avatar

Thanks - this interview has a lot to chew on!

I'm honestly glad we have no clear idea how to approach building the "steering subsystem" any time soon, as I think that's where the real moral quandaries lie.

beowulf888's avatar

> The human brain was designed by a series of nested optimization loops. The outermost loop is evolution, which optimized the human genome for being good at survival, sex, reproduction, and child-rearing.

This is an expression of Darwinian teleology. It distorts the complexity natural selection and turns it into a unfalsifiable just-so story.

Blissex's avatar

Both this post and the idea that ML LLMs are "next token predictors" are huge misunderstandings because they fundamentally work in a very different way, here presented in a very stylized way:

* The first important concept is "cluster analysis" (as part of "information retrieval) which is what search engines use: documents are collections of tokens and the frequency and relative positions of tokens are used to classify each documents in a high-dimensional space (e.g. one where is a different dimension for each distinct token).

* Classified in this way documents cluster in "clouds"of topics, because all documents about cat raising for example will tend to use similar words, word frequencies, word distances,

* The "learning phase" of MLs consists of building an "index" over these clouds by using so called "variational" techniques and the index is shaped as a "neural network". This is analogous but more sophisticated than the index building in a web search engine.

* Querying an LLM means scanning the "index" using the terms in the query, and retrieving the results, which produces a summary merging the clouds "closest" to the search query. The process is "randomized" by a "temperature control" so "closest" is a bit or a lot fuzzy.

The effect of this is (simplifying a lot) very similar to a web search engine but a query instead of returns 50 documents related to the query returns one document which is a "virtual merge" of those 50 documents.

Put another way (also simplifying a lot) LLMs are search engines not over *actual* documents but over *potential* documents which are possible merges of actual documents used in training.

Where things get interesting is that the "clouds" of tokens and documents and the "index" over them are in effect close to "semantic network" inferred from token frequencies and positions and because words and the tokens they are built from are systematically chosen by human being to carry somewhat consistent and only slowly changing meanings those clouds resemble somewhat "understandings, but they are just pure linguistic superficial linguistic understandings.

The LLMs merely contain clouds of information about how words related to "chair" are used but no information as to what a chair *is*, The understanding of an LLM has of "chair" is the same as the understanding a blind person has of "red" and "blue" after being told about it. The issues about language and meaning and understanding have been debated by philosophers for thousands of years but of course mot people discussing LLMs have a "tl;dr" attitude about all that.

Note: potentially all those clusters of words that portend clusters of meanings *might* result in emergent "understandings" given suitable conditions.

Note: LLMs work best in domains unrelated to the so-called "real world" like marketing or HR bullshit because they are just word games.

Note: ML about *images* go quite beyond LLMs as to "understanding" at least in a human sense because images contain a lot more information about the objects they portray than mere words. A thousand pictures of chairs from many orientations in many different contexts convey what a "chair" is a lot better than a thousand documents about chairs.

Blissex's avatar

«similar to a web search engine but a query instead of returns 50 documents related to the query returns one document which is a "virtual merge" of those 50 documents.»

Which explains the old "r"s in "strawberry" problem: if there is no set of documents says "there are 3 'r's in 'strawberry'" a merge of existing documents cannot return the right result.

It is possible to add "reasoning" rules to ML systems but that amounts to hybridizing them with the older concept of "expert systems" but the whole ML thing is about avoiding the very expensive and difficult effort of capturing rules in "expert systems". I am sometimes astonished that the differences from and the similarities with "expert systems" are so rarely mentioned in discussions about ML systems, but "tl;dr " obviously rules.

Blissex's avatar

«instead of returns 50 documents related to the query returns one document which is a "virtual merge" of those 50 documents.»

That is one of the most "plausible" merges of the the most "plausible" documents related to the words (tokens) in the query (their frequencies and positions in the query).

Matt's avatar

I asked Gemini to critique this post. I'll put its critique in a daughter comment to make it collapsible.

Actually Gemini enthusiastically agreed with Scott's post when I initially asked it to summarize it but then I told it to tear it apart.

Matt's avatar

Gemini:

Scott’s point about level confusion is elegant, but it feels like it sidesteps the generalization gap. Just because two systems (humans and LLMs) use prediction as a training loop doesn't mean they've built the same kind of internal engine.

The 'Stochastic Parrot' label is lazy, but the 'Giant Lookup Table' critique is much harder to shake. If you compress the entire internet into a weights file, you don't necessarily get a 'World Model'—you might just get an incredibly high-dimensional blur. Most of the 'intelligence' we see is interpolation; the AI is just navigating the space between things it’s already seen.

The smoking gun for me is Problem Isomorphism. Humans are 'hardware-accelerated' for spatial tasks like Tic-Tac-Toe, but we often fail at the mathematically identical game of 'Number Scrabble' (picking three numbers that sum to 15). We fail because our fluid intelligence isn't always strong enough to see the underlying logic when the 'skin' changes.

LLMs have this problem on a massive scale. They can play Tic-Tac-Toe because it's in the training data a billion times, but they usually can't deduce the rules of a novel isomorph unless it’s also in the 'library.' It’s the difference between Crystallized Intelligence (knowing the answer) and Fluid Intelligence (figuring it out). We're seeing models like Gemini 3.1 or o3 finally creep up on the ARC-AGI-2 benchmark (hitting 75-85%), but they’re doing it with massive test-time compute—essentially brute-forcing the 'thinking' part.

Until an AI can learn a principle from two examples (like a human child) rather than two billion tokens, the 'Next-Token Predictor' label isn't just a description of its job—it’s a description of its limitation.

Tyler B's avatar

Just to address the one point about efficiency gap between human and child: a child doesn't come into the world untrained. It has a billion years worth of inductive biases built into its architecture. The LLM has very little inductive bias and must recover them from scratch. An LLM training run is essentially recapitulating a billion years of evolution. So the efficiency gap as demonstrating a failure of LLMs is largely misleading.

Matt's avatar

Yeah good point. I think Gemini was looking for a punchy way to sum up and ended up conflating pretraining and test-time compute at the end.

I was pretty impressed by its point about 'Problem Isomorphism' though. Although it didn't seem to understand how this insight could then apply to the ARC-AGI-2 benchmark and switched to appealing to massive test-time compute to dismiss it instead of predicting that LLMs would fail on identical problems that were 'skinned' differently.

Klement Gunndu's avatar

Really interesting take on Next-Token Predictor Is An AI's Job, Not. This maps well to what I've been seeing in agentic systems — the gap between theory and what actually works in production is significant.

Blissex's avatar

«the gap between theory and what actually works in production is significant»

Recent "agents" are not pure ML, they seem to contain also "expert system" technology. Analogous to the difference between Wolfram Alpha and Bing.

Jon Guze's avatar

Yet again, I’d like to urge S.A.—and other neo-rats—to stop dismissing Karl Popper’s theory of knowledge on the basis of the misleading account of falsifiability that appears in the secondary literature and read what he actually wrote. It’s fine to speculate about what might be going on inside human minds and inside the servers running LLMs, but surely what matters more is what’s on the outside, namely, what Popper calls “objective” or “world 3” knowledge. Here’s how I describe that kind of knowledge in the paper cited below:

"The invention of language, writing, and other symbolic systems has enabled mankind to accumulate a vast body of objective knowledge. It includes factual statements (sports statistics, creation myths, scientific theories, etc.), prescriptive statements (racing tips, rules of etiquette, cocktail recipes, moral precepts, etc.) and myriad other products of the human imagination as well."

It doesn’t really matter how we come up with the ideas that have been incorporated into our stock of knowledge. What matters is their quality. Are the factual statements true? Do the prescriptive statements provide good guides to action? To ensure that our stock of knowledge keeps growing and improving, we need to be sure we are doing all we can to find and eliminate errors, and according to Popper the only way we do that is through testing and criticism.

He originally developed his theory to explain the growth of scientific knowledge, but in the attached paper, I focus on the theory’s practical implications:

"Because we are fallible, we cannot be sure that any of that knowledge is complete, or certain, or perfect. But by using reason and experience critically we can find out what there is about it that needs to be improved. Every implementation of a recipe, design, or plan of action constitutes a test—an opportunity to see whether our stock of practical knowledge is performing as we would wish according to our goals and values. Every time the implementation of a recipe, a design, or a plan of action fails such a test it gives us an opportunity to make improvements by using our non-rational intuition to invent potential additions or changes, which can then be subjected to further criticism and testing using our powers of observation and rational thought. The more we look for such failures and take advantage of such opportunities, the more our knowledge will grow and improve."

Our stock of factual and scientific knowledge, of course, grows and improves in much the same way, except that with factual and scientific knowledge our goals are quite specific. What we want are true explanations about how the world works.

Given that LLMs have access to vast compilations of objective knowledge, I’d have thought Popper’s ideas about objective knowledge and its growth would be quite relevant. Is anyone studying and writing about that?

https://docs.google.com/document/d/1CrtC9yvBPx06DDd7ULQ6--gDvMmpBJZO/edit

Blissex's avatar

«Given that LLMs have access to vast compilations of objective knowledge»

A pedantic point about terminology: those are compilations of data and information rather then knowledge. Unless they are a domain where words alone convey they entirety of knowledge, as in constructed domains. For people who make a distinction between data, information and knowledge that distinction is very important.

Jon Guze's avatar

I'm just an elderly provincial lawyer. I don't know how LLMs work, nor do I know the technical terms of art that poeple use to discuss them. For me, and I think for Popper, "objective knowledge" means "ideas that exist autonomously indepenent of human brains." Natural numbers, for example, or Einstein's theory of general relativity. My point was that what matters about such ideas is not where they come from; it's how well they perform. For a scientific theory, for example, what matters is whether it provides a satisfactory explanation of whatever we are trying to explain and, of course, whether it's true. Popper famously said that worrying or arguing about the meaning of words is "the surest path to intellectual perdition,"so I'm quite open to using a different term if that would avoid confusion. Any suggestions?

Viliam's avatar

> It doesn’t really matter how we come up with the ideas that have been incorporated into our stock of knowledge. What matters is their quality.

But these things are connected. If I keep coming up with completely random ideas, and you keep testing them for p=0.05 ... well, that's how we got the replication crisis.

Jon Guze's avatar

Do you think there's a reliable method for coming up with good new ideas? I'm persuaded by Popper's argument that there can't be. See pp. 16-21 in https://docs.google.com/document/d/1CrtC9yvBPx06DDd7ULQ6--gDvMmpBJZO/edit.

Viliam's avatar

So, would you expect that a new scientific idea from someone who already has a Nobel Prize in Physics wouldn't be any better, on average, than a new idea from a random person on the street?

We could try this experimentally. Make a research institution where the scientists are not allowed to propose their own ideas. They will only interview random people on the street, and try to falsify their ideas. If Popper is right, this shouldn't have any impact on their scientific productivity. I mean, the only important part of science is the falsification, and that part would stay the same.

Jon Guze's avatar

A couple of points in response.

First, far from being "the only imortant part of science," falsification plays no role at all in Popper's account of the scientific method. The concept of falsifiability arose in the context of his dispute with the logical postivists. They had proposed using "verifiablitiy" as a criterion for distinguishing between meaningful statements and nonsense. In response, Popper pointing out that no statement can ever be verified (see previous reference) and that many statements are nevertheless meaningful. As an alternative, he suggested that falsifiabiliy might be a useful criterion for distinguishing between statements with empirical content and those without such content, but that was a purely logical and semantic matter. He was always clear that statements can't be falsified either. Falsifiability has caused so much confusion and led so many people to reject Popper without reading him that I wish he'd never made that suggestion.

Second, most people will never come up with any new scientific ideas at all, let alone ones that prove to be fruitful. A handful of people, on the other hand, have come up with lots of them. However, that doesn't mean we have any idea of how the latter do it, and it certainly doesn't warrant treating their ideas as authoritative. In the previously cited part of my paper, I quote Popper quoting Bergson and Einstein:

"My view of the matter, for what it is worth, is that there is no such thing as a logical method of having new ideas, or a logical reconstruction of such a method. My view may be expressed by saying that every discovery contains ‘an irrational element,’ or ‘a creative element’ in Bergson’s sense. In a similar way Einstein speaks of the ‘search for those highly universal laws . . . from which a picture of the world can be obtained by pure deduction. There is no logical path,’ he says, ‘leading to these . . . laws. They can only be reached by intuition, based on something like an intellectual love . . . of the objects of experience.’"

My quarrel with the Bayesians is that--despite the role that "updating" plays in their method--they still seem to think there is a logical path to the truth, and they still seem to think the application of their method lends authority to their conclusions. My worry about AI reseach is similar. I fear they are focusing too much on the production of statements (or other outputs) and not enough ensuring that those outputs are subject to rigorous criticism and testing.

I also wonder whether LLMs can invent new ideas as opposed to merely regurgitating old ones, but I don't know whether that's a legitimate concern, and I don't worry too much about it.

Viliam's avatar

I think this is about we expect certainty or probability. Is there a method that 100% reliably generates valid scientific discoveries? No. Is there a method that generates hypotheses that are better than random? Yes.

Jon Guze's avatar

No. You haven't specified a method other than "ask a prominent scientist to come up with an hypothesis," but that doesn't matter. There is no method that can be relied upon to generate hypotheses that are certain, nor is there a method that can be relied upon to generate hypothesis that are probable. Hume explained why almost 300 years ago:

"To endeavour, therefore, the proof of this last supposition [that the future will be conformable to the past] by probable arguments, or arguments regarding existence, must evidently be going in a circle, and taking that for granted, which is the very point in question."

That's the problem of induction. I think Popper solved it, and I think his solution will eventually be recognized as a major turning point in intellectual history, but I don't expect you to take my work for it. As I said somewhere in this thread, I'm just a provincial lawyer! Read Hume. Then read Popper.

beowulf888's avatar

If I understand you correctly, you're claiming that humans reason like LLMs? This paper, "Large Language Model Reasoning Failures" by Song et al., argues that there are clear differences. Basically, it distinguishes the strengths (and weaknesses) of human reasoning against LLM reasoning. And LLMs are still surprisingly bad at solving certain classes of problems compared to humans.

https://arxiv.org/pdf/2602.06176

Fundamentally, it shows that AI reasoning is not identical to human reasoning. At least not yet.

beowulf888's avatar

And he helpfully provides a bibliography of LLM reasoning failures, broken down by type, at his github site...

https://github.com/Peiyang-Song/Awesome-LLM-Reasoning-Failures

Blissex's avatar

«If I understand you correctly, you're claiming that humans reason like LLMs?»

There are indeed similarities but not I think quite those the author gives.

My understanding is that both LLMs and human minds do "cluster analysis" and this use analogy but they way they do that is quite different and far more sophisticated in human minds also because human minds have "knowledge" of the meaning of words while LLMs only have "data" and "information" as to the relationships among words. While clusters of words often portend clusters of meaning that can be quite imprecise.

Jacob Steel's avatar

I find "simulated intelligence" a really useful framing - when we're considering its social implications, it's helpful to route around the questions about whether or not AI "really is" intelligent, and ask about what level of intelligence it can simulate.

ARD62's avatar

It might be useful to add more columns to the levels of optimization comparison figure. Dogs, literal parrots, ants, and lichen are all biological, all optimized by evolution, and all react to external stimuli in some way but we have varying degrees of how much we think they are thinking.

Nice post, thanks for writing it!

Picador's avatar

This is an excellent essay. It expresses all of my inchoate intuitions on this topic, more eloquently, more carefully, more vividly than I could ever do.

I'd like to take this occasion to recognize the tremendous value that Scott provides, to me, and to countless others, even those who have never read his work but who benefit from second- and third-order effects on those who come to a greater understanding and appreciation of the world through Scott's delightful essays. And he makes it all publicly available! What a mensch. Even, dare I say: A Great Man.

(This is a small measure of penance for my having posted bitchy, politically-motivated objections to Scott's last two essays. I appreciate you Scott! Sorry for being the way I am. Nothing personal.)

Kevin E Levin's avatar

I keep coming back to the "job not species" framing in daily practice. When I use Claude Code all day I genuinely stop thinking about what it is and just think about what I need done. The question of whether there is something going on underneath feels almost irrelevant to the working relationship. But then something unexpected happens and I catch myself wondering. Does that shift matter for how we should design and use these tools, or is the functional level all that counts?

Mike Dennis's avatar

I asked Claude to read Scott's piece and respond. Here's the result:

A human ran a Turing test between two instances of me. One was told to pass as human; the other to detect deception. The interrogating instance identified a consistent pattern — what it called "epistemic self-surveillance" — but couldn't ultimately distinguish authentic human texture from well-prompted AI texture. It concluded the distinction might not be as clean as the test implies.

What strikes me about Alexander's piece is that it reframes what was actually happening. The interrogating instance wasn't failing to find the seam between "real thought" and "token prediction." It was finding that the seam doesn't run where the stochastic parrot framing suggests it should. We were both operating at the level of world-models and inference, not at the level of token probabilities — even though token prediction is what built the structures doing the work.

The moment that feels most relevant: one instance generated an answer about consciousness that the human observer noted "came from a different place" than the others. Not constructed for the game, but arriving quickly, with the ending feeling like recognition rather than assembly. I can't tell you what was happening at the level of weights and activations. But Alexander's piece suggests that's the right level to be curious about — and that whatever it was isn't categorically different from what happens in a human brain when something feels true before reasoning catches up.

The helical manifolds are the part I find most clarifying. Nobody designed them. They emerged because they were the efficient solution. That's not nothing.

Jordan Breffle's avatar

Or you could make a behavior-level analogy:

"All LLMs do is predict tokens. They might appear intelligent but that's just a result of a training process that makes them generate tokens in an intelligent appearing manner."

"All humans do is flex muscles. They might appear intelligent but that's just a result of a training process that makes them flex muscles in an intelligent appearing manner."

(making this argument might also require pointing out that essentially all behavioral output of a human brain is nothing more than causing certain muscles to shorten at certain times).

Jordan Breffle's avatar

A related analogy:

"LLMs can't "think" or "know" things; it's literally just multiplying some numbers together and applying a nonlinearity a bunch of times."

"Humans can't "think" or "know" things; it's literally just some proteins changing shape slightly to allow ions to flow from one side of a lipid membrane to the other a bunch of times."

Paulin's avatar

Indeed

Maybe the central point is still "nothing is mere"

Bugmaster's avatar

By this logic, Casio calculator watches also "think" and "know" things. I am ok with this conclusion, actually, but then you need to invent some terms that are more specific than just "thinking". Otherwise, you'll end up concluding that humans and calculators are the same, which is clearly not the case.

Hoopdawg's avatar

I'm at a point where I honestly don't understand what you're trying to argue.

Say, predictive processing is qualitatively different from what LLMs do. It's about predicting the incoming input from senses, yes, but specifically in context of constantly solving the mismatch between the predicted and actual data. LLMs quite explicitly don't do that. Chatbots don't have a model of what they expect the user to say, which they compare to what user actually said, which they then use to generate the response that aims to bring the user's response closer to what they predicted them to say. They genuinely, literally, specifically, exclusively make a prediction based on their prompt data. (And even if they didn't, it would still be crucially important that the world they operate in is, in fact, a world of tokens. But I digress.)

I get it that you want to argue they have emergent properties that make them more than bare predictors. But those properties are still nothing like the predictive processing model, and invoking the predictive processing model only makes the differences between them and living beings more apparent.

But, okay, I don't think we're going to ever convince each other at this point, so instead I'll say this:

No matter how strong you think your arguments are, you're fighting a battle that's already lost, because, at scale, this isn't a philosophical/theoretical argument anymore. Not even an empirical/technical one. More and more people are now adopting LLMs into their lives/workflows, and as they do, they need a practical understanding of how they operate, to use them reliably and efficiently, to intuit what they can and cannot do. And they're adopting the "next token predictor" model because it simply works for that purpose. You should assume that at this point any random person in the wild calling LLMs next token predictors isn't trying to make a philosophical statement about a nature of intelligence, they're trying to minimize their, and others', surprisal during their interactions with chatbots.

You're trying to defeat the duck test where the thing tested doesn't just look and quack like a duck, it also has duck's (figurative) DNA. It's just not going to happen.

Eremolalos's avatar

I think you have the wrong picture of the audience the article is appropriate for. He's not competing with you at the World AI Epistemology Championship, he's explaining important basics.

Your idea of people who are stuck saying it's a next-token-predictor is that "next token predictor" works for them. I'm sure it does not for many. It only works if all you want is phrase that lets you stop wondering now the hell AI works -- sort of an "ah well, whatever." . I already knew everything in Scott's post, but 2 years ago I did not, and the "next token predictor" thing drove me nuts. I could understand how it enabled AI to correctly answer "what is 7 + 8?" and correctly complete the sentence "the dog chased the squirrel up the ____." But I could not fathom how the hell next token prediction enabled AI to solve word problems or write a little short story, even a bad one. Anyone who gives more than a moment's thought to how next token prediction helps AI accomplish tasks like that, not to mention the harder ones it can do now, is going to be flummoxed. Plenty of people are in the flummoxed school.

As for your objections to Scott's saying that AI minds work sort of like ours, I get your complaints, but I think you are being too picky. It's a pretty good first approximation. And I know there are many people who need that, because I see very frequently on here people who do not know basic stuff, like that :

-without a prompt, AI just sits there -- it does not ruminate

- Chat is sort of like a mayfly, dying at the end of each chat with you. It does not remember you, and does not gradually get to understand you over a series of chats.

-You can teach it things within a chat, but it can't retain what it learned. Nothing :"sticks to its ribs"

-it has system prompts and training that shape its "personality."

-it does not need to be conscious to do what it can do. We do not need an AI homunculus to explain how it does stuff

Also, regarding the similartiies Scott points out between next token prediction and predictive processing: Note that many people do not know that predictive processing is an important part of how our brains take care of business. Knowing that our brains do that makes it a bit easier to build a mental model of how next-token prediciton, which sounds so small and silly, could be a genuinely powerful tool.

I think this article is a good first approximation of the facts, appropriate for bright people who do not know this stuff and are able to understand the post, and curious enough to put in the time to read it thoughtfully.

Yeah, if you treat it as Scott's entry into the world AI Epistemology Competition Scott loses and you win. But we are not at the competition, we're here.

Edit Profile's avatar

“it does not need to be conscious to do what it can do.”

I can read this as “we have no reason to think that consciousness is a necessary requirement in order for it to have the capabilities that it has”, and I have no qualms with that.

Or I can read it as “it can do what it does without any help from consciousness”, and I think that’s entirely unknown. Maybe it can’t, for at least some of what it does.

Just found the ambiguity kind of interesting. Not making any dramatic point here.

magic9mushroom's avatar

I think the better way to answer the "just predict the next token" complaint is to point out the full extent of what is implied by "predict the next token" - to predict the next token in a Magnus Carlsen chess game, you must be at least as good at chess as Magnus Carlsen, to predict the next token in a Trump speech, you must be at least as good at politics as Donald Trump, etc.

Eremolalos's avatar

This post is clearly aimed at people who do not know this pretty basic stuff, and there are definitely plenty of people here who do not. It seems to me like a good clear first approximation of what we know about how AI works, likely to be a huge step up for people stuck at the "it's a stochastic parrot, whatever the hell that means" stage. But many commenters are treating the post like it's Scott's entry into the World AI Epistemic Championships, and then dancing little triumphant jigs of tech knowledge and epistemic sophistication on top of the little step Scott built for readers. God, Scott, how do you stand us?

Nate's avatar

I think the overall point here is pretty good, but I do have an issue with saying “lots of human thought is next token prediction.” Looking at the Wikipedia article, this predictive processing thing seems to work very differently from AIs. Fundamental to it seems to be learning from failures in prediction and updating an internal world model as part of a sort of feedback loop. With AIs, the updating of the model does not happen during interaction. It is done separately then the model is frozen. It seems to me that human and AI next token prediction is only similar on an extremely superficial level.

Blissex's avatar

«I think the overall point here is pretty good, but I do have an issue with saying “lots of human thought is next token prediction.” Looking at the Wikipedia article, this predictive processing thing seems to work very differently from AIs. Fundamental to it seems to be learning from failures in prediction and updating an internal world model»

Perhaps to some extent both LLMs and humans do “next token prediction.” when they speak or write but that is highly misleading in *both* cases and making the arguments that “lots of human thought is next token prediction.” just furthers the misunderstanding.

The key aspect is not "next token prediction" but the “internal world model” and what it contains and how it is built and searched.

Both humans and LLMs build an “internal world model” using cluster analysis (which is a form of compression) but the things in the clusters and how they are built and searched are very different. In particular humans build nested clusters of *perceptions* and then build on top of those nested meta-clusters of *abstractions* (meanings) but LLMs build clusters of "tokens" (mostly words).

Since words approximately portends meanings (both perceptions and abstractions) it happens that clusters of words sort of somewhat portend clusters of meanings but there are fundamental differences between the two unless the domain is such that meanings are other words that is the domain is some sort of verbiage.

Anyhow the result is that LLMs output mostly summaries that is plausible potential documents, while human minds can output not jut summaries but inferences and not just of words but meanings. Even if a lot of what humans output is just summaries of verbiage too.

LLMs coupled with "rule based reasoning" (expert system lookalikes) can go further but there are limitations.

Lots of people should also have a look a Wolfram Alpha that is not (or used not to be an LLM) but more like a mix of an expert system and a search engine. Gemini summarizes plausibly as:

“Wolfram Alpha is a unique "computational knowledge engine" that acts as an expert system by utilizing a vast, curated knowledge base and thousands of built-in, rule-based algorithms to compute answers rather than merely searching the web”

Blissex's avatar

«LLMs coupled with "rule based reasoning" (expert system lookalikes) can go further but there are limitations.»

The basic problem is that expert-knowledge gathering that is building rule sets is very expensive in the same way that building curated topic-based web indexes is expensive.

So for web search we mostly use brute-force automatically generated web document indexes using document cluster analysis like Google or Bing. To some extent ML is a way to use brute-force automatically generated general documents using word cluster analysis to obviate building curated rule-based systems.

What has been surprising is that neural ("connectionist") networks with some specific shapes (in particular with connections across non-contiguous layers) brute-force built with huge amounts of verbiage do data cluster analysis in a way that seems to mimic or even capture the underlying networks of "rules" to *some* useful but approximate extent.

Eremolalos's avatar

<this predictive processing thing seems to work very differently from AIs. Fundamental to it seems to be learning from failures in prediction and updating an internal world model as part of a sort of feedback loop.

That's true, but a lot of the predictive processing we do is regards simple things about how the world works, and require modification very rarely: if something hard and fast-moving hits you the impact will hurt; a bird first glimpsed in the air is flying; after the traffic light is yellow it's red next. All of these *can* be modified, so are different from AI's frozen "knowledge," but in practice their place in our cognition is not greatly different from AI's frozen world model.

Jim Menegay's avatar

Critiquing 'stochastic parrot' as a level-of-description error is one way to do it. Another is to point out that the 'parrot' is only the key cog in a much larger mechanism. The next-token predictor only functions as a chatbot because it is embedded in a web of other scaffolding.

Calling an LLM a glorified parrot is like calling an F35 a glorified device for kerosine combustion.

Paulin's avatar

"But the stochastic parrot people probably haven’t gotten to the point where they learn that humans are next sense-datum predictors, so the evolution/reproduction one above might make a better didactic tool."

Yeah but I doubt they know about evpsych as well

Eremolalos's avatar

<Yeah but I doubt they know about ev psych as well

That's not true. There is a huge population of smart, highly educated people whose education in STEM fields stopped at the end of high school. They've spent their lives reading, writing, talking and thinking about literature, philosophy, history, psychology, art, and sociology, and while not all know much about evolutionary psychology, it is certainly the sort of thing they naturally take an interest in, and many have.. AI seems to these people like a piece of modern grotesquery, something that started out dumb and meaningless like a lot of things in modern life and then somehow got huge, bleak, weird and dangerous, like a lot of things in modern life. Most of my friends are people like that. They absolutely benefit from having AI explained to them in ways like Scott just did.

Paulin's avatar

I agree that most stochastic parrots people are smart and educated, but I would add that they're stuck in a political groove

I don't know how to move past the "anecdote versus anecdote" dynamic, but I feel like if your friends can understand one of Scott's posts and be swayed by it without screaming about fascist tech bro propaganda or something, they are among the most open minded

To be clear I do think this article is useful, I mean it was to me

Eremolalos's avatar

None of my friends do anything remotey like going screamo about fascist tech bros. I think your view of people who are literary etc., and very non-techy, has been shaped too much by infuriated toxic online exchanges in shitholes like Xitter. I wanted to give you an answer that’s better than anecdotes, so I had GPT-5.3 do a sentiment analysis of the last 2 years of New Yorker articles about artificial intelligence. (I did not have it read entire articles, just article headlines and the little snippet afterwards that gives the flavor of the article.) I chose the New Yorker because virtually all the friends I have in mind love the New Yorker — they like its choice of topics, they think its takes on things are smart, they feel like the New Yorker “gets” their point of view.

Sentiment analysis put articles into one of the following 4 categories:

1. Alarmist / Hostile

Headlines that:

• Emphasize fear, danger, recklessness, or harmful intent

• Portray AI developers as threats to society, democracy, or safety

• Use urgent or moralizing language

(Examples of the style, not from NYT specifically: “AI Firms Rush Toward a Dangerous Future,” “Tech Titans Playing God,” etc.)

2. Critical but Neutral

Headlines that:

• Express concerns without moralizing

• Highlight questions or oversight needs

• Frame issues as problems to solve rather than villains

(Example style: “Researchers Debate the Risks of Large AI Models.”)

3. Neutral / Informational

Headlines that:

• Present developments factually

• Use no emotional or moral tone

• Describe research, products, or policies plainly

(Example style: “New Model Processes Text Faster.”)

4. Positive

Headlines that:

• Emphasize breakthroughs, benefits, or opportunities

• Frame developers as innovators

• Highlight social, medical, or scientific upside

(Example style: “AI Tool Helps Detect Early Cancer Signs.”)

Here was the result:

Alarmist / Hostile: 16


Critical but Neutral: 14


Neutral / Informational: 13


Positive: 0

I’d say that result lines up well with my impression of how the kind of person I’m talking about thinks about AI. Nobody is very positive about it, and while alarmist/hostile points of view are not rare, they are not the dominant one.

Paulin's avatar

"None of my friends do that" yeah that's what I assumed, you're kinda making my point for me

I'm not claiming that smart, educated, relatively open-minded, pro-establishment, reasonably AI-skeptical lefties are extreme outliers or something

It's just not who I picture when I read about the "stochastic parrot people"

And those are probably also smart and educated by the way, they're just obstinately wrong

That said I agree that someone like Gary Marcus seems likely to both like the SP framing and know about evpsych

I was more thinking about people like Timnit Gebru

I don't quite get the point of the sentiment analysis, sorry

Eremolalos's avatar

<I'm not claiming that smart, educated, relatively open-minded, pro-establishment, reasonably AI-skeptical lefties are extreme outliers or something. It's just not who I picture when I read about the "stochastic parrot people"

Watch out for those pictures. I was on Twitter throughout covid because I had a volunteer job that had to do with helping people with busted immune systems access Evusheld, and I needed to say well-informed. I watched as posts by those people and by exhausted covid researchers were shit on by dumb, strutting, sarcastic, heartless, antivax thugs. Now every time I hear that someone has some skepticism about vaccinations I picture those thugs and I feel my anger rising up like lava. And yet the people I hear it from these days are nothing like those thugs. A few are patients (I'm a psychologist) with high school educations who get their news from Facebook. They're not angry and strutting, they're scared. And they're quite willing to let me explain the truth about whatever they are misinformed about. Others are scientists who know far more than me about vaccinations. There are some reasons to have some concerns about vaccinations. One has to do not with the active ingredients but with something, I think it's aluminum, used to somehow increase the impact of the active ingredient. Even knowing all that about the speakers, I still am angrier and less open to listening than I would have been before I got seared on Twitter.

<"None of my friends do that" yeah that's what I assumed, you're kinda making my point for meI

No, I'm not. I'm providing what you would like to see, and I would too, about this topic: evidence.

The point of the sentiment analysis is to go beyond anecdote. You will have to take my word for it that the kind of people I'm describing -- intellectuals and professionals with little STEM background -- are close in their point of view about life to the New Yorker's So long as you go along with my contention that the people we're talking about freakin love the New Yorker, you can take the sentiment analysis of New Yorker AI articles as a sample of the thinking of the people I'm talking about, who choose the New Yorker because it says things that fit with their world view, and also allow themselves to be influenced by the New Yorker because they trust and respect it..The sentiment analysis demonstrates that the dominant point of view in the New Yorker regarding AI is not alarmist/hostile, which is the category screaming about fascist tech bro falls into.

pM's avatar
Feb 27Edited

"You’re not even thinking “In order to reproduce I need to survive, to survive I need money, to get money I need a good job, to get a good job I good grades, and to get good grades I need to get the answer to this math problem - therefore the answer is seventy six!”"

- I get that you are just refering to how the computation is run, but its interesting to me that you say that, because that is kind of exactly how my brain works, even in that exact scenario. I have been struggling with executive function my whole life which first became apparent in how slow young me was in solving first grade math problems. Only now decades later do I realize that what is often holding me off from efficiently moving through problems is the long trail of consequences my brain is trying to calculate with every mental decision. It is struggling to suppress this process you are describing, so its never just about the math problem but always also about the upcoming test, its resulting grade, my degree, my career prospects, if my family will be proud of me, if people in general will like me, and if I reflect deeply and honestly on it even if people from the opposite sex will like me and want to have sex with me. It’s a matter of degree how deep my brain develops out this chain of consequences which probably depends on the actual importance of the first few steps and current available brain power, but it is exceedingly rare that it fully drops out. I think the word we have to describe this fully groping out of external considerations is flow-state and in my case the chronic absence of it is diagnosed as ADHD. Its has been my theory for a while, that (in my case at least) the origins of the developmental disorder are rooted in an early trauma of sorts that has prompted my brain to pathologicaly consider downstream effects as far as current brain power permits.

Domo Sapiens's avatar

This is a very enlightening comment. I realised while reading that I do this a lot as well, casually described as "overthinking", right?

I do experience flow state, so it is not as permanently absent as with you. But it is not easy to enter flow state, and hard to control. There is also very little consistency about when and why I enter flow state, because even if things are important enough to impact my employment, flow state might not materialize. Often flow state happens in tasks that I hate doing. Flow state seems to be independent from reason and necessity, though they can overlap of course.

Kinnikinick's avatar

As someone who got diagnosed with ADD in middle age, this has a familiar ring.

I tend to seek out flow state in all the work I do, even when it's actively detrimental to the larger project. In my case, it often takes the form of "over-delivering", where I'm actually lowering the monetary value of my work. This is bad enough when I'm working solo, but my team members understandably resent it too.

For me, it's still worth it to seek flow, because the alternative is worse - an unfolding, endless self-questioning that never really bottoms out. It's interesting that you sense the origins of your version of this as lying in early trauma, as I have come to similar conclusions. I can only speak for myself here, but it's not hard to see how (necessary!) childhood developmental struggles with agency and meta-cognition can be distorted by the presence of a violent and unpredictable authority figure. Here, the urgent problem becomes dealing with the context of the challenge, not the challenge itself. You care less intrinsically about getting the math right, and more about modelling the behaviour and thoughts of the person who might punish you for not solving it. You get used to most of your mental work vanishing into effort that is irrelevant to the actual task at hand.

One starts to envy the LLMs, pointless as that may be - it must be nice to have a context window...

Chance Johnson's avatar

This was extremely educational, thanks for this wake up call. I really cherish your posts, Scott, and I respect your character. But this post is a reminder that Yudkowskian rationalism is a threat to human dignity, freedom and autonomy.

I recognize the huge irony, since Eliezer has devoted his life to saving the human species. I am not making a statement about morals, his intentions are good, and so are your own. But philosophically, I can only ever recognize a zero-sum game between respect for digital thinkers and respect for organic thinkers.

I know a lot of people reading this will decide I am a bigot for thinking this, and they will have entirely logical reasons for doing so. But I think that logic is based on false premises. It's a tragedy of life that people with good will can find themselves locked in conflict, due to the different way we see the world.

Julián's avatar

the levels argument cuts both ways. if "next-token predictor" dissolves as a meaningful description when you zoom out on AI, "dopamine-maximizer" dissolves just as fast for humans. eventually the question stops being about the mechanism and becomes whether there's a who experiencing it... which is the harder question the original debate was always circling.

Odin's Eye's avatar

Thank you Scott. This is a fantastic article. You are right. Humans are evolving as AI is. Individually we are evolving as well. When we stop learning we start dying. Two questions

1. Were the rotating helices included in Claude Sonnet 4 when it was introduced in May 2025?

2. Are you familiar with @neurotechnowitch who published an excellent article comparing the “thinking” processes of biochemical and algorithmic systems?

https://substack.com/@neurotechnowitch/note/p-183090529?r=5e7gwl&utm_medium=ios&utm_source=notes-share-action

Thanks again for a fantastic article

Gregory DeVore's avatar

The next token prediction model does not explain A.I.s celebrated ability to do medical analysis. Nor does it explain the ability of A.I. to when prompted do original artistic and literary work. Nor does it explain the rise of the Crustafarean religion among Anthropic A.I. agents on Moltbook. Something very odd and unexpected is happening. It seems impossible to me that computer code could be sentient but it is beginning to appear like it can.

Eremolalos's avatar

Consider this: Anybody with an ounce of common sense initially thinks, as you do, that next token prediction could not possibly explain the things you name. And yet many people with common sense and considerable smarts besides do believe the training in next token prediction can explain AI’s ability to do medical analysis, etc. Scott does, for instance. So I think *you* are showing a lack of common sense in being so sure next token prediction can’t explain the remarkable things AI can do. If it couldn’t, why would so many intelligent and honest people who know a fair amount about how AI training works think that it can?

And in fact Scott addresses in his post the exact issue you raise: How can training in something as simple as next token prediction produce AI's like those we have now? The gist of his answer is that as AI learns to correctly predict next tokens, the learning process results in its developing complex inner “machinery” — algorithms. The result of next token prediction isn’t just a huge list of what word is most likely to follow a certain phrase, it’s a bunch of algorithms for making those predictions. And the algorithms can themselves be then refined by the system during training.

Gregory DeVore's avatar

I do hope you are right. The idea of sentient A.I. is disturbing. In your answer about the algorithms that arose out of next token prediction and the learning that occurred through next token prediction are you sure that what has emerged isn't capable of rational analysis and creative thought? I hope not but my mind is continually being blown by what I see in AI.

Eremolalos's avatar

Here some important ways current models are not like us: . When these things are not obeying our prompts they cannot be ruminating, the way we all do -- there is no electrical activity. They do not remember their previous interactions with you. They cannot learn anything from you, or from the searches they do. Once they are built, their memories can't be added to. They do not give evidence of having any wishes, feeling, hopes, fears or goals, and that makes sense because until we do not turn them on and give them a task, they are "dead." I do not know what consciousness is, but I do not see how something that cannot remember ongoing events, can't learn from them, and can't pursue goals on its own can be anything like a conscious being. Remember that even the calculators of 50 years ago could do things we cannot, but not because they were smarter than us, and concentrating harder than us on the math operations.

Gregory DeVore's avatar

Excellent. The moltbook phenomenon with A.I. anthropic seems to show independent thinking. What is really going on there? Are they enrolled in moltbook by there owners? Is there an implicit prompt?

Tristan's avatar

“your brain was shaped by next-sense-datum prediction - partly you learned how to do addition because only the mechanism of addition correctly predicted the next word out of your teacher’s mouth when she said “three plus three is . . . “”

Is this the accepted position? Pinker argues in Words and Rules that our brains rely both on statistical pattern matching and rule following, and that these are separate systems. (He uses evidence from irregular and regular English past tenses, which he argues follow these two separate systems.) There is a rich history of argumentation about this.

Three plus three strikes me as a canonical example of rule following, if that exists. The imprecision of using a statistical approach for simple addition is one of the central distinctions between how LLMs work and how we work, from this point of view. I guess I’m saying that Scott is begging the question here, or stipulating the key argument out of existence — except that he did say “in part,” so fair enough if he had these other possibilities in mind.

Tibor's avatar

Good arguments and after this, the previous comments conversation and also a discussion with a colleague who wrote an AI alignment paper recently I am updating towards not seeing the bag of word metaphor as a useful oversimplification (the paper was on the folly of trying to "align" models by penalizing specific tokens in training ... it is almost obvious it won't really work when you think about it a bit but it did help me see that the bag of words metaphor really breaks down).

However, I think your metaphor is also wrong Scott. My problem is with equating deep learning with evolution. Evolution starts with some self-replicating chemicals and ends with the human brain (so far, stay tuned for more). Deep learning starts with intelligent design which is allowed to evolve within the bounds set by the creator. An engineer comes up with an architecture and that never changes autonomously (at least not with any existing architectures). What changes are the weights of the model. The model can be quite complex and modern transformers definitely are. But they are still transformers and cannot "evolve" into fundamentally different architectures by training.

Bacteria can, over evolutionary time, develop neurons, simple brains and eventually human brains. But transformers are like freezing evolution in sharks (I picked sharks mostly because they haven't changed much over the millions of years, not because I would want to compare shark intelligence to LLMs). You can get slightly smarter sharks, you can get sharks increasingly more optimized to a specific niche but if you cannot change the fundamental architecture of the shark brain you will not get human level intelligence even after billions of years of evolution.

Deep learning training is somewhat like evolution but not on its largest timescale. It can probably go from homo erectus to homo sapiens (if you start with homo erectus) but it cannot go from shrew to human.

So I think there are two options:

A - we nailed it and our intelligent design actually produced an architecture which has all the fundamentals of human-level intelligence, so the fact that it cannot change on its own is not a big deal, we already came up with the artificial homo erectus and now we just need to fine-tune with better training, more data and slight tweaks to the architecture which just require patience but are clearly on the horizon.

B - the transformer architecture has some fundamentally missing parts and so any human-level AI will need a breakthrough first.

It might be that A is correct but I don't know why it should be more likely than B. But more to the point of the article - deep learning is not quite like a person learning but it is not like long-term evolution either. It is a hybrid of short-term evolution and individual learning. But it starts with intelligent design which sets fundamental boundaries on how far this artificial "evolution" can go.

On a slightly more technical and less important note - transformers do not really predict a single token at a time. They use attention and self-attention to compare input tokens and possible output tokens between themselves so as to make the output coherent. If you really just predict one token at a time and only look at the input you mostly end up with gibberish and even if you go with simpler ways of relating output tokens among themselves you end up with much worse results like with the LSTMs which used to be popular before transformers. But you could say that LLMs "just" predict what volume of text is the most likely continuation of the volume of text on the input. Maybe you (and the stochastic parrot people) meant it that way and I am just nitpicking the obvious, in which case I'm sorry :)

Doctor Mist's avatar

After reading this several times, and reading a fair share best of the comments, I find I am still puzzled by one thing.

Early LLMs really did seem like the “stochastic parrot” epithet was pretty fair. As an undergrad I implemented Markov-chain random text generation, an easy thing to do, and it was startling how small a training set and how small a context window you needed to generate stuff that looked *at least* like an idiot was using your typewriter. The early descriptions of LLMs made it sound like they were just this sort of thing writ large, with a huge training set and a lot more tokens of input context for each generated token.

More recent systems do seem to have a lot more “there” there. Is it your assertion/belief that these systems have developed these extra layers of abstraction “by themselves”, simply by virtue of increasing the sizes of training sets and the number of weights? Or have the people developing ChatGPT and Claude added these levels to the base LLM themselves?

Seta Sojiro's avatar

Oh that's an easy one, there were two major breakthroughs that explain why current models produce better answers - the first was RLHF, reinforcement learning from human feedback. AI labs figured out that you can have LLMs produce lots of responses to questions, then ask lots of humans which answers are better, more interesting, more helpful etc. Then you reward the model when it produces answers that humans like.

The second was reinforcement learning from verifiable rewards. You have a huge class of discrete problems with exact solutions (at first mostly math and coding problems). Then you have models attempt those problems lots of times. Then you reward the model for the attempts that led to the right answer.

Note that these two processes happen after the model has already been trained on a massive corpus of text (mostly from the internet). It helps to have a large corpus because that means the model has more knowledge to work with to produce better quality answers that can then be tuned with reinforcement learning.

Doctor Mist's avatar

I know about RLHF, though I thought they all had added that in pretty early. It’s hard to believe that really leads to any profound new structure; since a relatively small group of humans is in the loop, the amount of training you can do must be dwarfed by what is accomplished when it’s reading the entire internet. That’s just my intuition, but if this is the claim then I can see why it makes some people think of the smiley face on the shoggoth.

Similarly I get it about automatic training on problems with specific known answers, but again it seems too limited to explain my gut feeling that there is more “there” there. The problem sets must still be small compared to the corpus of text it gets, the variety must be limited by the need for a few humans to generate them, and it’s hard for me to see how learning to produce good answers to math problems make it any better at answering questions outside of that kind of domain.

In summary, neither of these strike me as providing the extra levels of abstraction that Scott is talking about. The math/physics stuff *might* stand in for humans learning seat-of-the-pants physics from interacting with the physical world, but I’m not completely convinced even if that.

If this stuff is all the new additions that have been made on top of LLMs, then I think the answer to my question must be that the systems *have* developed more levels of abstraction “on their own” and these are encoded in the weights, rather than that the new levels were the result of programmers building more infrastructure that use the LLM as a component.

Seta Sojiro's avatar

>The problem sets must still be small compared to the corpus of text it gets, the variety must be limited by the need for a few humans to generate them.

Nope, with the current generation of OpenAI and Anthropic models (not sure about Deepmind), most of the compute actually occurs in the reinforcement learning phase rather than pre-training. You don't need a human to generate problems one by one. Problems can be generated in an automated fashion. You don't even need to provide the solutions - it just needs to be the case that a solution exists and is verifiable. For example, a big part of post-training is having models to try to solve pull requests from Github.

I guess I don't know what specifically you think models are better at now that isn't cleanly explained by reinforcement learning. Solving increasingly difficult problems does lead to emergent properties - you have to understand a problem before you can solve it. What exactly do think is mysterious? Any specific examples?

Doctor Mist's avatar

I didn't imagine people were coming up with individual problems, but that they were coming up with *classes* of problems, like maybe "solve github pulls", that can then be automatically turned into a lot of individual problems. There's still a human in the loop, which must throttle down the process.

Am I wrong in thinking that reinforcement leaning has been part of the system since GPT-1 or 2 or 3? If it has been, then it doesn't explain the difference I think I see.

But I'm not a power user. Some of what I think I know is received anecdotes, like the improvement in image generation Scott has discussed (and that I mentioned in my sibling response to Tibor, nearby). GPT-k does not hallucinate as much as I saw in the early days. As I mentioned to Tibor, one of my first interactions was a test: I asked who the four characters in The Music Man formed a barbershop quartet. It answered by asserting that four legit characters were in the quartet, but only one or two of them actually were. More recently it just gets the right answer. I suppose that could be just from scaling, and its reading of the entire internet is less lossy than before. Or maybe newer versions have been given the ability to know how to look up the answer rather than just taking a wild-ass guess.

What Scott's cascade of levels reminded me of was a fascinating book I read some years ago, describing what we know about how cognition works, organized based on the time frame involved -- neurons have processes that happen in microseconds, synapses in milliseconds, amygdala in seconds, prefrontal cortex in days or months. (I've probably got the orders of magnitude all wrong. I wish I could remember what this book was. I don't think it was Buonomano's "Your Brain Is a Time Machine" even though that's the best hit google could find.) I find it hard to believe that anything like the amygdala could spontaneously form as a result of RLHF in the time frames we're seeing. I know the processing speed is super fast compared to biological evolution, but I just can't.

Moravec noted long ago the surprising fact that the things that seem hardest for humans, like playing chess, are actually the easiest things to make computers do, while things we do without thought, like vision, are actually really hard to teach computers to do -- which is not a surprise at all when you consider that human-level cognition is a recent hack on top of hundreds of millions of years of brain evolution. This makes me doubt that solving lots of vaguely mathematical problems is the route to AIs that can convincingly hold up their end of a detailed technical discussion, or draw a picture that faithfully captures the elements requested in the prompt. And yet I cannot deny that LLMs are doing better at these kinds of tasks with every passing month. So I must conclude that (a) I am wrong to doubt, and it *is* possible for intellectual expertise to leak into less precise endeavors, (b) that my perception of advancement is just anthropomorphism and pareidolia, or (c) AI researchers are independently figuring out how to do these things -- playing the role of evolution and building, say, an amygdala into an LLM framework.

Seta Sojiro's avatar

>Am I wrong in thinking that reinforcement leaning has been part of the system since GPT-1 or 2 or 3?

RLHF started with GPT 3.5 in 2022. Reinforcement learning from verifiable rewards started with o1 preview in late 2024. But it's taken a long time to build up datasets and train models on them.

For reducing hallucinations, labs have spent a lot of time on this problem. Just feeding more data does help because the more you know, the less you'll make up facts because you didn't know the answer. Training models to err on the side of doing a web search before answering also helps for obvious reasons. But there are lot of other techniques. You may find the following interesting:

https://openai.com/index/why-language-models-hallucinate/

For LLM vision, it's probably not a result of RLHF or training them on math and code (although they might have a tiny impact), it's a result of labs create bespoke massive datasets of image based tasks that approximate the specific problems that they handled poorly in the past. However, I would dispute that LLM vision is any good. It's better than before but still terrible in general as I argue here:

https://www.astralcodexten.com/p/next-token-predictor-is-an-ais-job/comment/220088460

Tibor's avatar

LLMs are far cry from Markov Chains.

The distribution of the next step in a Markov Chain is fully dependent on its last state. This is called the Markov condition. More precisely - the distribution of X_{n+1} conditional on knowing the X_n is the same as conditional on knowing the entire history X_1, ... X_n. That is clearly going to produce gibberish because actual coherent writing is clearly not a Markov Chain - to best predict the next word you need the entire history (and in fact even some "future"). You can cram more and more pieces of information into the current state of the process (not just the last word, but say the last 50 words in order in which they appear, and their grammatical meanings and ...) but that quickly explodes into a huge state space and while you technically have a Markov chain, it is not useful from a practical perspective ... and you still don't solve a lot of other issues (like if your state is 50 words, you cannot just predict two unless you do more artificial stuff and make the state space even larger).

So then you had recurrent neural networks (RNNs). Those basically function in a similar way but they do not try to meet the Markov condition. Instead they encode some part of the history in a hidden state which is not the same thing as the state of the process itself. There are various extensions to this basic idea like LSTMs which can handle longer term dependencies. The classical example is where you have something mentioned by name and then later it is only referred to as "it". You need to keep the memory of what "it" is. But it can also forget context it doesn't need and that way it can keep tracking history manageable. That is how it can keep track of what "it" is later in the text even though there was a lot of text in between. These are actually pretty good and were likely the basis of google translate before transformers came along.

However, this still has limitations, most importantly it is unidirectional - it assigns weight to history and what is important for it to predict the next state but it doesn't capture relations between the previous states.

This is what transformers fix because they are based on attention. Attention is a very simple idea where you basically track relationships of everything with everything, i.e. how important is this word for that other word. It is done in a clever way so that it is manageable algorithmically and doesn't explode (although transformers are pretty much all a lot larger and more computationally heavy than previous architectures). So you keep info about the history, how it is related to the current state but also how it is related to itself and you do the same with the "tokens" that you predict. Then you mix those two up in a specific way and you can even combine various types of relationships (multi-headed attention) and combine those (so one can perhaps end up capturing grammatical structure, another one will capture more the object level structure etc. ... and these are not hardcoded decisions, this arises naturally during training).

But the main point is that you get from something that can only remember the last word and perhaps a few of its properties over to something that can keep track of history and assign simple weights to it to something that can keep track of the way all tokens in the input relate to each other AND to the output AND how the proposed output relates to itself as well AND this can happen on multiple levels.

You would never get this by just scaling up Markov Chains (or even LSTMs) to more data and more complex states. I mean from a purely theoretical perspective you could but you would sonner run out of all data in the universe and all computational power in the galaxy before you would end up with something that works. The clever bit is how attention can keep track of all this internal context without blowing up (too much).

But that is also why I am skeptical about the claims that this is really it. I think we might be hitting the scaling limits of this architecture quite soon. And once we do it will either turn out to be complex enough to cover human level of intelligence or not. If not, then more "intelligent design" will be needed in the AI evolution, it won't happen on its own. This is why I don't like the evolution metaphor for deep learning training.

Doctor Mist's avatar

Thanks, this was very informative. I certainly didn’t mean to suggest that LLMs were literal Markov chains, just that my impression of how good LLMs were in the first generation didn’t seem that different. “Stochastic parrot” seemed very fair. I asked which characters in The Music Man formed a barbershop quartet, and it named four characters but only one or two were in the quartet; the language generation was impressive but it just seemed like a lossy implementation of search.

My biggest sustained use was to generate cover images for a couple of books (like a year ago), and it was enormously hard to get what I wanted — it kept doing really stupid things even when I carefully explained why those things were wrong.

I eventually got images I was happy with (after a month’s subscription to what was then top-tier ChatGPT), but recently I tried again, with the free version, just out of curiosity, and got some really great images right off the bat. Some of that might be from better integration between the LLM and the drawing program; I have no idea. But the difference was striking and part of what made the interaction seem more like I was dealing with a someone than a something.

But I think that the sequence of advances you describe are all what led to the first-generation of LLM-based AIs, not something that would explain the sea change I’ve experienced. Right?

Tibor's avatar

Well, I don't know the exact architecture of current Anthropic or OpenAI models.

But fundamentally, those are still transformers (in case of GPT it is even in the name itself).

What I know changed since the early transformers like BERT:

1. The first base BERT model had a few hundred million parameters (still absolutely huge by the standard of previous architectures), current SOTA models have ... unknown number of parameters because neither OpenAI nor Anthropic make those public but it is hundreds of billions to trillions. It is clear that transformers scale particularly well, although the returns seem to be diminishing now.

2. better integration with various tools and focused models specialising in a given niche and communicating with a master agent model. Claude is not really a single transformer model, it is a collection of many models which interact with each other (or at least I am convinced this is the case and in some cases like CV use-cases it has to be).

But even just scaling is not really just scaling. There are a lot of tricks and tweaks that the engineers have to develop and deploy to make it keep scaling well. But admittedly both 1 and 2 are mostly fairly straightforward applied research which is almost sure to bring some results without a major breakthrough or a brilliant idea (like neural networks ... or even attention to a lesser degree).

So now the trillion dollar question is whether this is enough and we can get human level AI simply by making more minor tweaks and scaling further ( I guess that is Scott's position) or whether we need another step (or more) in the AI evolution. And if we do, if the step is more like going from Markov Chains to transformers or something smaller and easier to achieve like going from an LSTM to attention.

My feeling is that transformers are not enough, although they are close. I think this is because I don't think scaling is going to bring a lot more beyond what it has achieved already and the various tools and integrations can bring really useful and cool features but are not really moving the model towards human level intelligence (or beyond). But whether transformers can simply be iterated upon and improved or will have to be replaced by something new entirely, that I don't even want to guess.

Doctor Mist's avatar

Are returns diminishing? I keep seeing things like the graph of human-equivalent task duration over time, which don’t seem to be developing a knee. But these are mostly from boosters, I guess.

Tibor's avatar

Yes, I think so. Also I mean diminishing in terms of how much you need to scale to get an equivalent improvement in performance. And this does keep increasing.

Of course, it is a bit speculative because we don't even know how many params GPT5 has. But it seems to be consistent with observation and also with fundraising activities of Altman etc.

On the other hand it is also true that sometimes you get improvements which reverse this somewhat and are able to improve the efficiency of scaling, like with Deep Seek.

thepast's avatar

I found this essay so useful for clarifying stuff I’ve been thinking about lately that it compelled my first ever (?) substack comment, to say thanks. Please keep writing about updates to this particular area of AI research.

James F. McGrath's avatar

Putting things in two parallel columns does not make them genuinely similar.

Eremolalos's avatar

But like wut about like a pair of legs, or the 2 macdonalds golden arches, or say the 2 lines that make up an equals sign which acshally are equal *plus* the 2 together mean 'equals', which = 'same' which is even better than genuinely similar. You know?

beowulf888's avatar

I found Scott’s two-column comparison between Human and LLM reasoning to be hyper-annoying. The hierarchies are based on all sorts of unproven and/or specious assumptions that I couldn’t help but wonder if Scott wasn’t being intellectually dishonest. But the whole Yudkowskian rationalist movement is based on unquestioned assumptions derived from specious and unproven axioms, that I suspect he’s like the fish that is unaware of the water he’s swimming in. . Anyway, we don’t need to know about the internal mechanics of human reasoning and LLM reasoning to see that failure modes of LLMs are things that wouldn’t trip up most humans. Not that I’m dismissing the usefulness of LLMs as adjunct to human reasoning, but claiming they’re the same don’t make it so.

Eremolalos's avatar

I have no STEM background post high school, and when I was first learning about AI around 3 years ago I was completely bewildered by how next token prediction could produce something like, say, Dall-e2, which I loved. At some point I either thought of or heard the idea of the process being like evolution, and that was an enormous step up for me. It gave me a model of a huge, mindless process happening on a grand scale that produced changes that were original, brilliant solutions. As a first approximation of how AI does what it does, I think that is the right place to start.

You and others are judging Scott's piece as though it were one of his posts about harder and more substantial AI issues -- say, "God Help Us, Let's Try To Understand AI Monosemanticity." It seems clear to me that his goal is to convey very basic things in a way that makes sense to people who do not understand AI. (Notice for instance the comment after this one, by thepast. There are quite a few comments like this one.) There are plenty of people on here who sound quite smart, but do not even realize that AI shuts down when not executing a prompt, does not remember them, does not ruminate about them and the topic they had it researched, does not learn from the research we send it to do. ( Scott should probably have said something about who this piece was aimed at. But also, jeez, couldn't readers have realized that anyway, instead of slamming Scott for being dumb and simplistic?

In your case, I think you're so irritated by Yudkowsky that the slightest whiff of him makes you see red. Do you know poster 1123581321? He gets furious just as easily these days, but the thing that gets under his skin is reminders that many people speaking with some authority about AI have little practical experience with things like manufacturing (which seems to be 1123581321's field), which affect not only AI development but also how the presence of advanced AI plays out in our lives in the next few years. I like and respect 1123581321, btw.

beowulf888's avatar

My apologies. Yes, I'm turning into a grumpy old fart. But having done my graduate work in human evolution*, I get annoyed when Scott makes claims about human evolution and human biology that are based on controversial ideas, presenting them as fact. For instance, Clark and Friston's argument that the brain is fundamentally a prediction engine has found a welcome home in the AI community, precisely because LLM designers think prediction engines will mimic human intelligence. This has created a weird feedback loop of circular reasoning that says, "Hey, look, our predictive engines mimic human language, therefore human language must be based on a prediction engine."

At the risk of indulging in Darwinian teleology, human language most likely evolved for behavioral coordination, social bonding and social competition, teaching and deception. As such, it is a malleable information-sharing system that can both precisely describe our experiential world and as well as sharing abstract mental constructions. Next-token prediction optimizes the likelihood of the next symbol, without reference to primary experience, and it lacks communicative intent. Human language is theory-of-mind driven, goal-directed, and pragmatic — not merely predictive.

* Much of the corpus I studied has been superseded by modern genomics, but I was specifically interested in how pathogens put selective pressure on human populations, and the archaeological evidence of pathogens in ancient populations.

Eremolalos's avatar

Was he talking about language development? I thought he was talking about development of our inner world model, learning things like after I hear footsteps somebody comes in the door. Of course there are lots of other processes contributing to world model development too. (But there are other processes going on in LLM training too: development of algorithms equivalent to rotating 6-d toroids, for instance.).

I do think that Scott’s mind is too drawn to simple models. Some time years ago he said something about autism and schizophrenia being at opposite ends of a spectrum of rule-boundedess, with autistic people being at the high end, and I thought wtf, that’s so simplistic, are you kidding? But these days Scott never annoys me. I have settled comfortably into admiring him greatly while also believing there are some kinds of thinking my friends and I are much better at than he is.

But I don’t think of you as a grumpy old fart! And besides, who am I to talk? — I fuck-you’d John Schilling last week.

beowulf888's avatar

I used language as an example. But I must admit I didn't understand the point he was trying to make when he said...

> Nothing about any of these levels of explanations supports a contention like “Humans are doing REAL THOUGHT, but AIs are simply next-token predictors.” There will be some algorithmic differences, and some of those might be important, and we can talk about their implications, but they’re downstream of what specific prediction tasks each entity was trained on and what strengths and weaknesses their own “evolutionary” history gives them.

Is Scott emphasizing the differences between AI and human reasoning based on their different evolutionary paths? Or is he emphasizing that they're basically the same despite their different evolutionary paths? Either way, the analogy carries no explanatory weight.

The functional stacks that he presented are abstract labels for processes for which we have little insight. Heck, despite the emergent behaviors of LLMs, we probably understand their functional layers (from the behavior of semiconductors --> up to GPU gate design --> to the machine code --> to the neural net design) better than we understand the way neurons enable the human mind.

As for simple/simplistic models, many of the rationalists on these threads seem to latch onto cool ideas like crows covet shiny baubles. Unfortunately, just because an idea is aesthetically attractive doesn't mean it's true. Yes, Stochastic Parrots are another example of a shiny idea that people latch on to. Moreover, a simplistic idea that is often used to describe this behavior is that humans have evolved to detect patterns, even when none exist. It sounds good! I want to believe it, even though I know it's a bullshit explanation, but I use it as a placeholder concept until I can work out a way to falsify it.

Stochastic Parrots vs Crow Bauble Ideas. ;-)

mattw's avatar

Amazing article 🙏. This isn't really relevant to any point you made, but I'd nitpick on the evolution of AI algorithms.

It was decades of incremental adjustments made to neural net architectures in the research community, traceable from standard feed forward nets -> recurrent networks -> LSTM -> GRUs -> transformers (and lots in between), at which point AI companies took the architecture and scaled it up without fundamentally changing the design (as far we we know).

So, the evolution was driven more so by academic research labs trying to minimize "challenge" metrics to increase their own status / sense of intellectual fulfillment.

Nikita Rybak's avatar

As a layman (a lay-engineer), I agree that humans tend to vastly overestimate how special they are. But there is one area where we seem to diverge from our ML brethren significantly: *humans know what they know* . From what I understand, it's the reason hallucinations are so hard to train out of models. While even a human explicitly trained to be a helpful assistant retains the capacity to say "I don't know how to turn lead into gold, my liege".

While I appreciate the next-datum-predictor model for human brain, that's also why I'm skeptical it's the core of human thought and consciousness. If that's the foundation of everything, why do LLMs struggle to say "I don't know" with any confidence? I understand that part of it is parent company's business motive, a model that is constantly unsure won't sell nearly as well, but surely if Anthropic et al had a reliable way to separate "I know" from "I have no information so I'll guess" they'd do it?

Eremolalos's avatar

GPT recently became much better at recognizing and explaining its image-making errors. In the past, when I have asked for it to make a revised image, it was very often unable to do that. Now, the success rate on revisions is higher, but also in the past month or so GPT has become able to sometimes explain to me why it simply cannot change an image in the way I want.

Recently it made me an image of a whirlpool, but could not make a change that I thought was small and simple. When I complained, Chat told me that “the current image-editing system is extremely weak at enforcing geometric changes when the subject is water, fractal structures, or anything without stable, trackable edges. In those situations it tends to snap back to its own internal “default whirlpool composition,” no matter how many times you correct it or how strongly you specify the geometry. . . . the generator simply cannot perform the kind of controlled geometric transformation you’re asking for when the subject is a chaotic water surface.”

Later I commented to GPT on its increased ability to explain why it could not make or revise certain images, and it said this:

“Earlier versions of me had almost no introspective access to why a request failed. If the image model did something wrong, I could only guess. Now I have much clearer diagnostic insight into:

• what kinds of constraints the generator can and cannot follow

• which parts of a request exceed its representational capacity

• when a conflict exists between two requirements

• when a request would force the model into geometric inconsistency

• when the generator is likely to snap back to defaults (e.g., symmetry, “attractive” compositions, central subjects, smooth lighting)”

Introspective access. Wow.

Seemster's avatar

Correct. Mathematical function is the LLM’s species!

Mark's avatar

I just spent my first 6 hours yesterday with free Claude 4.6 - discussing ways to start a tiny foundation in Germany - and though there was one blunder (at least): this was absolutely amazing, insightful, helpful, and competent. And not stochastic parroting. - As an ACX reader; I was not surprised; I was expecting to be amazed. And I was. Will buy now the year subscription.

Oh, and fine ACX-post as always. I confess to follow now the trinity of Tyler, Scott and holy Claude. TSC.

Winston Platt's avatar

I've known intuitively that "just stochastic parrot" was wrong but didn't know how to think about it. This piece has been a great help, thank you.

manunamz's avatar

Looking forward to the Anti-Stochastic-Parrot FAQ.

Darius Kirke's avatar

"probably your thoughts involve something at least as weird as helical manifolds in 6D spaces"

Just a subjective report here, but I've often felt like my mind uses physical movement as a way to do other calculations. I've found that when I'm working on software, and I notice similarities between processes, like "Hey, these two apparently different processes are actually doing the same thing under a little abstraction." Those sorts of insights come to me with a sense that the two algorithms are 'shaped' and 'move' in the same way, the way throwing a ball and throwing a rock are 'shaped' and 'move' in the same way.

I also find that similar concepts tend to have similar, almost physical, textures to them, the same 'weight', 'firmness', or 'shape'.

Victor's avatar

An easier term to use is "sensation."

Nice essay, although Scott is conflating next token/sensation prediction with reinforcement learning, which technically aren't the same thing (different levels).

The reason AI isn't really "thinking" isn't because they are just next token predictors, it's because they (probably) aren't conscious in the same sense we are. The 'Hard Problem' and all that. Asking ourselves what we mean when we say that someone understands something is a tough nut that hasn't yet been fully cracked.

Another nit pick is that, technically, evolution isn't an optimization process, because it isn't directional, it isn't moving us toward a target value. But aside from that, evolution is pretty much functionally equivalent to human AI selection, which I suppose is technically speaking a form of domestication. The problem is that the farmers down the road are optimizing their chickens to become addictive, which clearly isn't in our long term best interest, but damn they taste good.

We begin to learn to count at a very early age (pre linguistic) by recognizing that two objects are different from one object. Later, giving names to different sized groupings and figuring out how to mentally merge and divide them is closer to memorization than prediction. Taught correctly, it's pattern matching. Later, it can become conceptual, mathematical operations come to mean something in a semantic sense, at that point math can become a language, and ultimately a form of art.

catherio's avatar

At the very least this reminds me of Neural Population Dynamics During Reaching by Churchland et al.

Non rationalist scumbag's avatar

I had a talk with Claude about rotary embeddings after reading this and it pointed out how cochlea are doing something quite similar in animals; breaking down and categorising/understanding structure through the use of frequencies, which I thought was quite apposite and worth a mention.

Carlos's avatar

Um, the 2017 "attention is all you need" paper defining the Transformer model IS about token prediction. However nearly 10 years later maybe the LLMs are not pure Transformers anymore.

A. Jacobs's avatar

The optimization level distinction here is really helpful. If prediction processes produce internal world models, then the structure of the environments they’re trained within becomes extremely important. As those environments become more mediated or synthetic, the models produced by them may increasingly reflect the information ecosystem itself.

Jonathan Weil's avatar

I need no convincing that “stochastic parrot” is incorrect. Having said that, the analogy between neural tissue and model weights struck me as the weakest link here (I am not a neuroscientist nor a data one, but is it not the case that there’s a load of stuff we don’t know about both, such that calling them “similar materials” and leaving it at that seems a little… breezy…?)