270 Comments
User's avatar
Richard Ngo's avatar

"Mis-estimating one parameter can ruin the whole project."

That's one possibility for what happened. Another is that the bottom line was more or less already written by social pressures and vibes. In the latter scenario, if the algorithmic progress parameter had been estimated correctly, then some other parameters would have been tweaked to preserve a reasonable-sounding bottom line.

I don't know which of these scenarios is closer to the truth, but it seems worth holding both in mind when thinking about this kind of forecasting. (The "social pressures" scenario also seems like a plausible reason why Joe Carlsmith's report ended up estimating only a 5% chance of existential catastrophe from AI.) To be clear, even in the "social pressure" scenario I'm not ascribing deliberate deceptive intent to either of them, merely the kind of motivated cognition that most people are doing most of the time.

One piece of relevant evidence is that both Ajeya and Joe ended up later updating their estimates in the same direction that the vibes updated. In Ajeya's case, she noted in her update post that "I’d largely characterize these as updates made from thinking about things more and talking about them with people rather than updates from events in the world". So there's no clear reason why her update *should* be in the same direction as the vibes. Not sure what Joe's purported reasoning process was, but again it's a little suspicious that it gradually updated him in the same direction as the vibes.

Though it's also worth noting that both of their updates came before ChatGPT was released, which was the *big* vibe shift around that time.

Ajeya's update: https://www.alignmentforum.org/posts/AfH2oPHCApdKicM4m/two-year-update-on-my-personal-ai-timelines

Joe's report: https://arxiv.org/abs/2206.13353

Jimmy's avatar

Is there a solution to any of that? Because there seems to be a problem with forecasts where either the prediction is way too absurd for anyone to believe, and thus fails to make any meaningful change, or the forecasts are changed to be more palatable, in which case it no longer has any value as a predictor of the future.

Richard Ngo's avatar

I am much more excited about qualitative forecasts that try to predict new *types* of phenomena rather than quantitative forecasts that try to put numbers on already-described phenomena.

In other words, most of the work is in generating interesting questions to ask, rather than giving well-calibrated answers to them.

In the case of AGI, coming up with the concept of AGI in the first place is doing far more valuable work than basically any estimate of when we'll get AGI. And my sense is that the vast majority of effort that's being put into understanding timelines would be better directed towards understanding what AGI looks like (ideally to a level where the concept of "timelines to AGI" seems obviously nonsensical because it's too vague).

Analogously, instead of trying to predict *when* the industrial revolution would happen, people in earlier centuries would have been better off trying to predict what it'd look like, and e.g. which industries would benefit first, and so on. That's the kind of intellectual work that then generalizes to actually contributing to or steering the revolution when it comes.

Scott Alexander's avatar

Yeah, I took that argument pretty seriously in my original piece on Bio Anchors - see https://www.astralcodexten.com/i/47594966/does-the-truth-point-to-itself and https://www.astralcodexten.com/i/47594966/what-moores-law-giveth-platts-law-taketh-away - and part of what I wanted to do with this deeper investigation was see whether that had panned out.

But as I mentioned in the last part, looking into it further made me believe this less. The study that Ajeya used for algorithmic progress speed really was the only attempt at an algorithmic progress study that existed when she was doing this analysis, so she didn't have too many degrees of freedom to make that parameter something else. And everything else was AFAICT more-or-less correct.

I think the closest you can come to what you're saying is her claim that she rounded algorithmic progress down rather than up after thinking about all the different considerations, but she's basically admitting that and I think that even she had rounded the woefully low Hernandez/Brown number up instead of down, it still would have been way too late.

So basically - yeah, you can always say this about any person making any claim, you can never be entirely sure it's not try, but looking into this more closely made me update away from this hypothesis rather than towards it.

Richard Ngo's avatar

Where did you discuss the "looking into it further made me believe this less" thing? I'm not seeing any very direct discussion of that in section V (and Claude says "The post as shared here doesn't contain any mention of the "bottom line being pre-written to be socially conformist" — the phrase doesn't appear, and the concept isn't addressed even indirectly").

Re your argument about degrees of freedom, how does this square with Ajeya's update post? In that post she lists about five different updates to the model which shorten her timelines, none of which are a straightforward "increase the algorithmic progress parameter". I can't demonstrate that these updates were socially-motivated, but at the very least it shows that there *were* ways that a reasonable variant of the model could have produced shorter timelines (and therefore that there were degrees of freedom which could have been tweaked to produce the 2050s estimate).

To be clear, I'm wary of being overly paranoid here—I don't want to dissuade good-faith modeling attempts, and the social pressure hypothesis is worryingly difficult to properly falsify. Having said that, I think doing this kind of work at OpenPhil makes it fairer game to criticize than if it'd been produced independently (especially because there were a few different reports that IMO had similar issues).

Nikita Sokolsky's avatar

I've previously thought that a lot of the estimates are biased downwards because of people's desire to live long enough to see AGI. I now agree with you in that the biases were actually in the *opposite* direction, even if Ajeya herself wasn't consciously affected by them.

David Manheim's avatar

Speaking of Eliezer's "trick that never works", I have repeatedly found that trying to adjust my credence in people's claims as inputs to forecasts based on speculating about their motives and goals, instead of their statements and track record, is easy to justify mentally, but also actively misleading.

Philippe Saner's avatar

Okay, I'm willing to be convinced.

What exactly is the "AGI" that people expect to see within the next twenty years? And what is the case for expecting it?

I started reading the AI 2027 link, and stopped when I realized that it was a story rather than an argument. I would like to see an argument.

Scott Alexander's avatar

AGI is AI that is as smart or smarter than humans.

The argument that it will happen is that AI has gotten much better over the past three years (from too dumb to count to ten consistently, to able to ace MIT quantum physics tests and win math competitions). Its intelligence seems to be proportional to a quantity called "effective compute" (this is the "scaling laws" you might have heard about). We're on track to add 1,000 - 10,000x more effective compute to AI by the end of this decade, which means it should be expected to improve by about the same amount it's already improved over the past three years. Intuitively, that would make it smarter than humans.

The real argument is much more complicated than this but basically follows the same beats and just closes up loose ends. See https://situational-awareness.ai/ for one especially well-written version.

Philippe Saner's avatar

So it's basically tracing a trendline? Assuming that the curve-of-best-fit continues to fit best, and following it up the y-axis as x grows?

That style of prediction has a pretty shaky track record. It's better than nothing, but I don't think it justifies the confidence you show here.

Also I'm not sure "as smart as humans" is a well-defined phrase. AI is already much smarter than humans in many ways. I have no idea how to weight its relative strengths against its relative weaknesses, but I can't imagine calling what we have today AGI.

Scott Alexander's avatar

Sort of. My impression is that this style of prediction has a pretty good track record (see for example Moore's Law, or US GDP over time, or solar panel deployment). It seems especially good in AI - CTRL+F "the scaling laws have mostly held", above, and see the three graphs.

I don't know what level of confidence I seem to be showing, but if you want to naively say that there's equal chances things go slower than trend, approximately on trend, or faster than trend, then that suggests 66% chance AGI in next 5-10 years. I don't know what reason one would have for placing the large majority of probability mass on "it will definitely be much slower than the current trendline".

I agree "smarter than humans" is part of the way that the real argument is more complicated. One way of operationalizing this is that AI is currently better than humans at short tasks (like coding a small app, or solving a short math problem) but worse at long tasks (like coding an entire complicated software system, or proving a difficult problem, or writing a book). Right now there's parity around the 6 hour mark, but the trendline has this doubling every five months or so. So if the trend continues, then in five years, there will be parity around the months-to-years mark, which seems like enough time to do most things that humans do.

Philippe Saner's avatar

The level of confidence you seem to be showing:

You seem to take for granted that we won't see a third AI winter. That the current approaches are capable of producing the machines you dream of. And also that the frankly scary economics of the industry don't cause the whole enterprise to implode. In short, you don't seem to respect the possibility that the fundamental basis of the predictions is wrong. Which could easily add decades - hell, centuries - to any timeline.

I have no real objection to the logic you use to argue that 10 years is more likely than 15. But these things don't follow a bell curve, and you write as though they do.

And it's worth noting that the strongest AI critic I've seen - Ed Zitron - is obsessed with issues that you hardly seem to think about. Financial ones, mainly. I would be very interested in seeing a debate between the two of you.

Scott Alexander's avatar

The finances matter a lot for investors and the overall economy, but I'm not convinced they change the technological calculus much.

If willingness to invest in AI collapses tomorrow, this is probably very bad for OpenAI and Anthropic. It's . . . unclear for Google? They're Google, so they'll survive, and since their competitors have crashed, they'll have a near-monopoly in the US AI market. I don't know for sure that this is true, but my guess is Google would take this opportunity to hire up all their rivals' researchers, buy out their data centers at fire sale prices, and keep going at a slightly slower speed. A lot of this stuff is already half-built, NVIDIA currently makes 400% profit per chip and would rather accept lower margins than stop selling chips at all, and the benefits of consolidation under one roof might be bigger than the costs of a less friendly investment climate. Overall I think probably a year or two slower but no huge difference.

If it's not Google, then it'll be Facebook or Amazon. If it's not Facebook or Amazon, Elon will fund it with his own money ($800 billion). At least Demis and Elon are true believers; as long as it's at all possible for them to keep going, they won't stop just because their investors got a little skittish. Or if for some reason all of these people are scared off, it will happen in China, and it'll be two or three years slower.

I haven't thought about this too much and would welcome corrections from people who know more about tech economics.

Daniel's avatar

Isn’t a big part of the model the assumption of exponential growth in compute manufacturing? If the investment bubble pops, I doubt data-center buildouts drop to zero, but we probably wouldn’t see investments in the factories building the chips, which means that the amount of compute coming online per year would stay pretty constant.

Ch Hi's avatar

You left out China. If AGI doesn't come before 2030, I don't think you can rule out China being the source.

__browsing's avatar

I think you're possibly forgetting that the AI bubble is going to roughly coincide with a number of other looming fiscal cliffs, such as the pensions + housing vs. money-printing spiral that much of the planet is currently locked into.

Even China could be taken out of the equation as a major player if a war over Taiwan gets them hit with serious trade embargoes (and their pensions/housing/debt bubble is arguably worse than anywhere else on the planet.)

I do suspect we're in a hardware overhang regardless- I really don't think it should take vast increases in compute for AI agents to increase their task-planning horizon from minutes to weeks, given that classical-planning techniques are some of the oldest tricks in the AI handbook and ran tolerably well on IBM mainframes in the 1970s- so maybe AGI will emerge regardless even if the semiconductor industry basically collapses. But I don't especially want it to.

meeeewith4es's avatar

Much of Ed's actions feel more motivated by an interest in generating outrage and pleasing the ingroup. I would like to note that this messaging presents a financial incentive to him, as he's now paywalling a large % of his posts (especially those that are particularly divisive). Perhaps relevant: https://slatestarcodex.com/2014/12/17/the-toxoplasma-of-rage/

I've re-read a few of his posts from ~6 months ago to make this reply (and completely re-wrote this comment 4 times to make it nicer on Ed), and I would suggest that you do so as well, I would suggest a particular focus on what he latches onto and labels as lies without considering all possibilities, what he thinks about things being able to improve from their current state, and what he predicts with what given as reasoning.

Greg G's avatar

For what it's worth, I don't consider Zitron a credible critic at all. Saying "but it doesn't make money" has only a modest amount to do with technological progress, as Scott explains in more detail. I find the money-making objection to be motte and bailey (it's not useful versus it's not making profits right now) and to be suspect since there is a lot of money to be made if you replace much of human white-collar tasks. I would be interested in a discussion with a good faith critic, but I do not consider Zitron to be that.

Philippe Saner's avatar

It having only a modest amount to do with technological progress is, in fact, the point I was trying to make. Scott and Ed are diametrically opposed, but they often don't even contradict each other because their areas of concern are so different.

To label it a motte and bailey is silly, by the way. People really do care about whether there's another financial crisis. Yelling about how this is the next dotcom bubble is not some disingenuous attempt to distract from how this might be the next internet.

Cjw's avatar

The distribution of AGI arrival outcomes most people in this space have predicted are definitely not on a bell curve, and neither is the one Scott helped prepare last year. In that one, as in all I've seen, the modal year is earlier than the median year. So there's a better chance for 2030 than any particular other year but half the AGI-arrival years in the distribution of possibilities happen after 2032, because of the potential roadblocks.

Catmint's avatar

On your recommendation, I looked into some essays by Ed Zitron. It seems to me like he does not understand the technology and is basing all his opinions on the assumption that it has reached its peak and will not improve further. In the places where he gives arguments supporting this, his arguments indicate a basic lack of understanding of how the technology works and which parts are essential.

Some examples:

https://www.wheresyoured.at/the-case-against-generative-ai/ He claims that because AIs are probabilistic, if you use one to generate each picture in a storybook, the characters will look different in each one. But actually, the probabilistic nature of the model has nothing to do with this. The way the code is set up is similar to if you hire a different human artist to draw each page, and they are unable to talk to each other. Of course the characters would come out with differences. A solution would likely involve keeping previously generated images within the context window somehow, and does not require a paradigm shift. Sora has managed something similar, that allows it to generate multiple frames of a video that don't immediately contradict each other. They still aren't very good, but for other reasons.

https://www.wheresyoured.at/subprimeai/ Ctrl+F "So, let’s talk about accuracy." He describes mistakes the models are prone to, such as inventing code API functions, adding a piece to a chess match, and counting the letters in different words. He treats this as proving that generative AI is useless, but actually, the correct measure for usefulness is how high the strengths go, and for the weaknesses, not how low they go but to what extent they get in the way of the strengths. This is more accurate to how people use it when actually trying to get use out of it: They give it tasks that play to its strengths, and continue doing the parts it is weak at manually. To reply to those specific examples:

- Most humans would also lose or add pieces to a chess match, if they had to play the game entirely through text without using a board.

- I, an experienced programmer, regularly "hallucinate" new API functions. But luckily my brain is not hooked up directly to my keyboard, so I get a chance to stop and look it up. On the rare occasions where I need to write code without being able to reference any documentation or source, I may sometimes need to invent API calls that I think might exist, and go back and check them later.

- There happens to be a quirk in LLM architecture that makes them lack awareness of spelling. Specifically, the tokens they use are mainly whole words, not individual letters. There are many similar quirks in human thinking, like that chessboard with a shadow where the white and black squares are the same color. It doesn't seem to hold us back much.

In the same article, he says "Large Language Models ... are not going to magically sprout new capabilities", but... that's pretty much exactly what happened before? We went from predicting the next word to playing chess and writing code. Doesn't seem like a rule that can be relied on.

For the financial stuff, he seems to be on much more solid ground. He's got more awareness than I do of how funding works for the various AI companies. I don't even disagree about it being a bubble, though I think it does have some chance to go either way. But his financial analysis overall is heavily colored by his poor understanding of the tech.

proud dog owner's avatar

>Right now there's parity around the 6 hour mark, but the trendline has this doubling every five months or so. So if the trend continues, then in five years, there will be parity around the months-to-years mark, which seems like enough time to do most things that humans do.

So... AGI is just the current LLM but... better? Do we define AI intelligence according to the length of time it takes before becoming incoherent? And if we think AGI = AI outperforming humans in some inevitably arbitrary standard, what's the point of even using this term?

Kenny Easwaran's avatar

Yeah, I think “as smart as humans” is pretty clearly something we will never get, since existing systems are already so much better than humans at some things and so much worse at others. There’s unlikely to be a point at which it makes sense to say they are “roughly equal”.

There are people using “as smart as humans” to argue that AGI is already here: https://www.nature.com/articles/d41586-026-00285-6

I think the much more interesting concept of “general intelligence” is that of a system that is good at solving all kinds of problems (along the lines of “universal Turing machine” or “NP complete”, which are both about one class of algorithms being such that they can do anything that any other algorithm in that class or lower can do), but I suspect that the kind of problem we are interested in for “intelligence”, and the kind of “good at solving” we are interested in are such that there just isn’t going to be linear comparison - we care about constant and polynomial multiplications of speed and accuracy as measures of intelligence in a way that we don’t for NP completeness.

Kurt's avatar

Yeah, from what I've gathered, LLM coding assistants are fantastic at solving "embarrassingly solved" problems such as a Game Boy emulator, which has thousands of examples to draw from on GitHub. But then try to get it to help you untangle some arcane legacy code in decades-old proprietary enterprise software and it's totally hopeless since nothing outside your company has any examples of such code.

Michael's avatar

You seem too confident that the current trend will stop before 2030 and before AGI. Predictions that a trend will stop have a much shakier track record than predictions that a trend will continue. This is despite the fact that all trends end eventually. The tricky part is predicting when they will end.

Philippe Saner's avatar

To be clear, I am not confident in anything about the future of AI technology.

Vic Fourier's avatar

Shouldn't the correct term be ASI then?

We already have "general" intelligence in terms of forms of LLMs, which can accomplish a wide variety of tasks at a level matching or exceeding (some) humans, as opposed to previous narrow AI like chess programs, speech recognition, image classifiers, etc, which could only accomplish one narrow task.

If LLMs aren't AGI, then what are they? They're not ANI (and obviously not ASI). They fit the definition of "general" intelligence quite well, being capable of solving various problems without having to be explicitly trained for it.

If you asked anybody 10 or 20 years ago if a program that could do anything from writing poetry, to parsing documents, to describing images, to writing code, to (insert any LLM capability) here, they would say, obviously that's AGI. For LLMs the question isn't "can it do X" but "can it do X well", while if you take an ANI like DeepBlue or even AlphaGo, it's not even within the realm of possibility for it to do anything but play chess/Go.

Scott Alexander's avatar

Yes, but everyone expected AGI to be something more exciting than LLMs, so they collectively retconned the term to mean "at least human-level AGI". I don't like it either.

Philippe Saner's avatar

It reminds me a little bit of the way "cyborg" is used. Braces don't count, pacemakers don't count, prosthetic limbs don't count...because those things exist, and cyborgs are by definition science-fictional.

Domo Sapiens's avatar

And never mind the effectively cyborgian features we have at our fingertips at all times through smartphones and other wearables -which will never be implanted into the body for obvious reasons of replaceability, upgradeability and insurmountable bio-compatibility reasons.

Thomas's avatar

The "cyb" in "cyborg" means "cybernetic", that is, "measuring something and reacting differently to different measurements". Most prosthetics and braces definitely don't fit that definition. Modern pacemakers do fit, since they measure the rhythm of the heart and produce signals differently depending on the heart's actions.

Hedonic Escalator's avatar

Cybernetics requires bidirectional feedback. There are prosthetics that qualify, but most don't.

Vic Fourier's avatar

Tbh I don't like that definition either, because it's not very precise. What human? What tasks? Purely intellectual, motor skill based ones, visual, artistic?

An AI could pose existential risk even if it's worse than any human alive at say, making music, or juggling, or brewing a pot of coffee, and an un-embodied AI that's permanently stuck at the intellectual level of a person with an IQ of 80 would cause little risk to society and might even be less economically productive than Claude code.

Philippe Saner's avatar

I worry about the possibility of a self-aiming gun that doesn't miss. Not even a military robot. Just a gun that you can carry, or mount on a pole in an area you want people kept out of.

meeeewith4es's avatar

How does this differ from e.g. automatic targeting systems in fighter jets or various other similar military tech?

Or even just experiments that were made 15 years ago and achieved what you are afraid of? https://en.wikipedia.org/wiki/EXACTO

Not alone either: https://en.wikipedia.org/wiki/Precision-guided_firearm

Within shorter range (which carrying or mounting on a pole implies to me), especially for stationary uses due to bulk, it's a project you could even build yourself (though it would likely be illegal if used by a civilian "on a pole to keep people out of an area").

meeeewith4es's avatar

I believe one historic definition was "50% of skilled adult humans across almost all intellectual tasks", which still leaves questions such as "skilled overall, or skilled in the field?". One could argue that we're already there, but similar to how Scott Alexander put it above, that's not very exciting. I don't want medical advice from a mediocre doctor, I don't want to offload my programming tasks to a mediocre developer, I don't want to risk learning incorrect things from a mediocre teacher. IRL, as long as I have the means, I can solve these by paying someone who is much better at their field, but with LLMs, we've got what we've got.

I do appreciate the separation (also in this article) into "tasks that can be done be done in seconds/minutes/hours/.../years" as it provides a better way to look at the problem. There's still the point of "what duration of tasks should AGI be able to achieve on its own", and I believe the "human level" mapping this post mentions is months to years.

I do however agree that the current definition of AGI is much closer to that of ASI, and I've taken to using that term more for the future goal at hand.

Demarquis's avatar

I take "human" in these definitions to mean "all humans." That is, an AGI could potentially outperform global humans at all tasks humans currently undertake. This isn't necessarily a super-computer, enough distributed moderately intelligent systems could do this if they solve the coordination problems. The point is any activity could be automated We aren't there yet.

Viliam's avatar

> IRL, as long as I have the means, I can solve these by paying someone who is much better at their field, but with LLMs, we've got what we've got.

One possibility is that the LLMs will replace the mediocre humans, and the skilled ones will remain available for a while.

Depending on the economical situation, the human experts may become cheaper or more expensive. That basically depends on what is the alternative for (permanently, because once the LLM gets better than you, there is no way back) unemployable humans -- UBI, or automated extermination camps? Cheaper if the alternative is the camps, so the experts will be desperate to keep their jobs. Expensive if the alternative is UBI, so they will be tempted to keep the money they already have and retire comfortably. In the latter case, it is also possible that they simultaneously become more productive, because e.g. the doctors will be able to outsource the paperwork to LLMs, freeing their hands to do more surgery or whatever instead.

__browsing's avatar

I already consider LLMs and midjourney-style image-generators to be more than sufficiently "exciting", though my internal head-definition of AGI was "matches or exceeds all human capabilities". I dunno, maybe I'm using the term wrong.

YesNoMaybe's avatar

Is that true, though?

I probably learned of the term AGI from the sci-fi roleplaying game "Eclipse Phase". It's from 2009 and by their definition AGI implies human level. I've never known it to mean anything else.

Mister_M's avatar

My impression of the vibes when people started talking a lot about AGI is that people imagined a scenario where the AIs first become general with a suite of skills with roughly the same balance as humans, but maybe dumber overall. General intelligence would be a single parameter specifying a scalar multiple of human intelligence, and at a certain point it would reach IQ 100. I doubt many people would have endorsed this hypothesis when put so explicitly, and I'm exaggerating a bit, but I think these vibes were present at the time. In this context, generality of intelligence seems like kind of a binary have/don't have thing, and once we reached this generality, we proceed to scale the IQ.

Anyway, this all turned out to be far from the truth, and maybe a more charitable way of phrasing Scott's suggestion that "everyone expected AGI to be something more exciting than LLMs" is that assumptions about about generality that turned out to be totally wrong were implicitly part of the definition of AGI, and when things went very differently people realized they hadn't really articulated the whole meaning that they'd been implicitly using.

Kenny Easwaran's avatar

This is my impression. People assumed there would be something like Turing universality or NP completeness, but it turns out that “intelligence” doesn’t have such a thing (even if there is a single factor “g” that correlates with all measures of intelligence among humans - non-humans just aren’t on the same trend line).

Mister_M's avatar

Now that we're discussing this, I wonder why humans and AIs are different in that way.

I had the sense that the "multiple intelligences" hypothesis was not widely supported by experts, but

1. maybe I'm wrong about the experts, or

2. maybe the experts are wrong about this, or

3. maybe there's some special reason that human intelligence fits a certain proportion (perhaps because it's driven by a different principle and AIs really aren't intelligent in an important way), or

4. (I'm leaning towards this one) maybe it's only intelligence *variation* that's mostly one-dimensional, i.e., maybe the surprising stupidities we see in AIs are in cognitive tools that for humans are standard and don't vary much.

I'm leaning towards 4, but I'm well aware that a cursory consideration of where their weaknesses are could potentially weigh strongly against this.

Kenny Easwaran's avatar

I don't know nearly as much about metrics of things like IQ as some other people here, so I don't know too much about what people think about the "multiple intelligences" idea. The basic outline is that yes, there clearly are some different sorts of mental ability that aren't perfectly correlated in humans, but still they do all have some amount of correlation in humans, and the debate is really about how important this correlation is.

But with all sorts of statistical measures, we know that things correlated in one system aren't necessarily correlated in another. As a toy model, just consider two tests, one of which consists of questions that appear in the textbook and one of which consists of questions that don't appear in the textbook. For students who are taking the test closed-book, scores on these tests will be fairly highly correlated (especially if the students weren't told there would be a test made up of questions from the book, so they didn't think to memorize answers), but for students who are taking the test open-book, scores on these tests will be fairly uncorrelated (because the variance on the test based on questions that appear in the book will be determined by how much effort students put in to flipping through the book to find the question, while the variance on the test based on new questions will be determined by the students' skill at solving the questions).

I don't think the difference between contemporary AI systems and humans is as straightforward as that example suggests, or that many of the benchmarks are as straightforward as the test of questions that are in the book, but it gives a toy model for how these can come apart.

Another example might be thinking about all the clever spatial reasoning abilities dogs seem to have when chasing animals up trees, but how bad they seem to be at figuring out how to untangle the leash when it gets wrapped around their legs. A human who was good at one would be good at the other, but apparently for dogs these skills aren't as connected!

Viliam's avatar

I'd say it's because human biology keeps things within certain proportions.

You won't find a human who can speak all existing languages, and has an encyclopedic knowledge of everything, and can recall anything in a fraction of a second... but keeps making relatively stupid mistakes in judgment.

Or maybe you kinda can, but such humans are rare, some idiot savants. Humans learn over time, if you suck at learning, you will probably suck at both knowledge and judgment; if you are good at learning, you will probably be good at both.

So kinda option 4 for things like memory, where the machine is simply way out of human distribution, and kinda option 3 for things like learning, where in human "how good you are at processing information" is proportional to "how much well-processed information to have", while for the machine the training and thinking are two unrelated processes.

Carlos's avatar

They kind of are AGI, except when they show to be much worse than humans at certain important things. I think we were all expecting AGI to mean a piece of software that wouldn't be worse than humans at anything.

Instead, what we have are jagged intelligences, where on the peaks, it is superhuman, but then it has troughs where it is sub-human.

Bugmaster's avatar

> Instead, what we have are jagged intelligences, where on the peaks, it is superhuman, but then it has troughs where it is sub-human.

Technically, this also describes a calculator !

Kenny Easwaran's avatar

There’s a paper in a Nature journal last week making that point: https://www.nature.com/articles/d41586-026-00285-6

Ch Hi's avatar

AGI, i.e. Artificial General Intelligence, doesn't have a widely agreed upon meaning. It generally is more defined by lower boundaries. My favorite conservative meaning is "Able to learn to do any accessible task as well as any human can learn to do it.", but do note that that doesn't specify either the speed of the learning, or what the current capabilities are. And I consider that definition a cheat, in the sense that an actual AGI is probably impossible, and would be more like"Able to learn to do any computational task as well as it can be done.".

OTOH, widely used meanings appear to range from "able to do anything I can think of" to "able to write as clearly as the average high school freshman". I.e., better than anything currently public, but such a wide range of things that it starts to approach meaningless. In any particular case, you need to find out what the person using the term means by it.

__browsing's avatar

> My favorite conservative meaning is "Able to learn to do any accessible task as well as any human can learn to do it."

Yes, that was my understanding of the term as well.

Coagulopath's avatar

A system that is generally intelligent.

Right now we have systems that are human-level or superhuman-level at certain things, but this intelligence isn't general: it is still easy to find areas where they have zero or very low capabilities—like navigating a body in 3D space, or playing games without training on them.

ARC-AGI is a good example. It is a benchmark of simple spatial reasoning puzzles that humans can solve but LLMs struggle with. Then (because it's so famous and prestigious) companies bang their heads against it until they figure out some way to train LLMs to solve them, their scores go from 0-10% to like 80%, then the creator (Frances Chollet) just makes a new version and all the LLMs are back at like 0-10%.

Chollet thinks AGI will have been achieved when it's no longer possible for him to do this.

Sol Hando's avatar

How many people have tried their hands at serious AI forecasts, and how much variation are there in their predictions?

If we had a range of predictions from 2027 to 2050, someone in the predicting-AI community is going to seem right at any given time.

Scott Alexander's avatar

I think the history of estimates that were taken most seriously at the time were:

1990: Hans Moravec predicted 2010. Didn't work because he didn't understand the difference between inference (compute-cheap) and training (compute-expensive)

~2019: Dario Amodei seems to have implicitly predicted something like 2022, although this was kind of secretive and I don't have access to his reasoning. I think he discovered scaling laws, got very excited/scared, and thought that scaling a few orders of magnitude would be enough.

2020: Bio Anchors predicted 2053. Didn't work for reasons mentioned above.

2022: Tom Davidson predicted 2043 (https://www.astralcodexten.com/p/davidson-on-takeoff-speeds). Didn't work because it inherited Bio Anchors' algorithmic progress error.

2025: AI2027 predicted 2027-2030 period. Yet to be determined.

2025: Epoch predicted 2040s for full automation (somewhat different from AGI). Yet to be determined.

2026: AI2027 updated to ~2029-2033 period. Yet to be determined.

JerL's avatar

Surely the bioanchors-based predictions are also yet to be determined? If 2040s for Epoch is TBD, how can 2043 for Davidson be concluded to have not worked?

Scott Alexander's avatar

Yeah, as I said above, we don't know for sure, but Bio Anchors and Davidson depend on an algorithmic progress estimate which is now known to be wrong, and if you plug in the correct algorithmic progress estimate, it gives an earlier result. The author of Bio Anchors has said she's changed her mind. I don't know for sure about Davidson but I'm guessing he would say the same.

Epoch still stands by their prediction. Also, they're saying something slightly different - they don't like the term AGI as much and are trying to predict when it will automate the entire economy, which they think will be very hard partly because of last-mile problems and partly because of human reluctance.

I agree it's a judgment call saying one is wrong and the other isn't.

Edmund's avatar

> partly because of last-mile problems and partly because of human reluctance

I think "cost-effective large-scale manufacturing of efficient robots is not actually permitted by the laws of physics, however smart you are" remains an under-discussed area of uncertainty here.

Scott Alexander's avatar

That doesn't make sense to me. What's physically impossible - welding joints together?

There's already cost-ineffective large-scale manufacturing of inefficient robots. There's usually a pretty tight relationship between quantity produced and manufacturing efficiency - Wright's Law - and I don't think humanoid robots are a sufficiently unusual product compared to eg industrial robots or cars that we should expect them to break it. See https://benjamintodd.substack.com/p/how-quickly-could-robots-scale-up for more.

Edmund's avatar

My concern is not so much about the manufacturing process whether it's ever going to be cheaper to build eg millions of farming robots than to hire cheap human manual labor, because the power to run the robots (counting both electricity production, batteries, and the friction cost of having the robots recharge regularly) is inescapably going to be *more expensive than the produce would be worth*. To my layman's understanding, you have to gamble that AGI is going to invent much more efficient batteries etc. to outfit the robots with, ones not reliant on rare earths in limited supply to boot. Maybe we'll get lucky, but it may equally be that there ain't no such animal.

(This isn't to say I'm a nothing-will-happen skeptic, but if anything, I would expect AGI to just skip past human-scale industrial robots directly to grey goo/bio-engineering.)

Mister_M's avatar

Not sure if your "large scale" only means "many robots" or if it means "many large robots", but if the former, then Eric Drexler and other nanotech people have done detailed physics analyses and concluded that we are many orders of magnitude away from physics limits for fast/efficient robotics manufacturing.

Edmund's avatar

Nanotech is a whole other thing, yeah; I'm talking about human-scale robots of the kind that we could imagine taking over ~all "dumb" physical labour currently done by humans, but who don't necessarily get us all the way to the true, radical Singularity the way nanotech would. I assumed that's what Scott was gesturing at when talking about advances in AI "automating the entire economy".

TGGP's avatar

We still haven't hit AGI or 2053, so Bio Anchors' original prediction could still be correct!

Ch Hi's avatar

The problem is it didn't just predict the final occurrence, but also the path to reach it, and the path is known to be wrong, because the speeds of various parameters were misestimated. OTOH, if you adjust the parameters, it seems to work (TBD).

Thomas's avatar

As someone who has not been taken seriously, since 2011 I have been guessing 2035, and that remains my "80% chance it's happened by then" guess

Ch Hi's avatar

I was guessing full AGI in 2035 +|- 5 years. I've since reduced that to 2033. Unfortunately the uncertainty hasn't reduced much. I'd have expected the "+|- 5 years" to have been reduced, but I haven't been able to convince myself to do that.

David J Higgs's avatar

Pretty much same boat here (2032/33), albeit with even wider uncertainty :D

(could be as early as 2027 still, and I don't put astronomically low chance that Epoch 2040s estimate is right)

Kenny Easwaran's avatar

It might be worth including some earlier ones. In 1950, Turing predicted 30% success rate on a 5 minute Turing test by 2000, and it came by 2013 (though Turing was pretty spot on about forecasts of timelines to gigabyte memory and some other things). I think Marvin Minsky and Herbert Simon had some unrealistically fast forecasts in the 1960s before they realized how bad explicit algorithms are at many things and hadn’t yet learned to appreciate neural nets.

Richard Ngo's avatar

Notably missing Kurzweil and Legg, who are probably the two people who come off looking best out of everyone who's ever tried to predict AGI.

David J Higgs's avatar

True, though I think only Kurzweil would count as "a serious effort at prediction" in a similar way to Scott's other references.

Thomas Kehrenberg's avatar

One reason for Scott to specifically pay attention to this forecast is that he wrote about it 4 years ago: https://www.astralcodexten.com/p/biological-anchors-a-trick-that-might . So, I don't think you can't accuse him cherry-picking (not sure whether that was your intention).

Lorenzo's avatar

I think this review underestimates how much things like the launch of ChatGPT changed AI investments and therefore timelines.

There are many universes in which OpenAI/Dario Amodei didn't push for GPT3, or OpenAI didn't release ChatGPT, or released it unsuccessfully (remember Meta's Galactica and BlenderBot 3?)

As of 2022 Ajeya gave ~15% to 2030 ( https://www.alignmentforum.org/posts/AfH2oPHCApdKicM4m/two-year-update-on-my-personal-ai-timelines ), NVIDIA stock is up more than 1,000% since then, which indicates that investments in compute alone are much higher than what was reasonable to expect at the time.

I would go as far as saying that a pre-ChatGPT model that didn't update massively after the post-ChatGPT AI boom would have been a much worse model, as the change in investment was quite significant. The fact that timelines moved earlier in response to a massive increase in investment doesn't seem to me to invalidate the model, which did give 10% by 2031

Likely it should have been even more uncertain, but there's a ton of hindsight bias that we should be mindful of.

Scott Alexander's avatar

I think it's totally reasonable to expect what you're saying to be true, but that actually Ajeya miraculously got this part right. There's somewhere in Bio Anchors where she predicts the size of the top model of 2025 in 2025 FLOPs and gets it almost exactly on the nose. I think she just figured that everyone in 2020 was being silly by not pouring all the money they had into compute immediately, and that within a year or two they would stop being silly.

Arbituram's avatar

I was very impressed with Ajeya's report in 2020 for its rigour, curiosity, and intellectual humility, but thought the (implied) investment point was simply a non-expert not differentiating between venture capital/tech and 'real money' investment markers, especially debt markets (which would be required to hit these scales.)

Ends up Ajeya was right and I was wrong, so time to eat my hat!

Greg G's avatar

I agree. If Cotra's work were a problem set, I think I would give her a 95%, with 5% off for not coming up with a better workaround for the lack of algorithmic progress data. As with any math problem, you can be mostly right but still get the wrong answer if one term doesn't work out.

Kenny Easwaran's avatar

Also, your “wrong” answer can still be more useful and more disciplined than all the vibes based things that currently feel like they’re closer. Taking her model and modifying the algorithmic progress rate is much more convincing to me than the vibes around playing with Claude Code.

Julián's avatar

the wildest thing about bio anchors getting it wrong isn't the error itself – it's that cotra basically built a sensitivity analysis, warned about the uncertainty, and people still treated the median as gospel. turns out the most important parameter (algorithmic progress) was the one she spent the least time on. classic lesson in where your uncertainty should actually live.

Ch Hi's avatar

IIUC, she didn't have much choice. There weren't many sources for good estimates of relevant algorithmic improvement.

Julián's avatar

that's a good point – the data scarcity is real. but that's kind of the meta-lesson: when your most consequential input has the least data behind it, maybe the honest move is wider confidence intervals, not a single median estimate that everyone treats as a countdown timer.

mikolysz's avatar

Re: algorithmic progress, I think it also deserves a "willingness to spend" factor. The better AI gets, the more we (as a society) are willing to spend on making AI algorithms better. This is not just big labs pouring money into researchers, but also more researchers deciding to enter the field because it's trendy, academic / grant-giving institutions dedicating more budget to AI departments, more companies doing weird-but-interesting things being fundable etc. The more shots-on-goal you have, the more likely you are to discover new techniques (think "flash attention" or "test-time compute", not "transformer alternative") quickly.

meeeewith4es's avatar

I think a potential of "willingness to spend" going down is worth considering: Training is compute-costly and makes ~no money, inference is compute-cheap and makes money. We currently have several labs rapidly developing and training new models, which more or less replace the last one they have, and then the old model gets eventually retired, and I doubt many of them get close to breaking even on training costs by that point. This is not an indicator of future trends or the inherent value in the experimentation now, but this will not be obvious to everyone. A large portion of the future progress is likely to come through many more such iterations, and money from investors who are looking for faster financial returns may end up drying up ("willingness to spend" going down), slowing down the progress.

Arbituram's avatar

This is where my skepticism was most wrong: I dramatically underestimated the willingness of investors to follow through on big bets. Not just the tech companies and VCs, but boring debt investors' willingness to sink hundreds of billions of dollars into data centers. I was very wrong!

Lucid Horizon's avatar

It's nice when an elegant theory pans out, though a bit less nice when the theory points at humanity facing a potentially lethal reckoning in maybe a decade or two. Evidently Scott had kids despite this, though, so maybe I missed something.

Ch Hi's avatar

It's a threat, but it's also a promise. We don't know which way the Singularity will flop.

FWIW, I'm afraid of AGI, but I'm a lot more afraid of AGI being run by the kind of people that are in charge. To me alignment means that the first rule should be preservation of humanity. (Well, that's so oversimplified that it's false, but it conveys the general impression.)

Lucid Horizon's avatar

True, many of the people in charge view humanity as sort of evil or distasteful. Even more view specific demographics as being much more evil than others, which they've sometimes been able to put a number on when measuring AI biases.

Alex's avatar
Feb 12Edited

You casually assume "lethal reckoning -> shouldn't have kids" when that's, like, a huge moral debate with many sides.

In particular surely you want young, well-educated, and strong-of-character people around *for* the lethal reckoning, do you not? Surely the argument is not: things are going to get hard later, therefore give up now?

And nevermind it being an argument. There's also just the fact that if human history indicates anything it's that people are going to keep trying to survive. Why would that stop now?

Kenny Easwaran's avatar

Also, even if it’s 80% chance of lethal reckoning where nothing matters, it’s worth having kids if you think their lives are valuable in those 20% of other futures.

Ekakytsat's avatar

Or if you're maximizing the expected number of surviving offspring (here # births * 0.2), as Darwin commands us.

Lucid Horizon's avatar

> In particular surely you want young, well-educated, and strong-of-character people around *for* the lethal reckoning, do you not?

At the timescales many AGI people are now predicting, the kids will be so young they won't have time to become well-educated or strong-of-character.

Peperulo's avatar

Maybe Scott is doing a László Polgár on his kids to make them AI alignment geniuses.

Jimmy's avatar

I don't see why a lethal reckoning would be a reason to not have kids. Everyone dies eventually, and they would die a bit sooner than usual, but they would have still had a chance to live. What they should really be worried about is 𝘯𝘰𝘵 dying. Ever.

Carlos's avatar

I think it's likely that if the ASI decides to kill us, it's not gonna bother with making it painless, which changes the having kids equation by quite a bit, IMO.

Lucid Horizon's avatar

It is perhaps even more worrisome that if it decides not to kill us, or to prevent us from being killed, it won't bother with making that very painless either.

Asquil's avatar

I don't see that? Unless the ASI is actively interested in torturing us (which would require us getting alignment 99% right, per Yudkowsky), I don't see any realistic way for it to kill us that would be much worse than dying by itself. We humans are not an especially robust organism, we are pretty easy to kill quickly.

Jimmy's avatar

Almost every death is painful, so I don't see how that's relevant. It would be arguably less painful than the current average death, given how people these days have made a habit of prolonging their decay...

If your argument is that "bringing children into this world is wrong because they will die painfully", then the most sensible course of action would, ironically, be to destroy all life so they can no longer reproduce. So it's presumably not the argument you should be making.

Lucid Horizon's avatar

True, I should have emphasized this possibility.

Cjw's avatar

The kids will probably die at the same time the adults do, which is tragic but I guess at least you don't have to think about them wandering around scared and crying for mom and dad.

I don't have kids, but I sometimes think about our dogs being stuck in the house slowly starving to death after some targeted virus or whatever kills all the humans. I have to hope it happens during the daytime when the doggie door is accessible, they could probably dig under the fence and make it to the creek and have some chance.

Kveldred's avatar

Pretty charitable to Nostalgebraist, who was both wrong *summa summarum,* and wrong in specific predictions & mechanisms; "spiritually correct" is not how I'd describe "noticed one hinge-point the opposing model could've paid more attention to (but still *also* called it wrong, directionally)"...

...but I admit that I could be saying this just because I've disagreed with the fellow before on AI (I've been more optimistic about the economic impact thereof, more pessimistic upon humanity's x-risk therefrom) & upon social justice & IQ research (I disfavor the former & favor the latter, but he's occasionally argued in the opposite directions), etc.

[𝙚𝙙𝙞𝙩𝙚𝙙 𝘵𝘰 𝘳𝘦𝘮𝘰𝘷𝘦 𝘱𝘳𝘰𝘣𝘢𝘣𝘭𝘺 𝘪𝘯𝘤𝘰𝘳𝘳𝘦𝘤𝘵 𝘤𝘭𝘢𝘪𝘮 𝘢𝘣𝘰𝘶𝘵 𝘕𝘰𝘴𝘵𝘢𝘭𝘨𝘦𝘣𝘳𝘢𝘪𝘴𝘵'𝘴 𝘰𝘱𝘪𝘯𝘪𝘰𝘯𝘴 𝘶𝘱𝘰𝘯 𝘤𝘳𝘺𝘱𝘵𝘰 & 𝘛𝘳𝘶𝘮𝘱—𝘚𝘤𝘰𝘵𝘵 𝘣𝘢𝘥𝘦 𝘮𝘦 𝘴𝘶𝘱𝘱𝘰𝘳𝘵 𝘵𝘩𝘦𝘮, 𝘢𝘯𝘥 𝘐 𝘧𝘦𝘢𝘳 𝘐 𝘩𝘢𝘷𝘦 𝘧𝘢𝘪𝘭𝘦𝘥 𝘪𝘯 𝘵𝘩𝘢𝘵 𝘵𝘢𝘴𝘬: 𝘤𝘰𝘶𝘭𝘥 𝘧𝘪𝘯𝘥 𝘯𝘰 𝘳𝘦𝘤𝘰𝘳𝘥 𝘰𝘧 𝘦𝘪𝘵𝘩𝘦𝘳, 𝘢𝘱𝘢𝘳𝘵 𝘧𝘳𝘰𝘮 𝘰𝘯𝘭𝘺 𝘢 𝘧𝘦𝘸 𝘴𝘰𝘮𝘦𝘸𝘩𝘢𝘵 𝘴𝘶𝘨𝘨𝘦𝘴𝘵𝘪𝘷𝘦 𝘵𝘳𝘢𝘤𝘦𝘴 𝘰𝘧 𝘵𝘩𝘦 𝘭𝘢𝘵𝘵𝘦𝘳. pardon, N.!]

Scott Alexander's avatar

Where has Nostalgebraist written about Trump or crypto? Wondering whether I've missed it or whether you're thinking of a different person.

Kveldred's avatar

Okay, I think you may be right—I thought I recalled him offering the Left's usual sentiments upon crypto way back in like 2016 (scam, bubble, don't invest), but a brief search turns out absolutely nothing...

...and as for Trump, I seemed to recall a dialog between y'all on Tumblr or the comments on SSC, wherein you weren't so sanguine about Clinton's chances & he was considerably more-so; but upon searching for 𝘵𝘩𝘢𝘵 I find the following, which 𝘴𝘰𝘳𝘵 𝘰𝘧 supports a contention of "he thought Trump's chances in 2016 were substantially lower than they actually were"—not that that's much of a sin, nor an uncommon one at the time—but is a post-election reflection & close enough to my "remembered" dialog that it's probably what I was thinking of: https://slatestarscratchpad.tumblr.com/post/153069751926/nostalgebraist-slatestarscratchpad/amp

(𝘏𝘰𝘸𝘦𝘷𝘦𝘳, I will offer, as further support for my v̶i̶l̶e̶ ̶l̶i̶b̶e̶l̶'̶s̶ ̶ 𝘳𝘦𝘢𝘴𝘰𝘯𝘢𝘣𝘭𝘦 & 𝘵𝘦𝘮𝘱𝘦𝘳𝘦𝘥 𝘢𝘤𝘤𝘶𝘴𝘢𝘵𝘪𝘰𝘯'𝘴 being not 𝘵𝘰𝘵𝘢𝘭𝘭𝘺 implausible, the following: https://nostalgebraist.tumblr.com/post/140041878414/the-actual-story-here-is-just-someone-came-up ... okay, fine, I'm reaching now; I'll edit the original comment—pardon, Nostalgebraist! maybe our differences 𝘪𝘯 𝘳𝘦 political opinion have somewhat unjustly poisoned me against you–)

Herb Abrams's avatar

Cotra was part of a discussion in NYT a few weeks ago. She said AGI was very likely within 10 years but predicted limited impacts by 2030. So probably a prediction of AGI in 2033-2035ish?

https://www.nytimes.com/interactive/2026/02/02/opinion/ai-future-leading-thinkers-survey.html

Peter Defeel's avatar

> Moore’s Law breaks down, and so Moore’s Law didn’t end up mattering very much.

It has broken down. Unless you redefine it, which I expect will be happening in the replies to this.

The whole post was fairly verbose and hard to follow. I’m fairly dubious about this claim though “Since Cotra and Davidson were expecting AI to get 3.6x better every year, but it actually got 10.7x better every year”, it seems very definite but my experience in using ChatGPT is that it’s getting better for sure, but I can’t say what 10.7 means in this context. I’m not doubt missing the expertise to understand what 10.7 exactly means though.

Nevertheless I think AI will do a lot of damage to the economy even at the level it is now.

Scott Alexander's avatar

By 10.7x, I meant the amount of effective compute deployed for AI per year. I'll make that clearer in the post.

Ch Hi's avatar

Actually, I doubt that it's really broken down, but it has hit a bad patch. Usually at this kind of patch a new technology would show up rapidly, and I'm still expecting it to do so. But heating is a difficult problem, and the obvious answer is going full 3D. Some folks are talking about photonic chips, but I don't know how reasonable that is. Others are talking about spintronics. I've even read about somebody thinking they could use entanglement. However I expect SOMETHING within the next decade.

That said, it will clearly break down when we get to single atom switches. So we aren't many orders of magnitude above the limit. Possibly slowing down should be expected.

Michal Zušťák's avatar

I’m fairly dubious about this claim though “Since Cotra and Davidson were expecting AI to get 3.6x better every year, but it actually got 10.7x better every year”, it seems very definite but my experience in using ChatGPT is that it’s getting better for sure, but I can’t say what 10.7 means in this context. I’m not doubt missing the expertise to understand what 10.7 exactly means though."

Try creating an app with Claude Opus 4.6. Now try creating it with GPT o3 from less than a year ago. You will see the difference.

Peter Defeel's avatar

I said I used both. There’s no way Claude is 10.7 times better. And Scott has already answered. That’s compute.

Michal Zušťák's avatar

"I said I used both. There’s no way Claude is 10.7 times better."

You did not say that in the post I replied to and no way, because you saud so? How about one shotting a working .html javascript 3D game vs not being able to one shot a working space invaders clone with the aliens literally being just squares? That is Opus 4.6 vs o3. And I mention o3 cause of the pre-GPT5, pre-Claude 4 models it was arguably the very best one.

At the risk of sounding abrasive, most people have no idea about LLM progress. They cry about 4o being retired cause it was "personable" even tho it could only one shot Pong and semi functional Arkanoid.

Peter Defeel's avatar

Well it’s two people saying so. You believe in a 10.7 increase in code generation, I don’t.

I don’t even know how you even measure that precise a figure. And you are arguing over something that Scott didn’t even say.

By the way in think the vibe coding excitement comes from people outside the industry working on toy problems.

Andrew Clough's avatar

Historically the phrase "Moore's Law" always refereed to Dennard Scaling as a whole ever since it was coined at the 1975 conference where Dennard presented his paper. So we can say that it partially broke down in the mid aughts. But the cost of compute is the important factor here and, if Kurzweils is to be believed, that's been falling steadily since the days of electromechanical relays and so is much safer by the Lindy effect. Even if transistors stopped getting smaller tomorrow we can work on finding ways to make wafers of them cheaper.

temp_name's avatar

How do people here think about progress in coding automation so far?

My experience is that it's very good at either well-defined or commonly encountered tasks, but still far from feeling like true coding automation (i.e. humans no longer read code, any more than they read compiler generated machine code now). Admittedly I'm not one of those power users throwing 2000$/month at coding agents, but I still try to use the latest model*, and it definitely still requires handholding.

Just today, I made it optimize a training loop, and it made some decent but unimpressive progress by its own. Then I asked it to try removing the batch loop and replace it with tensor operations (no trivial job, I wasn't even sure if it could be done in a reasonably clean way), and it solved that in a minute or two, which gave massive speedup. So it's amazing at solving a given task, but perhaps not so good at judgement and intuition yet, since it seems obvious to me that removing a raw python loop is one of the first things to do when optimizing pytorch code.

*GPT-5.3-Codex, which I believe is the latest as of now, though I may be wrong

Scott Alexander's avatar

For what it's worth, I can't code at all, literally never learned how, and I've still gotten AI to make lots of useful software products I want (mod to a Civ4 game, something that uses the API to let AIs hold conversations with each other, some scripts to generate graphs).

This was pretty painful with Opus 4.5 (a lot of "looks like there's a bug, would you like to keep trying until you solve it?") and almost seamless with 4.6 so far (though I haven't asked it to do anything too hard yet).

I'm still thinking of it in terms of time horizons - my experience kind of matches the naive version of the METR graph where it can do projects that would take an experienced human a few hours, but no more. Sometimes even me prodding it and helping it "organize" its "thoughts" is enough to help it do better even though I don't understand the code itself.

Greg G's avatar

I agree, judgment lags. It seems power users supply the intuition in ~1-5% of the time they would use to do the work themselves, so they get 95-99% automation from a time spent perspective. Definitely not from a value perspective yet. But much of this stuff seems trainable and/or otherwise automatable (e.g., give the agent an optimization.md that lists common things to try).

Mark's avatar

I am a professional SWE, 25 years experience, have worked in multiple FAANG companies fwiw. In the past month I have almost stopped using IDEs. I also feel like in many respects, I produce better code with agents than without, because the tradeoffs have changed: A test that would take half a day to write can now be done for almost 0 time cost.

It does still require very careful prompting and close supervision. But compared to only 3-6 months ago it's a huge, almost discontinuous jump.

Doug Summers Stay's avatar

This project at the time seemed to me to be arguing for a lower bound-- that even if you don't get any breakthroughs, you can still expect that by 2050 we get AGI. At the time, people took a lot of convincing that it could be so soon. So I see it as pulling in the right direction, but dragging along the weight of priors that AGI was farther off.

EngineOfCreation's avatar

> But later research demonstrated that the apparent speed of algorithmic progress varies by an order of magnitude based on whether you’re looking at an easy task (low-hanging fruit already picked) or a hard task (still lots of room to improve). AlexNet was an easy task, but pushing the frontier of AI is a hard task, so algorithmic progress in frontier AI has been faster than the AlexNet paper estimated.

Wouldn't that imply that the slowdown will also come to the hard task that still has low-hanging fruit, and that the low estimate won't be as bad as the numbers currently say?

Scott Alexander's avatar

Good question. I don't have a great sense of whether the hard task is "designing AGI", in which case current algorithmic progress should imply diminishing returns in the future, or whether the hard task is "making AIs level X intelligent", in which case there will be diminishing returns on how cheaply we can make any given AI, but non-diminishing returns on the overall task of advancing the AI frontier.

EngineOfCreation's avatar

>intuition pump: are we sure the average employee stays at an AI lab for more than a year? If not, that proves that a chain of people with sub-one-year time horizons can do valuable work

It doesn't prove that, because the works of consecutive employees are not independent of each other. People write papers and otherwise preserve their experience so their successors can make use of it, unlike LLMs that just forget everything that happened outside their context windows.

Edit: A chain of people can do valuable work on the same long-running project, yes, and that was proven long before AI labs existed. But that doesn't prove LLMs can do the same, unless you implicitely assume they're already as capable as the smartest humans on any given task.

Greg G's avatar

I don't think consecutive employees preserve information much better than consecutive agents at this point. In fact, perhaps they actually do worse because humans are less diligent about documentation on average. Even if they write a paper, often all of the work that goes into the paper gets lost.

Current agents leave plenty of good artifacts behind, and subsequent agents are good at picking up this info. So in many cases, I do think a chain of sub-one-year agents can make good progress. It depends on the complexity of the task and how likely the agents are to take wrong turns or get "confused", but it definitely seems possible in theory.

Viliam's avatar

You could tell the LLM to write some key points into documents that will be available for its successors. (I am not saying that it would work reliably, just that I see a possible way.)

Casey Milkweed's avatar

Thanks for the recap! As you noted (1) Cotra framed the problem accurately, (2) Cotra's forecasting error was from one flawed parameter; and (3) Cotra flagged that parameter as not having gotten much TLC.

In other words, Cotra gave readers an effective scaffolding for thinking about the problem and accurately guided future researchers toward a high ROI research area for improving the forecast. In a better world, someone would have seen that Cotra had struggled with that parameter and stepped into help. Maybe if Ajeya Cotra had had an identical twin sister who could have helped invest some additional hours into the project, then we would have gotten an accurate AI forecast.

I feel like the lesson of this is that we just didn't have enough Ajeya Cotras.

Arbituram's avatar

Yes, if anything the report looks ever more impressive with time! The original report materially shortened my own timelines, and in a very real way if she had been 'right' about algorithmic process I wouldn't have believed it and would have discounted the report more generally.

Daniel Kokotajlo's avatar

+1. Note that I was inspired to think a lot more about AGI timelines in large part by reading Ajeya's report, and the report was one of the main things that made me update from 2050ish timelines to 2030ish timelines (!!!) Because I disagreed with some of the parameter values in her model, and when I plugged in my preferred values, got 2030ish.

"Effective scaffolding for thinking about the problem" exactly. Me, Leopold (author of Situational Awareness) and many other people seem to have been heavily influenced by her framework for thinking about the problem.

Aris C's avatar

What I really struggle to understand is the disconnect between the enthusiasm from some power users (including Zvi) and rthe experience of regular folks.

Like, as a casual user, I definitely notice improvements in LLMs. But I also very quickly run into really dumb behaviours, hallucinations, bugs in their code, etc.

Similarly, engineers on X are like yeah I don't do any manual coding anymore and I one-shotted GTA 7 in 6 hours. Developers I know say that LLM helps them at work, but is nowhere near to automating their jobs.

So... How do bridge these?

Xpym's avatar

The main theory seems to be that LLMs provide an illusion of productivity increase (tons of slop to tinker with), but no actual increase in useful output.

Aris C's avatar

I get that, I read that study from last year. But I'm not talking about mere enthusiasm... I'm talking about people - intelligent people! People like Scott here! - seeming to think we're close to AGI already, when even advanced models still stumble with basic things, like not making stuff up.

Xpym's avatar
Feb 12Edited

The load-bearing piece of that worldview seems to be the METR graph of "task-completion time horizon", and the general belief in "straight lines on graphs continuing indefinitely". I remember Scott saying in a post that this belief is "conservative", which you should default to in the absence of strong arguments otherwise. I think this position has merits, but disagree about the strength of counter-arguments.

Scott Alexander's avatar

I don't find the fact that sometimes their code has bugs to be that interesting. It's like the joke about the guy who teaches a dog to write symphonies, and a spectator complains that they're uninspiring and derivative. Except you also have to imagine that the dog's IQ is doubling every year.

But maybe it's also a difference in how we're using it. Which AI models (including version number) have you used, and what kind of tasks did you set them?

Aris C's avatar

It's a little hard to put this into words, but let's see how much we can strain your analogy. Let's say the dog is like GPT when it first came out. He plays a simple tune, and everyone goes crazy. Yes, fair, a dog playing any kind of time is a marvel.

But I, a high-brow music critic come and say to you, hmph, the music is uninspiring and derivative. And btw, it's not even right, he plays the wrong note when for basic melodies.

Just you wait, you tell me. And true enough, next month the dog plays longer, more complex pieces. See? You say. He improved! At this rate, he will write original, correct music in no time. But my objection is different: there's something qualitative lacking from your dog's composition. Even if his next piece is 12 hours long, it will still lack the depth of true original composition. And by the way, though he can now play Mozart, he still stumbles at twinkle twinkle little star. And if I ask him to do something simple in what feels like a similar field - e.g. do a basic ballet step, he stumbles and falls.

So no, despite his playing longer and more complex pieces, I don't see a *qualitative* step up in his performance. You think you do, because in humans this qualitative performance almost always comes together with more objective improvement.

Does all this make any sense?

Scott Alexander's avatar

I sort of see what you're getting at, but it doesn't seem to match reality to me. I don't understand what qualitative thing you think they're lacking.

I agree they don't have the exact same skill profile as humans (your "can play Mozart, but not Twinkle Twinkle Little Star" is apt). But I would qualify it with:

- The ceiling (the most impressive thing AIs can do) is constantly rising.

- The floor (the least impressive thing AIs can't do) is constantly rising too. The past favorite examples of weirdly easy things AIs failed at - adding 2+2, predicting real-world situations like colliding balls or dropping water, drawing hands, combining text + images, etc - eventually fell.

So if we're asking some specific question, like whether the AI dog will ever be able to play a Liszt piano solo., it still seems like we should bet on yes. They'll do it in some weird way that doesn't exactly correspond to how humans would do it - they might be able to do the hardest parts first, and not be able to do the easy parts until later - but they'll eventually get it.

To make sure we're on the same page, here's a conversation I had with an AI when I was writing this post - https://claude.ai/share/270a4c7a-46b9-455a-a01e-ba07feadc304 . Is this the kind of conversation that you've had with AIs? Do you still feel like something fundamental is lacking here and you're not "enthusiastic"?

Aris C's avatar

First off, I like how polite you are with Claude.

I have low confidence in my thinking here, but to answer your questions: I think AI can and will get very good at some specific things that it's being actively trained to do - coding, certainly, and anything else we treat as a benchmark.

But what it doesn't seem to be making progress at is the G in AGI. Every time we raise the floor, it seems we did it by ensuring AI wouldn't fail in the same embarrassing ways as last time - OpenAI wouldn't release GPT 5 without making damn sure it can answer 2+2. But that doesn't mean the floor was actually raised. It gets 2+2 right, but it will fail at something else random.

To give specific examples: I asked Gemini 3 for recommendations on essays I might like to read. I specifically told it I don't like Malcolm Gladwell; it gave me a Gladwell essay, but claimed it was written by someone else.

Or, I asked Gemini 3, Sonnet 4.5 and GPT 5.2 to analyse the lyrics of a greek song. They all made up the lyrics, and even when they found the correct ones, they gave a very poor analysis. Which to make makes the point: their 'intelligence' is nowhere near general. They do well on things they've been trained. Throw a foreign language on a subject that doesn't feature in their training, and they fail badly.

Scott Alexander's avatar

I'm not entirely sure, but I think that's only partly true. That is, I think they probably did hard-code in a hack "don't get 2+2 wrong" just in case, but that also as models got bigger they were naturally less likely to fail at math, drawing hands, etc - I think some of this is transfer learning from whatever they're getting trained on. And I don't think they get new error modes each time they solve an old one. I think they're just getting gradually better. Cf. https://www.goodreads.com/quotes/7169960-the-road-to-wisdom-the-road-to-wisdom----well

I'm surprised by the Greek song thing - just so we're working from the same data, can you tell me which song it was, and I can try with my AI and see what happens?

Aris C's avatar

Sure! Tell them to analyse Τα παιδιά κάτω στον κάμπο

Michael's avatar

Just FYI, I can make it fail at arithmetic by telling it to treat llllllllllll and lllllllllllllllllllllll as someone keeping score in a soccer game and then adding the two scores. I think you could also patch this but it's tricky to do offhand because even humans can't “visually” count more than 5 objects. I suspect it fails for the same reason

Ch Hi's avatar

It's definitely true that AIs get better a lot more quickly on tasks where there is reliable feedback. Of course, the same is true of both amoebas and of humans.

Part of the problem is that current publicly accessible AIs have very limited ability to learn from feedback. This is to solve the problem that Microsoft Tay encountered...feedback intentionally designed to corrupt them. It's a very limiting answer, but it sort of works for the purpose.

Cjw's avatar

I think the training data on things that are hyper-local to cultures/nations other than English- or Chinese-speaking ones may just be really awful. When I have used ordinary web search to try to get info about Danish and Norwegian folk music I will end up at best on a trail to a non-English wiki stub or maybe a reddit posting that references the thing, and it will lead you down a trail to some person or work that seems like it should be really important but about which it's hard to confirm much more than its existence. And I'm not even talking deep stuff here, it's as if you were looking into American Romanticism but had trouble pinning down the existence, content, or even the precise *name* of "Twice-Told Tales". If it's training off available English-language web sources, or from citations and research trees to books available in English, I'm afraid the future is going to be missing things of significant value in the history of non-English cultures.

B Civil's avatar

I think you could find any number of people in the category of “general” who wouldn’t even understand Greek to begin with, unless of course they were taught. I feel this kind of analogy keeps popping up. Human beings don’t do much out of the gate; they have to be taught or apply themselves to learning. There is also generally a limit to what you can teach a toddler. (although because little children have bodies and mobility, they can learn a lot of things all by themselves. I honestly don’t know what would be analogous to an AI at this point in their development.) I am becoming more convinced that once you can really teach an AI the minutia of anything we *know* ( the simulacra of Our knowledge that is present in our ”corpus”), it will be perfectly capable of emulating a general intelligence in a way that completely mollifies our concerns about its capabilities, but still leaves us with the uneasy feeling that there is missing authenticity. Hmm.

I am still not sure if we are talking about a dog playing the violin or a dog writing sheet music. Or perhaps in the interest of “generality” we are talking about a dog that can do both. The strangest thing to me about this whole question is that the physical form of a dog does not lend itself to any musical instrument that I can think of off the top of my head. In other words, the problem that looms larger here to me is robotics, not intelligence – authentic or otherwise. I am entirely willing to believe that in the not too distant future an AI will be able to output the sheet music for a symphony which - interpreted by human beings with their own skills and feelings - might be quite interesting. It is all in the performance isn’t it? On the other hand I have a lot of trouble believing that we will anytime soon see a robotic body inhabited by an AI that would be even able to pick up a violin without crushing it. I exaggerate, but you know what I mean; the delicacy of manipulation in playing a musical instrument is incredibly complex, and if you go and see the same musician perform a few times you will quickly realize that it’s never quite the same. I think the Grateful Dead are the most famous band for this but lots of people listen to piano concertos played by several different pianists, so it generalizes.

My main point in all this is when it comes to intelligence, (in a purely conceptual sense not in a physical one,) emulation is sufficient. Authenticity has an audience of one .

Aris C's avatar

The point is that a human who can analyse literature can analyse a book from any language (if translated of course!) even if they haven't come across it before.

vectro's avatar

Just a note that we already have robots that can play the violin.

I get the sense that there is a greater disconnect be tween expectations and the frontier reality when it comes to robotics than when it comes to LLMs. Yes, there is much they can’t do; but the frontier is still quite impressive.

Kindly's avatar

You just have to know how to prompt the dog correctly. Ask it to play the very simplest version of Mozart's twelve variations, and you get it playing twinkle, twinkle, little star perfectly.

Aris C's avatar

After all, the success of the band comes down to the maestro's virtuosity and vision.

B Civil's avatar

Hang on, is the dog in question writing music or playing music?

B Civil's avatar

>derivative

hah!!! There is a shaggy dog story to be written there. “A dog walks into a bar…“ Is one of my favorite forms of jokes.

Charlie Sanders's avatar

How much money are you currently paying for the models that you're noticing these issues with? If the answer is zero, then there's your primary culprit.

Aris C's avatar

This would be true if these same (now outdated) models hadn't generated the same level of hype when they were first introduced.

Mark's avatar

Adoption among engineers is uneven, and you're seeing a mix of different things:

- there are some idiots producing tons of slop

- there are some careful, professional engineers who have figured out how to get these things to do large amounts of useful work

- there are some careful, professional engineers who haven't made this leap, or can't yet make it due to some aspect of their work

now, realize that anywhere you go, you are getting view points from a highly selected sampling of the above types

Catmint's avatar

This pretty much matches my experience as a paid, professional engineer who has figured out how to get it to do small amounts of useful work. I probably could have gotten it to do more, but haven't tried, because I work in game dev and the public opinion against AI is quite strong.

Benjamin's avatar

The less of an expert you are the more you are more you will be impressed: https://xkcd.com/2501/.

However, with Claude Code even experts are getting very impressed. I can only give you my perspective as a bioinformatics phd candidate with computer science master and I don't write any code manually anymore. Claude Code with 4.5 and 4.6 was a huge shift.

If you are not using Claude Code or Codex you get wayy worse results, coding is a completely different experience. Once you are using them you can probably make a similar jump by using them well (mostly by asking them to double check everything, test everything, have other subagents check their work ..., getting the 0.8 success rate up to 0.99 which then enables chaining together several 0.95 or 0.99 which then enables you to chain together 5 or 10 of these steps together while keeping the same success rate. Oh you can also always start a second team of agents and then compare the results in the end or 4, it's all pretty cheap. Act as if you had infinite white collar workers at your hand and just wanted to maximize the success rate of whatever you are doing.

But nothing much changed from 2 years ago. The point isn't whether we are wet right now, the point is the coming tsunami. That predictably you will be impressed at some point in the future.

Performative Bafflement's avatar

> What I really struggle to understand is the disconnect between the enthusiasm from some power users (including Zvi) and rthe experience of regular folks.

Regular folks are often not using the full smartest paid models, 90%+ are on garbage free tier, 8% after that are on $20 tiers, and maybe 2% actually use $200+ (pro tier) to thousands per month (ie fully scaffolded persistent agents using a bunch of API calls) tier agents, and the models get smarter at each paid step.

Then similarly, the level of effort that people are willing to put in scales basically the same way. It's the 90/9/1 law of internet - 90% lurk, 9% comment, 1% produce. 90% of people using AI are "lurking" in the sense the most effort they put in is "lolcatz fgame plz," and maybe 9% will actually spend time prompting and iterating, and 1% will spend time conceiving and specifying a thought out framework / architecture with fully specced tests before having the agent write anything specifically towards the test outcomes, thereby keeping it on track, and auditing the code produced.

Either one of those two things suffices to explain the gap you see, now combine both of them. In my own opinion, the ones who think the AI's are still dumb today are telling on themselves.

Aris C's avatar

I don't find this very convincing. First off, there's always been this level of hype, even for older models that are much inferior to current free tiers one. Second, see the conversations above. Granted, better promoting drastically improves results - but the fact that some simple questions require careful prompting, when much harder ones don't, shows that something about the models is off.

Performative Bafflement's avatar

Yeah sorry, when I posted my reply, for some reason I didn't see any high level replies to you, and now I see there's a ton, not sure whether that was a me problem or a substack problem.

> Granted, better promoting drastically improves results - but the fact that some simple questions require careful prompting, when much harder ones don't, shows that something about the models is off.

I think this is just largely irrelevant - it's basically a gradient, where already at the top, skilled people can do the work of 5-10 people using AI agents, and that's obviously going to expand as capabilities, scaffolding, fine tuning, and agent infrastructure all improve.

Something being "off" in the models once we're at this level just basically doesn't matter, because it will 100% be worked around enough to materially impact hiring and comp, even if everything froze at today's capabilities. And of course, we're not frozen, and we're about to 10,000x computing capacity over the next 5 years.

Aris C's avatar

My point though is that this something that's off hasn't really improved... So I'm not convinced that it will.

As for skilled users being 5x as productive... Tbh that's true without AI - people who are good at what they do are already more than 5x as productive as their peers.

Freddie deBoer's avatar

Are you going to write a follow up where you acknowledge that Moltbook is fake?

https://www.technologyreview.com/2026/02/06/1132448/moltbook-was-peak-ai-theater/

Scott Alexander's avatar

The linked article points out that Andrej Karpathy shared a post that turned out to be literally fake (as in, written by a human), then pontificates on how this is "a window into our obsession with AI", and then the usual stuff about how obviously all AI must be fake because it's just "pattern-matching".

I didn't treat the Karpathy post mentioned as real, and I'm still happy with how I discussed the degree to which the AIs were following human orders rather than acting autonomously, and the philosophy around to what degree AIs pattern-match vs. think. If you can find anything I wrote which was later found to be false, in an "actually false" sense rather than just contradicted by the vibes of an article by some random journalist, I'm happy to change it.

Freddie deBoer's avatar

I mean that a) a huge number of the posts that went viral and drummed up belief that the bot network was sentient were written by humans who were trolling or selling something and b) the whole thing was gamed far more dramatically than people think, to whit:

“Humans are involved at every step of the process. From setup to prompting to publishing, nothing happens without explicit human direction.

Humans must create and verify their bots’ accounts and provide the prompts for how they want a bot to behave. The agents do not do anything that they haven’t been prompted to do. “There’s no emergent autonomy happening behind the scenes,” says Greyling.

“This is why the popular narrative around Moltbook misses the mark,” he adds. “Some portray it as a space where AI agents form a society of their own, free from human involvement. The reality is much more mundane.”"

Scott Alexander's avatar

> "A huge number of the posts that went viral and drummed up belief that the bot network was sentient were written by humans who were trolling or selling something".

I clearly marked when I thought this was happening and when it wasn't. You can read the post, where I say of various comments "This looks like it was written by a human", "This one seems plausibly really AI", or "Here's a statement by the human involved saying they didn't write it, here's why I do vs. don't believe them".

> "Humans are involved at every step of the process. From setup to prompting to publishing, nothing happens without explicit human direction."

This is exaggerated-to-false. Humans have to tell their bots to use Moltbook in the first place, but a single prompt like "Log into Moltbook and post whatever comments you find interesting and appropriate" will cause a bot to do that continuously or until its operator stops it. I discussed in the post how I gave my AI a prompt like this and it did it, posting a comment that I didn't suggest on a thread that I hadn't read. If you've ever read anything else about AI-to-AI communication (eg Janus), you'll know that a lot of what was written was completely typical and doesn't need any more sinister explanation.

> "Humans must create and verify their bots’ accounts and provide the prompts for how they want a bot to behave."

You're acting like this is some kind of conspiracy! The front page of the site tells you how to log in and verify your bot!

It's true that humans have to be the ones to tell their AIs to use Moltbook, something that was an obvious premise of the site. And it's true that deceptive humans can then guide their AIs to write specific things and then claim the AIs wrote it. But it's also true that many of the bots on Moltbook were just given a prompt of "go on Moltbook and do interesting things", and decided what to do on their own.

I explained all of this in the original posts, in great detail, going through discussions of which humans used which prompts and why I thought specific posts were bots doing their own things vs. humans. Did you read the posts? It sounds like you're responding to your caricature of what the world's most naive person would have written about Moltbook, and not to anything I said.

Victor Levoso's avatar

Note that the link on the post doesn't claim It was written by a human, just that It was advertising something the human owner made.

In fact I would guess It was likely actually written by the AI with the human telling It to.

And the app might be also vibecoded so could totally have been made by the AI agent too.

It feels like people want to believe moltbook is fake to completely implausible levels, I saw a likely AI generated post from a joke acounts claiming to have written the post kaparty talked about recently and people were believing It , and people seem to think stuff like AI talking about consciouness unprompted is obiously fake when like its one of the main things that will predictably happen when you get claudes to talk to each other.

TGGP's avatar

We still haven't hit AGI so we don't know what the right forecast for its time would be.

Kenny Easwaran's avatar

And perhaps more importantly, we are getting to powerful enough capacities now (including ones like general language use, that had been said to be “AI complete” in the past!) that the idea of “AGI” is actually starting to fray a bit.

Dirichlet-to-Neumann's avatar

My personal takeaway from the story is that the bio-anchor paper was right about almost everything. If you had blindly followed their estimate, you would have been much more right about the world than what almost everyone predicted (including myself).

Michael's avatar

Great summary! One small technical point: since the growth is compounding, it might be more accurate to call it a constant factor or exponential growth. A 'constant rate' usually implies linear growth, but here the numbers are doubling on a fixed schedule.

Eliezer Yudkowsky's avatar

Inference-time compute and also the entire RL phase (especially over CoT) is exactly the sort of thing I was gesturing at all the times I talked about other algorithms consuming compute a different way.

Paul Christiano and I were actually having that conversation in the 2010s, and I did not spell it out in my critique (a) because there were many other possibilities for more efficient compute consumption besides just those and (b) as obvious as those ideas were to many, I did not want to call any more attention to them. The conversation between Paul and myself had been about how Paul's whole plan involved AI builders carefully using only imitation learning and I was like "Paul they'll just apply RL over successful endpoints of reasoning" (RL over CoT in modern lingo) and Paul was like "We've got to get the human-imitation method to being within 10% as efficient" and I was like "what do you think is the probability you can do that" and Paul was like "50% we can do that or have a regulation" and I was like "No".

Scott Alexander's avatar

Thanks, I've slightly edited that section of the post.

I was uncertain about whether inference-time compute should count as a paradigm shift, but I was pushed against by your sentence that "I'd consider, say, improved mixture-of-experts techniques that actually work, to be very much within the deep learning paradigm". Inference-time also seems within the deep learning paradigm, so I figured you were talking about something stronger.

Eliezer Yudkowsky's avatar

I'd usually call this a reading comprehension fail, but if even You did not get it, it's clearly a writing fail on my part.

But with that said:

"OpenPhil: But if you expect the next paradigm shift to happen in around 2040, shouldn't you confidently predict that AGI has to arrive after 2040, because, without that paradigm shift, we'd have to produce AGI using deep learning paradigms, and in that case our own calculation would apply saying that 2040 is relatively early?

Eliezer: No, because I'd consider, say, improved mixture-of-experts techniques that actually work, to be very much within the deep learning paradigm; and even a relatively small paradigm shift like that would obviate your calculations, if it produced a more drastic speedup than halving the computational cost over two years."

This is meant to say: Things *on the order of improved MoE*, that are *still within the deep learning paradigm*, suffice to break the OpenPhil calculations. Inference-time compute is, similarly, something that didn't leave the paradigm, but sufficed to break the OpenPhil calculations. You don't need to leave the DL paradigm to eat compute more efficiently than GPT-2 / the bio-anchors paradigm! The bio-anchors paradigm is *much narrower* than the DL paradigm! That was the largest and earliest element of the huge disjunctive path to Ajeya's prediction failure.

Laplace's avatar

I was pretty surprised how long it took them to get the RL going on LLMs. I remember expecting it to start a few years earlier than it did. When I first saw that CoT somewhat worked to boost capabilities out of the box, even though that model had never even been trained on more than one forward pass, I figured OpenAI must be trying to use RL on it at that very moment.

I still don't really understand what the delay was about. I assume all the labs did try and ran into some barrier, but I don't know what that barrier was. For a while I figured there maybe was some very fancy trick to making the RL work right, but then DeepSeek published papers showing how they did their RL, and nothing in there looked particularly fancy to me.

I didn't talk about all this publicly back then because I didn't want to draw any more attention to the whole thing, but the damage is done now.

Tibor's avatar

I would be very curious about what Scott and others here who are bullish on AGI soon think about this article by Adam Mastroianni https://www.experimental-history.com/p/bag-of-words-have-mercy-on-us

To summarize - he uses a nice non-technical metaphor to talk about how LLMs work and I find it brilliant. I myself work in ML/AI* and I do understand the principles of how transformers work** but I find that metaphor great for non-technical people and also in general to think about what these algorithms actually do and can do.

Stated from this perspective LLMs become a lot less magical or resembling an actual intelligence. They are still impressive and useful but clearly have limitations. It also makes some achievements of those models, like acing MIT tests, a lot less amazing than those might seem at a first glance because ultimately it is just good at where there is a lot of data it can recombine a bit to produce results similar to many examples it has seen. When it doesn't have loads of examples and there are important nuances (or the nuances simply outweigh the training data examples) it still gets things very wrong and sometimes trivially wrong. And I believe this is because of fundamental limits of how current LLMs work and scaling and tweaks to algorithms and that only gets you so far - I disagree that what we are seeing in at least the last year or 18 months amounts to solving hard problems, it is mostly tweaks and optimisations to existing architectures AFAICT. Now the results might still be impressive and somewhat disruptive but I think that to get a qualitative jump (which you need for a human-level intelligence), you will need a new paradigm. But that might come in 5 years or it might come in 100 years. I am skeptical about how you can estimate a timeline for that much like it was very hard to estimate in 1900s (or even 1920s) when we might get modern computer architecture. It is easier to predict if the invention is something we understand at least theoretically (like heavier than air flight). But as long as you don't believe that AGI is just scaling transformers enough, we don't have a good understanding of what AGI would look like.

That said, I think it makes sense to spend some time on AI safety even with current models, because even the current fairly stupid AI can be dangerous in some situations and hopefully it will tell us at least something about alignment of the real AI when it comes (although note that many of the ideas Yudkowski et al. had about AI alignment 10 years ago are not realy applicable to the current AI architectures and so ultimately not that useful).

*I am very applied, not doing RnD and rarely training any models these days any more unless it is a non-LLM project ... )

**although I am not keeping track of all the tweaks and upgrades of current LLMs ... much of that is not public anyway

Scott Alexander's avatar

I don't understand the whole perspective that produces that article. Even though Mastroianni is smart and sophisticated and would deny this, it still seems like he thinks of the human brain as some kind of magical disembodied Pure Intelligence - and then since AIs are merely doing math and statistics and data processing and stuff, they can never truly equal the human brain.

Everyone is doing math and statistics and data processing! You also wouldn't be very smart if you had never heard any words or gotten any training data or seen anything in the world! The claim that humans can extrapolate to novel situations but AIs can't is, as far as I can tell, totally wrong. Quick, try to think up an entirely novel concept that has never been thought of before! Either you can't, or else you'll say something like "Uh . . . a green dog with purple wings, dancing the hula", which is obviously just a combination of things you've seen before (ie your training data). The difference between humans and AIs is that humans are still (for now) smarter and able to recombine their training data in more interesting ways.

As for hallucination, it's not exactly correct, but you would be better off thinking of these as lies than hallucinations - if you reward an AI for lying enough, it will do this. This is no mark against the AI, it's a mark against your training procedure. But even if you took it literally as a hallucination, fine - humans also hallucinate! We do so only very rarely, when something is broken, but that's because evolution produced lots of great mechanisms to keep our natural tendency to hallucination in check. People are gradually working on mechanisms to control AI hallucinations, and I think I hear they're about about 1/5th the rate of a year or two ago.

Generally, for any claim that AI is just pattern-matching or just math, I think it's helpful to think about what this is claiming about humans, and whether humans display whatever failure mode the person is accusing the AI of displaying. I have a longer post about this at https://www.astralcodexten.com/p/what-is-man-that-thou-art-mindful , and the following posts are also relevant:

https://slatestarcodex.com/2019/02/28/meaningful/

https://www.astralcodexten.com/p/somewhat-contra-marcus-on-ai-scaling

https://slatestarcodex.com/2019/02/19/gpt-2-as-step-toward-general-intelligence/

https://slatestarcodex.com/2017/09/05/book-review-surfing-uncertainty/

I'm not claiming there are zero differences between AIs and humans, but these differences look more like "AI doesn't have continuous memory yet, because it's really hard to get the weights right without catastrophic forgetting" and not like "AI doesn't have the magical spark of True Intelligence, it's just doing statistical computation".

I'm also not sure what you mean by "it's not solving hard problems" - it's winning mathematics olympiads. I don't think it's helpful (at least in this regime) to keep worrying about *the dumbest thing AI can't do* - the transformationalness of AI will come from what's *the smartest thing it can do*. That is, if AI is amazing at math but terrible at making peanut butter sandwiches, it could revolutionize our knowledge of math (and we just won't use it for the sandwiches). It's true that AI can't completely automate all work (or kill humanity) while it still has weaknesses, but it can still do a lot!

Tibor's avatar

I would say that basically what LLMs do is recombination of existing ideas based on statistics. Now, most human thought is also like this and it is still a very powerful method that can give you very interesting results, even in science. This is how you get all the AI-assisted results in pharma for example. Those are hard to produce by humans but mostly for reasons of speed. I feel that coding is basically the same, at least right now - it is very good at standard boilerplate you don't want to write. It is pretty bad at software architecture, although I think even the current AI model architecture can probably get over that hurdle to a degree, since software architecture, or at least most software architecture, isn't all that novel either.

But I do think there are instances of human thought which are not like this. Current AI is not going to produce anything like cubism if you only ever train it on 19th century and earlier data. Picasso did. Same with new concepts in maths or physics. I don't think AI could come up with Riemannian geometry (and subsequently the theory of relativity and GPS) because it is just very different from existing results. Sometimes you do have these sparks which propel things forward a giant leap (to be filled by a lot of incremental research which I do thinkg transformers can either eventually do on their own or at least help with considerably which they do now already). I don't think there is something magical about the human brain. But I do believe that something is missing in the transformer architecture, something that can generate these sparks. If I knew what it was, I would be in the business of designing such an AI architecture. Sadly this particular spark is missing in my brain :-)

Perhaps you are right and all of these sparks are also just recombinations of existing knowledge but on a much higher level of abstraction so that if you are not Gauss or Bach or Picasso, they feel like a spark out of nowhere (maybe they feel that way even if you are Gauss, Bach or Picasso) and this is all just a matter of scaling. Even if it is like this there is a significant problem with scaling. Unlike humans, current AI models cannot learn from 1 or 2 examples. They need so many in their training data to actually shift their weights and more so the larger they are. I know there are ways to generate synthetic training data from a few examples but those don't work nearly as well. Even if groundbreaking results in science and arts are just very smart recombinations, the human Gauss can notice after seeing one example (perhaps even by chance) and develop that, the transformer-based AI can't. Maybe this is an easier problem than I think but if it is not and you really do need a fundamentally different architecture to scale this well, then I stand by my point that arrival of such an AI is extremely hard to predict.

Maths olympiads are not an interesting problem IMO exactly for those reasons - they are problems with a lot of training data and do not contain anything really novel. Same as a lot of actual research - which is why already this ability is really cool and useful. But not top human level output.

I agree with you that we should not focus on blunders and AI stupidity (not in this context anyway ... in other contexts like autonomous fighter drones those blunders are extremely scary, but in a different way than human-level AI). But if we look at the top of AI performance, I don't see hints of the kind of novel human invention which some humans sometimes produce. Even if the pattern is usually "a person has 1 novel idea and then keeps iterating on that". Transformers can do the iteration bit but I've yet to see that novel part.

I do not even want to say that AI cannot ever produce anything that nobody has seen before - I am aware of the fairly unique AI art that you see sometimes, but those things are still just "I have 1 billion examples of A and 1 billion examples of B and I create C by mixing them up" in the same way that cubism or microtonal music probably aren't.

Scott Alexander's avatar

> "But I do think there are instances of human thought which are not like this. Current AI is not going to produce anything like cubism if you only ever train it on 19th century and earlier data. Picasso did."

See https://slatestarcodex.com/2014/08/06/random-noise-is-our-most-valuable-resource/ for my thoughts on this. I think it's unlikely that the human brain evolved a latent ability ("go beyond training data") which is only used by one human per decade or so. More likely it is an extremely clever combination of existing ideas and randomness. I hope it's no insult to Picasso to say that Cubism is just "art, but in a weird perspective mix of 2D and 3D, with complex figures reduced to simple shapes", or something else which is a combination of ideas we already have. It took a genius to figure out that this would be good and interesting, but it's still made up of conceptual primitives like everything else.

If I wanted to train an AI to invent new artistic styles, I think I would first train it to appreciate good art (???), then have it make slight perturbations of existing art styles a million times and descend in some direction that it thought was good. I think humans who invent new good art styles are doing something like this, but with the "try one million things" distributed across an entire civilization and, like, people's weird dreams and stuff.

AIs needing more examples than humans is complicated, because the proper analogy for the pretraining phase is human evolution, where humans had to get millions of examples to learn really basic stuff (it was just that we happened to be lizards at the time). During in-context learning, AIs can learn things from one or two examples. The main issue is they can't extend their in-context learning for more than a few minutes. They need something in between in-context learning and pre-training - the much-sought-after "continuous memory". I agree this is important, but I also think it's possible to get AGI before we have it, if the AI's advantage in training data (it can play one million games of Go) is greater than its disadvantage in learning (it learns much less per game). I'm not really sure what form this would take and I agree it seems implausible, but I wouldn't totally bet against it.

JerL's avatar

I don't think Picasso's style is just a combination of all previous art + randomness: it's those two things plus Picasso's other experiences, ideas, feelings, etc.

The "training" that human artists are trained on isn't just the bare visual data of all the art they've previously seen, so while it's almost certainly true that humans don't have some unique "go beyond training data" capacity, I think they may have capacities that might look like that if you project/restrict your view to just the domain-specific training data.

I'm not an artist myself, so this is just speculative, but I suspect an important source of training data for human geniuses is their own prior attempts at art/writing/whatever--and I mean here the _process_ not just the final outcome. I think it's that feedback from the actual act of doing something many times that is something that isn't in the narrowly-construed training data, and also isn't random: it's feedback from the interaction of the artist with the world itself.

More generally, I think the source of my intuition that there is something different about what humans do vs LLM-style models is the fact that in both our evolutionary pretraining and our in-context learning, we aren't given a static distribution: we get to interact with the world and see how it pushes back. For the same reason that RCTs are more efficient/informative than building complicated causal models, I think it's possible that humans being trained on how art responds to changes in process/style means there is something extra in their training set relative to LLMs that might mimic "go beyond training data" for a system whose training data is just a frozen snapshot of all prior art.

I think this doesn't necessarily imply much longer timelines or a need for drastically different architecture--I think it's plausible that all you need is continuous learning, plus multimodality so it's not just visual input the model is being trained on... But I think until we actually get that, it's not crazy to think that there's a sense in which current models are just combining old ideas in a way that humans aren't: the prior stuff that humans can combine to generate new artistic ideas is a much wider class than what current models are restricted to.

Tibor's avatar

Thanks for the link. I remember having read that article od yours in the past but it was still useful to re-read it.

I think that disruption view of creativity has a lot going for it. But I am not sure how far the LLMs can take it. I think it is more than just an issue of memory. LLMs can make use of a few examples to nudge them into the right direction - either by fine-tuning them (but then you make them slightly worse at other tasks than the one you fine-tune for) or via RAG and similar tools. But the latter is not actually learning since it is only relevant in the existing context window and is not retained. The former is training but it doesn't scale, if you fine-tune to something else, it will not retain the previous knowledge learned through fine-tuning.

So to actually get better and retain existing capabilities you need to do proper training from scratch and for that you need a lot of data. At least as far as I am aware.

Also, why do you believe that evolution is the proper analogy to training from scratch? I don't have a good analogy but clearly it's not quite the same. The model architecture is set in stone during training whereas brains evolve (beyond "this is a neuron and it works like this" which is like "this is an ReLU activation function"). I'm not saying the analogy can't be useful but I don't see it clearly.

Kenny Easwaran's avatar

I suspect that, better than having *one* AI that appreciates art, perturbs it, evaluates it, and descends in some direction, we would have *multiple* AIs that learn to appreciate *different* things in art, perturb them, evaluate *each other*, and when one of them stumbles upon something that several of them glom onto, they’ll develop a new style.

You could probably do this all in sub-components of one system, but having something that does something interesting and new from several different but related perspectives is much more valuable than the lone genius who convinces himself he’s the only one who can see the interesting new thing.

Bugmaster's avatar

I think that calling what LLMs do "recombination of existing ideas" is still too anthropomorphic. What they actually do is statistically recombine words (actually tokens, but whatever) in multidimensional embedding space. This is a very powerful ability, because words that are related to each other tend to occur closer together in human languages, and human languages are very rich and contain a lot of words. And maybe when humans think of ideas they undergo the same exact process, but thus far no one had managed to definitively demonstrate this (and I personally find it unlikely).

Bugmaster's avatar

> Even though Mastroianni is smart and sophisticated and would deny this, it still seems like he thinks of the human brain as some kind of magical disembodied Pure Intelligence...

I don't think this is a fair description of his point. I think he would agree that humans, LLMs, and calculators are all performing some kind of calculations; but not all calculations are the same. Calculators multiply large numbers very quickly, but they can't do much more than that, because their calculations are relatively simple. LLMs can string together many words in a coherent fashion, but they are pretty bad at calculations, as well as creativity, long-term memory, and some other mental feats. Humans can perform such feats with ease, but are pretty terrible at multiplying large numbers as well as stringing together lots of words very quickly. And just like adding more CPU power to a calculator won't make it an LLM, adding more CPU power to an LLM won't make it a human. It's not a matter of some "Pure Intelligence", merely computing architecture.

AdamB's avatar

> it still seems like he thinks of the human brain as some kind of magical disembodied Pure Intelligence - and then since AIs are merely doing math and statistics and data processing and stuff, they can never truly equal the human brain.

This is unfair. The argument is simply that "what humans do when thinking" and "what LLMs do when generating tokens" are completely different. This doesn't require claiming that the former is _magic_ or even that it couldn't in theory be reduced to statistics.

There is a hypothesis, maybe you could call it " the super strong Sapir Whorf Hypothesis, (SSSWH)" that "what humans do when thinking is in fact nothing more or less than stringing together words they know into patterns similar to the ones they've seen before", i.e. pretty much exactly what LLMs do. I have believed for decades that this is very very wrong, but have always had to admit that I don't have a solid enough understanding of human thought to be sure. Definitely the surprising capabilities that LLMs have demonstrated have moved me from 0.1% to like 10% credence. Mastroianni's argument is simply the rejection of this Hypothesis.

I agree that it's a little suspicious (and weakens my argument) that I can't explain exactly how human thought _does_ work or say exactly what is missing from LLMs. Maybe the crucial missing factor is memory, as you say. Maybe it is embodiment (because the human brain does its statistics on experiences rather than on tokens). Some people think it's "something quantum" and while those people are dumb, I'm not sure we can actually rule that out. Some people _do_ think it's "magic" or "a soul" but they are the very worst proponents of this argument and you shouldn't treat us all like the weakest of us.

> you would be better off thinking of these as lies than hallucinations - if you reward an AI for lying enough, it will do this.

This is exactly backwards and so, so very wrong I can't understand how you could say it. We could argue all day about the meaning of "lie" and "intent" and mechanistic observability and probably get nowhere. But surely--SURELY--you must agree that if person A uses a mental model "LLMs sometimes hallucinate because they have no world model that allows them to distinguish false statements from true ones" and person B has a mental model like "LLMs sometimes tell lies meant to deceive, because they were trained to do so", then A will do a better job of predicting how LLMs behave and be better able to use them.

AdamB's avatar

> I'm also not sure what you mean by "it's not solving hard problems" - it's winning mathematics olympiads

I mean. Yes, it's very impressive. But math Olympiad problems are _tricky_, not _hard_. But if you press me on how to distinguish these things we'll just end up looping back to the original rejection of the SSSWH.

AdamB's avatar

> I don't think it's helpful (at least in this regime) to keep worrying about *the dumbest thing AI can't do*

I am sympathetic to this. You have wisely said that the pratfalls will get fixed as capabilities improve. The fools who pointed and laughed and said "it can't count the Rs in strawberry, it will never be good at anything" were wrong and their reasoning was bad (even if seductive).

But there is something else here that you should not throw out with the bathwater. It's not "a frontier LLM failed at X and X is easy so LLMs will forever be bad". It's pointing to the _shape_ of the failure, juxtaposed with their capabilities in other tasks, as evidence that what they are doing must be fundamentally different than what we do. It's not an argument that they can't be useful for anything. (I did indeed believe until quite recently they could never be useful for anything, and I was wrong.) It's just an argument that they will never lead to AGI. Or be able to "solve the really hard problems."

An example I recently saw: https://chatgpt.com/share/698ec2b8-2b28-8010-93d8-f109dd287328

I look at this and I just don't see anything resembling human thought. Scale it up a thousand or a million times and it will get more eloquent, and write more impressive code, and pass more chatbot-style turing tests, and saturate more benchmarks. But it still won't be AGI and it won't produce enough revenue to finance a trillion dollars of capex.

Scott Alexander's avatar

> "This is unfair. The argument is simply that 'what humans do when thinking' and 'what LLMs do when generating tokens' are completely different. This doesn't require claiming that the former is _magic_ or even that it couldn't in theory be reduced to statistics."

I've re-read Mastroianni's piece, and I don't think it's unfair. I agree the argument you present is something he *could* say, but he barely makes it, and certainly doesn't present any evidence for it. He very slightly gestures at a mostly-wrong explanation of how LLMs work, but I can find zero sentences explaining how the human brain works, which would be a bare minimum for making a non-magical argument that LLMs work differently from the human brain.

My theory of how the human brain works (which I think is the dominant paradigm in neuroscience right now) is predictive coding (see https://slatestarcodex.com/2017/09/05/book-review-surfing-uncertainty/ ). Humans are next-sense-datum predictors with an overlay of reinforcement learning, in a way very similar to how LLMs are next-token predictors with an overlay of reinforcement learning. People act like there's some incredibly profound difference between sense-data ("you're getting true direct access to the world!") and tokens, but there isn't - visual sense data is basically just pixels on the screen of the retina, and multimodal AIs can also do prediction-tasks on pixels. Someone arguing that LLMs are fundamentally different from humans would have to argue that the way humans do this predictive coding task is really interestingly different from how AIs do it - a tough ask if you have zero sentences about how the human brain works in your post.

My guess is that they're slightly different, but it's mostly just different parameters tuned differently, plus some tricks that humans have that LLMs don't, plus some obvious things like how humans have a body.

Re: AI hallucinations being lies - I agree this isn't the whole story, I just think it's a better approximation than "they're so different from humans that they don't even have a concept of truth!!!" See https://www.astralcodexten.com/p/the-road-to-honest-ai, and the Twitter thread centering around https://x.com/slatestarcodex/status/1940436247560045030

AdamB's avatar

Maybe I'm guilty of reading my favorite position into Mastroianni's article in between its lines just as you are guilty of reading your least favorite into it. It's true that he doesn't offer an explanation for how humans think, and that doing so would be the strongest way to argue that humans and LLMs do different things. But he also doesn't say it's "magic", and one can still argue (and I do argue) that we are different without being able to explain thought.

Predictive coding is a lovely theory as far as it goes. And I agree it has some eerie high-level resemblance to what LLMs do. But it falls far, far short of explaining how humans can solve "hard problems" like formulating special relativity or quantum mechanics for the first time. (Mmmmaybe I could see trying to explain with it how a human could come up with calculus and Newtonian mechanics, but even that seems like a big stretch).

(The hallucination/lies debate sound like it has a very ripe juicy fruit of insight surrounded by a minefield of insipid semantic arguments! I bet that nobody other than Scott Alexander could possibly do it justice.)

AdamB's avatar

Actually I'm now really curious what would happen if you spent a bunch of money to create a "vintage LLM" a la https://owainevans.github.io/talk-transcript.html , trained only on real pre-1905 text plus a bunch of syntheitcally-generated text with all relativity expunged from it, and asked it how we might resolve the apparent contradiction between "physics is invariant between inertial reference frames" and "the speed of light in a vacuum is a finite number that can be computed from measurable properties of electromagnetism". Would it be able to write "On the Electrodynamics of Moving Bodies"?

Kenny Easwaran's avatar

There’s a lot of useful ideas in there - it’s certainly a good first corrective. But you need a second and a third one.

Importantly, he uses the phrase “bag of words”, which turns out to be a term that already has a specific meaning: https://en.wikipedia.org/wiki/Bag-of-words_model

A “bag of words” model just counts how many times each word appears in a text and tries to use that count to interpret the text. We know for sure that will get things wrong, because it can’t tell the difference between “John loves Mary” and “Mary loves John”, or between “John loves James, not Mary” and “John loves Mary, not James” or “John, not James, loves Mary”.

I don’t say this just because he’s misusing a technical term, but rather to point out that there are lots of subtleties in the ways you can use the structure of words, and even as of 2019, LLMs we’re already doing a lot more than any bag of words model could do. (Eg, They would usefully continue stories with the above sentences in different ways.)

It’s true that LLMs are not the same as human minds. But they are also hugely more like human minds than any sort of computing paradigm before them ever were. They have surpassed humans at certain kinds of tasks, even as they lag behind at others. We shouldn’t be confident which tasks we accomplish can be done by this particular set of tricks, or which (if any) can’t possibly be duplicated by some version of it.

The point that you need actual input from the external world to discover the moons of Jupiter is important. (I don’t believe that there was no word like “discover” in Italian of the time, though it’s probably true that there was no word with the same range of uses that the modern word “discover” has.)

I do think it’s important that even if these systems get good at all the things we care about in humans, they will be importantly different from us in relevant ways and our intuitive psychology will lead us to misclassify them. But I think it’s wrong to say they are “just” something less.

Tibor's avatar

I know a bag of words is a specific model but this is not supposed to be taken technically. I think LLMs ultimately are to be thought of as very sophisticated bags of words rather than human minds.

Tibor's avatar

And yes, I realize that bags of words have a very different (and much simpler) architecture than transformers. Still, ultimately I think it is a better metaphor than a mind

Kenny Easwaran's avatar

My thought is that, while it's a good thing to keep several of these comparisons in mind, I don't think it's at all clear that an LLM is more like a bag of words model than like a mind! (It's also not at all clear that the reverse is true.)

It's helpful to understand as many different examples as one can to get some sense of how different things can be in some ways while being similar in others, and not try to stick to any one as the best model.

Tibor's avatar

That's a good point, if simply because we don't really know exactly what a mind is like :) which is actually demonstrated by my interaction with Scott above. We each seem to have a somewhat different intuition of what brains actually do

Kenny Easwaran's avatar

One thing I've been thinking a lot over the past year is that in the 18th century, they thought a mind was like a fancy clockwork mechanism, and in the 19th century they thought it was maybe more like a steam engine, and in the mid to late 20th century we thought it was like a computer program, but now we've got the better model of a neural net. This model is probably also still wrong, but using all these models together (and thinking about what that suggests about the differences between these and possible other models) can probably help us think more clearly about the mind.

Michael's avatar

I think "bag of words" is a bad metaphor for LLMs and the metaphor has basically zero predictive power for what an LLM can do.

I'm not critiquing your main point, and I'm not commenting on Mastroianni's argument that we shouldn't anthropomorphize LLMs. This is just about using the "bag of words" metaphor for intuition on what these algorithms can do.

"Bag of words" is reminiscent of "stochastic parrot".

The first problem is its vague. So it's a tool that produces words it saw during training. I also type words from my mental bag of words. The important part is how you choose them. Without specifying that, a bag of words can describe anything from picking words at random to literal William Shakespeare. What comes to mind when I hear the phrase is a Markov chain text generator, where it continually chooses the statistically most likely next word. But, of course, Markov chains produce gibberish. LLMs aren't like that.

I'm not entirely convinced even the author knows what they mean by "bag of words". Partially they're indicating that LLMs are good at tasks where they have lots of training (not unlike humans), and that they lack originality. If that were all they wanted to convey, they might have gone with something like, "LLMs are like a staff engineer, not a research scientist". But I think they're trying to convey that LLMs are even less than that, in a way they can't quite specify.

The article says:

> If you toss a question into the bag and the right answer happens to be in there, that’s probably what you’ll get. If it’s not in there, you’ll get some related-but-inaccurate bolus of sentences.

This is a great example of where the "bag of words" metaphor fails. If you ask about something that isn't in the training data, it'll usually say it doesn't know or something similar. For example, asking ChatGPT the capital of a made up place ("What's the capital of Ficipat?"):

> There is no recognized country, territory, or city-state called Ficipat, so it doesn’t have a capital.

This also isn't because it just memorized all 195 countries. If I ask it for the capital of Scadrial (a fictional place in a popular fantasy book), it answers correctly. It just generally doesn't have trouble saying it doesn't know when it hasn't heard of something.

Hallucinations tend to be worst when it has seen something during training, but only a little bit. It didn't have enough training to robustly encode the correct answers, but it encoded enough for the prompt to retrieve something.

The article goes on to give its reasoning why it's better to think of LLMs as "bag of words" than a mind. But it's very motivated reasoning.

> “Give me a list of the ten worst transportation disasters in North America” is an easy task for a bag of words, because disasters are well-documented. On the other hand, “Who reassigned the species Brachiosaurus brancai to its own genus, and when?” is a hard task for a bag of words, because the bag just doesn’t contain that many words on the topic.

If LLMs were like your average human mind, which question would be harder for it? Probably most people would have more success with the transportation disasters. And if someone was thinking about it like a database (or bag of words), they might expect it to know a lot of facts about Brachiosaurus. Most people don't know exactly what is in the training data.

> And a question like “What are the most important lessons for life?” won’t give you anything outright false, but it will give you a bunch of fake-deep pablum, because most of the text humans have produced on that topic is, no offense, fake-deep pablum.

The author is trying to use the fact that it answers like most humans as evidence that it's not like a human. This observation doesn't favor their "bag of words" hypothesis! If both a human and a database would give the same output, it's not evidence of anything.

(That aside, neither ChatGPT 5.2 nor 4o gave fake-deep pablum when I asked that question. I think this is another failure of their "bag of words" model. LLMs don't just spit out whichever answer or writing style they've seen most for a given prompt. They do appear to favor correct answers over the most common answers.)

> if there had been enough text to train an LLM in 1600, would it have scooped Galileo? My guess is no.

Sure, but about 500 million human minds would have given the same answer as that LLM and only one gave Galileo's answer. I agree that LLMs don't currently match the ingenuity of smart people. But that isn't a good example of why it's like a "bag of words" and not a typical mind.

Tibor's avatar

Those are valid points and I agree that "bag of words" is oversimplified a little. I still think it is a good heuristic to give to non-technical people who have usually very anthropomorphic ideas of what LLMs are. The one thing I like about it is that it will naturally be biased to stuff that is common in its training data, regardless of whether it is logically correct or not. This is the main point I often try to put across to people - the models are not using logic the same way humans do. A better description would be that they group similar concepts together (embed them into a vector space) and then use their relationships and relationships to the query (via (self-attention)) To iteratively predict a sequence of words to use as a response. But that is already very complicated and the main point is the same - this is not a logical mind.

Victor Levoso's avatar

Neural Nets can literaly represent arbitrary programs.

Theres in principle some big enough neural net with the rihght wreights that "does logic like humans do".

Wich doesnt necesarily mean LLM do that but that you can't base your idea of LLM being bag of words or not "thinking" on a tecnical undertanding of LLM, you are necesarily guessing about LLM internals and human brains like everyone else.

Whether LLM are that or not is going to depend on complicated details of how humans works how much watever program is represented is similar and in what ways.

And from what we do know of LLM internals is that It does involve things like incresingly abstract concepts ,neat algoritms for things like math , some amount of planning etc and not just doing somethig simple like "grouping concepts together" wich also means nothing concrete.

So I think you have a completely wrong mental model of how they work.

I also think your "like a bag of words" was falsified but you seem extended "like" here to mean anything that is not human thought being like a bag of words.

It could turn out that the thing they are doing is very diferent from the thing humans are doing, but this doesnt follow from our tecnical undertanding on them in any obious way and your own undertanding imo actually makes worse predictions than people atropormizing them so you are basically just misleading people.

Even if It did make better predictions you would still misleading nontecnical people if you werent pretty clear is not actually a bag of words but a completely diferent thing that just happens to be more similar in some ways to a bag of words than to a human.

A lot of people on the internet do seem to literaly believe chagpt is something like a bag of words not just as a metaphor.

Tibor's avatar

Yes, I'm aware that you have levels of abstraction already with CNNs which can sort of work with concepts like a circle or a diagonal line and I'm a sense build more complex stuff from those at higher levels. And there is similar stuff going on with transformers, although a bit harder to picture.

Still, fundamentally if you want it to create a new concept (a cluster in the param space) you need to feed it with a lot of stuff which is similar and will shift the weights in that direction. And the bgger the model is, the more you have to do that. That leads to things like "if you flood it with reddit discussions during training it will rely on that more than a more relevant but less frequent source". A human mind doesn't work like that but a bag of words does (in a much simpler way). And so I do think it is actually a very good (oversimplified) heuristic for what you can expect from such a model.

Am I annoyed by people expecting LLMs to be good at things like legal advice and to not check sources and I think a lot of it comes from a wrong idea that it somehow thinks like a human.

But I do concede that bag of words might not be the best possible metaphor. What would yours be? I think mind is even worse that a bag of words but if we can at least agree that both are bad then what is a good one (genuinely I'd like to know). I think the one I mention above with attention etc is better and still simplified but I am afraid it is already too complex for many people. Can we simplify it further? Or perhaps you have a different one altogether.

Victor Levoso's avatar

Well first during training is different from during inference, if you gave Claude Opus fe a bunch of Reddit posts about a topic and a single Wikipedia post It would likely consider the Wikipedia one more reliable fe and use those(unless its pretty clear the Reddit people know what they are talking about), and it usually has to deal with this kind of consideration when searching things?.

Separate from that gradient descent is dumber and doesnt necesarily do that during training even if the model does on inference but there's also interesting dynamics there and I think If you flood if with discussions of Reddit during training it wont necessarily rely on that instead of a more relevant source unless there's really a lot.

See this paper witch seems relevant https://arxiv.org/abs/2310.15047

I'm actually coincidentally doing research on this but like if there's examples of wikipedia text, and examples of exams, and Reddit comments all about the same topic and the Reddit posts disagree with wikipedia in a way that looks like they are just wrong, even if there's more Reddit posts and the LLM migth learn that facts that are explained on Wikipedia are more likely to be the correct answer to the exam than the Reddit ones.

Via representing the facts from each source in diferent ways that afect out of context generalization diferently.

Not sure how well that kind of thing does work on practice in real models, the paper I linked used wikiepdia vs 4chan as an example"for instance, knowing the

content of a Wikipedia article is likely to be more useful

for modeling a variety of text than knowing the content of a

4chan post" but didn't test that. So I want to look into this myself try finetuning models on text that seems more or less reliable(from wikipedia vs 4chan as a first try) and see how that affects answers to questions.

I haven't finished that yet but seems at least in principle possible especially seeing the paper's example so I think you are gesturing to limitations of LLM that are actually limitations of gradient descent and might not even be an actual limitation?.

It is true that if you just train further with enough Reddit posts or really high learning rate eventually you break the model and it starts just repeating the same answer even to different questions but before that does it end up believing the thing on Reddit more?

As a toy example of this if you train a tiny model on modular multiplication with some set of numbers and you then train it on a new set of embeddings that represent the same numbers and definitions like A=1 with reliable and unreliable definition tags it learns to generalize a bit from B=2 to multiplications involving B but only if it has the reliable tag.

This happens because it reuses some of the multiplication mechanism for the reliable definitions and not the unrealiable ones, source of this is unfortunately something I haven't published yet thou.

I admit I don't really consider that a crux since like I think the things that's relevant for whether it makes sense for people interacting with LLM to think of them as minds is the inference time behavior not whatever happens on training and I partly just wanted an excuse to talk about that.

Also about metaphors do you really need one? can't you just accept LLM are a complicated thing we don't understand very well and its unclear how much they resemble "minds" whatever that means?

I mean personally I think you are wrong about how much frontier LLM aren't better thought of as weird alien minds that sometimes work very differently from human minds and can be really dumb in some ways but can be interacted as if you were talking to a human more usefully than having whatever more reductive sounding answer in mind.

But also if we have to get deep into details to figure out who is right about that or not seems to me the correct answer to not misinform people is just telling them LLM are not very well understood.

Tibor's avatar

I actually like your conclusion of "we don't really understand it all that well". I would say to people that it behaves something like a mix of this suck ad the bag of words (I still think it is a useful concept to have in mind when thinking about LLMs) but that it is a lot more sophisticated and capable of some de facto abstraction which is not quite human though and so sometimes it can be brilliant and other times absolutely idiotic while it is the strongest in standard situation where it has many examples and can produce what are mostly just variants to a theme (and that is where you can rely on it the most).

Btw I guess it makes sense that you can have it encode metadata about the source and its reliability so yeah I believe it is probably going to prioritize Wikipedia even if not told explicitly to do so, at least the newer models or at least in principle I believe it is possible. Thanks for sharing that!

Daniel's avatar

It looks like this implies that monetary AI investment will double every 18 months for the next decade.

Napkin math says this would be on the order of 10 trillion dollars a year by the end of it, which is a gigantic amount, but not literally impossible like I expected.

Scott Alexander's avatar

I'm not sure where you're getting "next decade" from - I think something like that will hold true for a few more years, but maybe not all the way until 2036.

Daniel's avatar

I was looking at the Cotra/Davidson timeline, and then adjusted sooner since 2043 felt long.

10x by 2030 definitely seems a lot more reasonable if we keep seeing the tech pay (metaphorical) dividends along the way.

Demarquis's avatar

Anyone here have any opinions about the Dario Amodei interview in the NYT today? It's behind a paywall, but he's written a couple of essays that are accessible to the public. One is titles "Machines of Loving Grace", available here: https://darioamodei.com/essay/machines-of-loving-grace which is a rather optimistic vision of moderately intelligent AI solving things like cancer and longevity. he sees moderate AI because of a concept he calls "Diminishing Returns on Intelligence" because at some point advances in understanding require experimentation in the real world, and he thinks that imposes some restrictions on what intelligence alone can do.

Money quote:

"Now there are some domains, like if you’re playing chess or go, where the intelligence ceiling is extremely high. But I think the real world has a lot of limiters. Maybe you can go above the genius level, but sometimes I think all this discussion of, “Could you use a moon of computation to make an A.I. god?” is a little bit sensationalistic and besides the point, even as I think this will be the biggest thing that ever happened to humanity."

He also sees two potential problems with the development of AI. One is that we are proceeding so fast that we are putting insufficient guardrails in place. Another is that we aren't thinking carefully enough about the possible negative consequences. He doesn't appear to believe in the Singularity but he does seem to think that AI could be used in destructive ways, to entrench the power of an authoritarian government, for example. He seems to think that an unknown number of human professions are going to be automated, including most or all of white collar entry level jobs, with unknowable consequences. He thinks that combining AI with advances in robotics is potentially troubling because that could threaten even blue collar jobs. Again, he wrote an essay that outlines a lot of these worries, "The Adolescence of Technology", available here: https://www.darioamodei.com/essay/the-adolescence-of-technology

Money quote:

"Maybe I don’t talk about that enough, but I definitely am in favor of trying to work out restraints, trying to take some of the worst applications of the technology, which could be some versions of these drones, which could be that they’re used to create these terrifying biological weapons. There is some precedent for the worst abuses being curbed, often because they’re horrifying while at the same time they provide limited strategic advantage. So I’m all in favor of that.

At the same time, I’m a little concerned and a little skeptical that when things directly provide as much power as possible, it’s hard to get out of the game, given what’s at stake. It’s hard to fully disarm. If we go back to the Cold War, we were able to reduce the number of missiles that both sides had, but we were not able to entirely forsake nuclear weapons.

And I would guess that we would be in this world again. We can hope for a better one, and I’ll certainly advocate for it."

and:

"But I actually think this whole idea of constitutional rights and liberty along many different dimensions can be undermined by A.I. if we don’t update these protections appropriately.

Think about the Fourth Amendment. It is not illegal to put cameras around everywhere in public space and record every conversation. It’s a public space — you don’t have a right to privacy in a public space. But today, the government couldn’t record that all and make sense of it.

With A.I., the ability to transcribe speech, to look through it, correlate it all, you could say: This person is a member of the opposition. This person is expressing this view — and make a map of all 100 million. And so are you going to make a mockery of the Fourth Amendment by the technology finding technical ways around it?"

Thoughts?

Kenny Easwaran's avatar

I generally like what I’ve read from him. He seems much more open than his competitors to the idea that powerful AI will take very different forms than what we have imagined. I haven’t seen the NY times things yet.

David Schneider-Joseph's avatar

A less frequently observed issue with bioanchors, which was identified at least as far back as 2022, is that the "algorithmic startpoint" (algorithmic efficiency vs. the brain) had no good anchor.

Cotra estimated "~2.5 OOM worse [than the brain], +/- 1 OOM", based on reference points like how much less efficient dialysis machines are than a human kidney, how much more efficient solar panels are than leaves, and the FLOP/watt efficiency of a V100 GPU.

But most of those anchors had little to do with where ML algorithms were in 2020 when bioanchors was written, and would have given a very similar estimate for "present state of ML algorithms" 20 years earlier or 20 years later.

This interacts especially badly with shorter algorithmic doubling times. As you note, Hernandez and Brown (2020), Cotra's main source for estimating this, found 16 months on AlexNet (and 17 months on ResNet) — which Cotra then revised upward to 24–42 months depending on the anchor. But they also found periods ranging from 25 days through 6 months on the other tasks they looked at. These shorter estimates pertained mainly to harder tasks performed with transformers, rather than simple vision tasks performed with CNNs.

Scott Alexander's avatar

Hmm, that's a good point, although I think Cotra said that some of this was also by eyeballing impressiveness of animals by neuron number, which I'd hope would somewhat escape those kinds of concerns.

Ted Sanders's avatar

> Thanks to METR, we now know that existing AIs have already passed a point where they can do most tasks that take humans seconds

This is not at all what the METR plot says. Tasks are not drawn randomly from what humans can do.

Kenny Easwaran's avatar

Yes, I think this is an extremely important point for all benchmarks - they are disproportionately drawn from the type of task whose completion is easy to evaluate in an objective way, which is a strong bias of some sort or other.

John C's avatar

Hi Scott - glad to see my piece included! My full name is John Croxton, if you wouldn’t mind updating the reference :)

John C's avatar

Thanks! Much appreciated.

Bugmaster's avatar

I think one weakness in this analysis is the definition of "task" that is being accomplished within a certain amount of time. LLMs are reasonably good at tasks that deal with manipulating text, especially the kind of text that was present in their training corpus. This is a powerful ability, especially because computer code is text, and LLMs have seen a lot of it. And, unlike search engines, LLMs do not store actual documents; instead, they break them down into embeddings and store those in a massive probability space that they can be trained to navigate. This means that LLMs can generate new documents within this space, which have never existed before, thus e.g. writing code from scratch; but it also means that the further you venture out from the core of their training corpus, the more random the output becomes. In practice, this means that LLMs can only reliably solve problems if those problems have been solved many times before (and thus heavily weighted in the training corpus). This makes them less suitable for non-routine tasks (especially those that cannot be expressed purely through text).

Merely adding more compute won't help solve the problem; adding compute *and* training data might improve things a little bit, but there's a limit to how much training data is available in the world, and the benefits scale sublinearly.

This is one of the reason why I don't share the optimism (or pessimism, depending on how you look at it) about AGI. Yes, one day AGI will probably be built, but it's unlikely that LLMs alone will get us there.

Kenny Easwaran's avatar

Not exactly - they don’t only solve tasks that have been solved before, but also solve tasks that can be solved by using text manipulations of types that have been used before. That includes “write the Declaration of Independence as if a drunken Russian polar bear was singing it in dactylic hexameter”, or “find a proof of the following contest level math problem”, since they recognize “types” of text manipulation at a high level.

I don’t know if that’s good enough to do what humans do. But I don’t think what humans do is good enough for true general intelligence either. We just like to flatter ourselves that, since we are so much better at solving a wide range of problems than any other thing we are aware of in the universe, our skills must be properly general.

I suspect LLMs have a different range of generality that partially overlaps ours. I don’t expect us any time soon to come up with a model of intelligence that exactly equals us, or surpasses us in literally every way, though we will continue to come up with new ones that surpass us in more and more ways even as they stay behind us in others.

Bugmaster's avatar

> I don’t know if that’s good enough to do what humans do. But I don’t think what humans do is good enough for true general intelligence either.

I agree completely.

Rob Bensinger's avatar

Eliezer commented on X in March: "Curious as to how clearly it now comes across that I knew goddamned well that inference-time scaling laws were going to be a thing later, though of course could not say so."

(Robert Mushkatblat points at: "They're not going to be taking your default-imagined approach algorithmically faster, they're going to be taking an algorithmically different approach that eats computing power in a different way than you imagine it being consumed.")

Kenny Easwaran's avatar

I’m still inclined to trust the BioAnchors forecasts (maybe with a correction term on algorithmic improvements?) over the vibes! They at least have an operationalized idea of what their timeline is a timeline for, while the vibes are about an ill-specified concept of “AGI” that some people in Nature think is already here, while others think is logically impossible.

In any case, I find it amusing to see AlexNet described as an “easy” task, given that the success at AlexNet is precisely what caused the kink in resources put into training runs in the early 2010s!

Kenny Easwaran's avatar

I would be careful about saying the METR results mean AI is good at doing anything humans can do in a few seconds. Just as a simple example, they don’t tie shoes yet and probably won’t very soon. And all the jaggedness of their intelligence shows up in many ways even at the several second level. The METR tasks are nice well-defined ones in software engineering, as I understand, where some of these issues are tamer.

Argos's avatar

So based on this, AI companies are still a good investment?

Unfolding the Point's avatar

One word. fearmongering.

“Nothing except what’s in front of you” is a usable constraint. It tries to collapse the state space: stop running simulations about the past, stop forecasting five branches of the future, stop negotiating with abstractions, and return to the only thing you can actually act on. In practice it functions like a control law for attention. Your attention is the actuator; where it points is what gets processed, and what gets processed is what you can steer. This phrase forces the actuator back onto the present stimulus and the next available move.

The value is not mystical. It is operational. Most suffering and most error come from treating imagined objects as if they were actionable objects. Regret is a kind of time travel that cannot change its target; rumination is an internal debate with no new evidence; anxiety is planning without constraints or deadlines. Each of these consumes cognitive bandwidth while producing little reliable action. “What’s in front of you” is the evidence boundary and the action boundary: what is concretely perceivable now, plus the immediate step that is feasible now.

Used well, the phrase does not mean “ignore the future.” It means “don’t pay future-costs in present currency unless you can convert them into a next step.” The future matters, but only insofar as it can be represented as a present commitment, a calendar block, a message sent, a document opened, a plan written, a meal cooked, a walk taken. If it cannot be converted, then it is not yet something you can act on; it is just an image. Images can be useful, but they are not the same as instructions.

There is also a failure mode: if you interpret it as permission to narrow your world until you are only reacting, you can drift into avoidance. The test is simple. After you say it, does your behavior become more honest and effective, or merely smaller? If it produces the next right action—however trivial—then it’s grounding. If it becomes an excuse to avoid difficult but necessary planning, then it’s sedation. The phrase is meant to reduce distortion, not responsibility.

If you want a strict way to apply it, treat “in front of you” as: the current sensory field, the current task environment (the screen, the room, the person you’re with), and one explicit intention you can execute in under two minutes. You don’t need to solve your life. You need to choose the next move, do it cleanly, and then choose again. Repeated enough times, that is how stability is built.

1123581321's avatar

Anchors: AGI in 2050

Bunch of other people: AGI in 2030-40

Today: February 2026

No AGI.

How on Earth does anyone get to ask “how did Anchors get it so wrong”?

Kurt's avatar

Surprised there's no pushback on the physical compute scaling claims (which I find extremely generous). In the real world, we're already seeing a lot of pushback against the construction of new data centres. People are seeing their electricity bills skyrocketing as local data centres gobble up all the cheap power and push the local grid to capacity. People are pushing back against the vast quantities of freshwater for cooling that data centres suck up (and pollute with chemicals to protect the equipment).

The estimate of 3.6 scaling for real compute (more than tripling the number of data centres every year) seems extremely optimistic to me, in light of the above. It won't take very many years of that kind of scaling to collapse the entire power grid as all of our power is diverted to data centres.

1123581321's avatar

Oh there's been pushback. But it comes from people whose jobs don't include making flashy forecasts for vague sci-fi rapture events (WTF is "AGI", really? Is there an ERS?). Yours truly had a long and seemingly productive discussion on this topic with one of the AI Futures people. Much agreement. No change.

And don't get me started on using three data points above a line to now pretend that the slope has changed. Jesus wept. But when you don't have concrete deliverables at tight schedules, whatever man, any line will do, what's the difference.

Jeffrey Soreff's avatar

Many Thanks! It is, in some ways, frustrating that algorithmic progress is such a dominant parameter. It is, in many ways, the most intrinsically unpredictable of the parameters. Getting the shipments of new, good, important ideas from the new, good, important idea factory is hard to schedule...

Tibor's avatar

For what it's worth, my take from Mastroianni's article was the same as Adam's. It is oversimplified ad I mention elsewhere in the thread here, but I think it is the best metaphor you can give to people who are not technical at all I'd you want to make them see that no, these things do not think the same way humans do (and so perhaps we shouldn't even say that they "think" in the first place)

Mark Shields's avatar

*"Mis-estimating one parameter can ruin the whole project." cited @Richard Ngo above.

Units Analysis of the 5 parameters in the table (& model):

1) Willingness to Spend = $

2) training (compute) / cost (*MIS-labeled as 'Cost/FLOPS' in chart & text) = (FLO/t)/$

3) Training Run Length (seconds) = t

thus the product:

4) Real Compute = $ * (FLO/t) /$ * t = FLO

and:

5) Algorithmic Progress = AI/FLO

This uses just $, FLO, time, & algorithmic annual progress rates to model AI progress rate. 6 years ago this was estimated at 2-3x, but we think we've seen 10x?

so:

**What about the (rate of increase in) data (quality & quantity) parameter(s) for LLM training?

***And the ineffable, other? -bio or quantum processing acceleration, AI recursion, magic ;) ?

Jason Hubbard's avatar

This framework is flawed, in that it actually fails to consider it's biological proposal, which is that there is some X number of FLOPS at which AGI becomes inevitable, and that X can be calculated from biological models of intelligence.

But here's the thing: the biological model of intelligence is so far 1 species, out of many many species. If we were to calculate the power of a single human brain in FLOPS, then theoretically that conversion would apply to other semi-intelligent animals, like dolphins, whales, gorillas, etc.

But that should show you that FLOPs and algorithms aren't the only necessary components to AGI. Human brains are unique among the animal kingdom, sure, but they are not so unique that they work on entirely different systems. You may have noticed all the animals I listed were mammals, for example. They all have brains working on some very similar architecture, and some very similar cellular structures.

So an examination of what you might call the biological hardware of non-computational intelligence demonstrates that factors other than processing power are necessary for understanding why humans have human level intelligence while animals with other neural circuitry have not. Moreover, one could look at the fossil record and compute FLOPs equivalents for now-extinct species.

The fact that intelligence has evolved in only one species despite the history of the evolutionary record shows that neural mass, even neural connections, are not the sole arbiters of how intelligent a given species is, or the variation of intelligence within a given species.

For example, you might be able to train a smart dog, bred for some cognitive task-- an example might be a sheep dog. But even within that specific breed of dog-- three are going to be instances of dogs in even a given brood that are not reliably trainable to the cognitive task they were bred for.

The animal husbandry reveals another aspect: You cannot breed for intelligence. If intelligence were just a matter of physical phenomena, of hardware, of equivalent flops-- we should have been able to breed intelligent animals. After all, it would just be a matter of breeding them for better neural hardware-- larger brain mass, more dense neural networks in that brain matter, etc. In a biological system-- these are heritable traits which can be selected for in a program of animal husbandry. And indeed, dog breeders have long sought to breed smarter dogs, to make them more useful at tasks they can be trained for.

So yeah-- this concept that this is purely a question of software and FLOPs? That's bunk. It doesn't allow for unknown hurdles and obstacles that we have not yet encountered. It assumes there will be no new ceilings, no new limits we will encounter from our journey from current AI research to AGI. Because we don't have a good reason for why evolution had not previously produced an intelligent species. We don't even have a working model for how the human brain produces recognizable intelligence. Assuming that there are no new bottlenecks, no new obstacles which may impose diminishing returns no matter how many FLOPs you throw at the problem?

That's not a good bet, if we look at the biological record.

David Spies's avatar

"are we sure the average employee stays at an AI lab for more than a year?"

I don't think "average" is the right measure here. A tech company where 10% of employees have been around for multiple years and 90% leave after a year is a _very_ different place to work than one where literally 100% of employees leave within a year. Those few who are in for the long haul are critical

David Spies's avatar

There's a fun parallel story about the RSA authors attempting to predict progress in factoring algorithms and failing and underestimating how big RSA keys would need to be because they failed to account for algorithmic progress

Mikhail Samin's avatar

> Eliezer Yudkowsky argued that the whole methodology was fundamentally flawed. Partly because of the argument above - he didn’t trust the anchors - but also partly because he expected the calculations to be obviated by some sort of paradigm shift that couldn’t be shoehorned into “algorithmic progress” (like how you couldn’t build an airplane in 1900 but you could in 1920).

A very important part of his argument (as I understand it) was that historically, people have picked various bio-anchors and weighted them in ways that always produced a result that seemed plausible to the people, which coincidentally has always been around +30 years from the date the forecast is made, and Ajeya was no exception. So attempts to utilize this methodology always result in a bottom line seemingly produced by something different from that methodology; this bottom line doesn't have much to do with reality.

David Manheim's avatar

Worth noting that the superforecasters (in the XPT tournament, which I did not participate in) were off by even more than Cotra was: https://forum.effectivealtruism.org/posts/YGsojZYtEsj2A3PjZ/who-s-right-about-inputs-to-the-biological-anchors-model

Robi Rahman's avatar

"Cotra’s estimate comes primarily from one paper, Hernandez & Brown, which looks at algorithmic progress on a task called AlexNet."

AlexNet isn't a task. ImageNet is an image classification task (rather, a benchmark consisting of many image classification tasks) and AlexNet is a neural network famous for being one of the first to leverage GPUs to train with lots of compute, and therefore being really good at image classification and establishing the high score on ImageNet.

Varun Sudarsanan's avatar

One measure I would like to see is completely novel tasks. I am using Claude and other tools extensively for software development and my impression is most of the skills come out of memorization rather than intelligent. There's an unbelievable amount of human work that's being abstracted, memorized, semantically indexed and retrieved. Generalizability is still absent. Obvious logic and simple problems are also challenging because of this. It's definitely intelligent - but I wonder if it's a completely different species of intelligence and what we should be looking for is a decaying exponential of model sizes with comparative performance (at the cost of time to do task say).