Unfair of you to blur out his name: you're engaging with his suggestion, so he should at least get some credit for contributing something to the discourse.

Expand full comment

Agree with 1 - 3.0, not so much with 3.1. That is, I agree that arguments about "platonism" vs. "nominalism" on intelligence are a bad reason to disbelieve in the possibility of increasing intelligence for AI, but I also think 1 - 3.0 establish only the possibility of increased intelligence over time, not of an intelligence "explosion" in the scary sense.

Expand full comment

First section, 7th paragraph, The second instance of the word Schwarzenegger is misspelled Scharzenegger

Expand full comment

>"Wow, someone who’s literally the best chess player on earth only has a pretty high (as opposed to fantastically high) IQ, probably lower than some professors you know. It’s amazing how poorly-correlated intellectual abilities can be.”

It seems intuitive that if someone is distinguished for being extraordinary in a particular respect, that their respective categories of intelligence would be less correlated than a typical person's.

Expand full comment

Great stuff. Primates can evolve more intelligence by globbing on more and more layers of cortex (along with a few other things); and GPT can get more intelligent by whatever globbing-on process OpenAI does.

By the way, I’ve been reading SSC since 2013-ish when I was in medical school but was never a commenter until Substack. You’ve been a wonderful influence and teacher these years. Please take my annual membership as thanks!

Expand full comment

> I think if you get a very big blob, arrange it very cleverly, and give it lots and lots of training data, you’ll get something that’s smarter than humans in a lot of different ways. In every way? Maybe not: humans aren’t even smarter than chimps in every way. But smarter in enough ways that the human:chimp comparison will feel appropriate.

A bit tangential to your main point, but the hard part here will be getting the right training data. Just processing more text will not let machines learn how to do original scientific research. Maybe the actual thinking processes of a million scientists would though...

Expand full comment

Could we somehow inject ourselves with more neurons to get even smarter?

Expand full comment

Chimps are smarter than cows, but both are equally bad at designing better computer chips, and, evolutionarily, chimps are way less successful. I don't think the intelligence explosion is guaranteed, and I don't think the primacy of superintelligent AIs is either.

Expand full comment

I agree with 1 - 3.0 up until "The bigger the blob and the cleverer the arrangement, the faster and more thoroughly it learns...I think if you get a very big blob, arrange it very cleverly, and give it lots and lots of training data, you’ll get something that’s smarter than humans in a lot of different ways."

Maybe. All trends only hold to a certain extent. It's possible that the intelligence train blows past human-level and rapidly proceeds to superhuman level. It's also possible that there's a limit. My personal belief is that "Bitter Lesson" will hold up to human or slightly above-human levels, but that's it; at that point, we literally won't know how to build the necessary optimization functions to go farther.

Expand full comment
Jul 25, 2023·edited Jul 25, 2023

People tend to get hung up on the word intelligence pretty easily though

All the claims along the line “AI aren’t really intelligent because they don’t really understand reality, here’s really obvious mistakes AI makes that it couldn’t if it was really intelligent” are because of assumptions that come with the word intelligence

But you don’t need a lot of the correlates of intelligence like consciousness or cognitive consistency that aren’t needed to have an intelligence explosion

Wouldn’t it be better to use a different word without all this baggage, Yud tends to focus on the ability to model reality.

You could call it the Mentalization Explosion instead of Intelligence Explosion and save a lot of needless miscommunication

Expand full comment

I came to this post agreeing with the tweet. I don't feel this counterargument fully satisfied my doubts. Hoping I can coax the argument that convinces me out of someone.

I'm tempted to say there's a motte and bailey here, where "intelligence is a correlation of SAT scores and etc." is the motte, and "more intelligent things always conquer the less intelligent" is a bailey. Arguably this disproves the tweet as written, but I still feel like doomers tend to notice correlations between "intelligence" and many other qualities, reify the entire thing, and call that "intelligence." That reification would then cause them to not notice when doomerism arguments switch the motte with the bailey.

Notably, while the discussion of AI trends does address the idea that general comprehension skills are a useful concept, it doesn't affect the fact that, as far as I know, "emergence of agentic behavior" is pure sci-fi and yet a thing that people that people keep expecting to see because "intelligence." (I know there have been some writeups that go through the causation carefully, but I don't fully buy those and I think a platonic view of intelligence is part of why others are so credulous. Difficult to falsify, admittedly.)

One thing that would force me to question my perspective is a demonstration that IQ is correlated with more coherent/optimized goal-seeking. That is, do people with high Math SAT scores act more agentically and less like aimless balls of hormones than normal people? Relatedly, are the correlations between IQ and "general success" that Scott has mentioned before actually causal, or does it route through something like "people with good genes are also smart" or "tech jobs are high-paying right now."

Expand full comment

The Wolfram plugin for ChatGPT might be a disproof of your claim that hand-coding doesn't work well.

Expand full comment

It seems useful to assign something like intelligence to LLMs or other "neural" networks. The implication is, as you say, that at some point some ML setups will be more intelligent than humans with some positive correlation value. Whether this correlation will be high enough to have human-like drive and become dangerous to humans, who knows.

Expand full comment

I think (3) relies on a false equivocation. It has certainly been useful in AI discourse to talk about ML architectures in terms of general intelligence, but the abandonment of domain-specific rule systems in favour of training on generalized datasets does not establish that the data plus the training is "general" in a sufficiently like manner.

We could just as well talk about human and ML intelligence being "non-specific". Perhaps it would then be clearer that AI could be "non-specific" in all sorts of ways that don't replicate human abilities or lead towards an explosion. Favouring "general" and its connotations ushers in a lot of presuppositions that are not grounded in fact or argument.

Expand full comment

I’m amazed that you can actually respond CALMLY to that tweet. Sorry. Not adding anything of substance here.

Expand full comment

Just on the subject of concepts, this is yet another case where Ayn Rand's thought is useful - specifically her work on (I really think Yud's dismissal of her was unfortunate).

Concepts are not just loose bundles of similar things. They are mental groupings of entities that share the same essential characteristic. What's an "essential characteristic"? That one on which the most other characteristics depends.

For example, "bodily strength", the essential is "the ability to exert force - produce mechanical change".

When you grasp that, _of course_ grip strength correlates with lifting strength etc.

I haven't read through the piece, but this answers all the stuff about "intelligence". What is the essential characteristic of intelligence? The ability to select for effective action. And the instant you know this, you see why AGI is terrifying.

Expand full comment

The problem with the word intelligence is that people, like this guy, mistakenly equate intelligence (as a noun) with thinking (as a verb). Computers do not, because machines cannot, think. But computers can possess things, such as intelligence.

Expand full comment

> All human intellectual abilities are correlated.

Are all non-human intellectual abilities correlated? Because AI is non-human. The relevant philosophical question here is not whether human intelligence exists. It's whether a tree is failing at being a human because trees are obviously dumber than humans by human standards. Note that this is completely different than size: both humans and trees trivially take up space in the same manner. But they do not trivially think in the same manner. Dogs don't trivially think in the same manner either.

So how can you assume that AI will do anything that resembles thought? That it even has intelligence, whether defined as a cluster of traits we recognize as intelligence in humans or as a simple platonic form?

This entire discourse relies on two assumptions: that human intelligence is a universal (rather than human) trait and that intelligence is a trait so important that more than human amounts of it are a superweapon against which there can be no defense. And, to be honest, it has a very "calculate the number of angels that can dance on the head of a pin" quality because both those assumptions are rather hard, perhaps impossible, to explore.

Expand full comment

> But also, humans are better at both the SAT verbal and the SAT math than chimps,

The SAT is blatantly species-ist.

Expand full comment

I suppose that an underlying point of this essay is that intelligence is a coordinated amalgamation of many narrow skills (it's even possible that these skills are multi-use tools, and amalgamate temporarily when triggered by the appropriate environmental conundrum). If that's the case, it may even be that very narrow tools become associated with each other over time as they work successfully together. This would result in a population that possess broadly similar minds, but which are unique in the details. No wonder hand-coding it didn't work.

Expand full comment

The analogy describes an intelligence *increase*, but not an intelligence *explosion*. You’d need a whole big pile of boxes, that each needed a slightly different level of strength, and no matter how I torture the metaphor it still doesn’t feel like exponential growth.

Expand full comment

I feel like the kind of intelligence you talk about in section 2, while relevant when talking about humans (and to a lesser extent other animals) is less useful when you talk about AIs because a lot of the correlations break down.

I mean Kasparov is the best human at chess and 99th percentile in IQ. On the other hand, the best AI at chess is literally incapable of taking an IQ test. The current generation of LLMs are great or even superhuman at some things like general command of the English language or breadth of knowledge or ability to write a reasonable poem in under a minute, but bad at other things like being able to learn new concepts from little data, arithmetic, or mildly complicated reasoning.

Now the kind of intelligence you talk about in section 3 definitely is relevant to AIs, but seems like it is the much more specific thing of being able to recognize and reproduce patterns efficiently rather than being good at this basket of correlated cognitive tasks.

Expand full comment

> Concepts are bundles of useful correlations.

I agree, strongly; but I think this implies that A) Hume was right about causation (there is no such thing, only correlations of events); and B) it doesn't matter, because correlations between (time-stamped) events do everything that "causation" can do in the real world, and a lot of things that it can't.

Expand full comment
Jul 25, 2023·edited Jul 25, 2023

I'm usually the first person to accuse anybody of Platonism, but in this case I disagree with the charge. I think "intelligence" is similar to "computational power".

A computer can have great computational power by some standard measure, and yet be too specialized or too much of a pain to program for most applications. Think of graphics cards that can do lots of FLOPS, but only on tasks that can be parallelized in a very specific way. One could argue that AI is like this, if the future of AI were simply the series GPT5, GPT6, GPT7...

But AI isn't a single algorithm. The improvement of AI is due to improvements in algorithms and increases in computational power. So it isn't analogous to putting more and more GPUs on a graphics card; it's analogous to the ability to put more and more transistors on a chip. (And the "increases in computational power" half of it literally IS the ability to put more and more transistors on a chip.)

Putting more transistors onto a chip doesn't imply that that particular chip has more general-computational power than some other chip with fewer transistors. But it does clearly let the community of chip-designers make chips with more computational power than the current generation of chips, even though "computational power" isn't clearly defined.

Expand full comment

Another interesting test to add to the concept layer is if you can communicate it to another world-modeler and have them understand what you mean, or use it in a way that is predictably consistent with the way you meant it to be more rigorous.

I do see intelligence on terms of prediction. We are all moving through time, trying to guess what’s next, and intelligence is the ability to be good at figuring out what’s going to happen next and figure out what you should do to move to a preferred future.

I think your definition of blob of compute is slightly wrong. But only slightly. It’s like we are in an attic with holes in the roof and we blow dust to illuminate sunbeams. So you have to kind of think about what sunbeams are useful. Predictive text was something I thought would be fruitful prior to openai even because I think “what do you mean?” is one of those useful sunbeam clusters. The thing we think about for humans in particular is the ability to model another world modelers futures and you have to have it for communication.

Expand full comment

Yeah, you can get pretty cute with definitions (it's like 90% of legal practice), but it can be tricky to know when you're on to something or just being annoying. Or to be more precise, you can introduce unfalsifiability to almost any argument by futzing with definitions enough.

But I do think this is where the AI Doomer error is. Not that you can't bundle a bunch of stuff and call it intelligence, but that some of the stuff you're putting in the bundle doesn't belong. I'm very intelligent (according to lots of biased people at least) but still struggle with agency and executive function. A machine can almost certainly be built that can answer every question on every test, but that doesn't model the world or have any true agency.

Expand full comment
Jul 25, 2023·edited Jul 25, 2023

I think you're conflating two fairly different ideas in this post. First idea: "intelligence" makes as much sense as "strength" and it is coherent even for a non-platonist to discuss the intelligence of machines or to imagine machines becoming much more intelligent than humans or going through an intelligence explosion. Second idea: a good way to increase the intelligence of AI is to simply scale up the computational power and not worry about having lots of clever or domain-specific ideas.

Of course, these ideas are related to some extent: the second idea doesn't really make sense without buying into the first idea. But they are still pretty distinct! It is totally possible to believe that talking about intelligence is coherent without believing that "stack moar layers" is a good approach to AI. As an analogy, one could buy into everything you say about strength but not believe that "build bigger engines" is the best way to create stronger machines.

Expand full comment
Jul 25, 2023·edited Jul 25, 2023

I think you're correct that it's practical to roughly think of 'intelligence' as being a blob of stuff that, if you have enough of and arrange it appropriately, can create meaningful and general abstractions and make very accurate predictions. However, the 'doomer' argument requires more assumptions about the nature of this blob of intelligence: Namely, that a sufficiently sized blob will be inherently power-seeking. This also requires that it will act in the real word to optimise for a particular goal, and that the goal that will likely be programmed into it will be of a type that would cause irreversible damage if sufficiently optimised.

Expand full comment

I think this post is basically right that "intelligence" is a meaningful thing to talk about, in much the same way that "strength" is a coherent thing to talk about. That said, it matters a lot why exactly you're asking the question. If you're asking "who is stronger, Mike Tyson or Granny" as a proxy for the real question "who could move this large pile of dirt faster", then it matters a lot if Granny owns an excavator (and, for that matter, whether Mike Tyson has or can rent one).

As far as I can tell, the arguments for an intelligence explosion rest on the assumption that "how much stuff can an entity do" depends mostly on the question of "how intelligent is that agent". But that doesn't seem like the whole story, or even most of the story. Increases in human capabilities seem mostly tied to our ability to use tools, and our ability to coordinate multiple people to accomplish a task. For example, the populations of the Americas diverged from the old world thousands of years before either group developed agriculture, and both groups developed cities and civilizations relatively quickly after developing agriculture. So it clearly wasn't a thing where there was a threshold in intelligence where agriculture and civilization suddenly became possible.

In order to get the "FOOM" flavor of intelligence explosion, where a single entity reaches a threshold level of intelligence such that it can inspect and improve its own operation, which allows it to better inspect and improve its own operation, etc and become smart enough that everyone else combines is as insects, one of the following would have to be true

1. Actually, having and knowing how to use tools doesn't matter that much. Beyond a certain level, you can predict the results of your actions well enough that all you have to do is whisper to a butterfly, and it will flap its wings in just the right way that matter ends up in your preferred configuration 6 months later.

2. Having and using tools is important, but being smarter lets you build and use better tools, and at a certain (fairly low) threshold, you get better at making and using tools that everyone else combined.

3. Having and using tools is important, and the future will belong to whatever agent figures out how to use the tools other agents have built in order to improve its own tools and processes, and does not share any of its own advancements back, which acts as a ratchet (that agent only ever gets more capable relative to the rest of the world, never less) and eventually results in an agent that is more capable than everyone else combined.

Now, of course, a FOOM-style intelligence explosion isn't the only possible way to end up with a cycle of capability advancement. But it is the way that ends up with the most worrying dynamics like "there is an intelligence threshold for AI such that the first AI to cross that threshold will determine the shape of the future, and so to survive humans must build that AI exactly correctly on the first try without being able to iteratively test it first".

Expand full comment
Jul 25, 2023·edited Jul 25, 2023

I get all that. But it still seems to me that there are differences in kind, not just differences in degree, between GPT4 and something smart enough to be a real problem to us if it's not aligned. The really important differences between GPT4 and human level AI & /or superintelligent AI seem to be differences in the *kinds* of processes they can engage in, rather than in how well each can engage in the process. GPT4 currently is unable to recall its recent activities or recognize people who have prompted it before. Within the process of responding to a prompt it is unable to review what it has said so far, in order to orient itself for future action with the benefit of past action. All of these incapacities are deficits in what we would call self-awareness. GPT4 also shows no signs of being able to reflect about itself or anything else when not prompted, and no signs of having preferences and goals. It is also unable to give itself prompts that are generated by basic needs to sustain and protect itself (such internally-generated prompts are called *drives* when they occur in animals). In fact, it is unable to produce self-generated prompts at all. Moreover, it seems to utterly lack self-interest. GPT4 is essentially in a coma except when executing a prompt given by a human being. It also seems to be rather poor at reasoning, and very lacking in inventiveness, and in some of the capacities on which inventiveness rests, such as recognizing subtle but important isomorphisms. It is also not teachable in the usual sense. You can't give it give it an essay about, say, the reasons that intelligence is a useful and meaningful concept and expect it to be able to perceptively apply the ideas in the essay to other concepts, such as kinkiness, snobbishness, musical ability, beauty, malignancy. GPT4 will have learned about ways to talk about concepts, rather than ways to think about them.

Most of the capacities I named above as capacities lacking in GPT4 are not lacking in animals or in Einstein, who had self-awareness, self-interest, personal memory, drives and preferences, and teachability about their environment (as opposed to teachabilitu about the word sequences humans use to discuss their environment). So it makes sense to compare the intelligence of all of these beings ,each of whom *has* the same cluster of capacities, though some exercise that capacity at a more sophisticated level than others. I'm not sure, though, how much sense it makes to compare the intelligence of, say, GPT4, which lacks many of these capacities, with human intelligence. It would be like comparing the strength of Arnold Schwarzenegger with that of somebody who has no arms.

So I think the guy in the tweets has a point regarding intelligence not exactly being a *thing*. Most of the crucial functions I named are present in all species, and that makes it make sense to compare them in intelligence with each other, with us, and with Einstein. But GPT4 (and, I'm guessing, later versions of GPT that have similar architecture) performs quite well at some functions, but is utterly lacking in other crucial functions. So I am not sure how meaningful it is to talk about how intelligent AI is that lacks self-awareness, self-interest, drives, goals, personal memory and the ability to learn from experience (rather than from ML type trainings.).

Expand full comment
Jul 25, 2023·edited Jul 25, 2023

"Across different humans skill at different kinds of intellectual tasks are correlated."

That says precious little about whether skill at all different kinds of intellectual tasks is correlated anywhere else.

I can pretty easily imagine an AI that's good at producing stories - not at all good at much of anything else, including judging whether stories are true. It's called ChatGPT.

I can also imagine a robot that's strong in some ways (for some tasks) and flat out incapable of others.

Expand full comment

The MIRI decision theory that they think is key to alignment does presuppose a kind of Platonism - the best way to understand it is that one conceives of making the decision on behalf of an *algorithm* rather than a particular physical instantiating of it, and see which output for this algorithm being universalized would result in the best outcome. (Though maybe there’s another way to phrase it - it’s less clear to me that Kant’s formulation is committed to Platonism.)

Expand full comment

I think it was legendary baseball author/stat-head Bill James who told me [completely paraphrasing]

-- Good Science, good arguments, give you lots of STUFF TO DO. Especially PREDICTIONS. What does the theory predict? Well, let's run an experiment, test the hypothesis, and measure the results, see if they agree with the predictions.

I think arguments and theories that are largely proven and backed up by evidence, are valuable because they provide a deeper physical framework and view that you can use to make additional predictions about the [system you are studying].

Here is where I say that Tyson is "strong" and that tells us some likely correlations about his general physical health, metabolism, quickness/neuromuscular co-ordination, etc. If the value of science is correlation and prediction, well of course LLms trained on the corpus of human speech will make startling and "intuitive" correlations. But human thought has, maybe, no underlying physical basis or like, quantum-reality.

The corpus of human speech is "true" but is it meaningful? Hmmm...


Here is where I swerve off the road because I realized in the middle of this that Mike Tyson is almost a perfect small model for transformative unaligned world-changing AI


We know, from some of the most direct, irrefutable, and culturally transformative video and human-reported evidence of my LIFETIME, that Mike Tyson is strong. (1) His unprecedented ability to, almost instantly, beat another [huge and strong] human being senseless,

---- well actually it's like the GWERN "Clippy" story, he was so otherworldly dominant --- Holy crap!

Tyson is strong, and was AI-level destructive, because he was

-- born with genetic gifts

-- Trained relentlessly

-- Was fed an incredibly difficult and skewed training dataset that had at its premise that his only value as a human being, aka his "reward matrix", was to maximize the beating other human beings senseless.

-- Had, in addition to his physical gifts, a superior target evaluation model that discarded irrelevant info and focused on the target loop, leading to incredibly quick gains and vast mismatches in learning and decision speed.


-- Also, it seems, had an incredibly strong and relilient self-model that has somehow recovered from disasters and physical truncation and continues to contribute and be studied to this day.

I'm gonna wrap up this comment while I ponder the ways that Tyson was "invisible" in 1980-84, a terrifying rumor spoken of in awe by the few who had seen toe Miracles.


Expand full comment
Jul 25, 2023·edited Jul 25, 2023

"How can any cognitive capability be self-amplifying?" asks man on rage spiral generation app.

Expand full comment

There's a big difference between the concept of intelligence and the concept of strength. We know of other animals that are stronger than humans. Therefore, we can be certain that we are using a concept that generalizes across humans and non-humans in the case of strength. But we only define intelligence with regards to human capabilities. At least as far as we can prove.

Expand full comment

Can anyone tell me what is the way in which humans are not smarter than chimps? I don't have time to read the full book review right now.

Expand full comment

For the Platonists out there, here’s some grounded discussion on whether AI can meaningfully engage in the noetic world of forms and ideas:


Expand full comment

"g" is straight-up Platonism. Don't be defensive about it, a Platonist is a perfectly respectable thing to be. You could also put it in Kantian terms: g is the noumenal behind the phenomenal forms we call "witty lyricist" or "good at crossword puzzles". Platonist is not a dirty word!

That doesn't mean I agree with you.

Expand full comment

> superintelligent machines are no philosophically different than machines that are very very big

Agreed. Problem is, in order do design a big machine usually you cannot just scale a small one. A switch which can break thousands of Volts is not a scaled up wall light switch: it is designed radically differently (e.g., the blades have to be usually submerged in oil, not the air, to prevent arcing). Also scaling comes with its own tradeoffs: It would be extremaly difficult (if at all possible) to design human sized robot as agile as a mouse.

Expand full comment
Jul 25, 2023·edited Jul 25, 2023

1. African elephant 257 000 000 000

2. Human 86 000 000 000

If all intelligence takes is a higher neuron count, training data and time, why aren't elephants three times as smart as humans? Why are they actually vastly dumber than humans? Why was Einstein much smarter than most people despite having pretty much the same number of neurons, training data and time? You can't explain this as the correlation being less than 1.

I'm agnostic as to whether an intelligence explosion is possible. But I'm 100% certain that it won't happen in anything deep learning based. Deep learning is not capable of extrapolating beyond either what's in the training data, or in the simulator that produces training data (in the case of board games). People keep confusing interpolation in a vast database for intelligence because humans can't do it, so they have no reference point, and intelligence is the only other thing that can produce the same effects.

Expand full comment

> Why does this work so well? Because animal intelligence (including human intelligence) is a blob of neural network arranged in a mildly clever way that lets it learn whatever it wants efficiently.

”Mildly clever” feels premature given that we still have no idea how the human brain works. And until we do it feels epistemically dodgy to use ”neural network” as an undifferentiated catch-all for both MLPs and brains.

The root problem is thrown into relief if you add ”can navigate novel physical environments” to your list of intelligence criteria. We won’t solve that by hand coding linguistic rules, but we probably also won’t solve it by adding layers to gpt.

Expand full comment

Much of human intelligence is about sex appeal, because sexual selection is how we evolved. Us male men, we try to impress the women with how well we talk, groom, impress and manipulate other humans, solve problems, make money, write poetry, make art, play sports, sing, play guitars, etc.

We don't benefit from having sex with whales so we don't try to impress them. We don't even know what would impress them. Nature isn't interested in us using our intelligence in a way that would sexually attract whales.

Thus, our notion of intelligence is very constricted to the human domain.

Expand full comment

"These considerations become even more relevant..." Sorry, the considerations just don't seem that compelling. Corollary: "IQ correlates to positive life outcomes" is a statement of pure value judgement, because appropriate life outcomes are culturally specific. Martyrdom is a perfect example; in many cultures past and present, martyrdom and widow-suicide are considered excellent life choices. And not just by tribal people in some steaming jungle... We all know (?) that Al Qaeda members were/are disproportionately engineers.

Expand full comment

I think the 2nd tweet makes a valid point, which you're not really engaging with in this article. In alignment discourse I often notice the tacit assumption that a system which is intelligent thereby also has agency, self-preservation instinct, goals, and/or desires. It's quite likely that people don't feel the need to explicitly argue for these properties as much as they otherwise would, precisely because of this kind of "platonic ideal" of intelligence (a "soul", even if they wouldn't admit it) being ascribed to AI systems.

Expand full comment

Good post. Might want to look into prototype theory of concepts, very similar to this.

Also. Intelligence in adults is more than 80% heritable. See here. https://www.emilkirkegaard.com/p/heritabilities-are-usually-underestimated

Brain size and intelligence is closer to r = 0.30 than 0.20 (Cox et al 2019). https://www.emilkirkegaard.com/p/brain-size-and-intelligence-2022

Expand full comment

That last paragraph is nonsense. You seem to be missing what the thing is and what it is for. Beavers for instance went through an intelligence expansion in their evolutionary history and then plateaued. The "intelligence explosion" in humans from recognising markings of ownership on vases to the adapable Greek alphabet likewise was randomly fitted through enviromental fitness and a social type of being. The intelligence of beavers should produce its own values as a distinc category, and the same should be for AI, but it cant expand without an agency because machine intelligence is a tool. The only intelligence explosion we can atribute to machines will be at our current plateau of organization. If you really mean just engineering feats for humans, you should just say so, but I doubt there can be an expansion of human valued intelligence without either our expansion as the placeholder or an AI with biological similarity for environmental fitness and social organization: Grey-goo the Wise.

Expand full comment

On one hand, the distinction between [assuming X axiomatically] and [arriving at X through reasoning about evidence] is meaningful and valid, and as (we seem to share a worldview in which) the latter is better than the former, I can see why you'd insist on making it.

On the other, once you start using X in your further arguments, both boil down to [assuming X]. And once you get your priors trapped at X, it boils down to [assuming X axiomatically] regardless of how you originally arrived at it.

So yes, it's wrong and judgemental to accuse others of Platonism. But I don't think you'd disagree that there exist real factual differences between you two about the degree to which the concept of "intelligence" is useful and meaningful, and you, of all people, should know full well that we humans are bound to keep making mental shortcuts straight to the city centre.


And while we're at at, he's right. At the very least, there's an important difference between intelligence as in "Artificial Intelligence", which is a field of science dedicated to making machines solve demanding cognitive tasks, and intelligence period, which is the inherent ability to perform those tasks. You do mix up the two, in a way that extrapolates from the former to the latter.

I mean, to plant my flag here, the entire section 3 of the post is just completely backwards. There is a valid retelling of the same events, where some people tried to imbue formal reasoning and domain knowledge and other explicit aspects of intelligence into their systems, and then a bunch of other people came, did a simple statistical analysis on gigantic chunks of data, and got better practical results (in the field of Artificial Intelligence, while not getting any closer to actual intelligence).

Expand full comment

Counterintuitively, I find this quite encouraging. But it's kind of hard to explain why.

My coworker Dave, who's way smarter than me, can write an algorithm that I can't understand, which performs 100x better than my algorithm. The difference between code / a machine / an algorithm that kinda sorta works and one that performs near peak efficiency is huge, and the difference between being able to create the better algorithm and not is a matter of intelligence.

So, if AI was like an algorithm, then I can imagine it getting smart enough to rewrite itself from my version to Dave's version, and getting even smarter, and rewriting itself again, and its intelligence forming an exponential curve into stratospheric, Singularity heights.

But the point of this article is that intelligence isn't Platonic, and AI isn't like that. It's a giant blob of compute and training data, and so there's no way to rewrite it (like Dave rewriting my algorithm) and suddenly get 100x efficiency because of being more intelligent. It's much easier to imagine a blob of compute and training data running into diminishing returns and slowing to a stop.

So I think that while this establishes that an intelligence explosion is a coherent idea, it makes it seem much less likely.

Expand full comment

“Basically, we created this word and used it in various contexts because we're lazy, but then some people started assuming that it has an independent reality and built arguments and a whole worldview around it.

They postulate an essence for some reason.”

The fact that we name stuff is a big deal with some philosophers who seem to not differentiate between absolute social constructs (ie religion, God, nations to an extent) and stuff we noticed and named. Like a tree.

Seeing and naming intelligence is not a performative utterance, it doesn’t create intelligence ex nilho, it’s recognising what exists and can be measured.

Expand full comment

Stacking layers reminds me of the early days of aviation, when, it seemed obvious that th triplane was the successor to the biplane.

Expand full comment

I'm not your A&P professor, but, one biceps brachii, two biceps brachii.

Expand full comment

One idea that I'm curious about, that I've never received a satisfactory response for.

Why isn't AI safety a pascal's mugging?

Expand full comment

My main issue with this kind of take is that it takes bundles of correlations which have been fairly thoroughly observed and tested *in humans* and then proceeds to claim those correlations will hold for AIs.

What we've observed in AIs is that yes, improving performance in one task does tend to improve performance in other tasks in a manner superficially similar to human intelligence - however, there are enough patterns where an AI which otherwise seems to be extremely clever fucks up in a way that a human never would that we can be reasonably sure that the *mechanism* whereby the AI is performing these tasks is significantly different from the mechanism a human being uses.

This, by itself, is not disqualifying toward the idea that it can be meaningful to attribute intelligente to an AI - but please remember that, in correlations, the tails come apart (https://slatestarcodex.com/2018/09/25/the-tails-coming-apart-as-metaphor-for-life/).

The idea of "intelligence" is a useful enough framing to talk about AI because there is some correlation between how "smart" an AI is and how an equivalent human would act - but there is no reason why this must hold indefinitely - especially for LLMs, which get all of their "smarts" from the structure of recorded human writing.

It feels like, in trying to make an LLM smarter, eventually we will start hitting a wall based on the fact that an AI that has learned everything it knows from human artifacts has no (simple) way to figure out things that humans can't do - and it is likely that, as it tries, it will start exposing the fact that its "human-intelligence-simulation-mechanisms" are very different from how humans reason and are not subject to the same correlation-generating constraints. Hence, I think it is very likely that the correlation which allows us to describe AI capabilities as "Intelligence" will eventually break down.

Of course, none of this rules out the idea that AI could eventually do Very Bad Things to us and that alignment might be a huge issue - but I don't think it can be boiled down to "AI will be very very smart and hence dangerous".

Expand full comment

The problem with this argument is that we have mountains of evidence that the correlations between intellectual strengths that we call IQ is a fact about human and perhaps animal biology and NOT a fact about intelligence-as-such. To give but three examples: Deep Blue is superhuman at chess, but 100% stone illiterate, Watson is superhuman at Jeopardy, but almost certainly couldn't even make valid moves in chess, and a pocket calculator is superhuman at finding square roots, but can't play chess or Jeopardy at all. This doesn't even scratch the surface: you'd be hard pressed to find any computer system that isn't superhuman at some task, and yet none so far have had anything like the ability to form and achieve arbitrary goals that rationalists mean when they talk about capital I Intelligence. Taking the outside view, you have to have a very strong presumption that problems people have found with LLMs are real limitations and that they don't have intelligence-as-such either, though this isn't to say they aren't powerful or interesting or even dangerous. The alternative is an argument from ignorance: we don't really yet understand fully how LLMs work, so we assume that this time we've just finally made a computer program that's actually intelligent. Sorry, I don't find that at all persuasive. If you want to convince me that LLMs can achieve arbitrary goals, show me an LLM achieving arbitrary goals. That hasn't happened yet.

Expand full comment

I think you're cheating a little bit with the phrasing of "arranging your blob cleverly". Can you articulate exactly how that's different from building in specific computational structures/abilities, like the "Responsible People" were trying to do? Like, I agree that a transformer is way way more general and blob-like than hand-coding in specific linguistic rules or whatever. But I've found it tricky to draw a line between what the people who don't get the Bitter Lesson are doing wrong and what the people who invented transformers are doing right.

Expand full comment

As I've pointed out many times before, "intelligence" is certainly a coherent concept--but its meaning can be summarized as, "like an ideal human". (That's why the measure of artificial intelligence that everyone comes back to--despite its obvious glaring flaws--is the Turing test.) And it's not at all clear that "ideal humanness" is even fundamentally quantitative--except in artificially imposed ways--let alone subject to "explosion".

(I wrote about this at length, somewhat snarkily, in 2015: https://icouldbewrong.blogspot.com/2015/02/about-two-decades-ago-i-happened-to.html)

Expand full comment

> how any set of cognitive capabilities can be self-amplifying [from the twitter screenshot]

This is ... obvious to cognitive psychologists, because that's how human learning works? It even has a name in education research: the Matthew Effect (a.k.a. The Rich Get Richer).

Basically, give a class of kids some text to read; the ones who have better vocabulary, domain knowledge, or both, will learn more from the same text than the ones who don't. This is a real stumbling block for some attempts at progressive education of the no-child-left-behind kind, because even if you give the whole class a text at the level that its weakest student can sort of understand, the top kids will still get more value out of the same text, up to a point.

Some papers describe the Matthew Effect as a "positive feedback loop", which sounds awfully like a synonym for "self-amplifying" to me. It's particularly pronounced in reading comprehension, so you'd expect to see the same if you're training GPTs instead of schoolkids to "read" and "comprehend" in a general sense of these words.

Expand full comment

I agree with the general thrust of this argument, but defining concepts as “bundles of correlations” is both overly narrow and misses the point of the original tweet about platonic ideas. This may be true of “inductive” concepts like strength that are based on observation, but omits higher level abstract concepts like “justice ”, “good”, or “beauty”. Those ideas are not rooted in correlation, but in the network of high-level values that form the basis of our identity.

Also, the framing of “intelligence” by the author is a bit one sided. There are currently multiple, divergent definitions within cognitive science that refer to the ability to learn, to reason, and to adapt to new situations. Charles Spearman, an early 20th century psychologies argued that intelligence is a single, general cognitive ability (the g-factor) that underlies all specific mental abilities. Psychologist Howard Gardner posited the theory of multiple intelligences: linguistic, logical-mathematical, spatial, bodily-kinesthetic, musical, interpersonal, and intrapersonal. Then there is the triarchic theory of intelligence proposed by psychologist Robert Sternberg, which posits that intelligence is composed of three components: analytical, creative, and practical.

A better, broad definition of intelligence goes beyond humans and highly evolved animals, to include all life forms and even machines. We should equate intelligence with the process generally considered to be at its core: problem solving. Problem solving can be described as the process of identifying, analyzing, and resolving problems or obstacles to achieve a desired goal or outcome.

How complex this process really is only became apparent when people attempted to replicate it in a machine. In 1959, Allen Newell and Herbert Simon developed a machine called the General Problem Solver. To do this, they formalized the concept of problem space, an abstract space containing an initial state, goal state, path constraints, and operators (rules) defining how to move within it. Their insights were tremendously important and continue to have applications in machine learning until today. However, as they honestly conceded, their method encountered major obstacles that apply not only to machines, but to all life forms trying to solve problems.

First and foremost is combinatorial explosion. Whether digital or physical, large problem spaces with complex topologies of constraints and operators lead to a number of possibilities that exceed computational abilities. A simple chess game, for example, has an estimate of 10 to the power of 120 possible moves. This is known as the Shannon number and is larger than the total number of atoms in the universe. There are too many options to list, let alone compute.

The second limitation are ill-defined problems. The concept of problem space assumes that goal state and constraints are known and well defined. However, due to the frequently poor quality of available information, this is rarely the case. “How to write a good essay?” My initial state is an empty page. My goal state a good essay. Where is the path to a solution? Frankly, I don’t know. I make it up as I go along. Ironically, the vast majority of situations we encounter are precisely of this kind.

This is related to what philosopher David Chalmers calls the finitary predicament. According to Chalmers, the finitary predicament arises because our minds are finite and limited in their capacity to process information. We can only perceive a small fraction of the physical world, and we are limited in our ability to comprehend the complexity of the universe. As a result, our understanding of the world is necessarily incomplete and imperfect.

I feel that in the current A(G)I debate, it is important to clearly define the conceptual boundaries and definitions of these terms in order to have a meaningful debate.

Expand full comment

I was prepared to entertain this argument but your analogy to strength is so wildly incorrect that I gave up. If you spent any time trying to actually understand what strength is, how it works, how we understand it and *can measure it*, it would help you understand how unlike "intelligence" it is.

But you won't, because the belief in one-dimensional measurable intelligence is a Core False Belief of the Ideology of Rationalism, and leads directly to AI Doomerism. (See http://www.paulgraham.com/lies.html to understand how ideologies form around false beliefs) The screenshotted tweets at the top are simply correct.

Expand full comment

“Intelligence” is another useful concept. When I say “Albert Einstein is more intelligent than a toddler”, I mean things like:

Einstein can do arithmetic better than the toddler

Einstein can do complicated mathematical word problems better than the toddler

Einstein can solve riddles better than a toddler

Einstein can read and comprehend text better than a toddler

Einstein can learn useful mechanical principles which let him build things faster than a toddler

…and so on."

This seems to go against the very notion of "intelligence" meaning more than "knowledge" or "skills." It also seems to be an argument against the existence of g. Although a toddler may also be less intelligent than the same person would be as a grown adult, we recognize frequently very smart children that we would say are "smarter" than most adults, even if they don't know a lot. James Sidis was clearly more intelligent than most adults when he was very small, but lots of adults also knew Greek, or whatever subject he was unusually good at from a young age, and almost certainly better than him.

Based on this definition of intelligence, almost every adult would be "more intelligent" than almost every child, but I don't think you believe that and I certainly don't.

Expand full comment

This illustrates perfectly how theoretical linguistics got divorced from NLP - at some point, NLP guys understood they can just get much more data than a two-year-child learning the language and brute-force the problem instead of emulating parameters of Universal Grammar or whatever.

Expand full comment

Some concepts are bundles of useful correlations but not all concepts. And the concepts that are only bundles of useful correlations have one thing in common: They brake down outside of the domain in which they are useful data compressions.

For example, an average dog has much higher biting strength than Mike Tyson on Scot's grandma but they probably both beat the dog at freelifting. Meanwhile a hydraulic lift will beat Mike Tyson and Scot's grandma in lifting but neither of them at wrestling. So Scot's explanation of strength kind-of holds for people but not for things very much unlike people. Compare this to a concept not defined by typical correlations like height, where the Empire state building is taller than Mike Tyson just like Mike Tyson is taller than I am (I have insufficiant data to specify the hight comparisons to Scot's grandma, but that is actual lack of knowledge not lack of conceptional definedness).

Normally correlations can't be extrapolated beyond the domain they are observed on (think about it, a correlation coefficiant is defined by the slope of a linearization and most functions aren't linear). And factor-analytical constructs springing from many such correlations almost never can be so extrapolated, which is why every entry level stats textbook warns against reifying them. It is of course also true that social scientists mostly nod along and then ignore all the warnings from baby stats but that kind of sloppiness gets you a replication crisis.

So no, there is absolutely no reason to expect intelligence to be a useful concept outside of humans in a vaguely normal intelligence range unless you think more real than other factor-analytical indices. In other words the tweeter is dead right, AI doomers at least of the classical Yudkowskyan kind expect the concept of intelligence to be useful far outside the domain that expectation would make sense in an do so precisely because they are mistaking it for an actual thing.

Expand full comment

>Because animal intelligence (including human intelligence) is a blob of neural network arranged in a mildly clever way

No it's not.

Expand full comment

Altman has said intelligence is an emergent property of physics, I believe the implications are that intelligence is based on geometric invariants and symmetries. Encoder-decoder models are learning reduced representations during bottleneck compression, with the ambition of converging to platonic ideals and irreducible representations. After the ideal forms are captured, the training examples are no longer needed a la Wittgenstein's ladder.

Expand full comment

I think what you're missing is the assumption that you can scale things up by many orders of magnitude and still see the same correlations.


Expand full comment

No objections, but IMO engaging with a weak argument.

You've given examples of where a word reaches the limits of its powers of comparison. The question is, when does that happen for intelligence? When does intelligence break apart into the strength equivalent of biceps vs triceps, pound-for-pound, height leverage, etc?

I think it's fair to say that more/less intelligent doesn't well describe our relationship to ChatGPT already. It can write far faster than I can, it has a far greater breadth of knowledge than I have, etc. I'm better than it is at introspecting myself, at my own profession, etc.

There will come a time when an AI gets better than me at literally everything, with the last vestige possibly being reverse engineering my own mental state and knowing me better than I know myself. But none of this speaks to "foom!" per se. The explanatory powers of "intelligence" have broken down, they are currently not useful to describe the infinity dollar question.

But we can say some things about foom when we stop talking about "intelligence goes up" and start talking about specific goals. AI might be able to solve ECC but it cannot solve a one time pad. Things might move extremely fast, but every single type of scaling we know tapers off very quickly once it runs into its physics. Strength is a great example:


We can also talk about more useful properties of an AI than its "IQ":

1. How much information does it contain? (consider a human without any sensory input from birth)

2. How much information can it bring to bear on a question in a given unit of time?

3. How well can it avoid applying spurious information to a question?

4. How well can it avoid skipping information that would provide insight?

5. If 3 and 4 are inversely proportional, are they linearly or exponentially so?

6. (this list could be greatly expanded, I'd like to as well)

It's perfectly possible that an AI at 99% of its "intelligence" runs on the current total power output of the world and makes a couple insightful discoveries, but at 100% of its "intelligence" runs on all the stars in the universe and... proves that P != NP.

Expand full comment

I’ve worked in deep learning research and engineering for the past decade and I want to say that Scott Alexander is completely right, but I want to add some mathematical rigor to this. Also to push back on one common argument against this. And also clarify some historical stuff he was a little fuzzy on.

Neural nets are Turing complete when they’re infinitely deep. And they’re universal approximators when they’re infinitely wide. What does this mean for the non-mathematicians? It means that they can compute anything that can be computed. That’s a bit of a circular definition, but the relevant part is that they can do anything *in theory*. Since they can compute anything, this means that they are already, in theory, capable of superintelligence. So why aren’t they superintelligent right now, in practice?

2012 was the breakthrough year for deep learning. That was when Geoff Hinton’s techniques really started working in a way that was impossible to ignore. Historically, why they didn’t work well until 2012 was that they tended to badly overfit, meaning that they memorized their dataset exactly, but could not “understand” what they were learning [https://en.wikipedia.org/wiki/Overfitting].

So how did Hinton get around this? By using something called an autoencoder for layer-wise pre-training. Say that you’re trying to classify images of handwritten digits to figure out what digit it is. Rather than just train a neural net to take an image and output the number, you take the image and compress it down, and attempt to reconstruct the original image. What this does is it forces the neural net to learn abstraction layers, because the only way it can compress something down and still reconstruct the original is to learn common patterns in the data. It would learn how pixels are commonly combined into lines, and then shapes, and how those shapes are commonly arranged. This is referred to as representation learning. Layer-wise pretraining went out of style a long time ago for reasons that aren’t relevant here, but the point is that what makes neural nets powerful is their ability to learn abstractions. More data from more tasks leads to better representations and better abstractions.

Mathematically, rather than debate “what is intelligence” I think we should say that multiplying matrices together is a very general process that can learn any function in physics, meaning that it can learn anything happening in the real world. One very general problem is that these things learn “too well” in that they still often overfit. One the main things we do in the field on day-to-day basis is get them to generalize better and overfit less. Our current models are likely powerful enough to loosely memorize half the internet. But that’s pretty useless if it hasn’t learned abstractions. That’s what makes the Transformers (AKA soft attention with positional encoding) such a breakthrough.

What’s stopping us from having superintelligence now?

At a very high level, the answer is overfitting. Feeding it more data from different domains will help regularize it, meaning reduce overfitting and improve generalization to new tasks. So, yes an intelligence explosion is possible but not guaranteed. More tasks and more domains will ultimately led to improved performance across all domains as the neural net learns better abstraction layers that represent real-world physics. This includes language.

Also, a common argument against this is that LLMs are terrible at basic arithmetic. But this isn’t a fundamental issue with matrix multiplication, rather an issue with the activation functions we use. I think that Neural Arithmetic Logic Units might be a solution [https://arxiv.org/abs/2101.09530].

> In the middle of a million companies pursuing their revolutionary new paradigms, OpenAI decided to just shrug and try the “giant blob of intelligence” strategy, and it worked.

The “Attention is All you Need” paper that invented the Transformers was actually out of Google. The authors also didn’t really understand how powerful their technique was and originally thought it was just a useful technique for translation and that was about it. Alec Redford over at OpenAI then had the idea to just use it for pre-training on English. In line with what Scott Alexander argues, Alec thought of it the same way with “Language Models are Unsupervised Multi-Task Learners.” [https://insightcivic.s3.us-east-1.amazonaws.com/language-models.pdf]. Basically, that getting better at one task would correlate with improvement on others.

> The limit of Jelinek’s Law is OpenAI, who AFAIK didn’t use insights from linguistics at all

This is correct that they don’t use linguistics, but I think you’re giving OpenAI too much credit here.

Expand full comment

You can define a numeric scale of "strength," with 3 representing somebody's metaphorical grandmother and 18 representing Mike Tyson, sure. You can come up with methods to estimate where an arbitrary human is on that scale, and use an arbitrary human's "strength" value to make reasonable predictions about what specific tasks that human can do, sure.

But, that doesn't mean it's meaningful to imagine a human with a "strength" of over 9000 and speculate about what that human might be capable of based on linear extrapolation. You can certainly come up with a meaningful number for what the grip force of a human with a 9000+ "strength" would have to be, but it's probably very silly, if not physically impossible, to design your uber-gripping device as just a scaled-up human.

Expand full comment

No, AI-doomism is not Platonism; it's just a bait-and-switch.

As the story goes, AI is existentially dangerous because it will soon acquire superpowers, such as the ability to hack any computer, convince any human to do anything it wants, cover the Earth with "gray goo" nanotechnology, engineer a ~100% lethal pandemic, etc. It would acquire these powers by being "superintelligent" -- the AI would be to the smartest human (usually considered to be Von Neumann) as that human is to a rat. And it would become "superintelligent" via recursive improvements to its processing speed and total memory. So, act now, before it's too late !

But this story falls apart as soon as you look at any of the steps in detail. For example, what is "superintelligence" ? Well, it's usually described as the quality of being "super-smart", but what does that actually mean ? Sure, the AI could e.g. "solve math problems faster than Einstein", but calculators can already do that, and they're not "superintelligent". In practice, "superintelligence", as the AI-doom community uses the term, is synonymous with the ability to acquire superpowers. So, the AI will acquire superpowers due to its superior ability to acquire superpowers... this logical deduction is true, but not terribly impressive.

What about processing speed and "neural blob" size, however ? Surely, throwing more CPUs at any problem makes it easier to solve, so the AI could in fact solve any problem by throwing enough CPUs at it, right ? Well, sadly, the answer would appear to be "no". For example, modern LLMs (Large Language Models) such as Chat-GPTs are "Transformers" (that's the "T" in the name), a new type of deep-learning neural network that was discovered relatively recently. They differ from old-school NNs not merely in size, but also in *structure*; it is this new type of structure that makes them relatively effective at generating text (and the same holds true for image generators). You would not be able to make an LLM by merely networking together a bunch of 1980s-era Sharp calculators, just like you cannot make a genius by networking together a bunch of nematodes. Unfortunately, the term "neural network" is a bit of an aspirational misnomer; we humans (and other animals) have brains with radically more complicated structures than the Transformer; structures that allow us to do many things modern matrix-multiplying ML systems simply cannot do, such as learning on the fly.

Of course, AI research will continue, and it's possible (in fact, likely) that one day we will discover something more powerful than the Transformer (powerful enough to compete with human brains), but that day would not appear to be coming soon. Throwing more CPUs at the problem won't help (I mean, it helps a little, but the problem complexity is exponential); and, what's worse, many of the putative AI superpowers are outright physically impossible. So, I wouldn't go around panicking just yet (or, in fact, ever).

Expand full comment

Scott, next time you cite a correlation coefficient, please please PLEASE specify whether you mean r or r^2. Thanks in advance.

Expand full comment

The fact that it's specifically language engines such as GPT which are the closest to generality is actually an evidence that intelligence is less correlated concept than we might have thought.


Expand full comment

Also, the arbitrary reminder that heritability doesn't actually mean what people may naively think it does.

Go actually follow the link Scott provided and read the Caveats to learn more.

Expand full comment

I think this is a question of consciousness and not intelligence. I think we cling to the word intelligence because that is the primary value we see in "AI" technology, and so we try to measure human intelligence qualitatively and quantitatively and evaluate the performance of some information processing model against that.

In reality, humans do not have a scientific explanation for our own consciousness yet we have a myriad of non-scientific explanations that overlap in profound ways.

I do think it is possible for an "AI" to become intelligent and capable enough to have profound, and even independent/uncontrolled, impacts on society without us translating our consciousness as a pre-requisite. However, for me that conversation merits an STS oriented conversation with consciousness/morality oriented questions in parallel. I feel as though society blurs the two, and underemphasizes the importance of regulating the technology as is in the present due to our concerns for the future which are often more laden with fear and black/white thinking.

This societal blurring is a frequent theme in STS that often causes further disconnect between already misaligned societal stakeholders. e.g. One can claim that the (valid) fears of a climate change fueled societal collapse have polarized people based on their feeling of likelihood of that future scenario. The closer we get the harder it is to avoid, but in the pre-internet era the societal landscape of information communication was quite different.

One could then debate whether a more present oriented approach with more tangible messaging (e.g. keep your community clean and healthy vs save the planet) would have worked or if a more radical (e.g. stop everything now) would have worked. In the end, what happened happened and these are simple considerations for relative adjustments to explore alternative ideas about how we navigate issues as potentially fueling our own exinction... whether it be through the climate, "AI", nuclear armaggeddon, or some other technological system gone awry.

Expand full comment

Maybe you have "overlearned" the Bitter Lesson, and are generalizing it to apply in domains where it has never been observed (and where there are some good reasons to think it won't hold)?

I think the Bitter Lesson is unassailably true (and nonobvious, and useful) in Chess and Go. I think we could generalize that to "two-person turn-based discrete finite games of complete information" pretty safely.

When we add in games of incomplete information, my confidence gets a little lower, though the results in Texas Hold'em and Starcraft are encouraging.

What about natural language translation? Here there is some vagueness about the success metric, which puts us at risk of Goodharting ourselves. If the goal is to translate large numbers of documents at 90%+ accuracy for the lowest marginal cost, perhaps the machines (and the Bitter Lesson) have won. But if you wanted to translate a single document with the highest accuracy at any cost--say, if you had to sign an important bilingual document like a peace treaty or a 99-year lease for land to build a skyscraper on--you'd probably still want to hire a human expert, right? It's not clear to me that any Bitter Lesson-driven research will ever surpass a human at this important skill.

GPT4 has racked up another Bitter Lesson win at the game of "take a context window and generate a sequence of subsequent tokens that resembles, to a human, what another human might write". If you want to produce a large amount of reasonable-seeming text at the lowest marginal cost, it sure seems like the Bitter Lesson is the way to go. It's not totally clear to me whether this has any actual economic value (not relying on deception). I used to firmly believe it didn't, but I've been impressed and surprised by a quite a few successes, and certainly a lot of people/companies dumping money into this space must believe it might be useful for something.

Can we try to describe the space of problems for which the Bitter Lesson seems to apply? As a first guess, I'll put forth: "tasks for which you can obtain (or generate) a very large training set, and which have a clear and explicit success metric." But "taking over the world" doesn't seem to be in this space, nor does "building a more capable AI". I'll make a stronger prediction (with lower confidence) that "winning a programming competition" is (though just barely), and "being a professional software engineer" isn't (though just barely).

Expand full comment

"Intelligence Has Been A Very Fruitful Way To Think About AI"

This is where you lose me (in THIS particular argument).

What OpenAI et al have done is create a language model. What ChatGPT etc can do is "understand" English (for some weak sense of understand); what they don't have is the rest of the intelligence package, whether that's "common sense" (ie various world knowledge) or much deduction. I remain unimpressed with ChatGPT *as an "intelligence"*, and every example I see of it shows, yes, it's very good at manipulating language and language-adjacent behavior, but that's all.

I cannot ask ChatGPT an "original" question (ie one that doesn't have relevant text on the internet) and expect any sort of interesting reply: "ChatGPT, give me an essay on whether the nerd/jock dichotomy is real, referencing examples from literature across many different times and places"

(I just tried this and the result is unimpressive. It shows my point – language is understood, but no insight, nothing new beyond repeating sentences.)

Now am I being too harsh in my bar for intelligence? Don't most *people* fail that standard? Well, yes, I guess I am an aristocrat, in the Aristotelian sense of that term.

It was a reasonable hypothesis that language was somehow equivalent to "intelligence", and so a language ability was the same thing as a general intelligence. But what I see when I look at LLMs is that we've proved the hypothesis wrong. LLM's probably have a great future as an arena for "experimental linguistics" but they will be only a part (and maybe not even the hard part) of a real AI.

Expand full comment

Safe to say that AI will excel at imitating intelligence. Highly doubtful about anything beyond that. Ultimately there is an insurmountable distinction between organic/living intelligence and machine intelligence.

Expand full comment

AnomalyUK has interesting arguments against super-intelligent AI doomerism. This April, he wrote [https://www.anomalyblog.co.uk/2023/04/ai-doom-post/]:


I think I need to disaggregate my foom-scepticism into two distinct but related propositions, both of which I consider likely to be true. Strong Foom-Scepticism — the most intelligent humans are close to the maximum intelligence that can exist. This is the “could really be true” one. But there is also Weak Foom-Scepticism: Intelligence at or above the observed human extreme is not useful, it becomes self-sabotaging and chaotic. That is also something I claim in my prior writing. But I have considerably more confidence in it being true. I have trouble imagining a super-intelligence that pursues some specific goal with determination. I find it more likely it will keep changing its mind, or play pointless games, or commit suicide. I’ve explained why before: it’s not a mystery why the most intelligent humans tend to follow this sort of pattern. It’s because they can climb through meta levels of their own motivations. I don’t see any way that any sufficiently high intelligence can be prevented from doing this.


Since he had correctly predicted the eventual rise of LLMs in 2012 [https://www.anomalyblog.co.uk/2012/01/speculations-regarding-limitations-of/], his arguments are worth considering.


[W]hat is “human-like intelligence”? It seems to me that it is not all that different from what the likes of Google search or Siri do: absorb vast amounts of associations between data items, without really being systematic about what the associations mean or selective about their quality, and apply some statistical algorithm to the associations to pick the most relevant.

There must be more to it than that; for one thing, trained humans can sort of do actual proper logic[,] and there’s a lot of effectively hand-built (i.e. specifically evolved) functionality in a some selected pattern-recognition areas. But I think the general-purpose associationist mechanism is the most important from the point of view of building artificial intelligence.

If that is true, then a couple of things follow. First, the Google/Siri approach to AI is the correct one, and as it develops we are likely to see it come to achieve something resembling humanlike ability.

But it also suggests that the limitations of human intelligence may not be due to limitations of the human brain, so much as they are due to fundamental limitations in what the association-plus-statistics technique can practically achieve.


The major limitation on human intelligence, particularly when it is augmented with computers as it generally is now, is how much it is wrong. Being faster or bigger doesn’t push back the major limitation unless it can make the intelligence wrong less often, and I don’t think it would.


Expand full comment

>At some point we might get a blob which is better than humans at designing chips, and then we can make even bigger blobs of compute, even faster than before.

Although note that I think the intuitive interpretation of someone reading this would be 'we can add compute faster than we were adding it in the past', which gives a sense of acceleration.

Whereas I think the actual meaning is 'we can add compute in the future faster than we would have been able to if only humans were doing it.'

Which very easily *could* mean it still accelerated over time, but it also might not mean that (if for example we're running into other limiting factors besides the intelligence of chip designers).

Doing better in the future than we might have otherwise done in the future does not necessarily mean doing better than we did in the past.

(again, even if it's likely to mean that in practice)

Expand full comment

I think math, and in particular arithmetic and algebra, is a compelling counter example to the statement that “that extremely clever plans to program “true understanding” into AI always do worse than just adding more compute and training data to your giant compute+training data blob”.

For example, the public gratis version of ChatGPT does not have the arithmetic ability of a cheap pocket calculator from the 1990s: just ask it a 8 digits square root, I asked 42282466 and it answered 6504.89 instead of 6502.4969.

AFAIK, large language models at this time are not able to automatically use their own outputs as inputs to continue a multi-stage reflection, and that is necessary to execute even a simple arithmetic algorithm. Maybe we will be able to teach them that ability, but it is a significant paradigm jump from the current linear processing.

I think a more accurate statement would be that we have exhausted the things we understand well enough to explain them to machines.

Expand full comment

You can’t criticize platonism without engaging in it. “This whole class of arguments is flawed because it posits that abstract classes exist and can be reasoned out” is subtly contradictory.

Expand full comment

This article points to the age old need of man to worship- AI is just a tool but mankind is idolizing it- we cannot help that we are designed to worship but the irony of man putting trust for help in something he designs and makes himself is brilliantly and humorously laid out in Isaiah 44-esp9-20 - but the best bit is in verses 20-22 when Yahweh says he actually has the power and offers to provide real help at his own expense - it truly is for every age- let he who has ears, hear and he who is intelligent read and understand.

‘9How foolish are those who manufacture idols.

These prized objects are really worthless.

The people who worship idols don’t know this,

so they are all put to shame.

10Who but a fool would make his own god—

an idol that cannot help him one bit?

11All who worship idols will be disgraced

along with all these craftsmen—mere humans—

who claim they can make a god.

They may all stand together,

but they will stand in terror and shame.

12The blacksmith stands at his forge to make a sharp tool,

pounding and shaping it with all his might.

His work makes him hungry and weak.

It makes him thirsty and faint.

13Then the wood-carver measures a block of wood

and draws a pattern on it.

He works with chisel and plane

and carves it into a human figure.

He gives it human beauty

and puts it in a little shrine.

14He cuts down cedars;

he selects the cypress and the oak;

he plants the pine in the forest

to be nourished by the rain.

15Then he uses part of the wood to make a fire.

With it he warms himself and bakes his bread.

Then—yes, it’s true—he takes the rest of it

and makes himself a god to worship!

He makes an idol

and bows down in front of it!

16He burns part of the tree to roast his meat

and to keep himself warm.

He says, “Ah, that fire feels good.”

17Then he takes what’s left

and makes his god: a carved idol!

He falls down in front of it,

worshiping and praying to it.

“Rescue me!” he says.

“You are my god!”

18Such stupidity and ignorance!

Their eyes are closed, and they cannot see.

Their minds are shut, and they cannot think.

19The person who made the idol never stops to reflect,

“Why, it’s just a block of wood!

I burned half of it for heat

and used it to bake my bread and roast my meat.

How can the rest of it be a god?

Should I bow down to worship a piece of wood?”

20The poor, deluded fool feeds on ashes.

He trusts something that can’t help him at all.

Yet he cannot bring himself to ask,

“Is this idol that I’m holding in my hand a lie?”

Restoration for Jerusalem

21“Pay attention, O Jacob,

for you are my servant, O Israel.

I, the LORD, made you,

and I will not forget you.

22I have swept away your sins like a cloud.

I have scattered your offenses like the morning mist.

Oh, return to me,

for I have paid the price to set you free.”’


Expand full comment

Late to the party (listened via the ACX podcast) but coincidentally a friend and I recently recorded a podcast discussing, basically, just this: whether general intelligence is a real/natural unified thing or just a bundle of smushed together abilities. Not great quality audio, minimal editing, but super interesting for me to make, at least.

Spotify: https://open.spotify.com/episode/0DDvsWCC0L04BCgJmFrPI8?si=GEBUBClQQVG89wNU4FWbFQ

Google Podcasts: https://podcasts.google.com/feed/aHR0cHM6Ly9hbmNob3IuZm0vcy9lNDJlYjllYy9wb2RjYXN0L3Jzcw/episode/Mzc5NjljZGMtZWNhOC00NjE3LWI1MWYtZDRkMzA3NGIwYzAw

Copy and pasted episode summary/description:

Very imperfect transcript: bit.ly/3QhFgEJ

Summary from Clong:

The discussion centers around the concept of a unitary general intelligence or cognitive ability. Whether this exists as a real and distinct thing.

Nathan argues against it, citing evidence from cognitive science about highly specialized and localized brain functions that can be damaged independently. Losing linguistic ability does not harm spatial reasoning ability.

He also cites evidence from AI, like systems excelling at specific tasks without general competency, and tasks easy for AI but hard for humans. This suggests human cognition isn’t defined by some unitary general ability.

Aaron is more open to the idea, appealing to an intuitive sense of a qualitative difference between human and animal cognition - using symbolic reasoning in new domains. But he acknowledges the concept is fuzzy.

They discuss whether language necessitates this general ability in humans, or is just associated. Nathan leans toward specialized language modules in the brain.

They debate whether strong future AI systems could learn complex motor skills just from textual descriptions, without analogous motor control data. Nathan is highly skeptical.

Aaron makes an analogy to the universe arising from simple physical laws. Nathan finds this irrelevant to the debate.

Overall, Nathan seems to push Aaron towards a more skeptical view of a unitary general cognitive ability as a scientifically coherent concept. But Aaron retains some sympathy for related intuitions about human vs animal cognition.

Expand full comment

Is Arnold Schwarzenegger stronger then a truck? Is that a coherent question?

Expand full comment