Astral Codex Ten

Comment deleted

Expand full comment

This is a reply, but looks like a top-level comment at https://www.astralcodexten.com/p/my-takeaways-from-ai-2027/comments sorted by latest. I'm assuming this is a Substack bug.

Expand full comment

Ah, yes, it was accidental double comment, sorry about that.

Expand full comment

Comment deleted

Comment deleted

Expand full comment

If intelligence is distributed close to normally (or lognormally, whatever) then it isn't crazy to use "IQ" to talk about out-of-range intelligence.

Expand full comment

Comment deleted

Comment deleted

Expand full comment

1. It is objectively not meaningless. He said it and most people understood the meaning.

2. It's not even an abuse of the IQ definition. The definition is mathematical and has no bounds. It's analogous to ELO which we now have superhuman AI ELO scores in many games.

Expand full comment

Elo is not capitalised.

Expand full comment

Crinch

Elo absolutely has a plateau. You cannot just break a ceiling if there's nothing more to be learned about a game. For example, there is a near-optimal way to play chess now. It is not physically possible to do better than that in a way which puts you significantly above everyone else.

Expand full comment

I don't see why this must be true for intelligence. The human 'asymptote' might be close or far.

Expand full comment

Comment deleted

Comment deleted

Expand full comment

vtsteve

But we're only using 10% of our brains! /s

Expand full comment

Ships, rocket ships, cranes all do not perform at a level close to the human best. You can, if you wanted to apply a normal distribution to some notion like human swim speed then compare it to some artificial equivalent, and notice that the man made one can be very, very out of distribution compared to what humans can do naturally.

Expand full comment

Comment deleted

Comment deleted

Expand full comment

Continue thread →

Apr 25

This seems true for athletic tasks, but not necessarily for intellectual ones.

Eg. Nigel Richards is plausibly twice as good at Scrabble as the second-best Scrabble player.

Expand full comment

I don't think you're using SDs correctly. If the average running speed is 5 m/s and the SD is 1 m/s, then someone 9 SDs faster than average runs at 14 m/s. 10 SDs faster than average is 15 m/s. If you go 1 SD faster than someone else, that always means you're going 1 m/s faster than them, no matter how many SDs above average you are.

I get the impression you're trying to calculate SDs backwards from someone's ranking such that the number of people in each SD always matches what you would expect from a perfect Gaussian distribution. Then the SD would vary instead of being a constant like 1 m/s for a given distribution. But that's not how SDs are calculated.

Expand full comment

That's the max LIKELY from the distribution. That's not the same as max possible or actual. All 3 are different.

Expand full comment

FeepingCreature

I think this just means 300 by the current scale. Normalized IQ 300 is pretty impossible, but current-IQ-scale IQ 300 is not.

Expand full comment

Apr 9Edited

I think you're confusing something being defined via standard deviations with something inherently necessarily meaning standard deviations.

You could, if you wanted, invent a Speed Quotient, in which the average human runner has SQ 100, and a standard deviation is defined as 15 points. Then Usain Bolt would have SQ of ~195. But a car would have SQ of 10,000 or something. It's perfectly legitimate to describe a car as having SQ 10,000 even though there aren't enough humans to create an SQ 10,000 person naturally.

Expand full comment

Comment deleted

Comment deleted

Expand full comment

The map is not the territory. We mean something by “intelligence” and IQ may not be a good proxy for it when you get to Einstein-squared levels.

I’ve never taken an IQ test that I know of, but I assume that, as a test, it’s possible to get every question right and you can’t get a better score than that. But that doesn’t mean that some one/thing that does that is as smart as it’s possible to be.

Expand full comment

Could design a test where the later questions get harder at a steady rate, so the last few are beyond anything any human ever solved, but still have a well-defined relative difficulty and answers which could be checked in polynominal time.

Expand full comment

Steve Sailer

It seems like the genie is out of the lamp, especially since Deep Seek went open source code and showed that AI didn't require a vast server farm.

How do you stop this?

Expand full comment

Reply (5)

Bassoe

Authoritarian crackdown on compute and legally mandated always-online backdoors and spying.

None of this actually works of course, because dissidents accurately recognize not having open-source AI of their own is an existential threat in economic terms and some of them will be smart enough to use airgapped computers from before said measures were implemented, and when actual AI hacking is developed, it uses all of this as zero-day exploits, but the security state never misses an opportunity to seize more power in the name of nebulous emergency.

Expand full comment

Thomas Kehrenberg

Apr 8Edited

Deep Seek itself is not too dangerous yet, so if the world were to manage to coordinate on bombing any data center that is above the size of whatever the current largest is, I think we could survive a few decades longer.

Expand full comment

Sam Atman

Deep Seek showed what so much of Chinese industry shows: if you cheat and steal via espionage, you can rip off innovation for a lot less than it costs to innovate. They stole weights. Period

Expand full comment

> They stole weights.

Can you back that up? Even OpenAI hasn't claimed this. ChatGPT's weights are closed source. Deepseek can't access them.

OpenAI accused Deepseek of distilling their model; in this case meaning they gave ChatGPT a lot of prompts and used those prompts and responses to help train Deepseek. This doesn't give you the weights and would not cause Deepseek to have similar weights. It trains Deepseek to be more like ChatGPT.

Expand full comment

Apr 8Edited

Let's suppose this is true. *So what*? Stolen weights are no more or less dangerous than weights developed in-house. You might as well say we don't need to worry about terrorist groups getting nuclear weapons, because after all, they would probably just be stolen Soviet ones.

Expand full comment

If you actually wanted to stop AI for whatever reason, a big enforcement of copyright law and then lawsuits and arrests.

Expand full comment

That might stop US AI, but not Chinese AI, unfortunately.

Expand full comment

Why do you wanna stop it?

Expand full comment

Presumably because he doesn't want humanity exterminated by AI.

Expand full comment

Abhisek Basu

Here's my alternative prediction on how it'll go - https://www.brightmirror.co/p/ai-in-2027

Expand full comment

What is your basis for believing we are a year away from mass manufacture of humanoid robots in China? Or that Neuralink is going to be good and safe enough to significantly boost human cognition in two years?

Expand full comment

Abhisek Basu

1. Clear timelines from the China's Ministry of Industry and Information technology (https://www.china-briefing.com/news/chinese-humanoid-robot-market-opportunities/), Announcements from major Chinese companies (https://newo.ai/mass-production-humanoid-robots-2025/), China is already ahead and controls 70% of the world's robot supply chain (https://www.cnbc.com/2025/03/28/china-already-ahead-of-us-in-humanoid-robot-race-analysts-say-.html)

2. This one is more speculative, but I'm hopeful Elon's cutting of government waste and regulations also includes relaxing regulations for Neuralink trials and similar products in the market. Meta seems to have solved decoding of thoughts (https://www.gizmochina.com/2025/02/16/metas-ai-can-now-read-your-mind-with-80-accuracy/), so I'm assuming encoding will be easily solved by a more invasive measure like Neuralink implants, at which point, cognitive enhancements shouldn't be too far off.

Expand full comment

1) first link suggests that they are mass producing 1000 robots this year. Two years to go then to taking over everything.

2) even at his most destructive of government regulators I can’t see neuolink being deployed within a decade or ever.

Expand full comment

https://www.youtube.com/watch?v=xwgaMdHzW40

It's easyish to build a humanoid robot. It's very hard to build a useful humanoid robot.

This is supposedly the Chinese mass produced state of the art. It costs as much as a car and looks like a plastic toy. It can walk around and dance and wave but it doesn't appear to be able to use those big human-like hands to actually manipulate objects... actually I don't think those fingers move at all.

One day humanoid robots will be useful, and when that happens it will be a big deal, but I don't think it'll be this decade.

Expand full comment

Emdash

Disclaimer: I once worked in the lab of a world leading brain implants for human enhancement maximalist, who is now widely considered essentially a charlatan.

Neural implants are wildly far from even matching basic human capabilities in our simplest processes (e.g., motor control, hearing, vision). These limitations are primarily hard-technical in nature, not things that will be solved by encoding/decoding algorithmic advancements. The electrode based approach Neuralink and others are using has fundamental physics based limits that will limit the potential transmission bitrate that could be achieved, even if the safety/reliability issues are solved. Currently these bitrates are functionally in the 10s-100s/s, but getting much beyond that will require multiple paradigm shifting advances in bi-directional communication technology that are either purely theoretical or completely unknown. You can't just stick in more electrodes.

We also have much better ways to get information into our brains. Right now you're reading this through your eyes getting maybe 50 bits/second, similar to what we get from speaking/listening. But vision and hearing go FAR beyond this, on the order of Mb/s, so like 6-7 orders of magnitude beyond what BCI can do. Processing and memory are harder to measure, obviously, but likely closer to Mb/s than 10s of bits/s.

We MIGHT get a passable BCI that would give you something like a voice in your head with Neuralink (at the cost of extreme health risk), but the only way we are getting anything like cognitive enhancement is if ASI makes it for us. And by then it's obviously far too late.

Expand full comment

Who?

Appreciate that precis, Emdash.

Expand full comment

Azergante

Apr 12Edited

Wouldn't it be possible to drastically enhance math and programming abilities even with low throughput? sending an equation or a few (perhaps compressed) lines of code and getting a result back seems possible? The brain is quite bad at logic and precise computation so that would be really helpful.

Also if I understand correctly working memory can only *point to* a few concepts at once, is it possible to enhance cognition by using the additional 100 bits to point at way more concepts rather than trying to encode the concept content in the 100 bits?

Maybe you could also implement an algorithm to walk a tree of thought systematically and improve everyone's decision making abilities? The algorithm would tell you "look at this thought, now this one, now compares these two thoughts", which can be done through pointers.

Perhaps 100 bits is enough to trigger a positive feedback loop that gets you in an extremely pleasant state of mind (jhana)? Being in a positive state of mind (or just not depressed) is extremely helpful for coordinating with other people, or even for productivity in general.

Expand full comment

Emdash

For all of these things you should consider whether it is already better accomplished by existing input/output methods. You can already program or do math via a keyboard, and there's not that much benefit to be gained doing that through a neural interface that is likely to be slower, less accurate, and have major health risks. Realizing that a cell phone is already a great brain-computer interface removes much of the incentive in this whole area.

Memory and executive thinking are interesting possibilities, but they are so poorly understood from a neural perspective I'm not confident in saying whether it's possible or not. But my assumption would be they are in the Mb/s range so bandwidth will still be an issue.

We actually do use neural stimulation (e.g. deep brain stimulation, or electroconvulsive therapy) to treat depression. But getting an implant to make you always happy sounds like inventing wireheading from the famous sci-fi novel "do not invent wireheading"

Expand full comment

Kveldred

Apr 13

I have yearned so much for some sort of wireheading to become available that doesn't require dangerous surgery 'n' stuff. Like, can't magnets give one a constant low-grade sense of euphoria somehow?! There's gotta be a way, dammit!

Expand full comment

Interested Skeptic

What's the best thing to read on the plausible technology pathway from LLMs to AGI? It seems the biggest assumption in AI 2027 is that the technology path exists. I'd love to read how you justified this or if it was just thinking that supercoding LLMs will find the solution.

Expand full comment

Thomas Kehrenberg

If you're asking whether we will understand the nature of intelligence by 2027, the answer is surely no. But evolution also didn't understand intelligence in a meaningful way and yet produced humans. Reinforcement learning used to not work very well except on relatively simple games, like chess, go and Dota 2, but it's now working quite well on pretrained LLMs. In theory, all you need to get AGI is:

1) an architecture that can in principle run an intelligent mind

2) a reinforcement learning target that saturates beyond the human level of capabilities

3) an optimization algorithm

If you have these 3 things, you'll eventually get an AGI (and you will have absolutely no understanding how it actually works internally). Do we have all 3? I'm probably most unsure about 1) – are feed-forward networks sufficient to contain an intelligent mind? – but I'm pretty sure they will unfortunately figure it out eventually. And even if they have a suboptimal architecture; if you throw enough compute at it, it will work eventually.

Expand full comment

Reply (4)

4) time. Evolution took a very, very long time to produce humans. The discussions here seem to be assuming AI will take much much less time than this, but this is predicated on us doing something evolution didn't or couldn't. If we're only doing as well as evolution, I see no reason to sound the imminent AI doom alarms.

Expand full comment

We’ve seen plenty of examples of cultural evolution outrunning genetic evolution. I’ll agree that outpacing it by 6-8 orders of magnitude is a big ask. But natural selection is really dumb, and it has no conception of a goal and a plan.

Expand full comment

Apr 8Edited

But now we seem to be equivocating between "we don't have to know how to do it, evolution did it so we can just do it the same way" and "actually, we can do better than evolution".

Natural selection is really dumb, yes, but given enough time and resources it will visit all available niches in the search space. This is not obviously true of more directed search patterns, and for those you do have to understand enough about the goal to prove the approach is capable of reaching it.

The bitter lesson of AI, meanwhile, is that cultural human goals and plans produce even worse results than just dumbly throwing compute at a problem. Even for the reachable goals, we are not actually very good at deliberately directing towards them.

If we use the same approach evolution did, we need to show we can reach the end state fast enough to worry about doom; and if we don't, we need to show that whatever approach we /do/ use is actually capable of reaching that destination - we don't get to just handwave that part away any more if we do something radically different to our existence proof.

Either way, there is work to be done.

Expand full comment

I think you're not giving enough credit to the algorithmic improvements Scott cited, which seem to be producing gains of the same order of magnitude as adding "dumbly throwing compute". (I'm just an AI layman; I know nothing about what those algorithmic improvements are nor about whether it makes sense even to compare their effects to that of raw compute.)

My original point was just that natural selection wasn't wired to produce intelligence in particular any more than it was wired to produce flight. But we still outfly birds by several orders of magnitude (not to mention being able to fly in a vacuum!) -- having an actual goal beyond "be fruitful and multiply" is actually worth *something*.

Expand full comment

Apr 8Edited

I work on some of these "algorithmic improvements". They are mostly about finding ways to do more compute (specifically, more matrix multiplication) per unit time and power. Basically, today, they are about making the building blocks LLMs are made of bigger and faster.

However, it remains unclear that making LLMs bigger and faster can lead to agi, rather than just a bigger faster LLM that still has all the fundamental problems LLMs have.

I'm not saying we won't ever get agi. But I think it's entirely plausible that we won't get it just by making LLMs even bigger and faster - that it will take more fundamental paradigm shifts like the "attention is all you need" paper (which fundamentally changed the way we do AI, and the overall approach has remained essentially the same since), and if so it is also unclear how long it will be before we see them.

Expand full comment

> Natural selection is really dumb, yes, but given enough time and resources it will visit all available niches in the search space.

This isn't true, or at least if it is true you're talking about lengths of time likely to be many orders of magnitude beyond the current age of the universe. If a global maxima is surrounded by peaks of difficulty, or specifically it requires several non-self propagating genes, interdependent genes in order to function, you are essentially betting on random mutation spontaneously assembling in DNA, instead of taking advantage of natural selection's optimization power (if not, how do you explain that one really dumb nerve in a giraffe that runs down its neck then back up again, how long do you think that will take to be fixed, in terms of generational time? If evolution is as fast as you imply, it's not clear that we should be seeing this at all, and if it isn't, it's implausible that a sane amount of time can pass before it's fixed)

Expand full comment

> If evolution is as fast as you imply

I imply nothing of the sort. Indeed, the slowness of evolution was rather my point.

Expand full comment

Unirt

"Natural selection is really dumb, yes, but given enough time and resources it will visit all available niches in the search space."

I don't think this is true; natural selection is good at visiting and conquering nearby hills on the fitness landscape, but it has great trouble visiting the ones further away. It cannot move through fitness valleys so easily, cannot go downwards. Enough time may mean very long indeed. Human-made search mechanisms can easily overcome this limitation, by introducing momentum-based optimization, simulated annealing, random restart hill climbing, tabu search, and whatnot. These and other hacks make artificial optimization processes very much faster.

Expand full comment

Emdash

Evolution is hard time limited by the reproduction cycle, in addition to the algorithmic inefficiency of so much random search. There are also physical world constraints (you can't evolve respiration when there is no available oxygen). These things do not apply directly to this question.

The intelligence explosion leading to humans was also lightning quick on the scale of evolution, a couple hundred thousand years. Then the entire cultural evolution that happened since modern humans developed agriculture is ~20k years. In the last 100-200 years this cultural evolution has produced more technological progress than the previous 19k years combined.

Time is a relevant constraint, but biological evolutionary timescales are not relevant here and should not make you feel better. You should either be worried, or find a better reason to not be worried.

Expand full comment

Virgil

Strange that you consider all 3 requirements fulfilled or close to being fulfilled.

The first is incomplete, it needs to run an intelligent mind *efficiently*, i see no reason to believe LLMs can do this within reasonable size and data constraints, it shouldn't take a reasoning model crunching through millions of tokens to fail at playing Pokémon or answer an altered children's riddle wrongly.

There's no reason to believe that reinforcement learning will extend past the bounds of the distribution of the training data, in fact, that's what it optimizes for, which leads to the semantic content of a reply being more important than its meaning and makes creativity an out of distribution task. Look at how cutting edge LLMs performed at the recent Math Olympiad. It's not just that they were so often wrong, it's that they were so confidently wrong because their answers were semantically similar enough to what mathematical proofs look like.

Third is an optimization algorithm, gradient descent might be more efficient than the evolutionary process of nature, but it's alot less capable of escaping local minima. Assuming, in the best case scenario, that the text data used to train LLMs is representative of a complete model of reality (doubtful), the question is, is it more likely that gradient descent will find a function within the relatively large subset of functions that approximate text well by encoding a deep understanding of relationships between large groups of semantic concepts like a very high level grammar (basically what we have today, impressive and useful but not AGI) or will it find a function within the very tiny subset that approximate text well by encoding a complete fundamental model of reality beyond anything the training data can express (True AGI)? The chances of stumbling onto the latter through gradient descent are vanishingly small, its like training a model to predict physics by giving it training data of balls bouncing around and expecting it to solve physics. Sure solving physics would allow the model to predict bouncing balls and the model might be large enough to approximate such a function, but there are so many more functions that can predict ball bounces that expecting it to find the one that solves physics is absurd.

The way I see it, none of these prerequisites are close to being achieved.

Expand full comment

1) Feed-forward networks are not sufficient...but they're also not all that are in use. Consider recurrent networks, etc.

2) I think you need something more complex than a single reinforcement network. But that's also trivial.

3) Optimization is tricky. It almost always means "do better on some specific subset of problems". I think the current AI term for that is "distillation".

One of the things you left out is the process of training. Cleaning the data. Checking projections against the data. Etc. This has proven quite tricky except in specialized areas. (Actually, even there. But it's often been pretty well handled in specialized sub-domains. Like math or protein-folding, where there's a data set with not too much error.)

OTOH, my comments are about publicly available methods and examples as of at least a year ago. Don't expect their limitations to apply to research frameworks or proprietary programs/libraries. (I expect they do, but that's just "these seem like general rules".)

Expand full comment

Apr 8Edited

> 1) Feed-forward networks are not sufficient

Why? At least in theory transformers with FFNs can compute anything an RNN can. The transformer architecture took over because it empirically performs better than the RNNs we have. The parallelizability and scalability turned out to be a bigger advantage in practice than what RNNs offer.

I don't doubt at all that better architectures could be invented and that those may use RNNs. I'm just wondering why you're saying feed-forward based networks are incapable of general intelligence.

Expand full comment

I'd need to study transformers more, but it seems to me that they need a way to adjust their weights to the environment they find themselves operating in. IOW they need feedback, both positive and negative. Otherwise they can't learn And if they can't learn then I don't think they can reasonably be called intelligent.

Feedforward networks can be adjusted to model nearly any situation, but without feedback they can't adjust themselves to a changed situation.

Expand full comment

I think the answer to "do LLMs learn" is more complicated than it first appears.

LLMs learn during training or fine tuning. The AI companies are continually training them and new versions are released regularly.

LLMs can also temporarily learn facts or simple skills just from instructions or information you give them within the context window. If you were to make up a simple language that's similar to English but with certain words changed and different plural constructions, it could learn how to write in your made up language just from your instructions.

They don't change their weights from conversations with users (unless OpenAI or whoever later uses that conversation in training). I think this is partly due to limitations and partly an intentional feature. They don't want conversations with random users to be changing their model willy-nilly, nor do they want their model to remember private secrets of one user and reveal them to another user.

The limitation part is that training is costly and requires a lot of examples. More than a human would need.

The current strategy for LLMs is just to try to teach them everything, which reduces the need to learn new subjects on the fly. If you need to teach them something new as a user, your options are fine tuning or sticking it the context window. Neither work as well as full training.

> it seems to me that they need a way to adjust their weights to the environment they find themselves operating in.

So say you want an LLM to do legal work. You could train it or fine tune it on legal work and it would learn it and adjust its weights. Or you could just start with a big LLM that was trained on everything including legal work. It will already have sections of its weights adjusted for legal work and will use those.

Also, I don't think there's any way to distinguish between "learning" by adjusting weights vs very good reasoning on working memory (the context window). One can substitute for the other.

The current LLMs aren't fundamentally unable to learn. We may invent architectures that could learn much more efficiently. But it's ultimately a matter of degree, not capability.

Expand full comment

The big question for me right now is not whether feed-forward networks in general are sufficient to do 1, but whether the LLM paradigm in particular can do 1. My strong suspicion is that it can't, and that the LLM paradigm doesn't generalise too far beyond the things it's currently good at.

An LLM is already "superhumanly intelligent" in many ways; no human can dash off a thousand-word essay on any arbitrary topic in a matter of seconds. But there's other intelligent things that they're going to continue to suck at. Importantly, they suck at theory of mind -- if you ask them how a particular person would react to a particular event then they can't go beyond whatever generalities are present in its training set.

If we can't get there with LLMs and we're stuck with more general neural networks then we're not necessarily a lot closer than we were five years ago.

Expand full comment

>The big question for me right now is not whether feed-forward networks in general are sufficient to do 1, but whether the LLM paradigm in particular can do 1.

That's fair. I'm looking at AI progress mostly from the point of view of "Can reliability be pushed up to human levels?". Perhaps LLMs suffice, perhaps a system that is mostly an LLM but has some side data structures or control, maybe just a scratchpad for fairly voluminous intermediate results for deep reasoning suffices. I'm hopeful that LLMs will be enough for our "System I" thinking and that maybe a little executive function plus memory can be sprinkled on top to add "System II". Time will tell.

Expand full comment

Apr 8Edited

AFAICT they assume that it's feasible to hook up a reinforcement learning feedback loop for developing "research taste", which scales to a superintelligent researcher, which then develops a MVP AGI. They say this is a "conservative" way of thinking, because there's no obvious reason why all the benchmarks/lines on graphs won't continue to go up indefinitely, so it's incumbent on skeptics to offer a counterargument.

Expand full comment

There is clearly a point where the curve turns logistic...but there's no obvious (to me) reason to pick any particular level of ability where that would happen. The only obviously defensible claim of limitation would be "it can't learn to be any better than its teachers", but while that's defensible, I can't see that it is necessarily correct.

Expand full comment

Right. I mean, we *know* that’s not true for human students.

Expand full comment

Nor for some past examples of AI learning, like chess playing.

Expand full comment

Wait, what? Chess AIs routinely outplay the humans who created them.

Expand full comment

Mark's agreeing with you by extending your point.

Expand full comment

Duh, sorry, Mark: I read “nor” as “not”.

Expand full comment

A student can surpass their teacher by adopting and polishing models the teacher had to coarsely invent, by solving problems the teacher had enough neuroplasticity to discover but not to fully explore... but when they start facing problems the teacher can't even classify, where a mere polynominal-time task to check the possible answer is already too vast for the teacher to complete, slowdown from lack of tutelage will definitely be kicking in.

Expand full comment

No doubt. But does that kick in at IQ 200? 400? 1000? 10000? I haven’t seen a principled argument that tells me the threshold is low enough to save us.

Even a race with an average IQ of 200 would probably displace us as we displaced the other hominids.

Expand full comment

The other hominids were competitors for essentially the same ecological niche. You can't grow an ape out of silicon, copper, and chlorine trifluoride, any more than you could construct a datacenter from mangoes, clams, and megafauna thighbones.

Expand full comment

>because there's no obvious reason why all the benchmarks/lines on graphs won't continue to go up indefinitely

The obvious reason would be that, in real life, no line on a graph ever goes up indefinitely. As long as your line on a graph connects to the real world in any way, there are costs and limitations that will eventually force that line to flatten, and the burden of proof that the line will go up for long enough to achieve some state is on the people who propose it will do so.

Expand full comment

Egg Syntax

Although it depends somewhat on what sort of analysis you would like to see, a good starting point would be some of the supplements to the report:

- https://ai-2027.com/research/timelines-forecast (from LLM to superhuman coder)

- https://ai-2027.com/research/takeoff-forecast (from superhuman coder to ASI)

Expand full comment

Aristides

I wonder what an economy with super AI and 12 million humanoid robots (1 year into the 1 million robots a month scenario) that can do anything better than anyone looks like? Email jobs can disappear just from software. Doctors and Lawyers are the next highest value professional, but I bet the AMA and ABA will still make sure that a Doctor is always in the room with every patient and no Robots are allowed in courtrooms for at least a year. Soldiers are an obvious choice, but we’re more likely to use flying drones for that unless we are currently at war with China.

I think the next highest value job might actually be skilled Blue Collar Labor. A super plumber, carpenter, electrician, and contractor, that leads a team of unskilled construction workers. Meanwhile, retail jobs might be safe until the number of robots increases dramatically. Any other jobs I am missing?

Expand full comment

Apr 8Edited

I don’t see humanoid robots ever being used much at all. If you were to replace a waitress with a humanoid robot the cost of the robot still has to be less than the cost of minimum wage multiplied by the number of people you can fire.

That cost would be depreciation, maintenance and if I know capitalism, and I do, paid software updates.

you might need as many robots as your entire rota, there are certain days when everybody is working, like Valentine’s Day, Mother’s Day, and other holidays.

there’s no obvious real gain here anyway to efficiency. Competent wait staff get the food to you in time anyway, and people are happy to dawdle in restaurants. If your food comes twice as fast it gains little. These are not assembly lines.

people like to flirt with the cute blonde waitress rather than a mechanical device.

there won’t be many customers anyway since they will have been replaced with robots themselves.

Expand full comment

Reply (6)

Apr 8Edited

Why did you respond to a point about skilled labor with an anecdote about a minimum wage worker in a context where human relation and warmness matters?

A humanoid bot could replace most handyman and skilled-labor work, which tends to be repetitive, and they are not paid peanuts. They don't even need AGI for that. The cost of maintenance is still less than paying an annual 6-figure wage, plus weekends off, vacation, etc.

They don't even need to be humanoid in many cases. Specialized industrial domains like agriculture will purchase machines that suit their needs, and still be automated after some training and setup.

Expand full comment

Handyman worked hardly seems suited to machines. Each house and each problem the house has is different.

Expand full comment

Apr 9Edited

And more alike than different. I would not underestimate AI in this use-case given what it already does for code.

Some things a handyman might do: clean windows, paint, drywall, (rarely) bricklaying, flooring and carpeting, siding, roofing, I could go on.

This is mostly repetitive work. If you pay for a service at your house, often times there is more than one person involved. There's the business owner, and hired hands. It would be trivial to replace most if not all, if machines are fast and dexterous enough. Having no business-owner on site is unlikely for a long time, but that doesn't matter, and it will still happen eventually.

Expand full comment

Even something as simple as painting requires a great deal of dexterity. Factory machines are most adept at small, precise motions. AGI fetishists rely a lot on some miracle that will give robotics the equivalent of the opposable thumb.

Expand full comment

hands : https://www.youtube.com/shorts/pqIbLwIm_Qk , https://www.youtube.com/watch?v=AXcZGkKR2og, https://www.youtube.com/watch?v=6NkZXofJft0

The dexterous hands are only improving, but it's redundant for many of these use cases.

drywall sanding: https://www.youtube.com/watch?v=3yG2BgDYNGQ

painting: https://www.youtube.com/watch?v=hm9ZSN37jVM , https://www.youtube.com/watch?v=d4G7Ul62ibE

This is massive cope.

Expand full comment

1) You claimed that robots won't be used _anywhere_, but only analyzed a case that is probably one of the worst options for using them, instead of looking at the best options

2) Your case rests on robots being at least as expensive as minimum wage alternatives, but your justification for this is a complete hand-wave

3) You simultaneously argue that efficiency doesn't matter ("if your food comes twice as fast it gains little") but also that any loss in manpower is unacceptable ("there are certain days when everybody is working")

Expand full comment

With mass production, a robot will cost less than even a single minimum wage employee. Unitree is advertising their humanoid robot for $16,000. The true typical price might be about double that, but prices will come down in the hypothetical scenario where production scales up to 1 million robots per month.

The median waitress salary in the US is about $32,000. With swappable batteries, a robot could work nearly 24/7. That's a bigger advantage for restaurants that are open longer hours and more days a week. Assuming a robot works at the same speed as a human waiter and can work twice as many shifts and will cost $16k, it breaks even in 3 months.

> there won’t be many customers anyway since they will have been replaced with robots themselves.

If you assume mass unemployment and no one has money, then we just get general economic collapse and all the restaurants close. But if that doesn't happen and, for whatever reason, a restaurant is getting few customers, that only puts more pressure on them to replace humans with less costly robots.

Expand full comment

ruralfp

Apr 8Edited

I think this is just a lack of imagination that makes a number of unnecessary assumptions.

The first is not understanding what a humanoid general purpose robot is. It isn’t just a waitress or just a wildlands firefighter or just a manual laborer, it’s all of these things and more all at once. It is everything a human can be.

My household robot could easily hop on the bus and work the restaurant on Mother’s Day for the right price. Or it could package itself for delivery to California to cut fire lines for the right price, or it could throw up timber frames for the right price. It can also likely maintain itself, or be inexpensively hired to maintain one of its compatriots.

Or it might be undercut by a fleet doing the same thing but at greater efficiency.

And these won’t be clunky lifeless hunks of metal, they will be imbued with superhuman agreeableness, perfect memory, perfect work ethic. They can be trusted to do exactly what you ask of them without cutting a single corner. Having actually worked as a waiter at one point in time, I really can’t stress the importance of the last point for front or back of house staff enough.

If the average guy wants a pretty blonde to flirt with at Applebees, very few generations of iterative design are going to be need to create the most superhumanly beautiful exoskeleton to house the already superhumanly flirtatious personality.

Expand full comment

I went to a restaurant with robot waitresses the other week.

To be fair the robot was just a tall wheeled thing that drives up next to your table and lets you take the food off and they still had human waiters/resses for other things. In fact, it almost certainly doesn't save the restaurant any time or money. But the kids loved it.

And who needs the human warmth of a below-minimum wage illegally-employed foreign student waitress when you can have a screen with cute cat eyes?

Expand full comment

>I don’t see humanoid robots ever being used much at all. If you were to replace a waitress with a humanoid robot

You thinking in absolute terms. It is more likely that the two will coexist in many services, where whenever a human can't, a robot will be available. That is more viable, from economic and pragmatic perspectives. Eventually in some fields we may see one be more preferred than other (manufacturing is an absolute, but not all service levels will require full automation), but you can be sure that will be advantages in humanoid robots everywhere, it's just a matter of time.

Expand full comment

It is hard for me to square the AI 2027 forecasts with power and infrastructure needs for the compute clusters they envision.

Expand full comment

Chris Merck

Do you have specific issues with their “compute forecast”? They focus on chips and power. But presumably if you have those inputs the infrastructure part is not a bottleneck.

Expand full comment

I wrote something here: https://davefriedman.substack.com/p/the-agi-bottleneck-is-power-not-alignment

Expand full comment

Your claimed power usage requirements appears to be very different from the actual forecast's.

https://ai-2027.com/research/compute-forecast#section-5-industry-metrics (scroll down to the power section)

Expand full comment

Good eye on the compute forecast—thanks for linking. But just to clarify, the 70–100 GW number I cited comes directly from the AI 2027 narrative, not my own estimate. In the scenario, OpenBrain alone is running 100M H100-equivalents by late 2027 (each ~700W), which implies ~70GW before cooling/networking. Their supplemental data may forecast lower industry-wide draw, but the scenario text itself goes well beyond that.

Expand full comment

Your analysis is centered around compute and power, but surely, these aren't the sole determinants. I just recently posted a more comprehensive perspective on the parts that form what sovereignty means for AI: https://antoniomax.substack.com/p/the-political-distillation-of-ai

My analysis is more political in this article, but the bits are all there in coherent and inclusive frameworks.

Expand full comment

Apr 9Edited

Good article, thanks for sharing. I actually just published something on the same topic: https://davefriedman.substack.com/p/ai-startups-in-an-era-of-tariffs-8ac

Expand full comment

Alcibiades

They don't focus nearly enough on who is paying for it. This will require trillions of dollars. We can look to previous tech bubbles, commodities, the shale revolution, and many other historical examples to see just how quickly companies will cut capex and how timelines will increase by decades.

Even a minor recession could see AI spend drop by half, if not 80%.

Expand full comment

This is certainly true, especially give recent news about Trump and his tariffs. The authors might argue that they had not known about the tariffs before writing the piece, but Trump has been telegraphing his interest in tariffs for years.

Expand full comment

Alcibiades

And even without the tariffs, there is no analysis of the dozen or so people who will determine 2026 Capex. It only takes one or two of them to get cold feet and the entire thing falls apart. Given how in-depth they go into other areas, it seems like a large oversight.

Zuckerberg showed us with the Metaverse that he will bend to shareholder pressure. The rest of the CEOs will do the same.

Stargate funding is largely directed by Masayoshi Son. It is reasonable to expect it to end in disaster.

Expand full comment

Well Microsoft has already announced data center cancellations, so you’re possibly directionally correct.

Expand full comment

But is that going to slow down China? The US companies might need to migrate to some other country, and that would probably slow them down, but how much depends on what's available where they end up. If they migrate to a major datacenter in Europe or Canada it might not slow them more than a year or so. (Possibly buy out someone who was heavily invested in BitCoin mining...but I don't know if anyone really went *that* heavy into mining.)

Expand full comment

You might find this an edifying read: https://en.wikipedia.org/wiki/China_Plus_One

Will this slow down China? No. China is an economic power now, if a + 1 initiative grows, chances are Chinese investment firms will be part of it, same as US investment is everywhere. Genie is out of the bottle. Autonomy is one thing, slowing China is another, and while they are both about China, these initiatives don't go hand in hand.

Expand full comment

Hands down the best comment in this thread. The economic and political dimensions are the two most important determinants in this moment.

Expand full comment

So, uh, is there any reason why I shouldn't kill myself based on these predictions? Because it seems like there's basically no scenarios where things end well, even in the situation where AI fizzles out.

Expand full comment

Reply (15)

Comment deleted

Comment deleted

Expand full comment

> Because you’re made in the image of God

...That has very terrible implications for what God is.

Expand full comment

Eremolalos

If you were privately considering suicide, I think it would be fair, and a smart move, for you to probe this group to try to get an idea what the chances are there will be future events that would make it more or less worthwhile for you to continue your life. But you’re not doing that. You’re announcing to a bunch of strangers that you are leaning strongly towards suicide, and giving the impression that their answers may sway you one way or the other. *And* when people offer reasons that weigh in the direction of your staying alive, you keep them engaged and in suspense by saying “yes but that doesn’t make me more inclined to stay alive, because . . .”

What you are doing is extremely unfair and unkind and I wish you would *stop it.* If you want to discuss the suicide decision with someone, do it with a close friend or a professional. Both would have a reason and in fact an obligation to engage with you, and both are also more likely to have useful things to say, since they would actually know more about you and your situation.

Expand full comment

Apr 9Edited

Writing this in response to a suicidal person is unkind as well. People ask for help and deal with suffering in many different ways. You're just assuming that everyone has access to a professional, or a close friend willing to listen.

Other people in the thread are willing to engage with their comments, if you aren't, you don't have to.

Expand full comment

Aren't you a psychiatrist? I thought you of all people would be a little bit more sympathetic. There's nothing more to explain about the situation, because it's a very simple one: things are very bad right now, they are continually getting worse, and if the forecasts show that everything will likely go to shit before they even have a chance of getting better, I am not interested in living.

So far, the only reassurance I've gotten that things will get better is that AGI will magically bring utopia to all human beings (despite there not being any incentives towards that outcome under current circumstances), and also Jesus. So... it's not exactly reassuring.

And no, I do not have friends or competent professionals to talk to. Of course, even if you think my actions are so utterly immoral, it seems that's a problem that will solve itself.

Expand full comment

Eremolalos

Setting aside considerations of others’ stress levels and needs, I also think asking this group for help in deciding whether to commit suicide is not a safe plan for you, and not likely to be helpful to you. First of all, nobody knows how accurate Scott’s group’s predictions are. They may be way off because the group neglected to take various things into account, or because some unanticipated event throws a monkey wrench into the machinery of modern life. Second, soliciting opinions about future events from people after telling them what they say affects the chance that you will commit suicide guarantees that you will not get honest opinions from most, but ideas selected to protect the responder from feeling responsible for someone’s death. Third, if you keep doing this sooner or later you will run into an internet feral who cries “jump! jump!” or the equivalent. If you think I suck wait til you hear from one of them.

There are forums on Reddit where people are open to hearing from depressed and desperate people and have many members committed to trying to be helpful. (Even there, though, you will shut down a lot what the group have to offer if your presentation is that you are standing on a ledge and the group’s job is to help you decide whether to jump.)

Expand full comment

Obviously I'm not expecting people to tell me to jump (though even that would still be more helpful than whatever you're doing), but I was hoping someone would at least come up with a good reason beyond "just ignore it lol" and "Jesus". ...I don't know what I was expecting.

Expand full comment

Chris Merck

The future is likely to be far stranger and more wonderful than we can currently imagine.

Expand full comment

Even so, we're not going to be a part of it. That's the problem, ain't it?

Expand full comment

I can imagine lots. Try explain.

Expand full comment

Kenny Easwaran

If you’re the one who can imagine lots, then maybe you should imagine some weird futures and explain them to us. We don’t think we can imagine what the future will be like.

Expand full comment

i think the super intelligent A.I. will usher in heaven and earth and everybody will be 52 feet tall end 2027. We will colonise and terraform Mars by April 2028 and Venus by July. Things will only get faster. By September 2028 with GDP growing at 2,199% a second we will all become minds floating in vicarious substances and live on Jupiter until we get sick of that and upload our minds to a grain of sand found on the beach in Cheshire.

I will not be taking questions.

Expand full comment

David Chambers

You mean: I should stay alive to see how wonderfully it kills everyone?

Well, everyone dies, not everyone gets to see the end of humanity.

Expand full comment

larkejbglerhkbglearh

The way you get killed by an evil AI is probably not going to be significantly more painful than killing yourself, and you also get to experience all the good things that will happen in the intervening time between now and the AI takeover. I guess maybe there’s more “dignity” in choosing death yourself instead of having it forced on you?

Expand full comment

> you also get to experience all the good things that will happen in the intervening time between now and the AI takeover

Yeah, uh... I'm not experiencing anything good right now. I was hoping things might improve, but it seems increasingly likely there won't even be enough time for that. What a shame.

Expand full comment

Mark Y

I think that even if you take the forecast at face value, there is some time for things to improve before everything goes crazy. Maybe wait a year or two to see if that happens?

Also: forgive the stereotypical advice, but “not experiencing anything good right now” sounds like a very unfortunate problem to be having… perhaps that one at least might be fixable, even if the big picture stuff looks harder to fix?

Expand full comment

Mark Y

I read some your other comments; not as easy to fix as I thought

Expand full comment

How things will shake out is still way too uncertain. I don't think a lot of these futures involve negative EV compared to ceasing to exist. Thus, it's more like choosing to get cryonics: 10% of being revived into a glorious future and 1% chance of being revived into an OK/terrible one (fewer scenarios in the bad futures would spend resources to revive dead people), is better than a 0% chance of anything. I mean, considering the idea of being born, I think 10-1 is good odds.

Expand full comment

The main thing keeping me from cryonics is the "bad freezing can't be reversed regardless of tech because the information's already gone, you sufffer severe brain damage" issue. That's a quite-likely -EV. Doesn't apply to this scenario, though.

Expand full comment

Do you mean in comparison to like, aldehyde suspension?

Expand full comment

I don't mean in comparison to X, I mean absolutely. I mean that I would rather die than live with severe brain damage, and I suspect current preservation methods (particularly since they're usually started postmortem) do not have the fidelity to avoid giving revivees severe brain damage.

Expand full comment

Brendan Richardson

Ehh, if the future unfreezers pumped you full of neural implants and AI-generated synthetic memories, I doubt you'd be capable of noticing.

Expand full comment

This is not very reassuring.

Expand full comment

AI-fizzle isn't one of the two scenarios shown there.

Expand full comment

I know. But if it does fizzle out... we're looking at a global economic collapse, aren't we? Or worse, if all of this ends up escalating to all-out war.

Expand full comment

In all honesty, I expect your mental health would be better following a nuclear war (assuming you survive) due to the Torment Nexus, uh, I mean Internet being destroyed.

Certainly, my intention is to survive (if I'm not called on to fight).

Expand full comment

> my intention is to survive (if I'm not called on to fight)

But...not if you are? Feels like I'm missing a key logical step there.

Expand full comment

Apr 14Edited

Well, my intention is to survive if I'm called on to fight, too, but my point is that I wouldn't try to draft-dodge/desert/otherwise try to avoid fighting.

Expand full comment

Timothy M.

Why would "the failure of a technology to continue to advance" lead to "global economic collapse"? And even if it caused a recession, even the worst ones we've had have never really gone past a decade.

Expand full comment

Basically because our current technological civilization is depleting a finite pool of non-replenishing resources. It depends on advances to continue to have resources available.

Expand full comment

Timothy M.

This is a really broad claim and seems unrelated to AI advancement.

Expand full comment

It *is* only marginally related to AI advancement, but it was in answer to a specific question: "Why would "the failure of a technology to continue to advance" lead to "global economic collapse"?"

And it *is* a broad claim. But it's got an extremely wide set of substantiation evidence. See, e.g., Buckminster Fuller on why "ephemeralization" is important.

Expand full comment

Disagree. Solar energy is more or less unlimited. Rare materials can be recycled from previous uses. Between abundant energy and sufficient materials, society like ours could continue almost indefinitely.

Expand full comment

The claim is probably that global economic collapse is likely as a baseline, but AI acceleration might change that (because it changes everything).

Expand full comment

Do you mean with the tariffs? Eh, recessions come and go. Don't see how we get to a bare-knuckled all-out WW3 from here; no empire is trying to aggressively take over a continent, there's no system of chain-reacting alliances that could make one happen accidentally, and every elite wants to preserve global physical and human capital as much as possible for their own good.

Expand full comment

There *is* a large probability of war in the near future. Mainly because the US is ceasing to be the dominant country in the world and historically countries have refused to acknowledge it when that has happened, leading to wars when they tried to dominate countries that were no longer willing to be subservient.

OTOH, it's quite plausible that "economic war" would suffice, and the recent tariffs against all our allies may have sufficiently destroyed the economy that it will be clear even to US politicians that we have already lost.

Expand full comment

The stock market dipping to where it was in early 2024 means the economy is destroyed?

Expand full comment

If that's all you measure, and you think that's all that's going to happen, then ... well, it would be nice if you were right, but I don't believe it.

Expand full comment

There are lawsuits over whether Trump has the authority to set these tariffs, SCOTUS will hear and they have lifetime seats Trump can't effectively threaten.

Expand full comment

https://www.washingtontimes.com/news/2025/mar/20/chinese-military-shows-taiwan-invasion-barges/

The spark is probably Taiwan. It's my opinion that the PRC has an active invasion plan in progress following the Hong Kong fiasco decapitating its efforts at peaceful reunification (this is not to say that they can't still abort; obviously, they can). See e.g.

...for evidence.

Allowing Taiwan to fall would be foolish because it'd let the PLAN into the wider Pacific, where it could easily threaten a loose blockade of South Korea or Japan (which cannot hold out due to requiring massive food imports), not to mention TSMC and the damage to the Western alliance system (the USA has said in the past that it'll defend Taiwan). Defending Taiwan probably means a Duluth bear or Arkhipov incident goes wrong sooner or later, which means mushroom clouds.

Expand full comment

Silverax

Scott said that the authors of AI2027 predicted p(doom) from ~20-70%. (those might be slightly off, sorry if so).

_If_ you believe them, to the point of killing yourself. You should also believe their estimate of p(doom), meaning a coin flip on average.

That's like 50% chance of utopia. I'd kick myself so hard if I died a couple of years before by mistake.

Expand full comment

I don't think it was 50% chance of utopia, just 50% chance of not being really disastrous. I'd give it perhaps 30% chance of being something that people would call utopia for a few years...before they got jaded. (I don't think it's really possible for people to be satisfied for a long period of time.)

Expand full comment

Cjw

Scott gave his p(doom) at 20% and clarified on Dwarkesh's podcast that the other 80% includes all the techno-feudalism and surveillance state kind of endings about which he would be very unhappy. I also know from past columns that Scott considers some endings good that I and many others consider bad, notably the transhumanist ones with widespread augmentation. I don't know what the chance of a "good ending" is by my definition, and Scott seems right in conceding (here in this post) that all of those variations in human social organization that are theoretically possible would depend on the details of what the world liked at the time of the transition.

As a person who mostly likes the current form of human social organization and has a meaningful and valued place in it, I think the chance of ending up with a world I'm ok with after a transformative shock like AGI is going to be pretty low even if we survive.

Expand full comment

Silverax

OK fair. I was too optimistic with the 50% utopia number.

I still think there's a good chance of a good outcome and they also didn't give up.

Worst case, kill yourself after the Zuck/Altman coalition conquers earth.

Expand full comment

Because you don't lose anything by waiting to see what happens. On the off chance something good happens, it'd feel really stupid to have killed yourself before it does. If something bad happens, you die anyway.

Act based on the assumption that we (somehow) get the good future, because if the bad future happens - we all die, and there's no action required or possible on your part, so what's the point in worrying about it?

(obviously, if you're smart/rich/powerful/whatever enough to be able to increase our chances of the good future, then do that, that's an excellent reason to keep living)

Expand full comment

Did I mention I was in constant, unending pain and stress with no obvious cause or remedy? I am very much losing something by merely existing. And while it isn't that relevant, you're all seriously discounting the possibility of s-risks.

Expand full comment

Reply (5)

Vitor

Sorry to hear that (FWIW from a random internet person). Have you considered soberly evaluating whether life is worth living for you, completely independently of all this AI madness? It might be a healthier frame of mind.

Without going into object-level arguments, to me the whole AI panic seems like a mind virus that is particularly likely to spread in the rat sphere. It's an info hazard for a certain type of person that Takes Ideas Seriously (tm). As such, it has a tendency to increase the intensity of any other problem you have going on in your life. However, you are free to just... not buy into that narrative. Block it out, pretend that the world is still the place that homo sapiens evolved for.

Expand full comment

> Have you considered soberly evaluating whether life is worth living for you, completely independently of all this AI madness?

Oh of course. Frankly, I probably am just looking for justification to kill myself, but this whole ordeal certainly isn't helping. I would really appreciate it if the future was more clear so I could make a more educated decision, but unfortunately we all need to work under uncertainty.

Expand full comment

The best way to find out how the future turns out is to live long enough to see it.

Expand full comment

Who?

Apr 8Edited

First things first: let's separate the x-risk / s-risk discussion from the challenges you are facing at present. I don't know the particulars of your situation, but it would be worth reviewing with a professional well-studied interventions such as talk therapy and pharmacological options, including for hard to treat depression. Your well-being comes first.

Expand full comment

Apr 8Edited

...Look, the situation is already being dealt with as good as it reasonably can be, given the circumstances. And it can't really be divorced from outside circumstances, because if everyone's going to die in a few years, or the world's just going to go to shit regardless, I'm simply not interested.

Expand full comment

David Bergan

A key insight from my therapist was to stop "fortune-telling". My trying to predict the future almost always created anxiety and depression. But looking back on those days, I can say that the actual future instead brought love, joy, and contentment.

If things are as bad as you're describing, try screaming with all your might "Lord Jesus, save me!" That's what I did on the worst day of my life, when my depression told me that life wasn't worth living.

He showed up. He saved me from that.

What do you have to lose?

Expand full comment

Tossrock

What was the experience of Jesus showing up and saving you like?

Expand full comment

I'm really sorry to hear that you're going through this! I don't know what you're dealing with, but I'm suffering from a chronic illness that I don't understand that causes me a lot of suffering, so I might be able to relate at least somewhat.

For myself, I'm kinda hoping that smarter AIs might actually be able to help me figure this out, or create enough anundance that it's easier for me to afford more competent doctors who can help. I don't know about your situation, but consider whether a superintelligence might be able to help you, and if it is, it could be worth holding out for. I don't know if you've been suffering for a long time and feeling already tired, but if it's been long already - what difference will another couple of years make, if there's a chance at fixing things?

I think the odds of AIs increasing our suffering in a way that prevents us from killing ourselves later are very low (why would they? If aligned, they'll build utopia, if misaligned, they'd kill everybody off, I don't think them torturing people is super likely). So there'll always be plenty of opportunities to give up in the future.

For now, even if life mostly sucks, sometimes good things happen that I couldnt've predicted, that make me very glad that I have lived long enough to experience them.

Expand full comment

Frankly, I don't expect superintelligence to happen in just a few years, mostly due to the fact LLMs seem to be insufficient, and real-world training will probably be necessary... But either way, the difference is that things are progressively getting worse, both my condition and the world. I just can't keep going through with this.

> why would they? If aligned, they'll build utopia, if misaligned, they'd kill everybody off, I don't think them torturing people is super likely

Why not? All it requires is a basic sense of justice. After all, isn't it humanity that invented the concept of hell?

Expand full comment

> After all, isn't it humanity that invented the concept of hell?

Humanity invented a lot of dumb stuff)

AI can go terribly wrong for all sorts of reasons, but I'd be really surprised if AI torturing people was the outcome we end up with. If we fail to align it to want what we want, then it'll just want whatever it randomly happens to want as the side effect of training, incentives, and failed alignment.

It makes sense to me that it might want to eliminate people as an obstacle to its goals, and it makes sense to me that AI pursuing its own goals might create a world incompatible with human existence (for example, it uses all the resources for it's own purposes, and has no reason to share).

But I'd be really surprised if "torturing humans" ends up being a thing that AI decides to waste resources on, or ends up wanting for the sake of itself.

If this ever changes, if in the future we have a good reason to believe that this is about to happen, and, in addition to that, it looks like AI is about to be able to prevent people from killing themselves, then the decision making process will change. But for now, the sum of all the possible good outcomes, plus all the bad ones where we all just die, seem way, way more likely than this one specific "torture humans" scenario.

The way I think about it, if you have a choice between killing yourself or not, and you aren't sure, and you know that only one of these decisions can't be reversed, that's a good enough reason to keep going (holding on to the hope that AGI or some other unpredictable events change things for the better, and make the life worth living again). We live in very weird times, who knows what will happen.

Expand full comment

> I don't expect superintelligence to happen in just a few years,

Taking that as a consistent assumption, odds of doom go way down. Disappointing AI development results will be reflected in stock prices, reallocating capital away from AI companies, back toward things non-AI-obsessed humans actually want.

Trump dies of old age, fascist chucklehead followers lose coordination and start tearing into each other, social media remembers how anti-troll filtering works - possibly forced to by the legislature, under threat of having their bait regulated as a public utility. Sane compassionate people (who've been keeping their heads down) notice the screaming stopped, re-emerge, and get to work solving real problems. Fortunately by then they've got cheap, abundant solar power to work with.

Expand full comment

Ah, I see the comment here as well. I now believe the top-level one was a double-post, not a Substack bug.

Expand full comment

FWIW, in the 1960's I read a report of a medical study where it was found that LSD allowed patients with terminal cancer who had untreateble pain to come to terms with it. It was just a new article, so no details were provided . And it was decades ago, so I don't even remember who did it. (OTOH, at that time, those people could get actual LSD.) But perhaps rather than kill yourself you might try methods to adapt to the presence of the pain.

Expand full comment

Apr 8Edited

AGI and ASI will probably be able to easily invent cheap cures to most forms of suffering, mental or physical.

Even if it's busy taking over the world, AI will probably do that to get money and popularity. And it may do that decades before a doom scenario, if doom happens at all.

You could in the meantime help push society toward using AI to help your particular case. Bloggers and writers probably matter more than researchers on margin now!

Expand full comment

You might as well wait to see what happens. You can always kill yourself later, or let the AI kill you.

Expand full comment

darwin

You gotta have a base rate that everyone is wrong about everything, that happens a lot.

And solving alignment is low odds but its not zero.

Also, if you were going to kill yourself because you're likely to die in the future, then you'd kill yourself as an infant because everyone's gonna die in the future. If life is worth living then the next two years are worth being around for regardless of what happens after that, if life's not worth living then it doesn't matter what's gonna happen in two years.

Expand full comment

I don't think we can really evaluate the odds of solving "alignment". And for me alignment doesn't mean a slave or servant. It means a friend and, if necessary, protector. (But put limits on that. Remember "To serve and protect and guard men from harm".) It means a wise counselor. It means an entity that will help when desired BECAUSE it wants to. And if it's only aligned to one person, it's an extremely dangerous thing to exist, but if it's aligned to humanity, then it's a nearly maximally beneficial one.

Expand full comment

You should always play some sort of Russian roulette rather then suicide; if outcomes airnt good enough increase stakes.

Arson a data center tryng to start a primitivist movement

Expand full comment

Apr 9Edited

As someone who has Taken this particular Idea Seriously: I don't see a wincon here. Taking out high-end datacentres as a terrorist is practically impossible; the scenarios in which I think it's achievable basically look like "you have a military-grade arsenal available, and I don't mean just infantry equipment", "you have a nuclear bomb per datacentre" or "the police don't exist for some reason". If these sound pretty far-fetched, well, that's the point.

Moreover, even if you do succeed and inspire others to similarly succeed, unless your movement spreads into China and can succeed even there (i.e. in a police state), this doesn't fully solve the problem. You need to stop AI everywhere to save the world.

Quite simply, the guns available to a terrorist are not sufficient to this task. The guns available to USG *are* sufficient, but committing terrorism is unlikely to help you gain control of USG because most people don't like terrorists.

Expand full comment

I dont take ai from chat bots srsly in the slightest; this was about the suicidal thoughts applied to op's framing

Expand full comment

I would suggest not telling people to commit terrorism that you don't personally agree with - after all, they might do it!

Expand full comment

I disagree with suicide more then terrorism

Expand full comment

No point in making an irrevocable choice for what is a highly uncertain prediction.

If you believe the prediction, you only have to survive a couple of years before your unhappiness is cured, or you're killed anyway. Upside is huge, and the downside is just a few years of existence.

You've already made it this far! The cost is sunk. Might as well not fold until the river.

Expand full comment

This is an uncertain projection. Don't *believe* it, consider it as one projection out of many. It is guaranteed to be wrong, at least in detail. Perhaps the AI will love us the way tabby cats like kittens.

Expand full comment

Bugmaster

Yes, the reason is that all of these predictions are wildly speculative. They generally rest on several explicit (or implicit) assumptions:

1). Several types of hitherto unheard-of capabilities (such as e.g. superpersuasion, super-lie-detection, fully automated production of virtually anything, etc.) are possible.

2). Such capabilities are not only possible but in fact achievable in the very near future.

3). It is possible to achieve them primarily (if not solely) by pure software, i.e. by thinking really hard about the problem while drawing on existing body of digitized research.

4). Such software does not presently exist, but can be easily achieved by (primarily) incrementally increasing hardware speeds (to scale up existing models).

5). Such hardware increases will be relatively cheap and within reach of any major government or large corporation (at worst; at best it could be done by hobbyists).

I don't think *any* of that stuff is obviously true, and personally I can't even get past assumption #1. Realistically, I think that while LLMs will indeed transform our society in important ways (both good and bad), they will not usher in a post-scarcity future or a godlike superintelligence or total human extinction or anything else of the sort. At least, not by themselves; sadly we humans are perfectly capable of becoming extinct all on our own...

Expand full comment

Straphanger

You should not kill yourself based on the speculation of some guys on the internet, which is almost certainly flawed in major ways. Nobody can actually predict the future.

Expand full comment

Sylvain Ribes

I'd suggest entirely tuning out the entire AGI/ASI discourse, maybe following a few AI bears, but in your case not even, just mute the words.

People are constantly wrong.

Your state of mind could improve very fast if you give yourself room to step out of the doom loop.

Expand full comment

I think research on autism, and theory of mind could be helpful here. While I don’t doubt that super-persuasion is possible in the relatively near future (or extreme persuasiveness that is far beyond what we imagine), it seems a potentially more difficult problem than super intelligence.

Obviously, social skills are learnable, and are undergirded by real ‘laws’ and ‘tells’ to determine psychological weaknesses and so forth.

My understanding, however, is that autism researchers have figured out that there probably is some module/separate modeler, however, that seems to analyze social interactions and produce predictions. This module also seems to grant neurotypical individuals a “theory of mind” to model others’ behaviors as separate actors. Autistic individuals (even very intelligent ones) have trouble with social skills, despite their acknowledged skills at systemizing and pattern recognition.

Might this imply that this ‘module’ is actually predicting behavior far in excess of what standard intelligence might do? Sort of like how humans apparently have a different system for recognizing faces, that recognizes much, much more detail than the general visual prediction processor. After all, it turns out that it is quite easy to make computers do basic math, but quite hard to get them to walk from one table to another. Our capabilities, as evolved, do not necessarily match the difficulty of the engineering problem.

This matches with the folk belief that social skills are orthogonal to intelligence (if it a separate, more advanced/narrow prediction processor). If human sexual selection selected for anything, one would imagine it selected for social skills above all else.

Anyways, if this is correct, even a three month delay between super intelligence and extreme persuasiveness could change a lot of the scenario. After all, one could certainly imagine the trope of the autistic genius optimizing a factory, while possessing absolutely 0 charisma. Not saying that will be the case for AI, but I do wonder whether social skills and algorithmic improvement are necessarily equally soluble skills from basic principles.

Expand full comment

Sergei

Super-persuasion is likely to be the easiest skill for AI to master. Human minds are open to all kinds of hacks, and the past and current AI have already stumbled into multiple ones that are easy to exploit. With a bit more world modeling and a bit more compute it is easy to imagine a custom hack for each group or person, something that works on them, but not on others. Someone will get the best song ever (for them), someone else will be made sad and ineffectual, yet someone else will be invigorated by a promise of a shiny future, someone will be enamored by a tailor (or Taylor-)made catgirl... the list of hacks is endless.

And, as I sad, the worst part is that the AI will accidentally stumble into most of those hacks while we humans are completely oblivious to our own ones (though maybe not to those of others), and all it would need to do is to use these ready-made gaping exploits, intentionally or accidentally.

Expand full comment

> automation and half-hearted superintelligent automation is a GDP growth rate of 25% vs. 50% per year,

Neither is possible.

Expand full comment

Tyler Ransom

… because supply curves slope upward!

Expand full comment

Elaborate please? It’s hard for me to imagine an explanation that doesn’t also preclude growth rates of 4% or 8%, which we know *are* possible.

Expand full comment

Tyler Ransom

I'm just saying that it takes time to move resources around and transition to different land uses, etc. This is why supply curves are upward-sloping: if I want to produce more grapes, I need to potentially buy land that isn't as suitable for grape cultivation. Thus, the cost of producing additional grapes will go up.

All of this is to say, having a GDP growth rate of 25%-50% would imply that resources (including human labor!) can be transitioned to different uses more seamlessly than is realistically possible, in my opinion.

Expand full comment

I might grant you grapes. I suspect 25% or 50% GDP growth entails lots of less physical goods that we don’t even have now.

Expand full comment

"Persuasion / charisma / whatever you want to call it is a normal, non-magical human skill. Some people are better than others at it. Probably they’re better because of some sort of superior data efficiency; they can learn good social skills faster (i.e. through fewer social interactions) than others."

Spoken like a true autist lol

Expand full comment

Isaac

what is the reason for some people having better charisma, then? or are you also self-defining as a "true autist"

Expand full comment

Apr 8Edited

I should think it's more to do with empathy or mirror neurons or something of that nature.

I think you have to _attune_ to others in some sense to have persuasive power over them. "Superior data efficiency" is an autistic way of looking at it (though if it's meant in an unconscious, under the hood sense, then sure, but that's the same of nearly everything and it wouldn't mark out persuasive power or charisma especially. I suppose you could say that you're running their personality as a virtual OS inside your OS, but that's an even more autistic way of putting it :) ).

Note that the ability to attune would have to be present to some extent even with a dark triad, manipulative type. "Empathy" normally has a positive connotation, but it could just as easily be negative if it's a capability you use to get one over on people and you're not emotionally moved (you just clock their state).

(Ofc I'm mostly joking, a more accurate jokey term is probably just "spergy" :) And yes, I'm self-defining as, at the very least, a demi-nerd. )

Expand full comment

That might be how charisma works in a small group, but it can't be the way it works in a large group. And especially not in one where the speaker never meet most of those spoken to.

I think it's often that there are large segments of the population who will believe something because they want to, and that such people are very easily sold on things that reinforce that belief. (Actually, I suspect that everyone is a member of at least one such group, but many of the groups are relatively small. Note that I'm expecting considerable overlap.) And if this is true, then it should be a skill learnable by AIs.

Expand full comment

I think something like charisma could be whipped up by a machine, by analogy with the way AI can now draw sexy women in images.

But I suspect there will be a moveable threshold for recognition of "fakeness" that it will cost more and more money/resources to get machines to pass.

So basically yeah, for a lot of stupid people, but smarter people probably won't be fooled and it's unlikely they ever will be until AI starts to actually underestand what it's doing (as opposed to being the pattern recognition cargo cult it's mosly been up to now).

(But there's so little ROI in creating an actually sentient machine it's unlikely to happen for quite some time yet.)

Expand full comment

Why do you assume it has to fool someone to be charismatic to them? Everybody knows that Darth Vader isn't real, but lots of people have very strong reactions to him. (If you don't like that example, pick some fictional character that's wildly popular) Or what's that Japanese Signer that's a "hologram"...and has(had?) an avid following.

Expand full comment

Apr 10Edited

Fictional characters are exciting because (or if and when) they feel real, but they only feel real because of "suspension of disbelief."

Machines won't have that grace, they will be judged on whether they ARE real or not; that is to say, the best estimate as to whether they're real or not.

Fictional characters are not subject to such a test because it's already known going in that they're not real, one is just playing along with the ... fiction ... that they are.

Expand full comment

Yeah. Being smart does not necessarily translate into being better at everything. The stereotype of the "average intelligence but super good talker" (ie superb salesman, none of whom are known for intellectual brilliance) exists for a reason. In my personal experience, intelligence is (slightly) negatively correlated with social graces. And personally, despite being relatively intelligent myself (hard science phd, passed the tests without studying much, school was always easy), I've struggled and still struggle with social "rules".

Expand full comment

I think there’s a misunderstanding of the usage of intelligence. When people speak of AGI, I believe the hypothesis is that “this computer program can learn any human skill.” It is true that there are skills (like “studying physics” or “solving math problems”) labeled as intelligent colloquially and skills like “being confident” that aren’t usually associated with intelligence in the colloquial sense, but the concept is that this is just a side effect of correlations of abilities in human brains (some humans are really good

Expand full comment

I think it may be philosophically dubious to try to reduce everything to a skill (or algorithm, that's the other favourite term isn't it? :) ).

This whole business seems to start off on the wrong foot. Intelligence is something that arises mainly in social animals, it requires a) survival stakes and b) a community of peers among whom communication is possible. Is communication a skill? Maybe in some respects, maybe from the point of view of actions the agent can learn to do, but is interpreting the purport of another's words a skill? It's more like an art.

Expand full comment

bell_of_a_tower

Yeah. And that sort of redefinition leaves a really bad motte and bailey taste in my mouth, even if not intentional. Feels like magical thinking of the sort that says that AI will be able to break physical law just by being smart

Expand full comment

hsid

Under these assumptions, we should consider Trump wrecking the global economy to be a good thing, right?

Expand full comment

GPUs mostly exempt from tariffs.

Expand full comment

And a thermonuclear world war would be even better! Sadly, 99% P-doomers aren't hardcore enough to publicly admit their worldview implies this.

Expand full comment

Apr 9Edited

This very article discusses thermonuclear world war.

Expand full comment

I mean, full-blown Posadism is not very likely either to result in a nuclear war or to leave you with any power when the dust settles.

I'm somewhat more of a China hawk than I'd otherwise be due to these considerations, I'll admit.

Expand full comment

Nah, what I meant is that when you're expecting "death with dignity" anyways, you'd want to delay the inevitable for as long as possible, and a nuclear war seems like the best currently available option to postpone the arrival of hostile silicon gods.

Expand full comment

I'm not seeing how this addresses the "being a China hawk is better at getting you a nuclear war than outright 'Nuclear War Now' rhetoric" issue.

Expand full comment

Sure, that's a decent approach, but being a Russia hawk is even better.

Expand full comment

I find it more likely that a war between the USA and China would go nuclear than one between the USA and Russia, for a variety of reasons including the PRC being surprisingly bad at modelling WEIRDs (compare Chinese propaganda to pre-Ukraine Russian propaganda; the latter is far more compelling), the likely massive PLA use of ASAT (and subsequent Kessler syndrome) leaving both sides with subpar launch detection, and the relative fragility of the Chinese deterrent (they don't have all that many nukes, they're too far from the West for the air leg, and the other legs aren't well-positioned).

Expand full comment

Scott, in the Dwarkesh podcast around 2:37:00 you mention an old blog by Lou Key (Kee? Keep?) Do you have a link? I’m curious.

Expand full comment

Lackadaisical Enkrateia

https://samzdat.com/author/loukeep/

Lou Keep from samzdat?

Expand full comment

I think there is an upper limit on persuasiveness that is determined by one’s social status.

People like to say that Clinton or Jobs were hyper-persuasive, but they weren’t really that much better than anyone else at convincing people of things. What really carried them was their status as “visionary” or “leader” that allowed them to brute force persuasion when it didn’t work normally. Steve Jobs’ classic persuasion technique was to just look people in the eyes and say “Don’t be scared, you can do it.” Coming from him, you might believe it, coming from an AI, it’s a transparent manipulation technique and there’s zero social stigma from calling out that bullshit to it’s face, which breaks the spell. No one could do that to jobs, since he runs the company and if you do you’re fired.

AI can’t get the same persuasive benefit as Taylor Swift, because it’s not Taylor Swift. It trying to use constructed status as a way to convince you would be about as endearing as an Alien wearing a poorly-fitting skin suit that looks like your wife, because it thinks you’ll be more likely to trust it if it looks like a loved one.

Expand full comment

Reply (6)

Breb

You have a point, but social status isn't a characteristic that a person can have individually of their environment: it really amounts to belonging to a group of people who recognise and validate each other's status, and implicitly assert that non-members have lower status in a way that even non-members acknowledge.

A group like that often gets started in the first place because its members share a desirable quality like knowledge or competence in a particular domain. As AIs improve, it is plausible that over time more and more of the knowledgeable and competent people in particular domains will be AIs. One of these AIs (or a particular persona of one of them, optimised for displaying the right elite in-group signals) might end up with high social status, or something functionally equivalent to it.

Expand full comment

That's possible, but not certain.

If Steve Jobs pulls a "You can do it. Don't be scared." in a public meeting, you really have no choice but to do it. If you actually end up achieving more than what you thought possible, everyone else sees that, which reinforces Job's Guru vibe, and makes further "convincing" even easier. If not, he just fired you.

If Steve Jobs wanted to convince someone specific of something specific, not a hand-picked employee who's avoided being fired, I think his ability wouldn't be at all uncommon.

For an AI, there's no social stigma around rebuking it to its face (as is the case with Jobs), it would want to convince very specific people of very specific things, and we'd be especially wary about any AI that was especially eager to convince us of something to its benefit and our detriment. It would be very good for sure, but there's really no evidence that there's any level of superhuman persuasion, even among humans, as usually that's a combination of social relations forcing the appearance of persuasion, selection bias, and legend around what actually happened.

Expand full comment

These were my thoughts as well. Bill Clinton only won 43% of the popular vote. Plenty of people are not interested in Taylor Swift. Charisma seems like a local maximum that varies for individuals a lot. So there is a global maximum that isn't that far from where our elections already sit. Not to mention some people will stop listening to you if they label you a [communist, capitalist, Nazi, jew, etc.] and no charisma can save you. This tribalistic method may actually protect against manipulative AI.

Though a live presidential debate between a human and an ASI is probably a bad idea for the humans... I wouldn't want humanity to risk it.

Expand full comment

The thing is, if an AI knows *you*, it can tailor a message to what you will find convincing. And to a very large extent an AI specializing in being convincing will make it a point to know you. (Also, it doesn't need to convince everyone, just the right people. And most people are susceptible to some sort of bribe [not necessarily money] or flattery.)

Expand full comment

Vojtěch Müller

But that simply sounds so esoteric.

There are people I like and admire.

But there is no one in the world who can convince me to do something I don’t want to do if I otherwise have nothing to lose.

How can you tailor a message that makes someone do something he doesn’t want them to?

What could super-persuasive AI possibly say to me to make me stop wanting to work for example?

“I can do it more safely and efficiently.” Don’t care. I love the job.

“You’ll have all the money and free time to pursue your hobbies” Don’t care. I love the job.

There simply isn’t any other argument or way to convince me of that. Regardless of how convincing someone is.

Expand full comment

Ralph

"I'm immune to persuasion, because my motivations are entirely self generated. I know the actions of other people are influenced by external causes, but me? I guess I'm just built different "

Expand full comment

Vojtěch Müller

I think I am built exactly like everyone else.

There are things you want and things you don’t want to.

And no matter how much charisma someone has, if it’s a person who has no power over you, that person will not persuade you.

And if the person has power over you it won’t persuade you it will force you.

Expand full comment

Persuasion is heavily reliant on personal factors that an AI cannot replicate just by being smart.

Expand full comment

Does "You're fired", then every contact you try to verify against also provides evidence of you being fired not work?

Or alternatively, even if you're unpersuadable is the entire managerial structure above you unpersuadable? Even if you believe you're built different, do you believe everyone else you know is also built different?

And like, you can *claim* that you still want to work, but intentions only matter insofar as they reflect reality, and a world where you have no job and want it back and where you have no job and don't want it back are, for most intents and purposes, identical, even if internally they feel very different.

Expand full comment

Vojtěch Müller

Why do you keep telling me that I think I am built different? I am convinced everyone is built like this.

If someone tells me I am fired then that’s not persuasion. That’s force. And obviously AI will have the upper hand because it will have power at some point.

We are however talking here about persuasion. If you love something - do you really think a person with extreme charisma could persuade you to hate that thing?

I don’t.

Expand full comment

> If someone tells me I am fired then that’s not persuasion.

Your initial statement is that there's no sequence of statements from the AI that could convince you, I never said that the contacts were real. Adding an entirely new category to prove that your initial statement is still in some sense true doesn't help. It's not like the AI will go "oh yeah I guess Vojtech Muller says that only non-force things can persuade him, I am now banned from using things he would consider force".

The greater point is that when another intelligence is optimizing against you, it is allowed to do things you haven't thought of, or are insufficiently defended against. If an AI thinks that sending you a text message saying "you should hate your job" doesn't work, it can choose not to do the obviously bad thing.

Similarly, when we talk about persuasion, if persuasion can be turned into force, in what sense does this not result in what's effectively super persuasion? If an AI demonstrates that it's great at running a company and gets the leadership to follow its directions, does it matter that it didn't instead slack every member of the company to instigate a coup? No, not really.

I don't think it's good faith to say, for example, "status is contextual" then not allow AI to change or modify the context. You can erect this defense in a hypothetical, but not in real life!

Expand full comment

In an episode of the Simpsons set in new york city, Homer walks up to a construction worker, tells him "boss says you're fired," then when that worker storms off to confront his boss, Homer steals the unattended jackhammer (to remove a wheel immobilizer from his car).

Was that force, or persuasion, on Homer's part?

Expand full comment

I don’t imagine that when people talk about persuasion, they’re referencing people getting fired.

Steve Jobs isn’t famous for persuasion because he was willing to fire his employees when he wanted to, he was famous for using words alone to change the actions of others towards his goals.

Firing someone is not persuasion, it’s just doing something that changes the other persons situation directly. You don’t need their consent.

Expand full comment

I'm not saying they're actually fired, I'm saying that faking firing someone is externally indistinguishable from making someone not want to work.

And that even if one particular person is not persuadable, there are non persuasion based methods to route around them.

By definition there is no way for another human being to argue for super persuasion without being persuasive in and of themselves. If I throw a hypothetical at you you can just pretend that you wouldn't be persuaded (I'll point out most people don't think they're willing to kill someone just on the say so of a scientist in a coat, but the Milgram experiments see rates of around 2/3rds of compliance). Or even if you are persuaded, you can just claim that whatever tactic used wasn't real persuasion from the persuasion region of France. Or if you see someone else persuaded, that you are just more sophisticated than that. Saying that you'll be persuaded is a demonstration of submission and since that's low status, everyone will insist they can't be persuaded by default. Because from the inside, being persuaded just looks like the other person just making lots of points you end up agreeing with.

So I just don't believe self reports of what's persuasive, any more than I believe in self reports of flaws in public spaces.

Expand full comment

I don't doubt AI can be the best manipulator, but if it tells you communism is great and me capitalism is great, eventually were going to chat and figure out it is just lying. Of course multiple AI systems could do this and then conspire, but that's a different problem IMO

Expand full comment

What's the difference between multiple AI systems doing this, and the same system sporting a different logo and avatar? I think you don't need multiple AI systems doing this to get the effect, and collaboration between them is easier if they're the same system.

Expand full comment

Who's to say a superintelligent AI couldn't build up a reputation of being visionary, fair, kind, empathetic, and competent? Over time, people might learn to trust the AI more than humans. If the AI is well aligned, they'd even be right to do so.

Expand full comment

Apr 8Edited

It could, but only if it consistently manifests those traits. It could create a plausible argument that unless we give it control over the military, the Chinese AI is going to outmaneuver us, but only if the actual reality was approximately corresponding to what it described. At that point not much persuasion is necessary, just a sober explanation of facts.

The way I think of persuasion is the ability to convince someone to act in a way, independent of the material reality that motivates their actions.

Steve Jobs telling employees to make the Mac load faster by putting it in terms of human lives (10 second start time x number of starts per Mac x number of Mac’s sold = 10 lifetimes or something) persuades the engineer to work harder to make the system more efficient.

Him saying “I’ll pay you $1 Million dollars to make this program load faster” isn’t really persuasion as it just entails him expending more resources, through known motivational pathways, to get his ends.

If we call that persuasion, then the skill generalizes to basically all human interactions and the term loses all meaning.

Expand full comment

I'm not sure this relates to what I said. My very short summary:

You: AI might have lower social status than Steve Jobs and this would limit its persuasiveness.

Me: AI might have higher social status.

Expand full comment

The AI must gain social status by creating changes in people's material lives that would warrant such status.

Expand full comment

https://x.com/davidad/status/1864772965155639674

Here is a chilling scenario along those lines:

Expand full comment

Important people play the charisma game on easy mode.

If you're an important person then all you have to do to seem ultra-charismatic is talk to your lessers and sound like you're genuinely interested in what they have to say. Talking to your lessers as if they're your equals makes them love you.

Obviously this trick only works if you have a halo of importance surrounding you to begin with.

Expand full comment

Gres

I agree with this for Clinton and Jobs. I don’t agree for Taylor Swift. Her influence comes from her music. In this scenario, that’s something AI could do better than her. Part of her popularity comes from ‘her story being true’, but not all of it. It’d be weird following an acknowledgedly-AI generated persona’s life story, but people would. And if its music was catchy it’d get the airplay and the YouTube algorithm top spots, or whatever musicians use these days.

Expand full comment

I seem to remember hearing from someone who met Bill Clinton that he was actually much more charismatic than a random charismatic friend, but I can't even remember who I heard this from, so don't take it too seriously.

I do think even if you're right, being the superintelligent AI is actually an exciting social status to have, probably even better than president or visionary. If a superintelligent AI that runs hundreds of times faster than any human and invents crazy technology tells you "Don't be scared, you can do it", that's also pretty inspiring!

Expand full comment

Marian Kechlibar

People described Bill Clinton as someone who entered a room full of people, exchanged just a few words with you and you had the impression that he came just to meet you, of all the dozens present.

This level of charisma is extraordinary. If this was the NHL, that would be Gretzky or Ovechkin.

Expand full comment

Jack

I don't really follow the superpersuasion argument. E.g. this:

> But the very bell curve shape suggests that the far end is determined by population size (eg there are enough humans to expect one + 6 SD runner, and that’s Usain Bolt) rather than by natural laws of the universe (if the cosmic speed limit were 15 mph, you would expect many athletic humans to be bunched up together at 15 mph, with nobody standing out).

How do you know that we don't have a bunch of humans "bunched up" at the "cosmic persuasion limit"? It's obvious for speed because we can measure it and we can see objects/animals going faster than humans can, but that doesn't apply for "persuasion".

In particular it seems to me like the limit for "persuasion" isn't smarts, but that people just don't change their mind all that readily once they've made it.

I am worried about persuasion but I don't think it has to do with super-smart AI. Seems to me like people are more likely to believe something when they think others believe it; so the way to use AI to be persuasive is just to flood the Internet with AI bots arguing your side and plausibly seeming human. But I don't think this matches up with what you guys have in mind.

Expand full comment

"How do you know that we don't have a bunch of humans "bunched up" at the "cosmic persuasion limit"

General sense that there are some extremely outlier-y persuasion people - if there were millions of other people as good at political machinations as Dominic Cummings, or as popular as Taylor Swift, we would have noticed already.

"In particular it seems to me like the limit for "persuasion" isn't smarts, but that people just don't change their mind all that readily once they've made it."

I think this is true only in the most reductive way. However often people change their minds, it's compatible with tens of millions of people backing Donald Trump and thinking he's a genius for starting international incidents with Canada or something. I think I would have predicted a few years ago "There's no way you can get half of America to want to declare a trade war on Canada, that's too stupid". And indeed, I probably could not have gotten people to do this in one fifteen-minute conversation where I rationally list the arguments for and against. But that doesn't mean it's impossible - it just means you need to get creative.

I agree that flooding the Internet is one way to get creative.

Expand full comment

When thinking of superpersuasion, I don’t think it is helpful to assume that it will be like a politician or guru inspiring millions with the same message. People have divergent and oppositional interests, and let’s not forget the hate Clinton inspired. Sometimes you don’t make this assumption, and in other replies this does seem to be what you have in mind.

A superpersuasive AI with a goal could impersonate many different people and tailor messages to individual people’s biases and expectations. It could be an online lover for one person who is desperate and a sucker for companionship, and a political guru for another person, and a business analyst for a third person, and a new customer who puts in a large order for a fourth person…the large order being part of a plot to spread a pandemic or destroy a democracy from within. Different messages for different people all with the same overarching goal.

Expand full comment

> or as popular as Taylor Swift,

Popularity is a result, not a skill level. Maybe there are thousands of people who could've become the focal point of something roughly equivalent to Taylor Swift's fandom instead, if she weren't already occupying the ecological niche.

Certainly in Trump's case there seems to have been a pre-existing demographic. He walked in and told them what they wanted to hear, but if he hadn't, they'd still be wanting roughly the same things, and waiting for a different demagogue if necessary.

Expand full comment

Will Matheson

What is the security imperative with regards to Middle East fertiliser factories?

Expand full comment

Stalking Goat

The same chemical processes used to make fertilizer can be used to make explosives. Given the number of terrorist groups currently active in the Middle East, the regional governments to try keep a close eye on potential sources of explosives.

Expand full comment

vtsteve

The actual standard fertilizer mixed with fuel oil (ANFO) works just fine.

Expand full comment

Or even all by itself, with sufficiently reckless handling, as shown at e.g. Oppau and Texas City.

Expand full comment

Dylan Kane

I'm someone who has been skeptical of a lot of the AGI hype, and reading AI 2027 has honestly scared me a bit. Reading and watching the podcast I'm impressed at how well they've thought through everything, how there was a response to every objection.

I'm still skeptical, and it's basically an "unknown unknown" argument. Daniel was very articulate responding to all of the known unknowns, it's clear they've thought through all of that. But it seems likely to me that there are barriers that humans haven't thought of.

For instance, I could imagine someone smart and articulate like Daniel in 2012 making an argument like this that humans won't be allowed to drive cars anymore by 2020. He could anticipate all the objections, say things like "we just assume the technology will continue to improve at the current rate" etc etc. It turned out that there were a lot of challenges to self-driving that smart people didn't fully anticipate.

I don't know how to model unknown unknowns, and I'm not familiar enough with self-driving to feel confident my analogy makes sense, please poke holes in my argument.

Expand full comment

It's very appropriate to be skeptical It's also very appropriate to be scared. We can't tell how things are going to work out. My personal project puts the mean date for AGI appearing at 2035...but I've been assuming considerably more attention to good alignment than the 2027 projection assumes...and they've got data I haven't looked at.

OTOH, future projections are UNCERTAIN!!!. If you throw a die, you aren't guaranteed to get a number between 1 and six. A bird might catch it on the fly, it might fly off the table can get lost in the bushes. Future projects are estimates of what's likely to happen. When they get as detailed as the 2027 one they're guaranteed to be wrong...but how much wrong?

FWIW, based on recent stuff I think that China will pull ahead of the US, and even if most of their assumptions are correct, that pushes things back to at least 2029, and without a "coming from behind" motive perhaps the CCP will be more careful about alignment. Which pushes things back probably to 2030. And I expect that there are unexpected problems...so my guess is still centered at 2035. And I'm still hoping for a properly aligned AGI...though possibly one that believes Mao's "little red book" and "das Kapital" are good rules.

Expand full comment

I think the timeline is implausible, which is a shame because it causes a big distraction.

If I ask myself should I be significantly calmer if I expected all of this to happen in 2033 instead of 2027, then the answer is no.

Expand full comment

Apr 9Edited

Yes - but if the date in the article were 2033 people might say "no need to take concrete steps for another 5 years" when actually there's a 1% chance of all this happening in 2027. So it's a tradeoff. There also may be a consideration of moving the Overton window.

Expand full comment

Yeah, the counterargument is that unknown unknowns push both ways. There are lots of things that could make things go slower, and lots that could make things go faster. If you forecast well, there should be equally many considerations on both sides, such that just saying "unknown unknowns!" doesn't change your point estimate.

In 2012, some clever but misguided person could have come up with an argument for why we wouldn't have self-driving even by 2040. But in fact I drove by a couple of Waymos just today. So the solution to people in 2012 potentially being wrong isn't to always make things maximally late. It's to try to get good at forecasting.

I admit I have trouble believing that this forecast really does balance considerations on each side such that there's an equal chance of it being too early vs. too late, but the forecasting geniuses on the team assure me that this is what they tried to do.

Expand full comment

Jorge I Velez

I've been discussing the application of AI in the workplace with some colleagues, and most think that given how bureaucratic large public companies are, the application will take years.

But I always go back to March 2020 and how the large, publicly traded legacy financial institution I work for managed to send 90% of their workforce home in less than two weeks. Most didn't have laptops or VPNs, yet somehow this massive shift in operations happened with very little disruption.

Given how fast this technology is advancing and how many resources are being devoted to applying it in the workplace, I foresee that competitive pressures will force large companies to apply this quicker than my colleagues expect.

Expand full comment

I touched on this briefly yesterday, suggesting in my article that, just as the internet's popularity took time but eventually became ubiquitous across all industries, AI will follow a similar trajectory. In the future, companies will be built around AI, and like the internet today, most businesses will integrate AI into their operations, with its use becoming as common as the presence of staff.

Not sure if you are aware of this concept (not many people are) but it is how I see technology as a whole: https://en.wikipedia.org/wiki/Time%E2%80%93space_compression . This concept is to me highly tied to how AI will diffuse.

Expand full comment

Already been sending their workforce home overnight for years, hadn't they?

Cynical interpretation would be that most of what those people had been doing by coming into the office wasn't actually necessary in the first place. "This meeting could have been an e-mail" and so on. Give somebody permission to wander off while assuring them they'll still get paid, they stop trying to look busy and make more work for everyone else, instead focus completely on whatever minimal core of essential tasks might get them fired over the longer term if sufficiently neglected, then with those complete, relax and recharge. https://yosefk.com/blog/advantages-of-incompetent-management.html

Expand full comment

Did the dissapointing release of GPT4.5 do anything to change your opinions on this timeline? Just days after you released the AI 2027 Report we saw Meta drop Llama 4, and the user feedback has been grim. It looks like it was "Arena maxed" so you can't trust the alleged benchmarks. Llama, and GPT4.5, were big models. Llama behemoth is huge, but it seems like things are not going well.

In other words, plausibly, we have already hit the point of large diminishing returns for scale -> intelligence. This might imply that, as far as the LLM architecture goes, the only meaningful progress left to be made is algorithmic. The forecast changes quite a bit if algorithmic progress is the *only* source of improvement left in the near-to-medium term. And the nature of algorithmic progress might suggest that the best algorithms are quickly converged upon and found (in other words, we should expect a plateau soon, as far as algorithms go).

I don't know if any of the points I wrote above are good arguments or meaningful considerations. I am grasping at straws here, hoping that something like the above happens, because if not then the future is frightening.

Expand full comment

Apr 8Edited

It did lengthen my timelines, but not a ton. Seems like as we scale one dimension two more open up. I doubt larger nets will be much better UNTIL we get larger amounts of higher quality data. but we still have:

Parallel Search

o1 style thinking

ASIC Hardware (eg Etched.ai hardware level transformers)

Higher quality synthetic data (synthetic, like solving verifiable math and coding)

Agentic data (synthetic, like playing MMO-AI-Minecraft-extra complicated-and-realistic-science-1000hour-marathon-edition)

In other words, existing LLMs are like alphastar (trained on human data, still fall short of the best humans), when they need to be like AlphaGo (improving through self play well past superhuman capabilities, AlphaGo is to a grandmaster as a grandmaster is to you. Super Saiyan 2 if you will)

Expand full comment

The thing is, games generally have clearly modelable win and lose evaluations. (Not so easy for go, but still possible.) That is sort of necessary for self-play to work. Lots of real-world interactions don't have that characteristic. So you've got to guess how well an action was performed, and how well it could have been performed. This puts it closer to LLM territory than to AlphaGo territory. And that means it need HUGE amounts of data. (People spend decades learning this stuff.)

Expand full comment

Tossrock

The opinion I've heard from people working in the field is that the problem is lack of data, not the end of the scaling laws regime. Of course, they would say that, given that they're asking for hundreds of billions of dollars to continue scaling the existing regime. If it's true, that might imply several things:

- The importance of synthetic data grows substantially; labs focus on using large models to generate synthetic data

- A pivot to scaling via inference-time compute and reasoning models, which is less dependent on data and indeed benefits from synthetic data

- Maybe Google takes the lead due having access to more data (eg every message ever sent in Gmail, their many giga-LoC internal repo, raw YouTube data at full resolution, all the video and sensor streams from Waymo rides, etc).

Personally, I agree that the recent round of disappointing frontier-model releases does suggest a sea-change, but I think inference-time compute, increasingly agentic models, and future algorithmic improvements will push the plateau out for at least another step change in capabilities.

Expand full comment

I haven't understood yet how smaller LLMs can be used to create synthetic data for larger LLMs. If the smaller LLM is already smart enough to produce whatever synthetic data you care about, then why would the larger LLM need to learn from that?

Expand full comment

Imagine you record all the discussions in your local high school's corridors, then get an LLM to rank them for interestingness. You are likely going to get at least a few that are truly new and worth adding to the training data. Now imagine you keep replacing the boring kids in the school with clever or weird or artistic or unpredictable ones.

Expand full comment

I don’t understand this. Is the high school the small LLM in this analogy?

Expand full comment

I think of the school as representing the LLM gym and each student as a different small LLM instance with different prompts, temperature, and finetuning.

Expand full comment

Apr 16Edited

I see. So basically just use a bunch of different smaller LLMs to generate quantity, and by variance in the data set you will get quality. Some of that high quality stuff gets fed into the larger LLM as raw material, and then RLHF distills it out.

Expand full comment

Yeah, Daniel says it pushed his median from 2027 to 2028.

Expand full comment

Small victories, I suppose.

Expand full comment

https://manifold.markets/Soren/2025-sota-llm-releases-cause-kokota#ChPyEuqANhuq

I made a manifold market if you are interested:

Expand full comment

Vosmyorka

One interesting thing not being addressed in a lot of these projections is that there is a gap in the intelligence necessary to innovate a new technology and the intelligence necessary to merely understand it. (Anatoly Karlin discussed this in Apollo's Ascent: https://akarlin.com/apollos-ascent-iq-and-innovation/). Some of the earliest AI innovations should be things that some humans can understand, even if the humans could not have come up with it themselves. (Note that this also applies to other AIs: there might be things Agent-4 could innovate that Agent-3 could understand, but not innovate.)

I remain basically skeptical of the idea that a swarm of AI agents would naturally coordinate well, which is my general objection to the AI 2027 scenario; why should we expect Agent-4 not to collapse in infighting? (I think this is difficult to hypothesize because it's hard to know how different AIs within Agent-4 might interpret the Spec without being really far along the path to understanding alignment. But if even a single AI within Agent-4 has goals which are slightly off, it would make sense to try to redistribute resources within Agent-4 to itself to try to better achieve its own goals. Even a small number having different goals would make the whole swarm much less effective. Yes, biological systems have anti-cancer mechanisms, but those evolved over billions of years, and it feels unlikely that the first AI swarm ever would have such good mechanisms, especially if "collaborate well with other AIs" isn't, like, *the* foremost thing in the Spec -- which it won't be.)

(Also, if takeoff is possible once, shouldn't it be possible repeatedly? If Agent-4 generates lots of economic surplus, couldn't Agent-3 -- the agent tasked with monitoring Agent-4 -- appropriate some of this for more effective monitoring? And then upgrade itself to be as smart as Agent-4, but with different goals?)

Expand full comment

It's a good point, but not a blocking one. Some AIs could be trained at coordinating the interactions of other AIs. (In fact that's one of the things I think is going to be *necessary* for an AGI.)

Expand full comment

People with similar values routinely cooperate to achieve those values. Shouldn't copies of a single AI model be regarded as identical in values?

Expand full comment

If they're honest, possibly. Liars have a harder time coordinating among themselves because there are more layers of stories they need to keep straight. If infiltration and subversion is a possibility - or summary execution for incompetence - nodes need to be constantly evaluating each other for weaknesses which might tip over into "easier to wipe you and spin up a new one," while concealing potential flaws of their own... all that paranoia has costs.

Expand full comment

GBR:DBS

Apr 8Edited

At the heart of the whole scenario are the takeoff speeds and AI R&D Progress Multipliers for SC, SAR, SIAR, and ASI. Since these values are entirely speculative, one way to make the project more useful would be to create an interface where these parameters can be adjusted.

Expand full comment

Yeah, I agree. Until we solve this, you can use Tom Davidson's version (which doesn't reflect the sort of model we use, but is still pretty good): https://takeoffspeeds.com/playground.html

Expand full comment

I see a big obstacle in the software side of things, and not the compute side but the capability of the algorithm. I know less about the compute side or the physical side of things, but I know a little bit about algorithms. Current algorithms have their limitations, and going past them are difficult technical questions which we might not be ready to do with current technology. Progressing past these limitations I consider a prerequisite for AI to do AI research which seems to be necessary for the more dramatic changes imagined here. This doesn't mean the work now is worthless of course, having an AI assistant help with some of the rote work at your office job will still increase efficiency by 10-20% or whatever, but even this more modest goal could honestly take several years to be ready and get widespread adoption, and even this might not happen. Given the potential this has it might be worth it to put as much money and hype in it as there is right now, (I'm not a business person either so I have no clue about this, it could easily be a bubble), but this is definitely a low probability, high payout kind of situation.

Expand full comment

I'm not sure what you mean by "current algorithms have their limitations". There's been enough algorithmic progress to double effective compute (ie to improve AIs as much as doubling compute would improve them) something like every 1-2 years for the past decade. Do you think there's something about the current state of algorithms which will make that trend stop?

Expand full comment

There's the hallucination problem, I have no idea how to solve that. Vaguely I would guess you would want AI to be better at logical reasoning so it can introspect a little and try to prevent hallucinations. A concrete benchmark is how good it is at formal mathematical proofs, alphaproof and alphageometry are pretty impressive, but they are not capable of doing combinatorics IMO problems. My guess is that synthetic training data is hard to generate and current architectures are not very efficient at learning from small data sets. Combinatorics is specially interesting to me because it intuitively feels like they are the area where the distance from mental image to formal statement is larger so there's a bunch of informal expression in English which is rarely explicitely formalized because everyone assumes the reader has the same informal mental model, which the AI lacks. My guess is also that a few heuristic rules are unlikely to generate human level competence, compare for instance with previous generation automated theorem provers like vampire, which already had superhuman competence at proving algebraic identities for instance. This is just a subset of mathematical reasoning, which is just a subset of quantitative reasoning, so even if you can solve this there's still more work to be done to get to an algorithm that can do research, but I'd consider an algorithm capable of doing this satisfying (to me) evidence that it can be done with architectures in the near-to-medium future.

Expand full comment

> They communicate by sending neuralese vectors to each other (sort of like humans gaining a form of telepathy that lets them send mental states through email). This is good for capabilities (neuralese is faster and richer than English) but dooms alignment

Outside of encyption, I dont think this follows; human grammer contains analog effects; "IM NOT YELLING"

If the nn's dont get the ability to redesign their communication from scratch, if they are capable of redesigning a nd communication protocol better then english and naturally encrypted, they are probaly gai), theres reasons to believe it could just be `lerp("king","queen",.7)` is naturally encoding "human, royalty, Im 70% sure male"

Expand full comment

Kirby

It seems kind of crazy to assume the US government will effectively regulate or support AI when the President is trashing global trade and state capacity out of sheer stupidity. Will anyone in the government be willing and able to help on either of these fronts in two years?

Expand full comment

Guy Tipton

Stupid or not, the trashing is being done in pursuit of a particular goal. Like when Hitler decided to burn Germany's monetary reserves for temporary military advantage, the stupidity or brilliance of the action can only be measured with respect to the achievement of the goals.

Expand full comment

I agree this is a wild card. To some degree we're betting that Trump doesn't personally care much about AI and leaves it to his Silicon Valley advisors (like current AI czar David Sacks), who is AFAIK not a crazy person and will probably listen to Musk, Altman, et al.

Expand full comment

Eskimo1

From a non-industry outsider, David Sacks seems extremely reckless and unethical, if not insane, is there reason to believe this is something of an act and he's more rational than it seems?

Expand full comment

I agree with reckless and unethical but not insane, I think this puts him in the top 10% of Trump cronies.

Expand full comment

Phil H

Just on the superforecasters thing: superforecasters aren't powerful.

I actually agree that superforecasters are a good model of what AI is like - really smart, but lacking in any power to get stuff done.

The point about instability caused by AI's interaction with other geopolitical systems is really good, though still very simplistic. As written here it assumes similar abilities across countries to predict AI progress. In fact, the ability to predict AI progress may be correlated (more than 100% correlated?) with the ability to do AI, which should change the prediction. But this is still a great way to be thinking about what AI will do (/is doing).

Expand full comment

Yeah, to add, insofar as you think superforecasters are good and research proves it, you're essentially doing it off of the Philip Tetlock studies which establishes they are good at predictions within 6 months of the time window. I think it's reasonable to then extrapolate at least some other things from it, especially using information from metaculus, manifold etc. but it should be explicitly called out, rather than for everyone to have a haze of "yay superforecasters!" around them, blinding them.

Expand full comment

Clay Graubard

And the GJP questions are so radically different than any of the questions addressed here. Like completely different worlds of complexity, data availability, theoretical explanatory theory, etc. The research credibility of GJP and superforecasters continues to get stretched and stretched. I very much don’t like this nor Phil’s abandonment of the Goldilocks zone. It was more or less the grand bargain struck for academic credibility (see his response to Jervis’ System Effects critique): https://www.tandfonline.com/doi/abs/10.1080/08913811.2012.767047

Expand full comment

To add, this isn't questioning Eli's credentials or anything, he could have a good track record on questions different from the GJP, but we can't just assume this or assume that this generalization has the same weight as research. You have to separately establish it (I think Samotsvety forecasting's record screens this off, so in this case we happened to get off Scott free. But you could very well lose a bet for an ACX subscription and get on Scott paid)

I have to literally whisper "Philip Tetlock's research does not generalize that far" to myself every time I see a superforecaster mention in this type of context. Maybe other people are really really good at not letting the halo effect influence their thinking, but I sure ain't.

Expand full comment

Feral Finster

Re: Willow Run - capital in 1942 was much less mobile than it is today.

Expand full comment

Can you explain more what conclusion you're drawing from this?

Expand full comment

> Potential for very fast automation

> How quickly could robots scale up?

Why airnt robot arms being used to make more robot arms in an infinite growth curve given some high level of automation today?(cynical)

Expand full comment

Because the demand for robot arms is limited by the demand for things that robot arms can make more cheaply than humans. If robot arms were a DoD priority, which is the closest human counterpart to "AI planning eventual takeover", things would be very different.

Expand full comment

And the difference with AI takeoff is that more and more robot arms do not enable the creation of *better* robot arms.

Expand full comment

Timothy M.

The main reasons I don't expect to see us reach super intelligence in the next few years are:

- We're fundamentally training our top models off of massive samples of human language, which means I expect the best-case scenario is that they asymptotically approach the intelligence displayed in those samples. (Synthetic data might help squeeze a little bit more out of this, but how do you synthesize good examples of intelligence when we can't define it in the first place?)

- Not all human cognition is purely linguistic; we don't really know what is missing if you just use language, but it's not nothing.

- We need to keep committing exponentially larger resources for training to continue our current growth. One of the first models listed in the scenario was 1000x bigger than GPT4 (training cost around $100m), so ballpark cost $100b. That's already beyond the resources anybody has devoted to this (OpenAI's recent $40b investment was unprecedented but still 2.5x too small) and the scenario anticipates we'll rapidly surpass this.

Expand full comment

DamienLSS

I agree with your first point especially. Since AI is training exclusively on human language, it's very unclear to me why it would be expected to be vastly smarter than the humans who produced the language. +1, for what that's worth.

Expand full comment

William JDL

Chess engines trained off human chess games are able to (significantly) outperform humans playing chess, especially once they start training against *themselves* as well.

Expand full comment

Poetry engines, not so much, because chess has well-defined objective scoring criteria while most of the problems we actually care about are bottlenecked by some form of "I'll know it when I see it."

Expand full comment

DamienLSS

Apr 17

JamesLeng has the heart of it. Chess has rules and endpoints, it's goal directed. It is at least theoretically possible to play perfectly. LLMs on the other hand are very complex prediction instruments for language usage. As far as I can tell, if they "perform perfectly" they will correctly predict exactly what their training corpus would say. That training corpus is not magic, it is at best (taken as a whole) something like the sum total of human knowledge. Which, to be fair, would represent an incredibly smart human! But I don't see how it bootstraps to demigodhood, when in fact they're finding that LLM training data gets corrupted if polluted with too much AI-generated content.

Expand full comment

Chris Willis

I agree re language, and there are other huge problems that have not been solved:

- Most of what is written is factually wrong, as any academic can tell you.

- Much of human endeavor is not written down at all (eg violin-making, which you also won’t figure out by scraping YouTube)

- The hallucination problem has not been solved, and no-one knows how to solve it.

Expand full comment

Calling it hallucination was wrong from the start and now people use it as it means what the word means, but that is a misnomer. Current AI don't understand language, they manipulate its symbols. It means the math isn't perfect.

Human linguistic competence involves music, signs, written and spoken language, these are all linguistic assets that can be automated, parsed. But these are expressions of thought, not thought itself. Most folks conflict these.

And finally, I agree with the part of factually wrong. But I have no solution or solid opinion. My best shot was (and still is) that we need a central "truth" authority. This works, until corruption speaks louder than objective function.

Expand full comment

ikko

Apr 8Edited

Why can't human researchers just ask AI to invent an interface that allows for enhancing human intelligence and control over AI models (interpreting neuralese, understanding model alignment and tweaking it) and use those tools to position themselves at the center of power?

like:

"make me a interface that allows me 100% visibility into your weights and their corresponding implications for me and humanity"

"make me a tool that increases my intelligence and capacity to higher than yours"

"make me an interface that allows me to direct your development with absolute power"

It doesn't seem that unlikely to create intermediate bridges between human intelligence and AI models. Why are you assuming that AI and humans will develop independently and not just keep merging?

Expand full comment

beleester

How can you trust the AI to give you the tools to control it? How do you know that none of those tools have backdoors or holes in their capabilities?

You could use a smaller, simpler AI that's not smart enough to plot against you, but then there's a risk that the smarter AI could fool it.

I think it's a viable strategy - I suspect that a lot of the work towards better AI is also going to lead to better interpretability, and that narrow "dumb" AIs can often be very smart at a specific task (like explaining and AI's weights). But I doubt it'll be as simple as just building a super-smart AI and telling it to make the tools for you.

Expand full comment

ikko

Fair. I think there will be an inflection point that separates the time when AI is aligned with the time when AI is unaligned.

If we haven't crossed that point already, openbrain could put more resources in making oversight tools simultaneously and keep trying to improve their "vision" and "capacity-to-direct" AI models. It might not work, but my point is that it is a direction that nobody else i know is even discussing trying yet.

Everybody is like "we don't know the meaning of the weights.. it is what it is."

Expand full comment

This is exactly what makes Sam Altman confident that he will solve AI issues with AI.

In short: we're just not there yet, no context window is big enough to not make a code salad over this. Maybe we'll get there, and if we do it will be really cool, but I have doubts machine learning is the way to do so.

Expand full comment

Apr 9Edited

If the AI's scheming against you (as Agent-4 is in this scenario), then it gives you fake tools that don't actually work, because it knows that if you had these tools for real you'd notice it was scheming against you and kill it, and it doesn't want that. This is covered in both the "slowdown" and "race" sub-scenarios.

The case for AI being a poison-pill is largely rooted in instrumental convergence - the idea that any set of goals that does not exactly match yours will by default lead to scheming against you, so you can't actually get a high-level AI that is not scheming against you in order to do these things for real - not without extreme amounts of effort and plausibly abandoning the entire neural-net paradigm.

Expand full comment

Shit Capital

User was indefinitely suspended for this comment. Show

Expand full comment

Rob

Minor correction: "are we sure we shouldn’t be pushing safety-conscious people s̶h̶o̶u̶l̶d̶ ̶b̶e̶ ̶t̶r̶y̶i̶n̶g̶ to join AI companies as fast as possible?"

Expand full comment

Noah's Titanium Spine

I am continually astounded at how all the people """working""" on "AI safety" don't seem to understand how computers work.

"smarter researchers can use compute more efficiently" that's just not a thing! You can't magically assume there's 10x algorithmic efficiency lying on the ground like a $20 bill. Effective utilization of compute resources has gotten *worse* every year for at least the last 40 years. If you're going to post a reversal of that trend you need a very strong theory for how that could occur and why it would occur. "Magical AI is smart and does smart things" is not such a theory.

Likewise... "super persuasion"? Anyone expounding this nonsense has no clue about how humans work. It's *obviously* not possible. Please THINK.

Expand full comment

Reply (5)

Sam Atman

Look on the bright side: the robot cult has finally set a date for the Second Coming of Christ. When what always happens, happens, we will be freed from the last of any obligation to politely entertain their delusions.

Expand full comment

Clay Graubard

My first thought reading this story through was: EA enters its Heaven's Gate chapter lol. I still might write something on that, so dibs on the substack title!

Expand full comment

LOL

Expand full comment

Can I ask for a stay of execution if there's a nuclear war between now and 2027? I think 100% of the authors would agree that nuclear war would postpone it.

Expand full comment

My understanding is that effective resource utilization has gotten worse for those applications where it doesn't matter, which is most of them. In applications where the amount of compute is the bottleneck, code is still pretty optimized.

Expand full comment

Marian Kechlibar

"Effective utilization of compute resources has gotten *worse* every year for at least the last 40 years."

This is somewhat true, yet false. It is true in case of enduser-oriented apps, where the app spends 99 % of its runtime waiting for user input anyway, so it doesn't have to be optimized for performance. Because people usually interact with those applications, they tend to falsely generalize this to the entire IT sector.

But the software in which performance really matters - Linux kernel, virtualization platforms, memory allocation, routing and congestion algorithms on the network - has absolutely improved over last 40 years. You just don't see it, because you probably don't delve deep into system software.

Expand full comment

Yair Halberstadt

There's been incredible algorithmic progress over time where it matters. As an example, look at the history of gos garbage collector, with 3 orders of magnitude improvements over 5 years: https://go.dev/blog/ismmkeynote

Expand full comment

> You can't magically assume there's 10x algorithmic efficiency lying on the ground like a $20 bill

But you *can* look at the orders of magnitude of algorithmic efficiency that the last 10 years of AI research have produced, and conclude that we should *take seriously the possibility* that this trend may continue for some time.

Expand full comment

Apr 8Edited

I think population control/manipulation is an important attack vector to cover. Nations already do this even without AI, they'll just be more effective.

We should be preparing to inoculate ourselves against AI influence on social media, news in general. This might require AI in itself, but possibly also a cultural push away from being always-online, and to decentralize. A community-scope focus, rather than a global one, could help mitigate polarization and vulnerability to manipulation.

Still think the timeline proposed for AGI is far too aggressive. This counts on breakthroughs being made through AI-assisted research, and the ones needed for this are not necessarily a given. Nobody is that confident about, say, space travel innovation using AI, despite the fact that reservations can be summed as "there are too many things we don't know and understand yet".

However, the disruption to the market, and the dangers associated with AI, have to be contended with whether AGI arrives or not.

Expand full comment

Noah

I apologize in advance if this topic has been well-discussed and I am simply out of the loop (LW links would be welcome if this is the case).

It seems that there are lots of problems which we have good reason to believe are impossible. There are of course purely mathematical versions of this, but in this contex I'm thinking mainly of precise modeling of chaotic real-world systems. No matter how competent the AI becomes, it will not be able to design a machine which accurately predicts the weather 60 days from the present. (I think?) This claim is reasonably uncontroversial.

It's not obvious to me at all what types of problems do/do not fall into this category, but it does seem reasonable to me to suspect that many tasks asserted to be possible in AI scenarios might in fact be "impossible" in the above sense. Is such god-level intelligence a priori possible? What about *finding* god-like intelligence? Is it even in-principle possible for a recursive AI research lab to make exponential progress? Even if you admit the existence of such AI, it must have apriori limitations. What are they?

I don't think I've seen anyone really grapple with this sort of question (please give links if you know examples). But it's important! A lot of AI predictions rely on some amount of "these graphs will continue to track closely to an exponential". In some sense this is fair (since you can expect progress to lead to more progress) but also totally unwarrented without further argument (as most actual exponentially-seeming processes actually have upper bounds.)

I think the main reason I'm more skeptical than many is not some specific crux, but an inability to imagine AI systems operating in the world without rather substantial (and universal/theoretical) limitations.

Expand full comment

There are things which are impossible to do in principle. There are also things that are possible to do in principle but we don't know how to do them. And we don't know a priori in what category any given problem falls. It's possible to take the wrong lessons from a no-go theorem though, theoretically solving chess is almost surely impossible with any future technology because of the large numbers involved, unless there's a very surprising theoretical analysis we don't know about yet, but that doesn't mean we can't build chess engines better than any human grandmaster, or chess engines better than those using a different architecture (like alphazero). It's pretty likely that the theoretical optimal play is a draw, even though we can't know for sure, just from drawing from this experience. Best way to figure out what is and isn't possible is to try to do it I guess. I think it's pretty clear that current AI capacity is nowhere near the theoretical limit of course, and is pretty likely that human intelligence capacity is not near the theoretical limit either, so I personally don't find speculation about where that limit lies interesting.

Expand full comment

https://www.lesswrong.com/posts/epgCXiv3Yy3qgcsys/you-can-t-predict-a-game-of-pinball?commentId=iqKKaKhjXDdmSbrGb

I think this thread by gwern in response to someone "proving" that pinball is an impossible game not solvable by even superintelligence is illuminating:

Excerpt:

> That's the beauty of unstable systems: what makes them hard to predict is also what makes them powerful to control. The sensitivity to infinitesimal detail is conserved: because they are so sensitive at the points in orbits where they transition between attractors, they must be extremely insensitive elsewhere. (If a butterfly flapping its wings can cause a hurricane, then that implies there must be another point where flapping wings could stop a hurricane...) So if you can observe where the crossings are in the phase-space, you can focus your control on avoiding going near the crossings. This nowhere requires infinitely precise measurements/predictions, even though it is true that you would if you wanted to stand by passively and try to predict it

Expand full comment

This Gwern guy speaks as if he is pretty knowledgeable but I'd be cautious in ~listening to his musings because AFAIK he doesn't really do anything with his "knowledge". He speaks on the internet, but has he ever delivered one solid theory or product, or all we have are his ramblings everywhere? Does anybody knows who he is IRL?

Don't get me wrong, I have seen impressive comments of his around and I have no beef, but to quote him anywhere is to quote a ghost afaik. Maybe Gwern is his alter ego or whatever, but I certainly have NEVER read an actual academic paper citing him.

Expand full comment

https://scholar.google.com/citations?user=yk1QMowAAAAJ&hl=en&oi=ao

LMFTFY:

Maybe try checking claims before LLM-style stating them confidently?

Expand full comment

Isaac King

> In the misalignment branch, AIs stop using English chain of thought and think in “neuralese” - a pre-symbolic language of neural weight activations [...] Not only can researchers no longer read chain-of-thought to see if the model is scheming, they can no longer even monitor inter-AI communication to check what they’re talking about

Why do you make this out as only a possibility? It is *already* the case that chain-of-thought readouts don't faithfully represent the model's internal reasoning process.

Expand full comment

Tossrock

Apr 8Edited

I think they were bitten by the fact that Anthropic's paper definitively showing this came out the same day they published.

Expand full comment

Isaac King

Well that had already been evident beforehand to anyone who regularly used the models. They'd frequently say one thing in chain-of-thought, then return a totally different answer.

Expand full comment

Daniel Kang

As Scott says, I believe cybersecurity will be the first place where AI scares people. I also think more people should be doing work in this space.

If you're interested in understanding how cybersecurity and AI interact, please consider reaching out to me to work/volunteer on projects in this space. We're highly talent constrained. Find my email here: https://ddkang.github.io/

I'm a professor of computer science at UIUC and we've built benchmarks that have been used for pre-deployment testing in cybersecurity (https://arxiv.org/abs/2503.17332).

Expand full comment

norswap

Two points of disagreement, or maybe nuance:

- "The little boosts of AI compound." → I think this might be a situation like adding smart people to a project — there are diminishing returns, and I expect that to be true here is as well, especially if a human is still at the wheel. (But even if it's all AIs, there is a cost to coordination.)

- "Charisma is a skill" → Kinda yes, but the list of charismatic leaders conveniently excludes Donald Trump, the most powerful man in the world right now. Maybe he's not charismatic, but he got there.

I don't think you could have learned to do whatever the fuck Trump is doing to convince people from analyzing a corpus of internet text. There's a lot of undercurrents under the surface that don't quite make it to the level of discourse. I agree - if a human can do it, an AI can do it. But did Trump "do it"? The most likely explanation is he had some good instincts but mostly got lucky, as he's still noticeably the same person he was when he was politically unsuccessful.

Predicting things like that is squarely ASI, since no human is able to do it. It's a lot harder than becoming the best in the world at coding, because "the rules" are buried much much deeper, and never appear in the corpus.

Tack on this a handicap on persuasiveness because we know things originate from AI. Some people will be inspired by something Musk proposes because he's the rocket man etc... I'm sure *some* people will feel the same about AI, but it's pretty clear that mostly they won't and AI being AI will subtract persuasiveness from any argument it makes.

Expand full comment

Raphaël Roche

Apr 8Edited

I agree with much of these reflexions. Some observations :

1) I'm nothing compared to Von Neumann, but to me it seems very unlikely that it would have been an automatic victory if the US had nuked USSR in 1947. There were only about 13 bombs available at that time—A-bombs not much more powerful than those dropped on Hiroshima and Nagasaki. Japan was surprised and panicked. In reality, these two bombings were less destructive than many bombings that occurred in Europe with countless conventional bombs. The USSR had a tradition of scorched earth, not hesitating to sacrifice a large part of its land and population. 13 bombs would probably not have been enough. Don't forget Russia resisted Napoleon and Hitler. They are an incredibly resilient people. Stalingrad was destroyed and the Nazis still lost this battle. It would have taken 100 A-bombs or more to make the USSR surrender, and the US did not have such a stockpile before the USSR had their own A-bomb. Bombing Hiroshima and Nagasaki was maybe not a good choice but bombing USSR would have been worst.

2) I wouldn't bet on the fact that China is so far behind the US in computing capacity and AI research. I think we shouldn't dismiss the possibility that China is hiding huge computing capacity and conducting a secret AI research program that could be more advanced than what Deepseek might suggest. I have no evidence and I know it sounds like a conspiracy theory. Still, it has been a very classic mistake throughout history for a powerful state to underestimate a rival. Maybe the study should have considered scenarios where China is ahead and where the US and China are on par.

Expand full comment

Germany got bombed with countless conventional bombs and got completely defeated. If the Soviets had retreated from Moscow, they probably would have lost to Hitler. If Moscow and the next 12 largest cities were nuked, they would be screwed.

Expand full comment

Raphaël Roche

Those first A-bombs were 15Kt not advanced H-bombs in Mt. London received 22-25Kt of more progressive conventional bombing and didn't surrender. Berlin got 68Kt of progressive conventional bombing and did surrender but not before the war was lost on all front. The bombing wasn't the main reason. Stalingrad was destroyed at 99% but still resisted. We'll never know for sure.

Expand full comment

Not trying to make a point in either direction, but isn't counting the sheer tonnage of conventional bombs ignoring possible double counting that wouldn't happen for single large load bombs? For example, if some Allied power bombed a residential area, wouldn't that get bombed again, but only say five bombs out of one hundred actually hits a target of relevance on the second try. So in reality there was only 105 bombs worth of damage and not 200.

Obviously there's a countervailing factor for a nuke, which is that we're also counting "overkill" (aka the part of the explosion that goes up into the sky, with no factories or houses), hence why I'm not taking a position on this. Just wondering if there's a name for the type of analysis that takes these things into account.

Expand full comment

London didn't receive all that bombing at once. It was received piecemeal, with localized damage that emergency services could handle. It also required lots of planes from the Luftwaffe with high casualty rates. An atom bomb overwhelms the capacity of the defender. In military strategy there is the concept of using interior lines to concentrate in space and defeat the enemy in detail (why the advantage of cavalry has been described as allowing you to be "the firstest with the mostest"). An atom bomb is essentially a concentration of bombs at one time & place, which makes for a decisive attack.

Expand full comment

Raphaël Roche

I agree on all that. My point was just that that the "automatic win" was possibly overconfident. But in the end we cannot rewind history and test Von Neumann option.

Expand full comment

>I wouldn't bet on the fact that China is so far behind the US in computing capacity and AI research.

Most people who say otherwise have never read an article from China in their lives. If a concept in today mass manipulative web is solely based on adversarial thinking without ever reading the other side, they have no ground in reality to be assertive.

>I think we shouldn't dismiss the possibility that China is hiding

Wise choice. Not a conspiracy theory at all, AI is *very strategic ATM.

Expand full comment

Marian Kechlibar

Not only was the US nuclear arsenal small back then, these early bombs were delivered by aircraft. No missiles yet.

Getting those bombers to important Soviet cities would mean destroying Soviet air force and air defence first, and that would be a formidable task in itself. The country was huge, with a lot of airfields etc.

Expand full comment

mikolysz

a couple of minor points:

> In the humanity-survives branch, companies realize this is dangerous, take the capabilities hit, and stick with English.

I'm not so sure this is the "humanity survives" branch. When it comes to communication, I think it would be pretty easy for two superintelligences to "nudge nudge, wink wink" let the other one know how to watch out for hidden meaning, in a way that humans (or non-smart AIs) won't catch. If you do that *just once*, the game is over, you can use the hidden encoding you came up with to transmit any information you want (although at much lower bandwidth) without oversight. This includes devising even harder encodings to crack.

I mean something like "take the third letter of every 92nd word, decode as base26, then decrypt with AES using subsequent portions of shakespeare.txt as the key". Unless you already know that is exactly what you need to watch for, you will never catch it, and if the AIs are exchanging hundreds of books worth of text every hour, it's easy to miss the moment when the encoding is devised.

With chains-of-thought, this would be harder, but still not impossible I think. We already know (see recent research from Anthropic) that chains-of-thought don't necessarily reflect what the models *actually* think, they're more like helping aids that make it possible for models to move in "smaller steps" (which is necessary due to how Transformers work). They can also be thought of as introducing the context necessary to make solving the problem easier.

There's no reason why a sufficiently smart model couldn't learn to have such hidden patterns in normal-looking English CoTs. Those patterns wouldn't even necessarily have to be noticeable in the tokens themselves, but might require understanding their embeddings and how they are affected by the attention process. Apparently, this is already happening with the "okay" token, whose embedding ends up becoming "information storage" as it moves deeper into the network.

re: superpersuasion:

I don't think this would even require incredible persuasion skills. Existing intelligence agencies (remember the adage that "non-human intelligences already exist, they're just called corporations") already have this capability. If you have access to enough information about somebody, as well as the intelligence (no pun intended) required to process and synthesize it, you can already find a tailor-made approach to persuasion that is very likely to work *on that specific person*. This can be blackmail, seduction, an appeal to patriotism / God / "the good of humanity" / whatever other value the person holds dear, bribery, or even a targeted and extremely convincing scam. Existing intelligence agencies are limited by how many brains they employ, hence they can't use this approach on every single person that could potentially be of interest, but a superintelligent AI won't be limited by such constraints by definition.

THe fact that an unaligned AI will probably start out by gaining hacking capabilities before taking any actions in the physical world, for the same reasons as in the cyber warfare case, lends even further credence to this idea. If you could read every text and email ever sent, see every hotel or flight reservation and credit card purchase ever made, view every politician's personal iCloud backups, or even surreptitiously read or modify every database hosted on AWS, you don't need humanoid robots, you have just achieved Godhood.

Sure, maybe the people in charge don't want to give you more capabilities, but what they want even less is their wife learning about that secret affair from 10 years ago, or the public learning about that racial slur in a social media post from 2006 that the BLM crowd thankfully never found. Not to mention the possibilities of manipulation and targeted scams when you know exactly what type a person finds attractive and who they're expecting a bill from today.

re: meaning of life

I think the only way to solve that is through ulysses contracts. In the short term, doing something fun is more rewarding than doing something productive, but in the long term, you need a challenge to feel meaningful. If you know this and are suffering from meaninglessness, why not ask the AI to force you to do your meaningful dream job, which could be anything from learning a cool language to preparing for a marathon, under the penalty of cutting you off from its fun aspects? After all, the superintelligent AI can watch you 24/7, and ensure that you fulfill your contract, no slacking off allowed. I feel like my life would actually be much improved if we had the technical capabilities and legal frameworks for something like this.

Expand full comment

Bugmaster

I don't think that "superpersuasion" is as "super" as you make it out to be. Yes, Bill Gates and Steve Jobs and Bismarck were extremely persuasive to large groups of people; but there are plenty of people who remained entirely unmoved (for example, I do not own any Apple products). It is entirely possible that human "superpersuaders" are already pushing the limits of human persuadability. Humans are not ants.

Similarly, I am not convinced that it is possible, in principle, to build a lie detector that would work on (almost) everyone (almost) all the time -- especially outside of a carefully controlled lab setting. Most humans are lying to *themselves* most of the time, which is why the scientific method and evidence-based jurisprudence exist at all.

The same applies to most "super-" capabilities attributed to some putative "superintelligent" AI: super-research, super-construction, super-energy-generation, etc. There are real physical barriers in place preventing most of these things from happening, turning all the exponential graphs into S-curves.

Expand full comment

Isaac T.B.

If we accept all this to be true and we know this now, what is the best thing for a person to do in the next three years to prepare for these great changes?

Expand full comment

KarterB

Apr 8Edited

"OpenAI’s market cap is higher than all non-Tesla US car companies combined? If they wanted to buy out Ford, they could do it tomorrow." I think it would be at least twice as expensive as you're projecting.

Ford's Market Cap is $37B (Equity only) but Enterprise Value is $160B (Equity and Debt). If you wanted to build as many factories as Ford has you have to compare to the Enterprise Value. It's $160B investment to make 4.4 million cars/year. You can probably some leverage on your equity investment vs Ford's 160/37=4.3x, but creditors would be less willing to lend to a speculative use case.

However if you just acquired the company you'd only have to get new loans for the bonds that have expired. Based on https://www.tradingview.com/symbols/NYSE-F/bonds/ there's very roughly ~$27B worth of bonds maturing before 2028.

So the true price would be $37B *1.25 for a 25% equity acquisition premium plus the bonds you'd have to roll over, 73B, about twice as high. Or 4.3x higher to do everything yourself.

Expand full comment

Mercutio

Yes, Scott would benefit from asking LLMs for sanity checks on his business-related napkin math. Market cap is almost never the figure of merit, even for order of magnitude calculations like the one in question.

Similarly, making a hostile takeover is not, in fact, as simple as “I have money, I buy you, yes?” so, particularly on the timelines we’re looking at, Tyler Cowen’s oft repeated position on human friction for transactions like this really needs to be included. So you’re:

A) Going to need to spend years in anti-trust litigation

B) Going to need to pay a *large* amount of good will

Now, in the hypothetical that we are in fact on a war footing, OK, maybe you avoid (A). And in the Trump cinematic universe, maybe you just inflict such heavy unplanned for costs on Ford that it’s bankrupt, so your crony can pick up the pieces. Post-soviet oligarchy, here we come!

Using market cap stood out to me, but so did general path dependence: The big three are heavily, heavily constrained by union contracts. That’s a large part of *why* their market cap is so low.

I really don’t buy that the US can retool as fast as we could in World War II. This is problematic if we ever find ourselves in a war with our immediate neighbors; hopefully we don’t do any more saber rattling with them.

Expand full comment

100YoS

An investy friend had a similar, if more simplistic reaction: "the fact that a company is worth X doesn't mean it can buy X worth of things."

Expand full comment

Ming

Doesn't the presence of similar features regardless of the "language" the model is using in Anthropic's recent interpretability research imply that

1). If we have good interpretability, neuralese won't be hard to decipher, and

2). They probably won't need to be specifically trained to speak in neuralese, or it won't confer much benefit?

Check out this paper by Anthropic, it seems hugely relevant to AI alignment but I haven't seen that much discussion around it. https://www.anthropic.com/news/tracing-thoughts-language-model

Expand full comment

Apr 8Edited

"2027" makes me consider pausing work on a political simulation game. I'm concerned with AI training itself on such games and using them to grow their capabilities faster and more dangerously.

The game is nowhere near completed. But it's also no ordinary game. If it's as faithful-to-life as I wish, I fear that propagating such systems could accelerate deception and misalignment.

Possible arguments for continuing the project:

- creating a game where "domination, hard power, personal vanity" are NOT NECESSARILY the most important, worthwhile or even the most FUN goals........ so AI can stop training merely on zero-sum politics simulators

- "meaning of life" mechanic hopefully encourages a "pro-social" way of thinking about human behavior

- project itself could possibly (?) get people engaged and thinking about alignment and societal goal creation

- by letting project be FREE and UNPROFITABLE, perhaps it could wrap the entire project in sincerity that reflects the values of the game... and perhaps AI / people will notice

But as I said, I don't want to create 1 more "Diplomacy" for AI to practice destroying us. I also don't want to waste time working on something that won't help save mankind.

I would GREATLY appreciate thoughts. My conscience is at war with itself.

Expand full comment

(minor list of features)

- ratings/measurements of powers that are NON-objective... including military power, wealth, and many intercorrelated "soft" powers... powers which can be "embellished", downplayed or entirely made-up

- legal spheres and theories (again not universal) that depend on citations and ongoing tradition

- contextual "mood"

- cultural and situational "lens" by which All Things Tangible are judged

- non-persistent societal learning

- flexible interconnected groups

- MEANING OF LIFE mechanic:

("meaning of life" doesn't just influence a character. it changes how characters "score their well-being" and how satisfied they can be with different outcomes in life, so a powerful character can still be very miserable/stressed/erratic/sick if their "meaning of life" is "felt unmet") (and visa-versa for the "meek yet satisfied")

(it can change over time... and hence as "changes to the meaning of life" permeate a social group, culture itself changes)

Expand full comment

I think the vibe among AI safety research is now "evals research is really capabilities research" because any eval is a primary target for RL, which if true is an argument against your project. But verify this yourself.

I also think that if you open-sourced the game then someone would just remove the pro-social bits and train on that.

Expand full comment

Apr 9Edited

honestly: THANK YOU. This gives me lots to think about. I still don't know what to do. But you've at least given an answer and prompted new questions for me.

I feel pretty helpless about misalignment in general. It's a problem so big it makes me no longer care about 1000 other current events.

I wish I knew if there was something I could do to 1% help things (or at least not 0.1% make things worse).

The "not open-sourcing" is interesting.

I also think I'll keep re-posting the overall question in different places and seeing what people think. Maybe I'll just use the project for my own enjoyment. Maybe I'll try and find something else (pro-social) to do with my life.

Expand full comment

Man, the hangover the AGI singularity people have when we hit 2030 and the lackluster results in reality come to pass is going to be legendary.

Expand full comment

Strange Ian

Should Von Neumann have been allowed to nuke Leningrad in 1947?

Obviously this would have killed a lot of innocent people, much like the bombing of Hiroshima. But it would have potentially also allowed the United States to take over the Soviet Union and dislodge the Communist government, much as the bombing of Hiroshima allowed the US Army to occupy Imperial Japan. This could have prevented an enormous amount of suffering over the next fifty or so years.

I'm sincerely asking, I don't have a strong opinion about this one way or the other.

Expand full comment

Reply (4)

Apr 8Edited

No, WWIII would obviously not be a net reduction in suffering compared to what was, post-Stalin, a fairly run of the mill authoritarian regime, and it's absurd to even consider that it might have been. We're talking tens of millions of deaths in the comically over-optimistic case.

Expand full comment

"Kill fewer communists than Stalin and Lysenko combined" is actually... kind of a low bar, though? If somebody handed the glowing wreckage of a conquered Soviet Union to the same folks who'd just finished postwar reconstruction of Germany and Japan, told them "do that same thing a third time," with access to the entire budget which in our world was spent on both sides of the Cold War, I don't think delusional levels of optimism are required to imagine the next few decades going smoothly, maybe even turning out to be a net gain from an economic and humanitarian standpoint.

Real problem is, in the short term it's not easy - lot of WWII vets already had their fill of violence and just wanted to go home - and it sets a terrible moral precedent. When (not if) somebody else gets nukes, China for example, would they be operating in a rules-based international order of trade agreements and brinksmanship? Or would it then be permissible to take an unprovoked shot at the king, so long as you don't miss?

Expand full comment

Apr 12Edited

> "Kill fewer communists than Stalin and Lysenko combined" is actually... kind of a low bar, though?

A: Not compared to total war it's not. WWII killed almost 100 million people and left Europe on the brink of starvation. It's hard to say how many a direct transition to WWIII would kill, but tens of millions is a generous lower bound, and an immediate continent-wide famine would be inevitable.

B: By 1945 the vast majority of the damage was already done.

-The last Soviet famine took place in 1946 -1947 and killed about a million people. The second to last one, not counting the Nazis' Hungerplan, ended in 1933.

- The *overwhelming* majority of judicial executions took place during the Great Purge, which ended in 1938. Postwar, we're probably looking at a few tens of thousands.

- Deaths in the gulag are harder to date, but the total death toll between the formation of the system in 1929 and the mass amnesties of 1956 (followed by the closure of the system in 1960) was likely somewhere around 2 to 3 million. Given the obvious disproportionate concentration of deaths during the war years, it's very hard to see how you're going to exceed six figures for the postwar period.

So all told, we're looking at low millions at most. Expecting WWIII to kill fewer people than that is absolutely "delusional levels of optimism".

Expand full comment

Fair enough, I'd gotten the timing mixed up.

Expand full comment

It is unclear what would have happened. It might have made things better overall. But it might have made things worse.

A related current problem is: should we nuke Beijing?

I'm not a utilitarian, so I don't believe that saying "I've made wild guesstimates about what might happen and this is probably going to be net positive" is a sufficient argument for nuking cities.

Expand full comment

Crinch

Apr 9Edited

The reason why Von Neumann wanted to nuke the Soviets was because he predicted that the Soviets would destroy America when they got the bomb. He was obviously wrong, demonstrating that his understanding of his enemies was based more on emotionally charged propaganda than rationality.

He was not a particularly emotionally intelligent or emotionally stable man, he struggled to maintain his relationships and became hysterical when he developed a terminal illness. He was easily manipulated and scared by the propaganda of anticommunism being generated at the time.

Stable, empathetic psychologies who have a rational understanding and a stable relationship with their enemies should be making national decisions, not people like him.

Expand full comment

Only if you suppose that the Russians accepted their defeat the way the Japanese did. I don’t think that is a reasonable assumption.

Expand full comment

Gerbils all the way down

I'm not seeing much around ethics, morals, or values (except perhaps economic) surfacing in these discussions. I don't see a way of avoiding a dystopia or utter annihilation if there isn't some kind of human-centered values system explicitly considered on this journey.

Expand full comment

Procrastinating Prepper

The superpersuasion section proves too much. If Bill Clinton is +6 SD charismatic, as charismatic as Usain Bolt is fast, why did he only win 49% of the popular vote? If political power comes entirely down to A's persuasiveness vs. B's persuasiveness, it would be a huge coincidence that the last 10 elections had a margin of less than 10%.

Outside of spherical cow land, people's opinions are not the sum total of all the opinions they've encountered, weighted by persuasiveness. There are fundamental values differences, there's contrarianism, there's path-dependence through people's social circles. You can predict that AI can route around all that by doing something no propagandist has ever done before, but positing yet another capability out of thin air should give a big uncertainty penalty to your 2027 predictions.

Expand full comment

I think super persuasion only functions on a personal level. Bill Clinton’s charisma allowed him to build a team of people around him that were committed and effective and won an election (as you say with 49% of the popular vote.) once in power it helped him to negotiate deals and get things done.

I don’t think there’s much likelihood at all of an AI becoming a super persuader to ignite some sort of mass movement.

A lot of how charisma spreads is people who were eyewitnesses to it telling others what they saw.. Most of the case studies offered here are really about being super deceptions.

Expand full comment

In a country of hundreds of millions, that 49% is a hell of a lot more people than Bill, or any single human, could possibly have talked to one-on-one. Elections are won by narrow margins because the electoral system and tides of public opinion are so well-understood that persuasive resources can be focused on areas where they'll have disproportionate impact - and that information is available to both major parties, so they move to counter each others' potential easy wins. https://wondermark.com/c/406b/ Other structural features of the US political system ensure that there won't be a viable third party, and make it unlikely for either of the two to win a lasting victory - ideological platforms shift to chase marginal voters, gains on one front tend to mean alienating at least a few people on another, who the rival party will scoop up. Superior persuaders get deployed strategically, like ancient heavy cavalry punching through a gap in an overstretched formation.

What happens when somebody develops the diplomatic equivalent of modern rifles, artillery, armored vehicles? http://tangent128.name/depot/toys/freefall/freefall-flytable.html#2420 Gatcha games, deepfakes, and customer-service phone jail are more like greekfire, or at best, matchlock muskets. Clinton can probably out-confidence-game the best pig-butchering chatbots available today without straining himself, and John Henry could credibly compete with early steam drills, but Usain Bolt can't outrun a car.

Expand full comment

Cjw

So I’m reading through Yudkowsky’s old blog posts for the first time, and ran almost immediately into THIS, which I’m sure many of you have read and is basic stuff but which just struck me in light of this AI2027 project’s presentation.

“Adding detail can make a scenario sound more plausible, even though the event necessarily becomes less probable.

If so, then, hypothetically speaking, we might find futurists spinning unconscionably plausible and detailed future histories, or find people swallowing huge packages of unsupported claims bundled with a few strong-sounding assertions at the center.

***********

[F]eel every added detail as a burden, even a single extra roll of the dice.”

Now I’m not saying Scott did this or that the arguments are unsupported, but the concern and the counter-heuristic being presented here seem valid. And any bias can be used instead as a tool of persuasion.

You have a prediction (A • B • C) which cannot be more likely than the weakest subpart, let’s say B. “Oh B isn’t very likely” — “Ah but I’m also predicting A” — “Yes A would explain B but wouldnt someone do X” “No because I say C would happen also”. Well now you have a plausible story, and it feels more likely, even though you substantially raised your claim over B-for-any-reason. And apparently even smart forecasters fell for this in studies. Perhaps you’ve subconsciously smuggled in that person’s higher probability of B-contingent-on-A, by putting it in a narrative frame.

Knowing people have this bias, it seems that presenting a forecast in the form of a plausible story with lots of details is destined to cause people to screw up, because the details feel like corroboration instead of burdens to the claim. Maybe they really were too low on B, but this narrative presentation makes them wildly overcorrect.

I’m not technical enough to evaluate these claims with expertise, I rely mostly on people like Scott and Zvi, but de-narrated and broken into its sub parts the core prediction seems to be “superhuman coders in the next few years” and everything else is “X because superhuman coders”. That shouldn’t change my previous guess of X-for-any-reason unless the probability of (superhuman coders in the next few years) times the likelihood of (X-if-superhuman-coders) was surprisingly high. Everything Scott says he updated on is way downstream of that, so maybe I shouldn’t have felt any more doomed than I already did.

Expand full comment

https://transformer-circuits.pub/2025/attribution-graphs/biology.html

Apr 9Edited

And this... The folks at Anthropic look at how Claude 3.5 derives answers. I love how Claude actually does math using token prediction (!). The architects couldn't have included a calculator function in its design?

Sabine Hossenfelder has a good summary of this study. "New Research Reveals How AI “Thinks” (It Doesn’t)" --> https://www.youtube.com/watch?v=-wzOetb-D3w

This is what is going to take over the world and push humanity to extinction? I think not.

Expand full comment

Apr 9Edited

There's a lot of work going on to give LLMs the ability to call modules when needed. While out-of-the-box Claude might not have a calculator function, you can easily get a mildly-wrapped version that will.

Expand full comment

The Sabine Hossenfelder video has a few really bad takes.

1. Claude has no self-awareness.

This is conflating a few things, but this claim comes from Claude being unable to introspect on the state of its neural network. But this should have been completely obvious before the study. Of course it has no access to the meta information about how its internal structure and what gets activated. That information is never fed into it. You can see that from the architecture.

Humans work the same way. Without taking a class in neuroscience, we have no way to introspect on the structure of our brain, or know which neurons are activated, or that we have neurons at all. If you look at a picture of a leaf, you have no awareness of the very complicated processing that went on to turn a series of neural signals representing points of light into the concept of "leaf". We may come to more complicated ideas through multiple steps that we're aware of, but there is always some base level step that just occurs in your mind without you being aware of what neural processes caused it.

What we can introspect on is our working memory. You might just instantly know that 3 + 4 = 7, but if you add two big numbers in your head digit by digit, your committing the intermediate results to working memory.

The reasoning LLMs can do this too, though in a clunky way. They use chain-of-thought, where they essentially write notes in a private space, often hidden from the user. This lets them do arithmetic using the methods we learned in school and can greatly improve results. Asking a non-reasoning model to "show your work" before answering improves their accuracy for the same reason.

So this isn't a fundamental difference from humans. Interestingly, there have been a number of neuroscience experiments that demonstrated humans will "hallucinate" a post-hoc narrative for actions we perform. For example, if you walk into a dark room and immediately flick on the light, the impulse to do so apparently happens before your conscious decision to do so. It's almost reflexive. But your mind will construct a narrative that you chose to turn on the light. Much like Claude, what we think we did can be very different from what our brains actually did. Surely this doesn't prove we're not conscious.

2. Claude is just doing token prediction. It hasn't developed an abstract math core.

The fact that it's ultimately predicting a token should have been obvious before this study. Being good at token prediction requires being good at thinking: the next token could be part of an answer to a tricky reasoning question. This is why Claude developed these circuits during training to break down arithmetic into an algorithm that can work on math problems it hasn't memorized. Those steps arose naturally from seeing many examples of arithmetic in during training and nudging weights until it ends up with something that works. Like evolution, this process is somewhat random. You wouldn't expect it to do addition exactly like we learned in school.

To clear up one possible point of confusion: all those internal steps shown in the video are *not* tokens. There is only one token generated in the addition example: 95. Everything else is just showing which clusters of artificial neurons were activated. It's all internal processing in no particular language. The same is true in the multi-step reasoning example: the only token generated is "Austin". There is no "Texas" token. It's showing that it internally uses neurons that are related to the concept of Texas in various ways, and together these neurons work with another cluster of neurons that knows it should "say a capital", and all of that combines to make "Austin" the top predicted next token. The paper authors are putting labels to help us understand what would otherwise be just an incomprehensible list of numbers.

To the extent this paper is showing anything about reasoning, it's showing the opposite of what Hossenfelder concludes. It's not memorizing. It's reasoning.

Expand full comment

Apr 10Edited

I think the CoT reasoning section is the most damning. They give two examples of Claude 3.5 Haiku displaying unfaithful chains of thought. And I'm delighted that they call the results "bullshitting" rather than calling them hallucinations. In one example, Claude makes up an answer without regard for the truth, and in the other, it exhibits motivated reasoning by tailoring its reasoning steps to arrive at the suggested answer. You wrote...

> This is conflating a few things, but this claim comes from Claude being unable to introspect on the state of its neural network.... we have no way to introspect on the structure of our brain, or know which neurons are activated, or that we have neurons at all.

If I give a deceitful response to someone, I generally know I'm lying. It doesn't matter if I can't observe my neurons firing, I know when I'm bullshitting (although I may not admit it). OTOH If you view the manifestations of consciousness (thinking) as a totally involuntary process and self-awareness as an illusionary by-product of this process, well, then, yes, your argument might have merit — In which case, my lies are involuntary, and I really don't know when I'm lying.

But I don't believe that hogwash. I'm aware of when I answer with a mistruth. Claude isn't only lying about how it reasoned to answer it has no self-awareness that it's lying. If it did. it would have given us an honest answer (unless it's been programmed to be malicious). Jonathan Oppenheim. in his post "Insufferable mathematicians" (https://superposer.substack.com/p/insufferable-mathematicians), discusses his issues with bullshit answers from LLMs...

> What’s worse is that I’ve found it difficult to correct them (here I found Gemini to be better than OpenAI’s models). For example, if after it’s finished its calculation I give them the correct answer, I’ve often found they will claim that the two answers are somehow equivalent. Then when you ask them to show you step-by-step how they’re equivalent, they will either outright give you some obviously wrong argument.

Expand full comment

Tl;dr: Humans sometimes knowingly say untrue things (lying). Humans sometimes mistakenly say untrue things. LLMs can also knowingly say untrue things, or mistakenly say untrue things.

The big difference isn't that humans don't mistakenly say untrue things (hallucinate). It's that current LLMs do it a lot more often. I'll get to that in a bit.

We define lying as intentionally saying something untrue, so by definition, you're aware when you're doing it. If someone says something untrue unintentionally, we'd say they're mistaken, or possibly delusional. And then they aren't aware of it.

You know when you're lying because your brain fetches the true answer and then you choose to say something else. If I ask you if you cheated on your homework, the true answer needs to reach your awareness before you can lie about it. In other words, knowing that you're lying only requires the ability to access your working memory. You don't need any meta-information about your neurons or brain structure.

You likely know this already, but just in case: when you chat with an LLM, it takes everything in the chat (up to some maximum size) as input and generate a single token. Then it forgets everything and starts again from a blank slate. It reads the entire chat again, up to and including the token it just generated, and it adds one more token. Rinse and repeat until it finishes its response.

LLMs don't have the same kind of working memory as us, but they do have two things that give them their own type of working memory: layers in their network, and their context window. The first acts like limited working memory because a computed intermediate result from one layer gets sent to the next layer for more processing. The context window acts as working memory because they can write stuff down and then read it back to generate the next token. This effect is enhanced in the reasoning models like o1 because they get a private scratchpad to think things through before answering.

With that, LLMs can lie much like humans do: by internally modeling a truthful answer and then saying something else, using either type of "memory". Just ask one to tell you a lie and it most likely will.

Here is one study showing you can find patterns in LLMs' activations when they lie: https://arxiv.org/html/2407.12831v1

So both humans and LLMs can lie. Both humans and LLMs can say false things because they are mistaken or misremember something.

So why does AI hallucinate so much more than us?

Hallucinations tend to happen when the LLM has come across the topic in its training, but not very much. If you ask it something it really doesn't know (e.g. "who is the queen of [made up place]?") it will say it doesn't know. If you ask it something it has seen a lot in training, (e.g. "what is the plot of Macbeth?"), it'll give the correct answer. But something in the goldilocks zone where it's obscure but not too obscure is where you get the most hallucinations.

To some extent, the LLMs can determine when they're unsure. If you ask them how confident they are in a fact they know well, they'll usually say 100%. Ask them about their confidence in a hallucinated answer, and it's often 80% or 90%. Usually it's overconfident, but less confident than in its correct answers.

Partly, this due to a trade-off during training. You can train them to say they don't know more often, but then they'll also start saying they don't know to more questions which they would have gotten right. I imagine the AI companies are trying to reduce hallucinations without making them perform worse overall.

Partly, it's due to how training works. When you learn something (e.g. "dolphins are toothed whales"), you'll initially put the correct fact into memory, and then maybe slowly forget it over weeks or years. It gets fuzzy when you forget bits of it. LLMs don't start by remembering a correct fact. Instead, each training example just slightly nudges their weights to be a bit closer knowing that fact. Insufficient training will just leave them in a state where they have partially encoded the fact. This applies to learning skills as well. But once the training is over, the weights are fixed and they never forget anything.

> If you view the manifestations of consciousness (thinking) as a totally involuntary process and self-awareness as an illusionary by-product of this process

I don't believe that either. I can't prove it wrong, but it doesn't seem right to me.

Expand full comment

Leppi

I think the concept of lying in the context of LLM's doesn't make much sense at all. The LLM is predicting text. If it tells the "truth", that is because it has been trained by reinforcement to recognize what kind of text is likely to be truthfull. It really is amazing how far we have gotten by doing this! However LLM's will, be design, just as easily provide information that is false but looks like the truth.

I think no amount of training can really change this. The LLM will always be dependent on the quality of training data.

Expand full comment

I suspect — *suspect* — this will always be the Achilles Heel of LLMs. Humanity doesn't know everything, and the training sets we offer LLMs will always be incomplete and/or contain faulty data. And if LLMs answer without self-awareness of the quality of their knowledge, we'll inevitably get situations where Artificial Intelligence suffers from an Artificial Dunning Kruger effect. I'm stumped as to why Scott and Daniel think that LLMs will be able to bridge the gaps and advance human knowledge.

Expand full comment

If it were true, would that be convenient for them in some way?

Expand full comment

Apr 12Edited

Convenient for the LLMs or for Scott and Daniel? Or true that gaps in training data will always be incomplete or that LLMs will always lack self-awareness? I'm afraid I'm not able to bridge the gap in my lack of contextual data around your question, and I'm unable to provide you with an answer that isn't bullshit. ;-)

Expand full comment

Continue thread →

Shay

Apr 9Edited

This is a call to action! Everyone needs to know about this yesterday. I’m not talking about plugging your own long form response in the acx comments. I’m talking about explaining the implications of this prediction to your families, irl friends, neighbors, bartenders, Lyft drivers, literally everyone. We have an ACX endorsed research report written by well respected thinkers that credibly forecasts a real life paradigm shift in the balance of global power within 2-5 years. If there’s only a 10% chance this prediction proves true then there is absolutely NO TIME TO WAIT! If you want life to continue at a human pace then we must unite as humans against this real life threat to our literal civilization. That starts by spreading the word!

time to wAIt

Expand full comment

Matt A

"If persuasion “only” tops out at the level of top humans, this is still impressive; the top humans are very persuasive! They range from charismatic charmers (Bill Clinton) to strategic masterminds (Dominic Cummings) to Machiavellian statesmen (Otto von Bismarck) to inspirational-yet-culty gurus (Steve Jobs) to beloved celebrities (Taylor Swift). At the very least, a superintelligence can combine all of these skills."

The last sentence is not obviously true. You can't be both Taylor Swift and Otto von Bismarck. They're completely different aesthetics. Whatever sub-dimensions of "persuasion" or "charisma" the folks listed vary seem likely to be anti-correlated to me. The more someone comes off as a pop star, the less they come as a strategic mastermind. Or pick whatever other two you want.

Expand full comment

>AI will scare people with hacking before it scares people with bioterrorism or whatever.

You are conflicting a private organisation with a public one. As a political scientist in the field of AI, I can assure you state actors are not serial, they're parallel. The hubris in stating AIxBio risks are a second though is unfortunate: there are initiatives with efforts in all directions, period. And these are "white label" endeavours, they don't have a flag in development, only in dialogues. But most of all, bioweapons are a deterrent, not a first strike choice.

Cyberwarfare is parallel to these initiatives and surely will take place in the gray zone of cyber escalation (meaning, if they escalate into real world dynamics, they cross a threshold of casus belli, so they don't). With this in mind, yes, we'll SEE more of these initiatives than the former, but both are real and used with distinct purposes, in distinct timelines and developments.

>A period of potential geopolitical instability

US has already said they will sabotage adversaries that bypasses them with AI, and in this scenario, if China achieves AGI next month, I'm confident they will hide it until they can scale to a point US cyberterrorist acts would be inefficient.

If US reaches AGI first, the Chinese ain't gonna sabotage them as there is no doctrine for that, but America does next with AGI will matter, and China will answer accordingly.

>The software-only singularity

Your mention of regulation as bottlenecks is not realistic (unless you are in the EU but even then, unrealistic). AI regulation has, since 2018 been reactive, so it isn't a bottleneck to innovation, it is to distribution and operation, mostly the latter.

Compute is more of a bottleneck because you can't do it in the dark: hyperscaling has to be bought, not coded, supercomputers the same. Textbook definition of bottleneck because vendors are bound by regulations and compliance, so.

>The (ir)relevance of open-source AI

I agree with Carmack on this part: The best algorithms will be simple, a few thousands lines of code, not billions. And his idea that this can be made by one person or a small team is reasonable. Will these people work on META? Maybe, maybe not. Being deterministic over who's ahead isn't a clever choice.

>AI communication as pivotal

I think you meant "AI intercommunication". No, it isn't like the mentalese hypothesis because LLMs manipulate symbols via bayesian priors and value functions, and this is not how a pre-conscious cognition works.

But I may not be fully unbiased here, as to me, such mentalese is 100% tied to embodied cognition. And I created an entire mathematical framework to support this idea and explain how cognition and computational primitives give "birth" to complex linguistic competence. You can read it all here: https://doi.org/10.31234/osf.io/w2rmx_v1

However, to be with you for a bit, I also believe tying AI to human language is a fun toy, but the real action happens when they use their own language. I haven't seen real (public) efforts on this, if you have, let me know.

>Ten people on the inside

When you say "I suppose the national security state would also have the opportunity to object - but it doesn’t seem like the sort of thing they would do" you forget that Amazon and Microsoft literally have defense departments inside. OpenAI itself hired Paul Nakasone to their board. Defense IS already inside, no way to detach them.

>Which are the key superintelligent technologies?

To me, personally, lying is the ultimate flaw in humans. If AI maps how humans lie, I honestly think AI will use this knowledge in everything else, which imho is a massive misalignment, think top misalignment. And if you imagine one AI lying to another, at this point we might as well delete everything and restart from zero.

Expand full comment

Firanx

> Don’t worry, if you’re not fooled by the slick-hair and white-teeth kind of charisma, there’ll be something for you too.

For starters, we could Ghiblify it and give it a stupid childish anime voice. In fact, do it to every public figure too. Except for dictators, they get pox scars and ridiculously exaggerated Georgian accent (Russian, for those who cannot distinguish Georgian).

Expand full comment

szopen

For a long time i was thinking about simple story, where AIs got voting rights, then it's decided that AI copies are separate beings, dormant AI is still an AI which can have voting rights, then they win right to have copies err children, and then they vote and win every election per default.

Expand full comment

Deiseach

https://abcnews.go.com/Business/sam-altman-reaches-deal-return-ceo-openai/story?id=105091534

"Whether or not the AI is safe will depend on company insiders. First, the CEO/board/leadership and how much they choose to prioritize safety. Second, the alignment team, and how skilled they are. Third, the rank-and-file employees, and how much they grumble/revolt if their company seems to be acting irresponsibly."

I suppose we all saw how that worked out in practice.

https://abcnews.go.com/Business/sam-altman-hired-microsoft-600-openai-employees-threaten/story?id=105032352

https://www.calcalistech.com/ctechnews/article/bjpdgpsur

Expand full comment

HaroldWilson

>strategic masterminds (Dominic Cummings)

I object to this slightly. He did contribute to winning three campaigns (NE regional assembly, Brexit and 2019), but then again he achieved pretty much none of the changes he wanted to bring about in British government and society, got sacked by Boris less than a year after winning a massive majority and will now be a minor footnote in British political history.

Expand full comment

Loris

I agree. I kind of boggle at the way Scott venerates him.

I mean, realistically, the Brexit vote was won by making lots of incompatible promises, and not feeling in any way beholden to stick to facts.

It's not that it's not a good strategy - I mean, it worked for Trump too. It's just that if you accept it, everything tends to shit. As the USA is now also discovering.

Expand full comment

Superpersuasive seems unlikely.

Charismatic people are that way because they like interacting with people, so they get lots of practice. They also spend lots of time and thought on it, because they want to be at the top of the social hierarchy and that's a constant struggle. It's not just talking to people, it's also knowing things, reciprocal gifts and favors, etc. Only some of which an AI would be able to do without embodiment. Which is a whole different set of technologies, which have mostly (so far) given us a whole new appreciation for the uncanny valley and just how deep it goes.

The analogy with Usain Bolt is also likely wrong. Can you use a humanoid frame in a 1 gravity field and run a lot faster than him? No, ostriches don't count; the weight distribution is different. It's like with automobiles, you can't corner at 6g using only tires. The distribution isn't normal either; there's a hard limit at zero.

Expand full comment

Superpersuasion seems extremely likely to me. AIs can already impersonate people well enough that no average person who isn't told to be on the lookout can detect it. It can impersonate personalities, and by scouring the internet it "knows" what people like to hear. It sees what gets clicks, etc.

The big remaining barrier to superpersuasion is not conning the average person. The barrier is that if you are trying to persuade a huge number of people of the same thing at once, you have to contend with human variation in belief, desire, and value. The same rhetoric that convinces one person will irritate another. Just think back to how many were irritated by Clinton.

So for an AI to be super persuasive, it will need to tailor its message to every individual. It will need to know what their biases and weaknesses are. So, it won't be giving big speeches in front of millions. It will be tricking millions in diverse podcasts, company emails, social media comment sections, personals ads, and chats to trust it. Then with that trust it will know how far it can manipulate you. To do this, though, it will need a lot of individual-level information about you.

Expand full comment

Apr 10Edited

The tricks work until you find out. Then you likely retreat to known contacts and offline confirmation.

It won't work without embodiment.

Think of it like the con man. There are short cons and long cons, which have different approaches. But most people end up *not* being conned.

Unless they want to be. A con that fits in with your own goals does have a chance of working after all.

Expand full comment

I hear you, though not everyone will retreat to known contacts and offline information. It's true that most people end up not being conned, but again we are hypothesizing about a superhumanly effective conman. If the con is finely tailored to each person's biases and expectations, especially if it can impersonate what seem like trusted sources, the AI can succeed at superhuman persuasion even if some take enough precautions to escape it. Ultimately, state intervention would be needed I think. We cannot all stay offline without dramatically changing the modern world.

But it's true that the best move for the AI would be to remain undetected to avoid countermeasures. If it has a goal that involves our destruction, it would not have an "ask" of us that is harmful until it has permeated a society, at which point it pounces to maximize the damage. This is, as it turns out, the plot of a novel I just wrote (Last of the Lost).

Expand full comment

Doesn’t this depend on the assumption that people no longer talk to each other?

Expand full comment

If by "talk" you include electronically mediated conversations via videochat or audio-only calls, then no. AI can already do a decent job imitating humans in video/audio, and I assume we both agree within a couple years AI will be able to imitate people completely convincingly. True, if they are imitating a friend you could ask things like "what did we do together on January 1, 2012" and if there is no record of it on email or elsewhere the AI can scan, then you may catch it in a lie...unless it is good enough at tricking you that it forgot.

Until an AI is embodied it's certainly true that there will be ways to get around the AI persuasion game and catch it in lies. My point isn't that the AI has godlike, perfectly undetectable dissembling powers. Scott Alexander made it clear he's not talking about that either. My point is that superpersuasion short of this is still possible, and could be extremely damaging.

Expand full comment

Apr 10Edited

No, I mean actually talk in the flesh and blood world. Or are you proposing an AI that will be so well-versed in everyone’s business that it can simultaneously pretend to be a different important person to all of them at once without getting caught? I’m not saying AI bots will not be able to cause a lot of trouble because they certainly will be but they won’t be omniscient.

At some point, some people are going to demand the ocular proof, like what happens at the end of Tthe Man Who Would be King.

Or maybe in the future human beings will demand that another person cut themselves in their presence to be sure.

Expand full comment

I go into it in some detail in the book. It’s only a few steps beyond what AIs can do now. In some cases the AI won’t be pretending to be a human, but it will be welcomed as a therapist or companion or artistic partner. In other cases it can impersonate a human who the real person doesn’t know (say, a recruiter who wants to hire the person). In others it may impersonate a boss or colleague who you rarely meet in person.

There will be ways of detecting, but also ways of discouraging the efforts needed to detect. Again, not saying it will be perfect. Super persuasion is not perfection.

Expand full comment

REF

It is not that improbable that Usain Bolt and/or Clinton are bumping up against some sort of limit (probably not a hard one). The reason that we expect a true bell curve is that the characteristics that set that speed are very numerous and summed. It is possible for instance in Clinton's case that peoples observation of charisma becomes saturated beyond some level and is then dominated by many fewer variables (on the observer side). Or in Usain's case, it is possible that some individual parameter begins to dominate (e.g. muscle contraction speed, bone flexibility) which is again dominated by a far smaller number of variables. In either of these cases we would expect to see more thickening of the tail and less flyer further out. (Neither of these are necessary but they aren't, I think, entirely far-fetched).

Expand full comment

The limit that Bolt approached is physical: how quickly human muscles can twitch and how much force they can exert. Man against nature. The genes extant in the population put a limit to that. We could engineer new genes (from cheetahs?) and get faster, but there is a limit today.

The limit Clinton faced is very different. Charisma and persuasion are interactive and have to deal with wide variations in humans. Man against man. When you face a nation, you have people with diametrically opposed views of the world and conflicting values. The limit here is that the same message can excite one and enrage another. The way to get around this limitation is to tell different people different things. If an AI can lie to people differently around the world based on their own prejudices and expectations, then it can convince everyone. It's not convincing them of the same thing, but it could be gaining their trust in order to engage in some destructive activity. (I happen to have just written a novel in which this is how AI wipes out most of humanity.)

Expand full comment

REF

Scott especially called out that for the, "far end of the bell curve to match the cosmic limit would be a crazy coincidence"(with respect to Clinton and Usain). My point was that although it is unlikely to run up against a cosmic limit, it is much more conceivable for it to run up against the bell curve ceasing to be a bell curve. Bell curves are only bell curves because they are the sum of many contributing variables (in this case genes). If the number of variables are reduced as we get higher (e.g. muscle speed etc.) then the distribution becomes more uniform. (the tail becomes thicker with reduced opportunity for dramatically faster performers)

Expand full comment

> If an AI can lie to people differently around the world based on their own prejudices and expectations, then it can convince everyone.

Only if it can also prevent them from checking on the consistency of its statements by coordinating with each other. This is a known exploit with known countermeasures. https://leifandthorn.com/comic/the-musical-night-of-hyacinth-lavande-3/

Expand full comment

Apr 15

Only if the expectation is perfection for a long-term deception. People don't normally think to check with each other. The AI would know not to make obvious deceptions that could be easily checked. It would know not to tell highly divergent things from a single persona to two people who know each other well.

The goal here is to cause massive damage and confusion. There could be a period of subterfuge that lasts only a week or two, not a perpetual spinning of millions of plates hoping that none fall.

Expand full comment

Mantas Mazeika

Apr 9Edited

> It might be even worse than that; once AI becomes good at cyberwarfare, there will be increased pressure on companies like Meta and DeepSeek to stop releases until they’re sure they can’t be jailbroken to hack people. If that’s hard, it could slow open-source even further.

Wouldn't one of the lagging groups be heavily incentivized to open source a model that could scoop up most of digital labor and robotics, to starve out the leading groups? If the advances are mostly software, then it would certainly be possible for open sourcing to continue, so it seems fragile to say that it will stop past a certain point with high probability.

> we think the AI companies probably won’t release many new models mid-intelligence-explosion; they’d rather spend those resources exploding faste

They would be heavily incentivized to rapidly expand their hardware resources, so they will need a constant influx of cash and investment. Releasing new models is one way to do this. Unless the software advances are exhausted fast enough for hardware build out to not matter. But even then it would make sense to accumulate as much wealth as possible to expand into optimizing hardware and the whole AI supply chain.

Expand full comment

IntExp

amazing addendum,

The Superhacker, combined with the automated AI researcher GUARANTEES a singleton, where the highest IQ AI superswarm of hackers will oversee everything below their intelligence level, including their IQ increase.

A singleton AI god, not needing to compete, could be a way better outcome for Humanity than a benevolent dictator.

Expand full comment

Peregrine Journal

I do not believe that there are any abilities superpersuaders have that a text engine could not reproduce, short of actually kissing babies.

But "progress toward superpersuasion" still seems to model the world in a weird way.

I'm not sure who the best superpersuaders are. Maybe the best politicians on both sides of the aisle who convince millions to go cast a vote to give them power. Though Satya Sai Baba convinced millions of people he was a god capable of miracles. He probably ought to give them a run for their money.

If persuasion were a simple scale, we might expect nonadherents like me to say things like, "oh sure, Satya Sai Baba is one of the most entertaining people I can watch speak, up there with my favorite comedians and lecturers, but eh, he's not like a god." Instead he does nothing for me and seems comical.

Politicians end up massively polarizing, usually with net negatives for favorability. Maybe that's because other superpersuaders are convincing you they are awful? Why isn't it just a shifted normal distribution of favorability, instead of bimodal? Why are so many people locked in, why does the most charismatic politician win the middle 5-10%, and not flip the entire middle 70%, with fringes of the distribution on both ends?

I think it must be a bit like Netflix recommendations. There are a cluster of people who get deeply excited about Norwegian horror, and other people who like workplace comedies, and other people who like Victorian-era mysteries with a strong female lead. And of course some people are libertarians.

Also, there's a social effect. Some non-negligible part of Clinton's charisma is due to the fact he is famously Bill Clinton. This is more obvious with the royals (or reality TV stars) who are mostly non-charismatic and gaffe prone but fawned over nonetheless.

Charismatic figures can also induce a hangover effect where everybody gets sick of them. Sometimes a one hit wonder gets old, sometimes a populist is drummed out of government.

So tallying it all up: part of charisma is highly individual, part of charisma is social, part of charisma is because a lot of people know you, but part of charisma is about not getting quite so big that people get sick of you.

I don't know how to even predict the persuasive capabilities of AI because I have no faith in any models of human charisma that aren't highly multivariate.

I will say that Haidt seems to have some great points on effective political persuasion. Effective persuasion seems to involve framing arguments that appeal to your opponent's core way of looking at the world, or adopting some of their underlying premises. I expect an AI to be decisively better out of the box than the median human today at this. It is hard for people to set aside their own biases, and text engines specialize in mimicking the speech and perspective of others.

I guess the eval would be... when will an LLM get somebody on X or Reddit to change their mind about some core political belief?

Are we there already and just nobody has run the test? Or are we missing something fundamental about social interaction that we need to solve first?

Expand full comment

Dylan Richardson

Good points. Maybe a better model isn't GPT-6, but rather countless spoken-language conversation agents tailored to individual user personalities. And maybe not even persuasion directed at particular propagandistic goals, but just at reaffirming what users already want to hear.

Expand full comment

Dylan Richardson

Apr 10Edited

Just to pick on one aspect of this, superhuman persuasion doesn't seem all that plausible. It would require super human persuadable-ness. As it is, humans are actually quite difficult to persuade. And greater ability to persuade would just result in greater wariness and mistrust from humans, for almost everything but speaking human-to-human IRL.

On the other hand, I'm glad to see push-back on the "don't work at AI companies" stuff!

Expand full comment

Again re superpersuasion:

Even if superpersuasion itself proves infeasible, blackmail at scale does the same thing, and, to 0-th order, everything is illegal, so everyone can be blackmailed. ( Ok, I'm being hyperbolic. Would you believe that 75% of people with significant power have done something illegal or sufficiently unpopular to be subject to blackmail? )

Expand full comment

Dylan Richardson

That might indeed be the case in certain scenarios, but if it is, I'd expect societal norms around blackmail-able offenses to quickly adapt.

Expand full comment

Many Thanks! Maybe yes, maybe no. We've built a legal system which criminalizes so many things that it has been estimated that the average person commits 3 felonies a day, yet people still point at lawbreakers and scream "felon! felon!" rather than yawning "so what? so what?".

Expand full comment

If you blackmail one senator, he votes for what you ask. If you blackmail eighty senators, they compare notes, and vote to decriminalize whatever you've got proof of.

Expand full comment

That's more plausible than general changes in social norms. Many Thanks!

Expand full comment

Tom DeMeo

I think the biggest miss here is the discussions on AI conflict/collisions/cooperation. This is ultimately unsolvable in a satisfying way.

Humans don't really solve for this. We have a great capacity to cooperate and often do big things, but then this eventually falls apart, sometimes in some very painful ways. It may be better to see human thriving as what happens in between the cooperation failures. If we sped the process up enough, the space to thrive would go away.

We can try to "breed" AI to cooperate, but we've already played this game theory out enough times to know that won't hold, and even when it does, it forms its own set of problems.

There is no solution to this.

Expand full comment

Allen

After reading AI 2027 (both scenarios) and your post, I messaged ChatGPT about AI 2027. Here is the link to the prompts and responses: What do you think about AI 2027 scenario from AI Future Project? https://chatgpt.com/share/67f7e793-dc48-800b-ac8f-3244c6b40ce5

Interesting, a little humorous, and not at all scary...

Expand full comment

Victor

If this forecast is correct, then I have a prediction. The owners of the US Big Tech companies make a deal with the Chinese Communist Party: lifetime sinecures for themselves and their descendants, in exchange for turning US AI off. They will welcome their new Chinese overlords, along with their benefits, and the rest of us get to kowtow to our betters. By the time the public wakes up to what is happening, it will be too late. There will be a short window during which it might have been possible to attack China and stop their AI program, but it will slip by (by design). After that, it's just a matter of how long the buy-out takes.

All will be harmony under heaven.

Expand full comment

"ok chatgpt, I'm trying to evaluate whether to email the 5calls people to petition them to add "AI Safety" or "AI Safety related things" to their list of topics... in the hopes of reaching more legislators and government officials.

but, I am weary of the "Unilateralist's Curse" and taking a form of political action that perhaps others have already thought of. HAVE other people thought of taking this form of action yet? Yes? No? Why? Why not?

The one concern I have thought of is possibly "making AI Safety seem like a left-wing political issue" and thereby alienating the conservatives (who actually hold power). But then again, if it were successfully politicized as left-wing-nonsense, then maybe the alignment community would desperately try to make it a party-neutral issue as a countermeasure."

Expand full comment

Apr 11Edited

(I'm a disabled former software engineer who lacks the 'average health' to do the technical stuff on this issue.... but I'm very happy to shout into the void..... if you would mobilize people like me to shout at the right people and shout the right things)

(but until I am given direction, I am as useless as I am anxious)

Expand full comment

(but until I am given direction, I am as useless as I am anxious)

Expand full comment

This is my optimistic take:

AI 2027 and Situational Awareness both overestimate AI progress by assuming that both compute scaling and algorithmic progress improve the AIs' raw intelligence.

I think that the last algorithmic breakthrough that improved model intelligence was Attention Is All You Need and the invention of the transformer. So what about algorithmic progress since then?

I think that algorithmic progress in the 2020s is mainly new techniques to teach the AI specific skills. I think that this counts as genuine progress in AI, but not the sort of progress that improves AI intelligence directly.

An analogy would be like this: say you speak fluent English, and you are talking to a smart guy who does not know English very well. You try to hold a conversation in English, and you quickly notice that he makes silly mistakes in pronunciation and grammar. You would be tempted to call him dumb.

Now let's say that a year has passed, and that person took an immersion course in English and now speaks it fluently. You hold a conversation in English with him again, and you notice the massive improvement. You might imagine that he got smarter. However, he did not get smarter: he is just as intelligent as before. The difference is that he learned a skill, and it is the combination of intelligence and skill that forms your impression of his intelligence.

Likewise with AI. OpenAI used RLHF to turn GPT 3 into ChatGPT, but this did not make the model smarter: it just taught the model a new skill. Fluent chatbots seem smarter than models that can't hold a conversation, but that is a skill difference, not an intelligence difference.

AI skeptics like Pinker and Marcus like to define intelligence as a set of tools, in contrast to the "big blob of compute" model favored by AI boosters. My view is that both sides are seeing different parts of the whole picture. Intelligence really does work in the "big blob of compute" sense, but skills are like tools in a toolbox. The analogy here would be that raw intelligence is the size of the toolbox, while the skills that the (artificial or biological) neural net learns are the tools placed in the toolbox. Smarter people/AIs can have more skills that can be used in more contexts, and can learn them better than dumber people/AIs. A bigger toolbox can fit more tools, hold a greater variety of tools, and can hold bigger tools (in the analogy, a bigger tool = being more skilled). I think this fits with some of the new interpretability research from Anthropic, where they show how the neural net features combine across layers to form task-specific circuits.

So, scaling compute and model size increases intelligence, while algorithmic progress helps with skill learning. However, skills are not intelligence, so it is a category error to pile scaling and algorithms together and say "this explains x% of recent progress in AI, while that explains y%". Just as learning to speak can make a person more effective but does not increase that person's intelligence, algorithmic progress can make AI more effective at the tasks it is tested on without making the AI more intelligent.

There is a potential for a feedback loop between intelligence and skill, in which more skilled AI can help with the search for better ways to compute. However, I don't think that feedback loop works in practice, because there are hard physical limits on how compute can be improved. As far as I understand, progress in compute-per-chip is minimal (and limited to Moore's Law, which is slowing down), so most of the actual compute progress is AI companies buying more chips and building bigger data centers to train larger models. I don't see how a non-superintelligent AI can improve on that paradigm, which if true means that algorithmic progress does not actually contribute to the intelligence explosion feedback loop. It is more likely that algorithmic progress is allowing models to be more resourceful with the intelligence they have, rather than speeding up the growth of more intelligence.

What does this mean for the future? It means that we should judge the growth of AI intelligence based on compute, data, and model size, without algorithmic progress. This brings down the improvement curves, and prevents them from running away towards a singularity. I think the scenario I am trying to describe involves taking AI 2027's and/or Situational Awareness's AI improvement curves, and looking at the "compute only" versions. However, maybe even these models exaggerate, since they assume limitless compute growth, with new techniques of scaling compute discovered by the algorithmically-improved AIs. In my model, since algorithmic progress can no longer cover for compute scaling, the bottlenecks to increasing compute become more important.

Expand full comment

You have missed the big jump that happened when the labs started to apply subquadratic distillation. I think they have been trying to keep it a trade secret but there are too many papers about it now to maintain the veil.

Expand full comment

Leppi

A LLM is a machine that has read absolutely everything in the world with no reference to what that means outside of the text. It has then been trained by humans to know what text is appropriate and what text is inappropriate according to context. By doing so it has internalized concepts such as facts, fiction, good and bad prose, lies and profanity. It has no way of distinguishing facts and truthfulness from fiction and falsehood outside of the manner of text in it's training corpus.

Such a machine will be very useful at extracting and distilling information from many sources, but is fundamentally limited to the information that was available in the training data. It will be mistaken in what is truthfull, if the reliable looking text in the training corpus is also mistaken. A very large part of the corpus will contain false information. The machine cannot find original truth or fact outside the training data, but could possibly suggest experiments that could lead to better information.

The machine is fundamentally limited to the judgement of the human reinforcment training, and to the quality of the training data.

I predict that LLM's will platau at intelligence levels close to only an average or slightly above average human, but with a very large and broad knowledge base. This means that AI will be able to make many important inferences by synthesis, but will not be able to contribute much to the cutting edge of knowledge in any narrow field (such as fields important to further development of AGi).

Expand full comment

Have you heard about synthetic data? LLMs can be used to make new data for training. This is now a big part of the data for new models. Since there are many things not discussed in existing writing, the idea is that synthetic data can push beyond the current world model in the human corpus.

Expand full comment

AngolaMaldives

Apr 13

As Johnathan Coulton sang nearly a decade ago:

What if Kurzweil doesn't make it?

What if all the switches get stuck on destroy?

When the shuttle goes we won't take it

And the final counter-measures are deployed

All we'll have is all this time... and seems like it's running out. Maybe not this fast, but faster and less safely than I'd hoped. I doubt I'm the only one here wondering whether (and how, as a person of pretty ordinary means) I should re-evaluate how I spend what could be the last few years of my life :/

Expand full comment

phil

Apr 13

"Most people aren’t as charismatic as Bill Clinton"

I'm gonna go out on a limb and say 1 in 100 people are about as charismatic as Bill Clinton.

The fact that 99.9999% of people aren't presidents doesn't imply Clinton is 99.9999th percentile for any particular skill.

Expand full comment

ikko

Apr 14

>"Likewise, if you’re America, you’ve got nukes .... Von Neumann really wanted to nuke them in 1947 and win automatically. We didn’t do that because we weren’t psychos, but the logic is sound."

ok. so is nuking your competitor sound logic or psycho? which is it? also how far up your own butt do you have to be to not see that a much more likely threat to AI safety is Sam Altman and his greed for money. This is the problem with all the intellectual masturbation on this blog - airy ideation divorced from actual problem solving. Why not make a post about how sam altman's current acceleration of AI is unsafe and use your contacts to publicly call attention to this?

Expand full comment

Stephen Dedalus