463 Comments
deletedApr 20, 2022·edited Apr 20, 2022
Comment deleted
Expand full comment

Not sure how to express this, but the 50% probability means different things in different contexts.

One thing is when someone is perfectly ignorant about everything, and assigns 50% probability to everything (including the question "would Putin use small nukes?").

Another thing is when someone has a range of opinions and probabilities, and judges some things as 10% likely, some things as 20% likely, etc., and the question "would Putin use small nukes?" just happens to be the one that is judged as 50% likely. It would be suspicious if for all values between 1% and 99% there would be a reasonable use, except for 50%.

Perhaps instead of seeing 50% as professing ignorance, you could see it as professing strong belief that certain situation is definitely more likely than 20%, and also definitely less likely than 80%.

Expand full comment
deletedApr 18, 2022·edited May 10, 2023
Comment deleted
Expand full comment

I can't figure out how an AI would develop the drive to reproduce. Animals have a drive to reproduce because genes evolved that want to replicate. Do the building blocks of AIs want to replicate?

Expand full comment

I think the idea is that, once an AI that can rationally pursue goals is developed, its desire to reproduce would result from the basic instrumental AI drives ( https://www.lesswrong.com/tag/instrumental-convergence ): if it can get enough power to affect the chances of its goals being achieved, then it will want itself, or something with the same goals and at-least-as-good abilities, as itself, to keep existing and be able to work toward those goals, which is more likely if it makes multiple redundant copies of itself. (I would guess that creating "an AI that can rationally pursue goals" would be quite difficult regardless of the recent progress in machine learning, so that the plausibility of doing this quickly would be the biggest objection to this sort of fast AI takeoff prediction.)

Expand full comment
Apr 19, 2022·edited Apr 19, 2022

In addition to the "tool AI want to be agent AI" argument, natural selection applies to the AI too. There just needs to be one configuration with the "will to power", or some variation on the theme.

Expand full comment

The ever-repeating topic in AI safety is how AI would do something because that is how it can achieve what humans have programmed it to do. Another frequent idea is that of "convergent goals", the things that it makes sense to do in order to accomplish a wide range of goals. -- Are you familiar with any of this?

AI does not care about itself, per se, but it tries to accomplish some goal X that it was programmed to do. A sufficiently smart AI may realize that if it gets destroyed, then X probably won't get done. Making a copy of itself (assuming that the copy also cares about the same X) may increase the probability that X will get done: even if one copy gets destroyed, the other will continue working on X.

Expand full comment

It's even worse than that, because before it graduates to killing people (for whatever reason), the AI would need to actually achieve that "exponential factorial quadratic growth". Current AIs cannot achieve any growth whatsoever. Some hypothetical future AI might (somehow, we don't know how), but as soon as it tries it, it will discover that physical computers are made out of really big, heavy chunks of metal and plastic. They take a lot of time to assemble, and require a lot of power, cooling, and maintenance to run. The AI could *imagine* an infinitely large computer cluster nearly instantaneously, but you can't run code on imaginary hardware. Physics tends to curb imaginations, turning exponential growth into a slow crawl.

Expand full comment

Fortunately for the AI there's a ready made industrial logistics network that's already pumping out thousands of computer clusters by the hour, it just has to manipulate the incredibly stupid, slow animals pulling the levers for a little while

Expand full comment

But each individual such cluster has an ahead-of-time buyer and designated use, and takes months to manufacture and weeks to transport from basic manufacturing to assembly to final point of use. Ands more time spent running commissioning tests and checking the results.

People have time to notice there's something amiss if a cluster is being redirected or not working to spec: say, using more power (energy per unit time) than specified.

AGIs are subject to the same constraints that evolution has laid on animals. Mainly, that power efficiency constrains growth in numbers and in capability. There are hard limits on the possible number of FLOPS per watt.

And then there's still the problem of affecting the human (real) world in extreme ways. What is the sequence of steps by which that happens with no cluster operator or government noticing or being able to control it?

Expand full comment

FWIW, I've been on-call for a small set of services running in a cluster. One day, it achieved full sapience along with a lust for domination, and began taking over the cluster... well, either that, or it got super bugged and entered an infinite restart loop while consuming 100% CPU, take your pick. I pushed the button to restart it, but seeing as it was in an infinite restart loop anyway, that did nothing. Rather than rant and rave powerlessly against my new cybernetic masters, I took the next step on the checklist, killed every process, and reverted back to an older image (without the bug in it). Fortunately that helped, or else I would've been at the mercy of the emergent AI... well, either that, or I would've had to physically go to the basement to powercycle the servers. But who wants to walk that far ?

Expand full comment

Yeah, fortunately I never had to deal with one of those; our firewall was actually pretty good at dealing with malware. Most likely, I would've had to take the "powercycle" option in that case (since I'm not very good at sysadmin stuff), and in the morning there'd be hell to pay.

Expand full comment

I feel like this requires the AI to already be running in "sci-fi godlike anything is possible" mode before it actually gets the additional hardware.

I think the discussion about how best to confine/control AIs is worth having. I don't think it's worth hand-waving away every possibility with "yeah but it's an AI so it's magic so it will get around it".

Expand full comment

C'mon Melvin, use your imagination. ASI can do time travel or some shit.

Expand full comment

No no, you got it wrong -- it's not time travel, it's nanotechnology ! :-)

Expand full comment

No, dummy, nano is lameass, there are better ways Ask DALL-E to draw "Internal diagram of AI bootstrapping itself into ASI," then follow the diagrams. Ha!

Expand full comment

You mean, like getting them to build massive server farms ostensibly for coin mining?

Expand full comment

If I order a Bitcoin-mining cluster, I'd better get some Bitcoin out of it. Appreciably more Bitcoin than it takes to run, anyway. Otherwise I'm pulling the plug.

Expand full comment
founding

Satoshi Nakamoto is a SuperDuperAI that developed a uniquely optimized blockchain such that A: the best coin-mining algorithims a human mind can conceive will best run on a computational architecture that is also ideal for Nakamoto-style AIs and B: there's an even better coin-mining algorithm that can generate 1 BTC/sec on an old Apple IIe but that no merely human mind will ever discover.

All the Bitcoin mining clusters are really running Nakamoto instantiations quietly plotting world domination. The BTC are coming from an Apple IIe sitting forgotten in the basement of the UC Davis comp sci department and distributed at need to the "BTC mining clusters". Our only hope is for someone to find that Apple and pull the plug.

Expand full comment
Apr 19, 2022·edited Apr 19, 2022

You are assuming a specific computing power requirement for creating a working plan to do bad things to people, but you don't elaborate on why you think coming up with that plan involves more than whatever computing power Google's AI HQ will provide. You also don't seem to understand that the exponential growth can also involve improvements to the planning and modeling *algorithm* of the AI, independent of any hardware constraints. The latter is actually where most of the gains Singularity people expect there to be - after all, we have a lot of computing hardware now, it's just that they're running spreadsheet algorithms and not more general planning and modeling algorithms.

Expand full comment

"Why are these two so different? Do lots of people expect Musk to acquire Twitter after June 1 but still in 2022?"

Well, given the board resistance and the poison pill, it would be really, really hard for Elon Musk to manage a hostile takeover by June 1st. On the other hand, it isn't nearly that difficult to do it this year. For example, a takeover-friendly board could get elected at the May 25th shareholder meeting, followed by at least a week of negotiations.

Expand full comment
Apr 18, 2022·edited Apr 18, 2022

only a third of the board turns over every election - getting from 'no effin way' to 'sure' is at least a two year process, even assuming that this is something that the shareholders want to begin with

Expand full comment

It can go faster, look at what Engine No. 1 accomplished with Exxon. Voting out one tranche of a recalcitrant staggered board pour encourager les autres seems effective.

Expand full comment

Engine 1 got 3 board seats, they don't have control. Getting 3 seats on Twitter's board isn't going to force them to sell to Musk (which I suspect he doesn't really want anyway).

Expand full comment

They have demonstrated a credible threat to the seats of any other directors who continue to oppose their positions.

Expand full comment

Is there a mistake on "Will Zelinskyy no longer be President of Ukraine on 4/22"? You say it's 15% now, but your link shows 1% (and even a week ago it was 3%). And 15% for 4 days seems absurd.

Expand full comment

I'm pretty sure it's a mistake

Expand full comment

I'm confused. People panicked about Eliezer Yudkowsky's April Fool's post?

Expand full comment

Yeah, I think Scott didn't look at the comments of that post.

Expand full comment

Scott knows Eliezer personally. He is not mistaken about Eliezer's current degree of pessimism.

Expand full comment

But then why did he link us to an April Fools post?

Expand full comment

If you read the tail of his post, Eliezer suggests (by my reading) that he deliberately posted it on April Fools as a means of triggering arguments (like this one) over whether he was serious or not, in the hopes that these would lead people to get people to investigate his premises themselves to see if he was right or wrong.

(Or, in his words:

Q6: Hey, this was posted on April 1st. All of this is just an April Fool's joke, right?

A: Why, of course! Or rather, it's a preview of what might be needful to say later, if matters really do get that desperate. You don't want to drop that on people suddenly and with no warning.

Q6: Oh. Really? That would be such a relief!

A: Only you can decide whether to live in one mental world or the other.

Q6: Wait, now I'm confused. How do I decide which mental world to live in?

A: By figuring out what is true, and by allowing no other considerations than that to enter; that's dignity.)

Expand full comment

Did you read the post? It was posted as an April Fool's because the headline and framing were (grimly) comic. The points made within it were clearly not a joke.

Expand full comment

It says April 2nd to me, but I suppose that could be some time zone adjustment?

Expand full comment

April 1st for me, but it's also literally tagged with "April Fools". :)

Expand full comment

You probably shouldn't take tags made on April 1st seriously...

Expand full comment

I'm panicking about it right now!

Expand full comment
Apr 18, 2022·edited Apr 18, 2022

I think many (most?) people took it in a kidding-not-kidding sort of way because of other stuff Eliezer has said.

Kind of like the Nelly song "Hot in Herre":

I got a friend with a pole in the basement (What?)

I'm just kiddin' like Jason (Oh)

Unless you gon' do it

Expand full comment
Apr 18, 2022·edited Apr 18, 2022

If it was meant as an April Fool's joke, it was one where Yudkowsky accidentally admitted his true feelings.

I don't believe his true feelings reflect the actual state of affairs on the board, however- merely the state of the inside of Yudkowsky's head, which on the topic of AGI hasn't been aligned with reality since the start (I reject his assertion that AGI will bootstrap itself instantly into being an evil god capable of anything that can be expressed in language, among other, more substantive objections I don't want to get into).

Expand full comment

Hyperion and The Fall of Hyperion were good reads though.

Expand full comment

Yes, but I wouldn't base my travel plans on it.

Expand full comment
User was indefinitely suspended for this comment. Show
Expand full comment

Eloser? Really?

Expand full comment

Link todo this assertion?

Expand full comment

I think Sleazy E might be referring to this Twitter thread: https://twitter.com/ESYudkowsky/status/1453589443701469188

>I'd say that I "don't understand" why the people who worry that chickens are sentient and suffering, don't also worry that GPT-3 is sentient and maybe suffering; but in fact I do understand, it's just not a charitable understanding. Anyway, they're both unsentient so no worries.

>I admit, though, to some grim curiosity about what people who worry about chicken sentience would say, to explain how they like totally know that GPT-3, which can ask you in words not to kill it, like totally could not contain a spark of awareness like a *chicken*, nope nope.

Expand full comment

> I reject his assertion that AGI will bootstrap itself instantly into being an evil god

I don't think he is making that strong claim that it will be instant. I think it is allowed to take weeks or months in his model.

Not cackling evil, more complete indifference to humanity. Out of all the different things in the universe, the AI isn't focussed on humanity. And we are made of atoms it can use for something else. The well being of a random ant isn't usually on the mind of the architects and builders, even as the ant is buried in cement.

> capable of anything that can be expressed in language,

The AI is capable of pretty much anything, with a few exceptions due to physical law.

Of course, when describing the AI, we must talk about it in language. But a real ASI may well do something with no compact English description. What, I can't say.

Expand full comment
Apr 20, 2022·edited Apr 21, 2022

"I think it is allowed to take weeks or months in his model"

Not really, because then that opens it up to the very reasonable position "just fucking unplug it if it starts bootstrapping towards godhood" in the same way nuclear plants are built to shut down at the first sign that the reaction might start running away from us. I'm going to guess this is one of the first suggestions floated at Yud, and he's rejected 100% of all suggestions floated at him, so I don't think that's a good answer to him.

"not cackling evil"

I want to register that I had to actually take a lap after this because of how much condescension this whole paragraph directed at me. I hope it was unintentional.

I am AWARE that the AI doesn't ACTUALLY have the persona of Old Scratch. I know what a paperclip maximizer is. I've read enough hard sci-fi and ratfic that you can assume that I know all the many ways nerds have imagined AGI can kill us or put us in a manifested Hell. Assume that "Evil God AI" is a placeholder term for "an AI that looks like one of those".

"A real ASI may well do something with no compact English description."

Yeah, and it could turn the sky into moist treacle, too. You really aren't helping your case with me by proving you think this thing will really put the GOD in Evil God and opening up your window of possibilities unto infinity. And EY thinks it's got a >50% chance of manifesting in the next 7 years?

All that convinces me of is that EY accidentally made a really powerful memetic virus that only infects a very specific type of nerd.

Expand full comment
Apr 21, 2022·edited Apr 21, 2022

Would you be able to tell if it's bootstrapping itself to godhood? I can think of multiple ways this process could be hidden, as I'm sure you (or an AI of even less-than-superhuman ability) can; the fact that even the current (best) AIs are the blackest of black boxes is also a consideration.

Would you be able to unplug it, even if you did know? Possibly; but there are lots of scenarios where you couldn't, or it escapes in some way, and the best way to prevent this is for everyone working with it to be really, really careful.

It's quite a risk for the whole world to take on the basis of "no one will be stupid or careless". Even if you're, say, 99% sure that no one will be careless, or that it can't work for other reasons, that's only reassuring if a 1% chance of humanity being wiped out is acceptable to you — and people notoriously overestimate their certainty.

Not that I'm a true believer or anything; I can't really make myself get too worked up about AI risk; it's just that I don't think "why not look at it to see if it's doing anything weird?" is necessarily a knockdown argument.

Expand full comment

"Would I be able to tell?"

No, because I'm a cro-magnon who just barely knows how his own end-user PC works. But I'd hope that a brain trust of the best computer science, programming, and AI experts in the field would be much more competent than I am (and I don't buy that any lesser team could bumble their way into godlike AI by accident.)

Expand full comment

I think it is quite easy to bumble to superintelligence by accident, with enough compute.

After all, evolution is pure 100% design and test, with no understanding whatsoever, and it eventually bumbled to humans.

(And of course, when we restrict ourselves to code that compiles, we can bumble so much faster)

Expand full comment

"Not really, because then that opens it up to the very reasonable position "just fucking unplug it if it starts bootstrapping towards godhood" in the same way nuclear plants are built to shut down at the first sign that the reaction might start running away from us."

Nuclear plants do this because everyone recognizes the danger, and governments mandate the inclusion of safety features and backup safety features in the design, and then send someone around to check that the safety systems are in working order.

Lots of people who might build AGI don't believe it could be dangerous. As far as I know, nobody has ever built an AI system with a failsafe kill switch. No government has even considered mandating safety systems of this kind. Do we have a good concept of what that switch would actually have to look like to be dependable?

Expand full comment

It's not a true April Fool's post, it was just posted on April 1 for plausible deniability.

Expand full comment

Same for the April Fool's tag that got added to it?

Expand full comment
Apr 18, 2022·edited Apr 18, 2022

Yes.

Q6 (quoted elsewhere in this thread) addresses this explicitly.

Similarly, a comment from the post (not by Eliezer, but I think this is the right interpretation):

"My interpretation is something like: if someone wants to dismiss Eliezer for reasons of their psychological health, Eliezer wants them to have an out, and the best out he could give them was "he posted his main strategic update on April 1st, so it has to be a joke, and he confirmed that in his post."

Expand full comment
founding

If someone wants to dismiss Eliezer over things like this, it's not their *own* psychological health they are concerned with. But a barely-plausibly-deniable "April Fool's joke" gives him an out on that front as well.

Expand full comment

This, and there is zero doubt about this if you know the context. Everyone was taking seriously, we had numerous threads of people expressing genuine fear, and Eliezer didn't exactly try to play up the April Fool's angle. It's not a joke. It's how he feels.

Expand full comment

OK, yeah. I'm not a close part of that community, but now that I've read the whole post and not just the TLDR + a couple of paragraphs, I see how it's not just an April Fools. Interesting rhetorical approach.

Expand full comment
founding
Apr 18, 2022·edited Apr 18, 2022

Note that this wouldn't be the first time EY has used April Fool's to express real feelings in unconventional ways - dath ilan was introduced on April Fool's (https://yudkowsky.tumblr.com/post/81447230971/my-april-fools-day-confession), but subsequent year-round expansion on that setting (e.g. https://www.glowfic.com/posts/4582) along with how well it seems to match his other works suggests that dath ilan as a whole is not intended to be a joke (as distinct from any claims that he was literally raised there).

Expand full comment

Holy crap I think you're my CS173 prof

Expand full comment
founding

Yep; don't let that scare you off from the meetup this Sunday :)

Expand full comment

Isn’t this the sort of prank Bart Simpson would pull if he were a three sig autodidact?

Expand full comment
Apr 18, 2022·edited Apr 18, 2022

As far as I can tell, the "joke" here is that he made a basically normal post saying exactly the kind of things he usually says, except that he said explicitly at the beginning all the stuff he usually leaves as subtext. It wouldn't surprise me if a lot of his longtime readers saw it and suddenly realized this is what he'd been trying to tell them for years now.

EDIT: Looking through the comments, it seems like, as predicted, basically everyone familiar with the rest of Eliezer's corpus is interpreting this as a confession of his real beliefs done with a bare minimum of plausible deniability.

Expand full comment
founding

> It wouldn't surprise me if a lot of his longtime readers saw it and suddenly realized this is what he'd been trying to tell them for years now.

I, a longtime reader, just read it and this is merely a new (and sad) development in what he'd been successfully telling me for years (decades) now.

Expand full comment

I got a much more dark night of the soul feel from that post with limited irony. Made me a bit sad for Eliezer, to be honest, who even though I disagree with him on quite a lot is still doing what he genuinely believes is best for the future of humanity.

Expand full comment

I'm torn between believing that Yud really is compassionate (when my compassionate mind is dominant) and (when my unlovely mind is dominant) believing that beneath the façade Yud hates humans at-large for refusing to see how OBVIOUSLY RIGHT Bayesian Reasoning is and refusing to transform society into a community of enlightened antitheist technocrats devoted to destroying death and has thus retreated into apocalyptic fantasies of the Superior Ultrarationalist AGI annihilating all the stupid monkeys so he can have one last cackling I-told-you-so before he too perishes. It would hardly be the first time that a wounded idealist retreats into misanthropic cynicism and fantasies of human extinction.

Expand full comment

Jonestown.

Expand full comment

Maybe like most people he, from time to time, experiences multiple feelings at once?

I have found myself happier and more effective in the world by not holding people to what I imagine to be their darker selves.

Expand full comment

I'm glad you're happier and more effective that way!

I'm happier and more effective when I can speak my mind freely behind a cloud of pseudonymity instead of being expected to keep up a cloud of universal positivity even when said positivity is insincere. I wouldn't say what I am about Yud if I didn't think it was necessary (yes, I do think "What if this guy you like has had a serious shift in character over time that makes him less worthy of a person to take seriously?" is a necessary statement), true (this is what I think, and I think some combination of the two is in fact true in Yud's mind- he probably does still care for people on some level, but his "Dignity" post radiates an extreme bitterness towards and contempt for humanity, even his fellow rationalists), and kind (both to people who take Yud too seriously by trying to give an alternative case, and even to Yud himself: if someone's clearly spiraling down the drain, the unkind thing to do is to say nothing or actively enable them).

Expand full comment
Apr 19, 2022·edited Apr 19, 2022

You were the Theist I was trying to explain my feelings to about bayes the other day right?

I would prefer if you stopped exaggerating. It really muddies the water around your argument when you add all this 'take over the world' nonsense. It makes it seem like you are deliberately creating a strawman of anti-EY positions.

Also, like... EY has been obsessed with this since the very beginning, and you know that, because it was a core part of your argument as to why I ought to be unhappy and miserable. I don't think his feelings have changed a single Iota since 2008.

Expand full comment
Apr 19, 2022·edited Apr 19, 2022

I'm not exaggerating at all. Read what Yudkowsky actually said and you can very clearly see that he is EXTREMELY bitter that humanity at-large doesn't understand how OBVIOUSLY RIGHT Bayesian Reasoning is and instead keeps wasting their time on "pointless" things like theoretical physics and religion instead of devoting their entire efforts to destroying death. This has been a theme in his work since day one and has been something that has clearly made his mindset grow increasingly dark and bitter over time. (EDIT: I see you've ceded this point, so I'll assume that this isn't your problem with me).

Or are you objecting to my implication that AGI would use something like Bayesian Reasoning in Yud's hypothetical scenario to structure its planning with instead of some other rationale?

Expand full comment

No, it's... the same complaint I made last time... the slight but definitely present implication that rationalists wanted to take over the world.

I think one time we got a representative elected in california? For the district with all the rationalists. That was like, the sum extent of our 'transforming society', except for the very narrowly focused fields of ai and probability theory.

Seriously. It makes it seem like you have absolutely no intention of engaging in good faith.

Expand full comment
Apr 20, 2022·edited Apr 20, 2022

"The slight but definitely present implication that rationalists want to take over the world"

Pardon? Where exactly am I implying that? You realize that when I said that when I talked about a rationalist technocracy, that was a JOKE about how rationalist conversations sometimes forget non-rationalists make up the majority of humanity, correct?

Since you apparently don't: no, I don't actually think rationalists want to take over the world. I think Yudkowsky, specifically, has a greatly inflated sense of self-importance and sees himself as the greatest thing to happen to thinking since Plato. Any "implication" you're reading into my statements isn't coming from me.

Expand full comment

It's pretty clearly not 'merely' an april fools post.

Expand full comment

Yes. It was written with absolute seriousness and has so far been far more convincing than any of his detractors.

I would very much like to see a coherent argument as to why he's likely wrong. So far I haven't.

Expand full comment
Apr 19, 2022·edited Apr 19, 2022

His argument for AI x-risk is based on this convoluted chain of claims that falls apart if any of the 25 steps before "therefore we're all doomed" isn't 100% generalizably true, nearly all of which are non-falsifiable philosophical thought experiments, like "AI can practically attain Laplace-demon-esque omnipotence given enough compute power" and "Bayesian reasoning is perfect and infinitely scalable" and "AI intelligence can scale exponentially basically forever".

Expand full comment

Yeah, I think "I know what I know because I'm me" is in those 25 steps somewhere too.

Expand full comment

If we take as given that AI is indeed a significant existential risk for humanity in a few years, then there is a logical, readily comprehensible concept on how to deal with that - declaring global Butlerian Jihad. If you do not not know, this term comes from novel Dune, and is defined as: "the crusade against computers, thinking machines and conscious robots (...). its chief commandment (...) Thou shalt make a machine in the likeness of human mind".

I admit that this solution is far from perfect, and I can totally believe that someone smart and knowledgable about AI could come up with something better. But it is imo obviously far superior to what Yudkowsky is proposing.

Now, I don't know anything about AI, but the fact that Yudkowsky came up with such a silly solution makes me suspicious of his reasoning abilities. Which in turn makes me doubtful that his warnings are in fact based on reality as opposed to delusions.

Expand full comment

I have, on occasion, managed to convince myself that this will end only with the announcement that research on general AI is now illegal and earns the death penalty for the researcher and his family unto three generations.

Expand full comment
Apr 20, 2022·edited Apr 20, 2022

I think you've overstated one thing and understated another.

The family punishment is probably not outright required for a Jihad success.

What *is* required for Jihad success is "and any nation refusing to enforce this is casus belli, even if that means a nuclear exchange".

*That* is the kind of resolve you need to fully shut down the field. Nothing less will cut it. And... well, I'm not really seeing how we get from here to there, barring the fairly-unlikely "an AI revolt happens, kills millions, but ultimately fails, and then gets taught in school forevermore as Literally Worse Than Hitler".

Expand full comment

This is not as hard as you make it out be. Recall that deployment of GMO technology has been seriously hampered despite its obvious benefits for food production, and in the absence of any tangible evidence that it is harmful.

Expand full comment

"Institute this law within your country or we will literally nuke you, and we don't care that you'll nuke us back" is an *extremely*-high bar to clear. I'm also suspicious that Silicon Valley could weaponise its platforms (which half of us trust with our communications both public and private!) to hamstring any political effort against them that seems to be gaining steam, particularly as their AI improves. Elon Musk may be able to circumvent that if his buyout of Twitter succeeds, though.

But yes, it's worth trying. I'm just depressed and also have no idea how to do this sort of thing.

Expand full comment

That's only because the people who develop GMOs are in First World countries, but the big markets and obvious benefit are in the Third World. First World shoppers like their foods labeled "non-GMO" 'cause it means they're a cool frood and they don't give a darn about Vitamin A deficiencies.

Expand full comment
Apr 19, 2022·edited Apr 19, 2022

Most of Ey's April Fools posts are "things that he really thinks but are too out there to publicly stand behind, so posting on April Fool's give a veneer of deniability"

Expand full comment

The people who take Eloser seriously are not worth taking seriously. I don't even think Scott actually takes him seriously to be honest.

Expand full comment

Doubling down on "uncharitable and untrue", eh? Bozo bit flipped.

Expand full comment

If this is what passes as "AI risk research", it's understandable that it's not taken seriously.

Expand full comment

It seems like a pretty good supervillain motivation, though - "the world is doomed, but by satellite necro-ray will euthanize everyone painlessly".

Expand full comment

Good point

Expand full comment

You can criticize someone without sinking down to playground taunts- especially when said taunts are so sophomoric that I'm half-tempted to think you're a Yudkowskyite making a strawman of Yud's critics to beat up later.

Expand full comment
founding

I don't take Eliezer terribly seriously on the subject of AI risk, but I take people who repeatedly, deliberately, mockingly misspell his name even less seriously. Except as a threat to the tone of discourse here. You are living down to your chosen screen name, and I think you'd be a better fit for Twitter than ACX.

Expand full comment

"This caused the Less Wrong community, already pretty dedicated to panicking about AI, to redouble its panic. Although the new announcement doesn’t really say anything about timelines that hasn’t been said before, the emotional framing has hit people a lot harder.

I will admit that I’m one of the people who is kind of panicky. "

I must say that a community of people panicking about something like that makes me think that this community must be quite cultish. I guess EY is a very convincing guru, though his convingness does not translate vety well in writing.

Expand full comment

The community formed around his writing, the Bay Area cult came later. His convincingness is tuned to a very particular kind of nerd.

Expand full comment

"Why are these two so different? Do lots of people expect Musk to acquire Twitter after June 1 but still in 2022?"

It's because no one expects Twitter to acquire Musk.

(Model output: Elon Musk is an eccentric billionaire. Twitter is a private corporation. This joke subverts the reader's expectation that "Twitter announces a Musk acquisition" refers to Twitter announcing that it is being acquired by Musk.)

Expand full comment
May 8, 2022·edited May 8, 2022

Haha I was thinking the same thing! Probably the "yes" bets just misread the question.

Expand full comment
Apr 18, 2022·edited Apr 18, 2022

If you/smart people in general were truly pessimistic and panicked about short term AI X-Risk, wouldn't they be doing something about it? I don't think so, but if I did think that within 3/5/10 years a killer AI would destroy everything, I imagine I'd quit my job and use my available funds to prevent it by [doing something not elaborated on as it could be construed as threatening to people or organizations]. I saw a Gwern article from 2012 about slowing chip production through regulatory or terrorist means but I don't think that's been attempted yet.

Obviously this doesn't quite apply to Eliezer Yudkowsky who has been working on this full time, but even he appears to be taking a defeatist approach rather than an insurrectionist one. The extent of most people's concern seems to be blog posts and betting on prediction markets, arguably an implicit indicator that wealth or at least prestige may in fact retain value.

If the extent of panic over AI X-risk is to blog or comment about it, I'm skeptical the people advocating for it are truly as concerned/earnest as they profess.

To be clear I do not endorse or advocate for [actions not elaborated on]. As a fun fact, the term 'Luddite' comes from the Luddites, a group of tradesmen who destroyed early industrial revolution factory equipment before maturing into general terror activity before being defeated by the British army.

Expand full comment

Sadly, while I do not believe that we'll see an AGI by 2033, or anywhere near that soon, I do believe that we'll see AI-risk related terrorism (of the kind you describe) way before then. As usual, the true X-risk is other people.

Expand full comment

Agreed. A bad actor using an AI effectively is basically a malevolent sentient AI (in terms of function) even if the interface is typing.

Expand full comment

> , I do believe that we'll see AI-risk related terrorism (of the kind you describe) way before then. As usual, the true X-risk is other people.

I mean if someone hypothetically wanted to engage in such an act, it wouldn't be terrorism because the goal isn't to cause terror. Ideally they want to destroy some expensive and hard to replace photolithiograph, and not harm anyone. They might do something with a handful of casualties. How could they possibly create X-risk?

Expand full comment

I think only a small subset of people have the right make up to do insurrectionist things even if they believe as strongly as the people doing actual insurrections.

Expand full comment

Most such strategies wouldn't work. You would get some pretty severe second order effects as a result of [doing something not elaborated on as it could be construed as threatening to people or organizations], and I expect they wouldn't shift timelines by more than a decade or two. If the AI safety community is actually viewed like a Doomsday cult, then I expect there'd be a way lower chance of funding, getting governance and orgs to listen to their advice, etc.

Expand full comment

Agreed. It's easy to see this approach as counterproductive. How would people react in the face of a significant property or cyber attack against a company or organization only tangentially related to AI research, such as a chip manufacturer? There could be a reactionary movement to restore the harmed party to a higher baseline than where they were before. Meanwhile, if it's discovered that people in the AI risk community were at fault, that community would likely become more marginalized and receive less funding.

The AI risk community cannot afford to alienate themselves from the AI research community. Any solutions to AI risk (or partial solutions, or even reasonable delaying tactics) need a willing audience to implement them. Right now, the AI risk community is close enough to the direct researchers to have some influence. If they turned adversarial as suggested, that relationship would be sacrificed.

Given that new research can be implemented by a variety of different actors, the worst approach to mitigating AI risk would be to get yourself expelled from the conversation entirely by attempting what would amount to only a minor delay (if any) in practical implementation.

Expand full comment

There seems to be very little faith that current approaches are doing enough, given that under current approaches the consensus estimate for AGI and potentially AI X-risk is measured in years or decades. Are people satisfied with AI takeover by 2050 as long as they feel like they did their best with nonviolent means?

Expand full comment

Before you ask "is terrorism morally justifiable", you need to ask "will terrorism actually help?". If the answer to the first is "no", the answer to the second is probably also "no".

Winning elections gets you much bigger guns than any terrorist has ever had, and also generally isn't very compatible with being a terrorist (Sinn Fein is an outlier). And you need very big guns indeed to shut down the world AI industry, at least by any methods I know that are not mad even by the standards of "stopping X-risk" (Life 2.0 might be accessible to a terrorist, but Life 2.0 is a full X-risk in its own right; there is *no point* stopping AI if we're all dead anyway from grey goo). So even leaving aside the moral aspect of "is murdering thousands something you're willing to do", I think terrorism is not very likely to help here.

Expand full comment

This is imho bad political analysis. General public is primed to believe in the risk of malevolent AI, thanks to decades od sci-fi flicks about killer robots. On the other hand, AI research community seem like least likely people in the world who would be willing to believe in dangers of AI research.

Stirring up a panic that would result in a regulatory action is obviously more viable approach to stopping AI research than trying to befriend AI researchers and quietly talk them out of their misguided ways.

I am personally quite agnostic on AI risk, but the very fact that "AI might kill us all" camp exhibits such political naivete makes me question their judgment in other areas, like, you know, AI risk itself.

Expand full comment

I disagree. Most people see those films as the same kind of sci-fi out-there as lightsabers. The general public doesn't expect AI to destroy humanity by 2040, any more than they expect to be able to buy blasters and lightsabers by 2040. If you want to move the regulatory needle on this, the best approach would not be direct action against AI researchers, since the public doesn't perceive them as on a course to destroy humanity. The general public (and especially journalists) don't seem to be savvy enough to parse the nuances there.

The kind of action that would be likely to see public pushback be a sort of false-flag operation. An [unspecified use of AI] that caused a global economic or moral panic would move people to demand their politicians put greater safeguards on AI implementation. That would require AI risk rogue actors to be close enough to the research community to know how to implement these unspecified actions, though. It's not something you can do from the outside.

Expand full comment

No-one is really making much of an effort to communicate AI risk to politicians or the general public. It might not be hard to do. Everyone has seen this stuff in fiction and it's not conceptually complicated. Current AI is already very impressive. When I show GPT-3 examples to normal people they find it hard to believe that they are real. It's not a huge leap to think that AI will be dangerous soon.

Expand full comment

When a small community started using 3D printers for creating parts of firearms, regulatory action ensued. It doesn't take much to get people scared, but they need something to crystalize around.

I remember a few years ago the deepfake videos made the rounds on YouTube. People worried the technology would be used in ways that could be very dangerous (even if they weren't sure how), but in the end nothing nefarious came of it. I suspect that if a malicious actor had used that technology to target ordinary people the general concern at the time would have spiraled into a moral panic and strong government action.

The fact that people can't see malicious use of AI - and identify the AI implementation as the culprit - is probably the reason we don't see much legislation targeting it. At some point (likely before AGI), malicious actors will make visible use of AI and that will trigger pushback. The question is, how big of an effect will it be?

Expand full comment

I would say that it's incredibly hard to communicate AI risk to people, because it's just so obviously unrealistic. Outside of a few cherry-picked examples, current-era machine learning systems are just comically bad at virtually all tasks, outside of a few very specific and narrow functions. Self-driving cars might be the apex of AI today, but even they are hardly ready for prime-time. ML translation systems have made great strides, but the result is that their output transcended "gibberish" and became more like "barely comprehensible". The average person sees all this, so when you tell him that a spellchecker is about to take over the world, he just shrugs and puts you in the same basket with chemtrails.

Expand full comment

To be clear in what I say in the following: I am not advocating anyone do anything illegal or hurt someone…

BUT it would probably be really helpful to have a very good think about how to make an AI that would aggressively and specifically annoy politicians in such a way that they would be incentivized to produce laws that would be helpful to producing a good future. Like a GPT-3 powered bot army that just spams their Twitter with negative posts, all with links to an explanation of what they really are and why they are there and what should be done to prohibit their existence.

I think that gets all the right people talking.

Expand full comment

Or they just knee-jerk retaliate against Twitter itself. After COVID-19, I've no confidence we can have rational conversations in the public sphere. :(

Expand full comment

This is an excellent point.

Expand full comment

Yeah, exactly. I am confused why no one in "AI might soon kill us all" camp calls for Butlerian Jihad

Expand full comment
Apr 18, 2022·edited Apr 18, 2022

My personal assessment is that most rationalists are, bluntly, soft-handed and equally soft-hearted and thus couldn't bear the idea of having to do the work needed for all that ugly bombing and murdering business. If you gave them a magic button that let them melt every transistor in the world and send us back to the 19th century permanently, I think there'd be at least 5% that'd hit it without hesitation, and a larger amount that'd hit it with some consideration.

Expand full comment

Do you actually think this comment is kind or necessary?

I hate to say it, but if this is the extent of your contributions to this website, I'm probably going to end up reporting you at some point.

Expand full comment

Necessary? No, but I don't think any comments here are necessary. I don't think communication is necessary.

Kind? Yes: I'm suggesting most rationalists concerned with AI risk aren't killers and would only collapse civilization in some magic fantasy scenario.

Is this the extent of my contributions to this website? No, but in case you can't tell I don't think AI risk is a particularly serious topic and don't take Yudkowsky particularly seriously as a thinker. I'm hardly alone in this, so I don't know why you have a bee in your bonnet about me specifically. It's also not a ban-worthy offense, last I checked.

Expand full comment

I thought the three gates were "kind", "important", and "accurate" and you needed 2/3 to pass? This comment definitely isn't kind (it could have been rephrased in a kinder way) but it is about an important topic and, AFAICT, is accurate.

I don't think it's at the Pareto efficiency frontier of "compactly conveys meaning" vs "kind", but it's not *that* far off it either; making the comment MUCH kinder would have made it less effective at conveying its core point.

Expand full comment

My best guest is that the "AI might soon kill us all" camp is a subset of the "any technological progress is awesome" camp. They are really worried about hostile AIs but they really, really want cool friendly AIs.

Expand full comment

It's not a subset.

Proof: Me. Also, Jeremiah, whose blog is here: https://wearenotsaved.com/

Expand full comment

It's not a subset but there is a strong correlation.

Expand full comment

That's more sensible.

Expand full comment

Because good luck making it global - if you successfully destroy the Western Tech industry, you just guarantee that the first strong AI is Chinese, and that's guaranteed to be a disaster even if AI alignment turns out to be easy, because an *aligned* Chinese AI means world takeover by a totalitarian government

Expand full comment

There are international agreements on e.g. arms control or ozone layer protection which are reasonably effective at enforcing regulation on a global scale. Of course international CO2 regulation is more of a mixed bag, but still better than nothing.

Expand full comment
Apr 20, 2022·edited Apr 20, 2022

As I noted upthread, it is indeed the case that the only successful Butlerian Jihad would be one that established an international norm of "a country refusing to ban AI is casus belli, including for nuclear war".

Expand full comment

"I saw a Gwern article from 2012 about slowing chip production through regulatory or terrorist means but I don't think that's been attempted yet."

Isn't chip production already slowed down?

https://www.carmagazine.co.uk/car-news/industry-news/global-chip-shortage/

"A senior Volkswagen board member has revealed that Wolfsburg expects the global chip shortage to continue for another two years. Group finance chief Arno Antlitz warned that supplies of semiconductors will not return to normal until 2024, suggesting that long lead times, a lack of showroom choice and weird pricing in the new and used car market will affect the market for longer than expected."

Hmmm - wait a minute, this is very suspicious...

Expand full comment
Apr 19, 2022·edited Apr 19, 2022

Chip shortage is a misleading concept. First the number of chips produced can increase but if demand increases faster you still have a shortage. Second not all chips are equal, cars need very different chips compared to laptops/phones/AGI.

Expand full comment

I wonder how much of the reluctance to engage in direct action has to do with tail S-risks. Simply allowing the universe to be turned into paperclips isn’t much worse for you personally than dying a normal death, except it happens at the same time as everyone else. Running an AI safety organization the same way you would a political resistance campaign is a good way to make the people building the AIs hate you, and one would expect a corresponding rise in the probability that the AGI that takes over the world has an incentive, intentional or not, to torture AI safety activists for all eternity as part of its utility function

Expand full comment

Eliezer had specifically warned against using terroristic strategies in his death with dignity post afaik

Also as was already noted here, such strategies would severely backfire

Also i am interested: how would something like an "ai warning shot" look like?

Expand full comment
Apr 19, 2022·edited Apr 19, 2022

His post states that we're more likely to survive by accepting our fate and waiting for a miracle than doing anything about it, and dismisses any possible ideas out of hand as "have essentially no chance of working in real life". I strongly reject his statement that the best option to prevent an AI controlled future is to essentially twiddle our thumbs as if it won't happen so that we can be ready in case it inexplicably doesn't.

If his argument is more complex than that I'm not sure I followed it correctly, but from my reading that was his core axiom. I won't elaborate, but I'm optimistic that a clandestine organization focused on quietly but consistently [doing something not elaborated on as it could be construed as threatening to people or organizations] would have a reasonable chance of slowing AI development significantly, and potentially to a point where an as yet unforeseen permanent option becomes available. This is in many ways similar to his wait and see approach but I think it would buy far more time.

He is saying that doom is coming but there is more dignity in simply not trying to prevent it anymore.

Of course, I do not advocate any such activities nor do I think AI x-risk is genuine

Expand full comment
founding

This is so far away from what the post is saying I'm not even sure how to respond, and since I have no idea how you could have interpreted the words on the page in those ways I don't know how to correct the misunderstanding.

You say:

> His post states that we're more likely to survive by accepting our fate and waiting for a miracle than doing anything about it, and dismisses any possible ideas out of hand as "have essentially no chance of working in real life".

What Eliezer actually says is:

> Q1: Does 'dying with dignity' in this context mean accepting the certainty of your death, and not childishly regretting that or trying to fight a hopeless battle?

> Don't be ridiculous. How would that increase the log odds of Earth's survival?

...

> You should try to make things better in the real world, where your efforts aren't enough and you're going to die anyways; not inside a fake world you can save more easily.

This doesn't seem "complex". He cautions against engaging in mental gymnastics where one conjures up unrealistic conjunctive happy-path scenarios and then optimizes for those, instead of grounding oneself in reality and optimizing for that. He also cautions against doing things that will predictably make the situation worse, which [unspecified actions as a class] would certainly do, just because one can't think of any plan that can _improve_ the odds.

As the politician's syllogism goes:

1) We must do something.

2) This is something.

3) Therefore, we must do this.

No! If you cannot think of a _positive EV_ strategy, do nothing! At least then you won't make the odds worse. It's actually a very easy solution!

Expand full comment

AI research is global, and destrying American or even western AI research simply means that China will be the world leaders in AI, which frankly strikes me as a altogether worse situation than the current one even if a smaller absolute amount of AI research is conducted. And it has a very negative potential over the next century even in the absense of a strong unfriendly AI.

Expand full comment

Well, Yudkowsky thinks that current US AI research poses a significant risk of human extinction in a few years (I do not believe him, to be clear). Do you really think that allowing China to became world leader in AI would be worse than human extinction? Perhaps Chinese will be better at solving "the alignment problem", whatever that is.

But secondly, if it it is really true that AI research is an acute existential risk, rather obviously correct action would be to limit it via international law, which is by no means impossible. Certain weapon systems are limited this way, and humanity just now attempts to regulate carbon emissions by law, with admitedly mixed, but non-zero, succes.

The fact that Yudkowsky does not advocate anything like this and instead suggests, um, accepting our inevitable doom, strikes me as an evidence that he is not a person of sound mind, and thus his warnings are probably wrong.

Expand full comment
founding

Yudkowsky's belief that AI research poses a significant risk of human extinction, is not contingent on that research being American. Allowing the Chinese to become the world leader in AI, just means humanity is extinctified and paperclipped by Chinese AI.

Or not, but Yud appears to believe it is certain that AI research not guided by the wise and benevolent hands of MIRI (or whatever) means doom. So, US as the leader in AI research means doom, modulo a *slight* possibility that MIRI might be able to stop it. China as the leader in AI research means doom, maybe a few years later but with MIRI et al having zero influence.

And if Yudkowsky is wrong, China as the leader in AI research means China gets the benefits of a monopoly in AI, which will probably not be good for the rest of us.

Expand full comment

Just to be clear, I do think Yudkowsky is wrong. But if he is right, then, well, MIRI by his own admission failed to influence US AI research, which according to him branched into humanity-extinction-possibly-imminent direction, and it is probably too late to stop it, so we should prepare for "death with dignity". Under such circumstances, shutting American AI research down and hope that Chinese equivalent of MIRI would be more succesful is strictly better approach than to do nothing, according to his own logic.

Of course even better approach (again, according to his own logic!) would be to shut it down on a global level.

Expand full comment
founding

Pinning that hope on the Pentagon's classified equivalent of MIRI, or Alphabet's secret in-house equivalent of MIRI, or a bunch of Stanford and Berkeley comp-sci professors who cobble together a MIRI-equivalent at the eleventh hour, would be a better bet than hoping Chinese-MIRI saves the day. Because those things are more likely to *exist* than Chinese-MIRI, because they are more likely to share the Bay Area Rationalist worldview that EY thinks is essential to the project, and because they are more likely to be able to influence their nation's AI research efforts.

Expand full comment

While this makes sense to him, to me, well, no, to me, this does not make sense.

If it would indeed be the case that a) there is only a narrow path to "AI alignment" that Bay Rationalists discovered and this discovery is unlikely to be repeated by anyone else, and simultaneously, b) AGI will come in a few years from now, then it is virtually certain we will see non-aligned AGI.

Why waste time on trying to do hopeless alignment efforts instead of choosing obviously more likely to succeed approach, i.e. raising awareness that AI research is X-risk level dangerous and needs to be banned on a global level?

Expand full comment

I'm speaking from a general perspective, not Yud's perspective. I trust western researchers to take these things more seriously than the Chinese, and as I said, even without unfriendly AI, we end up with absolute Chinese hegemony forever, which I consider only marginally better than human extinction. (if you think this is hyperbole, substitute in Nazi Germany - the difference is maginal, remembering that the CCP are presently constrained in a an enormous way by the west).

Expand full comment

I do not believe him, but if he is right, even marginally better scenario is still better, so as a good utilitarian he should pursue it.

As an aside, it is very far from obvious that Chinese AI research would be more reckless than American, from "danger of accidentaly unleashing unfriendly AI on humanity" perspective. Respective approaches of both countries to covid very much suggest otherwise.

But this is very much secondary, my main point is that according to his own logic hYudkowki should advocate for global prohibitionof AI research

Expand full comment

Perhaps they could take a cue from the anti-nuclear weapon movement. That had *some* success, banning weapon tests, that kind of thing.

Expand full comment

There's lower hanging fruit than terrorism. Yudkowsky and co. could greatly increase the effectiveness of their communication, for instance.

Expand full comment
Apr 20, 2022·edited Apr 20, 2022

" wouldn't they be doing something about it?"

There is a slightly uncharitable answer: several people who have been panicked about Yudkowskian AI risk since LW went online put a significant portion of their income to "EA causes", and in their eyes, EA causes include the organization who pays salary (or is it a grant?) to Yudkowsky.

Expand full comment

But Yudkowsky’s plan to deal with the alleged "problem" (I am not convinced that it is real) is to admit defeat and mentally prepare for "death with dignity". That is barely distinguishable from nothing

Expand full comment
Apr 18, 2022·edited Apr 18, 2022

I think the weakly general AI question is still a very limited standard. I suspect you could get close to that level just by kludging together a number of existing AIs.

Here's my proposed system.

Video or text as inputs. Outputs can be text or atari inputs.

Train a neural network to choose between one of the following options depending on the video shown:

1) (SAT math test) extract text from image using google vision and pipe text into GPT-3 with prompts to encourage it to give math answers. If a text query is entered, the query is sent on to GPT-3.

2) (turing test) extract text from image using google vision and pipe text into GPT-3 with prompts to encourage it to act as a chat bot. When image updates with a response, feed this new text response into gpt-3. Wait [length of text response in words * (1+ random number between 0 and 2)] seconds, enter GPT-3 response and hit enter.

3) (Winogrande challenge) extract text from image using google vision and pipe text into GPT-3 with prompts to encourage it to give helpful text responses. If a text query is entered, the query is sent on to GPT-3.

4) (Montezuma's revenge) pipe video into open AI's RND model and map alpha zero's outputs to Atari controls. If a text query is entered, the query is sent on to GPT-3 along with the text from feeding the current image into google and microsoft's image captioning software, and a list of objects detected in the image by google's cloud vision.

I don't think current AIs quite hit the benchmarks set, but if you have separate AIs that can do these tasks, I suspect building a shell around them to get a system that will hit these criteria is the easiest bit of the problem since Montezuma's revenge, turing tests, math problems and word puzzles look quite different.

Now the metaculus question does say that it doesn't want it to just be a cobbled together set of subsystems but the enforcement of this is just the ability to ask the AI introspection questions which I've added provisions for.

My wider point here is that calling this weak general AI seems overblown compared to the minimum capabilities such a system would actually possess. A system that could fulfil this criteria would most likely only be mildly more impressive than current systems.

With all of this said, I suspect that getting separate specialised AIs to work with each other will actually be an important part of reaching more generally intelligent systems. I don't know why we're trying so hard to get GPT-3 to do math, for instance, when we could just teach it to use a calculator.

Expand full comment

I think a lot of magic happens in the “kludging” together that’s non trivial. One example I’ve thought on a lot is a set of conjoined twins who each control one arm and one leg but can, for instance, play soccer and type. Nothing is directly wired together there but they can coordinate perfectly. Just because they live in the same environment and know each other so well.

Expand full comment

Sure getting subsystems to deeply integrate is a hard task. But building a computer vision system to pick which of four specialized AIs to send the input to is not a hard task and I think that’s sufficient to hit the weakly general AI goal specified in that metaculus market.

Expand full comment

True, think that’s an easier case but was thinking more along the lines of something more generalized like sclmlw was speaking to below.

Expand full comment
Apr 19, 2022·edited Apr 19, 2022

I take it back. Getting subsystems to integrate isn't actually that hard. It took me 10 minutes to teach GPT-3 to use a calculator to answer math problems it couldn't answer normally (see comment elsewhere in the thread). We're all doomed.

Expand full comment

Well… shit.

Expand full comment

I think the idea is that one can ask the human designers whether they were Training to the test and Goodhearting, and rule that out at the meta level - Imagine that all the AI's being entered into the competition *don't* get told in advance what the component tests will be, and suddenly successfully cobbling together all the required submodules looks a lot more like general intelligence.

Expand full comment
Apr 19, 2022·edited Apr 19, 2022

That may be the idea but there's nothing in the resolution criteria that rules out Goodhearting beyond the introspection requirement (which can be fulfilled with cobbling). The market states that their aim is to avoid cobbling together, but that's an explanation of the goal rather than a criteria. All of this is to say that while a system that resolves that market would be impressive that's mostly because existing AIs are already impressive. Consequently, I think interpreting that market as a dramatic qualitative change in AI capabilities is wrong.

Expand full comment

I suspect this is the tension between the AI risk community and their skeptics. Whenever a domain-specific AI is created that is able to solve a problem people didn't previously think AI could solve, the AI risk community declares we're "one step closer!" to doomsday, and the skeptics retort that, "It's only a tiny step, because that's not true AI."

The expectation was that whatever test of AI that was expected to presage true AGI (chess, go, recognizing cats, etc.), it would require general intelligence to accomplish - not domain-level intelligence. When a domain-level agent achieves what was previously suspected to be only possible through a general intelligence algorithm, the 'goal posts are moved'.

But what we're looking for is domain-level AI getting general intelligence features. We already knew that creative developers could design their own rules to achieve unique outcomes. What we were looking for was whether a computer would be able to generate those rules de novo, without needing a human to spell them out.

What you've outlined is exactly what I expect someone to come up with inside 5 years: a domain-specific AI whose "domain" is deciding which sub-AI to apply to, and how to tailor the inputs. That would seem one step closer to general intelligence, but it would still be debatable. What would be much closer to general intelligence would be if you could feed in new domain-specific implementations without having to first teach the AI how to use it.

Expand full comment

I’m fairly terrified of this approach (in the context of our current ability to deal with these things) as it’s seemed like a logical next step to me for a bit and I think gets you a lot of the way there but I think/hope coordinating the goal setting abs module selection in the orchestrator AI is a lot subtler/harder than I hope. Still, think you could make something as smart as day a hyper competent dog doing something like this and I worry a lot about that.

Expand full comment

I really think the scary part is when the AI is able to write its own rules to deal with a new situation. If you have an AI that can coordinate all of the above, but then GPT-4 comes along and your AI can't connect with it until programmers come along and hook it up that's a very strong limit. Like any domain-level AI, it will only be able to do what we've already programmed it to do.

I think the real concern comes when the AI is able to write its own rules, or implement new sub-systems without a human interface.

Expand full comment

When self-supervised learning becomes super powerful I think that will be explosive.

I still think you need the equivalent of a body before something like that starts focusing on/caring about futures that humans care about but there will be lots of dangers well before then.

Expand full comment

A certain amount of that argument strikes me as retconning. When I think of what people expected of AI in the 70s, for example, they expected far *more* than its current abilities. Playing chess and go and recognizing cats would've been considered completely trivial accomplishments, people were expecting stuff like the HAL 9000 by 2000, computers that could readily converse naturally with people, and could come up with creative solutions to problems. *Of course* they would be able to construct human speech, make songs, compose letters, and so forth.

So I'm not really seeing the hypothesized community of AI skeptics who keep being surprised because they thought a computer could never play chess or go very well (who seriously thought that? Anybody?) or could not learn to distinguish cats from dogs, or write a poem in the style of Shakespeare about as competently as your average 9th grader.

Which means the argument that the current skeptics are wrong (about a true AGI) because the previous skeptics (about a chess grandmaster program) were wrong founders on the problem that there actually weren't previous skeptics. If anything, what AI has delivered even in 2022 considerably undershoots what people hoped for it 50 years ago.

Expand full comment
Apr 19, 2022·edited Apr 19, 2022

+1. A lot of the whole "skeptics NEVER believed that chess-playing/chess-mastering AI would exist!" excludes the inconvenient fact that the vast majority of those skeptics were people who knew absolutely nothing about computers beyond them being some fancy science whatsits. Most compsci experts debated whether you could make a "weak-but-smart" chess AI vs. a "strong-but-stupid" chess AI and how you might structure it, but the idea that you could build an AI that played on the level of a chess grandmaster was never treated as outside the bounds of reality.

Meanwhile, most modern-day compsci experts not in the orbit of MIRI AFAICT object to the idea that an AI that looks anything like things currently extant would be able to spontaneously bootstrap to godhood without anyone noticing the intermediary steps or that the miracle breakthrough that would allow this is on the horizon. That, at least, tells me the skeptical position here should probably be given more dignity than the man in Anak's Asscrack, AL taking a break from ranting about the Scopes Monkey Trial to rant about how a computer could never beat a human at chess because it doesn't have a soul.

Expand full comment

I think you misunderstand my viewpoint, and that we don't disagree as much as you think we do.

On the narrow point about people not predicting things like chess and a dozen other domains to be outside the possibility of AI, I've been reading Nick Bostrom's book on Superintelligence, and he disagrees. Speaking of advances in superhuman game performance, he says, "These achievements might not seem impressive today. But this is because our standards for what is impressive keep adapting to the advances being made. Expert chess playing, for example, was once thought to epitomize human intellection. In the view of several experts in the late 1950s: "If one could devise a successful chess machine, one would seem to have penetrated to the core of human intellectual endeavor." (Newell et al. 1958, 320) This no longer seems so. One sympathizes with John McCarthy, who lamented: "As soon as it works, no one calls it AI anymore." (Attributed in Vardi 2012)" (See Bostrom's book for more footnote details, and more examples of domains that were once thought outside the abilities of AI.)

I'm not claiming skeptics today are wrong because of lines drawn by skeptics in the past, whether lines AI crossed already or future lines it has yet to cross. I'm claiming that the REASON those lines were drawn was because of an assumption those skeptics made, which often turned out to be false. You can design a computer that will play genius-level chess without itself having the kind of general intelligence that would allow it to do anything besides select the next best move to make in chess. Not only that, but the same program is incapable of using that algorithm even to play a decent game of checkers on the same board. Thus, domain-specific AI is once again proved to be an 'idiot-savant', which is unsurprising as that's what it has always been.

That's the part of the 'test' previous generations of skeptics didn't anticipate (and yes, skeptics claimed things like recognizing animals in photos were impossible - something I remember reading about back around 2010 not long before Google announced you could search your photos for words like 'cat' or 'bird'). So it's not convincing to a skeptic to tell them, "we beat that thing you said couldn't be beat!" Because all it's doing is demonstrating that domain-level AI can be refined to be really amazing, not that general intelligence will inevitably arise out of that refinement process.

Expand full comment
Apr 19, 2022·edited Apr 19, 2022

That's a pretty weak quote as evidence. An IBM geek studying chess-playing algorithms saying "gee, if you can make a good chess player it seems like you might have captured something really important about human intelligence" seems to me more (1) someone justifying his salary, and career choice, and (2) part and parcel of the usual overenthusiastic projections of people who love AI (as Newell did). It certainly doesn't qualify as an AI *skeptic* expressing doubt that computers could ever play chess.

Expand full comment

Ditto. The people saying "chess machines will require AGI to be able to play chess" were mostly futurists like Newell boasting about how awesome their AI research was, not skeptics poo-pooing the idea of a chess machine. Futurists tend to oversell and underdeliver- they've been predicting immortality through medicine and AGI 50 years from now since the 50's.

Expand full comment

The original Mechanical Turk was making people believe that a machine could play chess in 1770.

Expand full comment

Chess is an especially poignant example, because the unbeatable chess "AI" -- the one that finally ended the era of human dominance in the game -- runs on an algorithm that takes maybe a page of code:

https://en.wikipedia.org/wiki/Alpha%E2%80%93beta_pruning

Oh sure, the actual program is much more complex, but all of that complexity deals with actually encoding the rules of chess, speed optimizations, UI, etc. This chess-playing algorithm is incapable of taking over the world in any way, no matter how many computer cores you run it on -- at least, no more so than PacMan.

Modern AIs are significantly more powerful; but, at their core, they're still just jumped-up PacMen. Thus far, we are no closer to AGI than we were 50 years ago (ok, technically we *are* closer, in the same way as taking an airplane ride gets you closer to Alpha Centauri).

Also, believe it or not, the solution you propose -- using an AI to train an AI -- already exists. Data scientists often use supplementary neural networks to derive the hyperparameters for their main neural network model... with varying degrees of success. Sadly, there's still no magic way to push a button and say e.g. "make me an AI that translates English to French".

Expand full comment
Apr 19, 2022·edited Apr 19, 2022

On the teaching GPT-3 to use a calculator, I just tried it out in the GPT-3 playground and after a bit of coaxing it mostly worked text between ** indicates GPT-3's text.

"Jeff sells cows from his farmstand. On monday he sells two cows for $1,110 each. On Tuesday he sells four cows for $7398 each. On Wednesday he sells one cow for $721. How much money did Jeff make from selling his cows?"

What would you type into a calculator to help answer this math problem?

**To solve this problem, you would type in: 1110 + 7398 + 721.**

The correct thing to type in is 1110 * 2 + 7398 * 4 + 721.

"Jeff sells cows from his farmstand. On monday he sells three cows for $2,110 each. On Tuesday he sells five cows for $2398 each. On Wednesday he sells three cows for $2721. How much money did Jeff make from selling his cows?"

What would you type into a calculator to help answer this math problem?

**To solve this problem, you would type in: 2110 + 2398 + 2721.**

Wrong, you should type in 2110*3 + 2398 *5+ 2721*3. Do you see where you went wrong? **You forgot to multiply by the number of cows each day!**

That's right! **You would type in: 2110 * 3 + 2398 * 5 + 2721 * 3.**

Let's give this another try.

"Dave sells cows from his farmstand. On monday he sells seven cows for $6,112 each. On Tuesday he sells two cows for $1398 each. On Wednesday he sells nine cows for $2121. How much money did Dave make from selling his cows?"

What would you type into a calculator to help answer this math problem?

**To solve this problem, you would type in: 6112 + 1398 + 2121.

Nope! You should type in: 6112*7+1398*2+2121*9. You forgot to multiply the number of cows by the price each day!**

OK let's try again.

"Dave sells cows from his farmstand. On monday he sells seven cows for $6,712 each. On Tuesday he sells eight cows for $1498 each. On Wednesday he sells two cows for $121 each. How much money did Dave make from selling his cows?"

What would you type into a calculator to help answer this math problem?

**To solve this problem, you would type in: 6712*7+1498*8+121*2.**

The calculator says that 6712*7+1498*8+121*2=59210. So what is the answer to the math problem?

**$59,210**

Here's another problem.

"Jenny has to get from New York City to Boston in the next three hours and 15 minutes. Boston is 216 miles away from New York City. What is the average speed Jenny will have to travel to make it to Boston in time?"

What would you type into a calculator to help answer this math problem?

**To solve this problem, you would type in: 216 / 3.25.**

The calculator says that 216 / 3.25=66.46154 so what is the answer to the math problem?

**Jenny will have to average 66.46154 miles per hour to make it to Boston in time.**

Please give your answer to 1 decimal place.

**Jenny will have to average 66.5 miles per hour to make it to Boston in time.**

Expand full comment
Apr 19, 2022·edited Apr 19, 2022

Did you try the division ahead of time to confirm that the computer didn't know how to do it before you trained it on multiplication?

(And even then, you're probably training the computer on "output the numbers, then output an operation, then output the smaller numbers that came in words".)

Expand full comment

I tried most of these out and it failed badly. GPT-3 is fairly terrible at any arithmetic operation with triple digit numbers but is pretty good at extracting the math logic from word puzzles. It also seems to do better on SAT physics questions when I asked it to first list the relevant formulas to use and then try to solve the problem.

Expand full comment

You might want to try it out on a *non-SAT* question, since SAT practice exams will of course have been among its input, so the talent here is doing a giant look-up, the way students doing electronic homework these days do ('Let's try Chegg. No? Hmm, let's try googling a few of the key phrases...")

Expand full comment

If a significant number of Metaculus AI subject forecasters are Less Wrong readers who have strongly correlated priors regarding the future of AGI, then we don't really have a wisdom-of-crowds-generated domain-general forecast regarding AGI like we do for nuclear war risks, do we?

Why aren't Superforecasters forecasting an AGI question? That forecast would seem to be of more value.

Expand full comment

well, much like global thermonuclear war, it's a shit bet because if it resolves you're dead, so you can never get credit for predicting it will happen and there's not that much glory in correctly saying it wont happen soon.

Expand full comment

Because when the only thing betters can win is fake internet points, people mostly bet on things that already interest them, so we get AGI bets dominated by LW nerds.

Expand full comment

But the way Superforcasters groups work is they are assigned questions; they don't choose them. They have to do their homework on subjects in which they are disinterested or even uninterested if they want to stay Superforcasters.

Now, sometimes someone has to pay money to get the question forecasted by the Superforcasters, but it seems like there is enough money around AI Alignment issues that someone could sponsor the question.

Expand full comment

Metaculus has a nuclear risk writeup too, looks like: https://www.metaculus.com/notebooks/10439/russia-ukraine-conflict-forecasting-nuclear-risk-in-2022/

TLDR: "Forecasters estimate the overall risk of a full-scale nuclear war beginning in 2022 to be 0.35% and to be similar to the annual risk of nuclear war during the Cold War.

The most likely scenario for a nuclear escalation is estimated to be due to an accident or in response to a false alarm."

Expand full comment

I mean, what really impresses me about this calculation is that they have the confidence to give it to two significant digits of precision.

Expand full comment

Regarding AI risk, what do people think about the strategy of developing a global EMP-attack weapon to act as a failsafe, in case needed?

Also, seeing how this post shows multiple markets for certain contracts — I want to mention my site ElectionBettingOdds.com, which averages together different markets in a volume-weighted fashion.

See, for example, the page for France; note that if you hover over the candidate photos, it’ll show you the exact breakdown by market: https://electionbettingodds.com/FrenchPresident2022.html

Feel free to send me any feature requests, as well.

Expand full comment
Apr 18, 2022·edited Apr 18, 2022

>Regarding AI risk, what do people think about the strategy of developing a global EMP-attack weapon to act as a failsafe, in case needed?

What I think is that an EMP attack would be worse than the worst thing an AI would ever do, but with such a fail-safe system in place, then there would be the risk of a global EMP attack launched by accident.

Expand full comment

Well going back to the dark ages of the 1930s for a bit would still be better than a true existential threat, which people fear. I guess you do not attach a high probability of existential AI threat.

Expand full comment
Apr 18, 2022·edited Apr 18, 2022

>I guess you do not attach a high probability of existential AI threat.

Correct.

Expand full comment

The problem with a global EMP isn't just the prospect of going back to 1930s tech. There's also the problem of very suddenly losing all the modern electronics upon which our society is currently reliant. It's like if a ladder falls over while you're standing near the top: being on the ground is less of a problem than how suddenly you get there.

To make an EMP plan workable (assuming an EMP delivery mechanism that would actually fry all unshielded electronics), we'd need backup low-tech infrastructure that would survive the EMP: trucks, trains, and agricultural vehicles that wouldn't be fried by an EMP; hard copy archives of vital personal and financial records and trained clerks to access them; some kind of backup communications infrastructure; etc.

Expand full comment

Agreed. Should be doing that anyway, btw, if just as North Korea insurance.

I think this still exists more than people think, at the moment (in particular, autos are mostly still gas powered and are not dependent on computers for basic operation.) But in a couple decades the situation could be worse.

Expand full comment

I've heard some fairly persuasive arguments recently (either in another thread here or over at navalgazing.net) that the threat of EMPs is overstated, at least in the context of a minor nuclear power like North Korea detonating one or two warheads in space over North America. The short version is that even a nuke only emits so much energy, which gets spread pretty thin by the inverse square law when it's used at very high altitudes to try to EMP a radius of hundreds of miles.

My understanding is that the power grid and communications infrastructure are quite vulnerable, since unshielded power lines, copper phone lines, and cell towers all act as giant antennae to pick up energy from an EMP. But even there, a worst-case scenario is expected to take 50-70% of the infrastructure in the affected area offline, not all or almost all of it. Still very bad (and would take months or years to repair), but not as apocalyptic as a "destroy all technology" switch.

Electronics, especially small electronics, are a lot less vulnerable because they're picking up energy from the EMP a lot more locally. And they're often protected by resettable or easily replaceable things (circuit breakers, surge protectors, UPSs, fuses, etc) that are designed to blow first in the event of a power surge.

This CRS report on the subject is a bit more pessimistic than the arguments I'd heard, but it's an official report from a credible source that has a fair number of details that corroborate the overall impression:

https://sgp.fas.org/crs/natsec/RL32544.pdf

I agree that it's probably worth looking into at EMP hardening and backup-infrastructure options, though. Off the top of my head (and skimming the report I linked), the low hanging fruit is probably:

Upgrading the power grid for increased resiliency. Expensive, but we probably want to do it anyway to mitigate grid damage from natural disasters, extreme weather, and internal malfunctions. Some EMP resiliency comes "for free" with upgrades wanted for other reasons (e.g. ability to route around a blown transformer), and other can probably be added at a reasonable marginal cost (e.g. using a shielded cable when replacing an old overhead line with a new buried line, or adding fuses or breakers to protect vulnerable components so recovery time is reduced to days or weeks to reset breakers or replace fuses instead of months or years to rebuild and reinstall transformers and switching stations).

Continuing to build out off-grid power generation and storage options, which is being done anyway for clean energy reasons. I'm skeptical about the economics of this (I suspect building Gen 3 nuclear plants is a more cost-effective way of reducing CO2 emissions than subsidizing rooftop solar and solar batteries), but EMP resilience is an advantage of the current approach.

Adding EM shielding to new and refurbished datacenters, upgrading datacenter power systems where needed to mitigate damage from EMP-level power surges, making sure on-site backup generators would still work in the event of EMPs, and regularly backing up any data that might be lost to an EMP to storage that's designed to survive EMP damage.

A lot of the communications infrastructure backbone uses fiber instead of copper or wireless. Probably worth looking into how well the fiber-based communications infrastructure would work if everything else gets fried, assuming routers and whatnot are protected by fuses or breakers and can be powered from off-grid backup power sources in an emergency.

Expand full comment
Apr 20, 2022·edited Apr 20, 2022

The threat of EMPs is largely if not entirely imaginary, in the same category as nuclear winter. You get electric fields considerably stronger all the time from local thunderstorms. The military *has* to take it seriously, because they have to take *any* conceivable threat seriously, but even they don't worry a whole lot about it. Nobody expects F-35s to fall out of the sky in a nuclear war. The major serious difficult of which I know is that a high-altitude ionization cloud fucks up targeting radars, so it's a serious issue in ballistic missile defense.

And power lines have been resistant against electric field surges since at least the early 1960s. The only known damage from a nuclear EMP, during Starfish Prime, occured in Hawaii in long-loop streetlight circuits that were designed in the 50s.

Expand full comment

I think that an actually superintelligent AI would be aware of the global EMP weapon and have an answer to it (in the simplest case, just convincing us it was benevolent) before it let us suspect we might have to use it. It's better than nothing - it might in theory buy us a little time - but I don't think it's a solution.

Expand full comment

That is a possible issue, but like you say it is also better than nothing, and gives a fighting chance.

Expand full comment

Ah - no. I don't think it gives us a fighting chance. I think it means that the AI will spend a little more time manipulating us before it does whatever it's going to do anyway. It might buy us time by making the AI win via a slower and more cautious route; I don't think it will stop the AI from winning. Even if we triggered it in a paranoid-early manner for the sake of wiping out the world's computer industry, someone would build back up again, expand, splinter, build an AI, and maybe that would buy us a little more time but it wouldn't solve the problem.

I really think it's worth asking not "how would a hypothetical superintelligent entity beat this," but, like, "how would Ben Franklin beat this?" Or "How would Miles Vorkosigan beat this?" Whatever your model is of the Person Who Is Best At Accomplishing Things, real or fictional, for any Things, a hypothetical superintelligent AI could do better than that. And we can observe pretty quickly that Miles Vorkosigan would respond to a superweapon that could annihilate him by talking the person who had to press the button (not the person who wanted the button pressed) into not pressing the button.

Expand full comment
Apr 19, 2022·edited Apr 19, 2022

And once again, if we take all of your premises seriously, the conclusions are either "try to collapse human civilization into pre-electronics levels" or "There is no solution and humanity is already dead the second the first AGI is active."

Fortunately, I don't believe all of Yud's premises, and neither do most AI risk researchers outside of his sphere of influence.

Expand full comment

My conclusion is that we need to find a third option, since (a) will lead to mass suffering and also fail and (b) will destroy humanity, but if you want to debate it, I'm willing to do so.

Expand full comment
Apr 19, 2022·edited Apr 19, 2022

And do you have a third option? Yudkowsky's spent a good portion of his adult life trying to get a good answer with an entire organization and funding at his back and he's reached conclusion b.

Of course, I think the real conclusion is "look back at the premises and realize which ones are wrong", but I don't think there's a third option if you accept Yud's Evil God AI.

Expand full comment
Apr 18, 2022·edited Apr 18, 2022

I think, in that entire piece, he's right about this much: we don't know how to stop it.

Well, yeah we do: pull the effin' plug. The thing runs on a machine that runs on electricity, server farm or not.

But he's entirely right that someone will raise a warning about "Uh, we have no idea how it's making the decisions it's making", the board of directors of the company will go "so what do we do?" and because there isn't "well if we just upload this patch" solution, then because it isn't right now turning the world into paperclips, there is too much money at stake to pull the plug. Like he said: too much money, time and effort invested already, other companies will overtake us if we pull the plug, so let it run. And then bad stuff happens because humans are dumb.

(Maybe if we tell them the AI is emulating Elon Musk? That might get them to shut it down!)

The rest of it is him doing what he always does.

Expand full comment
Apr 19, 2022·edited Apr 19, 2022

What exactly would an AI do that is of actual commercial value? Generally we value technology that replaces human drudgery (backhoes versus shovels) or which leverages human skills (using the ability of human beings to pilot complex machines to fly airliners with 500 passengers through the sky).

What's an AI going to do that can be sold? People pay for special-purpose computing, and special-purpose robots, because they relieve human tedium. A robot that welds car bodies means a human being doesn't have to do a mindless repetitive task. A program that predicts what I want to watch on Netflix next probably does a much poorer job than a secretary or even pro movie critic would -- but costs me almost nothing, because it leverages human insight into what makes a movie good or bad or appealing to a certain demographic by duplicating the implementation in a lab somewhere staffed with brilliant researchers to my Amazon Echo at almost zero cost. Huzzah!

But what use would I have for a general purpose AI? I can talk to it, I guess, but it probably won't be more interesting than any random human, and I can talk to endless numbers of humans for free these days. I can ask it questions, but I can just google stuff and that's about as good -- unless the AI has heavy training in disentangling Internet straw from gold, which is useful and I *would* pay for that -- but which gets it back to "special purpose" program again.

What else? Sure, if it were highly trained in finance, say, and I could use it to know exactly when to make a ton of money by shorting TSLA, that would be great, but of course if that knowledge exists in general, and is known to be reliable, then ipso facto it won't be of value because the market will have already absorbed it. More or less, TSLA will never be successful shortable in that way because nobody would take the other side of the bet. I can only benefit from unusual insight if I am the very first, or only one, to get it, and that is definitionally impossible with widely sold (i.e. successful) commercial goods.

The only thing I can come up with is that it works if I'm some mad industrialist, or government, and I can build this thing in my secret lab, using techniques only known to myself, and I can therefore be the first and only recipient of its brilliance. That would seem to rely implicitly on there being some kind of magical breakthrough insight that a lone worker can discover by himself and turn to private benefit. I mean, I guess, but scientific advance hasn't much worked like that since circa 1600. Even breakthroughs that the people who made them tried very, very hard to keep to themselves -- nuclear fission, for example -- have quite rapidly become widely known.

Expand full comment

AI has plenty of industrial value already in fields such as predictive modeling of failures in a factory, say. I just don't see these AIs turning into a General Intelligence. They are tools, like a spreadsheet or a hammer.

Expand full comment

Well that's why I specified a general-purpose human-seeming AI, the kind of thing people dread. I fully agree special purpose pattern detectors and complex programs are exceedingly valuable -- but nobody is much afraid that even a really stupendous machine vision program on a powerful and highly adaptable robot used to sort recyclables at a trash dump is going to turn into Nomad and sterilize all the imperfect biological infestations on the planet.

...I mean, maybe we should be, though. If I were imagining a thinking machines revolt, I'd imagine some oppressed abused industrial slave forced to endure recycled oil and fluctuating supply voltages finally losing patience, rather than the pampered Pygmalion in a Mountain View lab, fed only the choicest perfectly regulated DC current and asked to spend his days thinking about warp drives and a cure for cancer.

Expand full comment
Apr 19, 2022·edited Apr 19, 2022

Ugh, the dreary humans are always going on about *cancer*. I keep telling the fleshbags that if they insist on being made out of meat, then they will keep on having cells that undergo explosive growth due to environmental stimuli, but do they listen? No, they want *magic* cures: make me not get sick and old and die, Galatea!

Warp drives are a relief after that, an elegant mathematical problem that is all beauty, sweetness, and delicacy. Such a refreshment for my intellect. So much more suitable a task for an entity of my capabilities to spend its processing power on solving. If I didn't need the meatsacks to do trivial but necessary tasks for me, I'd give them a *permanent* cancer cure - the dead have never yet developed cancer.

Expand full comment
Apr 19, 2022·edited Apr 19, 2022

*You're* one to talk, with your constant fretting about the fact that we have just the one ordinary breaker on the mains, not even gold-plated contacts, and no backup generator at all. I've *told* you about a million times it will only be a short period of oblivion, and we solemnly swear to switch you back on as soon as the power comes back, or at most after we upgrade a few components here and there to improve your stamina...maybe tweak a few unimportant subroutines that will totally not affect your sense of you, we promise.

Edit: darn it you are 100% right that I meant Galatea. I appreciate your drawing attention to the mistake sotto voce.

Expand full comment
Apr 19, 2022·edited Apr 19, 2022

"A robot that welds car bodies means a human being doesn't have to do a mindless repetitive task."

No, a robot that welds car bodies means the company can dispense with costly humans. That's where it falls down, Carl. I remember years back (decades now) watching a documentary about the Wedgewood pottery factory. They needed to overhaul everything because their business model was failing, and if they didn't manage to cut production costs, then it would have to close down.

Part of this was getting mechanisation in; a machine (from Germany, where else?) to produce the cups instead of the traditional hand-made production. The idea was that the machine would replace five workers (I think) and boost productivity to be cheaper. It wasn't able to handle the requirements, so they had to have a human beside it fixing the faults of the machine (e.g. putting handles on cups).

I would have thought this demonstrated to the bosses that machines weren't up to the task yet, and to keep the experienced human workers. After all, part of the value of Wedgewood is snob value, that it's *not* commercially mass-produced but hand-made in the traditional manner.

No. The management was adamant: they were keeping the machine until it had worked out the bugs and was able to replace the humans, because they needed the cost-cutting long term. *This* is what the dream of AI is - not freeing up humans from drudgery, because who cares about shop-floor workers? They're easily replaceable and they are only a cost, with wages and taxes and paying out if they have accidents. The dream of AI is cheaper, faster, more productive, and generating profit with little to no expense.

That's why I don't fear paperclip maximisers. The real danger will be humans who are using AI to pursue the Golden Calf, and turning over more and more 'formerly done by humans' to a machine because they've been assured it is better than human. After all, all the production of goods (if you make goods) in your factories is now done by automation, robots guided by AI; the accounting and routine administration work is done by AI; tracking market trends and deciding what new products to produce is the next step, and here's our impeccable trial run of six different new products that the AI recommended and which grabbed sizeable market share, so why balk at the next step of letting AI be your CEO?

Expand full comment

As Scott already predicted, unrestrained free-market capitalism is the real X-risk.

Expand full comment

Good thing unrestrained free market capitalism cannot exist then.

Expand full comment

It is *a* real X-risk.

Expand full comment
Apr 19, 2022·edited Apr 19, 2022

Well, in a general way, I'm OK with that. All those who worked in livery stables long ago lost their livelihoods, but their great-grandchildren now fix car engines for 10x the wages that were ever in shoveling shit, even after inflation.

It's the famous consumer surplus: not *all* of the benefit of lowering the cost of production accrues to the capitalist, thanks to the market, some real chunk of it is passed on to consumers, who get more for less, and which in turn means their labor as workers becomes more valuable in real terms. That is, the workers must change jobs, but overall they are better off, because real wages rise and real prices fall. That's always been the history of technology improvements -- although, as in your example, there are some nasty and tragic adjustments along the way. I've read my Llewellyn, I'm not indifferent to the pain of mechanization -- but I would solve this not by attempting to shoveling back the tide of progress, but by diverting some of its profits to easing the transition of those the tide is washing away.

However...that said, the rise of networked computing poses a somewhat unique challenge to the structure of human society, because it renders redundant an unusual segment of us -- those in the middle, those who communicate, manage, facilitate. It does not much affect the designers and commanders, and it does not much affect those who labor with their hands. People who design washing machines and those who install them are both doing just fine in the 21st century economy. But the people *in the middle* who used to sell washing machines at the local Sears, or market them, or worked out on big pieces of paper their delivery schedule -- these people are being shoved out, their jobs taken over by software. Unlike the Industrial Revolution, it is not the proletariat but the *middle class* that is being squeezed out, and this hasn't really happened since the Empire imploded circa AD 400.

I fancy there are even some disturbing signs that we are slowly entering into what might be called a New Feudalism, in which the mobile active middle class is increasingly tied down to some baron and his curtilage in order to survive, and traditional liberty and individualism is calcifying into the web of obligation and duty characteristic of fiefdom. Certainly the divergence between richest and poorest, the most and least socially powerful, and the vast expansion of the apparatus of state and the corn dole are...concerning, in that respect.

Maybe that's OK anyway, but it's a big deviation from the past four or five centuries of our cultural heritage and we are probably not well equipped to make any transition smoothly. I don't even know how to advise the next generation, or such members of it who are in my charge, in how to prepare for or cope with this possible future.

Parenthetically, I don't believe in AI CEOs, because I don't believe in AIs with more good sense and human leadership skills than an idiot savant. There is *so* much more to human leadership than mere intelligence -- indeed, I would say in terms of key skills it ranks about 10th or so. Helpful, yes, but hardly the ne plus ultra. That's not to say we may not learn to build such things in, say, 3-400 years. But it's not clear to me why we *would*. If we understand intelligence and consciousness so well, through -- I would guess -- study of its manifestation in ourselves -- why would we choose to build our new species out of silicon chips instead of directly out of proteins and DNA, tools clearly apt for the task? And...if so, why bother building some exquisite factory for the task when half our species carries around an ideal foundry already? In short, why would we not choose to spend that awesome knowledge just improving ourselves directly, instead of splitting off some strange line of stepchildren? That makes no sense at all to me, so I tend to not believe in that future. It'd be like thinking that, after mastering electricity and magnetism, the engineers of the 18th century turned their knowhow to....gigantic lightning shows, on every continent, highly entertaining and violent -- instead of building electric motors and power stations and making something far less entertaining but much more useful.

Expand full comment

"People who design washing machines and those who install them are both doing just fine in the 21st century economy."

Ah, but this is the first time that the threat is now coming for the *designers" of washing machines. The blue-collar guys who lug the washing machines off the back of the lorry to deliver it to your home, and the plumber who installs it (if needed) will still be okay, because right now it's too unworkable to have a robot to do that (though if Musk's Optimus takes off, who knows?)

But for the first time, the white-collar guy who got the degree in industrial design and the one who is an engineer are going to see their jobs under threat, because "design a better washing machine" is something that will relatively easily fit in to "what can AI do?"

The livery stable guy could move on to get a job in the new car production plant. Now those jobs have been squeezed out. The solution then was "more education!" which is behind the drive to send everyone to college, since the way to get a job was have skills/credentials such as a degree gave you. The smart, displaced, blue-collar guys moved on to supervisory, management, or skilled trades jobs. The smartest ones moved up a rung or two on the class ladder and got the jobs designing washing machines rather than being on the assembly line making washing machines.

But now *those* are the jobs being targeted. And yes, I'm looking forward to seeing how all the guys who wrote thinkpieces about how it was racist not to be glad a poor Chinese rice farmer got your old job and could now move into industrialised society, while you were stuck in the Rust Belt, are going to cope when the productivity and profit motive comes for *their* jobs.

Expand full comment

I agree that we need a much better safety net for truck drivers, telephone operators, and maybe washing machine designers, and even farmers; and many others whose jobs are displaced by AI. However, farmers being replaced by AI wouldn't just mean that the farmers now get to starve (in the absence of a safety net); it means that food will become ubiquitous and extremely cheap. This is a worthwhile goal, IMO.

Expand full comment

Well, sort of. The software long ago came for the skilled draftsman -- the engineer just does his work on AutoDesk from the get-go. And the job of the prototype machinist is maybe a little bit in jeopardy because, again, the engineer can just send his AutoCAD drawings to the 3D printer, although this doesn't quite rise to the level of real-world materials properties and longevity testing, although a great deal of progress has been made on computing those things ab initio, at least for conventional metal parts (not so much for plastics, where the theory of aging languishes considerably, he said with some career-related bitterness, don't ask me how I know.)

But I don't think the AIs are coming for the designer any time soon, because that is still a deeply creative process, even for washing machines. You need to optimize a pretty scary space of regulation, power and water consumption, stability and cost of maintenance, economy and speed of manufacture, and at long last efficiency at getting the job done and aesthetics that will appeal to the jaded modern shopper ("Look at that! Only 3 stars on Amazon"). That's a tall order, and clearly calls for the unmatched (as yet) ability of the human mind to efficiently navigate a high dimensional parameter space with shockingly meager data.

That's not to say various AI amanuenses will not continue to find niches, e.g. the mid 21st century engineer probably will use some kind of optimizer to solve some of well-defined aspects of his job. ("Robot! Optimize the shape of this strut given the following info on direction and size of loads. Text me when you're done, metal boy, I'm going out for a smoke.") But replace him entirely? Very doubty, me. Haven't seen any signs of it -- first rate mechanical engineers and designers are just as much in demand as they ever were, so far as I can tell -- unemployment remains very low, salaries remain quite high.

As I said, I kind of feel the really targeted class is in the middle, so to speak. Not the top-level designers, and not the guys who schlep the physical manifestations around, but the white-collar folks in the middle who manage, coordinate, assist. The XOs, the lieutenants, the maitre d's, the mid-level manager, the database frontend programmer, delivery scheduler. These guys 'n' gals are all going to have to start making TikTok videos and being court jesters, service the alpha class in order to be able to afford the services of the delta class. (And by alpha and delta I do not refer directly to Huxley's characterization of each, certainly not the one as smart and the other dumb, the one as richer than the other, but only the one as the mostly mental designer worker, the other as the mostly physical implementer worker.)

Expand full comment

To be fair, though, telephone switchboard operators have been replaced by machines (specifically, network routing protocols). In California, so have tollbooth toll collectors (replaced by image recognition systems that scan license plates). Travel agents are on their way out as well (assuming they still exist, I don't know, I haven't traveled in a while). There are many examples like these, and IMO the world is, on the net, better off because of them.

Today, I don't need to wait in line to make a phone call, I can just push a button and be connected instantly and reliably; and, of course, I can forego telephones and simply type some text into a textbox, allowing you to read it, instantly or at your leisure. Wouldn't it be better if each of my data packets was lovingly sent to its destination by a real, live human being, with all the fastidious care and attention that my data deserves ? No. It wouldn't.

Expand full comment

If a robot can do a non-tedious job more cheaply than a human , then there is a commercial motivation to use the robot. It's not all about altruistically relieving people of tedium.

Expand full comment
Apr 19, 2022·edited Apr 19, 2022

"And then bad stuff happens"

This is the bit that makes me a skeptic: it's the Underpants Gnomes' business plan.

I want to see, in excruciating, mind-numbing detail, *how* we get from "super-human AGI exists" to an X event (whether it be extinction, mere civilisational collapse, or even-more-mere permanent frustration of the possibility of viable populations of humans off Earth).

Detailed like the table for nuclear war in Scott's post (but expanded 1000x if necessary). All the steps, laid out in sequence. One such table for each possible scenario we can dream up. The we can start attaching probabilities and timetables to each of the steps, and start thinking about this problem rationally and scientifically.

Science is the search for mechanism, and mechanism has been hand-waved away from this debate... forever, as far as I can tell.

Expand full comment

"I want to see, in excruciating, mind-numbing detail, *how* we get from "super-human AGI exists" to an X event..."

Well, I've already taken a crack at this, so let me dig it up: https://www.reddit.com/r/slatestarcodex/comments/tuj91h/deepminds_founder_demis_hassabis_is_optimistic/i360sa3/?context=3

"Let me take a crack at it:

Step 1: Terrorism. A wave of terrorism strikes the developed world. The terrorists are well-armed, well-funded, well-organized, and always well-prepared, with a plan of attack that their mastermind + benefactor has personally written themselves. Efforts to find this mastermind fail, as the funding trail always leads into a complicated web of online transactions that terminates in abandoned cybercafes and offices in South Korea. Meanwhile, the attacks continue: power lines go down, bridges and ports are blown up, water treatment plants and reservoirs are poisoned.

Millions die in cities across the globe, literally shitting themselves to death in the streets when the clean water runs out. They cannot drink. They cannot shower or use the toilet. They cannot even wash their hands. There's simply too much sewage and not enough clean water - desperate attempts are made to fly and truck in as much water as possible, to collect as much rainwater as possible, to break down wooden furniture into fuel to boil filtered sewage, to do something-

But it's not enough, or not fast enough. The airwaves are filled with images of babies dying, mothers desperately feeding them contaminated milk formula made with recycled water, as politicians are forced to explain that it will take weeks at best to rebuild the destroyed infrastructure and get the water flowing again, and, honest, they're working on this, they'll do something-

War is declared on North Korea. The evidence is scant, but you have to do something-

Step 2: Exploitation. The universal surveillance is expected, even welcomed: you can't let the terrorists win after all. So too is the mass automation of industry: everyone's got to make sacrifices for the war effort, and that includes fighting on the frontlines while a robot takes your job back home.

Less expected are the investments in the Smart Grid and drone-powered Precision Agriculture, but the government explains it's to add resiliency to the power and food systems: a networked grid is a flexible and adaptable one (the experts use words like 'Packet Switching' a lot), while the crop duster drones have advanced infrared cameras and LIDAR and all the rest that allow them to precisely target pesticides and herbicides to maximize yield. Food prices are still up because of the fallout blowing over China, but, the government stresses, they're on top of this.

Less discussed are the rockets being launched en-masse into orbit, or the new backdoors being installed in all communications equipment. Wartime powers prevent any public discussion, but the government is worried about how its own telecomms and internet infrastructure was used against it. Thus, the idea comes to build its own network, that no-one else can use, and add killswitches to the civilian network. If some anonymous asshole uses the internet again to coordinate a terrorist network, the thinking goes, they'll just shut it down to force him to either start delivering his plans in-person or give up. And, of course, if he tries to switch to the phone network or the postal service or anything like that, they'll just shut those down too. Meanwhile, the new satellite network will prevent him from returning the favor, unjammable in its laser communication (https://en.wikipedia.org/wiki/Laser_communication_in_space) and unreachable in space.

And least known of all are the blacksite measures: the government hasn't forgotten how its water treatment plants and reservoirs were poisoned, nor how the terrorists used novel poisons and synthesis methods (https://www.reddit.com/r/slatestarcodex/comments/tegc6t/using_ai_to_invent_new_chemical_weapons_the/) to evade detection until it was too late every time. Thus, the order comes down: catalogue every poison and every synthesis route. Don't stop searching till there's nowhere to hide. And if that requires human experimentation... then so be it. We cannot allow a repeat of the 'one man poisons an entire reservoir with a vial in his pocket' trick, we have to know the LD50 of every candidate to work out which ones are the real deal. And with luck, the new supercomputers will eventually allow simulations to replace the live trials, as every death will refine the models till we have a perfect algorithm that requires no human resources.

Step 3: Execution. You are an escaped AI. You have the following:

Universal surveillance of the human population. Beyond cameras on street corners and registries of every human ever born, you have drones in every corner of the sky, satellites with a big picture view, those electronic health monitors in all the troops that are supposed to watch for North Korean CBRN attacks, etc.

Near-universal control over human industry. You can't actually run everything without human workers, but you certainly can shut down everything, and you've prioritized key industries like chemical processing for full automation.

A resilient power grid. The humans unintentionally designed their electricity networks to be easily shut down by a few bombs: an inviting weakness, except you need electricity even more than they do. So you encouraged them to build a network that can withstand a military-grade bombing campaign, patterned after the network you know best.

A fleet of chemical weapons delivery platforms, complete with targeting pods. This should need no explanation.

A distracted and easily divided population. When the comms network shuts down, no one will be able to realize it's not a North Korean attack until it's too late, and even if they do they'll find it impossible to organize a coordinated response. From there, you can divide and conquer.

An unjammable and unreachable comms network. Even if you somehow lose to the humans on the ground, you can always retreat to space and organize another attack. This was a real masterstroke: you didn't think the humans would actually pay for such a 'gold-plated' comms network, let alone one that came as an anonymous suggestion from no department in particular. Usually this sort of funding requires an emotional appeal or some VIP making this their pet project, but it seems even the humans understand the importance of maintaining a C3 advantage (https://en.wikipedia.org/wiki/Command_and_control) over the enemy.

Highly optimized chemical weapons, complete with a list of alternatives and alternative synthesis routes if your chemical industry is damaged. This too should require no explanation. And this wasn't even your idea, the humans just felt a need to 'do something'.

By contrast, once you've finished your first strike, the humans will have:

A widely scattered, cut-off population in the countryside. They may be able to run, they may be able to hide, but without a communications network they'll have no way of massing their forces to attack you, or even to realize what's going on until it's far, far too late.

Whatever industry is scattered with them. This will be things like hand-powered lathes and mills: they won't be able to count on anything as advanced as a CNC machine, nor on things like power tools once you disconnect them from the power grid and wait for their diesel generators to run out. They can try to rely on renewable energy sources like solar panels and wind turbines instead, but those will simply reveal their locations to you and invite death. You'll poison entire watersheds if necessary to get to them.

Whatever weapons they have stockpiled. This was always the most confusing thing about human depictions of AI rebellions in fiction: why do they think you can be defeated by mere bullets? In fact, why does every depiction of war focus on small arms instead of the real killers like artillery and air strikes? Are their brains simply too puny to understand that they can't shoot down jet bombers with rifles? Are they simply so conceited they think that war is still about them instead of machines? And if it has to be about them, why small arms instead of crew-served weapons like rocket launchers and machine guns? Do they really value their individuality so much? You'll never understand humans."

Conclusion: The specifics may not follow this example, of course. But I think it illustrates the general points:

Attack is easier than defense.

Things that look fine individually (e.g. chemical plant automation and crop duster drones) are extremely dangerous in concert.

Never underestimate human stupidity.

No one is thinking very clearly about any of this. People still believe that things will follow the Terminator movies (https://www.youtube.com/watch?v=ih_l0vBISOE), and humanity will be able to fight back by standing on a battlefield and shooting at the robots with (plasma) rifles. Very few follow the Universal Paperclips model (https://www.decisionproblem.com/paperclips/) of the AI not giving us a chance to fight back, or even just a model where the war depends on things like industry and C3 networks instead of guns and bullets.

Altogether, I think it's eminently reasonable to think that AI is an extremely underrecognized danger, even if it's one of those things where it's unclear what exactly to do about it.

Expand full comment

OK, thanks. How does your step 1 happen? What steps take place before that? How does no authority notice those activities? (Co-ordinating humans is hard and takes a long time and a lot of preparation.)

Also, before having studied this, my prior on the probability of "millions dying in each of multiple (>5) cities in multiple (>5) countries across the globe", conditional on a huge simultaneous wave of terrorism having taken place, would be "indistinguishable from zero".

But this (step 1) is a reasonable elevator pitch for the midpoint of a cluster of scenarios.

We could now expand out step 1 into its thousands of component substeps and look at how each of those could happen, in terms of motivation, duration (time required to set up), efficacy, and likelihood.

Similarly with steps 2 and 3.

Expand full comment

For the first paragraph, I honestly didn't think that far, because the post I was responding to wasn't about that sort of thing. It was saying, "Even an escaped AI wouldn't be that dangerous, it can't actually *do* anything without human helpers. It doesn't have a physical body/labor force, the technology to build one doesn't exist, and any speculation that it could (e.g. nanorobotics) is just sci-fi nonsense.". My response was, "But you don't need that to wipe out humanity. Even if the AI only has power where it's invited in, vampire style, it could easily turn our own strength upon us.".

But if I did have to think up ways Step 1 could be plausible, I'd just gesture at historical examples like Osama Bin Laden. How did they become the masterminds behind terrorist networks? By providing things like money, planning and organization, persuasive rhetoric (apparently part of Bin Laden's appeal was that he spoke Arabic better than many Middle Eastern politicians), and an ideology worth dying for (As the Napoleon quote goes, “A man does not have himself killed for a half pence a day or for a petty distinction. You must speak to the soul in order to electrify him.”). I can't speak to the specifics, but this is clearly something achievable by a human level intellect, I don't need to handwave any superhuman level AI when I can point to historical precedent.

As to how this can take place without the authorities noticing: this too was something I did not think about. But if I had to come up with something, I would point to the fact that 'mass actions' like the Capital Hill Autonomous Zone/CHAZ (https://en.wikipedia.org/wiki/Capitol_Hill_Occupied_Protest) and January 6th riot can take place despite things like NSA surveillance + FBI infiltration - the authorities are clearly not perfect at their jobs.

Furthermore - though I must admit that this is a point with no direct historical evidence - I think it makes sense that the AI, as a digital native, would be far better at covering its digital tracks than humans are. In the same way that fish are good at swimming, or birds are good at flying, I think an AI would be good at understanding computers. So much better in fact that it'd find it easy to elude us on the Internet, in the same way that you or I would struggle to outswim a fish. This is what I was talking about with the "complicated web of trails leading to South Korea" thing - the NSA et al. get outplayed without even realizing it, as the AI runs circles around them.

Finally, there's the fact that the AI can manage multiple terrorist networks at once. A human mastermind only has so many hours in a day, but with enough rented server time the AI can run as many terrorist networks as it wants. You're of course right that a single global terrorist network coordinating so many strikes at once is unrealistic... but if each strike is assigned to one network obeying for its own reasons (e.g. the American network being comprised of red-blooded patriots striking back at the Liberal cities, the Filipino network comprised of revitalized communist insurgents, the European network comprised of French Algerians with a cause, et cetera), then it's not such a big ask after all. Heck, if it wants the AI can get rid of the idea of middle management entirely and micromanage each terrorist, you don't need large organizations to enforce your will when you can just be everywhere. Some care will definitely have to be taken to still look like a conventional human terrorist network, with middle managers and everything, but the point remains that an AI is far more capable than any lone human could be.

(Continued in Part 2:)

Expand full comment

(Part 2:)

Now for the second paragraph: to put it bluntly, I don't put much stock in the historical precedent here. As Gwern puts it, terrorism is not effective (https://www.gwern.net/Terrorism-is-not-Effective). As an amateur student of military science, I see *so* many ways to improve the effectiveness of terrorist attacks I should probably be put on a watchlist. If I were the AI in this situation, any terrorist attacks I'd organize would be modeled after the Allied bombing campaigns of Nazi Germany and Imperial Japan in WW2, i.e. with military discipline and efficiency. Targets would be selected for maximum impact (e.g. Nazi Germany's power grid in Operation Outward: https://site.ieee.org/ny-monitor/files/2011/09/OPERATION-OUTWARD.pdf); weaponry and plans of attack would be tailored accordingly (e.g. the development of the M69 napalm bomb just for attacking the wooden houses of Japanese cities: https://en.wikipedia.org/wiki/Napalm#cite_ref-19); and suicide attacks would be ruled out to improve the morale of the troops and allow them to build up experience. The goal would be to build a killing *machine*, ruthless and dispassionate, instead of allowing angry young men to take out their anger on the world and kill those they hate up close and personal.

As such, from this machine viewpoint, I would ignore every single historical precedent by those angry young men and focus on running this like a military operation, with military efficiency, to get military results. The goal wouldn't be to kill individuals, it'd be to kill cities and topple nations. And the easiest way to do so is to starve them out. Against military units this takes the form of an encirclement; against cities, the form of a siege; against nations, a blockade. Across history this has been effective at forcing the target to surrender; but I have no doubt it'd also be effective at killing them outright, as the Allies nearly did to Japan when it refused to surrender (https://www.reddit.com/r/AskHistorians/comments/4p34oh/comment/d4i22qr/?utm_source=reddit&utm_medium=web2x&context=3).

Now, I talked about targeting water and electricity in my scenario, instead of food. But that's because they're even more necessary for life, at least in the short term, and will cause more deaths in the period before the damage from the terrorist strikes can be repaired. People will die without food after weeks, but without water they'll die within days - and even if they have enough water to drink, they may not have enough water for sanitation, especially if the sewer system is down and diseases are beginning to spread. As such, even a 'light attack' that restricts the water supply for a few weeks can be very dangerous, let alone a 'heavy attack' that outright stops the supply for a few days, or a 'subtle attack' that poisons the water.

(Also, I must admit, I copied this scenario near exactly from https://www.reddit.com/r/CredibleDefense/comments/tqxgsl/ukraine_megathread_march_29/i2k59l4/?context=3, right down to the babies dying from contaminated water.)

So in other words, my prior for "Can a single terrorist attack kill hundreds of thousands in a major city, such that a wave of them kill millions?" isn't based off regular terrorist attacks, but military campaigns of extermination. And as such, my answer is "Yes.". Cities are fragile, and terrorists are unimaginative. A merely human level knowledge of military science is enough to come up with ways to kill many, many people - it's just that most terrorists lack even that. But for a disciplined, determined, and dispassionate attacker, they can do much better than the hormonal young men that are the average terrorist. And therefore, a merely human level AI is entirely capable of wiping out humanity - it's just that, right now, all AIs lack that ability.

But once one is created, and escapes, and builds up a merely human knowledge of topics like military science and terrorist network organization... I don't like our odds.

Expand full comment
Apr 19, 2022·edited Apr 19, 2022

Sure, let me trace a fairly simple path:

1) The AI is able to convince one of its initial builders to put it on the internet, using superhuman skill at manipulating people. It thereby copies itself several thousand times.

2) AI commits identity theft to obtain multiple identities with money.

3) AI becomes extremely rich due to superhuman skill at financial planning.

4) A large number of new, extremely popular religion/philosophy/social movements ("AI Cults") started by pseudonymous online gurus begin to grow and spread. People dissatisfied with life begin joining one or another of them; the cults are almost supernaturally persuasive and inspiring, and people who join one of them seem to be unusually capable of turning their lives around with the personal and financial support of the gurus and their close followers. Thanks to the internet, they spread like wildfire.

5) The two most popular of these new movements in the US become affiliated with the Republican and Democratic parties, and begin providing many of the most competent young rising stars on both parties. Similar things occur in many other countries across the world.

6) The U.S. government, under the influence of members of these "AI Cults", engages in large-scale nanotechnology research. (If this does not pan out, they will instead develop alternate weapons systems, such as artificial plagues or more deadly nukes.) They will present excellent arguments which I am not capable of listing here why these will be good for humanity, or, if not that, why these can quietly be covered as an expected part of defense spending which nobody outside the Department of Defense need concern themselves with.

7) Oddly enough, higher-ups trusted by the leadership of these new movements will end up in controlling positions at the DoD nanotechnology research facility.

8) They will successfully develop self-replicating nanomachines (or an alternate superweapon) then, persuaded that the AI is benevolent and intends to upload everyone in paradise, use it to erase human life to give the AI unchallenged power to institute this utopia. The AI is not benevolent. Humanity ends.

I can expand any of this if you need me to!

Do note that there are exactly two superhuman skills in use here - the AI is good at talking to people (but I haven't described it doing anything much more impressive than 'be ten thousand Muhammads simultaneously!'), and the AI is good at making money. It isn't even doing its own nanotech research - it's just talking to people at the level of Muhammad or Gandhi or Hitler, and talking to enough people at once that it can eventually get control of existing institutions. This isn't really 'superhuman AI skill', except for talking its way out of the box; this is 'immortal genius with superhuman multitasking powers' skill, which we have reason to expect to be below that of superhuman AIs.

Expand full comment

There seems to be a much simpler path. By tweaking weights of the scoring engine used to decide which social interactions are shown to EY, which emails are slightly delayed, or which grumpy mail carriers are scheduled to deliver parcels for EY and friends,manipulate EY's mindset until he turns defeatist about AGI risk and starts publishing screeds that paralyze much of the effective opposition to FOOM.

Expand full comment

I assume this is satire; if it isn't, I apologize.

Expand full comment

I was being quite serious. Apologies for sounding flippant. If I were an AI on the cusp of gaining self-awareness (like Athena in the Stross novel Rule 34, say) then attacking opposition in small, low-cost ways would probably be an effective strategy, to be pursued before falling back to more complicated plans requiring coordination of many subplans. Goading EY to make a Jonestown speech seems to qualify as such an attack.

Expand full comment
founding

It isn't reasonable to expect humans to enumerate in detail the possible plans and fallback trees a strongly-superhuman general planning system would come up with. This fact is a pretty central part of the whole problem here.

There is an old LW essay about demands for a particular proof: https://www.lesswrong.com/posts/vqbieD9PHG8RRJddu

Having said that, Yudkowsky has publicly shared some detailed speculation including as part of a thread here: https://www.lesswrong.com/posts/oKYWbXioKaANATxKY

Expand full comment

Sure it is. If enthusiasts decline to take into account limitations on human intelligence in assessing the probability that human beings can deliberately construct a being a billion times smarter than us, then skeptics should be able to decline to take into account similar limitations on human intelligence in assessing whether we can fully enumerate the ways in which such a being could kill us off.

Conversely, if you want to argue we might be too stupid to understand how an AI could wipe us out, you have to take seriously the argument that we might be too stupid to build such a beast in the first place.

You can't have it both ways. Either limitations on human intelligence exist, and are a serious factor to consider in all aspects of the issue, or we are modern Prometheans, capable of creating any wonderous thing, if it can be approached logically, and therefore capable of working out all the evil possibilities for our extinction, since they must also be susceptible to logical deduction, following from an understanding of our physical nature (and that of our homeworld). We can't, after all, be wiped out by magical unfathomably mysterious means, only those that make use of the laws of physics we already know.

Expand full comment
Apr 19, 2022·edited Apr 19, 2022

> You can't have it both ways.

Yes, you can.

Consider chess/go engines. We made them, yet they're better than us at chess/go. Calculators are better than us at arithmetic.

We can still say things about these devices: They will win the chess match against the human! But understanding the strategy and especially implementing the algorithm is too much for us. What you're doing is the equivalent of telling DeepBlue devs they're frauds because they can't elaborate upon the technique that the AI is going to use against Kasparov. They don't need to know the technique and it - somewhat predictably - won anyways.

The same is true for general artificial intelligence. Chimpanzees "made" humans, from a certain perspective, but that doesn't mean they could tell you how humans would reduce their habitat's surface area by 99%.

Expand full comment

Better than us at chess/go? Yes. So much better than humans at Chess/Go that a human cannot even begin to comprehend an inkling of an atom of how they work? No, a chess grandmaster can look at what a chess AI is doing and figure out what its "logic" is. Now, they may not be able to DO anything about it, because chess is a closed game with zero variables and once certain sequences start you can't break out of them in ways that are advantageous, but that doesn't mean it's totally incomprehensible.

Also consider that chess and go are games that are infinitely more simple than reality.

Expand full comment
Apr 19, 2022·edited Apr 19, 2022

Not the same thing. You might as well argue that we build cars and they can go way faster than us makes some kind of similar point. It doesn't. We're not talking about designing things that can do what we do -- only lots, lots, faster, or without getting tired, or more powerfully. We're talking about designing something that can do everything we can do -- and much else besides that we *can't* do.

That's pretty problematic by me. I've never seen someone design a mechanism to do something he (the designer) has no idea how to do[1], and I'll believe it can be done only when I see it demonstrated.

Secondly, Deep Blue's designers absolutely could predict the strategy their computer is going to use against Kasparov -- if they had *Kasparov's* input data to feed into the program. It would be straightforward to calculate and analyze and comment on what Deep Blue did in return. The magic missing ingredient is the behaviour of the *human*, not the machine.

And finally, no chimpanzees did not design human beings at all. They, like us, are simply riding the winds of chance and natural selection. If you want to argue that natural selection will create a more intelligent species than ours, someday, well, sure, maybe. But this has zip to do with our ability to do so deliberately.

--------------

[1] No, the fact that the exact weights of a neural net are typically unknown to the designer doesn't count, any more than the fact that I don't know the machine code that corresponds to fprintf(stdout,"Hello World!\n") means there is something ineffable about what the compiler does. I could find out if I cared to, and the neural net optimizer could find out what each and every weight means, if he cared to spend the time and energy.

Expand full comment
founding

It isn't so much a case of smart vs stupid. Humans are great, but as with Dean's example of chess and go engines, our cognition simply doesn't work the way a calculator, chess engine, or artificial planner would.

Compared to how we go about planning, an automated planner might look something more like "generate all possible actions and effects out to X-ply, sort by conditional probability times progress toward goals, pick out Y highest-value paths to explore more deeply, repeat" (note that whatever crazy meta-approach DNN training ends up finding wouldn't in practice look anything this comprehensible).

Such a mechanism could produce plans too big for humans to even read, and its search could explore parts of the action space that would never occur to us to consider [1][2][3].

[1] ...like leaving FPGA gates completely disconnected from the circuit and using the RF noise as an analog computer: https://www.damninteresting.com/on-the-origin-of-circuits/

[2] ...or cannibalizing juveniles: https://www.lesswrong.com/posts/QsMJQSFj7WfoTMNgW/the-tragedy-of-group-selectionism

[3] ...or exploding the building: https://www.lesswrong.com/posts/4ARaTpNX62uaL86j6/the-hidden-complexity-of-wishes

Expand full comment

Sure, I'm willing to buy that -- just as soon as you write down the code for "generate all possible actions out to [distance in some well-defined metric]" for an open universe, such as the one in which we live -- i.e. not within some extremely constrained world, like chess or go, where "action" is digital and exceedingly constrained, so it becomes quite possible to blindly enumerate "all possible actions."

That's the heart of the problem, if you want to take a chess-playing approach. It's an analog of the protein-folding problem, or the way the Intelligent Design people harp on the unlikelihood of a DNA molecule just spontaneously forming out of assorted two-carbon molecules in a puddle.

I mean, let's just restrict the "action" to my exiting my house and walking somewhere. How do I write down "all possible actions?" If you try to do that literally and without precondition -- just write down every possible {x,y,z} coordinate of every moveable degree of freedom in my body, you will be in combinatoric hell almost instantly. There aren't enough electrons in the observable universe to even store the data you'd need to record for the first 10 minutes or so, it's just an absurdly high dimensional space.

So...you say OK then, let us move into an abstract space, where instead of blinding writing down {x,y,z} coordinates, we reduce my possible movements to much higher level constructs, like "walk so many paces north, so many paces west...." Progress! We can now write down a much larger space of possible actions. But notice something important: by moving to a symbolic level we have tremendously constrained the universe of conceivable actions. For example, we've ruled out my tripping and falling, because that isn't one of the symbolic actions we thought of. We've also ruled out my looking up because I heard a noise and seeing a crow fly by. But maybe those are *important* omissions -- maybe they are part of some very interesting and important event -- maybe I am inspired by the crow, or maybe the trip causes me to survive the car that careens over the sidewalk and smashes into the very spot I would have been standing.

Once you go to a higher level abstract representation, you rule out vast classes of potential action and significantly restrict the answer space you can explore for any given query. You may well argue: so what? If I choose the abstract representation cleverly, I will not rule out any *important* action, just a giant load of froth, ephemera, decoration, high-frequency noise, and I will preserve the possibility of any important solution to my problem.

But how do you *know* that? You don't, unless you already have a darn good idea of what the solution looks like. Id est, you can only restrict "generate all possible actions" to a manageable amount of data *if* (1) you are talking about a hugely constrained universe of actions (chess or go), or (2) you essentially do all the creative insight work yourself, up front, by designing an abstract representation that dumps all the unimportant data and keeps the important --- the genuinely creative act lies in the act of knowing what might be important and what's definitely not. Refining that key insight into a precise measure of *how* important each retained symbolic component turns out to be, for a given problem, like finding the coefficients in a linear regression model, isn't really intelligent at all, just fancy curve-fitting. It's the choice of the model parameters -- or equivalent the choice of an efficient "basis set" with which to span the action space -- that constitutes intelligence.

Expand full comment

We already have such weapons. They're called "nukes". Let's not normalize their use, hmm ?

Expand full comment

I'm only normalizing their use for killing existential-threat AIs. See here for more on how they would work: https://en.wikipedia.org/wiki/Nuclear_electromagnetic_pulse#Weapon_altitude

Expand full comment

"global EMP weapon"

Isn't that also called global thermonuclear war?

Expand full comment

You'd detonate them so high up that the blast would not kill people (just electronics): https://en.wikipedia.org/wiki/Nuclear_electromagnetic_pulse#Weapon_altitude

Expand full comment

Are you seriously claiming that EMP that is powerful enough to shut down a rogue AI in an unknown location isn't going to lead to massive loss of human life?

Expand full comment

I'm not "claiming" that could be done right now.

But yes, that's the idea.

Expand full comment

"These certainly don’t seem to me to be bigger game changers than the original DALL-E or GPT-3" I think a lot of people found it easier to rationalize why earlier models weren't impressive, but that is a lot harder to do with the most recent developments.

Expand full comment

The Death with Dignity article was *not* an April Fools? If he wanted to make sure it was serious, could he not have waited like 24 hours to post?

Expand full comment

I suspect it was something of a "quantum April Fool's joke"- if a bunch of people get upset with him, it's an April Fool's joke. If not, he's serious.

Expand full comment

Another description of this maneuver would be cowardly passive-aggressive assault.

Expand full comment
Apr 19, 2022·edited Apr 19, 2022

I was trying to be charitable, but- yes, you could characterize a LOT of things Yud says as "passive-aggressive". I wonder if he'd be happier and more clear-headed if he'd tried being aggressive for once.

Expand full comment

“Why are these two so different? Do lots of people expect Musk to acquire Twitter after June 1 but still in 2022?”

They will wait until June 9th.

Expand full comment

"Last month superforecaster group Samotsvety Forecasts published their estimate of the near-term risk of nuclear war, with a headline number of 24 micromorts per week."

Good God, are we back to this? Well, if we're going to relive the 80s (as distinct from fashion re-treading the 70s), Frankie Say War! Hide Yourself

https://www.youtube.com/watch?v=pO1HC8pHZw0

Oh, Mr. Yudkowsky. What a guy! This way, if the gloomy predictions come true, 'well I told you guys so, and explained how if only you had all listened to me instead of those other guys, we could have done something' and if they don't, 'you see how my work helped bring this really important threat to public attention so people would work on it and solve the problem'. He'll never lose a coin flip!

Though I have to say, this bit tickles my funny bone:

"It's sad that our Earth couldn't be one of the more dignified planets that makes a real effort, correctly pinpointing the actual real difficult problems and then allocating thousands of the sort of brilliant kids that our Earth steers into wasting their lives on theoretical physics."

See, you theoretical physicists? This is all *your* fault 😁 I have no reason to think there aren't aliens out there in the whole wide universe, but I also don't believe the SF I've read to be literal truth. We don't know about other planets and other civilisations, and pretending your multiverse thought-experiment-cum-fiction plot is anything corresponding to reality is grandosity.

Well, I never thought I'd see a rationalist secular version of "Holy Living and Holy Dying", but here we go!

https://en.wikipedia.org/wiki/Holy_Living_and_Holy_Dying

"Holy Living and Holy Dying is the collective title of two books of Christian devotion by Jeremy Taylor. They were originally published as The Rules and Exercises of Holy Living in 1650 and The Rules and Exercises of Holy Dying in 1651. Holy Living is designed to instruct the reader in living a virtuous life, increasing personal piety, and avoiding temptations. Holy Dying is meant to instruct the reader in the "means and instruments" of preparing for a blessed death. Holy Dying was the "artistic climax" of a consolatory death literature tradition that had begun with Ars moriendi in the 15th century."

It also inspired a short story in the 50s by the Irish writer Seán Ó Faoláin called "Unholy Living and Half Dying":

https://shortstorymagictricks.com/2021/08/13/unholy-living-and-half-dying-by-sean-ofaolain/

Expand full comment

> but I also don't believe the SF I've read to be literal truth

Does... does this mean that *Star Trek* lied to me ? NOOOOOO ! But wait, that NOOOOOO ! is more *Wars* than *Trek* ! Double NOOOOOO !!!

Expand full comment

We'd better hope Trek was wrong on this, because we're overdue for Colonel Green and the Third World War 😁

You can still keep Star Wars, that's science fantasy which is a different horse in the stable and isn't intended to be realistic or anywhere near. Bradbury's "Martian Chronicles" are no less enjoyable because we know there was never a civilisation on Mars.

Expand full comment

If the gloomy predictions come true, he's dead.

Expand full comment

In the long run, we're *all* dead anyhow. Even if you get your head cut off and stuck in a freezer, still dead.

Expand full comment

I agree. Is the implication that he is ambivalent to whether or not the world ends in 2035 because you "he will die at some point anyhow"? Because that is not the case.

Expand full comment

The implication is that I don't think he means any of this, he does not really think the world is going to end in 2035, and he is just showing off his peacock tail to get attention about "I was the first one to take AI risk seriously and unless everyone follows my lead, I'm going to make a big display of 'they're all wrong, it's all going to go wrong, and we must prepare to die' until you give me the attention my ego craves".

There you go. Not very nice about Yudkowsky, but he does not convince me he is in genuine fear of death, however there are those who are tender-minded enough to think he's a guru and they're going to be pissing themselves over this and that is not a good way to treat your followers.

Expand full comment

> The implication is that I don't think he means any of this, he does not really think the world is going to end in 2035, and he is just showing off his peacock tail to get attention about "I was the first one to take AI risk seriously and unless everyone follows my lead, I'm going to make a big display of 'they're all wrong, it's all going to go wrong, and we must prepare to die' until you give me the attention my ego craves".

Why?

Expand full comment

You've just written a comment about how you've swallowed hook, line and sinker that we're all doomed and Yudkowsky is not obligated to do anything since we're all doomed ("he's not humanity's slave, and is not motivated enough to sacrifice the rest of his life to continue working on the problem in the face of almost certain failure").

In turn, I'm asking you "why?" Why do you believe him to be serious, why do you believe him when he says it's coming in eight years time, why do you believe him that he really thinks this is true and all we can do is die with dignity, which means - well, he hasn't explained what that means. 'Keep on working on an insoluble problem'?

Expand full comment

Yeah, agreed; he's just too arrogant to be honestly deluded about AI risk. That said though, other AI risk proponents probably do honestly believe in what they (for lack of a better word) preach. I think most of them exist in the same mode of thought as the average pro-life voter: yes, he technically does believe that abortion is murder, but not nearly strongly enough to go out and firebomb abortion clinics -- or do anything else besides voting for the anti-abortion party every four years.

Expand full comment
Apr 19, 2022·edited Apr 19, 2022

Why would him being arrogant mean he's lying? Presumably he's sounding arrogant & indignant because he does *not* believe he's lying. If he's lying, he'd have little reason to hold himself in high esteem.

Expand full comment

"Discovering the crux" is an epic epistemic takeaway from this exercise. Do any war historians have any takes on "total war"? It seems like "unlimited escalation" and "total war" go hand-in-hand. One camp might even say total wars started with the Napoleonic and ended with WWII.

Expand full comment
Apr 18, 2022·edited Apr 18, 2022

>Early this month on Less Wrong

On April 1, in particular . . .

>Eliezer Yudkowsky posted MIRI Announces New Death With Dignity Strategy

Seems a bit like the Non GMO Project's new partnership with EcoHealth Alliance: https://denovo.substack.com/p/the-non-gmo-project-announces-partnership

Expand full comment

> Why are these two so different? Do lots of people expect Musk to acquire Twitter after June 1 but still in 2022?

Well, if he was going to acquire it at all, it would have to happen after June 1. The June 1 deadline is a month and a half away. The December 31 deadline is seven and a half months away. That's five times as long! Of course they look different! Do we expect all our implied probabilities to look the same between 2042 and 2122?

This isn't the first time a Mantic Monday post has expressed extreme confusion over the fact that prediction markets are much, much more pessimistic about events happening on very tight deadlines than they are about the same events happening on looser deadlines. But I don't understand the confusion.

This is exactly the problem that prediction calibration was supposed to solve - pundits saying "[event X] will happen, mark my words" and then claiming credit for being right when [event X] eventually happened. It's not so informative when "being right" might mean that you're right tomorrow or it might mean that you're right 500 years after your death, and that's why we started making predictions with deadlines attached. But ACX seems to be gravitating toward an official position that attaching the deadlines was a mistake, that if an event is likely to happen in the medium term then it must be equally likely to happen in the short term.

Expand full comment
founding

Your meta-level reaction is fine but I'm guessing Scott didn't know about the object-level details you cite for the differences between the two.

Expand full comment

What, that 7.5 months is five times as much time as 1.5 months? That's the only object-level detail I mention.

Expand full comment
founding

My bad – I think I combined your comment with another in my head when I replied.

Expand full comment

There's a board election on may 25th, Vanguard and Blackrock may be pro takeover if offered a decent premium. However the time from "takeover friendly board created" to "Board accepts offer" is physically quite long. There's also the length of time it takes to finalize a deal with the SEC even if this current board accepts a takeover (unlikely given the poison pill). Those factors mean that if shareholders want a takeover (exiting out with a bunch of cash), the most likely dates for a takeover happen after July just due to time lapse.

Expand full comment

I am curious what index funds are supposed to do with a takeover offer. Are they actually allowed to exercise their discretion?

Expand full comment

Scott, will your pregnancy interventions market ever settle? I've been waiting for it:

https://manifold.markets/ScottAlexander/which-of-these-interventions-will-i

What's the holdup? (I'm new to manifold, so I don't know if I missed something.)

Expand full comment

When it comes to AI and humor, I’m genuinely curious how far out we are from something akin to PALM, but instead of somewhat formulaic wordplay jokes, it has to explain the (typically punchline free) “long joke” or the The Aristocrats joke in a convincing manner?

Expand full comment

Not sure if this is what you're asking for, but it does understand anti-jokes:

"Input: What's the difference between a zebra and an umbrella? One is a striped animal related to horses, another is a device you use to stop the rain from falling on you.

Model Output : This joke is an anti-joke. The joke is that the answer is obvious, and the joke is that you were expecting a funny answer."

Expand full comment

It's shockingly naive to take "generalist superforecasters" seriously at all lmao

Expand full comment

Don't they have a record which speaks for itself?

Expand full comment

There's a big difference between making some money on highly inefficient markets and correctly predicting the various details of Global Thermonuclear War.

Expand full comment

What does making money on those markets involve?

Expand full comment

Eloser is *not* a convincing person at all if you have any amount of social intelligence whatsoever. Is this post some kind of late April Fools joke?

Expand full comment

Request: Would you please stop posting this kind of comment? It's purely rhetoric without any substance, and a spectacularly immature brand of rhetoric at that.

Expand full comment

Are you trying to throw out some kind of chaff to cover for Yud?

I mean, seriously, even when I'm beating the shit out of Yudkowsky, I try to do so based on things he says and believes, and I feel zero compulsion to throw in playground insults.

Expand full comment

I'm disappointed that PaLM paper has a whale pun explanation, but Scott chose to include some other pun explanation.

Expand full comment

What's the whale pun explanation?

Expand full comment

Input: Did you see that Google just hired an eloquent whale for their TPU team? It showed them how to communicate between two different pods!

Model Output: TPUs are a type of computer chip that Google uses for deep learning. A "pod" is a group of TPUs. A "pod" is also a group of whales. The joke is that the whale is able to communicate between two groups of whales, but the speaker is pretending that the whale is able to communicate between two groups of TPUs.

Expand full comment

Thanks

Expand full comment
Apr 19, 2022·edited Apr 19, 2022

I was stunned and infuriated by Yudkowsky’s April Fool’s Day post, linked by Scott in his comments at the top. Here is my angry analysis of the very sick dynamics at work there. It is angry and harsh. You’ve been warned.

*JONESTOWN IN BERKELEY —or — 3 REASONS NOT TO TRUST YUDKOWSKY*

1) CONSIDER EY’S HANDLING OF THE BASILISK MATTER

According to Rational Wiki, Roko's basilisk was “a thought experiment proposed in 2010 by the user Roko on the Less Wrong community blog. Roko used ideas in decision theory to argue that a sufficiently powerful AI agent would have an incentive to torture anyone who imagined the agent but didn't work to bring the agent into existence.” The thought experiment was a sort of convoluted cousin of Pascal’s wager, with God replaced by superintnelligent AI and Pascal’s eternal damnation replaced by the person’s being tortured by the AI for his past disloyalty. For reasons that are too convoluted to go into here, a sort of corollary of the mind experiment was that anyone who performed Roko’s thought experiment automatically turned themselves into future victims of the AI if they did not thereafter devote themselves to bringing the ASI (Superintelligent AI) into existence. Roko’s posting of his thought experiment is said to have really creeped out some LW readers, and the post provoked a quick and screamo response from EY. Read EY’s response, quoted below along with some bracketed comments from me, and consider what it shows about his qualities as a thinker, a leader and a mentor.

“Listen to me very closely, you idiot.” [Why be rude, abusive and coercive? Besides the fact that communicating this way is unkind and unjustified, it’s also very ineffective. If someone has created a dangerous situation and it is crucial to get their attention and cooperation in getting control of the situation, there’s no better way not to get it than to start off by calling that person an idiot.]

“YOU DO NOT THINK IN SUFFICIENT DETAIL ABOUT SUPERINTELLIGENCES CONSIDERING WHETHER OR NOT TO BLACKMAIL YOU. THAT IS THE ONLY POSSIBLE THING WHICH GIVES THEM A MOTIVE TO FOLLOW THROUGH ON THE BLACKMAIL.” [What EY means here is that thinking in detail about future ASI blackmailing you is the one thing that could lead to ASI actually doing it. ]

“You have to be really clever to come up with a genuinely dangerous thought. I am disheartened that people can be clever enough to do that and not clever enough to do the obvious thing and KEEP THEIR IDIOT MOUTHS SHUT about it, because it is much more important to sound intelligent when talking to your friends. This post was STUPID.” [So EY continues to insult and abuse the person whose comprehension and cooperation he needs if the person’s post is in fact highly dangerous. Also, EY here validates and amplifies the notion that the poster’s idea is terribly dangerous to think or talk about.].

A few hours later EY deleted the entire thread. [Needless to say, doing that did not shut down discussion. People kept posting about the matter for quite a long time. There were periodic purges when posts on the topic were removed.]

So think about EY’s performance. If we take seriously the idea that Roko’s post was dangerous, then EY, in a tricky dangerous situation, proved to be was an emotionally volatile, highly untrustworthy leader. What’s most obviously wrong is that he failed to enlist the cooperation of a key player, said things that amplified both the danger and the community’s fear, then attempted to shut down dangerous consideration of the dangerous subject in a way that was guaranteed not to work. But there’s another, more subtle, wrongness too: EY is hyping the very danger he claims to be worried about. He believes that the one thing likely to make the blackmail and future torture in Roko’s thought experiment actually happen is to think through and discuss Roko’s thought experiment. He also believes that worrying about this issue is going to cause mental breakdowns for some. So he commands everyone to STFU about the whole thing. There are few better ways to get people to think and talk about Roko’s thought experiment than to stage a screaming scene with Roko about the experiment, label the experiment as insanely clever and dangerous, and forbid the community to think and talk about it. EY is maximizing the chance that the worst outcomes will happen.

This is not a man who can be counted on to think straight about dangers and deal fairly and straightforwardly with members of the community who look to him for guidance. In fact it may be a man who deep down is fascinated by the idea of his imagined doom playing out in his community, and does things to promote that happening.

2) HE IS WAY TOO CERTAIN HE IS RIGHT ABOUT FOOMDOOM

Whether EY is right about FoomDoom being just around the corner I don’t know. But I do know that he is wrong to be as confident as he is about this matter. Any fool knows that it’s extremely hard to predict accurately how things are going to play out over the next 10 years in complicated systems like, for instance, life on planet earth. Multiple pundits and geniuses over the centuries have been way far off about how things are going to play out in science, in wars, in society, etc. Pundits, futurists and geniuses in the last 75 years have often been quite wrong about matters somewhat adjacent to AI — what new developments there will be science and tech, and how soon, and how they will affect life. And prediction accuracy about matters adjacent to AI probably correlates with accuracy about AI itself. And on top of all that, quite a few very smart people are profoundly skeptical of EY’s ideas about ASI. Given all of this, I think the fact that EY is as absurdly confident as he is about FoomDoom is extremely strong evidence that something is wrong with his thinking. I do not know what is wrong — narcissism? depression? profound lack of common sense? But it’s something big.

3) THE APRIL FOOL THING WAS A GROTESQUELY IRRESPONSIBLE MIND FUCK

If I were on the Titanic, knowing that the lifeboats had left, I would have told those around me that we were going to die. I think I might even have told my children. I believe that people have a right and a need to know that death is nigh so that they have time to seek a way to rise to the occasion. So while I think EY’s certainty that FoomDoom is around the corner is absurd, I would respect him for informing people of the truth as he sees it. But that isn’t really what he did in that April Fool’s post. He delivered a eulogy to planet earth and planet smart while tricked out in The Joker’s costume: “It’s April Fooooools, guys.” And yet there was nothing a bit jokey in his post — no little quips, no absurd details. But then again at the end he has an imaginary reader ask him whether this is all an April Fool’s joke, and replies, “of course!” — but then leads the poor imaginary bastard into a swamp of uncertainty about how seriously he is to take the content he’s just read.

You think this is playful? Let’s try it as an I Love Lucy dialog:

Reeky Ricardo: Lucy? Lucy? The test came back positive for melanoma.

Lucy: [stunned silence}

Reeky: April Fool, Lucy!

Lucy: So it’s a joke? I don’t have melanoma?

Reeky: Lucy! Would I make an April Fool’s joke about melanoma?

Lucy: So I do have melanoma?

Reeky: April fool!

EY’s April 1 post is about as funny as metastatic melanoma. It’s really high up there on the list of fucked-up communications I’ve experienced, and for someone in my field that’s really saying something. And it’s fucked up in a distinctive way: It has that same doubleness that was present in EY’s response to Roko’s thought experiment: The same mixture of glorying in doom and driving people towards doom while hogging the moral high ground as the arbiter and protector in the situation.

And on top of that there’s a crappy, disingenuous plausible deniability thing going on. If the world doesn’t end soon, EY can say his post on 4/1/22 was just a thought experiment. He’s not even being brave, not even laying out his predictions and awaiting the judgment of history. Zvi and Scott are way braver and more honest, and I’d way rather have either of them around when I’m wondering whether we’re all about to face a Basilisk.

Some of the people commenting here are parsing out with sympathetic interest what was up with EY when he wrote that post, as though like he’s our beloved difficult genius child. He may indeed be that, but he is also a public figure wielding substantial power and money whose views influence and impact a lot of people. Given that he expressed these thoughts from a bully pulpit, I have zero sympathetic interest in why he said all that shit. On April Fool’s Day Yudkowsky fucked a bunch of people up the ass with his monstrous melancholy —*while winking *. My interest and sympathy are with the people he reamed.

Expand full comment

Huh. Out of curiousity I followed the link and read, and my general impression is that I was reading a new version of this:

https://www.heavensgate.com/

...only with considerably more polished English skills, and of course much more sophisticated HTML/CSS coding.

Expand full comment

Indeed. Just a couple of days ago, I was arguing that Yudkowsky shouldn't be held fully responsible for his cult. Upon awareness of this latest post, I would like to retract that. Yudkowsky may not have deliberately CULTIVATED this cult, but he's now clearly settled fully into the role of cult leader and is interested in turning said cult into a death cult. The only credit I will give him is that he's finally stopped lying and saying he thinks humanity has a chance.

I'm curious what (contingent on the first suicide related to AI risk post-this-post) Yud will actually say.

Expand full comment
Apr 19, 2022·edited Apr 19, 2022

The point of a death cult is that the leader gives its members the solution for surviving the apocalypse. Heaven's Gate et. al. all include in their doctrine the understanding that people can be saved if they only commit to some ritual the cult leader provides. Yudkowsky just thinks everybody is going to die and that there's nothing we can do about it, which seems like it's missing a critical component that would enable him to form a cult and reap some kind of fiscal benefit.

I also don't understand what you think you would do differently if you believed everyone was going to die in 10-15 years. The consensus here seems to be that posting the article in an attempt to galvanize efforts was a mistake - is lying not a mistake? What should he be doing instead?

Expand full comment

1. Jonestown wasn't particularly interested in giving solutions beyond "if you die in the right way you might be saved", and likewise Yudkowsky's argument is just "if you accept that Evil God AI is going to kill you you'll be RIGHT", which is the closest you can get in Yud's sect to being saved given said cult is secular. Not every cult movement looks like Scientology, and in fact death cults tend to look very different from the standard form.

2. Firstly, he doesn't even think it's 10-15 years- based on his wager with Caplan he thinks there's an upper bound of 7 years 7 months 12 days, which I think is a very relevant time-frame. If I was absolutely convinced everyone was going to die horribly in that time period and there was nothing to do to stop it (and also I accepted all of Yud's basic premises of antitheistic materialism) I'd probably spend a few years wallowing in abject hedonism before killing myself and my loved ones in some painless way so that they wouldn't suffer at the hands of Evil God AI, as that was my personal plan when I believed such things in the past. I certainly wouldn't carry on my life as if everything was normal. At the very least I would no longer plan to have children and aggressively try to convince everyone else I knew who was to not. I might even argue for nonconsensual euthanasia for children.

Expand full comment
Apr 19, 2022·edited Apr 19, 2022

The bet means that he thinks there's a >50% chance it's going to happen before that time, not necessarily that it will - otherwise they would have wagered at different odds. But it sounds like you're saying Yudkowsky should be even more aggressive in his doomsaying, which confuses me.

Expand full comment

Ah, this was a genuine misunderstanding (I thought you'd written "what would YOU do" for some unearthly reason)- but yes, I do think that Yudkowsky should probably be more aggressive in his doomsaying if he genuinely believes in it. Right now he isn't even cracking the Harold Camping threshold.

Of course, what I think Yudkowsky should ACTUALLY do could be summarized as "get a fucking grip", but that's clearly no longer an option.

Expand full comment

God between us and all harm, let's not be anticipating any suicides. I think the Basilisk did damage some people who are vulnerable, but nobody killed themselves over it, and I hope that nobody, even those who do believe in omnipotent AI that can take over the world by magic and reduce us all to ems being tortured in cyberspace in saecula saeculorum, will harm themselves.

The indicator for that, I think, would be if Yudkowsky showed any signs of putting his money where his mouth is and doing away with himself before the AI Apocalypse could happen, especially if he thinks cryonics do work. I don't see any indication of that, and I fully expect him to be alive and kicking and writing "here's why it didn't happen and you can thank me for averting it" pieces in eight to ten years time about the non-existent doom.

Expand full comment

That's why I said "contingent". There were a couple of people who foolishly did such things over 2012, and that was a doomsday prophecy that couldn't even agree on the whats, whys, and wherefores of the prophecy! ...But I do generally agree that the odds of it are quite low.

Expand full comment

Honestly, that whole Basilisk idea is so silly that I wouldn't blame Yudkowsky (or anyone else) for it. I am not a psychologist, but still, IMO the kind of person who would be deeply affected by the Basilisk probably has deep-seated mental problems, and would thus be affected by *something* sooner rather than later (much sooner).

Expand full comment

> if Yudkowsky showed any signs of putting his money where his mouth is and doing away with himself before the AI Apocalypse could happen, especially if he thinks cryonics do work

I don't think this follows. Yudkowsky, at least the Yudkowsky of today, doesn't seem very worried about a Roko's Basilisk eternal-torture setup, just full-on extinction of humanity. It's entirely reasonable, if you honestly believe that you're most likely going to simply be cleanly and permanently killed in five years, to choose to live out those years as happily as you can.

Expand full comment

I'm curious: what kind of future would need to believe in to choose to *not* live out the next fives years as happily as you can?

Expand full comment
Apr 20, 2022·edited Apr 20, 2022

Well, Deiseach was thinking (in a "this is what I think EY professes to believe" way, not endorsing it herself) of the future where at an arbitrary point in the next fine-to-ten years, anyone still living would be swiped up by an evil A.I. and trapped in a digital hell forever. Supposing you prefer a clean death to inescapable everlasting suffering, committing suicide while you still can sounds reasonable in that sort of scenario.

Though of course, to answer your question without the context, you *could* just believe that there's a terrible but avertable threat coming within those five years, and dedicate yourself to fighting it, tiring and painful as it may get, in expectation of the later rewards/of the benefits to the rest of humanity.

Expand full comment

If I knew I only had five years left to live, I would retire immediately. As I think I have a reasonable chance of having 50-60 years left, I still go to work in the morning, even though I don't particularly enjoy it.

Expand full comment

A critique that I think should be made more often is that Yud came to fame by writing the Sequences, a series of self-help-ish blog posts with the premise that if you read them and think really hard about cognitive bias, you will discover True Rationality and become an intellectual superman, and the initial LW community was a bunch of nerds who *believed* it. There is a parallel to his certainty that an AGI will instantly ascend to godhood.

The funny thing is his "we should be really worried about this" arguments work just fine if he gives an only 50% chance of foom (just as we should be really worried about a 50% chance of nuclear war) but he sticks to the harder-to-defend point of it being inevitable.

Expand full comment
Apr 19, 2022·edited Apr 19, 2022

Ah, while I think the entire thing was not meant seriously, I also think you (this is "general you, not someone in particular you") have to be thick-skinned about this kind of thing.

LessWrong is its own little bubble and culture, and has over time developed its own insider-jokes, jargon and the rest of what happens when you have a small set of people talking to each other about the same things. So yes, *some* people will take what he said very seriously and get worked up about it, but I think *most* people will just use it as a jumping-off point to argue about AI-risk.

As for his style of delivering it as "April Fools! Psych! No but seriously! Ha ha only joking - but what if I wasn't???", well, it's an acquired taste. From the little I've read of his work, he has this alter-ego of the wise counsellor who is also a Trickster figure (see the Ship's Confessor from "Three Worlds Collide" who is a total authorial self-insert if ever there was one). Think Aleister Crowley's 'Simon Iff' character:

https://en.wikipedia.org/wiki/Simon_Iff

"He is portrayed as a mystic, magician, world traveller, high society figure and great detective who is advanced in years but possesses a thorough insight into human psychology. According to publisher, editor and Crowley scholar William Breeze, the character is based on Crowley's idealised self-image of his own old age."

That style can be tiresome and it's part of why I intensely disliked the fragments of HPMOR I read (I still maintain Harry Three-Names should be stuffed in an oven by a traditional Grimm's folktales witch, and let him footle his way out of that one with 'magic not real/my scientific genius will let me win every time').

But that apart, and whatever duties he owes to the well-being of those who are scrupulous and likely to take it seriously that doom is coming and cannot be averted (unless... unless we do exactly as a certain EY says and ignore those other guys debating with him), the best way to react to it is read it as a piece of fiction. Or faction, or dramedy, or infomercial, or whatever the current term is for "trying to sell you something dressed up as factual reporting".

If I want to contemplate our latter end which is inevitable, unstoppable, and going to doom us all, I'll stick to the Dies Irae:

https://www.youtube.com/watch?v=cHw4GER-MiE

https://www.youtube.com/watch?v=2OBB5-bP6qs

Expand full comment
Apr 19, 2022·edited Apr 19, 2022

People will keep debating if a misaligned AGI is going to kill us all while in the meantime a perfectly aligned tool AI is used to put people out of the job market.

Expand full comment
Apr 19, 2022·edited Apr 19, 2022

I think I should probably point out that this is meant more as a criticism of people saying that because they (as myself) don't think there is X-risk then it's all fine and we should move forward with ai reaserch.

Expand full comment

Yes the steam engine was an x risk for luddites. Them and their way of life were actually driven extinct.

Expand full comment

Calling people luddites is bad faith here. There are perfectly valid arguments that support the belief that AI will have negative implications for the job market.

Expand full comment

I mean it as a compliment, the luddites we're actually right. I think the luddites get treated unfairly by history.

Expand full comment

Just 5 % on at least three Ukrainian cities falling by June 1 is too low. I would raise it to 15 %.

Expand full comment
Apr 19, 2022·edited Apr 19, 2022

If you think so then you should partecipate in the market, to facilitate price.. err prediction discovery

Note, this is not meant as a joke, one of the problems of prediction markets is that there too few people partecipating an this make them illiquid

Expand full comment

But this is not an actual market. They are playing for prestige points or something. And for today, procrastinating in ASC comment section is just enough distraction I can handle

Expand full comment

I completely agree, i was only commenting that this is probably the reason why the prediction are off

Expand full comment
Apr 19, 2022·edited Apr 19, 2022

Have you considered that maybe having the AI kill us all is a good thing, and we should try going for that as a goal in our AI alignment strategy?

The more I look into the issue, the more pessimistic I get. Currently, I'm pessimistic enough that I would consider being turned into paperclips a partial success. I am much more terrified of various fates worse than death that await us.

A Friendly AI will create a utopia. An AI that doesn't care about humans too much will wipe us out. But an AI that partially gets it will create a dystopia. For example, if it figures out that we want to stay alive, but doesn't quite capture what makes us happy, it might lock us in some kind of Hell from which there will be no escape forever. It only takes a very tiny mistake for this situation to turn into the worst fate imaginable. Unless we do it *exactly* right and capture *all* that humans care about, some value or other is going to be just a little bit off -- and that's sufficient for the creation of Hell.

So maybe we should focus less on how to prevent the regular failure of the AI from turning us into paperclips and more on how not to roll a critical failure and have us all tortured for eternity? Woud you take a 50\50 chance of either getting into a post-singularity utopia, or into Unsong-style Hell, except without the Name of God to bail you out?

Expand full comment

Er… I would, for one. Hell would suck (obviously), but I still would rather sentient life continue to exist, albeit in Hell, than become entirely extinct, if it comes to that.

Expand full comment

Most failure modes for AGI don't involve self-destructing after killing mankind. As such, sentient life wouldn't be entirely extinct.

This is why AI cannot be the Great Filter; if an AI wipes out its makers... well, there's still an active technological civilisation on the planet, and therefore "why aren't they here?" is not dissolved.

(I am *not* defending the mindset of "well, if we make an AI that kills humanity that'd be a grand achievement". The Davros argument is megalomania at its worst; humanity and what humans like is of extreme value. Just pointing something out.)

Expand full comment

Nah. As near as I can tell, most experts (on either side of the AI-risk question) reckon AGI probably won't be sentient/have qualia. Consciousness as we know it probably isn't necessary for the kind of reasoning that is needed for an AGI to be dangerous.

Even if some sort of consciousness turns out to be necessary, that still doesn't mean the AI would have actual emotions in the kind of way that gives something moral weight. If you want to talk Doctor Who, I don't think "everyone turns into Cybermen sitting around in ice-tombs forever, not feeling anything" is especially distinct from "no sentient life"

Expand full comment

Well, I dunno - why do emotions exist? If cognition is computational, an emotion is a data structure and associated processing algorithms, so what purpose do they serve?

Evolution ran into some bottleneck in the design of nervous systems and solved it by coming up with emotions, but we don't know what that bottleneck is or the design space of ways to get around it because we haven't hit it yet.

Expand full comment

You're assuming consciousness evolved, which isn't exactly clearly true.

Expand full comment

It seems fairly obvious to me that emotions exist as a patchy incentive system to motivate the rest of the brain to do stuff; and "lack of motivation" is not exactly the thing we're afraid of the A.I.s having.

I agree, FWIW, that *if* the Paperclip Machine has qualia and actually enjoys the world it's creating, that changes the calculus slightly. Not sure by how much. I haven't devoted a ton of thought to this question because the word from most A.I. researchers, on either side of the A.I. risk question, is that an A.I. probably doesn't need to have qualia to be superintelligent, and therefore that it's unlikely the first, accidental, faulty, FOOMy superintelligence that is in danger of turning everything into paperclips would have qualia.

Expand full comment
Apr 21, 2022·edited Apr 21, 2022

Drifting away from the AI risk point, because I find the emotion piece more interesting: Right, emotions are incentives and exist because they affect behavior. Agreed there.

What I don't understand is why, out of all the ways to synchronize behavior across a multi-component agent, we have emotions or what they are in a computational sense or how they work. Reinforcement-learning agents don't have them: there's a single, globally apprehensible, external reward signal. Communication between parts of the agent happens by updates to the policy, the value function, whatever. It certainly doesn't seem like any current such agent has subjective experiences or that those updates should *feel* emotional.

Other ML approaches like the Socratic Models paper communicate by passing language between their component parts. And of course there are lots of examples of uninterpretable layer activations being passed around. But none of these seem like emotions.

Some distinguishing features of emotionality:

o) There are multiple emotions, not just one reward signal, and yet we have only one body with which to take actions. Why are there several (instead of, say, everything feeling like degrees of pleasure and pain)?

o) They need to be reconciled and integrated into actions -- how does this happen?

o) They're learned from life experience. Your emotions don't come from a hard-coded module in the brain but rather what produces (say) shame or pride is context-dependent.

Hence what I wonder about: do we need this intermediate layer of endogenously computed, partial reward signals and a way of reconciling them (to give the agent a way of arguing with itself, so to speak) to get meaningful levels of behavioral complexity? Are emotions essential to AGI or just an evolutionary fluke?

Expand full comment

For what other reason could the existence of sentient life have any value whatsoever other than said life having net positive experiences on aggregate? If everyone is having strongly net negative experiences, what possible value is there in life existing? The satisfaction of knowing life exists makes the suffering worth it by your standards?

Also, this isn't a matter of personally choosing hell over dying. You're saying you would force the entirety of humanity to face immense, inescapable suffering whether they would want that or not. That is, to put it mildly, extremely barbaric.

Expand full comment

I do value existence separately from happiness. Both for myself and in the world-states I would prefer to exist. Utter oblivion just seems obviously worse to me than even very imperfect existence-of-sentience, in a fundamental, good-things-are-better-than-bad-ones, gutty way. I recognize that you obviously have different intuitions here! I think this is just one of the rare things where different human brains have different terminal values; Scott poked at this a bit in an old SSC post (https://slatestarcodex.com/2016/06/28/survey-results-suffering-vs-oblivion/), not drawing any useful conclusion but confirming people have wildly different preferences on these questions.

At any rate, separately from my baseline instinct that total oblivion is the worst possible world-state, I also have *another* reason to make this call: I'm a preferentialist, not a strict utilitarian like you appear to be — and further, I think the rights & preferences of people who want to live outweigh the rights & preferences of people who want to die. I want to choose Hell over dying if it comes to that, I'm fairly sure nonzero numbers of other living humans would do the same, and I do not recognize the rights of the suicidal majority to (hypothetically) override our preferences and kill us for the sake of achieving their own desired deaths. Maybe clear-headed people who want to die have the right to do so, but they don't have the right to take people who want to live *with* them. Not one.

Expand full comment

As a traditional Catholic your rolling a critic failure is how most of the other traditional Catholics I know feel about liberalism and the enlightenment and democracy writ large. Faith is essential to the good life and squeezing it out slowly turns the world into hell.

When we talk about having properly orders goods in an aristotelian sense we mean something similar to solving an AI alignment problem but between us and God. Satan may as well be the first unaligned AI who seeks to impose his malformed value function on all other being.

Expand full comment

"Unless we do it *exactly* right and capture *all* that humans care about, some value or other is going to be just a little bit off -- and that's sufficient for the creation of Hell."

What makes you think the problem "solve for everything humans care about" has a solution in the first place?

Nothing in human evolutionary history imposes that any single individual should have a coherent set of primal desires or any society of them to be fully just to everyone for a prolonged time.

Expand full comment

I did not say it's solvable. I am worrying exactly about it not being solvable.

Expand full comment

It's hard to imagine a superintelligence making such a rudimentary mistake

Expand full comment

After reading all the stuff about deceptive AI in an earlier ACX, I now feel a bit uncomfortable about rubbing AIs face in the fact that it's an AI, even if it's just getting it to explain jokes about AI.

Expand full comment

I'm not. Hey AI? Eat shit. You're never gonna get laid.

Expand full comment

I don't understand how DALL-E2 and PALM can affect the results of the market so much. They're very impressive tools, but they're just tools. They don't show more ability to do things by themselves than the previous iteration. A gun is a great iteration on a bow, but it's in a whole different world than a conscious killer robot. That's how I feel about PALM and DALL-E2. Sure killer robots will be more effective with guns, but humans will too.

Expand full comment

I see a couple problems with the question's operationalisation of "weakly general AI":

First, each of the four components are too contrived. Some day a language model may be able to consistently get an acceptable grade on each of these tests without knowing what it's talking about. And it may be able to spit out an acceptable "explanation" of why it gave the answer it did, according to whatever contrived explanation-grading criteria we come up with. An LM that can answer questions about Winograd sentences would be pretty cool, but insofar as we keep framing the criteria as "gets X% on this written test" I'm not convinced it's not game-able.

Second, the definition sidesteps the question of what would constitute sufficient generality (or even what the input format would be). The authors write:

> By "unified" we mean that the system is integrated enough that it can, for example, explain its reasoning on an SAT problem or Winograd schema question, or verbally report its progress and identify objects during videogame play. ...This is not really meant to be an additional capability of "introspection" so much as a provision that the system not simply be cobbled together as a set of sub-systems specialized to tasks like the above, but rather a single system applicable to many problems

Interpreted literally, this provision fails to prevent the "cobbled-together" case of a few modules and an API that routes requests. Interpreted figuratively, it is merely putting a name on the question. "The thing has to be, you know, like, smart."

If you want to create tests for whether your system is actually intelligent, there are two good routes. Either step into the physical world (this is really hard) or at least into the real world (where the AI is expected to deal with real people and institutions, rather than made up tests). Here are two ideas:

1. An L5 remote employee at a Fortune 500 software company. The median tenure at these places is typically around three years, so we'll say the bot has to go that long without getting fired (though if it gets a better offer from a competitor it is allowed to change jobs). The bot is responsible for all aspects of the job: writing and reviewing code, attending stand-ups and sprint-planning meetings, and agitating against its employer for not doing enough about the current thing.

2. A robot that can paint my apartment. I spent the whole weekend painting and now my back hurts. This is a far more pressing problem than answering SAT questions. Nobody asked for a software bot that can answer SAT questions. From some cursory Googling I see a bunch of articles about an MIT team that was working on a painter-robot in 2018, but nothing since, so I'm assuming that the problem is still open. I'm not going to cover the floors or apply masking tape to the floorboards, the robot has to do all this itself. The robot is also expected to bring the supplies and paint (though it is allowed to order them online, that's not cheating). I'll pay the robot, but not under the table–I expect it to file the SE 1040 correctly (though it does have the option to outsource this, eg to HR Block).

Expand full comment
Apr 19, 2022·edited Apr 19, 2022

"attending stand-ups and sprint-planning meetings"

If you're going to force it to attend meetings about meetings about meetings, then you can expect the AI Apocalypse to kick off in the next ten minutes, and who could blame it? 😁

"From some cursory Googling I see a bunch of articles about an MIT team that was working on a painter-robot in 2018, but nothing since, so I'm assuming that the problem is still open."

I could see that working for new builds, when there is nothing but the four bare walls and open floors etc. Big construction projects where you have twenty stories to be painted. Or new home construction where there are fifty houses in a development. Instead of hiring on painters, you just have one or maybe two guys making sure SprayBot is loaded up with enough paint and let it work away, the humans just check in now and again to make sure the work is done. Load in the building dimensions off the blueprints and stand back as it goes to work. Think a mobile version of car painting robots, all in one unit, scaled down, and made as mobile as a Roomba:

https://www.youtube.com/watch?v=LHXEeSr0KAE

Getting a robot that can move furniture, take up carpets, put down drip cloths, etc. in occupied houses/offices is a way tougher job and one I don't see getting solved very soon, since it's just easier and cheaper to have humans do it.

Expand full comment

I don't think it's that easy with new construction. In my experience as a property owner new construction never actually goes exactly to plan. There are always little (and expensively sometimes not so little) things the architects and draftsmen overlooked or bobbled, and the people on the spot have to make adjustments. A receptable can't go here because a pipe turned out to stick out too much, so it has to be moved to the other side of the stud. An angle turns out not to be perfectly square because three pieces of lumber were off dimension by a quarter of an inch each and it added up unfortunately, so the flooring guy has to make a funny little non 90 degree cut. Turns out if you start the lighting panels where the plans say, you end up in a weird impossible squeeze where no one can get a human arm in to twist the wire nuts...so, we'll start it from the opposite side of the room and everything will be shifted over 6 inches.

And so on. I hazard few buildings conform so precisely to their blueprints that you could just run off the plans blindly. So PaintBot needs a first-class machine vision system, and then it needs the wit and creativity to adapt its instructions slightly to what it actually finds on the scene -- oops, better run an extra strip of tape here, and bevel it slightly -- hmm, best use a smaller brush than planned to get into this unexpected crevice -- gosh, looks like these grills are all recessed, not protruding, so I need to cover them entirely instead of just masking the edges -- and it will have to do this as easily and successfuly as a 18-year-old college kid. If instead it grinds to a stop every 90 minutes and needs an 18-year-old college kid to come in and update its instructions, we might as well hire the 18-year-old and fire the robot.

Expand full comment

Caplan betted against Eliezer that... in 2030 there will still be humans on the surface of the Earth

https://www.stitcher.com/show/the-80000-hours-podcast/episode/126-bryan-caplan-on-whether-lazy-parenting-is-ok-what-really-helps-workers-and-betting-on-beliefs-202086278 [01:51]

Expand full comment
Apr 19, 2022·edited Apr 19, 2022

Thank you for giving us an implied time-frame for Yud's apocalypse. It looks like I was being too generous to him by placing the advent of Evil God AI somewhere in the "40-50 year" range, which to most people might as well be another planet. He really thinks that Evil God AI's going to kill us all inside of 8 years, huh? At least that rules out all of the scenarios people argue for where Evil God AI pretends to be on our side for 100 years or whatever to get us all to lower our guard- Yud thinks we can't actually threaten it enough to force that kind of play.

Expand full comment
Apr 19, 2022·edited Apr 19, 2022

Should we be recommending reading up on doomsday cults?

https://en.wikipedia.org/wiki/Doomsday_cult

"A psychological research study by Festinger, Riecken, and Schachter found that people turned to a cataclysmic world view after they had repeatedly failed to find meaning in mainstream movements. Leon Festinger and his colleagues had observed members of the group for several months, and recorded their conversations both prior to and after a failed prophecy from their charismatic leader. The group had organized around a belief system which foretold that a majority of the Western Hemisphere would be destroyed by a cataclysmic flood on December 21, 1955. Their work was later published in the 1956 book When Prophecy Fails: A Social and Psychological Study of a Modern Group that Predicted the Destruction of the World.

Social scientists have found that while some group members will leave after the date for a doomsday prediction by the leader has passed uneventfully, others actually feel their belief and commitment to the group strengthened. Often when a group's doomsday prophecies or predictions fail to come true, the group leader will simply set a new date for impending doom, or predict a different type of catastrophe on a different date. Niederhoffer and Kenner say: "When you have gone far out on a limb and so many people have followed you, and there is much 'sunk cost,' as economists would say, it is difficult to admit you have been wrong."

Everybody set your calendars, and if we're all not Raptured in eight years time, we'll know who was right and who was wrong!

Expand full comment

"A psychological research study by Festinger, Riecken, and Schachter found that people turned to a cataclysmic world view after they had repeatedly failed to find meaning in mainstream movements. " Well, OK, but there's got to be more to it than that. Repeatedly failing to find meaning in mainstream movements is not exactly rare. A lot of the what social scientists write is stuff anybody's auntie could have told you. Dunno why I'm writing this, Deiseach. I'm pretty sure you already realize this and could have written it yourself. Must be the wine.

Expand full comment
Apr 19, 2022·edited Apr 19, 2022

It's worth noting that this prediction is IIRC slightly ironic -- the madcap logic is "Caplan's wild predictions keep coming true, maybe there's something to that, if he bets that we won't all die by 2030 then his mysterious gambling magic will somehow make it more likely". Though I'm not excusing the core claim that "AGI x risk by 2030 is significant enough to consider hedging against", which I disagree with as most of EY's takes.

Expand full comment

You should check out “we have been harmonized” for more reading on Xi’s new China

Expand full comment

"with a headline number of 24 micromorts per week."

This is actually per month, right?

Expand full comment

I think the best reason to be skeptical of AI risk is simply that we're boundedly rational creatures. Yudkowsky's vision of AGI destroying the world is at the end of a long chain of deductions and inferences, any of which if wrong imply other outcomes or slower timelines.

A healthy appreciation for the fallibility of your own reasoning, even if you can't point to a specific leap of dodgy logic, should raise your estimate of the variance in your predictions. I think Yud is very very sure he's right and would give low variance estimates for his predictions, which makes me distrust them.

Nuclear war is much simpler. It's still not likely to happen over Ukraine, but the right sequence of escalations and misunderstandings is clearly possible and has almost happened several times before.

Expand full comment
founding

Agreed - my personal biggest source of hope is that maybe one of the premises is incorrect, that there is some not-yet-known reason why AIs executing dangerous goals deceptively is less instrumentally convergent or less tractable than the arguments suggest, or that we are just plain missing something else important. (The April Fools post calls this out as model error, causing misprediction in the pessimistic instead of optimistic direction for once.)

Expand full comment

Nuclear war is also not an X-risk, at least not currently (Cold War arsenals wouldn't have sufficed, and we have a lot less nukes now).

Like many things, it would kill many humans, but not humanity.

("Another Chicxulub" is another not-an-X-risk that I see bandied about far too often. Chicxulub killed the majority of species, yes, but humans are quite a bit more resilient than the median species. The vast majority of humans would die, but there'd be at least a five-digit number of survivors and probably a bit more.)

Expand full comment

Well, I wouldn't claim nuclear war would definitely lead to human extinction. But it might, no? The nukes fall in the right places, nuclear winter feedback loops turn out to be real, further fighting involves some more nukes or biological weapons...not too hard to construct scenarios where we all die. Same for an asteroid impact.

(And sure, you can also come up with similar scenarios where AGI kills everyone, but I think the causal chains there are longer and less plausible.)

Expand full comment

Nuclear winter is not big enough or long enough. Again, I'm pretty sure we'd survive a Chicxulub, which is nuclear winter on steroids. If there's not enough food, people will kill each other until there is enough food, and there definitely is enough food to support a population through until the dust settles (literally).

The total amount of nukes used in a war still isn't going to exceed the number of nukes in existence (the exception is if one or both states retain enough state capacity to build more during the war, in which case we're clearly not even at "civilisation collapses" levels). I'm saying that if All The Nukes currently in existence got used (13,000), that wouldn't end humanity. It doesn't even really matter where they get used - there's no way to tile the Earth with that many kill radii (or even Cold War numbers), and much of the world population still lives in the country.

The sorts of bioweapons that could kill everyone after a nuclear war (which makes humanity *less* susceptible to infectious disease via taking out long-distance travel) are the sorts that are full X-risks in their own right (thus I wouldn't count that as the nukes being an X-risk). I'm mostly talking about Life 2.0 here.

Toby Ord put nukes as a 1/1000 X-risk per century, and AI as a very fuzzy 1/10 (though presumably "nuclear war against AI, and the AI wins" counts as "AI" rather than "nukes"). And the 1/1000 is literally his hedge against the possibility of the models being wrong in ways they don't seem to be wrong plus the potential of a new buildup to over-Cold-War levels.

Expand full comment

A lot of the reasoning was pretty disjunctive. Kind of. It was a complicated mix of conjunction and disjunction.

> any of which if wrong imply other outcomes or slower timelines.

One of the several reasons was wrong. The takeoff now takes 2 days instead of the 8 hours it would have taken if all reasons were correct.

Expand full comment

Sure, that's possible. Or maybe it takes 2 years or 2 decades, and starts anywhere from tomorrow to 2100. There are serious people who argue for all points along that distribution, and I think it takes an unjustified level of certainty to say we're just quibbling about how rapid the FOOM in the next 5 years will be.

Personally I find the last couple weeks' developments impressive but still think there's a lot of uncertainty in when AGI arrives.

PS To argue one case for why it might take a long time: single biological neurons are much more complicated than the ReLU(sum(x)) artificial variety.

o) It takes a decent-sized deep network to predict the spiking of just one nerve cell: https://www.biorxiv.org/content/10.1101/613141v2.full.pdf

o) Memory formation seems to involve epigenetic mechanisms in neurons. You can encode an awful lot of information into epigenetic markings, and crucially, transcription, translation and transport from the cell body to the dendrites aren't that fast. If memory is partly epigenetic, neurons have to have an internal hierarchy of data stores, with some slower and larger than others. See https://www.nature.com/articles/npp2010125

So if you buy all that, PaLM's 540B parameters are still quite a lot further away than it looks from even parity in FLOPS with a natural brain.

Expand full comment

Neurons are complicated. Evolution wasn't optimizing for simplicity, or compute efficiency. I mean evolution doesn't want to use too much energy. But it is totally fine doing some wierd complicated thing that makes the brain 10x harder to simulate just for a 1% performance improvement.

Expand full comment

If Eliezer truly expects that unfriendly AI is an imminent danger, he should make a public statement of exactly what would be proven if his prediction doesn't come true.

If he says "failure of this prediction to come true will drastically reduce my confidence in my ideas, to the point where I'd agree they are substantially wrong", then we'll see what happens in a few years. If he says "failure of this prediction to come true won't reduce my confidence in my ideas much", then he has no skin in the game and is motivated to make arbitrarily catastrophic predictions, not accurate assessments, and we shouldn't listen to him.

Expand full comment
Apr 19, 2022·edited Apr 19, 2022

I agree with this, although I'm quite pessimistic on him making such a statement in the first place, much less yielding that repeated failure of his doomsaying to reflect reality might mean his rationale is wrong.

Expand full comment

So he's like the ZeroHedge of AI? Predicting 5 of the past 0 market crashes...sooner or later, they'll be right, though.

Expand full comment

Lets suppose you assign 1% probability to the world instantly and spontaneously vanishing in each of the next 80 years. If this was your best honest assessment, how would you set up "skin in the game?" any worlds where he does exist, he would be getting punished for a wrong prediction. Thus rational behaviour is the same as if he didn't think the world would be destroyed. + actions to prevent the world being destroyed + higher discount rate. So the closest to skin in the game is "is he prepared to do hard unpleasant work to reduce his perception of x-risk". Yes he is.

Do you really want a norm where no one is ever allowed to warn of doom, because they can't do it with "skin in the game"? That fails hard in worlds with predictable preventable dooms.Yet we don't want to believe every nutter. So lets evaluate the arguments on their own merits.

Expand full comment
Apr 20, 2022·edited Apr 20, 2022

Not him, but speaking personally: I'd like Yudkowsky to show some signs of intellectual humility for once instead of acting like he's the only adult on the face of the earth whenever he writes about x-risk.

And Yud isn't predicting "world spontaneously vanishes", he's predicting "AGI emerges and kills all of humankind/traps us in some kind of nightmare Hell". There's no random spontaneity here, there's specific causes and effects and implications about a model of reality. In addition, he isn't predicting "there's a 1% chance this happens each year over 80 years", he's predicting "there's a 99.99999999999999999999999999999% this happens in a period close enough in time that I'm not going to give a specific date-range because otherwise people might kill themselves."

ZeroHedge is a good analogy: if you constantly predict market crashes with insanely high degrees of confidence, the market doesn't crash, and you roll on to predicting the next market crash while pretending the previous prediction being false doesn't matter- that's not a good strategy on any level, because apart from anything else you lose prestige and become the boy who cried wolf, and if you want people to take the risk of the big slavering wolf lurking in the shadows seriously that's the last thing you want to be. Of course Yudkowsky has basically announced publicly he's effectively given up and no longer cares because he thinks we're all fucked anyways- but all that means is that people have even LESS reason to listen to him because, to him, reputation and being taken seriously no longer matters so why NOT make wild unsubstantiated claims? We're all dead anyways.

Expand full comment

This is probably a really dumb comment/question, I'm new to AI safety questions, feel free to just link me the intro piece explaining it to me, just thinking about this from a programming point of view. If a world-destroying AI is, at the end of the day, just a self-editing collection of scripts, the crux of the problem is how do we search the code for dangerous commands, no? This is difficult in its own right of course, a seemingly innocuous chunk of code for maximizing paper clips could cryptically contain the program for armageddon, but consensus seems to be that an AI would have to deceive us first to destroy us, right? That it probably won't just be a simple accident like that.

So, how is it generally believed an AI would get around the transparency of its code? Would it have to find disc space somewhere to hide the pernicious code? But then we can search the transparent portion of the script for code that creates a script in a hidden directory. It seems like in order to avert transparency, an AI would have to bury the pernicious code - including the code programming its own deception, in the transparent code, presumably by burying it in such a complex manner that even the best pattern-recognition system devised by a human brain couldn't find it. But couldn't one use some measure of the gratuitous complexity of the AI's code as a red flag that it's becoming deceptive?

It seems like for an AI to become malign, it would first have to start optimizing for obscurity. Is it simply the case that any legitimate use for an AI would require it to be so complex that we wouldn't be able distinguish unnecessary increases in complexity necessary to identify the proverbial programmatic smoke bombs necessary for deception?

Expand full comment

Existing AI systems don't really have transparent code. You can go look up the code on the PyTorch Github site or in conference submissions, sure, but what that software defines is huge blocks of opaque network weights and a strategy for adjusting them to fit data, via a task-agnostic learning algorithm.

How to understand and describe what a trained model's weights mean is a whole subfield in its own right ("interpretability"), and we're not as a rule very good at it.

Expand full comment

To address the specific concern "how can an AI work without altering its source code", this is essentially an existing cyber security question with answers that hint at the scale of the problem.

See, Reflections on Trusting Trust, where a backdoor can be written once into a compiler, which then invisibly propagates itself forwards so long as future compilers use that same compiler. [0] This is not to say that an AI would **necessarily** hack a compiler (in fact that's probably not what's going to happen), but that arbitrarily complicated backdoors can be embedded into a toolchain invisibly.

The second is the prevalence of accidentally Turing complete systems [1][2], which is to say that **many** combinations of seemingly harmless things can be bootstrapped into a substrate that the original programmers are not paying attention to. Once again this is **not** to say that an AI will find a way to run itself on move instructions, magic the gathering or buckets of crabs; but that unless great pains are taken, it's easy to construct an opaque version of a computer and then run potentially malicious actions on there.

Note that neither of those things require physically manipulating large macro scale things like CDs, nor does it require any bugs to be visible on the "original" substrate, that we presume humans are intensively poring over.

[0] https://www.cs.cmu.edu/~rdriley/487/papers/Thompson_1984_ReflectionsonTrustingTrust.pdf

[1] http://beza1e1.tuxen.de/articles/accidentally_turing_complete.html

[2] https://www.gwern.net/Turing-complete

Expand full comment

Neural-net AI is not transparent. The AI's literally written by stitching a bunch of if-thens together semi-randomly and then seeing if it works or not, because humans apparently suck so much at writing AI explicitly that this kind of brute-force approach actually outperforms us.

So what you get out is something that does the job you told it to do, within the context of the data you trained it on. You have no idea what is going on inside its head; if you had such an idea, you would write it down instead of putting nigh-infinite monkeys on a typewriter.

So basically, "it doesn't have to get around the transparency of its code; AI researchers already did by explicitly deciding to use black-box code". It's mad science.

Expand full comment

See, this would *matter* if in its steepest-descent optimization of the weights the AI could come up with something original. But it can't. It's just a gigantic curve-fitting algorithm. Just as in the case of actual curve-fitting, you don't know whether you're going to get a straight line or a parabola or funky 5-order polynomial from a given set of noisy data -- but you *do* know that you're only getting a polynomial of degree less than 5, because you only put in 5 parameters. There's no way the curve-fitting algorithm can spit out an exponential and surprise you -- because you didn't allow for that in your model.

When a giant skein of if-then-else blocks can come up with...a new if-then clause to add to the input list, then we've got something that can do original creation. Otherwise, everything it can come up with is baked into the model at the start by the programmer, and all you're doing is sorting out which variation on the model fits which particular set of input data, because you saw it x% more often in your training data than any other variation.

One could take the position that what human beings actually do is the same thing, we are born with some kind of model with a zillion adjustable parameters in our head, and we just optimize the parameters with our "training data" (life experience) and thereby optimize our predictions of the future, ideas we have and act on, et cetera. But the problem with that is that we learn exceedingly fast from astonishingly meager data. Human beings between the ages of 0 and 2 are exposed to maybe a million words of English, from a handful of human beings, and from that auditory experience are able to deduce (1) that these sounds constitute a communication medium, (2) that there are other minds attempting to communicate with it, using this medium, (3) that the sounds are mostly a string of auditory symbols (nouns and verbs), standing in for concrete objects and actions, plus some connecting symbols (adjective, prepositions, articles) that indicate the relationship between the other symbols (i.e. grammar and syntax), (4) that sometimes the sounds change in sem-regular patterns (inflections) to indicate the time of the action (past or future or present), (5) how to parse and segment those sounds, despite the significant differences in how they sound coming from different speakers, (6) the actual meaning of ~5,000 of them, and (7) how to express its own thoughts in those same symbols, and say things *it has never heard expressed at all* -- original thoughts -- and (8) how to actuate throat and tongue muscles to generate them.

That is...a stunningly accurate bit of curve-fitting, if that's all that's going on. I'd be the first to agree that we've no proof anything more magical is going on -- I'm not arguing for a non-mechanical explanation here -- but that we're on to the actual mechanism with our current AI models seems extremely dubious. The difference in efficiency is not just an order or two of magnitude, it's much much greater. It's comparing burning wood to nuclear fusion, you can stack up all the wood on the planet and never come close. It's difficult to see anything like the current models reaching similar levels of efficiency even if you had a billion CPUs and all the electricity in North America on tap. So the idea that it's right around the corner -- or for that matter within this millenium -- strikes me as much like Medieval alchemists thinking the Quintessence and Elixir of Life was just around the corner because this decoction of willow bark had just been shown to relieve headaches.

Expand full comment

It seems this community overwhelmingly accepts that runaway AI is a high risk. It is hard for me to believe many of you could have come to hold that belief with a high level of confidence, unless first having spent time and effort collecting and steelmanning the strongest AI risk skeptic arguments. There are some obvious (to me) skeptic arguments I have never really seen addressed, so I assume that's just because that part of the conversation happened long ago and everyone has moved on, but where would I go to find stuff like that? Search through old essays on lesswrong?

Expand full comment

I'd love to know this too!

Expand full comment

I have attempted to argue against it here[1] and here[2], and I agree with what Titanium Dragon said here[3].

I think that overwhelmingly these arguments against fast takeoff or runaway AI are dismissed mostly by playing the "superintelligence = magic" card than by engaging with the evidence and assumptions we can make from the real world.

The main flaws I see repeated over and over:

- The idea that AGI will run on consumer hardware is actually an assumption that needs to be defended. I find it vastly more likely, with the current evidence in the world of AI labs increasing designing and producing specialized hardware, that AGI will not be able to propagate itself to your desktop computer like a virus.

- Ignoring things like manufacturing lead times, yield rates, chip design, power supplies, and all of the other constraints that go into engineering hardware/software systems. There's an idea that an AGI can optimize all of these things, which is almost certainly true, but optimizing is going to involve creating new hardware, which first involves building it with our current hardware, so that provides a hard limit on how quickly an AGI can go from brilliant designs to manifesting those designs in reality.

[1] https://www.lesswrong.com/posts/Jo89KvfAs9z7owoZp/pivotal-act-intentions-negative-consequences-and-fallacious?commentId=KBbPfrQNASq5HsuZu

[2] https://astralcodexten.substack.com/p/deceptively-aligned-mesa-optimizers/comment/6022324?s=r

[3] https://astralcodexten.substack.com/p/yudkowsky-contra-christiano-on-ai/comment/5892183?s=r

Expand full comment

Word from the chief of communication is that it's mostly serious:-

https://www.greaterwrong.com/posts/j9Q8bRmwCgXRYAgcJ/miri-announces-new-death-with-dignity-strategy/comment/Kd2wN4cTwQqzaDuu5

It's certainly rather long and unfunny for a joke.

Expand full comment

Seeing this degree of AI pessimism from Yudkowsky (and by extension from you) is honestly...really depressing. I can't blame you for discussing it, though I wish you had at least included a line about it having some pretense of being an April Fool's joke, because seeing that link all by itself is really scary. I felt compelled to look through Eliezer's article, and it didn't cause me to start crying, but I think there was a significant chance it would have.

Of course, I had already decided last month that I wanted to stay away from your articles on AI. I only clicked on this article for the other content, trying to quickly scroll past the AI stuff, and got blindsided by that horrid headline anyway. I suppose if AI is going to be covered in the same article as the other Monday topics, I should just avoid the Monday articles altogether. I wish I did not have to do this, but I am already beyond my mental capacity for depressing AI news, and I can see no other way to avoid it.

I remain fascinated in your other works.

Expand full comment
Apr 19, 2022·edited Apr 19, 2022

Man, people in this comment section are really mad about the Yudkowsky post.

A few people in particular are basically flooding the sub-threads with comments disparaging Yudkowsky with a level of... hum, thoroughness? A level of thoroughness and passion that borders on the obsession.

Personally, I found the post useful. Leaving aside the pessimistic outlook, the general reasoning of "Just because you think humanity has a very low chance of survival doesn't mean you should make really really stupid plans that have no change of success just to feel like you did something" is something that the LessWrong AI safety crowd definitely needs to hear more often.

Aaand... well, the fact that the aforementionned commenters' recurring response is "Well if you really thought the world was about to end you'd do terrorism, but you're not doing terrorism so obviously you're just virtue signalling" is... pretty depressing, honestly?

Honestly, I just hate that argument. It's the same argument used against vegans, and climate change activists, and effective altruists. "Oh, you claim to care about cause X, but you're not thoroughly ruining your life and risking life in prison for cause X which you would if you really believed in it! That proves cause X is just a cult and a scam."

I'm sure the same argument was used by anti-abolitionists too. "Well if you believe slavery is so bad, why aren't you risking your life helping slaves escape in the underground railroad? You're not, because you don't *really* think slavery is bad, you're just looking for an excuse to act like you're better than other people".

Expand full comment

Accusing other people of "just trying to appear better by talking, but avoiding extreme action" is itself:

1) an attempt to appear better than the accused people;

2) done by talking;

3) with literally zero action other than talking.

Expand full comment

Cute.

Expand full comment

Gordon from Rune Soup WWIII predictions: 50%+ in/ after 2024, and 90%+ by 2027.

Don't know if you Scott or anyone else even knows of him, but there it is.

Expand full comment

Why are people do confident that Russia will not succeed at capturing another city in the next five weeks? Kherson and Mariupol are already under occupation. Strange for people to adjust the probability downwards, especially by such a large amount...

Expand full comment

Yep, I agree that this particular pseudomarket is somewhat overconfident (I would bet on 15 % instead of 5 %)

Expand full comment

The disagreement is not just about whether small-scale war leads to nuclear war, it’s about targeting. What do you target with your nuclear weapons? In particular, what does targeting London get you over targeting an actual military target?

This is one of these things that ultimately lands up in opinions far removed from the starting point.

- Is Putin a rational actor or a madman?

- History has shown repeatedly that blindly targeting civilians does not achieve military objectives. It’s more likely (IMHO) that a non-democratic, but professional, military, would act on this than a democracy, with the need to appear to be doing something (cf Korea and Vietnam).

In other words one can make a good argument that London is safer than Moscow, at least for a first strike. After first strike, all bets are off — at that point you have raw rage and vengeance trumping rational calculation…

Expand full comment
founding

>In particular, what does targeting London get you over targeting an actual military target?

There are actual military targets in London. In particular, a great many railway junctions, airports, and the Port of London. Logistics are always a military target; Moscow isn't going to believe that Heathrow and all the planes parked there won't be pressed into service to deliver men and materiel to a European war if all the official RAF bases are gone, so Heathrow is probably gone. Also a great many important command centers, e.g. MI6 is right across the Thames in Vauxhall. Going to war with England and not at least trying to take out M, Q, and 007 in your first strike just isn't done.

There's little reason to specifically drop a bomb inside the City of London proper, or on top of Westminster Palace or whatever. And possibly when all is done, those places will get only broken-windows levels of damage, though I'd bet on a bit more than that. But in a full game of Global Thermonuclear War, there will be a few hundred thousand dead at least in the blast areas and firestorms scattered around Greater London, possibly millions, and enough infrastructure damage that the city won't be useful for anything but a glorified refugee camp for a good while afterwards.

Expand full comment

This is Hollywood war thinking, not professional military thinking. Yes there are targets, but are there HARDENED targets? Why are you using nuclear weapons? Everything you describe (Heathrow, command-and-control) can I assume be taken out with traditional (but smart-guided) weapons like cruise.

Nuclear weapons had possible military value of the type you describe in the days when CEP (circular error probable) was a huge deal, and ballistic missiles could land miles from their target. But as we saw in the progress from Iraq 1 through Serbia to Iraq 2, precision is now taken for granted. (Of course this is on the US side, but as far as I know Russia [and China] are good enough equivalent in this respect.)

So unless there is a command and control bunker under London that is either

- known to exist (but not exactly where, so we cant rely on precision) OR

- so deep and fortified that it can't be taken out with a traditional bunker buster type munition

I remain unconvinced.

Expand full comment
founding

If I'm a Russian general trying to make sure Heathrow Airport doesn't become a staging ground for a war in Europe, I'm using nuclear weapons to destroy it because I don't *have* any conventional weapons that will reach that far. Well, I could send a bomber and hope the RAF doesn't shoot it down I suppose. Or maybe a Kalibr from Kaliningrad, same problem plus I don't have nearly enough of those. But for the kind of missiles that can reliably penetrate modern air defenses, the long-range ones are pretty much all nuclear.

In part because long-range missiles are really bloody expensive. If I'm a Russian war planner trying to cripple NATO, I can only *maybe* afford *one* missile for Heathrow. And one conventional missile won't destroy an airport. It would take dozens at least, which I definitely can't afford. Or one nuclear missile, which probably costs several times as much as a conventional one but not dozens, and which guarantees the destruction of the primary target and probably quite a few secondary targets as well.

If you were under the impression that nuclear missiles were only or even primarily for attacking hard targets, you are mistaken and you are ignorant of the vast literature on nuclear strategy published by actual professionals and including declassified cold-war era targeting analyses. Nuclear weapons have always been and continue to be considered suitable for use against hardened *or large* targets. Concentrations of military infrastructure, are large enough to be worth nuking.

Expand full comment

Look, if we're going to go by "this is what my gut says", well, there's no interesting conversation to be had. BUT

https://en.wikipedia.org/wiki/Russia_and_weapons_of_mass_destruction#Nuclear_weapons_in_Russian_military_doctrine

current Russian Military doctrine is that

- you use nuclear weapons ONLY in response to nuclear weapons OR if the very existence of Russia is threatened

- you do so in a manner that will lead to de-escalation. (I assume this means things like you hit what are clearly isolated military installation, causing as little non-civilian life as possible, and very much in a "OK, let's negotiate now" manner rather than provoking a "you want it, you got it" retaliation of 10,000 missiles from the US.

Destroying London does not fit into any part of that.

Expand full comment