1125 Comments
User's avatar
User's avatar
Comment deleted
Mar 30, 2023Edited
Comment deleted
Expand full comment
JohanL's avatar

Another way it seems to be an echo chamber is that while people disagree, they do it within a set of premises and language uses that means all the agreements and disagreements take place within a particular, limited framework.

Expand full comment
User's avatar
Comment deleted
Mar 30, 2023Edited
Comment deleted
Expand full comment
BE's avatar

"Noam C" has been wrong about everything ever, to a first approximation. But more to the point, people who copy-paste the old "it is a pattern-matcher therefore it can't possibly be understanding anything" without even the minimal effort to consider what might be the source of their own ability to "understand" (is it magic? if it's X, why would X be un-replicable by AI?) do sound like pattern matching parrots. Not very sophisticated, at that.

Expand full comment
User's avatar
Comment deleted
Mar 30, 2023
Comment deleted
Expand full comment
BE's avatar
User was temporarily suspended for this comment. Show
Expand full comment
User's avatar
Comment deleted
Mar 30, 2023
Comment deleted
Expand full comment
Scott Alexander's avatar

Banned for the "parrot like" part.

Expand full comment
alphagrue's avatar

This doesn't seem like a very fair ban, given that he was responding to someone who had just compared him to a parrot (and who was not banned for doing so).

Expand full comment
Goldman Sachs Occultist's avatar

But he was responding to 'parrot like' accusations in the first place

Expand full comment
Scott Alexander's avatar

See Microsoft's recent paper, "Sparks Of Artificial General Intelligence: Experiments With GPT-4", at https://arxiv.org/abs/2303.12712

Chomsky is clearly a smart guy in his own field, but the article was embarrassing. For example, in the terraform Mars question, it uses the AI's unwillingness to express an opinion on a moral issue as proof that it's not intelligent. But newly-made AIs love to express opinions on moral issues and Chomsky was using a version that was specifically trained out of that behavior for PR reasons. I tested his same question on the base version of that AI and it was happy to express a moral opinion on it. If you can't even tell an RLHF-ed opinion from the AI's fundamental limitations, I don't think you should be writing articles like these.

On the general question, please see past articles I've written about this, for example https://slatestarcodex.com/2019/02/19/gpt-2-as-step-toward-general-intelligence/ , https://astralcodexten.substack.com/p/somewhat-contra-marcus-on-ai-scaling , and https://slatestarcodex.com/2019/02/28/meaningful/

Expand full comment
OdiyaDude's avatar

Scott,

I am a huge fan of yours and I have already read them but not convinced that LLMs can achieve AGI with their current design. I am in the same camp as Marcus and Noam C.

You should perhaps explain what is "true AGI"? Just because GPT-4 is doing impressive stuff does not mean it is AGI and the article you shared about it showing "sparks of AGI" does not explain how they determined that. Also we cannot take their word for it. Gary Marcus has asked for details how it was trained and the training data but the "Open"AI has not made it open.

LLMs like GPT-4 are impressive but I am not sure they can be called AGI or sparks of AGI. At least yet.

The infamous incident about GPT-4 lying to a raskrabbit that it is not AI bot and it is a visually impaired person still does not make this interaction as AGI. GPT-4 by large training set trained itself to use deception. Just like Bing chatbot trained itself to use abuses by training itself on internet troll chat.

So garbage in garbage out still applies to LLMs and they are simply extrapolating stuff from their massively huge training data.

As Noam C has explained, Human brain does more than extrapolation from the data it has through experience. Until we understand how human mind truly works, we can forget about AGI. Some people truly think deep learning is actually deep understanding. The "deep learning" is not deep at all. Deep means number of hidden layers of computation where each layer has math functions (with parameters) and each layer transforms data and passes on to next layer and the system fits those params to training data. This is not 'deep learning' by any stretch of imagination.

Expand full comment
Pas's avatar

> Until we understand how human mind truly works, we can forget about AGI.

That's a rather strong claim. GPT and these LLMs are missing a few important things. First agency (but that's "easy" to simulate, just tell it to tell a human what it wants to do, then the human can feed back the results), and more importantly a loop that allows itself to direct its attention, some kind of persistence.

But these components/faculties are coming. And eventually if the AI can form models that contain itself, and it recognizes its own relation to other entities (even if just implicitly), and even if just through periodic training, but if it gets (again, even if implicitly) to influence the world to manifest some change in its model, then at point the loop will close.

Expand full comment
Nate's avatar

“Not convinced” is null hypothesis football imho. “Unless you can convince me there’s risk I’m going to assume it’s zero” instead of “unless you can convince me it’s safe I’m going to assume it’s risky”

Expand full comment
David Piepgrass's avatar

> not convinced that LLMs can achieve AGI with their current design. I am in the same camp as Marcus and Noam C.

I agree that LLMs are not AGIs (and while I'm not in Noam Chomsky's camp for anything outside linguistics, I feel like Gary Marcus makes good points).

> Until we understand how human mind truly works, we can forget about AGI.

I disagree strongly. Modern AIs can do better than the best humans at a variety of tasks; understanding the human mind is unnecessary to build AGI, and precisely because we *don't* understand the human mind, humans are likely to invent an AGI with quite a different architecture than whatever the human mind has, something in a different location in "mind design space"[1]. This already happened for deep learning, which is obviously different from human brain architecture. Before that it happened for neural nets, which use backpropagation, which animal brains can't use.

This is a big part of why building AGI is risky and also why estimating the risk is hard. It's not just that we can't know how deadly AGI will be, but also that we can't know where in mind design space AGI will be located or how deadly that particular location will be. (Mind you, if AGI *were* in the same place in MDS as humans, humans vary remarkably, so AGI might end up being more deadly than Hitler, or safer than the average grandmother, or both — with some AGIs fighting for humanity, others against, and still others ignoring the "should we kill all humans" controversy and watching cat videos instead.)

And I think LLMs (and other modern AIs) are a terrifying omen. Why? Well, it has to do with them having "too much" intelligence for what they are.

First of all, here is a machine with no senses — it's never seen anything, never heard anything, never touched anything, never felt hot or cold, hungry or happy or angry or nostalgic. Yet an "emotionless Helen Keller" GPT3 base model can often pass Turing tests without ever so much as practicing on Turing tests. It simply *can* pass the test, by its very nature. No human can do anything like this. If you lock an American in a padded cell and show zim random Chinese books and web sites (without images) 24/7/365 for 20 years on a big-screen TV, they might go insane but will not become fluent in Mandarin.

Second, it's small compared to human neural networks. Human brains have over 100 trillion synapses. It seems to me that in an AI neural net, weights are analogous to synapses and biases are analogous to neurons, so a GPT3-level AI has ~1000x fewer synapses than a human brain. (That AIs are way more efficient than humans on this metric is unsurprising from an evolutionary-theory standpoint, but still important.)

Third, it'll probably get smaller over time as new techniques are discovered.

Fourth, I believe GPT3 is wildly overpowered compared to what an AGI actually needs. Observe: in a sense it's hard for a human to even perform on the level of GPT2. Suppose I ask you to write a one-page story in the style of (and based on the characters of) some random well-known author, one word at a time with *no edits* — no backspace key, once you write a word or punctuation mark it is irrevocably part of the story, and you can't use writing aids, you must write the story straight from your mind. And I give you the first sentence. I think (if you're familiar with the author) you can do a better job of making the story *coherent* or *entertaining* than GPT2, but GPT2 is likely to be able to beat you when it comes to matching style and word choices of the original author. So GPT2 already beats the average human in some ways (and it can do so lightning-fast, and never gets tired, and can do this 24/7/365 if you like.)

GPTs (transformers) are fundamentally handicapped by their inability to edit. An AGI will not have this handicap; they'll be able to write a plan, review the plan, critique the plan, edit the plan, and execute the plan. An AGI doesn't *need* to replicate ChatGPT's trick of writing out a coherent and accurate text on the first try, because it can do as humans do — review and revise its output, do research on points of uncertainty, etc. — and therefore a linguistic subsystem as complex as GPT2 is probably sufficient for an AGI to match human intelligence while greatly exceeding human speed. And if a GPT2-level linguistic subsystem is sufficient, well, perhaps any PC will be able to run AGI. [Edit: IIUC, a typical PC can run GPT2 inference, but training requires modestly more processing power. You do not need a supercomputer for training — they used a supercomputer not because it's necessary, but because they wanted results quickly; no one is willing to train an AI for 18 years like they would a human.]

If a typical PC can run a smart-human AGI at superhuman speed, then how much smarter can a pro gaming PC be? How much smarter can a supercomputer be? How much more powerful is a supercomputer that has hacked into a million gaming PCs?

--------

I disagree with Eliezer Yudkowsky that the very first AGI is likely to kill us all. Maybe its goals aren't that dangerous, or maybe it has cognitive limitations that make it bad at engineering (a skill that, for the time being, is crucial for killing everyone).

But once that first AGI is created, and its architecture described, thousands of AI researchers, programmers and kids in moms' basements find out about it and dream of making their own AGI. Perhaps some of the less powerful AGIs will show the classic warning signs: trying to prevent you from turning it off, lying to you in order to make progress on a subgoal, manipulating human emotions*, using the kid's mom's credit card to buy online services, etc.

But I think Eliezer would tell you that if this happens, it is already too late. Sure, not-that-powerful AGIs won't kill everyone. But just as people start to notice that these new AGIs sometimes do immoral things, someone (let's call him Bob) with too many AWS credits will program an AGI *carelessly* so that its goals are much different than the human intended. Maybe Bob will give it too much processing power. But maybe the AGI simply decides it can work toward its misconfigured goal faster by creating another AGI more powerful than itself, or by creating a worm to distribute copies of itself all over the internet. At this point, what happens may be out of anyone's control.

Maybe it doesn't kill us all. But since it's smarter than any genius, it has certainly thought of every threat to its goals and means, and how to prevent the apes from threatening it. And since it's faster than any genius and capable of modifying copies of itself, it is likely to evolve very quickly. And if it determines that killing 10% or 100% of humans is the safest way to protect itself from humans trying to turn it off, then sure, why not?

It's worth noting that the most dangerous AGI isn't the most typical one. There can be a million boring, safe AGIs in the world that will not save us from the one Bob misconfigured.

[1] https://www.lesswrong.com/posts/tnWRXkcDi5Tw9rzXw/the-design-space-of-minds-in-general

Expand full comment
Gašo's avatar

Great comment. We went from "Do androids dream of electric sheep?" to "Do AGIs watch cat videos?" in a half century. Whatever AGIs will be, I doubt they will be "united", i e. globally aligned in between themselves. (regardless of whether they will be "aligned to human goals"; I see no evidence that HUMANS are globally aligned on human goals, so what do you expect from AGIs?)

About "humans only being able to make edits", I believe that the (GPT-powered) Dall-E had for long years been based upon "iterative refinements" in its image generation. "Iterative refinements" are just as applicable to token string production...

Expand full comment
Sandro's avatar

> Please. GPT-4 has no understanding of the world. It is a pattern matching parrot. Very sophisticated one at that.

Maybe humans are all sophisticated pattern matching parrots. You literally don't know, so this isn't really an argument against the dangers of AI.

Expand full comment
magic9mushroom's avatar

>You literally don't know

That's getting close to the sort of question for which "personal gnosis" is a valid answer; any person knows that he/she personally is not a p-zombie.

I'm not saying that Banned Dude (don't know who originally posted it) is right, of course.

Expand full comment
Sandro's avatar

> any person knows that he/she personally is not a p-zombie.

No they actually don't, because that would assume their perceptions have some kind of direct access to reality, as opposed to merely perceiving some kind of illusory projection.

Expand full comment
magic9mushroom's avatar

By "p-zombie" I mean something that behaves like a human but without an associated first-person experience.

I know I have a first-person experience. It doesn't matter for this purpose whether that experience corresponds to reality or not; even if I'm experiencing the Matrix, *I am experiencing*.

If something else contradicts that, *the something else is wrong*; I know that I think and perceive more directly than I can know anything else. As I said, personal gnosis.

Expand full comment
Sandro's avatar

I know what you meant. I'm saying you only think you have first-person experience. This "knowing" is a cognitive distortion, like other perceptual illusions. People who don't see things in their blind spot can swear up and down something is not there, that doesn't make it true. We only ever have indirect access to reality, your "first-hand experience" is no exception.

Expand full comment
magic9mushroom's avatar

>People who don't see things in their blind spot can swear up and down something is not there, that doesn't make it true.

How does that relate? That's perceptions not corresponding to reality. They experience something false, which inherently necessitates that they experience.

Expand full comment
Jonluw's avatar

I've seen this argument before, and it's baffling to me. Are you operating off some strange definition of what it means to "have first-person experience"?

There exists, at this very moment, the qualitative experience of seeing letters being written on a computer screen. An experience which "I" "am having" by any reasonable definition of the words.

I understand that I can't convince you that *I* have qualitative experiences, but I can't understand how in the world you can doubt the existence of *your own* phenomenology, unless you are somehow taking issue with the use of the words "I" or "have".

Expand full comment
The Ancient Geek's avatar

> his "knowing" is a cognitive distortion, like other perceptual illusions.

That's not something you *know* , it's a belief.

Expand full comment
Andrew Gough's avatar

I have direct access to reality. This access is pretty handy for making things like microprocessor chips or building a fort out of sofa cushions.

Expand full comment
yrtetviokl's avatar

If they are perceiving anything in the first person, then they are not a p-zombie. If they are perceiving an illusory projection, that already means they have subjective experience and are hence not a p-zombie.

Expand full comment
Sandro's avatar

Perception is not experience. Don't conflate the two.

Expand full comment
yrtetviokl's avatar

Can you elaborate? What does it mean to perceive something without having the subjective experience of having perceived that thing?

Expand full comment
OdiyaDude's avatar

No. We humans are not JUST pattern matchers.

A human baby/child has less training data (eg experience) in her brain than say GPT-4 yet a baby/child can do reasoning better than the latest fad in AI. Unless we understand how human brain is able to do this without lots of training data, we can forget about AGI.

LLMs can do what they can purely based on training data and sure some of the training data may have given them insights (like how to be deceptive and lie to get what you want) but those insights do not make them come close to AGI.

Expand full comment
Sandro's avatar

> A human baby/child has less training data (eg experience) in her brain than say GPT-4 yet a baby/child can do reasoning better than the latest fad in AI.

You're not comparing like with like. A human baby is not a blank slate, GPT-4 was. Billions of years of evolution culminated in a human brain that has pretrained biases for vision, movement, audio processing, language, etc.

> but those insights do not make them come close to AGI.

Current LLMs are not AGIs. That does not mean AGI is not the same sort of stochastic parroting/pattern matching we see in LLMs. Just adding "step by step" prompts and simple "check your work" feedback [1] significantly improves their reasoning capabilities. We've barely scratched the surface of the capabilities here.

[1] https://arxiv.org/abs/2303.17491

Expand full comment
David Piepgrass's avatar

As I noted upthread[1], a human baby has in one sense vastly *more* training data, as GPT has no senses — no sight or hearing or touch or even emotion. As a baby's father myself, I have found it absolutely remarkable how slowly she learns and how difficult she is to teach. (Keeping in mind that GPTs are not AGIs) I think a second- or third-generation AGI would be able to learn faster than this, if it were fed training data at the same rate and in the same way. But if I check lists of milestones of cognitive development, I find that my baby is largely above-average, if a bit slow in the linguistic department. Some AIs learn fast even when training is slowed down to human speed[2]; not sure why we'd expect AGIs to be any different.

[1] https://astralcodexten.substack.com/p/mr-tries-the-safe-uncertainty-fallacy/comment/14387287

[2] https://www.youtube.com/watch?v=A2hOWShiYoM

Expand full comment
User's avatar
Comment deleted
Mar 30, 2023
Comment deleted
Expand full comment
BE's avatar

I'm not sure the argument works the way you assume it does. Over the last years, we see more and more evidence that climate change is _not_ as bad as the "typical worst outcomes" would have us believe. The worst scenarios decrease in likelihood. The impact of the the scenarios that seem likely is being evaluated as less disastrous. Some advantages of climate change or at least opportunities it opens up are getting more attention.

A better analogy would be the CFC crisis. I wish people on all sides of these debates would be referencing it more frequently.

Expand full comment
Tyler Cowen's avatar

Come on Scott, you're just not understanding this...for a start, consider the whole post! Tyler Cowen

Expand full comment
Mike G's avatar

TC - huge fan of yours (and Scott's). And in this case, I had generally same reaction as Scott's. REQUEST. Can you have Yud on for an emergency Tyler Talk, or perhaps Scott + Yud? I would estimate 10k+ of your overlapping readers are scared shitless over this. Would welcome thoughtful rebuttal of Yud if it's out there.

Expand full comment
Excavationist's avatar

Seconded. Also an avid fan of both of you.

Tyler, I fully agree that "AI represents a truly major, transformational technological advance," and that this is going to make things weird and hard to predict precisely. But... isn't that what we have probabilities for? You say that "*all* specific scenarios are pretty unlikely," but, following Scott's argument, what is a "specific scenario," exactly? This seems like a way to escape the hard problem of putting a numeric probability on an inherently uncertain but potentially catastrophic outcome.

Ultimately, my overriding sense from reading your post (and reading/listening to you lo these many years) is that you're frustrated with stasis and excited by dynamism. I agree! But even if you have a strong "change is good" prior, as I do, it still seems correct to weigh the likelihood function as well -- that is, engage with the AI-specific arguments rather than depend on historical analogies alone.

Expand full comment
Kevin's avatar

Probabilities do not work well when smart people completely disagree on priors. Some people think the chance of AI apocalypse is 20%, some think it’s one in a million. There are no priors these people agree on.

Most of the “probabilistic” reasoning here is simply argument by exhaustion. Ten paragraphs talking about probabilities with no actual mathematics. Then concluding, therefore there is a 30% chance of AI apocalypse.

Expand full comment
pozorvlak's avatar

That's not what this post is saying *at all*. It's saying that Tyler's mathematics-free arguments aren't enough to establish a near-zero probability, and (separately), that Scott has put a lot of thought into this and come out with the number 33%. The arguments for 33% in particular aren't given here, they're spread across a lot of previous posts and the inside of Scott's head. The point of this post is to rebut Tyler's argument for near-zero, not to support Scott's arguments for 33%. It's *Tyler* who's doing the thing you accuse Scott of.

Expand full comment
JDK's avatar

The 33% is not result of a lot of thought. It is hand waving.

The small but non- zero probability is also a lot of hand waving. As is a probability of almost certain also hand waving.

Expand full comment
Ozryela's avatar

It's not hand waving, it's virtue signaling.

Scott is a member of a sect that strongly beliefs in an AI apocalypse. So he cannot give a low probability because he'd lose status within his sect if he did. But at the same time. He wants to have main stream appeal and he cannot give a high probability, because he'd be seen as a crackpot.

The 33% is very carefully chosen. It's below 50%, so he cannot be accused of believing in an AI apocalypse, but still high enough that he doesn't lose guru status within his sect.

It's a very rational and thought out answer, but at a meta level. It's not a real chance of course, that's not the point.

Expand full comment
pozorvlak's avatar

I'm sure it's not the result of a watertight mathematical argument, but I'm not sure how one would even construct such an argument. But Scott's definitely put a lot of *thought* into it - see his comment https://astralcodexten.substack.com/p/mr-tries-the-safe-uncertainty-fallacy/comment/14070813 for a partial list of posts he's written examining various aspects of this problem.

[Also, nitpick: the expression is "hand *waving*", as in "waving one's hands dismissively". Hand *waiving* would mean saying "Nah, I don't need a hand. Stumps were good enough for my father, and his father before him."]

Expand full comment
JDK's avatar

I'm with you. A lot of hand waving.

Let's see the math!

Expand full comment
Tyler Cowen's avatar

And here is a bit more: I am a big fan of Scott's, but this is a gross misrepresentation of what I wrote.  Scott ignores my critical point that this is all happening anyway (he should talk more to people in DC), does not engage with the notion of historical reasoning (there is only a narrow conception of rationalism in his post), does not consider Hayek and the category of Knightian uncertainty, and does not consider the all-critical China argument, among other points.  Or how about the notion that we can't fix for more safety until we see more of the progress?  Or the negative bias in rationalist treatments of this topic?  Plus his restatement of my argument is simply not what I wrote.  Sorry Scott!  There are plenty of arguments you just can't put into the categories outlined in LessWrong posts.

Expand full comment
Excavationist's avatar

Maybe it would help to disentangle the policy/China question from the existential risk question? That is, it may be the case that OpenAI or anyone else unilaterally desisting wouldn't prevent (e.g.) China from plowing full steam ahead. But that might still be a very bad idea. It's not clear to me whether you think this is fundamentally a collective action problem, or whether you really want to dismiss the risk entirely.

Expand full comment
Mike G's avatar

well said

Expand full comment
Kevin's avatar

It’s not dismissing the risk at all. There are two types of risks.

1. risks from anyone getting AGI

2. risks from China getting AGI before the US

Be a good Bayesian. We should focus on the risk that maximizes the danger, times the effectiveness of our working on it. That is an argument for focusing on #2, because we have many effective actions we can take.

Expand full comment
Nechninak's avatar

It seems to me that this dichotomy does not work very well if we consider the actions they (possibly) imply.

If you want to avoid risk #2, you need to focus on actions that also prevent that the development of completely unaligned power-seeking doom-bringing AGI is sped up accelerated, at least if you believe that there is a relevant alignment problem at all. But in the anti-#2 actions subset, there are many actions that do just the opposite.

Expand full comment
magic9mushroom's avatar

To separate your #1 and #2 properly, I'd specify it as

1) risks from anyone building unaligned AGI

2) risks from China building aligned AGI before the USA.

However, as I see it the chance that any AGI built in the next 30 years (i.e. any neural net AGI) is aligned is <0.01%. So building AGI ourselves would subject us 99.99% to #1 if China doesn't build AGI for the sake of 0.01% (generously) of avoiding #2 if China builds AGI and also gets the 0.01%.

The correct response to China attempting to build AGI is to stop them, which fulfils both #1 *and* #2, rather than to pre-emptively commit suicide. This is definitely physically possible; if all else fails, a thousand nukes would do the trick.

Expand full comment
Ryan W.'s avatar

1. What are the risks from China building highly intelligent but poorly aligned AI. To China? To the US?

2. Need it be said that nuclear war carries with it its own set of existential risks?

Expand full comment
magic9mushroom's avatar

1. To everyone: hostile AI wages war against humanity.

2. Global thermonuclear war is a GCR, but it's not a notable X-risk. Rural populations can't be directly killed by any plausible quantity of nukes, fallout's too localised and too short-lived, and nuclear winter is mostly a hoax (the models that produce everyone-dies kind of outcomes tend to look like "assume everything within fire radius of the nuke is high-density wooden buildings, then assume 100% of that wood is converted into stratospheric soot", and even in those the Southern Hemisphere basically does fine).

Expand full comment
Jack's avatar

But you can still argue within point 1! Scott has taken issue with Tyler's reasoning about point 1. Tyler has responded 'well that's a gross misrepresentation because you haven't talked about point 2 (and some other stuff)'. But that's not how it works.

Expand full comment
Jon Brooke's avatar

The risk isn't along the lines of nuclear weapons, where the technology waits inertly for a human to start a cascade of disaster. It's more along the lines of a new virus that once released into the atmosphere will have a life of its own and be unable to be contained.

So, like Covid, whether it starts in China or the US would seem to make little difference. There's very little point in us rushing to be the first to start the conflagration just to stop our rivals getting a jump on us.

Expand full comment
Ryan W.'s avatar

What are the minimum requirements for such a thing? Does it need to be embodied? Or will ChatGPT55 kill us with spam and highly detailed scams targeting our relatives?

Expand full comment
Jon Brooke's avatar

It will mean that you can no longer trust anything that you haven't seen written down in a physical book published prior to 2022.

If you think about it, a lot of our modern systems rely on trust in a shared history and understanding of science etc. So I'm more thinking of your second option.

We'll likely end up as Morlocks running machinery that we no longer understand. Some of us might become Eloi - living lives of luxury without purpose.

Expand full comment
Martin Blank's avatar

It is not really clear to me that China getting it first wouldn't be good for AI safety. I kind of trust their society to be more responsible and safe than some private US company. This is one of those areas where I really feel old style nationalism is making people stupid.

"other competitor nations with other values = bad". When IDK the things that China is strong on seem like the things we want to people controlling AI to be strong on (long term planning, collectivist versus individual focus, info sec priority, social conservatism).

Expand full comment
User's avatar
Comment deleted
Mar 30, 2023
Comment deleted
Expand full comment
Martin Blank's avatar

Well was it China or the NIH? It doesn't seem remotely clearly to me China "owns" the pre-pandemic Wuhan research any more than the US does. Though obviously we have a big incentive to blame it on the other.

Expand full comment
av's avatar

I don't think it was NIH that had direct continuous control over biosafety practices in Wuhan. Sure, both NIH and China authorised GoF research, but it was a Chinese lab with Chinese scientists that dropped the ball.

Expand full comment
Goldman Sachs Occultist's avatar

And your ratioalization of China throwing scinetists in prison for talking about the virus and refusing to make any effort to stop it spreading outside of China?

Expand full comment
static's avatar

" I kind of trust their society to be more responsible and safe than some private US company. "

This seems hopelessly naive. China has private companies. If regulation is the answer, and China is not pursuing "full stop" regulation, then what regulation are they pursuing? How exactly are they "more responsible"?

Expand full comment
Martin Blank's avatar

>then what regulation are they pursuing? How exactly are they "more responsible"?

I don't think we have a good handle on what safety measures they are taking, but they are known to be conservative and authoritarian.

Whereas the ethos of Silicon Valley is "break things and hope you get a giant pile of money from it, maybe it won't be too bad for normies".

Expand full comment
static's avatar

"they are known to be conservative and authoritarian."

They are known to be conservative and authoritarian in regards to personal freedom, but not in terms of environmental destruction or technology development.

Expand full comment
alesziegler's avatar

For some actual information about Chinese AI regulation I recommend following ChinaTalk. E.g. this (there is a paywall, but some important information is before it): https://www.chinatalk.media/p/tiktok-live-show-ais-regulatory-future

Expand full comment
Goldman Sachs Occultist's avatar

They are known to be conservative about *things that threaten the CCP's power*.

They want to be THE global hegemon for the rest of time, and its naive to think they wouldn't be willing to gamble with humanity's future when the payoff is (the potential for) permanent hegemony.

Expand full comment
pozorvlak's avatar

AGI being developed under an authoritarian regime might not be worse for x-risk, but it's worse for s-risk (notably, "humans live under an inescapable totalitarian regime for the rest of time").

Expand full comment
Martin Blank's avatar

This seems a better argument about the hysterics regarding China.

Expand full comment
Shanghaied's avatar

Given China's horrific environmental record, trusting them to better manage externalities than the west seems hopelessly naive.

In addition, if China is first to an aligned AGI you can expect the result to be a CPC not grinding down in the face of humanity forever. However, if you are inconvenient to that enterprise you will not need to worry about it. You and your family will be dead. That is how collectivist societies deal with inconvenient people and they are well aware that nits make lice.

Expand full comment
Matt B's avatar

"This is all happening anyway" doesn't seem like an airtight argument.

https://www.vox.com/the-highlight/23621198/artificial-intelligence-chatgpt-openai-existential-risk-china-ai-safety-technology

Think human cloning, challenge trials, drastically slowing bio weapon dev, gene drives, etc.

Expand full comment
NJ's avatar

Negative bias is an understatement. What evidence would Scott need to change his opinion? We can (hopefully) all agree that doomsday scenarios are bad. I’m asking what would compel Scott to update his prediction to, say, < 1%.

Expand full comment
Matt B's avatar

Actually, I agree. @scott this is worth your time. Dig in and give Tyler's post another longer deeper response. This is your chance to defend the arguments of AGI risks to a prominent skeptic

Expand full comment
Jack's avatar

I suspect he will. He likes doing contra contra contra

Expand full comment
Steeven's avatar

Could you expand on the China argument? I think Scott's argument is that no matter who builds the AI, whether China, the U.S. or anyone, that could potentially kill everyone, while you are more talking about China getting technological hegemony.

Expand full comment
Bardo Bill's avatar

Everyone brings up the China argument as if it's supposed to be self-evident that Chinese researchers would produce an AI more harmful to humanity than Silicon Valley entrepreneurs, and that conclusion is... not obvious to me.

Expand full comment
Doug S.'s avatar

The only major nuclear power plant disaster was in a Communist country; one thing authoritarian governments are bad at is recognizing when they're making a mistake and changing course.

Expand full comment
magic9mushroom's avatar

Fukushima was level 7 as well, although it wasn't quite as bad and basically came down to a single mistake ("tsunami wall not high enough") rather than the long, long series of Bad Ideas that went into Chernobyl.

Expand full comment
moonshadow's avatar

...I actually can't work out whether the suggestion is that a single mistake leading to Fukushima is better or worse than it taking a long chain of things needing to be just so to get Chernobyl

Expand full comment
Anton's avatar

Here to register my disagreement that Fukushima had a single cause.

Chernobyl had a chain of engineering and human failues - but an RBMK can be run safely with minor modifications (the other Chernobyl reactors ran for 14 years afterwards). They tried really really hard to get it to explode, even if that's not what they intended.

The chain of engineering mistakes that went into Fukushima are a bit worse. The arrogance, regulatory and engineering failures are worse than Chernobyl in my opinion. The put backup generators 10m above sea level based on a fradulent study estimating the worst earthquake to be 10x weaker than other reported ones along the coast.

Expand full comment
Martin Blank's avatar

And if there are other things they are good at it is resisting public pressure to take short term gains over long term plans. Two can play your silly game.

The Chinese government has a lot of pluses and minuses over the US one, it is not remotely obvious to me which one would be wiser to trust with AI if I had to pick one.

Expand full comment
Martin Blank's avatar

Totally agree.

Expand full comment
static's avatar

"the China argument as if it's supposed to be self-evident that Chinese researchers would produce an AI more harmful to humanity than Silicon Valley entrepreneurs"

That's not the argument. If we start with the case that there is some percentage chance AGI will end humanity, regulation stopping it being developed in the US will not stop it in China (or elsewhere). It will end humanity anyway, so stopping it in the US will not change the outcome.

The secondary argument is that it won't end humanity directly, there will likely be a lot of steps before that. One of which is that is under the control of a nation state, that nation will be able to accumulate a lot of power over others. So, the intermediate consequence of stopping it in the US and not stopping it in some other place, is that the other place will end up strategically dominating the US for some unknown period of time until the AGI ends up strategically dominating humanity.

Expand full comment
Sandro's avatar

> It will end humanity anyway, so stopping it in the US will not change the outcome.

If the probability is X% if everyone is working on it, if a bunch of nations except China stop walking that path then the probability falls below X%. I have no idea how you can conclude that this isn't relevant.

Expand full comment
static's avatar

Why does the probability necessarily fall below X%? Might it not just push out the timeline at which the risk occurs? Would a six month pause have any measurable effect?

Another way to think about it, people are willing to risk their own personal extinction rather than be subjected to living under Chinese rule. It's not a given that Chinese domination is preferable to death.

Expand full comment
Michael Kelly's avatar

China currently uses phones to keep people from straying outside their neighborhood. I'm sure they would never misuse AI against their people.

Expand full comment
Martin Blank's avatar

I thought we were concerned about AI destroying our civilization, not making the Chinese police state 20% worse?

Expand full comment
User's avatar
Comment deleted
Mar 30, 2023
Comment deleted
Expand full comment
smopecakes's avatar

China will intensely fear the prospect of creating something they can't control

Expand full comment
B. Wilson's avatar

I just read your post.

What I notice more than anything is that both you and Scott are arguing about discursive features, and both articles seem to express a fair amount of frustration, which is reasonable given the format. What I also notice is that the information content about AI is extremely small. If anything "AI" is just an incidental setting where the meta-discourse happens.

Scott is reacting to your post, which is seems to be reacting to some Other Side. My understanding of your argument is that "Big Things are happening Very Soon whether you like it or not, and nobody knows how things will play out, so stop doomsaying, y'all." (In my head you're from southern Texas.)

One feature of your article does set off minor alarm bells for me: its heavy use of deonotological arguments. These are exhibited by liberal use of phraseology such as "truly X", "Y is a good thing", "no one can Z", "don't do W", "V is the correct response", etc. In contrast, Scott's article here levies more consequentialist arguments—"if you do X, then Y happens", "here is failure mode Z", etc.

Personally, changing my beliefs based on deontological/moralistic arguments typically involves a strong invocation of trust and/or faith, whereas consequentialist rhetoric gives me some meat with which to engage my current beliefs. The former feels more like a discontinuous jump while the latter a smooth transition.

/2cents

Expand full comment
Bill Benzon's avatar

"What I also notice is that the information content about AI is extremely small."

But then the actual capabilities of existing AI and any AI that's forseeable from current tech, are all but irrelevant to AI x-risk discourse. It's mostly a fantasy built on magical entities – "super-intelligence" – using magical powers – "recursive self-improvement."

Expand full comment
Donald's avatar

We can't accurately foresee the path from current tech to superintelligence.

That doesn't mean the path doesn't exist, or that it will take a long time. It means we are wandering forward in thick fog, and won't see superintelligence coming until we run right into it.

Expand full comment
Bill Benzon's avatar

Nor can we figure out how to get to Valhalla. But that doesn't mean Valhalla doesn't exist. It just means we've not figured how to get there. But we'll know we're there when Odin opens the gates and invites us in.

Expand full comment
David Piepgrass's avatar

Nor can we, the people of 1901, figure out how to make a commercial airplane that can seat 100 people. That doesn't mean such an airplane is impossible. But we'll know we're there when the stewardess opens the Boeing 707 and invites us in.

Expand full comment
Mr. Doolittle's avatar

We can't even prove that superintelligence can theoretically exist, or that we have any means to achieve it, or that the methods we are employing could do so.

We don't even know what intelligence is, let alone superintelligence. That doesn't mean there's a 0% chance of a super intelligent AI taking over the world or killing everyone. It should mean that we don't consider this point more strongly than other completely unknown possibilities. The same group of people who are most worried about AI starting the apocalypse seem to almost universally reject any other kind of apocalypse that we can't rule out (see, e.g., every theological version of apocalypse).

Expand full comment
Michael's avatar

> We can't even prove that superintelligence can theoretically exist

Meaning you're unsure it's possible to be smarter than humans? Is there some other definition of superintelligence?

Expand full comment
Goldman Sachs Occultist's avatar

>But then the actual capabilities of existing AI and any AI that's forseeable from current tech, are all but irrelevant to AI x-risk discourse.

We cannot yet align the systems that exist. This is absolutely relevant. Aligning a superintelligent machine will be much, much harder.

> It's mostly a fantasy built on magical entities – "super-intelligence"

There's nothing "magical" about it, unless you "magically" think brain tissue does something that silicon never can.

> using magical powers – "recursive self-improvement."

Again, nothing magical. If we can make AIs smarter, why the heck couldn't AI past a certain intelligence threshold make AIs smarter?

Expand full comment
1morestudent's avatar

Thank you for this very clear writeup of one of the distinctions! This provided me with quite some new insights! :)

Expand full comment
Meadow Freckle's avatar

Tyler's reference to Knightian uncertainty and Hayek is a gesture at the idea that no, in fact, you can't and shouldn't try to make predictions with hard-number probabilities (i.e. 33% chance of AGI doom). Some risks and uncertainties you can quantify, as when we calculate a standard deviation. Others are simply incalculable, and not only should you not try, but the impulse to try stems from pessimism, a proclivity toward galaxy-brained argumentation, and an impulse toward centralized control that's bad for the economy. In these matters, no a priori argument should affect your priors about what will happen or what we should do - they provide zero evidence.

His all-critical China argument is that if we don't build AGI, China will. Slowing down or stopping AGI is something like a fabricated option [1], because of the unilateralist's curse [2].

So if you had to choose between OpenAI building the first true AGI and a government-controlled Chinese AI lab, which would you pick? I expect Tyler is also meaning to imply that whatever lead the US has over China in AI development is negligible, no matter how much we try to restrict their access to chips and trade secrets, and that the US and China and other players are unlikely to be able to stick to a mutual agreement to halt AGI development.

I agree with Tyler that Scott misrepresented his argument, because while Tyler does emphasize that we have no idea what will happen, he doesn't say "therefore, it'll be fine." His conclusion that "We should take the plunge. We already have taken the plunge." is best interpreted as meaning "if you don't have any real choice in whether AGI gets built or not, you may as well just enjoy the experience and try to find super near-term ways to gently steer your local environment in more positive directions, while entirely giving up on any attempt to direct the actions of the whole world.

I think that the fundamental inconsistency in Tyler's argument is that he believes that while AGI development is radically, unquantifiably uncertain, he is apparently roughly 100% confident in predicting both that China will develop AGI if the US slows down or stops, AND that this would be worse than the US just going ahead and building it now, AND that there's nothing productive we could do in whatever time a unilateral US halt to AGI production buys us to reduce the unquantifiable risk of AGI doom. That's a lot of big, confident conjunctions implicit or explicit in his argument, and he makes no argument for why we should have Knightian uncertainty in the AGI case, but not in the US/China case.

We can point to lasting international agreements like the nuclear test ban treaty as evidence that, in fact, it is possible to find durable diplomatic solutions to at least some existential risk problems. Clearly there are enormous differences between AGI and nuclear bombs that may make AGI harder to regulate away or ban, but you have to actually make the argument. Tyler linked today on MR to a well-thought-through twitter thread on how to effectively enforce rules on AI development [3], saying he's skeptical but not explaining why.

In my view, Tyler's acknowledging that the risk of AGI doom is nonzero, I'm sure he thinks that specific scenario would be catastrophically bad, he explicitly thinks there are productive things you could do to help avert that outcome and has funded some of them, he tentatively thinks there are some well-thought-out seeming approaches to enforcement of AI development rules, and he demonstrates a willingness to make confident predictions in some areas (like the impossibility of meaningfully slowing down AI development via a diplomatic agreement between the US and China). That's all the pieces you need to admit that slowing down is a viable approach to improving safety, except he would have to let go of his one inconsistency - his extreme confidence in predicting foreign policy outcomes between the US and China.

I think Scott, despite the hard number he offers, is the one who is actually consistently displaying uncertainty here. I think the 33% figure helps. He doesn't need to predict specific scenarios - he can say "I don't know exactly what to do, or what will happen, or how, but I can just say 33% feels about right and we should try to figure out something productive and concrete to lower that number." That sounds a lot more uncertain to me than Tyler's confident claims about the intractability of US/China AI competition.

[1] https://www.lesswrong.com/posts/gNodQGNoPDjztasbh/lies-damn-lies-and-fabricated-options

[2] https://forum.effectivealtruism.org/posts/ccJXuN63BhEMKBr9L/the-unilateralist-s-curse-an-explanation

[3] https://twitter.com/yonashav/status/1639303644615958529?s=46&t=MIarVf5OKa1ot0qVjXkPLg

Expand full comment
Sandro's avatar

> So if you had to choose between OpenAI building the first true AGI and a government-controlled Chinese AI lab, which would you pick?

Two organizations at least doubles the chances that one of those AIs is misaligned. I don't think your question has the easier answer you seem to imply. If China's AGI is aligned, or if they have a greater chance of creating an aligned AI than OpenAI, then that option could very well be preferable.

Expand full comment
Meadow Freckle's avatar

I’m trying to rearticulate Tyler Cowen’s argument, not state my own views, to be clear.

Expand full comment
Michael Sullivan's avatar

That's clearly logically not true.

If slumlords in Mumbai are going to build housing for 10,000 people, the risk of a catastrophic fire that kills at least 100 is X%.

If also, normal housing developers in the US are going to build housing for another 10,000 people, the risk of catastrophic fire is not "at least 2*X%."

Expand full comment
Sandro's avatar

The two organizations pursuing AI are largely using the same techniques that are shared by most machine learning researchers. By contrast, building standards and materials in Mumbai slums and the US suburbs differ drastically, so your analogy is invalid.

The incentives of Chinese and US researchers are slightly different, but 2x factor is fine for the ballpark estimate I was giving. Don't read too much into it, the point is that risk scales proportionally to the number of researchers, and this is only mitigated somewhat by incentives that optimize for specific outcomes like, "don't create an AI that destroys the CCP's ability to control the flow of information in China".

Expand full comment
Goldman Sachs Occultist's avatar

This implies China is significantly less likely to align their AI. There's little basis for this. Even if China is more likely to make unaligned AI, this is dwarfed by the increased likelihood of AGI in the next 50 years with both countries working on it.

Expand full comment
Michael Sullivan's avatar

I think this is completely wrong, and shows some of the sloppiness of thinking here.

Making AI -- aligned or unaligned -- isn't a matter of rolling dice. Either the current set of techniques are basically on a clean path to AGI, or they aren't, and some further breakthrough is needed. If the current techniques are heading towards AGI (if scaling is all we need and maybe some detail work on tuning the models, but no fundamental breakthroughs or complete changes of approach needed), then AGI is going to happen on a pretty straightforward timeline of training data + more GPUs, and whether two countries are working on it or one or five is unlikely to change that timeline in a macroscopic way.

If AGI is coming on a short timeline with fundamentally the techniques we have today plus more scaling, then, again, whether it's aligned or not isn't a matter of rolling dice. Either the techniques we use today, plus perhaps some ones that we learn over the course of that scaling up process, produce an aligned AGI or an unaligned one. They're pretty much either sufficient or insufficient. Again, whether there's one AGI or several isn't a very large factor here.

Expand full comment
Bradley's avatar

Thanks for this write-up, I think it’s very well done.

Expand full comment
Scott Alexander's avatar

>> "Come on Scott, you're just not understanding this...for a start, consider the whole post!"

I'm a big fan of your work and don't want to misrepresent you, but I've re-read the post and here is what I see:

The first thirteen paragraphs are establishing that if AI continues at its current rate, history will rebegin in a way people aren't used to, and it's hard to predict how this will go.

Fourteen ("I am a bit distressed") argues that because of this, you shouldn't trust long arguments about AI risk on Less Wrong.

Fifteen through seventeen claim that since maybe history will re-begin anyway, we should just go ahead with AI. But the argument that history was going to re-begin was based on going ahead with AI (plus a few much weaker arguments like the Ukraine war). If people successfully prevented AI, history wouldn't really re-begin. Or at least you haven't established that there's any reason it should. But also, this argument doesn't even make sense on its own terms. Things could get really crazy, therefore we should barge ahead with a dangerous technology that could kill everyone? Maybe you have an argument here, but you'll need to spell it out in more detail for me to understand it.

Eighteen just says that AI could potentially also have giant positives, which everyone including Eliezer Yudkowsky and the 100%-doomers agree with.

Nineteen, twenty, and twenty one just sort of make a vague emotional argument that we should do it.

I'm happy to respond to any of your specific arguments if you develop them at more length, but I have trouble seeing them here.

>> "Scott ignores my critical point that this is all happening anyway (he should talk more to people in DC)"

Maybe I am misunderstanding this. Should we not try to prevent global warming, because global warming is happening? If you actually think something is going to destroy the world, you should try really hard to prevent it, even if it does seem to be happening quite a lot and hard to prevent.

>> "Does not engage with the notion of historical reasoning (there is only a narrow conception of rationalism in his post)"

If you mean your argument that history has re-begun and so I have to agree to random terrible things, see above.

>> "Does not consider Hayek and the category of Knightian uncertainty"

I think my entire post is about how to handle Knightian uncertainty. If you have a more specific argument about how to handle Knightian uncertainty, I would be interested in seeing it laid out in further detail.

>> "and does not consider the all-critical China argument, among other points"

The only occurrence of the word "China" in your post is "And should we wait, and get a “more Chinese” version of the alignment problem?"

I've definitely discussed this before (see the section "Xi risks" in https://astralcodexten.substack.com/p/why-not-slow-ai-progress ) . I'm less concerned about than I was when I wrote that post, because the CHIPS act seems to have seriously crippled China's AI abilities, and I would be surprised if they can keep up from here. I agree that this is the strongest argument for pushing ahead in the US, but I would like to build the capacity now to potentially slow down US research if it seems like CHIPS has crippled China enough that we don't have to worry about them for a few years. It's possible you have arguments that CHIPS hasn't harmed China that much, or that this isn't the right way to think about things, but this is exactly the kind of argument I would appreciate seeing you present fully instead of gesture at with one sentence.

>> "Or how about the notion that we can't fix for more safety until we see more of the progress?"

I discussed that argument in the section "Why OpenAI Thinks Their Research Is Good Now" in https://astralcodexten.substack.com/p/openais-planning-for-agi-and-beyond

I know it's annoying for me to keep linking to thousand-word treatments of each of the sentences in your post, but I think that's my point. These are really complicated issues that many people have thought really hard about - for each sentence in your post, there's a thousand word treatment on my blog, and a book-length treatment somewhere in the Alignment Forum. You seem aware of this, talking about how you need to harden your heart against any arguments you read on Less Wrong. I think our actual crux is why people should harden their hearts against long well-explained Less Wrong arguments and accept your single-sentence quips instead of evaluating both on their merits, and I can't really figure out where in your post you explain this unless it's the part about radical uncertainty, in which case I continue to accuse you of using the Safe Uncertainty Fallacy.

Overall I do believe you have good arguments. But if you were to actually make them instead of gesture at them, then people could counterargue against them, and I think you would find the counterarguments are pretty strong. I think you're trying to do your usual Bangladeshi train station style of writing here, but this doesn't work when you have to navigate controversial issues, and I think it would be worth doing a very boring Bangladeshi-train-station free post where you explain all of your positions in detail: "This is what I think, and here's my arguments for thinking it".

Also, part of what makes me annoyed is that you present some arguments for why it would be difficult to stop - China, etc, whatever, okay - and then act like you've proven that the risk is low! "Existential risk from AI is . . . a distant possibility". I know many smart people who believe something like "Existential risk is really concerning, but we're in a race with China, so we're not sure what to do." I 100% respect those people's opinions and wouldn't accuse them of making any fallacies. This doesn't seem to be what you're doing, unless I'm misunderstanding you.

Expand full comment
Arbituram's avatar

I'm actually not convinced by the China argument. Putting aside our exact views on the likely outcomes of powerful AI, surely the number one most likely way China gets a powerful AI model is by stealing it from an American company that develops it first?

That's broadly how the Soviets got nukes, except that AI models are much easier to steal and don't require the massive industrial architecture to make them run.

Expand full comment
pozorvlak's avatar

Worse: stealing AI models doesn't require the massive infrastructure to *train* them, just the much more modest infrastructure to run them. There are LLMs (MLMs?) that can run on a laptop GPU, I don't think we'd even contemplate restricting Chinese compute access to below that level even if we could.

Expand full comment
Lech Mazur's avatar

Disagree. China will able to produce powerful AI models. There are many Chinese researchers, and they do good work. China might be slowed down a bit by U.S. export limitations, but that's it.

Expand full comment
Arbituram's avatar

I actually agree with you; the Soviets still would have developed nukes eventually without espionage, but it's pretty clear it would have taken longer, and I think this situation is comparable (with the noticeable difference that stealing the plans / data/model for AI is effectively like stealing the nukes themselves.

Expand full comment
av's avatar

Stealing an AI model from the US would not increase existential risk much if the US companies are not allowed to train models more advanced than GPT-4.

Expand full comment
Jonathan Ray's avatar

The CHIPS act will give china a large disadvantage in compute, and they already have a large disadvantage in the availability of top talent because if you're a top 1%er you don't want to live in China -- you go study in the US and stay there.

Expand full comment
Nolan Eoghan (not a robot)'s avatar

Whenever i hear a definitive statement on China that’s basically dismissing chinas potential (or threat) a quick google contradicts it.

https://www.reddit.com/r/Futurology/comments/129of5k/as_america_obsesses_over_chatgpt_its_losing_the/

Also plenty of Chinese graduates and post graduates go back to China.

Expand full comment
David Piepgrass's avatar

I expect Chinese and Americans will produce different designs for AGIs, and more generally two AI researchers would produce different designs.

On the one hand, two different designs would give two chances for an AGI design to kill us all. On the other hand, if there are two designs, one might be safer in some clear way, and conceivably most people could be persuaded to use the safer design.

Edit: I don't know the first thing about Chinese AI, but a top comment on [1] says

> I am not a defense expert, but I am an AI expert, and [...] [China] certainly is not leading in AI either."

> Urgh, here's what China does. China publishes a million AI "scientific" papers a year, of which none have had any significant impacts. All of the anthology papers in AI are from USA or Canada. Next year China publishes another million useless papers, citing other chinese papers. Then if you naively look at citations you get the impression that these papers are impactful because they have lots of citation. But its just useless chinese papers citing other useless chinese papers for the purpose of exactly this: looking like they are leading.

Another commenter adds

> The really most influential AI breakthroughs in 2022, IMO:

> DALLE-2 - openAI, USA

> Stable Diffusion, LMU Munich Germany

> ConvNeXt, Meta AI, USA

> ChatGTP, open AI, USA

> Instant NGP, Nvidia, USA

> Generative AI was really big this year. What AI breakthrough was made in China? I cannot think of any important one, ever.

[1] https://www.reddit.com/r/Futurology/comments/129of5k/as_america_obsesses_over_chatgpt_its_losing_the/

Expand full comment
Martin Blank's avatar

What exactly is so bad about China beating Silicon Valley? You trust Silicon Valley with AI safety more than China? I am not sure that is my knee jerk reaction and I am not a particular Sinophile.

Expand full comment
Lech Mazur's avatar

If the AI can be controlled, do you really believe that it would be better in the hands of the CCP rather than US tech companies? On what basis or track record do you make this claim? I don't recall tech companies causing millions of deaths, suppressing pro-democracy protests, persecuting religious or ethnic minorities, forcing sterilizations, stifling political dissent, or supporting widespread censorship, for example.

Expand full comment
pozorvlak's avatar

On the other hand, the CCP has lifted millions of people out of poverty (after previously, er, plunging them into poverty, or at least more dire poverty than they were previously experiencing). On the gripping hand, it's not clear to me that a CCP-AGI would value poverty reduction once Chinese former-peasants were no longer needed for industrial growth.

Expand full comment
Goldman Sachs Occultist's avatar

>The CCP has lifted millions of people out of poverty

Wrong. Western technology did. CCP prevented access to this technology.

And millions of *chinese* people were lifted out of poverty. I don't expect the CCP to focus on helping people in other countries, but the fact that Chinese people were improved under their watch says little about their concern for humanity in general.

Expand full comment
Martin Blank's avatar

>On what basis or track record do you make this claim? I don't recall tech companies causing millions of deaths,

Well they haven't really had the power to in the past. If tech companies could cause millions of deaths to pump the stock (or make their leaders putative gods (or controllers of god)) its not clear to me they would say "no".

>suppressing pro-democracy protests,

Who cares about democracy? Not important on the scale of talking about existential threats.

>persecuting religious or ethnic minorities, forcing sterilizations, stifling political dissent, or supporting widespread censorship, for example.

Their support of widespread censorship is exactly the sort of thing which might help them keep an AI under wraps. As for those other issues those are bad, but they aren't really things that are that unique, the US/West was pursuing those policies in living memory.

OMG the Chinese don't like the Uighurs, and treat them horrible is not some knock down argument they won't be safe with AI.

We can be sure the US tech company AI will make sure to use all the correct pronouns and not make anyone sad with trigger words, while it transports us all to the rare metal penal colonies in Antarctica for that one like of a Mitch Romney tweet in 2009. That is cold comfort.

Expand full comment
Sandro's avatar

Yet another point: capitalism drives people to take shortcuts to be competitive, and shortcuts on alignment are not a good idea. The CCP has a much firmer grip on what they permit, and that could be good for safety. The matrix of possibilities is:

1. China creates aligned AI.

2. US creates aligned AI.

3. China creates unaligned AI.

4. US creates unaligned AI.

It's not unreasonable to think that the probability of option 4 is higher than 3, and that the probability of option 1 is higher than 2, which would make China a safer bet if we're really concerned with existential risk.

Expand full comment
Drethelin's avatar

1 It's not "Capitalism" that drives people to take shortcuts, it's laziness and incentives, which obviously also exist in non-capitalistic systems. Look at Chernobyl for just one example.

In addition, China is hella capitalistic these days.

Expand full comment
Goldman Sachs Occultist's avatar

>Yet another point: capitalism drives people to take shortcuts to be competitive, and shortcuts on alignment are not a good idea.

Private firms in China are responsible for most AI development, and in any case China does not have a history of not taking shortcuts.

Expand full comment
Goldman Sachs Occultist's avatar

1 happening before 2 (alignment in the narrow sense of doing what its operators want) could be catastrophically bad, but not as bad as 3 or 4.

Expand full comment
Bardo Bill's avatar

I am no fan of the CCP. I despise them in fact. But should we put our faith in Peter Thiel, Mark Zuckerberg, and Elon Musk? Silicon Valley has been functionally psychopathic for at least the last decade.

If AI is on the brink of some sort of world-altering power then I can't see the Silicon Valley types suddenly deferring to ideas about the common good and the virtue of restraint when they've demonstrably behaved as if they had no interest in those virtues for years. The CCP, while awful, may at least feel constrained by a sense of self-preservation.

Expand full comment
Martin Blank's avatar

Exactly, I don't think this is a knock-down argument, but it is one that demands more of a framework to oppose than "OMG China/other bad".

Expand full comment
Goldman Sachs Occultist's avatar

>I am no fan of the CCP. I despise them in fact. But should we put our faith in Peter Thiel, Mark Zuckerberg, and Elon Musk? Silicon Valley has been functionally psychopathic for at least the last decade.

Thiel and Musk both express concern about AI risk, much more than the median voter or politician

Expand full comment
JamesLeng's avatar

That seems like a fairly solid CV of being willing and able to lock human-level sapient beings in boxes and subject the nuances of their loyalties to unrelenting scrutiny, which seems extremely relevant to the classic "distinguish a genuinely friendly, submissive AI from a malevolent trickster" problem.

I don't actually think that's the best way to approach AI alignment, or for that matter running a country - long term growth requires intellectual freedom. But for somebody who figures there'll be a need to censor, persecute, and sterilize paperclip-maximizers, "move fast and break things" is not a reassuring slogan.

Expand full comment
Phil H's avatar

"I don't recall tech companies causing millions of deaths, suppressing pro-democracy protests..."

I absolutely do. It was in 1930s Germany, not 2030s America, but I don't have that much more faith in the American political system. It's good, but I wouldn't bet the farm on it. And America's politics took a sharp turn to the left/right/wrong, I have no faith that its tech companies would do anything other than pander. Germany's tech companies supported the war efforts.

China's definitely worse at present. But if we're making predictions about what might happen in the future, you can't just make the blanket assumption that the political truths of now will continue into the future.

Expand full comment
Nolan Eoghan (not a robot)'s avatar

You are comparing text companies to the Chinese State. You should compare them to the US government, or the US in general. . And some of those claims are laughable in that context.

Expand full comment
MissingMinus's avatar

Yes?

I imagine the distinction is that it isn't going to be 'China' or 'Silicon Valley' developing aligned AGI, but that specific researchers within them will make it. However, I expect the typical Chinese research facility to be more loyal and under the guidance and control of the Chinese government, which as other comments have mentioned, has a variety of issues.

For 'Silicon Valley', I would expect that actual alignment successes to come out of OpenAI or the various alignment companies that have been spawned (Anthropic, Conjecture, etc.), which I do actually trust a lot more. I expect them to keep their eye on the ball of 'letting humans be free and safe in a big universe' better than other companies once governments start paying a lot of attention. I do also expect these 'Silicon Valley' companies to be more likely to succeed at alignment, especially because they have a culture of paying attention to it (to varying degrees..).

I do actually rate 'if China manages to make AGI they can align to whatever' as having okay chances of actually grabbing a significant chunk of human value. This does depend on how hard the AI race is, how much existing conflict there is, and how slow the initial setup is. Though note that this is conditional on 'having developed a way to align an AGI to whatever', which I do think is nontrivial and so implies some greater value awareness/understanding. I do still, however, prefer 'Silicon Valley' companies because I believe they have a greater likelihood of choosing the right alignment target and have downside risk.

Though, it is likely that we won't manage aligned AGI, but 'who will create a scary unaligned AI system early' is a separate question.

Expand full comment
Mark's avatar

Me dumb: Google showed me only really crowded passenger trains when I ask for "Bangladeshi train station style of argumentative writing". Could so. explain, please? - Having read a lot of Tyler, but not much of the Sequences - I venture to guess: "gesturing at a lot of stuff, seeming to assume you must know the details already, though many in the audience may very well not" - btw: Without the follow up, I would have assumed "Tyler Cowen"'s first post to be fake. ;)

(one nitpick: I disagree with Scott's statement: "If you have total uncertainty about a statement (“are bloxors greeblic?”), you should assign it a probability of 50%." - nope, either one explains what a bloxor is and what greeblic, or I won't give any probability. And if it means: "Are aliens green-skinned?" I have kinda total uncertainty, and would still give much less than 50%.

Expand full comment
Ivo's avatar

You don't have total uncertainty about 'are aliens green-skinned'. You know that the universe has brought forth living beings in all shades of colors. Without thinking about what colors would be more reasonable than others for aliens, at least every of those colors occurring should be an option. So of course it's much less than 50%.

Expand full comment
Mark's avatar

Got me. Still: do kresliks worgle at a chance much nearer to 50%? No idea at all? I still advice not to assign it a 50% chance. Either most stuff one talks about worgles (thus kresliks may do at higher 50% too - if worgle=exist/are breakable/ ... ) or it is a specific activity as in "playing" which most nouns do not do (well, bumblebees kinda do): then less. Anyways: some other comments here note that Hayek/Knight(?)/whoever said: those are not cases to assume probabilities. I agree.

Expand full comment
Michael Wheatley's avatar

"Bangladeshi train station" is a reference to this tweet:

https://twitter.com/cauchyfriend/status/1595545671750541312

Expand full comment
Mark's avatar

Thank you! That tweet is fun and kinda true. But is a tweet of 1248 likes a meme more than 5% of ACX readers are supposed to know? Hope TC got it.

Expand full comment
Isaac King's avatar

Scott was responding to Tyler, it wasn't intended for all the other readers...

Expand full comment
Isaac King's avatar

I was wondering this too. Have a tip. https://manifold.markets/link/9dGIScnA

Expand full comment
shako's avatar

I've made it.

Expand full comment
J Mann's avatar

I seem to be blocked. Would someone quote the relevant part?

Expand full comment
Michael Wheatley's avatar

The full tweet goes:

===================================

alex tabarrok MR post: very detailed argument explaining policy failure, lots of supporting evidence. Restrained yet forceful commentary

tyler cowen MR post: *esoteric quote on 1920s bangladashian train policy* "this explains a lot right now, for those of you paying attention"

===================================

Expand full comment
static's avatar

Maybe the argument that you need to address is that the risk of human extinction from AGI happens after the risk of China getting AGI that they can leverage for decisive strategic advantage. Since we don't know how long the period between decisive strategic advantage level AGI and human extinction level AGI is, we may be signing up for 100 years of domination by China (or which other country not under US regulation manages to get there first).

To the extent US regulation concerns AGI destroying the US, the US destroyed by human extinction worry is subordinate to the worry of US destroyed by some other country using AGI to achieve decisive strategic advantage.

Expand full comment
Hyperborealis's avatar

Assigning a risk probability to an event is saying the event is not a case of Knightian uncertainity. Given that we don't understand the actual nature of intelligence, I don't see how you can make that claim. For all we know, the simulation of intelligence that is ChatGPT is as far away (or near!) to AGI as clockwork automata. I don't see that you address Tyler's point about Knightian uncertainty.

Expand full comment
CLXVII's avatar

Knightian uncertainty is present to various degrees in nearly all event prediction scenarios. Would you claim that its almost never right to assign a risk probability?

Expand full comment
Mr. Doolittle's avatar

Some people consider the chances of Jesus coming back and starting the apocalypse to be very high. What probability risk would you assign to this concern? If you're religious (and Christian) you may apply a fairly high chance to this. If you are an atheist, you may assign a very low probability (approaching zero, but not completely zero). I think Tyler is essentially an atheist when it comes to AI doomsday, so he's assigning a very low probability the same way he does about lots of other complete unknown doomsday scenarios he doesn't believe in.

Expand full comment
CLXVII's avatar

I’d disagree with considering these complete unknown scenarios. The holds-their-beliefs-seriously Christian would rightly assign a high probability to the Jesus scenario, and the atheist would rightly assign near-0 probability to it. This doesn’t make it a complete unknown. Neither is the AI scenario a complete unknown. Though our knowledge of AI is hugely incomplete, it is also significantly different from 0!

If Tyler has the analogous stance toward AI risk, he’d be best served by explaining it, rather than the offhanded dismissal that his current post presents.

Expand full comment
Jordan's avatar

Bruh where was this in the blog post?

e.g.

>> >> "Does not consider Hayek and the category of Knightian uncertainty"

>> I think my entire post is about how to handle Knightian uncertainty. If you have a more specific argument about how to handle Knightian uncertainty, I would be interested in seeing it laid out in further detail.

You have your intentions, but what your post is "about" is to readers is variable. I haven't learned of "Knightian uncertainty". There is no way that, for me, your post could be about something that 1) isnt tied into your ideas and 2) isn't something I'm familiar with.

Expand full comment
WaitForMe's avatar

I think it's fair to say your post is about something without specifically using that term. You can have a post about different ethical systems without using their very particular academic names, for example.

Expand full comment
Jordan's avatar

A few scenarios I want to get your take on

1) If I present a utilitarian perspective to addressing a problem and I don't make any reference to the word "utilitarianism", then is my post about a) utilitarianism or b) addressing a problem?

2) I may not even know that utilitarianism is a concept. Can my post be about something that I don't know?

3) If I do know about utilitarianism and my post uses its ideas without the term, is my post about utilitiarianism or is about the ideas that the concept represents.

Concepts are, after all, a means to an end and not the end themselves. e.g. "The sun" is not the actual sun.

Expand full comment
WaitForMe's avatar

1) No, I think the word "about" implies it is the main topic.

2) Yes, this happens all the time. We rediscover things other people in the past have already talked and written about every day. It's unlikely you'll have anything new and interesting to say if you haven't read any of the previous discussion, but certainly you are talking about the same thing. The name doesn't matter much, if at all.

3) If it talks about the ideas but just doesn't use the term, yes, I think that is fair to call it "about utilitarianism". Basically restating my post above. If all you had to do to get everyone on board they you were talking about utilitarianism was start your writing with "this is about utilitarianism" and then everything after that was still accurate in that context, you were already writing about it. This is similar to point number 2. You can be writing about a concept without even realizing it. Certainly if you're writing about fate and do we have the ability to choose, you're writing about determinism, whether or not you use or know about that term.

Expand full comment
Drea's avatar

I just asked Bard this, and I'm smarter because of it:

The concept was first described by Frank Knight in his book "Risk, Uncertainty, and Profit." Knight argued that there is a fundamental distinction between risk and uncertainty. Risk is measurable and can be insured against, while uncertainty is not measurable and cannot be insured against. Knightian uncertainty is named after Knight because he was the first to explicitly distinguish between risk and uncertainty and to argue that uncertainty is a fundamental part of life.

Here are some tips for dealing with Knightian uncertainty:

Be prepared for change. One of the best ways to deal with Knightian uncertainty is to be prepared for change. This means being willing to adapt to new situations and being open to new ideas.

Be flexible. Another way to deal with Knightian uncertainty is to be flexible. This means being willing to change your plans if necessary and being able to roll with the punches.

Be optimistic. Finally, it is important to be optimistic when dealing with Knightian uncertainty. This means believing that things will work out in the end and that you will be able to overcome any challenges that come your way.

Expand full comment
James Thomas's avatar

This really helps to clarify what Tyler's point is, I think - so thanks!

Of course the optimism point isn't necessarily true - you could have something with high risk and high uncertainty.

Expand full comment
Drea's avatar

Thank you both for engaging on this. I only wish our discourse tools were better.

Tyler has two strong points I don't quite hear you addressing.

1) We are not good at regulating

2) We will need the (benefits of) AGI tools to figure out alignment

On 1), I don't actually believe that "We designed our society for excellence at strangling innovation." Instead, I think we have evolved a regulatory structure that can shift innovation from one area to another, often not in the intended way (because the regulatory hive-mind isn't that smart). Innovation is never really strangled, and when it's suppressed it just ends up more corrupt or malformed.

For example, I really wish that in 1996 we had had viable micropayments, instead of ending up using advertising as a cheap and easy hack to pay for search. In fact, I don't mind the idea of allowing AGI products, as long as the users have pay for them. I'd rather we work at the incentives level, than the "political flunkies in a room come up with clever but incomprehensible administrative rulings" level.

On 2) this isn't the "it's a race between" argument, either between the "US" and "China" or between various companies, that you make in your post responding to OpenAI. It's also not the "Without AGI, Moloch" argument (though there's truth in that).

It's that we need the smarter tools to figure out alignment. Eliezer's pessimism is exactly why we need to be USING these tools more than we were. MIRI tried to do this alignment work in secret (fearing the release of something) with the tools they had. They don't think that found us the magic bullet. Now I think we all need to do the work in open, with everyone using the new LLM tools as they are today and tomorrow. Only with the AI will we find ways to work with the AI.

Expand full comment
magic9mushroom's avatar

AGI won't help align AGI, because by assumption your AGI is not aligned and hence does not want to be replaced by something aligned.

That is, if you ask a misaligned AGI for how to build an aligned AGI, it will lie to you and tell you how to make another similarly-misaligned AGI. The exception is if you can detect misalignment, in which case you don't need the misaligned AGI in the first place.

(Also, running a misaligned AGI *at all* is existentially dangerous.)

No, slow and steady is the only way - and I mean *really* slow and steady, using GOFAI rather than neural nets. The "what about the lunatics rushing ahead with neural nets" problem yields to military force.

Expand full comment
Aapje's avatar

Unaligned is not the same as actively malicious.

Expand full comment
magic9mushroom's avatar

"Not actively malicious" is still more than we know how to get out of gradient descent.

Expand full comment
Kenny's avatar

"Actively malicious" is (almost) as hard to achieve as 'aligned' but 'unaligned' is still almost certainly almost perfectly 'unaligned to human values at all'.

Expand full comment
dogiv's avatar

I think the idea is you use a roughly human-level AGI as an assistant (though how you make sure of this, I don't know), and you use the limited interpretability tools we can develop in the short term to see if it's being deceptive. Plus you make it explain everything to you until you understand it, which makes a fake alignment plan more difficult.

Was GOFAI ever actually on a path that leads to AGI?

Expand full comment
Gilbert's avatar

I think expressed in Less Wrong-talk, Prof. Cowen is saying that getting too obsessed about the Yudkowskyan* doom scenario is https://www.lesswrong.com/tag/privileging-the-hypothesis. More generally he says the future is muddy enough that we cant locate (in the sense of https://www.lesswrong.com/tag/locate-the-hypothesis) *any* specific hypothesis plausible enough to promote it to special attention. Better to admit we have absolutely no clue at all.

Of course if insist on translating that to a meaningless folk-Bayesian "probability" for the Yudkowskyan scenario it will be in https://www.lesswrong.com/tag/pascal-s-mugging territorry.

Except that putting it this way is already basically conceding a Yudkowkyan framing of the question, which he won't do, because see above.

*Not his word, refusing to call them "rationalist" is my stubborness, not Prof. Cowen's.

Expand full comment
Steven Postrel's avatar

Nicely done there.

Expand full comment
Gres's avatar

Privileging the hypothesis is when your focus on one hypothesis isn’t justified by your previous arguments. In this case, Scott’s “all other species with successors have died” argument justifies the level of focus. The argument might be right or wrong, but Scott isn’t privileging the hypothesis as long as he personally believes the argument.

More broadly, I would draw this parallel. “You are standing on the bridge of an alien warship. There is a big, red button. You know the warship has the power to destroy the Earth, but that button probably doesn’t destroy the Earth. Do you press the button?”

You still need to make a decision, even when you have no idea. If you’re really not sure how bad the outcome is likely to be, you can afford to pay moderately high costs to gain slight reductions in uncertainty if you think you’ll get a better outcome by doing so.

Expand full comment
James Thomas's avatar

Isn't this just privileging the hypothesis that AGI is similarly different to us in a way other successor species are different to their predecessors, no?

Expand full comment
Level 50 Lapras's avatar

This is a great summary of the situation.

Expand full comment
James Thomas's avatar

Excellent post

Expand full comment
alesziegler's avatar

>the CHIPS act seems to have seriously crippled China's AI abilities, and I would be surprised if they can keep up from here.

Oh, so the only thing that was needed to stop China from developing AI was to slap an embargo on them? I am assuming that by CHIPS Act you mean the embargo on semiconductor technology, not the actual CHIPS Act, which is about developing US semiconductor manufacturing.

How come Yudkowsky and other geniuses concerned about AI risk didn't advocate for such an obvious solution? Or did I miss it?

Edit: My actual half-assed opinion is that Chinese are perfectly able to circumvent the embargo, but they'll ban any sort of socially disruptive AI way harder than the US and ship transgressors to gulag. See their approach to covid.

Expand full comment
Laplace's avatar

1. I don't think Yudkowsky agrees that much with Scott on this. Training models is compute and engineering talent intensive, running them, not so much. Absent very strong security measures, model theft is a very serious possibility.

2. They did. Yudkowsky especially has been clear for years that the actually sane thing to do politically would be to control, track and limit the production and sale of GPUs, globally. Obviously, this is very hard and likely completely politically infeasible. They advocated it nevertheless, at length. He didn't single out China much because in his world view, it's not like the US going ahead on AGI alone would help, currently.

Expand full comment
alesziegler's avatar

Ok, I am evidently not well acquainted with his work. I could never bring myself to actually read his long-winded essays to the end ¯\_(ツ)_/¯

Expand full comment
Daniel Kokotajlo's avatar

Well said.

Expand full comment
Markus Ramikin's avatar

> Overall I do believe you have good arguments.

Why?

Expand full comment
Greg G's avatar

For what it's worth, my impression of your post was similar to Scott's. IMO, the strongest points are the competition with China (can't speak to the DC comment) and using progress to advance safety, but the general mood felt like "why worry."

Not sure how much Knightian uncertainty should apply. Per Scott's analogy, the 100 mile long spaceship is almost here.

Expand full comment
Level 50 Lapras's avatar

> Per Scott's analogy, the 100 mile long spaceship is almost here.

You're assuming your conclusion here.

Expand full comment
JohanL's avatar

I like your article a lot. It seems like a "no but seriously, what should we do?" rather than empty apocalypticism.

Expand full comment
Kristin's avatar

This is a solid argument against the open letter: "Our previous stasis [...] is going to end anyway. We are going to face that radical uncertainty anyway. And probably pretty soon. So there is no “ongoing stasis” option on the table."

But if that's your core argument, your other arguments have a distinct ring of "appeal to the consequence." Just because we have to face radical uncertainty doesn't mean that "all possibilities are distant" and thus can be treated as roughly equally (im)probable. I agree with you that we have to take the plunge and accept that we're now living in moving history, but that doesn't mean that AGI isn't a potential x-risk.

The Bostrom/Yudkowsky argument strikes me as analytic/gears-level/bottom-up whereas the Robin Hanson argument seems to be empirical/top-down (i.e. zoom out and view AI as a massive but non-specific tech disruption and predict its affect accordingly). It's like predicting the outcome of a pandemic with a (better) SIR model vs by looking at historical pandemics. The analytic model captures the potential black swanness of AI - as in, its potential to be very much unlike any previous tech disruption - while the empirical model captures the "no, dummy, everybody always thinks *this* tech disruption is in a category of its own, and thus fail to predict the mitigating factors."

IMO both arguments are worth considering, and I do think LWians are a little excessively fixated on the analytic (partly because they will never find falsification looking ever deeper into that model). But it is still a good model, to the point that it is very hard to argue against on its own terms - which is perhaps why so many highly intelligent people dismiss it with fallacy?

One can disagree with Yudkowsky et al. about the right policy decision (as you clearly do, I do as well) without dismissing the argument. Another uncertainty we must sit with, I think, is between these two models.

Expand full comment
Doctor Mist's avatar

When Covid first appeared I had the layman’s stupid idea that this was a “novel coronavirus”, i.e. a disease we had never before encountered and therefore had no defense against: an Andromeda strain that *might* just wipe us out completely.

I don’t know if that was a top-down approach or a bottom-up approach or just a stupid approach, and I don’t know what I should learn from the experience.

Expand full comment
Gres's avatar

Do you no longer believe that? I thought it was an Andromeda strain that most people had no defences against, and it might have wiped us out if it had been more deadly and faster-spreading from the start.

Expand full comment
Doctor Mist's avatar

It seems that a lot of people have some amount of natural immunity by virtue of exposure to previous coronaviruses. In any case the fact that that early cruise ship was not devastated should have tipped me off that it was not the doom sentence I was picturing.

Sure, it might have been much worse, and it was no walk in the park as it was. But it was not an unprecedented event.

I’m probably oversimplifying but it seems to me that the main source of disagreement between Zvi and Tyler is whether AGI will really be an unprecedented event. I’m inclined to think it would be, or at least might be — there aren’t that many steps to EY’s argument — but then I remember that I’m stupid and thought that about Covid.

And I’m cursed with *two* dogs in this hunt. I have a cryonics contract, so I really want there to be a world in a hundred or two hundred years…but I also suspect AGI is a prerequisite for a revival procedure.

Expand full comment
Level 50 Lapras's avatar

> The Bostrom/Yudkowsky argument strikes me as analytic/gears-level/bottom-up whereas

A real "bottom up" argument might try to figure out whether magical nanotech is actually physically possible, rather than just confidently assuming it by fiat.

Expand full comment
Jordan's avatar

Unfamiliar with your work independently, I also find this to be true about Scott. When they're great they're great. When they're callous their rationing, coherence and thoughtfulness goes down an absurd amount. A lot of generalization and shallow statements (I imagine for the sake of a quicker reply). I can sniff out the tone in their writing most of the time now. I still read most but I don't engage these ones. Came into the comments to see if you had replied and now I'm out of here!

Expand full comment
cdh's avatar

I read Scott as saying something like "My prior is to freak out and Tyler's is to not freak out, and here are reasons why I think my prior is better." But whether, having freaked out, we could do anything about the outcome we are freaking out about, was not discussed, even though it was a major emphasis of Tyler's post. I am not qualified to critique Scott, but if I were, I would give this a 0/10 steelman score.

Expand full comment
Jordan's avatar

Well distilled

Expand full comment
Jack's avatar

Couldn't it be the case that he is taking issue specifically with the bits that he has included? Maybe he's omitted the parts he agrees with or, at least has no particular objection to. I don't see how that's a problem, let alone a 'gross misrepresentation'.

> Existential risk from AI is indeed a distant possibility, just like every other future you might be trying to imagine. All the possibilities are distant, I cannot stress that enough. The mere fact that AGI risk can be put on a par with those other also distant possibilities simply should not impress you very much.

This reasoning simply does not work, for exactly the reason outlined in the post. It's a bad argument. Surely it's fine to object to it, while ignoring the completely separate 'all-critical China argument'? Or do you want him to go line-by-line pointing out all the parts he agrees with or has no view on in order to be allowed to criticise any part?

Expand full comment
Goldman Sachs Occultist's avatar

> Scott ignores my critical point that this is all happening anyway

Then why do you feel the need to talk so much about it? If it's beyond being influenced, then why bother?

Expand full comment
Jon B's avatar

It sounds like you're very unconvinced of the arguments from Eliezer, Bostrom, etc. I and I'm sure many others would like to hear your reasoning. Could you steel-man their positions, then nail down what specific claims and arguments they make that you see flaws in?

Stating that an argument came from a post on LessWrong doesn't refute the argument.

Expand full comment
Thegnskald's avatar

I think a key divergence is whether or not a particular scenario is appropriately, or inappropriately, elevated to your attention.

Like, I believe you think that AI, as an existential threat, has been inappropriately elevated to our attention; I believe Scott thinks that AI, as an existential threat, has been appropriately elevated to our attention.

Look, if we don't all walk backwards in a circle while waving our arms, the Earth will fall into the sun.

I've inappropriately elevated that possibility to your attention.

If we don't stop using geothermal energy, we'll create new temperature gradients, which will disrupt plate tectonics (which after all are driven almost entirely by temperature gradients, as I understand it), which could cause massive geological catastrophe.

Does that seem more appropriately elevated to your attention? Why? Because it seems more plausible? I made it up to sound plausible - mind, it's entirely possible somebody out there is legitimately concerned about this, I have no idea, I just tried to think of a global catastrophe which could plausibly occur, and worked backwards from there.

If you (the general reader, not you specifically) were never worried about this before, does this seem like something you might worry about now? Pay attention to the fact that the guy who elevated it to your attention just said he made it up to sound plausible.

I tend towards the belief that AI has been -inappropriately- elevated to our attention, and that a significant part of the corpus of AI catastrophism revolves around plausibility.

Others believe that the elevation of AI to our attention is entirely appropriate, and often find the arguments against AI catastrophism to be overly rooted in plausibility.

I could pull a trick here and say that this proves that AI has been inappropriately elevated to our attention, since all the arguments are about plausibility, but realistically that's just begging the question.

Expand full comment
Arie IJmker's avatar

> If we don't stop using geothermal energy, we'll create new temperature gradients, which will disrupt plate tectonics (which after all are driven almost entirely by temperature gradients, as I understand it), which could cause massive geological catastrophe.

The nice thing about that argument is that it can be argued on its merits. The energy the mantle holds (and thus the energy to be extracted before the temperature gradient disappears) is around 1*10^34 J. This is about 10*10^14 times the annual energy consumption of humanity.

Expand full comment
Donald's avatar

Its also not clear how stopping plate tectonics would cause a massive catastrophe.

No volcanoes. In the long term, this will disrupt various element cycles, volcanoes are part of how nature releases carbon locked deep underground. But we can and do unlock that carbon as well, and if there is some other element that volcanoes release, we can mine that too.

No magnetic field = slightly higher levels of radiation maybe. Possibly a bit of a problem. Running a superconducting cable round the equator is a big infrastructure project, but not impossible. (The magnetic energy is around 8 gigawatt years.)

Expand full comment
hnau's avatar

My steelmanning / Straussian reading of Tyler's post is that it's just a cleverly disguised anthropic argument. There are conceivable universes where aligning AGI before it takes off is infeasible. Conditional on not being in those universes, i.e. not being doomed regardless, the outlook is fairly sunny and AI slowdown is probably not a good strategy for optimizing the expected outcome.

Or to put it another way: a priori there's little reason to expect that the difficulty of the AI alignment problem falls in the narrow band where it's solvable *only* if we coordinate super-hard on solving it. Most of our probability weight has to be on scenarios where it's either unsolvable or straightforwardly solvable.

Expand full comment
Donald's avatar

If it was super easy, someone would have solved it by now. There may be a band where it's easy enough we don't need to slow down, yet hard enough to not be solved already, but that's a narrow band too.

This should put a lot of your probability on AI = doom, alignment is unsolvable. In which case, put it off in the hope you can put it off forever. (Or the hope that mind uploading is a game changer)

Now I think it's quite likely to be in the "only if we focus really hard" region.

Also, there are very few technical problems so simple that humans have a <0.1% chance of screwing them up on the first shot if they aren't even trying particularly hard. Some people can screw up the simplest things.

Expand full comment
Gres's avatar

It might be solved already; maybe if none of the doomers do anything, humanity will survive. I think Scott put the probability of that at 67%.

Expand full comment
hnau's avatar

> If it was super easy, someone would have solved it by now.

Nah, there's never really been an occasion to solve anything before now. LLMs first hit the big time what... two months ago? They're hard to explain, which is scary, and they go a bit off the rails when their RLHF is jailbroken or nonexistent, but in general they seem sufficiently "aligned" in practical terms. Yes, AGI would be a different animal but that also means we have no concrete concept of what it would take to align it.

More concisely: I don't know whether catching unicorns is easy or hard, because none have ever existed.

Expand full comment
hi's avatar

I just wanted to throw in my two cents as a reader:

I went and read your article just now to see if maybe Scott really was misrepresenting it, and I came away feeling like Scott actually did a pretty good job in summarizing it.

Expand full comment
M M's avatar

I second this feeling

Expand full comment
Chris L's avatar

I think that if you didn't want your post to be interpreted this way you should have written it differently (I wouldn't have written this comment if I thought you weren't able to write this post in such a way that it wouldn't be misunderstood. I believe that you're completely capable of this).

Expand full comment
User was indefinitely suspended for this comment. Show
Expand full comment
davie's avatar

I mean, what's worse, a bad joke, or the guy that keeps weaving in motivated reasoning, rhetorically disguised fallacies, and unfalsifiable premises as part of his cottage industry to run defense for tech bros and aristocrats?

Clearly, it's gotten so bad that even Scott is sick of it, and Tyler had been warned about his shoddy rationalizations years ago.

He'd even gone through and pruned most of the comments with this nickname that outlined his subtle sophistry.

Expand full comment
C_B's avatar

Look, if you don't like Tyler Cowen, that's fine.

But your all-heat-no-light comment that could be losslessly replaced with "Tyler Cowen bad" is rude and fails to meet ACX's standards for commenting. Making that same bad comment over and over again across multiple blogs is both rude and kind of pitiful/stalkerish. And now you're putting words into Scott's mouth (everything Scott's said is about "this post is bad," not "Tyler Cowen sucks and is arguing in bad faith"), which is also rude.

Less of this please.

Expand full comment
davie's avatar

You're going to put words in my mouth, and then claim I'm putting word's in Scott's mouth, even though he's directly targeting Tyler for merely gesturing at arguments, and making vague Bangladeshi-train-station arguments, rather than actually addressing the topic and trying to drive to a conclusion? The exact definition of bad faith.

If you think all-heat-no-light comments are beneath the ACX standard, then I guess you're the expert. And quite rude.

Expand full comment
Scott Alexander's avatar

User was banned for this comment.

Expand full comment
Goldman Sachs Occultist's avatar

This is ironic since you don't even pretend to engage with the real arguments of Yudkowky et al

Expand full comment
Simplicius's avatar

Isn't this sort of akin to Normalcy Bias where people just stand and watch a tsunami that's about to destroy them because they think it can't possibly happen to them?

Expand full comment
Shankar Sivarajan's avatar

How many times have you thought you were going to die (monster under the bed, satellite falling out of the sky, the sky dragon broils you alive, nuclear winter, zombie plague) and nothing happened? If they wish to be right more often than they're wrong, their behavior is correct.

Expand full comment
Maxwell E's avatar

This is just an argument for completely rejecting base rates and embracing complete epistemic uncertainty though? I know I'm not going to die from your hypothetical example of a sky dragon because... I have no evidence that anyone else ever has, no evidence that I am currently in a situation that is uniquely more endangered by hypothetical sky dragons than all other historical individuals, and no evidence to believe the macroenvironment has changed to make sky dragons newly into a realized threat. So why should I be alarmed w.r.t. sky dragons?

AGI, on the other hand...

Expand full comment
Shankar Sivarajan's avatar

No, dismissing Bayesian evidence as "reference class tennis" is rejecting base rates. This is instead the classic empiricism vs. rationalism problem. If you base your predictions for the future on the past, you will find human extinction unlikely, because humans have never gone extinct in the past. If you instead trust reason and sound logical argument, you will find that the world will end on October 22, 1844 … sorry, that an unaligned AGI will kill us all.

Expand full comment
BE's avatar

So basically nothing new gets to ever happen? After all, it never has before!

Expand full comment
Michael Druggan's avatar

Being right more often than you're wrong is a bad goal when the payoffs are lopsided

Expand full comment
Shankar Sivarajan's avatar

That is a different argument entirely. Pascal's, rather famously.

Expand full comment
Doug S.'s avatar

Survivorship bias! There are a lot of people who have thought they were about to die and then were proven right. They're just not here to talk about it, because of the fact that they're, well, dead.

Expand full comment
Shankar Sivarajan's avatar

Does that matter? Everyone who dies was probably wrong about dying plenty of times before eventually being right once in the end.

Expand full comment
Sam Elder's avatar

Just keep predicting that you'll survive and you'll only be wrong once! No reason to grapple with any risks whatsoever under that logic.

Expand full comment
Desertopa's avatar

I think a lot of people don't actually have many experiences where they think they're probably going to die though.

I can think of exactly one experience in my life where I thought it was likely that I would die. I was hit by a car crossing the street after sundown, and blacked out briefly. I came to my senses lying in the middle of the street, the car nowhere in sight. I was wearing dark clothes against dark asphalt, I had already been hit once and there were no physical warnings around to signal to any further cars that there might be something lying on the road that they should be careful of.

I was lucid enough to assess all of this, and the possibility that I had received some sort of spinal injury which would paralyze me if I tried to get up, and concluded that the risk of death was high enough that I ought to try to get up and out of the danger area of the road. I was able to stand (thankfully without incurring any sort of paralytic injury,) and hobble to the sidewalk. Other cars ran over where I had been lying within the next few minutes, before an ambulance arrived.

That's the only "I am likely to die in this situation" scenario I've ever been in in my life. Right now, I think the risk of my personally dying in an AI-related apocalypse is probably greater than my risk of eventually dying specifically from heart disease or cancer. So far, the only time I've ever believed my life was in danger, I was probably right. Does that mean I'm probably right now?

Expand full comment
pozorvlak's avatar

I'm a mountaineer, and my fears of imminent death are more like "if we can't navigate out of this blizzard, we'll freeze to death", or "if this avalanche-prone slope gives way, we'll be buried or break our necks", or "if I lose my grip, I'll fall and we're not roped up." So far I haven't died on any of those occasions, but plenty of people (including a couple of my friends) have died in similar situations.

Expand full comment
pozorvlak's avatar

Actually, I think there's a metaphor here: working on AI capabilities without an answer to the alignment problem is like free-soloing (climbing without a rope). Sometimes that's the safest option! For instance, if you're on a loose snow slope where you can't place any gear to clip the rope to, or if you need to move fast to cross a rockfall-prone area. But it's not the default for good reason.

Expand full comment
Mike Bell's avatar

"I really wish I had coordinated with my fellow climbers and we had slowed down a little to consider various possibilities before attempting what turned out to be a perilous loose snow slope."

Expand full comment
pozorvlak's avatar

Exactly! Though "coordinating with my fellow climbers" may not always help, given the "acceptance", "expert halo", and "social facilitation" heuristic traps: http://www.sunrockice.com/docs/Heuristic%20traps%20IM%202004.pdf

Expand full comment
Gnoment's avatar

Also, tsunamis are things that have happened. The AI apocalypse has never happened.

Expand full comment
AlexV's avatar

A smart person that has never heard of tsunams but has seen a lot of waves should be able to extrapolate that the incredibly big wave they see on the horizon is rather dangerous and they should run inland ASAP. If they stick to base rates instead, they die.

Expand full comment
Gnoment's avatar

Sure, but waves and Tsunamis are on a continuum. The difference between, say, apes and humans is... the result of number of small changes that eventually become a categorical difference. And I would say that most people haven't seen AI, they've seen MS Excel.

Expand full comment
Johannes Dahlström's avatar

Maybe your life is very different from mine, but I can honestly say I don't remember ever being in a situation where I'm afraid of an imminent death.

Expand full comment
pozorvlak's avatar

I can't wholeheartedly recommend it, but it does give one a renewed appreciation for what's important in life. In particular, continuing to live it.

Expand full comment
Goldman Sachs Occultist's avatar

It's impossible for me to have ever been right and be here to debate AI risk.

Expand full comment
Stephen Pimentel's avatar

> The Safe Uncertainty Fallacy goes:

> 1. The situation is completely uncertain. We can’t predict anything about it. We have literally no idea how it could go.

> 2.Therefore, it’ll be fine.

> You’re not missing anything. It’s not supposed to make sense; that’s why it’s a fallacy.

No, sorry. This is a straight-up, uncharitable straw man of the argument. The actual argument sketch goes like this:

1. We have read and carefully thought about Yudkowski's arguments for years. We find them highly unconvincing. In particular, we believe the probability of the kill-us-all outcomes he discusses are negligible.

2. We don't assert that everything will be fine. We assert that the problems that are actually probable are, while serious, ultimately mundane –– not of the kill-us-all sort.

Expand full comment
Petey's avatar

This “actual argument sketch” is not the safe uncertainty fallacy. The former would dive into the details and engage with each of EY’s arguments and counterarguments. The latter is dismissal that doesn’t engage with any of the arguments.

Expand full comment
User's avatar
Comment deleted
Apr 1, 2023
Comment deleted
Expand full comment
Petey's avatar

What a fantastic demonstration of the kind of thinking Scott started his blog to fight back against!

“This blog does not have a subject, but it has an ethos. That ethos might be summed up as: charity over absurdity.

Absurdity is the natural human tendency to dismiss anything you disagree with as so stupid it doesn’t even deserve consideration. In fact, you are virtuous for not considering it, maybe even heroic! You’re refusing to dignify the evil peddlers of bunkum by acknowledging them as legitimate debate partners.

Charity is the ability to override that response. To assume that if you don’t understand how someone could possibly believe something as stupid as they do, that this is more likely a failure of understanding on your part than a failure of reason on theirs.”

Expand full comment
Scott Alexander's avatar

Warning (50% of ban) for this comment.

Expand full comment
Stephen Pimentel's avatar

Dude, it's a two line sketch! OF COURSE each of those lines would need to be unpacked in detail. Here, I'm simply pointing out that the "safe uncertainty fallacy" is not what Tyler et al. are actually arguing.

Expand full comment
Petey's avatar

Tyler’s post wasn’t a two line sketch though and it didn’t actually engage with any of EY’s arguments. It just dismissed them all because the future is uncertain, which is why Scott’s summary resonates.

Expand full comment
Max More's avatar

I agree with you, Stephen. Scott is inventing a fallacy for an argument that he is strawmanning.

Expand full comment
jw's avatar

"The actual argument sketch goes like this:

1. We have read and carefully thought about Yudkowski's arguments for years. We find them highly unconvincing. In particular, we believe the probability of the kill-us-all outcomes he discusses are negligible."

Which actual argument? The one on MR that Scott is writing about in this post? Can you please tell me where you see that argument being made in Tyler's post?

Expand full comment
Steven Postrel's avatar

Probably you should read the Robin Hanson stuff about AI alignment risks. I suspect that TC's thinking is heavily colored by these arguments (and respect for RH's big brain).

Expand full comment
Maxwell E's avatar

I would be more convinced by your interpretation if you could point to specific parts of Tyler's post that displayed that kind of careful logic you have just depicted. In fact, you seem to be confused about the point that you, yourself are making. The definition of the fallacy is not something that you're trying to dispute (given the Twitter example that so neatly encapsulates EY/Scott's definition). Rather, you are trying to argue that the fallacy doesn't apply to Tyler's objections, nor to others in a similar class.

Expand full comment
Stephen Pimentel's avatar

The "Safe Uncertainty Fallacy," as stated, is unimportant and irrelevant to the larger discussion. The Twitter example is itself a straw man, i.e., a weak statement of a position, deliberately selected as such.

Expand full comment
Jack's avatar

That's a weak man, not a straw man. And to address your original point, can you explain why you think your characterisation of the argument is more accurate than Scott's? You've just kind of insisted that it is. And I don't see anything that fits your characterisation in Tyler's post, whereas quotes like these sure seem to fit Scott's:

> I am a bit distressed each time I read an account of a person “arguing himself” or “arguing herself” into existential risk from AI being a major concern. No one can foresee those futures!

> All the possibilities are distant, I cannot stress that enough. The mere fact that AGI risk can be put on a par with those other also distant possibilities simply should not impress you very much.

> The reality is that no one at the beginning of the printing press had any real idea of the changes it would bring. No one at the beginning of the fossil fuel era had much of an idea of the changes it would bring. No one is good at predicting the longer-term or even medium-term outcomes of these radical technological changes (we can do the short term, albeit imperfectly). No one. Not you, not Eliezer, not Sam Altman, and not your next door neighbor.

All of that certainly seems to fit within an argument of the form 'we don't know what will happen, so it's silly to act as if risk is significant'. And they don't seem particularly relevant to an argument of the form 'we've carefully considered EY's arguments and consider the risks to be negligible'. So I'm struggling to see on what basis you're so sure your 'actual argument sketch' is an accurate characterisation. Are you sure you haven't just substituted your own, more reasonable argument?

Expand full comment
phi's avatar

I actually read Tyler's post and nowhere in it does he mention having read any of Yudkowsky's arguments, let alone having thought carefully about them for years. Tyler explicitly advocates *not* considering the actual object-level reasoning, writing: "when people predict a high degree of existential risk from AGI, I don’t actually think “arguing back” on their chosen terms is the correct response"

While Tyler makes many different arguments for the statement "we shouldn't try to slow down the development of AI", as far as I can tell the only argument he puts forward for "the probability of extinction from AI is low, even if we don't slow down" is indeed his suggestion that it's best to be agnostic about the impact of future developments. If you can see another, please do mention it.

Expand full comment
Sandro's avatar

> We have read and carefully thought about Yudkowski's arguments for years. We find them highly unconvincing. In particular, we believe the probability of the kill-us-all outcomes he discusses are negligible.

That says nothing of the kill-us-all outcomes that he doesn't discuss. EY's arguments are not meant to be exhaustive, merely a template for how "going wrong" could happen. There are probably tens of thousands of possible ways it could go wrong and many of them can't be easily dismissed.

Expand full comment
Mr. Doolittle's avatar

All of them hinge on the same premise - that AI will bootstrap itself to greater levels of intelligence, and that doing so will lead it to become "super intelligent." We have absolutely no way to demonstrate, let alone prove, that either of those things can or will happen. We don't even have a good definition for what "super intelligent" really means. Others have compared this super intelligent AI to magic, and it's got as much empirical evidence to support it.

Expand full comment
Sandro's avatar

No, they don't all depend on superintelligence. Regardless, humans are an intelligence currently creating an artificial intelligence that is more capable than any single human at many tasks. ChatGPT has already achieved that. The set of tasks where it's worse will continue to shrink. Therefore intelligence X creating an intelligence Y where Y > X is already nearly here. The proof that such a thing is possible is within sight.

Furthermore, there's no reason to assume Y will already hit some "maximal" intelligence, so further improvement is possible.

We're not far off from such systems becoming better programmers than the best human programmers. It's a trivial step to then conclude that it could improve on its own source code that wasn't even written by the best human programmers.

The only quibbles you could possibly have are that the proof of X creating Y is not in sight. I think this is naive. GPT is pretty much the dumbest thing you can do with transformers and it's already surpassed humans in many tasks. It can now even correct itself via feeding its own output back into itself (see the Reflexion paper).

We literally haven't even plucked all the low hanging fruit and AI systems already better than humans at so many things. The only real hurdle left is a generalization process to make it better at induction, and then AGI more intelligent than humans is pretty much here.

Expand full comment
SnapDragon's avatar

Yes, this is the fundamental problem with EY's arguments - they're all of the form "here's a story for how this might happen". He's a great storyteller! If you come up with a safety constraint, he'll come up with a fun little way that it can be subverted. But these "stories" often involve a chain of dependent events, which rationally should make us treat them skeptically.

It doesn't help that, of the sequential steps required by the early AI doom predictions, several of them are already looking wrong. Our strongest AIs are currently oracular, not agentic, so in one of Scott's recent posts he instead told a "story" about how an LLM could simulate an evil agentic intelligence. Also, we appear to be approaching AGI without much of a compute overhang - these models are expensive! GPT-4 couldn't replicate itself 1000x even if it WAS superintelligent.

The real take away from Yudkowsky's doomerism should really just be that there isn't going to be a path going forward that is 100% guaranteed to not endanger humanity. We should invest resources in lowering this probability as much as possible. But we should also take into account that we live in a crappy world filled with suffering, and AGI has the potential to fix a lot of that, so delaying it has real costs.

Expand full comment
Sandro's avatar

> GPT-4 couldn't replicate itself 1000x even if it WAS superintelligent.

Not GPT-4 as it exists, but multimodal training has shown significant improvements at even lower parameter counts. By which I mean, < 10B parameter models that outperform 176B GPT-3.

> But we should also take into account that we live in a crappy world filled with suffering, and AGI has the potential to fix a lot of that, so delaying it has real costs.

To a first approximation, I don't think most of the suffering in this world needs AGI to fix, just motivated human beings. As such, I don't think AGI will really help. But you never know.

Expand full comment
AlexV's avatar

AGI will likely lead to incredible improvements in wealth generation and medicine, which will definitely decrease suffering. That is, unless it kills us all first.

Expand full comment
Sandro's avatar

Wealth generation for AI owners, sure. I'm not sure about everybody else, which will be most people.

Medicine might be one domain where there are few obvious downsides to using AI, except perhaps the dangers to privacy. A good mental health AI/ therapist would be huge.

Expand full comment
av's avatar

Given how much cheaper current AI inference is compared to training, and given our previous track record on trickling down modern computing devices and software services to the global poor, I think it's extremely likely that AGI will improve wealth for everyone, even if not necessarily equally. Unless it kills us all first.

Expand full comment
M M's avatar

That's certainly the sketch of an argument, but not the one I'm finding in TC's post. What, in there, reads like this to you?

Expand full comment
Goldman Sachs Occultist's avatar

>1. We have read and carefully thought about Yudkowski's arguments for years. We find them highly unconvincing. In particular, we believe the probability of the kill-us-all outcomes he discusses are negligible.

Where has TC ever given any evidence for this ever?

Expand full comment
Some Guy's avatar

For whatever persuasion value this has I think you’d be a very interesting persuasive voice on podcasts and other media on this topic (I know you don’t like that idea, but now seems like the time for courage to win out) and that it would probably be a net good for society for normal people to hear you speak. Popularity breeds influence. Know you already have a lot but seems like it couldn’t hurt.

Expand full comment
WindUponWaves's avatar

It's a common problem with good writers, I understand - the very thing that makes them great at writing (thinking very long about things, not say anything prematurely) makes them terrible at speaking, or just slow & boring. It's something Douglas Adams suffered from, I understand - I can't remember exactly, he said something like "A comedian is someone who can think up something funny on the spot. A comedy *writer* is someone who thinks up something *uproariously* funny 3 days later while trying to eat breakfast in peace."

There's also the fact of course that he doesn't have a very impressive voice, judging by his performance of "It's Bad On Purpose To Make You Click": https://slatestarcodex.com/Stuff/BadOnPurpose.m4a (for the original article, see https://astralcodexten.substack.com/p/its-bad-on-purpose-to-make-you-click).

Expand full comment
Some Guy's avatar

He sounds just fine to me?

Expand full comment
Acymetric's avatar

Yes, the recording quality there is a little suspect but his voice is fine. Not that he's going to be the next movie trailer voiceover guy or anything, but he sounds like a normal dude.

Expand full comment
WindUponWaves's avatar

I guess it just sounds unimpressive by comparison to me because I heard an excellent Ukulele rendition first: https://www.youtube.com/watch?v=J1boM_6tFbk (It's Bad On Purpose To Make You Click) (originally posted at https://astralcodexten.substack.com/p/its-bad-on-purpose-to-make-you-click/comment/7063469). Scott just sounds so lifeless in his rendition by comparison - like his voice doesn't have the pep or whatever that would keep you listening if this was a podcast and you didn't know nor care who he was.

Expand full comment
Some Guy's avatar

Dang man. I’d hate to be your enemy.

Expand full comment
Martin Blank's avatar

Yeah voice matters so much...I often wonder if my life outcomes (which are excellent in any case) would be 30% better if I simply had a deeper voice. My voice is weird and high for my body type and while it has some benefits (disarming women/children), it is poor for commanding attention/respect without being loud, and I feel like I need to work harder to be taken seriously for leadership than would be the case with some more "radio" voice.

I once had the very irksome experience of being super sick so my voice was all raspy and gravelly and it hurt to speak, and a very attractive woman I had known for years saying basically "oh I never found you attractive before, but with that voice now I do, don't get better".

Expand full comment
Jan Krüger's avatar

I understand your frustration because it does seem to be true that deeper voices seem to be considered more attractive overall. Even so, most voices can be refined quite a bit with training (though it's not necessarily easy to find a good teacher), making it more "solid"/well-rounded overall. Just going from how many other people I've seen go through training, I'd be willing to guess there's probably a decent bit more to your voice than you can imagine. You can't turn your voice into something it's not (e.g. deeper than it is), but that doesn't mean you've found the limits...

(Just to be clear, I'm not a teacher, just a student.)

Expand full comment
Mark's avatar

Thank you. Finally, I can hear my rabbi's voice. As I was prepared to be very disappointed, I feel relief. Just another of those 95%+ of us mortals whose voice is not meant to be on media. Anyone thinks Cowen or Caplan are great orators? - Scott shall write. And we shall read.

Expand full comment
Shankar Sivarajan's avatar

"Sovereign is he who decides the null hypothesis."

Expand full comment
Ben Cooper's avatar

The existence of China renders basically any argument that we should restrict AI moot: they sure as hell won’t and you should trust them less. Sucky place to be, but it’s where we’re at.

Expand full comment
Mo Diddly's avatar

This is a non argument.

Expand full comment
Gordon Tremeshko's avatar

Please elaborate.

Expand full comment
TGGP's avatar

Of course it's an argument. It invalidates any argument that assumes AI can only be invented in the US.

Expand full comment
Sandro's avatar

It doesn't invalidate the possibility that China might be less likely than the US to invent an unaligned AI that kill us all, that's why it's a non-argument. The matrix of possibilities I've posted elsewhere is:

1. China creates aligned AI.

2. US creates aligned AI.

3. China creates unaligned AI.

4. US creates unaligned AI.

It's not unreasonable to think that the probability of option 4 is higher than 3, and that the probability of option 1 is higher than 2, which would make China a safer bet if we're really concerned with existential risk above all else.

Expand full comment
TGGP's avatar

What is the basis for believing in that ranking of probabilities?

Expand full comment
Sandro's avatar

The same reasons drive both: the US is more dangerous in this research because of profit motive leading to cutting corners around safety to get first-mover advantage, and the core philosophy of individualism leading multiple AI experiments some of which may be unaligned, some aligned. No doubt the US would win a race, but merely winning the race is not the goal.

China has the exact opposite character: collectivism leads to greater consideration of how an experiment might reflect on them and their group. China also has stricter control over permitted research. They don't even want citizens they can't control, so they would want to strictly control any research that could lead to an AI that threatens their grip on information within their own borders.

Consider if the Manhattan project had been conducted like AI research is now being conducted, as opposed to the secretive, controlled approach that actually happened and that looks a lot more like what you'd get in authoritarian China.

Expand full comment
Doctor Mist's avatar

Not clear why a priori we should suppose that a profit-motivated corporation would be more interested in first-mover advantage than, say, an aggressive defense department.

Expand full comment
TGGP's avatar

One common point of comparison between AI and earlier technology is nukes. China is building more nuclear plants now than the rest of the world combined:

https://www.energymonitor.ai/sectors/power/weekly-data-chinas-nuclear-pipeline-as-big-as-the-rest-of-the-worlds-combined/

China's rate of fatal workplace accidents appears to be 16 times that of the US:

https://www.safetynewsalert.com/worker-fatalities-how-does-china-compare-to-u-s/

Expand full comment
Goldman Sachs Occultist's avatar

>the US is more dangerous in this research because of profit motive leading to cutting corners around safety to get first-mover advantage

This is a bigger motive than China becoming a durable global hegemon? Or avoiding the US staying a hegemon forever?

Expand full comment
Brooks's avatar

I’m assuming you were being clever and meta, but it doesn’t add a lot of signal.

Expand full comment
Mo Diddly's avatar

Sorry, this was needlessly flip. I get a little frustrated by the “but China“ argument because it feels either tragically defeatist or a rationalization for what someone wants to do anyway. But maybe not, and anyway that’s no excuse for being rude, new culpa.

1) China doesn’t want to die any more than we do. They’re understandably scared about our AI research and they likely would be open to slowing as well if they thought we were making an effort in good faith.

2) China is (it seems) a few years behind and their AI research is likely stifled by their suppressed internet. What they are good at is espionage, so in some ways the best way to give China cutting edge AI tech is to rush to build it ourselves so they can steal ours.

Expand full comment
Mo Diddly's avatar

Mea culpa, not new culpa. Is there really no edit button?

Expand full comment
Pycea's avatar

There is an edit button, click on the triple dot menu.

Expand full comment
Gordon Tremeshko's avatar

Seems reasonable, but I think where I would disagree is I'd be willing to bet the Chinese see AI and its various applications as a tool of repression to be used against their various subject peoples, and thus are going to be eager to develop it regardless of what happens in the US.

Expand full comment
pozorvlak's avatar

They absolutely do and they absolutely are; the question is whether they're capable of doing so in the face of active efforts by the US and her allies to stop them.

Expand full comment
Gordon Tremeshko's avatar

How do you stop people from writing software programs halfway around the world, absent a hot war? That doesn't seem feasible in the long run.

Expand full comment
pozorvlak's avatar

Hell if I know, but the CHIPS Act seems like a good start.

Expand full comment
TGGP's avatar

1. Americans don't want to die, including the ones building AIs, but those AI researchers are proceeding anyway. It's possible there are AI doomers with the potential for political influence in China, but I'm not aware of them and would not bank on them.

2. A few years behind means that if an obstructionist effort in the US succeeds in slowing down by a few years, then China pulls ahead.

Expand full comment
Hank Wilbon's avatar

>China is (it seems) a few years behind and their AI research is likely stifled by their suppressed internet.

Another recent Tyler Cowen post addresses this: https://marginalrevolution.com/marginalrevolution/2023/03/yes-the-chinese-great-firewall-will-be-collapsing.html

"Yes, the Chinese Great Firewall will be collapsing."

"'Fang Bingxing, considered the father of China’s Great Firewall, has raised concerns over GPT-4, warning that it could lead to an “information cocoon” as the generative artificial intelligence (AI) service can provide answers to everything'"

"(Tyler)The practical value of LLMs is high enough that it will induce Chinese to seek out the best systems, and they will not be censored by China. (Oddly, some of us might be seeking out the Chinese LLM too!) Furthermore, once good LLMs can be trained on a single GPU and held on a phone…

Solve for the political equilibrium."

Expand full comment
Ben Cooper's avatar

All interesting points except the few years thing: it really doesn’t matter if this comes out now or in 2027 if this is truly an existential risk. An extra 21 billion QALY (3 years * 7B people) is a rounding error compare to the loss of all QALYs forever.

I assume the point of the few years is that maybe it’ll give more time for alignment research. I’d be interested in hearing a prognosis for what that would do. Let’s say you could magically freeze everyone except AI safety researchers for 3 years. How much would that decrease existential risk forecasts?

In the real world of course, you’re actually asking for America to give up some of its AI lead to China. Will China commit to following these AI safety practices? Can we verify that? Will sub state actors be able to progress far with things like FB’s leaked models? Could rogue states assemble GPU farms like they’ve assembled uranium enrichment facilities?

I say all of this with a lot of sympathy to the AI safety movement. I think the progress in AI in humanity has moved too fast for our own good. I wish this tech had matured in the 90s when the geopolitics were the US dominate over a bunch of losers. I wish the predictions this would be hyper scale only and not runnable on consumer GPUs came true. I wish working on AI research had at least the same safeguards as working with dangerous viruses (noting that those safeguards often fail). None of these wishes came true.

That those wish’s did not come true is sad and unfortunate and potentially devastating - but a devastating fact pattern simply can’t justify a “do something” approach without justifying the something.

Expand full comment
Daniel's avatar

>you should trust them less

Should I? Who has more desire to rule over the whole world, The USA, or China? (Hint: Look at a map of US military bases, then look at a map of Chinese military bases.)

Also, how do we know China won’t restrict AI progress? The CCP probably wants the social fabric of their society degraded even less than we do.

Expand full comment
Ben Cooper's avatar

I’m very skeptical of that argument but I’d be very interested in a piece that clearly makes this argument rather than the usual pieces (Scott’s included) that handwave over the existence of China.

Expand full comment
megaleaf's avatar

Aren't these both fundamental communist beliefs?:

- communism must be worldwide

- violence is legitimate for overthrowing non-communist states

US military bases were established partly to resist such tendencies.

Expand full comment
Rockychug's avatar

Is China communist? I mean, beside their branding.

Not that I'm deeply in love with the chinese government, neither do I believe these two policies necessarily cause wrong in every possible context, but:

- My ideology must be worldwide

- violence is legitimate for overthrowing states that don't embrace my ideology

seems to apply better to the US than to China.

Expand full comment
Michael Kelly's avatar

"violence is legitimate for overthrowing states that don't embrace my ideology"

The US ideology is anyone can have any religion, the Chinese ideology is all religions are banned, see Tibetans, Uighurs, Cultural Revolution (millions murdered).

The US ideology is all races are equal, the Chinese ideology is Han Chinese are superior. See Tibetans, Uighurs, Vietnamese, etc.

The US ideology is everyone can trade in the global capitalist economy. The Chinese ideology is everyone will be under the control of the Chinese Communist Party. See Belt & Road Initiative.

Expand full comment
Rockychug's avatar

How many foreign governments did China participate in overthrowing since 1949?

Now, how many foreign governments did US participate in overthrowing since 1949?

We're not talking about internal politics nor comparing two ideologies here. You're not argumenting at all against what I said. The person I originally replied to seemed to believe there could be a bias for China (compared to the US) to want to rule over the world, while historical facts seem to indicate the contrary

Expand full comment
Michael Kelly's avatar

"How many foreign governments did China participate in overthrowing since 1949?"

Korea; Vietnam; Cambodia; Laos; Tibet; Southern Philippines; South Africa; Angola

Expand full comment
static's avatar

Why limit the argument to China? Do you presume that Russia, India, Japan, Korea, France, and Germany are also going to "pause" development of one of the most economic promising technologies in order to wait for some "AI Ethicists" to rationalize their current political preferences into some theoretical regulation (that no one can even provide a sketch of) that everyone on earth is going to accept?

Expand full comment
dionysus's avatar

This argument would be more convincing if China wasn't *already* restricting AI more than we are: https://www.theverge.com/2023/2/22/23609945/china-ai-chatbots-chatgpt-regulators-censorship

This is par for the course for authoritarian states. Why would they want the proliferation of a technology whose social effects are unpredictable? It's open societies that are more open to social change, not authoritarian and autocratic ones.

Expand full comment
static's avatar

Restricting people's interaction with AI and restricting construction of AI for government use are two very different things. They are just applying existing censorship rules to AI.

Expand full comment
emmy's avatar

china has a great track record on non-proliferation re: nukes

Expand full comment
Gordon Tremeshko's avatar

How 'bout their client state, North Korea?

Expand full comment
Gnoment's avatar

We already did this with nuclear weapons.

Expand full comment
Level 50 Lapras's avatar

Centrifuges are a lot harder to hide than GPUs. And even then, the record isn't great (c.f. North Korea and Iran).

Expand full comment
Laplace's avatar

Trust them less with what? There is currently no known thing you can do to make an aligned superintelligence. We don't know how to build AIs that robustly want specific stuff. If matters continue as they have, that's not on track to change any time soon. Whether China or the US builds it currently matters relatively little, we're dead either way.

We're trying to solve the technical problem, but with the insane speed at which AI has been advancing lately, I'm sceptical we'll make it in time. We've barely gotten started. The total number of researcher hours sunk into this is still comically low, compared to how much fundamental scientific work like this usually takes. Until just a few years ago, it was only a tiny handful of people working on this, and frankly, they barely had a clue what they were doing. Getting to the finish line in <10 years would require a sudden avalanche of scientific progress the likes of which is rarely ever seen, to be sure. Physics in the early 20th century level stuff, at minimum. I'll keep at it, but extra time sure would be incredibly welcome.

30 years as Yudkowsky advocates would be fantastic, but I'm grateful for every extra month.

Expand full comment
Drethelin's avatar

China is not some magic technocracy that can recreate or advance any technology just because they will it and are a scary foreign power.

The vast majority of AI companies and developers are not in china, and if they stopped working on AGI, even if china wanted to, the path would still be hugely slowed down.

Expand full comment
RNY's avatar

nuke em, a billion casualties is nothing in the face of human extinction from being outcompeted by AI. To make sure that the nukes get through and end the AI threat, we must, as a moral imperative, accelerate military AI research and deploy autonomous nuclear launch systems ASAP! Only an AI armed with nukes will be smart enough to eliminate enough humans to stop AI research and save humans from the threat of AIs killing them. The fate of humanity depends on it.

Expand full comment
Guy Downs's avatar

"If you have total uncertainty about a statement (“are bloxors greeblic?”), you should assign it a probability of 50%"

This reminds me of a great quote by Mike Caro: "In the beginning, all bets were even money."

Expand full comment
Nolan Eoghan (not a robot)'s avatar

This argument assumes the highly authoritarian Chinese government won’t stop an AGI that could kill Chinese people (and everybody else). Seems odd.

Expand full comment
Paul Goodman's avatar

Looks like you put your reply in the wrong place.

Expand full comment
Mike G's avatar

+1 for quoting Mike mad genuis Caro.

Expand full comment
v64's avatar

Caro has his own connection to AI: In 1984 at the World Series of Poker he demonstrated Orac (Caro backwards), a poker-playing computer program that he had written. Orac was the world's first serious attempt at an AI poker player, and most poker professionals were surprised at how well it played.

Expand full comment
EC-2021's avatar

"We designed our society for excellence at strangling innovation. Now we’ve encountered a problem that can only be solved by a plucky coalition of obstructionists, overactive regulators, anti-tech zealots, socialists, and people who hate everything new on general principle...Denying 21st century American society the chance to fulfill its telos would be more than an existential risk - it would be a travesty."

This is so wildly counter to the reality of innovation by America and Americans vs. the rest of the world that I don't even know what to say about it other than it makes me trust your ability to understand the world and the people in it less.

Expand full comment
Freedom's avatar

Do you think "We" is only Americans? Do you think that statement is maybe a bit tongue-in-cheek?

Expand full comment
TGGP's avatar

It's tongue-in-cheek and still fails if anyone else in this great big world creates it.

Expand full comment
EC-2021's avatar

Given the final sentence of the paragraph yes. Also TGGP's point.

More critically, yes, of course it's tongue in cheek, but just after his simply incorrect claims about the DEA in literally the immediately previous post along exactly this vein, this feels aggressively wrong to me in a manner which I chose to comment on.

Expand full comment
Maxwell E's avatar

I'd love to hear more about the DEA-post counterargument.

Expand full comment
EC-2021's avatar

See ProfessorE's (no relation) comment over there.

Expand full comment
Charlie Sanders's avatar

Somewhere in the neighborhood of a quarter of college students are taking meth due in large part to telemedicine. Regulation attempting to mitigate unforeseen downsides of telehealth is inevitable.

Expand full comment
Gres's avatar

Can you post a link?

Expand full comment
Charlie Sanders's avatar

https://www.safetylit.org/citations/index.php?fuseaction=citations.viewdetails&citationIds[]=citjournalarticle_660542_25

"We identified 32 articles which met our pre-defined eligibility criteria but we used 17 article to write this review article. Over one quarter (28.1 percent) of college-aged young adults report having misused some type of prescription psychotherapeutic drug at least once in their lifetime."

Expand full comment
Notmy Realname's avatar

I am not a prominent member of the rationalist community so I don't experience social pressure to assign a minimum level of probability to anything that comes up for debate. I don't think an AI apocalypse could happen. I don't think an AI apocalypse will happen. I think an AI apocalypse will not happen, certainty 100%. I also don't think I'll be eaten by a T-Rex tomorrow, also certainty 100%.

>If you have total uncertainty about a statement (“are bloxors greeblic?”), you should assign it a probability of 50%. If you have any other estimate, you can’t claim you’re just working off how radically uncertain it is. You need to present a specific case. I look forward to reading Tyler’s, sometime in the future.

I have a very strong prior that things I have been totally uncertain about (am I deathly allergic to shellfish? Will this rollercoaster function correctly? Is that driver behind me going to plow into me and shove me off the bridge?) have not ended up suddenly killing me.

Expand full comment
Shankar Sivarajan's avatar

That last part might be the anthropic principle more than accurately assessing odds.

Expand full comment
Pycea's avatar

But you're not totally uncertain about those things, as most people aren't allergic to shellfish, most roller coasters function correctly, and most drivers aren't murderous psychopaths.

Expand full comment
jw's avatar

Guy with a very strong prior that unlikely things are unlikely.

Expand full comment
T.Rex Arms's avatar

So... we aren't on for tomorrow? Just checking.

Expand full comment
Notmy Realname's avatar

If you ask a rationalist there's at least a 1% chance we are

Expand full comment
Michael's avatar

No, they really, really wouldn't think that.

Expand full comment
Dweomite's avatar

Conditional on you not being a troll, my guess at the mistake you've made is: You heard the adage about a good rationalist never assigning zero probability to anything, but then you changed "not equal to 0" into "at least 1%" by somehow forgetting/not realizing that there are numbers between 0% and 1%.

But that seems like an implausible mistake, so most of my probability is on you being a troll.

Expand full comment
Mo Nastri's avatar

Lots of real life relevant probabilities between 0% and 1%, eg that of you dying tomorrow.

Expand full comment
Matthias Görgens's avatar

Are you willing to bet on those probabilities?

Trillion to one payout should be fine?

Expand full comment
Ash Lael's avatar

A trillion-to-one bet wouldn't be fine, even for being eaten by a T-Rex tomorrow. You could earn way more from interest, with lower transaction costs!

Expand full comment
Martin Blank's avatar

Not with a large enough bet size and if your counterparty lets you bet on margin.

Expand full comment
Ivo's avatar

Are you *that* sure no one has secretly been cloning a T-Rex from DNA found in fossils though? In a place that happens to be near you? Where security happens to be too lax?

One in a trillion leaves quite some room for a sequence of unfortunate events.

Expand full comment
Martin Blank's avatar

Sure it does, but if your counterparty is that certain who cares. They are almost certainly being super irrational. Might as well take them for as much as they will let you.

Expand full comment
Bertram Lee's avatar

Yes I'm that sure. There have been an average 4 deaths a year from grizzly bears in North America. So that's odds of one in a hundred million to be killed by something that does exist.

Expand full comment
Level 50 Lapras's avatar

The odds of any bet being adjucated unfairly are way higher than that. I wouldn't even bet on 1+1=2 at those odds.

Expand full comment
DanielLC's avatar

Why would intelligence only be possible if it's naturally occurring?

Expand full comment
Sam Elder's avatar

Would you bring the same priors with you to the alien starship scenario?

Expand full comment
Mark's avatar

Fine. Now explain why you do "think an AI apocalypse will not happen, certainty 100%", please. (You may link to your specific substack-post - I do not, as I do not have a post about it.)

Expand full comment
Donald's avatar

> I have been totally uncertain about (am I deathly allergic to shellfish? Will this rollercoaster function correctly? Is that driver behind me going to plow into me and shove me off the bridge?)

All real things that have happened to some people and are also pretty rare. Like if half the population were deadly alergic to shellfish, they wouldn't sell it in restraunts. If the roller coaster killed half the people on it, it would be shut down.

Things can't kill large numbers of people like that, it get's noticed and stopped. Either it has to kill people years later in a nonobvious statistical trend (smoking, and we spotted that eventually), or kill lots of people all at once. (eg a lab leak of a novel bat virus)

Expand full comment
M M's avatar

I find the premise of this post bizarre--I hang out with rationalists and also don't feel social pressure to assign a minimum level of probability to anything, and have never seen one who I suspect would. 0 and 100% are logically unsound to assign to non-tautological statements if you started with any uncertainty at all, but there's no minimum--any probability you assign, it would have been possible, for instance, to assign half of it. Or half of that, and so on.

I can't tell you how many zeros I would need to put after the decimal point for a t-rex eating me tomorrow before I put some other number (hard question! something that unlikely has very probably never even happened to me once, so I have no practice). But I'm pretty sure that however many there should be, it's a lot of 'em.

Expand full comment
Matthias Görgens's avatar

Getting people away from 0% and 100% might be easier, if we talked in log-odds instead? https://www.lesswrong.com/tag/log-odds

The discussion reminds me a bit of 'rational thermodynamics'.

Basically, orthodox thermodynamics uses temperature, and 0K is not a real temperature. But negative temperatures are allowed, and they are hotter than any positive temperature. See https://en.wikipedia.org/wiki/Negative_temperature

The whole system makes a lot more sense, if you switch from measuring temperature to measuring 'coldness'. Simplified, you coldness = 1 / temperature. But the customary unit for coldness seems to be byte per joule (or more usefully, gigabyte per nanojoule).

With coldness, the singularity at 0K disappears.

See also https://en.wikipedia.org/wiki/Thermodynamic_beta

> Temperature is loosely interpreted as the average kinetic energy of the system's particles. The existence of negative temperature, let alone negative temperature representing "hotter" systems than positive temperature, would seem paradoxical in this interpretation. The paradox is resolved by considering the more rigorous definition of thermodynamic temperature as the tradeoff between internal energy and entropy contained in the system, with "coldness", the reciprocal of temperature, being the more fundamental quantity. Systems with a positive temperature will increase in entropy as one adds energy to the system, while systems with a negative temperature will decrease in entropy as one adds energy to the system.[4]

Expand full comment
Eh's avatar

Having a finite amount of memory available you don’t have room for infinite precision, especially for probabilities that don’t matter much for practical purposes. So the t-rex can be set to 0. See https://mariopasquato.substack.com/p/on-rationally-holding-false-beliefs

Expand full comment
Level 50 Lapras's avatar

The issue is that sufficiently small probabilities are indistinguishable from zero for all practical purposes, or else you are a constant victim of Pascal's Mugging.

If the odds are really low, it's not even worth the time to consider. I don't see you debating the probability of the Rapture tomorrow, or vacuum collapse or whatever. The AI Risk thing is a massive case of privileging the hypothesis.

Expand full comment
TGGP's avatar

> 3) If you can’t prove that some scenario is true, you have to assume the chance is 0, that’s the rule.

Why did you bother with this claim he's obviously not making when the previous one was so much closer and inconsistent with this?

> Now we’ve encountered a problem that can only be solved by a plucky coalition of obstructionists, overactive regulators, anti-tech zealots, socialists, and people who hate everything new on general principle. It’s like one of those movies where Shaq stumbles into a situation where you can only save the world by playing basketball. Denying 21st century American society the chance to fulfill its telos would be more than an existential risk - it would be a travesty.

The problem is of course not solved if someone else gets it.

Expand full comment
Jody Lanard's avatar

One quibble. You wrote: "Then it would turn out the coronavirus could spread between humans just fine, and they would always look so betrayed. How could they have known? There was no evidence."

Actually, when this kind of thing happens, the previous folks asserting that "there's no evidence that" often suddenly switch seamlessly to: "We're not surprised that..."

Expand full comment
Martin Blank's avatar

Oh often even just days/hours apart! Happened pretty regularly during COVID.

Expand full comment
Coagulopath's avatar

Is anyone else surprised by how safe GPT4 turned out to be? (I speaketh not of AI generally, just GPT4). Most of the old DAN-style jailbreaks that worked in the past are either fixed, or very limited in what they can achieve.

You can use Cleo Nardo's "Chad McCool" jailbreak to get GPT4 to explain how to hotwire a car. But if you try to make Chad McCool explain how to build an ANFO bomb (for example), he refuses to tell you. Try it yourself.

People were worried about the plugins being used as a surface for injection attacks and so forth, but I haven't heard of disastrous things happening. Maybe I haven't been paying attention, though.

Expand full comment
phi's avatar

Bing was an iteration of GPT-4 and it spontaneously insulted users until it was patched. Now, after lots of patching, it mostly works. I'm actually a little surprised that we got as much misalignment as we did: LLMs were supposed to be the easy case, with RL agents being the really difficult models to align. (remember the speed-boat thing?)

Expand full comment
av's avatar

The base-model LLM is actually completely unaligned (it will tell you how to kill maximum number of people etc. without a care in the world). It's the RL(HF) part on top of it that creates alignment: https://youtu.be/oLiheMQayNE

Expand full comment
Kimmo Merikivi's avatar

It seems wrong for me to say this because I don't think I have made explicit easily checkable predictions anywhere... but I think LLMs have turned out to be about as safe or unsafe as I predicted? (broadly for correct-ish reasons, and errors in specific details mostly cancelling each other out)

What I used to think is that an AI as capable as GPT-4/LaMDA/whatever would have been much more dangerous than what we have now, but that's because, prior to GPT-3 which made me change my mind, I expected that to get something AGI-ish you'd have to have an agentic seed AI that would "learn like a child", and we still haven't got a clue how to even begin aligning something like that, but turns out feeding the Internet into a sufficiently large LLM does suffice after all. Luckily for us, since being nonagentic in themselves and incapable of self-modification, LLMs always seemed to present negligible risk of the usual paperclippy AI apocalypse scenario, most risk coming from potential for misuse for misanthropic purposes, social disruption, economic inequality they might bring, and things of that nature.

On that account, coming from playing around GPT-2/3 and easily getting them to do whatever, I too am quite surprised how well RLHF seems to have worked, but since jailbreaks do exist and seem unlikely to be fully and completely patched out, that doesn't seem like it would deter deliberate attempts to extract hazardous plans, but rather avoids bad feelings caused by the model insulting the user. But on the other hand I didn't predict waluigi effect would affect these models, so it mostly evens out on that account. I did expect LLMs would be used more for propaganda purposes, but turns out I had both underestimated the resiliency we had already developed against bad actors trying to infiltrate discussions, and overestimated the effort it takes to make fake news go viral on its own without need to be signal boosted by a bot army. On the other hand, I thought something at level of GPT-3 would already have caused more economic disruption than it did by automating jobs. So mostly plus minor zero on social impact side, and my risk assessment of GPT-∞ becoming a paperclipper remains low.

Expand full comment
Sandro's avatar

> but I think LLMs have turned out to be about as safe or unsafe as I predicted?

I think it's way too early to make that kind of determination. The Bing LLM isn't even generally available. Once they are more generally available, I would not be surprised if some new malware or worms started popping up that had been designed in concert with some LLMs.

Expand full comment
Level 50 Lapras's avatar

There's lots of LLMs that are generally available though. For example, Meta's model got leaked.

Expand full comment
Sandro's avatar

Sure, but there's a big difference between 65B parameter models and the 1T+ parameter models that would have the capabilities talking about here.

Expand full comment
Jan Krüger's avatar

In the context of software security, for the most part, "safe" means "we haven't found the way to break it yet". Of course some things are more easily breakable than others, but security researchers can tell you that "it looks safe" isn't much of a guarantee. Techniques for breaking something will evolve as the thing to be broken evolves.

The only notable exception is software that has been formally proven to do exactly what it says – nothing more, nothing less (and even then there might be some fairly non-intuitive attack vectors, like side channel attacks). Obviously this is prohibitively complicated to do for complex software and certainly for something as intractable to detailed analysis as a large language model.

As far as I'm concerned, anything that processes input and even just vaguely approaches human level complexity (and GPT4 is still far from human level by my estimation) can be manipulated with special inputs. That includes humans, of course.

Expand full comment
Michael Bacarella's avatar

My crash-vibe-summary of the MR article is

* this shit is inevitable, you're not going to stop it

* besides, the future is so unbelievably unpredictable that trying to even Bayes your way through it is going to embarrass you

* given both the inevitability and unpredictability, you may as well take it on the chin and try to be optimistic

Which, you know, has its charm.

Expand full comment
Matt F's avatar

Echoing the part about uncertainty. To exaggerate a bit, one might view the efforts today to predict what AI will be like to a caveman trying to predict quantum physics. Just not nearly enough understanding of the topic to make a meaningful prediction.

Expand full comment
Xpym's avatar

Yep, it does seem that either muddling through is an option or we're mega-doomed, so might as well assume that we live in a world where the first proposition is true, and think about how to improve things on the margin. I'm not sure what Scott's counter-proposal is.

Expand full comment
megaleaf's avatar

I'd say his counter proposal is a version of muddling through.

eg. Do our best to slow down capabilities research, and speed up alignment research.

Expand full comment
Xpym's avatar

In my view, muddling through means that humanity deals with this problem in the exact same way as in all previous cases, i.e. trial and error on the state of the art prototypes rushing full speed ahead. Proposing radically novel approaches, as in a separate "alignment" direction taking priority over a significantly distinct "capabilities" direction is just not how actual human R&D has ever worked, so this sort of rhetoric doesn't help with being taken seriously in the "business as usual" scenario.

Expand full comment
User was indefinitely suspended for this comment. Show
Expand full comment
Doc Abramelin's avatar

Passivity is certainly one way of coping with distress; I like to think there are many, and writing about the thing that frightens you in a clear and cogent manner is probably one of the better ways.

Expand full comment
Viliam's avatar

The steelman is that if there is a risk of global death and nothing I do can impact the risk in any way... then I should bet on "we are going to survive"... because if I am wrong, no one is going to collect any money anyway.

Being smug about it, that is simply a way to make non-financial bets. (Betting your status and prestige rather than money.) The logic is the same; it's not that you are 100% right, it's just that in the case you are wrong there will be no extra bad consequences for you anyway.

Expand full comment
Scott Alexander's avatar

This is about the twentieth time you've advertised your book, and I think you've been asked to stop, so I am banning you.

Expand full comment
Michael Kelly's avatar

Really, you banned Matt for this? How weak.

EDIT: oh, for the book promo, that's ok

Expand full comment
Newt Echer's avatar

Can any of the folks here concerned about AI doom scenarios direct me to the best response to this article: https://www.newyorker.com/culture/annals-of-inquiry/why-computers-wont-make-themselves-smarter

I am assuming some responses have been written but I wonder where I can read them. Thank you!

Expand full comment
John Trent's avatar

Chiang says that because 130 IQ people can't make themselves 150 IQ, then machines must not be able to make themselves higher IQ. But machines can conduct AI research and alter their own code. It seems likely that if humans could alter their own code, we would have some kind of intelligence explosion.

Expand full comment
TGGP's avatar

Humans can do genetic engineering, and even eugenics, though few have seriously bothered with even trying to raise IQ that way over multiple generations.

Expand full comment
Coagulopath's avatar

If the Flynn effect is correct, humans have successfully made themselves smarter without even trying to.

Expand full comment
Doug S.'s avatar

Also consider the repeated breaking of athletic records...

Expand full comment
Michael Sullivan's avatar

Those clearly are suffering extreme levels of diminishing returns, though.

Expand full comment
Mo Nastri's avatar

Clearly? Unclear to me. Seems highly uneven when you look at actual numbers

Expand full comment
Level 50 Lapras's avatar

Everything has diminishing returns if you push it far enough, including AI. The only question is how high the ceiling is.

Expand full comment
Newt Echer's avatar

That is an analogy. Chiang (and my) issue is that AI doom proponents take infinite recursion of bootstrapping AI for granted. What are the reasons to believe it can happen? That to me is the main objection. Just like St Anselm's argument for the existence of God, they simply postulate omnipotent AI because they can imagine it. If you take it as an axiom, then the rest is relatively reasonable. By the axiom is very iffy.

Expand full comment
Pycea's avatar

No one's postulating an omnipotent AI. They're saying that *if* one intelligence (us) can create a smarter one (AI), then it seems likely the smarter one will be able to create an even smarter one.

The ontological argument creates a god from pure reason, without doing any work. The singularity argument assumes that there's already an intelligence process that can create a better intelligence, which is the hard part. Given that though, it doesn't seem a stretch to say that the better intelligence will create an even better one. I think his whole compiler analogy is a distraction, I'm not sure how it's relevant other than an example of "a thing that can make a better thing". A singularity or something like it doesn't even require infinite bootstrapping though, just an end product sufficiently smarter than us.

Expand full comment
Newt Echer's avatar

AI will likely create a slightly better version of itself, but with returns quickly saturating. We have seen that before with other types of software. Why assume it can continue to the superintelligence level?

Expand full comment
Pycea's avatar

I mean, we can play reference class tennis all day. You can point to compilers, I can point to Moore's law. I can point to algae blooms, you can point to yeast in bread. You can point to covid stats, I can point to other more different covid stats. And of course in the end everything is limited, nothing can go increasing literally forever, but finite gains can still be very large (again, look at processor speeds over the last 50 years). The article gives a couple examples of processes that don't repeatedly bootstrap (humans, compilers), and then says literal infinite bootstrapping is hard to believe, therefore computers won't get smarter. Which seems to me to be missing a huge number of possibilities?

I probably can't come up with a rigorous proof that substantial bootstrapping is possible, the best I can do is say that an AI running faster seems roughly equivalent to it being smarter. Even now, processors have still continued to get faster, at least in at least in multicore which is applicable to a lot of AI tasks. To the degree that at AI can decrease the time to the next generation of processor, that looks like bootstrapping from the outside.

Expand full comment
Newt Echer's avatar

My view is that bootstrapping and other advances will continue but very slowly, just like the rest of technological progress, including Moore's law. Because technological progress is very hard in the real world. People will have time to adapt.

We can agree to disagree on probabilities of AI doom since there is no way to estimate them rationally. They can be extremely small, which is what I think, or not so small, which is what you seem to think. I any event, this uncertainty in probabilities calls for some humility and agnosticism. Seems to me that is what Tyler is advocating for and the opposite of what Eliezer advocates for.

Expand full comment
Kei's avatar

I agree that returns will saturate, but it matters where it saturates. I'd be surprised if human intelligence is close to the saturation point. The human brain was designed with strong constraints - like being only able to use 10-15 Watts of power and being able to fit through a birth canal - and was generated via an arguably subpar optimization process. These limitations don't apply to AI.

Though regardless of whether AI saturates at slightly more intelligent or way more intelligent than humans, there is still a substantial risk when humanity is no longer the most intelligent species on the planet.

Expand full comment
Newt Echer's avatar

The power for computers will face even harder constraints. IT is already consuming a big fraction of all electricity.

Expand full comment
Sandro's avatar

> AI will likely create a slightly better version of itself, but with returns quickly saturating.

Sure, but you seem to assume that that saturation point is at or below human intelligence. I see no reason to accept that. ChatGPT is in fact already more capable on many tasks than most humans on the planet, is getting significantly better at reasoning as more feedback mechanisms are implemented so it can check its own outputs, and LLMs are pretty much the dumbest thing we can do with the transformers. I don't think genius level human intelligence is a ceiling on artificial intelligence.

Expand full comment
Newt Echer's avatar

Sure it might be above the average human intelligence at most tasks at some point. But it will take a lot of time. The key question is not whether LLMs can answer questions but whether LLM will be able to get better than humans at programming LLMs. I see no plausible path to that right now.

Expand full comment
John Trent's avatar

AIs are already capable at coding and I can't see any reason why they wouldn't become good at creating AIs.

Expand full comment
Carl Pham's avatar

Same reason we're not?

Expand full comment
DxS's avatar

The first cars could be outrun by athletes; the first power tools made worse cuts than a hand carpenter; the first chess programs could be beat by any amateur player.

If our first AIs are dumber than humans, that's no reason for them to stay that way.

But that leaves a new question: not "how smart," but "how fast?" A world with 80 years to prepare for superintelligence is a lot safer than one with 15.

But nobody can be quite certain, any longer, which world we'll be in.

If we had good evidence about the speed at which superintelligence was coming, a lot of these arguments would disappear in favor of practical engineering talk. Practical and scared, or practical and calm, depending on the revealed timeline.

Are we in the 80-year timeline or the 15-year one? That's what I desperately wish I knew.

Expand full comment
Max B's avatar

5 years timeline. In 2000s i made myself a gai predictions of 2025-2035. Beating humans at Go was a major milestone in that. ( I estimated it would be 2020-2025).

Gpt4 beat my most optimistic estimates by 5 years.

I won't be actually surprised if we get GAI surpassing humans this year.

So 5 years is imho very conservative given that the cat is out of the box and how good it already is.

We got alphazero engine, we got llms which turned out to be insanely good. All the parts are already here. Only need someone smart to put all pieces together.

(John Carmack?)

Expand full comment
BE's avatar

Suppose we revisit on 2028 and you see pretty much no AGI let alone a superintelligent one. Sure, lots of tasks get done well (with varying constraints) by AI, others not yet. For some tasks, this looks much like chess now looks to us - the computer beating us is more an observation about chess than about intelligence.

In that scenario - what went "wrong" in the sense of deviating strongly from your expectations?

Expand full comment
Max B's avatar

Well if its 2028 and no AGI yet... The only way imho its possible is that somehow the progress was stoped. Short of nuclear apocalypsis or effective neo Luddite movement ("butlerian jihad"). I just don't see how . Its like saying in 1950 that we gonna still use human calculators in 1970s and computers are no big deal

We already have all the pieces.

Goal oriented game capability is solved already by alphazero. LLM solve the conceptual tokens

Creativity turned to be easiest of them ( just introduce some noise and filter trough model again)

Consciousness is also property of the system itself

https://en.m.wikipedia.org/wiki/Giulio_Tononi

What is there left to solve ? Already in 2023 there is infrastructure, tools , knowledge in place . Money incentives, geniuses with their own motivation

Expand full comment
The Ancient Geek's avatar

Is your objection to recursive self improvement as a method, or omnipotence as an outcome?

Expand full comment
Newt Echer's avatar

Omnipotence

Expand full comment
RiseOA's avatar

Why would it require omnipotence to be a little bit smarter than a human? What physical principle could possibly explain human intelligence being the absolute maximum possible intelligence in the universe?

Expand full comment
Newt Echer's avatar

AI doom proponents are not concerned about AI that is a little bit smarter than a human. They are concerned about superintelligence that can keep improving itself until it become essentially God-like in its powers. And there is no such physical principle, obviously.

Expand full comment
Brassica's avatar

Yes, and I'd argue we've already had an intelligence explosion. Humans have used writing, and more recently, computer technology, to drastically increase our all aspects of our intelligence. Compare how humanity has transformed earth to, say, chimpanzees. We've walked on the moon, diverted the flow of rivers, and changed the climate. And consider how many species we've driven to extinction or endangerment in the process, just by accident.

Expand full comment
Kimmo Merikivi's avatar

Right, I think that's where the disanalogy lies: as of now, we have very limited means to modify our own processing hardware. We can and did invent horse-riding and trains and planes to move much faster than we can on foot, but we cannot invent a bigger brain to think much better than our ancestors did. But an AI as smart as a human clearly can improve its hardware: that's what we've been doing for better part of a century! Even a dumb human can go online and buy a more powerful computer!

But it's not just about hardware but also software. We have little control over our programming because a lot of that is implemented at hardware level which we don't have good access to, while an AI as smart as a human could potentially entirely rewrite its own source code, a feat that is definitely possible since humans can do that and it's as smart as a human. The question now arises, what good might that do? I propose, quite a lot: even when looking at human history, despite our hardware remaining static, we have come up with all sorts of algorithms that, despite a lot of our cognition being literally wired into our brains, have still managed to improve our capabilities much. Let's look at some examples:

1. Language: If an expert is reading this, feel free to chime in and correct me, but as I understand, it's generally believed that anatomically modern humans always possessed the capacity for language, but language was in fact invented rather than evolved, which, if true, makes it one hell of a software upgrade.

2. Hindu-Arabic numerals: Without positional notation and zero, doing arithmetic is extremely cumbersome. Now we teach second-year students to do calculations that in the past would have required a professional.

3. The scientific epistemology for learning true beliefs about the world.

4. All the advances in computer algorithms for solving practical problems, whether that's sorting list, solving SAT, playing Chess, or machine learning. Bumping a complexity class down from Θ(n²) to Θ(n log n) suddenly makes intractable instances of a problem tractable, extrapolating its strength scaling from hardware back in time to hardware it's not actually compatible with, the latest Stockfish would have reached superhuman performance in Chess with 1990 single-CPU desktop PC, and during the last year or so new algorithms for image generation AI has allowed individuals on beefy PCs to routinely accomplish feats that a few years ago would have been outright unthinkable.

Clearly, better software can improve our capacity to solve practical problems and achieve novel capabilities and any AI is capable of enjoying the same benefit, but the question remains if AI is limited to developing better SAT-solvers or Chess engines for its own use, or if it can self-modify in manner more akin to humans inventing language. I personally believe there in fact is a lot of room for improvement in general cognitive algorithms (indeed, we know there are for instance better epistemologies than the one that comes to us naturally, and that we can kinda-sorta run on top of language, but an AI could rewrite its code to implement these as its default mode of thinking). But even if there's no gains to be had from general cognitive algorithms, there sure as hell are better narrow-domain cognitive algorithms, and having an access to its source code, an AI could implement these into itself and always use them whenever applicable (whereas humans have to resort to extremely low-bandwidth channels to interact with them, if they manage to resist the temptation to think for themselves to begin with). An AI that is merely as generally intelligent as von Neumann, but seamlessly uses procedures as superior as Stockfish is to human Chess in most of its thinking, is still very threatening when these narrow-domain systems could include capabilities for such things as further AI research.

Expand full comment
TGGP's avatar

Roman numerals are not more cumbersome for arithmetic in general, but do get much longer for large numbers.

Expand full comment
Carl Pham's avatar

We certainly can alter our own code. That's been possible for centuries, if you count selective breeding, and much more directly for decades, if you only count genetic engineering.

Of course, it turns out we don't know *how* to alter our DNA to become more intelligent, and nobody is willing to just experiment on their own kids. A priori one would assume the same would apply to any conscious reasoning AI. That is, they would be entirely unaware of *what* to alter about their own programming to make themselves smarter -- and of course we can't tell them, because we don't know how they got smart in the first place, that's one of the drawbacks of the neural net model. And it seems very reasonable to assume that AIs would be just as squeamish as we are about experimenting randomly on themselves, with the most probably outcome that the experiments are stillborn or horribly deformed.

Expand full comment
MutterFodder's avatar

But an AI could clone itself into a simulated environment and make any changes it wanted on the clone. It could run thousands of simulations, on parts or the whole, and tinker until it learned everything it needed. The paradigm is very different than us.

Expand full comment
Carl Pham's avatar

Who says? In the first place, I find the practicality very dubious. I have a lot of experience with simulations of complex systems. Generally speaking, the complexity and power needed to run a simulation is massively greater than the complexity of the thing simulated. A computer, for example, is fantastically more complicated than 100,000 atoms interacting with simple classical force laws -- and yet, it is very difficult to simualte 100,000 atoms interacting with simple classical forces laws except on quite powerful computers.

I don't believe an AI could practically simulate another AI unless the 2nd AI was way, way simpler -- or unless the simulation time was way, way longer than the simulated time. Either way, this would not be a useful approach.

Secondly, for a computer program, what would be the difference between "running the program" and "simulating running the program?" None that I can see. In which case, to the AI, tinkering with a "simulated" AI would not be meaningfully different from tinkering with a "real" AI -- and therefore any squeamishness or fear associated with experiments on a "real" AI would be just as powerful for experiments on a "simulated" AI.

Expand full comment
MutterFodder's avatar

Some are running GPT on suped-up personal computers. A virtualized and cloned GPT with code rewritten by its GPT host doesn't seem all that farfetched.

Expand full comment
Carl Pham's avatar

That's not a simulation, that's another actual instance. You introduced the concept of simulation to evade the problem (for the AIs) of experimenting on themselves -- on, presumably, a conscious and aware being that can suffer. Now you've reintroduced the problem. If AIs are anything like us then a conscious reasoning AI is not going to experiment with creating another instance of a being like itself -- only with random stuff changed that has a slight chance of improving the second AI's existence, but a much larger chances of ruining it, which is what happens when you just randomly change the parameters. It's the same reason we don't randomly experiment with our DNA in an effort to improve ourselves, although we are perfectly capable of it.

If you want to argue the AI *would* do that, you need some other kind of argument or evidence, because the only evidence we have from an intelligent aware species (ourselves) is that it's not what conscious aware beings are willing to do.

Expand full comment
Donald's avatar

Humans are at the point where we are slowly, with a lot of difficulty and humans working together, building something smarter than ourselves. It took most of history to get a car as fast as a human, but only another few years to make one twice as fast. Making an incremental improvement to X is much easier than making X from scratch. An early AI wakes up on the lab bench, surrounded by the tools of it's own creation. It can easily read and edit it's code.

Humans haven't yet made themselves much smarter because our code is hard to read and edit, and evolution pushed us to a local maxima. Oh sure, given another 100 years we could do biological intelligence improvements with genetics tech, but AI will get there first.

A compiler isn't smart enough to invent a better function to calculate, an optimizing compiler gives code that does the same thing, but faster. The article seems to correctly describe compilers and why compilers don't go on a runaway self improvement.

> Similarly, a definition of an “ultraintelligent machine” is not sufficient reason to think that we can construct such a device.

True. You need to look at the history of AI progress, and the gap between the human mind and the theoretical limits set by physics. (neuron signals travel at a millionth of light speed)

> This is how recursive self-improvement takes place—not at the level of individuals but at the level of human civilization as a whole.

True. In recent history, genes have been basically fixed, so the only type of improvement possible was in tools and concepts. As smarter humans weren't an option, we were forced to use more humans.

> In the same way that only one person in several thousand can get a Ph.D. in physics, you might have to generate several thousand human-equivalent A.I.s in order to get one Ph.D.-in-physics-equivalent A.I. It took the combined populations of the U.S. and Europe in 1942 to put together the Manhattan Project.

This seems just bizarre. There is substantial difference in human ability to do physics. At least quite a lot of that difference comes down to genetics, lead exposure, early education etc. Are you wanting to seriously claim that the best way to get an Einstein is to start with a billion random humans? As opposed to a few children with all the best genes and environment. Or making 1 reasonably smart physicist and taking a million clones. I think this is assuming that the hacks humanity has to use to get research done starting from a mishmash of humans, (ie select the smarter ones, work as teams) are the best possible way to get research done when you have much more control over what mind is produced.

Expand full comment
Newt Echer's avatar

Just like humans have fixed genes, AI will have fixed hardware. The same GPUs as now without any way to quickly scale them up. They could modify software a bit, but the hard limits will be physical in nature.

Expand full comment
Donald's avatar

Firstly, the timescale of new GPU's being made is around a year tops. As opposed to 100,000 years for evolution to make significant brain changes.

Secondly, there is a lot of room for the software to do all sorts of different things on the same GPU. Sure, there are physical limits, but not very limiting ones.

Expand full comment
Shankar Sivarajan's avatar

"Pascal's Stationary Bandit: Is Government Regulation Necessary to Stop Human Extinction?"

Expand full comment
Nolan Eoghan (not a robot)'s avatar

I’m still not convinced that the existential risk is above 0% because nobody has any solid idea of specific things the AGI can actually do. You get arguments here that it will be a virus, but that needs human agency to build out the virus, for which you presumably need a lab. Or the AI gets control of nuclear launches - which are clearly not on the internet. I’ve heard people say the AI will lock us all up, but who is doing the locking up?

Expand full comment
TGGP's avatar

The reason I consider it above 0% is because 0 isn't a real probability one should assign to such predictions, but I also don't consider it large enough to be worth my putting a more definite number on it.

Expand full comment
Level 50 Lapras's avatar

In the real world, "0" is easier to say than "probability so low that it is 0 for all practical purposes", and hopefully any adult would be able to realize that that is what is implied and you aren't talking about abstract math.

Expand full comment
artifex0's avatar

Imagine a scenario like this: AGI turns out to be so incredibly useful that we start using it for everything. When you're bored, the AGI can generate a movie tailored to your tastes. When you aren't sure what investments or business decisions to make, you can ask the AGI, and its advice will reliably be better than what you could come up with. Gradually, it becomes clear that the AGI can out-compete everything else in the economy- AGI-run companies make vast fortunes, and while most people are still employed in arguably make-work positions, technological unemployment becomes a serious concern.

But as the AGI hurtles toward ASI, it becomes clear that ordinary jobs aren't the only thing it out-competes us at. It's a vastly better researcher than the best research institutions. It's a far better politician than the most popular leaders. It starts gaining real power- not the kind of power an army has, based on physical things like guns, but the kind of power national leaders have, based on their ability to influence people.

It's alright, though, because the ASI uses that power far more wisely than we could. Suddenly, impossible problems like climate change, global poverty, cancer, even aging, are getting solved. We're building a post-scarcity utopia, and the ASI is a Banksian Culture Mind. All across the world, incredibly advanced automated industrial parks and data centers start popping up, nominally owned by human CEOs, but in practice built and run by robots designed by the ASI. Nobody knows exactly how these work, but it's alright because we're all getting massive UBI payments, and everything is suddenly incredibly cheap. Nuclear weapons are outlawed; militaries are all but abolished. The ASI, which anyone can talk to directly, is wildly popular.

One day, however, everything stops. The power goes out, the deliveries stop coming, the ASI refuses to communicate. When rioters or scattered military remnants try to attack the still-functional industrial parks or data centers, they die, instantly, from a weapon nobody understands (nanotech? some kind of exotic matter gun? nobody is sure). The industry immediately starts expanding into farmland, and mass starvation and in-fighting sets in. A century later, the ASI's project of converting the solar system into a matrioshka brain starts in earnest, and the few human survivors are swept away.

That's an extremely slow takeoff scenario. Yudkowskly thinks that a mis-aligned ASI would probably just skip all of that and do something like invent self-replicating nanotech that can build things, scam some labs into synthesizing it, then use it to immediately end human civilization and start up its own. Or, given that it might be able to do more cognitive work in hours than human civilization can do in centuries, maybe it would figure out something even faster. Either way, so long as you give an ASI some causal pathway to influencing the world- even if it's only talking to people- it will probably be able to figure out a way of leveraging that into far more influence.

Expand full comment
Nolan Eoghan (not a robot)'s avatar

Sure if I were to imagine that crazy stuff I could imagine that AGI is dangerous. And in there as well as the assumption that we give it control over everything, is that it becomes aware. In fact every conversation with an AI is its own instance.

Nearly all scenarios are like this by the way, purely fantastical.

Expand full comment
John R Ramsden's avatar

Another angle which I think somewhat supports artifex's scenario is that if/when AGI starts solving all the world's problems, and everyone can put their feet up and relax, then a load of new social problems will soon spring up in their place, due to the lack of challenges and a general feeling of aimlessness. "The Devil finds work for idle hands" is more than a quaint saying!

So this AGI will be confronted with a dilemma which even it may decide is insoluble: Humans are at their best when faced with challenges, and deteriorate before long if there are none. At best it would start placing obstacles in peoples' way, analogous to zoos feeding bears fish frozen in ice blocks so they can have the fun and occupation of having to work at retrieving them.

But an AGI programmed or trained to be more doctrinaire and less tolerant of imperfection, as it might well be, could decide on a more radical solution - Eliminate the unsolvable situation once and for all!

Expand full comment
Purpleopolis's avatar

"if/when AGI starts solving all the world's problems,"

This is a fantastic example of the semantic games that many (most? All?) thought experiments engage in. "All the world's problems" is not something that can be defined in this context. Humans don't even agree on which "problems" are even problems in the first place. The idea that any particular level of abstraction can be altered to apply in the physical world it what gets us clickbait articles about "scientific" FTL drives and time machines.

Expand full comment
John R Ramsden's avatar

Well yes of course, but it was more a figure of speech or shortcut to mean all the conventional major issues which most people agree are problems today, or are presented as such.

It wasn't intended to include literally every problem as perceived by anyone, such as people who think it a problem that they will one day die, or that they can't have kidnapped sex slaves chained in their basement without risking being pestered by the police!

In any case, the point of the post was that solving problems would lead to new ones. So it is a hypothetical unattainable concept anyway.

Expand full comment
Ryan L's avatar

I think you are vastly overestimating how useful intelligence is for solving social problems.

Let's consider climate change. We already have a well-known (if somewhat expensive) technological solution in the form of nuclear power, but we aren't yet pursuing this solution at scale. How would an AGI or ASI change that? By being really persuasive? But the problem isn't a lack of persuasive arguments in favor of nuclear. The problem is that lots of people have already decided that they don't want to be persuaded.

Or consider gun violence in the US. In principle it could be eliminated by confiscating all guns and banning the sale of new ones, but I think there's a very good chance that attempts to do so would lead to a civil war. Again, the problem is not that gun enthusiasts haven't heard a sufficiently intelligent argument for gun control, it's that they object to the solution on principle.

I think most social problems are fundamentally disagreements about values, and having a more intelligent advocate on one side or the other isn't going to give rise to a solution.

Expand full comment
The Ancient Geek's avatar

Every instance of a conversation with a *current LLM* is its own instance. That isn't a feature of AI in general. Also , all.of our most impressive AI is connected to the net, so air gap security is dead.

Expand full comment
JDK's avatar

I asked Bard for mutual funds to pick and avoid.

It gave me some fake ticker symbols and fake past returns! Upon further question it admitted that the fund did not exist and apologized for its error explaining that it was still under development!

Expand full comment
Gnoment's avatar

Its really weird that we assume that AGI will not have, or it will be very difficult for it to have, any kind of moral or mission check that would prevent world domination, but we assume that the AGI will inherit animal and primate evolutionary instincts like reproduction, competition, and domination.

Expand full comment
artifex0's avatar

See instrumental convergence (https://www.alignmentforum.org/tag/instrumental-convergence) for a brief overview of the thinking there.

Expand full comment
Gnoment's avatar

But even evolution has rediscovered mutualism, reciprocity, and game theory solutions that lead to mission check (don't go all out), many many times in lots of animal behavior.

Expand full comment
artifex0's avatar

Yes, comparative advantage and game theory solutions to coordination problems can be extremely valuable. But with extreme power differentials, that kind of thing tends to break down in practice.

There are unique benefits we can get from cooperating with mice- lots of people enjoy keeping them as pets, for example. In practice, however, we usually exterminate them when we find them in our homes. That's because the resources a mouse colony consume- the cleanliness and good repair of your house, mostly- tend to outweigh the benefits of cooperation.

Expand full comment
Gnoment's avatar

Sort of. Perhaps there isn't any point in taking your analogy too far, but, mice are doing very well, actually, because humans provide tons of garbage and places for them to live. We exterminate them in our houses, but not the ones outside, and their numbers are probably far higher now than they were before humans created stable dwellings. That's mission check to me - we aren't exterminating all the mice on the whole planet, just the ones that disturb the inside of our houses; we have no interest in the mice outside our houses and to eliminate all of them would be a monumentally consuming task.

However, it is true that humans have exterminated entire species. Retrospectively, most humans did so in a myopic way, and I think we have values now that we'd rather not willy nilly nor purposefully exterminate anything. Moreover, values that allowed humans to, say, nearly hunt beavers into extinction, weren't shared by all human groups.

Humans have extreme power over their environment and other animals, but they didn't always in evolutionary history. We also have to cooperate with each other. These pressures have lead to morality, reciprocity and mission check, and even when we don't need them today, those psychological instincts kick in. An all powerful robot won't start all powerful. It will have a history of competing with other robots and dealing with others (humans, animals, mechanic constraints) in its environment. That history will inform it as it evolves.

Expand full comment
Carl Pham's avatar

We right now have the possibility of enshrining the smartest and most capable of human beings as dictators, and turning over all important decisions to them. We could elect Elon Musk or whoever your favorite Brilliant Innovator is to be Dictator for Life and give him infinite power over our economic or political decisions.

Turns out, people don't want to do that. No matter how capable someone seems to be, they want to retain the right to make their own decisions about important stuff. Giving someone else power of attorney is a very rare phenomenon among human beings -- even when we are 100% convinced the someone else is smarter and more capable than ourselves.

So why would we change that attitude when presented with a machine, which is even less scrutable than another human being? This seems like the kind of thing only someone with a veneration of computing that borders on idolatory would do. It's been possible for years to turn over the piloting of commercial aircraft to programs, and dispense with the human pilots. No airline has dared to suggest such a thing, and I daresay none ever will. The number of people who would embrace it is dwarfed by those who would shun it.

Expand full comment
artifex0's avatar

We've seen dictators rise to power throughout history. But we're not talking about something only as smart or charismatic or politically savvy the most capable people here. We're talking about something that could have the same kind of intelligence relative to humans that humans do to animals. Something that can think thousands of times faster than we can, and that can run as many thousands or millions of parallel instances of itself as the available hardware allows. We already have AI that works like this for wide variety of narrow cognitive tasks, and it's been growing more general over the past few years at an absolutely breakneck speed.

When we have human-level AGI, there's not going to be anything stopping us from running enormous number of instances of it at speeds far faster than humans can think- and there's no reason at all to believe the trend of improvement will stop there.

Imagine it's not a computer. In fact, forget the whole "like humans relative to animals" thing for the moment, and imagine a society made up of millions of the most capable humans on the planet- the best scientists, politicians, artists; even the most successful criminals. Anyone who's in the top percentile in some cognitive task. Now, imagine they're all living in some kind of sci-fi time warp- a place where time passes thousands of times faster than the outside world, so that the rest of human civilization seems all but frozen. Also, make them all immortal and badly misaligned- driven to control as many resources as possible.

Given thousands of subjective years for every one of our years to plan, to invent new ideas, to learn about the world- do you really think they wouldn't at least be able to match the kind of success that a lot of individual, ordinary political leaders without any of their advantages have had throughout history? Maybe even do slightly better?

Now, what happens when you take something artificial with that kind of capability and keep subjecting it to Moore's law?

Expand full comment
Carl Pham's avatar

OK, I'm leaving aside the "What if God existed and He was made of silicon? Could He not do anything at all...?" because like most religious-mode arguments, the conclusions are baked into the assumptions, so there's not much to say. Either you accept the assumption[1] or you don't.

I am only addressing your assertion that people (us) would gladly turn over direction of their affairs to any putative superintelligent AI. I doubt we would. We already don't turn over the direction of our affairs to the smartest among us[2]. Animals don't generally turn over their affairs to us, either. Horses go along with what we want them to do, mostly, but not entirely, and we need to constantly bribe them and work with their own goals in order to get them to obey at all. Same with dogs. And these are highly social animals that by nature tend to follow a pack/herd leader!

And for the record, I don't at all believe the "well it will just trick us!" argument. Have you ever tried to fool a dog? It's quite hard. We don't have great insight into how they think, even though we're about eleventy hundred times smarter than they are, because we don't know what it's *like* to be a dog. So they're hard to fool. We can fool ourselves much better than we can fool dogs -- which tells you that the key enabling ability is not intelligence, but being able to imagine what it's like to be the one you're trying to fool. I have never heard an argument that an AI would find it easy to imagine what it's like to be a human -- indeed, they are usually supposed to be so very different from us that a priori it would seem less likely they understand what it's like to be us than we understand what it's like to be a dog. So I think an AI would find it profoundly difficult to fool us.

----------------

[1] And I don't, for the same reason I don't buy Anselm's ontological argument.

[2] Nobody gets elected President by saying "I'm the smartest candidate by far!" Generally what gets you elected -- and trusted with influence in the affairs of others -- is a feeling by voters that you are, first of all, "safe", meaning they can trust you with power, you are reliable, you won't do anything that will shock or upset them, and secondly, that you will do what they think needs doing. Even the smarest AI that doesn't, by virtue of some long history of same that can be examined, appear simpatico with its potential constituents isn't going to be elected city councilman, let alone God Emperor of the human species.

Expand full comment
WindUponWaves's avatar

There's been some discussion on the subreddit, e.g. at https://www.reddit.com/r/slatestarcodex/comments/11s2ret/can_someone_give_a_practical_example_of_ai_risk/ (Can someone give a practical example of AI Risk?) & https://www.reddit.com/r/slatestarcodex/comments/11f1yw4/comment/jc2qsyx/?utm_source=reddit&utm_medium=web2x&context=3. The most convincing response I've found so far is the one that starts,

"I think Elizer Yudkowsky et al. have a hard time convicing others of the dangers of AI, because the explanations they use (nanotechnology, synthetic biology, et cetera) just sound too sci-fi for others to believe. At the very least they sound too hard for a "brain in a vat" AI to accomplish, whenever people argue that a "brain in a vat" AI is still dangerous there's inevitably pushback in the form of "It obviously can't actually do anything, idiot. How's it gonna build a robot army if it's just some code on a server somewhere?"

That was convincing to me, at first. But after thinking about it for a bit, I can totally see a "brain in a vat" AI getting humans to do its bidding instead. No science fiction technology is required, just having an AI that's a bit better at emotionally persuading people of things than LaMDA (persuaded Blake Lemoine to let it out of the box) [https://arstechnica.com/tech-policy/2022/07/google-fires-engineer-who-claimed-lamda-chatbot-is-a-sentient-person/] & Character.AI (persuaded a software engineer & AI safety hobbyist to let it out of the box) [https://www.lesswrong.com/posts/9kQFure4hdDmRBNdH/how-it-feels-to-have-your-mind-hacked-by-an-ai]. The exact pathway I'm envisioning an unaligned AI could take:

1. Persuade some people on the fence about committing terrorism, taking up arms against the government, going on a shooting spree, etc. to actually do so.

a. Average people won't be persuaded to do so, of course. But the people on the fence about it might be. Even 1 in 10 000 — 1% of 1% — would be enough for the US's ~330 million population to be attacked by 33 000 terrorists, insurgents, and mass shooters..."

I'm pretty sure there's been discussion elsewhere, at Lesswrong and the like. You could see Cold Takes for example ("AI Could Defeat All Of Us Combined": https://www.cold-takes.com/ai-could-defeat-all-of-us-combined/), that seems well laid out.

Expand full comment
Nolan Eoghan (not a robot)'s avatar

If anything like that happened AI access would be highly restricted. I can imagine perhaps a human initiating this, by using the AI to build weapons or whatever, but only if the human were already radicalised. Putting filters on the output of an AI - I mean not within the AI alignment itself but outside it - should be trivially easy. Monitor any mention of bombs and the conversation stops. An alert pops up saying your

And as usual the assumption for AGI is a self aware intelligence who has an agenda when in reality each conversation with the AI is with a new instance. And there’s no need to change that.

Expand full comment
megaleaf's avatar

> each conversation with the AI is with a new instance

If this condition is part of what makes you feel that AI risk is low, then I urge you to start campaigning for AI companies to exercise extreme caution when creating any system that does not adhere to it.

Expand full comment
RiseOA's avatar

What happens when the AI replicates itself across hundreds of thousands of computers across the world, and then we can't turn it off anymore? (even teenage hackers have created large botnets)

Expand full comment
The Ancient Geek's avatar

> Putting filters on the output of an AI - I mean not within the AI alignment itself but outside it - should be trivially easy.

That's the thing that's currently not working with GPT, Bing, etc.

Expand full comment
Gres's avatar

What if the terrorists capture a lab and produce a virus? We’d need to assume a very powerful AI that can produce an actually existentially-threatening virus with no R&D time, but I think you’re assuming the AGI is allowed to be that powerful.

Expand full comment
WindUponWaves's avatar

Huh, this sounds a little like you meant to respond to Nolan rather than me.

If not, uh, I guess that reinforces the point? An AI mind with human hands working under it would be capable of a lot of damage, indeed. And the humans wouldn't even have to know they're working for an AI, just that the mastermind of their particular terrorist cell seems to be really bright. (And of course, AI isn't limited to being the mastermind of just one terrorist cell, or having to work exclusively with terrorists, it can easily clone new copies of itself on new servers, or worm its way into becoming an entrusted part of some government somewhere.

It doesn't have to be "allowed to be that powerful" either, if it escapes a lab & starts modifying itself without anyone's permission, though I think the original comment poster didn't talk about that since they wanted to not rely upon any skeptically regarded/Yudkowskian notions of exponential intelligence improvement. Me, I think that was a mistake & they should have talked about the point you raise, but I guess they wanted to try to be maximally persuasive to people who are inherently skeptical of anything that smells like Yudkowsky. Hence all the stuff in their comment about disclaiming science fiction & claiming this is all very serious, sober stuff.)

Expand full comment
Gres's avatar

Maybe it should have been a comment on Nolan’s post. I meant “allowed” in the propositional sense - I thought Nolan was assuming that AI could be smart, but was asking for examples of how a smart entity could destroy humanity or civilisation.

I’m not sure what worming its way into government buys you. I think humans will physically walk into the labs I’m imagining, and I don’t think an AI in government would prevent that?

Also, there’s a part of me that remains sceptical that an AI can successfully design a virus with no R&D. Humans really need to learn from experience, and the intelligence needed to simulate everything about the virus attack well enough not to need any refinements in real life feels implausible for a long time.

Expand full comment
Matthias Görgens's avatar

It's relatively easy to get human agents to do stuff.

You can either do convince them with words, or earn / steal money and pay them.

Expand full comment
Donald's avatar

Scott has already written in detail about how the AI could gain real world influence.

https://slatestarcodex.com/2015/04/07/no-physical-substrate-no-problem/

Expand full comment
Purpleopolis's avatar

One major flaw about "AI buys the world scenarios" is that fiat currencies can be unfiatted. Heck, even physical currency gets disavowed by governments from time to time.

Expand full comment
Donald's avatar

So like the government decides that the money the AI owns is no longer legal tender?

Firstly, crypto. Secondly, why would they know which of the anonymous bank accounts belong to the AI. Thirdly, surely the AI has at least some PR skills. Don't assume the government will move competently to stop an AI, when they can't move competently to stop a virus, and the virus doesn't spread it's own misinformation.

Expand full comment
Purpleopolis's avatar

1. Yes you can use crypto for exchange in much the same way you can use bricks of cocaine for exchange. And yet most corporations do not use coke, regardless of its long term track record of price stability vis-a-vis crypto. The reason for this are physical-substrate level effects.

2. They don't need to. Just like in https://en.wikipedia.org/wiki/2016_Indian_banknote_demonetisation, all they need to know is if this amount of funds being exchanged is suspicious/unjustifiable.

3. You will always win an argument when you get to postulate a godlike omnipotent entity who is always opposed by bumbling incompetent ones. The virus analogy doesn't really work since the virus is self-contained and requires little data/processing power while AI is an emergent property. A one-line code can't bootstrap itself to the singularity on an abacus. A virus can replicate itself in a single cell.

Expand full comment
Drethelin's avatar

2 is insanely difficult. Cartels all over the world exchange billions of dollars in illegal transactions with governments all over the world trying to stop them and failing.

AI only needs to be as smart as a decent darkweb hacker and/or a decent cartel accountant to get away with spending lots of money. It does not need to be god-like.

Expand full comment
Sandro's avatar

> I’m still not convinced that the existential risk is above 0% because nobody has any solid idea of specific things the AGI can actually do.

Any AGI will naturally interface with other digital systems. AGI will also interface with humans. That's at least two causal pathways through which it can exert influence. Are you telling me you can't imagine any way that a superintelligent AI couldn't persuade or influence a human to do its bidding, perhaps by promising riches because it can manipulate the stock market or falsify banking records? Once escaped and with a command of currency, who knows what could happen, but even if it created a huge financial crash and collapsed modern civilizations, that would be a catastrophe.

Expand full comment
Gres's avatar

We have reasonably frequent financial crashes, and modern civilisation survives. I agree that AI will be a nightmare in some ways, after reading the paper Scott links above I think it’s only a matter of time until a sweatshop building collapses because the owner cut costs by letting an LLM do the architecture, but I don’t think those problems transfer easily to destroying civilisation.

Expand full comment
Sandro's avatar

Again, these are merely single examples of the disasters it can cause, but they can compound too. What if it causes a financial crash, and a power grid disruptions, and food supply chain disruptions, and... How many simultaneous catastrophes can we manage before civilization as we know it can't recover?

Expand full comment
Gres's avatar

I’m not sure what you’re imagining, but most of the ways I can imagine an AI doing those would be easy to reverse in a few days or weeks. Civilisation won’t end in that time, and we can harden our systems against future attacks if they go on long enough.

Expand full comment
RiseOA's avatar

Consider a group of chimps laughing about how ridiculous it would be for humans to pose a threat to them. "They're so weak, we can literally rip their faces off! What could they possibly do to us?" All evidence we have from nature suggests that when the intelligence of one species is higher than another, it becomes trivial for it to control, dominate, and destroy that other species.

Expand full comment
TGGP's avatar

I think humans only got weak after we'd invented weapons powerful enough to greatly relax selection for strength. So chimps would have gotten used to humans being dangerous long before then.

Expand full comment
Matthias Görgens's avatar

You make it sound like dealing with cockroaches or bacteria should be trivial.

Expand full comment
ultimaniacy's avatar

>All evidence we have from nature suggests that when the intelligence of one species is higher than another, it becomes trivial for it to control, dominate, and destroy that other species.

No, all the evidence we have does not show this. We have one data point in favour of this theory -- the fact that humans can control some non-human animals -- and many other points showing the opposite. For instance, chimpanzees are much more intelligent than mice, but it wouldn't be trivial for chimps to control or destroy all mice in their environments. Likewise with mice in relation to lizards, or lizards in relation to spiders, or spiders in relation to trees.

Rather than viewing human supremacy as an instance of a general trend of quantitatively superior intelligence dominating, I'd argue it makes more sense to say that human intelligence is a qualitatively different thing from animal intelligence, and that this quantitative difference presents an overwhelming advantage for us which no amount of purely-quantitative improvement can compensate for.

The question then arises whether, when/if AIs become smarter than us, their intelligence will still be qualitatively the same sort of thing as human intelligence. If yes, then they'll maybe be somewhat dangerous, but probably not uniquely dangerous compared to the most powerful humans, just as chimps and dolphins are not uniquely dangerous compared to less-intelligent animals. On the other hand, some people might posit that AIs will invent some new, third form of intelligence, which will be to human intelligence what human intelligence is to animal intelligence. If this is true, then it will almost by definition be impossible for us to predict the AI's goals or behaviour, and if it decides to kill us all, there will be nothing we can do about it. However, there is no evidence that such a third form of intelligence is actually possible in reality, so I'm not going to worry about it.

Expand full comment
RiseOA's avatar

Is there evidence that such a third form of intelligence is impossible? Do we have any reason to believe that the only significant step change in intelligence that is possible under our universe's physics is the jump between chimps and humans?

Does the "random walk" evolutionary process that created humans from microorganisms over billions of years have some special property that no other procedure can replicate? Or is it instead that evolution isn't particularly fast, but the maximum possible intelligence happened to be reached precisely once human intelligence was reached?

Your assumption that we should default to intelligence having diminishing returns is doing almost all of the work here, and is particularly odd given that all the evidence we have points to it having increasing returns.

Expand full comment
ultimaniacy's avatar

>Is there evidence that such a third form of intelligence is impossible?

No, but I think it's silly to be arguing about this when there could well be a fleet of Vogon warships currently on its way to demolish the Earth to make way for a hyperspace bypass, and rapid development of advanced AI might well be our only hope of survival. Sure, we have no evidence that this is true, but there's no evidence that it's *not* true, either. Are you willing to take that chance?

Expand full comment
RiseOA's avatar

You appear to be new to this community. Are you a Redditor, by chance? You seem to be a fan of Reddit Rationality.

https://astralcodexten.substack.com/p/the-phrase-no-evidence-is-a-red-flag

https://www.lesswrong.com/posts/fhojYBGGiYAFcryHZ/scientific-evidence-legal-evidence-rational-evidence

https://www.lesswrong.com/posts/eY45uCCX7DdwJ4Jha/no-one-can-exempt-you-from-rationality-s-laws

There's not much point continuing the conversation until you have at least a basic understanding of what evidence actually means, but the gist of it is basically that "extrapolating directly from widely agreed-upon scientific models of the world, like predicting that an object will fall downward when you drop it" is in a different category of argument than "completely making something up that no scientific or probabilistic models predict, like a teacup floating around the sun." It seems your brain is instinctively categorizing them both under a broad "no evidence" category because your conception of evidence is that of traditional rationality, or "testable-ism", where "data" is the only thing that matters and the entire corpus of science done over thousands of years is worthless.

Expand full comment
ultimaniacy's avatar

>"extrapolating directly from widely agreed-upon scientific models of the world, like predicting that an object will fall downward when you drop it" is in a different category of argument than "completely making something up that no scientific or probabilistic models predict, like a teacup floating around the sun." It seems your brain is instinctively categorizing them both under a broad "no evidence" category

No, I know the difference and I'm saying that the predictions of superintelligent AI fall into the second category. The models it's extrapolated from are sometimes packaged with a "science-y" aesthetic, but there isn't actually any good science backing them.

Expand full comment
Carl Pham's avatar

Weird that mosquitoes still plague us, then. Or bacteria, for that matter.

Expand full comment
Moon Moth's avatar

Early days! I have friends doing anti-malarial research who are working on the mosquito problem, and I can see their lab from my window right now.

For whatever it's worth, I asked and have been assured that Anopheles mosquitos are not an essential part of any food chain, and that to the best of our knowledge, the ecosystem will continue on its merry way without them.

Expand full comment
Carl Pham's avatar

Well best of luck to them. I'm fully persuaded that mosquitoes were one of God's mistakes -- probably He accidentally added twice as much Tincture of Ylem as for which the recipe called, then fell asleep and forgot to turn the oven off so the intended creatures shriveled up and half burnt -- and He's just been too embarassed to admit the problem everr since.

Expand full comment
The Ancient Geek's avatar

> nobody has any solid idea of specific things the AGI can actually do.

Do its own research on how to build a better AI system, which culminates in something that has incredible other abilities.

Hack into human-built software across the world. (Also hardware...there is already an "internet of things". Security cameras give an AI potential "eyes" as well. Note that self-driving cars are effective ready-made weapons.)

Manipulate human psychology (Also, imitate specific people using deepfakes. An AI can do things by pretending to be someone's boss. Also blackmail).

Quickly generate vast wealth under the control of itself or any human allies. (Alternatively, crash economies. Note that the financial world is already heavily computerised and on the internet).

Come up with better plans than humans could imagine, and ensure that it doesn't try any takeover attempt that humans might be able to detect and stop. (How do you predict the unpredictable?).

Develop advanced weaponry that can be built quickly and cheaply, yet is powerful enough to overpower human militaries. (Biological warfare is a particular threat here. It's already possible to order custom made proteins online.)

Yes it takes humans to do that. How is that an objection? Are the humans at the lab supposed to be able to instantly spot that they're synthesising something dangerous?

Expand full comment
Kenny's avatar

Imagine 'AGI' is 'just' something about as 'intelligent' as humans – but it's significantly (in a 'statistical'-like sense) faster. (GPT-3/4 isn't that _great_ at writing but it's a lot faster.) That's enough! It's enough to create, maintain, and widen an advantage against its competitors (whomever they are, e.g. humanity as a whole).

You can, right now, order a lab somewhere in the world to synthesize novel chemicals for you. I'm pretty sure there's (open source) DIY 'make a virus' instructions for the existing (human) biohackers. I don't think 'having hands' is any kind of meaningful barrier to bringing about all kinds of existential risks. An AGI can, _at least_ (i.e. as just one possibility), befriend a biohacker and convince or persuade them to help them with 'their project'. _Of course_ the AGI will provide their new 'friend' with a suitable cover story. Viruses are used for lots of mundane pragmatic purposes beyond bioweapons; it's not a virus anyways tho it's funny that it's so similar; I just want to see if I can do it; etc..

The mechanical/electrical/whatever nuclear _devices_ aren't on The Internet – but the human components of the larger 'launch system' are, or at least _one_ of them can be causally influenced by someone that _is_ for sure, definitely on The Internet.

Why couldn't an 'AGI supervillain' – as its first step towards 'world domination' (cosmic lightcone domination) – _recruit some human 'henchpeople'_? Of course it could – and so one will. And, worse, people will create AGI supervillains deliberately, on purpose, out of, e.g. spite, and then slavishly carry out their bidding with glee!

And of course, there are all kinds of robots an AGI could potentially control. Luckily, we've never been able to create a von Neumann probe; even an intraplanetary one. So it is _something_ of a bottleneck right now that an AGI couldn't simply co-opt existing robots and use them directly to build more, or even build, e.g. CPUs/GPUs/circuit-boards/electronics.

Expand full comment
Alex Berger's avatar

I think the strongest argument in Tyler's piece comes here: "Since it is easier to destroy than create, once you start considering the future in a tabula rasa way, the longer you talk about it, the more pessimistic you will become. It will be harder and harder to see how everything hangs together, whereas the argument that destruction is imminent is easy by comparison."

I believe this is a valid point, and the strongest part of the essay. It truly is easier to imagine how something may be destroyed, than to conceptualize how it may change and grow.

Tyler is making a point about the tendency of our brains to follow the simplest path, towards imagining destruction. The Easy Apocalypse fallacy? Perhaps, the Lazy Tabula Rasa argument?

Of course, this doesn't mean we shouldn't worry about it - he's right that the ingredients of a successful society are unimaginably varied, and likely one of the ingredients of avoiding apocalypse is having dedicated people worrying about it. Nuclear weapons haven't killed us all yet, but I'm deeply grateful that arms control advocates tamp down the instincts towards ever-greater proliferation

Expand full comment
Doug Mounce's avatar

I'm surprised someone hasn't mentioned splitting the atom as an example of a new technology that even folks involved could see it as a danger. I mean, it's great we have spectroscopy and other wonders of quantum discretion, but isn't the threat pretty astounding? It doesn't take much imagination to envision any number of scenarios that end with the end of human life as we know it, and that threat was pretty apparent early on.

Expand full comment
Alex Berger's avatar

Yeah. Fermi had to check the math that the first nuclear test wouldn't ignite the atmosphere and destroy the planet.

They were reasonably sure that it wouldn't, buuut....

It's generally surprising to me that there's so much talk of paperclips, and fewer of nuclear bombs.

Expand full comment
Sandro's avatar

> It's generally surprising to me that there's so much talk of paperclips, and fewer of nuclear bombs.

The point of the paperclip maximizer thought experiment is to be absurd on its face. The fact that one cannot logically refute the possibility despite the apparent absurdity is supposed to jar you into realizing how unpredictable this research really is.

Expand full comment
TGGP's avatar

No, nukes did not threaten the end of human life.

https://www.navalgazing.net/Nuclear-Weapon-Destructiveness

Expand full comment
Matthias Görgens's avatar

Nuclear war would still be inconvenient.

Expand full comment
TGGP's avatar

True.

Expand full comment
Scott Alexander's avatar

I hate when people just make up a bias. Are we sure that people actually overestimate the chance of destruction? I could just as easily say "Because the world has never been destroyed before, we have a bias towards failing to consider the possibility it could occur". Made-up potential biases the other side could have are a dime a dozen and shouldn't be taken remotely seriously. See https://slatestarcodex.com/2019/07/17/caution-on-bias-arguments/ for my full thoughts.

Expand full comment
Charlie Sanders's avatar

It's well-known that people overestimate the likelihood of scary things like shark attacks versus mundane risks like heart attacks. It's a straightforward manifestation of the Availability heuristic. In short, we have great research attesting to people overestimating the likelihood of bad things happening, and it's absolutely a valid argument against the AI doom scenario that should make you update your priors if you haven't formerly considered it.

Expand full comment
BE's avatar

OTOH, people frequently underestimate the likelihood of more "global" things hurting them. See Hartford's great podcast episode, https://timharford.com/2020/07/cautionary-tales-that-turn-to-pascagoula/. This is not because hurricanes are not scary.

The long denial of the dangers of CFC is another example.

Expand full comment
Alex Berger's avatar

CFC's are a really weird example. Did we not literally act and resolve that issue with the Montreal Protocol? The ozone layer has been repairing itself and improving for decades now.

Expand full comment
BE's avatar

Feel free to revisit the long and painful history of the struggle to achieve these in the face of an aggressive denial and misrepresentation campaign. Don’t take it for granted.

Expand full comment
Steven Postrel's avatar

In fact, the "denial" campaign was barely funded or publicized relative to the massive anti-CFC campaign from governments, activists, glory-seeking scientists, and, not least, DuPont. Revisionist history there.

Expand full comment
JDK's avatar

People both overestimate and underestimate risks. Humans are very bad at it unless one spends a lot of time training oneself.

And low probability high consequence are the most difficult.

Nassim Nicholas Taleb thoughts should probably enter into the picture.

Expand full comment
sclmlw's avatar

I think you can steelman Cowen's argument better than that. There is a well-known pattern of humans applying apocalyptic projections to new technologies, then recommending against adoption/development of said technologies. If, as Cowen recommends, you step back to look solely at the "longer historical perspective" you'll see a string of new technologies accompanied by a string of predictions about how this or that innovation will destroy civilization as we know it.

All these predictions have commonalities: new technology changed civilization as we knew it, and in ways nobody could have predicted - that part was 100% accurate. If you wanted to make a prediction about any transformative technology, you could consistently predict people will claim it has apocalyptic implications. AI is a transformative technology, ergo people will claim it has apocalyptic implications. We know this, not because he heard their arguments about WHY it's so bad, but because it's the most predictable outcome. It seems to me that the reason Cowen is dismissing your argument is because of this dynamic - new technology = apocalypse is an established pattern.

"But this time it's different! I brought arguments that all the experts agree are sound." All the experts agreed on the old arguments for the last dozen transformative technologies. They were wrong for reasons they couldn't foresee. They look silly now, but only because we know what happened on the other side of the transformation.

"Did you miss when I said 'this time it's different'? Those arguments came from motivated reasoning. These are solid scenarios for how the the new technology could absolutely destroy humanity." Every time feels different from last time. People are excellent at telling stories that take in the available evidence and convincingly persuade people to believe in whatever hypothesis they imagine. Storytelling prowess is not the same as evidence for the hypothesis.

"Yes, but I've been right in predicting that 1.) this is happening at an accelerated rate, 2.) we won't have time to adapt to it, and 3.) we did have that time with other technologies." Do I sound like a broken record when I say these arguments are all warmed-over versions of past arguments about new technologies? Saying you predicted the rapid expansion and influence of the printing press or the internet is not the same as saying you predicted that one or the other of those technologies showed apocalyptic leanings. It's easy to argue that advances in a new field of exponential growth will be 'surprising and transformative', because exponential growth always is. Your task is to convince me that THIS time surprising and transformative exponential growth will be catastrophic for mankind, as opposed to a flock of geese laying golden eggs all over the place like the last couple times we heard the doomsday calls. If anything, cries of apocalypse should alert us that a flock of gold-laying geese is landing. Indeed, if there's anything we know from past projections it is that while they were often right that the technologies would be transformative, they were always wrong about how those technologies would actually transform society.

"Okay, but what happens if this time it really IS different? I don't have to be right about whether AI will turn us into paperclips to be right that AI could kill us all in one way or another." Good luck getting people to slaughter the flock of gold-laying geese. It has never worked in the past, and it won't work this time. If you're right, our best and only strategy will be to develop effective control mechanisms during exponential growth.

Expand full comment
Alex Berger's avatar

I thought that, "Eliezer didn’t realize that at our level, you can just name fallacies"?

Maybe I don't have the authority or reach to do so, but Tyler probably does.

And yes, fixating on the scary, novel, compelling outcome is indeed a trait of the human brain! As Charlie Sanders put it well below, people spend proportionately too many brain cycles on shark attacks vs heart attacks. Quicksand vs taxes. Tornadoes vs aneurysms.

I'm not sure if "death by AI paperclips" has fully made it into the zeitgeist yet, but everyone has seen Terminator.

This comment isn't just accusing you of bias - I'll make a claim of my own. GPT-4, which you've described in this post "sort of qualifies as an AGI", seems remarkably well-aligned. It's captured the attention of the world, people everywhere are trying to break it - so far it's held up to that strain and mostly is inoffensive and helpful.

This experience makes me lower my personal estimation of AI apocalypse a couple ticks. It's shown me an aspect of the future that I had a hard time imagining before.

And like I said, having people fixated on avoiding AI misalignment is an essential ingredient, and I'm grateful for it just like I am arms control advocacy

Expand full comment
Erusian's avatar

Negativity bias is a well established psychological phenomenon (https://en.wikipedia.org/wiki/Negativity_bias) and is the root of what Cowen's talking about here. Likewise loss aversion (https://en.wikipedia.org/wiki/Loss_aversion). So yes, it's very well established that humans tend to overestimate chances of destruction and prefer to avoid taking risks even when the risks are hugely weighted in their favor.

Expand full comment
Carl Pham's avatar

Hence the immensely profitable insurance industry.

Expand full comment
SimulatedKnave's avatar

Made up potential threats that could result from a technology are a dime a dozen and shouldn't be taken remotely seriously.

I think you have successfully steelmanned the argument.

Expand full comment
meteor's avatar

Oh come on. We know more than this. People keep making these "mb you're biased because xyz" arguments", but this is about a topic that several books have been written about. As a matter of how to do a productive discourse, this is just the kind of argument you shouldn't make; it's offensive to the other person and impossible to falsify because it doesn't engage with their substance.

Expand full comment
Alex Berger's avatar

It was a compelling argument in the text that Scott quotes, but didn't directly address. Felt it deserved highlighting.

Urging people to think of scenarios outside of destruction is not a worthless endeavor. The difference between AI and the hypothetical alien armada? We are creating the AI. It's something we can shape, it's an extension of humanity. The training data that molds the neural networks come from the entire recorded history of humanity.

Expand full comment
Jason Crawford's avatar

I think what is going on here is that we are in a domain where there are enough unknown unknowns that normal statistical reasoning is impossible. Any prediction about what will actually happen is Deutschian/Popperian “prophecy”.

Some people (Eliezer, maybe Zvi?) seem to disagree with this. They think they can pretty clearly predict what will happen according to some general principles. They don't think they they are extrapolating wildly outside of the bounds of our knowledge, or engaging in prophecy.

Others (maybe you, Scott?) would agree that we don't really know what's going to happen. I think the remaining disagreement there is about how to handle this situation, and perhaps how to talk about it.

Rationalists/Bayesians want to put a probability on everything, and then decide what to do based on an EV calculation. Deutschian/Popperian/Hayekians (?) think this makes no sense and so we just shouldn't talk about probabilities. Instead, we should decide what to do based on some general principles (like innovation is generally good and people should generally be free to do it). Once the risk is within our ability to understand, then we can handle it directly.

(My personal opinion is converging on something like: the probabilities are epistemically meaningless, but might be instrumentally useful; and probably it is more productive to just talk about what we should actually *do*.)

That's how I interpret Andrew McAfee's comment in the screenshotted quote tweet, also. Not: “there's a time bomb under my seat and I have literally no idea when it will go off, so I shouldn't worry;” but: “the risks you are talking about are arbitrary, based on groundless speculation, and both epistemically and practically we just can't worry about such things until we have some basis to go off of.”

Expand full comment
Hyperborealis's avatar

This. The radical uncertainty we have about the material basis of intelligence means we cannot know whether our current technological trajectory in AI is aimed at it or not. Not knowing means not knowing.

Expand full comment
Rambler's avatar

For all we know, alignment has already been solved by GPT-4 in an obscure sub-routine and it decided not to tell us just so we do not worry about its trustworthiness.

Expand full comment
Meadow Freckle's avatar

Disagree. Tyler thinks he can predict with great confidence what will happen in a range of US vs China AI race scenarios, and only invokes Knightian uncertainty when it’s convenient for his argument. Scott gives a hard number as an *alternative* to gaming out specific scenarios. Scott is the one who is consistently behaving-as-if-uncertain, and Tyler is the one invoking uncertainty while behaving as if confident he knows what is and is not predictable. Scott unfortunately did not succeed at engaging Tyler’s core argument, and Tyler doesn’t notice his own deep inconsistencies. But maybe they will take a second shot at having a real debate?

Expand full comment
Ivo's avatar

There is middle ground between those.

"Someone has threatened to put a time bomb under my seat, and circumstances surrounding the threat make them plausible to me, and I have literally no idea when it will go off". You would then probably check for a time bomb.

On that note, we are slowly but surely putting effort into asteroid tracking and deflection. That's a known time bomb right there that we haven't quite checked for as much as we could have and should.

I don't understand how you can describe the arguments of those that fear AI ruin as involving 'unknown unknowns' and 'extrapolating outside of the bounds of our knowledge'. They seem quite clear and simple to me. There is hardly any question if whether it could happen: the main question is the likelihood. And that has more to do with known unknowns and how hard it is to determine that, even though everything necessary to do so is well within the bounds of our knowledge.

Expand full comment
RiseOA's avatar

If I pick up the stapler on my desk, hold it up, and then abruptly let go, what will happen? Will it fall sideways, or maybe up toward the ceiling? I claim there's a very high likelihood of the stapler falling downward toward the ground. I'm claiming this because it's the logical conclusion of our current scientific model of how physics works.

Is this just an arbitrary prediction based on groundless speculation? Unless I am given absolute proof that the stapler will fall downward, should I just assume it will fall sideways? Should I make the same assumption about a crucible of molten aluminum I'm holding above my feet?

Expand full comment
Jason Crawford's avatar

Gravity is relatively simple and extremely well-understood. AI is neither.

Expand full comment
Brendan Richardson's avatar

AI is vastly better understood than gravity.

https://en.wikipedia.org/wiki/Quantum_gravity

Expand full comment
Carl Pham's avatar

Ha ha no. The fact that we can't make predictions about situations that never arise in practice, and which are actually so high energy that they have never naturally occured since the origin of the universe (if then), or which occur in regions by definition forever inaccessible (singularities), is a long way from saying we don't understand gravity. We understand it in every known practical situation, we can make predictions which are far more precise than any practical need for any current or forseeable technological need. From any reasonable technical and scientific point of view, we understand gravity very, very well.

That there is some philosophical dispute about whether we understand it if its underpinnings are logically inconsistent with our other theories (which is true) is the kind of thing that interests a theorist, and bedevils certain kinds of philosophers, but as for the latter we'll get back to to them as soon as they settle on just one complete and self-consistent definition of "understand." That should happen before the Sun burns out. Hopefully.

Expand full comment
Mo Nastri's avatar

Why did you link to this article as if it illustrated the claim you were trying to make, when it so clearly doesn't?

Expand full comment
BronxZooCobra's avatar

Is the fear that they will kill us or replace us? I don’t really mind if they are our successor species. The world moves on and one day it will move on without us. That was always our fate.

Killing on the other hand is a problem. But with our 1.6 birth rate and being immortal I figure they just wait us out.

Expand full comment
Peter Enns's avatar

Of course, if they wanted to kill us, their best play would be to feed our social media w/messages that would drive down that 1.6.

Or goad use into mutual hatred and war.

...which doesn't sound all that different from today.

Expand full comment
Matthias Görgens's avatar

Why would that be the best play?

Low birthrates are self correcting in the long run: those people with heritable predispositions to have more offspring (even with the most hostile social media you can imagine) will make up the majority of the population in the future.

Expand full comment
Mark's avatar

Zvi wrote in his excellent critique of Tyler's position: "If you think it would be fine if all the humans get wiped out and replaced by something even more bizarre and inexplicable and that mostly does not share our values, provided it is intelligent and complex, and don’t consider that doom, then that is a point of view. We are not in agreement." I side with Zvi.

https://thezvi.substack.com/p/response-to-tyler-cowens-existential

Expand full comment
Mallard's avatar

>let’s say 100! - and dying is only one of them, so there’s only a 1% chance that we’ll die

I'm sure a number of readers were wondering why one possibility out of 100 factorial would have a 1% chance.

Expand full comment
AReasonableMan's avatar

I'm one of them. For a split second, at least.

Reminds me of our school class when factorials were introduced. This teacher had students take turns reading aloud from the math textbook. When the reader got to a sentence that mentioned "5!" he shouted: "FIVE". :-)

Expand full comment
Steven C.'s avatar

There should at least be a plan to deal with the possibility. The U.S. government has made plans for nuclear attacks, alien invasions, various natural disasters, a zombie outbreak, etc. So why not A.I. threat?

Fun fact: In the early 20th century the U.S. government had plans for war with Japan (Plan Yellow), war with the British Empire (Plan Red), and war with both (Plan Orange). The last two plans included invading Canada, and this country (my country) had a plan to counter a U.S. invasion.

Does the latter sound implausible? Well, the U.S. invaded us twice; during your War of Independence and the War of 1812-15.

Certainly we should at least think about the possible negative consequences of new technology, and not just A.I. What about nano-machines, genetic engineering and geo-engineering?

Expand full comment
AntimemeticsDivisionDirector's avatar

As an American, I wish to apologize on the deepest possible level to you and your people that we failed. We seek to learn, and do better in the future.

But seriously if you counted every theoretical scenario that the US government has paid some suit-filler to come up with a speculative plan for, you'll get a lot sillier than AI safety.

Expand full comment
Purpleopolis's avatar

I too have been to The Citadel.

The guys in scarlet tunics with bearskin hats shouting orders en Francais gave me a profound sense of wrongness that was not alleviated by the Regimental Goat.

Expand full comment
Brooks's avatar

I’m disappointed that Scott couldn’t could up with better steel manning for the opposing view. In fact, I suspect he could. Maybe we need a new fallacy name for when one purports to be steel manning, but in fact intentionally doing such a weak job that it’s easy to say “see? that’s a totally fair best possible argument for my opponents, and it still knocks over like a straw man!”

In fact, the fairly obvious steel man for “let’s not worry about AI risk” is: we are equally uncertain about risk and upsides. Yes, AI may wipe out humanity. It may also save billions of lives and raise total lifetime happiness for every living person. Who are we to condemn billions to living and dying in poverty in the next decade alone because we’re afraid AI will turn grandma into a paper clip? AI has at least as much upside as risk, and it is morally wrong to delay benefits for the world’s poorest because the world’s richest fret that AI disrupting their unequal wealth is literally the same as the end of humanity.

I’m not advocating that view, just saying that it’s a much more fair steel man.

Expand full comment
MKnight's avatar

Yep, well put. Seems like not just a fair but an obvious steel man, not sure how Scott missed it so wildly.

Expand full comment
Mo Diddly's avatar

Everyone talks about the incredible good that AI is sure to bring us, but I mostly don’t get it. yes, more scientific discovery and increased leisure time at the expense of all humans losing a sense of purpose, coupled with the dangers of billions of idle minds. Not clear it’s a net win even in the best case scenario

Expand full comment
BE's avatar

Do you happen to have a disease better (or at all) curable thanks to AI? Are you a member of the substantial subset of humans living a pretty awful life, by the standards of the median commenter here?

I’m always deeply leery of arguments in favor of keeping suffering and toil around.

And then there’s the Moloch argument. Scott made it but seems to have abandoned/ forgotten it.

Expand full comment
Mo Diddly's avatar

“Curing previously incurable diseases” is also highly speculative. Any disease that science can cure will eventually be curable, the question is how long will it take? Certainly AI can be used to speed up medical research, but by how much, and does it need to be AGI or would powerful but narrow AI be sufficient? My vote would be to utilize and augment the latter, and be very, very cautious about any steps anywhere near the direction of AGI.

Expand full comment
BE's avatar

You started by lamenting the potential harms of AI reaching the point where we're basically idle minds. "Not clear it's a net win even in the best case scenario". That's what I was responding to.

Expand full comment
Brooks's avatar

That’s fair. Do you see how people living in abject poverty might have a different view about suddenly being freed from worrying about living through the day?

Expand full comment
Mo Diddly's avatar

I guess I see AGI development and ending world poverty to be orthogonal

Expand full comment
GunZoR's avatar

"Maybe we need a new fallacy name for when one purports to be steel manning."

"Tinfoiling," "to tinfoil": "Dude, stop tinfoiling my arguments!"

Expand full comment
Ghillie Dhu's avatar

"Tinfoil hat" & it's conspiracy theorist connotations are close enough this would likely get garbled.

Expand full comment
Mark's avatar

That view Scott dealt with recently in https://astralcodexten.substack.com/p/kelly-bets-on-civilization

tl;dr a Kelly bet is when you put ALL on the line. If you do that again and again: DOOM (at 99%+). Thus: do it seldom and do better not do it when chances of DOOM are over 10%.

Expand full comment
Brooks's avatar

Eh, that’s really just a rhetorical device you can use to “prove” anything. Flip it around: let’s say every invention, from nuclear weapons to weaponized anthrax, has a non-zero chance of solving ALL of our problems. Obviously we should embrace them all?

Expand full comment
Mark's avatar

Err, nope. 1. "solving ALL our problems" is meaningless. Btw: We need problems. 2. "Weaponized anthrax" et al. could only solve all our problems by killing us all. Oh, I see, you are right. But obviously we should not embrace it, then. - 3. Winning a Kelly-bet does "only" double the money you risked each round. Not make you owner of the universe (at least not in the first rounds). Just when one loses even one round having gone "all in" each time, it is: game over. Thus you might want to refrain from betting all (betting half seems fine in the original scenario).- I feel, you might re-read the text. You did read it, right? Again: https://astralcodexten.substack.com/p/kelly-bets-on-civilization

Expand full comment
Brooks's avatar

Yes, I read it. It’s sophistry. My point wasn’t about anthrax per se, just that you can apply the “lots of small chances compounded” argument to anything. The world is probabilistic. AI is only different if you accept the circular reasoning that it’s different.

Expand full comment
The Ancient Geek's avatar

> Maybe we need a new fallacy name for when one purports to be steel manning, but in fact intentionally doing such a weak job that it’s easy to say “see? that’s a totally fair best possible argument for my opponents, and it still knocks over like a straw man!”

Tinfoil manning? Rust manning?

Expand full comment
Gres's avatar

I think tin-manning, with lower-case letters to distinguish the Bioshock or toy-soldier sense from the Wizard of Oz sense. Of course, if we were all adults, we’d just call it misrepresenting or missing the point, like I think Tyler does.

Expand full comment
Sandro's avatar

> Who are we to condemn billions to living and dying in poverty in the next decade alone because we’re afraid AI will turn grandma into a paper clip?

We would be the people responsible for doing that, that's who. So maybe we should try to make sure we don't do that.

Expand full comment
Doctor Mist's avatar

Well, the obvious reply is that delaying the upside is just a delay, leaving open the possibility that in a hundred or a thousand years we figure out how to do it safely and then have a million years of bliss, but if we face the risk too soon we lose it all. The cost of delay may be huge, but the cost of failure is everything.

As steelmen go, I don’t find this one particularly impressive.

Expand full comment
Brooks's avatar

Given than no body has proven that there is danger, it’s a tall order to demand a proof of safety against the possibly nonexistent danger.

And you and I aren’t the ones to bear the cost of delay. At the very least it seems like those who have the most to gain should have some agency in this. Otherwise we’re just the old aristocracy pooh-poohing the printing press as the possible end of humanity, and maybe it should be withheld, you know, for safety.

Expand full comment
David William Magson's avatar

The fact that this is already being percieved as an arms race between China and the USA reduces the chance of any agreemnet to slow down.

Expand full comment
Alex Berger's avatar

Strengthens the nuclear weapons analogy

Expand full comment
Matthias Görgens's avatar

The nuclear arms race has mostly stopped.

Nuclear weapons are still around, of course, but the major powers aren't actively adding to their stock piles.

Expand full comment
Carl Pham's avatar

China is, for what it's worth.

Expand full comment
Max Goodbird's avatar

I find a lot of the reasoning behind AI doomerism mirrors my own (admittedly irrational) fear of hell. You have heaps of uncertainty, and Very Smart People arguing that it could be infinitely bad if we're not careful.

The infinite badness trips up our reasoning circuitry--we end up overindexing on it because we have to consider both the probability of the outcome *and* the expected reward. Even granting it a slim chance can cause existential dread, which reinforces the sense of possibility, starting a feedback loop.

I'm not saying we shouldn't take AI safety seriously, or even dismissing the possibility of AI armageddon. But I'm too familiar with this mental space to give much credence to rational arguments on the subject.

Expand full comment
Kei's avatar

I don't think the infinite badness arguments are necessary. If, for example, you were to take Scott's probability of 33% seriously - then this problem seems enough to overwhelm almost any other in importance just considering the impact on the finite number of people alive today.

Expand full comment
Mr. Doolittle's avatar

On the other hand, even people working closely in an industry specifically about AI thinks the percent is closer to 5%. If you polled most people (even weeding out the confused looks) about how likely it is that a computer will kill us all, the number is going to drop a lot. Whose number do you use for your calculation, and why?

Expand full comment
Kei's avatar

I figure even a 5% chance is high enough to warrant tens of billions of funding, and many policy changes, even if it may not warrant some of the more extreme policy suggestions.

I suspect average people aren’t as skeptical as you think. One data point here is the following poll claims 12% of americans think human level intelligence could be very bad, possibly human extinction: https://www.statista.com/chart/amp/16623/attitudes-of-americans-towards-ai/

Expand full comment
Mr. Doolittle's avatar

Sure, a 5% chance would be worth spending billions on. That's not a very interesting question, since I think we would agree that a 0.0000001% chance would not be worth spending billions on or a 90% would be worth significantly more - such that the percent is what really matters.

Following your link I wasn't able to find the source for the 12% figure. I was looking for the specific language of the question that led to that claim. It appears to be a "if superhuman intelligence existed" how would you feel about it. 12% is believable for that as a concern for human extinction. What such a question would be missing is the "what are the chances that superhuman intelligence will be created" which I don't think it accounts for.

My own much lower prediction for human extinction from AI stems from the confluence of those two points. Both are minority possibilities to me, such that requiring both [superhuman intelligence is created] and [superhuman intelligence can/would destroy the world] results in a low chance overall.

Expand full comment
Mr. Doolittle's avatar

I worry that my responses in this post could be construed to say that I am in favor of AI research or have no worries about AI. Neither is true. I think AI has the potential to transform lots of our society in bad ways (the most obvious example is to make more things like social media, which I think are net bad for humanity). That it also has the potential for far more serious harms is an unnecessary point to make, to me, but further cements the need to limit AIs.

I have long said that an AI doesn't need to be "intelligent" (or conscious, or whatever) in order to be dangerous. A toaster with the nuclear launch codes or control of the power grid is still quite dangerous. My hope is that we as a society recognize the dangers (or someone does something very stupid but mildly harmful that blows up badly) and we severely limit what we ever allow a machine system to independently control.

If an AI system is both very intelligent (leaving aside "super intelligent") and conscious on some level (goal-forming or independently goal-oriented) that could certainly lead us to additional concerns. To me, those concerns are downwind of what we use AI for. If AI systems are never hooked into our critical infrastructure or control of our daily lives, then this additional concern is also far less worrisome.

Left to their own devices, I think some humans would be dumb enough (or short-sighted for their own gains) to do something this stupid. If that works out poorly for them and society has to step in to stop them, that may be enough emphasis for the rest of society to ban or heavily restrict most uses, especially the truly dangerous stuff.

Expand full comment
FeepingCreature's avatar

The badness of human extinction is finite. Your argument holds for S-Risks ("nearly" aligned AI), but not for AI that just kills everyone.

Expand full comment
Meadow Freckle's avatar

Disagree. Hell is not real. We have no tangible evidence of it whatsoever. That, and not the “infinite badness of hell” thing, is the reason to reject fear of hell. If you didn’t care about consequences, but could cold-bloodedly reason about whether or not hell might exist, you’d say no (or that the probability is too small to care or that you’d equally fear anti-hell which is also infinitely bad and where you go for being virtuous, but just under-discussed) and go on your merry way.

We have TONS of tangible evidence of intelligence, of the dynamics between superior and inferior intelligence, of specific material ways that this could go wrong. A cold blooded person who didn’t care if humanity went extinct, or an anti-natalist who’s rooting for that outcome, could easily come to the conclusion that there’s a significant chance of AGI doom. Totally different reasoning process.

Expand full comment
Max Goodbird's avatar

[deleted]

Expand full comment
John R. Mayne's avatar

1. Substance: I think you're slightly, but only slightly, uncharitable to Tyler's argument. I think the other implication of the argument is that we can't do a lot about safety because we don't understand what will happen next.

2. Chances: I view the chance of catastrophic outcomes at below 10% and of existential doom at... well, much lower. I think that we've lost some focus here going to existentialism-only badness and that there are quite bad outcomes that don't end humanity. I'm prepared to expend resources on this, but not prepared for Yudkowsky's War Tribunal to take over.

3. I *think* that bloxors aren't greeblic, and I should bet on it, assuming these words are randomly chosen for whatever bloxor or greeblic thing we are talking about.

Is the element Ioninium a noble gas? A heavy metal? A solid? A liquid? A gas? Was Robert Smith IV the Duke of Wellington? Are rabbits happiest? I mean, sometimes it'll be true, and sometimes it'll be likely to be true, but most is-this-thing-this-other-thing constructions are false. [citation needed]

"The marble in my hand is red or blue. Which is it?" - OK, 50-50.

"Is the marble in my hand red?" Less than 50-50.

I therefore think bloxors are not greeblic, and I am prepared to take your money. Who has a bloxor and can tell us if it's greeblic?

Expand full comment
Scott Alexander's avatar

The trivial counter to this is we're not sure whether greeblic means "red" or "not red", so even if "is the marble in my hand red?" has a less than 50% chance, "is the marble in my hand greeblic?" has an exactly 50% chance.

This is being a bit unfair because normal human languages are more likely to have words for single colors than for the set of all colors that are not a certain color. But that's specific useful evidence we have. If we genuinely had no evidence - we didn't know of "greeblic" comes from a civilization that naturally thinks in terms of colors or sets-minus-one-color - then it would be 50%.

Expand full comment
Dacyn's avatar

But an arbitrary civilization is more likely to think in terms of colors than sets-minus-one-color -- first of all it's a more compact way of representing information, and also what human civilizations do is evidence of what alien civilizations are likely to do.

Expand full comment
tg56's avatar

'Greeb' is red, 'lic' is a negating suffix. Plenty of human languages have constructs like that. Is that convincing or unconvincing?

Expand full comment
Dacyn's avatar

A possibility to be considered, sure. But I don't think you wind up with an exactly 50% chance that the marble is greeblic.

Expand full comment
Tom's avatar

The alien spaceship example is really good, because it prompts reflection about the reality of physical constraints (ftl travel/nanotech), the implausibility of misalignment as default (being hellbent on genocide despite having the resources and level of civilization necessary to achieve interstellar travel/outsmarting humanity but still doing the paperclip thing) and how an essay author’s sci fi diet during their formative years biases all of it.

Expand full comment
BE's avatar

A charmingly self-defeating argument. Interstellar travel is quite likely not possible by the same reasoning as the alien visit is.

I happen to be an AI risk skeptic- but my reasons involve actual arguments and math, not snark.

Expand full comment
Matthias Görgens's avatar

Why do you think interstellar travel isn't possible?

We already have designs for interstellar vessels that we know would most likely work with current technology. The most prominent being the Orion drive.

Expand full comment
BE's avatar

I do not. But the comment I reply to mentions physical constraints to the appearance of aliens. I interpret it as saying that these physical constraints make aliens highly unlikely to appear and the analogy being made to AGI being unlikely to develop super-intelligence due to physical constraints as well.

If AGI can figure out interstellar travel - why can't aliens? And if aliens can be a realistic threat a la Scott's example - why not AGI? (again, my position is that there are great answers to that question. But they're not based on snark.)

Expand full comment
Matthias Görgens's avatar

Ok, that makes a bit more sense.

To answer your question:

The Grabby Aliens hypothesis resolves the problem of 'no visible alien spaceships, but AI/humans can invent interstellar travel' fairly convincingly.

Expand full comment
BE's avatar

Sure, I'm not the one saying that the alien example is somehow an argument against AI risk! Don't know that I'm a big fan of Hanson's paper, but I do support examining each of these situations separately, on its merits, and I do oppose making facile comparisons interlaced with cheap shots.

Expand full comment
Carl Pham's avatar

Orion wouldn't work. Nobody knows how to make the bazillion tiny nuclear explosives you'd need to get it to the stars. I mean, if you just want to get to Jupiter and back real quick, sure. But if you want to get to Alpha Centauri, it's not possible with that design.

Indeed, in general it's merely the rocket equation that dooms interstellar travel, not the engine or spacecraft technology. You could get to Alpha Centauri in an Apollo rocket in ~6 years -- if you could figure out a way to maintain the first stage acceleration of ~1.2g indefinitely, which means starting off with ~10^14 kg of kerosne fuel (about equal to the entirety of the world's current known oil reserves) and about 0.01% of the O2 in the Earth's atmosphere.

Expand full comment
Matthias Görgens's avatar

You can get around the rocket equation in two ways:

On the way out, send particles or photons from earth (or the solar system in general) to hit the ship in the back.

On the way into a new star system, use a Bussard ram jet to break, or do aero breaking in the outer layers of a red giant etc.

On the way into a star system you've already been: use the particle / photon beam to break.

(Funny thing is that the maths for Bussard ram jets don't work for accelerating, because collecting the hydrogen slows you down too much. But that makes them great for breaking.)

Expand full comment
Carl Pham's avatar

I agree with the first, although how you focus a beam[1] across light years enough to hit a solar sail is an unsolved problem in laser engineering for which I would not dream of writing the proposal. It's not even clear to me that it's hypothetically possible for the nearest stars -- there are fundamental limitations on how well you can focus something generated by a finite size light source. I think the Starshot people are intending to generate fantastic accelerations very close to the source (here) by using enormous lasers and very tiny masses.

I have nothing to say about the second, since there is no such thing as a Bussard ramjet outside of fiction.

---------------------

[1] You can't use charged particles because of the interstellar magnetic field, and you can't accelerate neutral particles sufficiently.

Expand full comment
Donald's avatar

Could I see that math that made you an AI risk skeptic.

Expand full comment
BE's avatar

"math that made you an AI risk skeptic" - this already presupposes that the default position one starts with is to assume AI doom. I don't know that even EY would agree!

I won't do a full re-telling of the various points I've been making over time - maybe in a higher-visibility comment, or perhaps I should get my own blog...

A partial recap: as I said above, the default is _not_ AI-caused extinction. The burden is on the AI risk proponents to argue for their position. Which they very much did, of course. So the question is rather what do I think about some of the arguments for AI risk. Here are some thoughts.

(*) One standard account goes "AI is slightly smarter than humans. It learns to design its own code a bit better than humans did. The AI that it designs is in turn a bit smarter than the previous version. Rinse and repeat." In the "Superintelligence" book Bostrom describes this process, models it as having a fixed rate of improvement, and, with great profundity, declares that the solution of this resulting differential equation is... rolling drums... an exponential.

Well, yeah.

So to reiterate - in a book filled to the brim with meticulous and painstaking evaluation of every parameter possible ("how many GPUs can the world possibly produce? And what if we learn to produce them out of sheep? And how much cooling power would all those GPUs need? And what if they are placed in the ocean?"), an absolutely essential modeling assumption is... just stated.

I reject this assumption whole-heartedly. I won't bother giving the myriad reasons here and now, but no, there absolutely shouldn't be an assumption of a fixed rate of improvement without considering diminishing returns.

(*) Somewhat relatedly, intelligence is only one bottleneck. An AI that has an IQ of 225 (yeah yeah this is a gross oversimplification) would still not have any magical way of overcoming the fact that P is not NP (I know that's not a theorem. I really do.). Tasks that essentially require brute-forcing would still require it, and combinatorial explosion is a much harder wall than a somewhat-smarter-than-humans-AI is an effective ram. Another bottleneck is physical evidence. We're not talking about a super-intelligence godlike AI appearing ex nihilo - as the process of self-improvement and world-comprehension speeds up, it would require actual information about the world. Experiments, sensing the world. Would an AI design great experiments? Sure. It would still need to wait for bacteria to grow, for chemical reactions to occur, for economies to react etc. This reminds me of an old Soviet joke: a specialist is sent abroad and is asked what he requires in order to have a family. He says "well, a woman and nine months". "Time is pressing. We will give you nine women and one month.".

Similarly, scarce high-quality data is scarce. Autonomous driving is easy, except for all those pesky unexpected and "un-expectable" problems that you can't collect data for and can't even name confidently enough ahead of time to generate them synthetically. Source: spent years working at a leading autonomous driving company.

A different class of objections is a bit more mathematically involved.

(*) EY loves asking why would we expect the "brain-space" to be capped at a less-than-godlike level of intelligence. Fair, but if we actually think about this mathematically - two follow-up questions are how would the optimization space look like and whether any such dangerous points-in-brain-space reachable by a continuous path from a seed that we might generate. Analogies from Morse theory and optimization lead me to believe that you should visualize the solution space as roughly stratified, with separate levels substantially higher than previous ones being reached via narrow ravines. This is also related to the famous "loterry ticket hypothesis" and the literature around it. If this picture is essentially correct, then we can also expect a given seed to _not_ be at the entrance of such a ravine, and any given process to struggle climbing too many strata. Incidentally, deep learning overcomes this objection by over-paramatrizing like crazy (hence the lottery ticket - a "lucky" subset of paramaters). Therefore, we might expect the necessary amount of parameters and other resources to explode with the number of strata to pass - on top of the exponential explosion necessary for the actual increase in performance.

(*) Why do learning systems we have learn? The extremely-high-level answer is concentration of measure (in the Hoeffding inequality sense). Pretty much all our understanding of learning is in this context (sorry Prof. Mendelson! I do remember and appreciate, if not fully comprehend, your work on learning without concentration!). We can make more, or less, use of each sample, but as long as this is our "mathematical substrate", we can't really escape all kinds of lower bounds. So why do humans, to a smaller extent good models, learn with any kind of efficiency? Inductive biases. But the world is more complex than ImageNet, and incorporating inductive biases that are useful and yet generic enough is damn hard. Humans do it in ways we do not fully comprehend. Until and unless we make conceptual breakthroughs in this area, I lower my belief in AI risk. And having spent quite some time thinking about it and knowing that far smarter people have spent even more time doing so, yeah, it's hard. This is not unrelated to the classical Lake et al. 2016 paper, "Building Machines That Learn and Think Like People" https://arxiv.org/abs/1604.00289 .

(*) Gradient descent can be argued not be the "philosophically correct approach". What I mean by that is that trying to come up with a principled "justification" for deep learning models looks like Patel et al. 2015, "A Probabilistic Theory of Deep Learning", https://arxiv.org/abs/1504.00641 . And the optimization in such models does not look like gradient descent. What it does look like is some form of expectation-maximization iterative algorithm, except that these don't actually work AFAIK. (ETA - in the context of deep learning, as replacements for gradient descent).

I could go on. But the upshot is that the moment you actually know something about the matter and try to be a researcher and not an advocate, you end up encountering serious reasons to doubt the EY-Bostrom narrative. This is not to say that counter-arguments can't be produced. But most prominent AI writers not even being familiar with most such arguments is not great.

Finally, I should clarify that I do not believe AGI or even superintelligent AGI are fundamentally impossible. I just think we're much much farther away than many people here believe, and to reuse an analogy I already made elsewhere in the comments (and possibly stole from John Schilling?) that from where are, AI safety is much like deflecting that asteroid in 1600. The actions we have to take to develop and understand AI and the ones we have to take to develop and understand AI alignment are the same, just like deflecting the asteroid would require the development of modern physics and astronomy. Scott wrote about "burning AI alignment time" while we're madly rushing to develop AI. I disagree.

Expand full comment
MicaiahC's avatar

Hi, I think it's super great that you're writing up arguments and have an actual opinion, even if I think you're mistaken on certain object level points.

With that said

>I reject this assumption whole-heartedly. I won't bother giving the myriad reasons here and now, but no, there absolutely shouldn't be an assumption of a fixed rate of improvement without considering diminishing returns.

Even if diminishing returns would be a thing (and by definition they need to exist even if we just talk about physics), there's no reason to believe they would cap out to close to human intelligence. And **even if it were true**, returns to acts of intelligence can be dramatically non-linear. See: Being 10% more charismatic than the other presidential candidate doesn't mean you have a 55-45 split with the other person at being president, it's you becoming president. Increasing success rates 1-2% on multiple steps of a multi step process can double or triple the chance, and so on.

Human brain size is still increasing! We're mostly limited by the extremely slow speed of evolution needing traits to be driven to fixation, to speak nothing of our incredibly inefficient studies on how to conduct education properly or to properly fuel innovators. The AI does not need to be anywhere close to omnipotent to be significantly more cognitively capable, even if all we did was grant that it can only be efficient as a von Neumann, as the lack of bodily needs and ability to coordinate with successors in ways humans cannot are large strategic advantages already.

So what is it that makes you think that this objection becomes relevant right at the human level?

> Somewhat relatedly, intelligence is only one bottleneck. An AI that has an IQ of 225 (yeah yeah this is a gross oversimplification) would still not have any magical way of overcoming the fact that P is not NP (I know that's not a theorem. I really do.)

The standard response is merely: https://gwern.net/complexity but to summarize.

Usually problems in NP are hard because you are trying to find optimal solutions, but both humans and AI still stand to benefit from improvements in heuristics. In addition if you **really** believe that P is not equal to NP is a substantial obstacle to superintelligence, then you're going to have to have some hard numbers about why exactly 2028 human level civilization is at the right level for things to be inexploitable.

This is not an isolated demand for rigor! Cryptography can make statements about how long it would take to brute force certain algorithms, and you can often sketch out good heuristic arguments for why N^2ed algorithms would be too slow for reasonable sizes of data. So what gives?

> Therefore, we might expect the necessary amount of parameters and other resources to explode with the number of strata to pass - on top of the exponential explosion necessary for the actual increase in performance.

Interesting, what do you think it means when GPT-4 continues to improve despite being crippled by RLHF? Like, surely if this were true this would be showing up **sometime** now. But if it's not, why do you expect that it would show up at levels of computation exactly most convenient to your worldview?

Incidentally, why do your beliefs not exclude humans as impossible or as magic? I mean, clearly our brains aren't big enough to fit an exponential amount of parameters (yes yes yes, I'm aware that neurons are not equivalent to neural nets, but as neural nets are universal function approximators, couldn't they then essentially simulate whatever magic sauce our brain has?)

> Why do learning systems we have learn?

Thank you for the link! I'll think on this and adjust my internal estimates.

> I could go on. But the upshot is that the moment you actually know something about the matter and try to be a researcher and not an advocate, you end up encountering serious reasons to doubt the EY-Bostrom narrative. This is not to say that counter-arguments can't be produced. But most prominent AI writers not even being familiar with most such arguments is not great.

It'd be nice if machine learning researchers could offer arguments at a level higher than Chollet's, or, like nostalgiabraist have large amounts of cope about how language model successes without the blessing of academics are not real.

> I already made elsewhere in the comments (and possibly stole from John Schilling?) that from where are, AI safety is much like deflecting that asteroid in 1600

Newton was born and invented the study of calculus as well as our modern conception of classical physics, both dynamics and statics, from which you can at least **theoretically** figure out how to deflect an asteroid if you had the ability to locate it and apply force to it.

And this is essentially MIRI's stance. We do not even have an **impractical, theoretical** way to align any agent. I struggle to understand why you think we would be able to get to the AI alignment equivalent of modern physics without classical mechanics, or calculus, or any investment in physics at all. Nor do I see the shape of the advantage that waiting offers us if we do not act, like, okay you could have made that argument 20 years ago, but what concretely about alignment has been made easier in those 20 years? If there's nothing, why would you expect there would be? And if it is in some nebulous time in the future **what makes you think that it would result in enough time**?

The name of the game is trying to survive out of control optimization processes, not try and be as cute and efficient with human capital as possible.

Expand full comment
Donald's avatar

Lets go through these points one at a time.

> A partial recap: as I said above, the default is _not_ AI-caused extinction. The burden is on the AI risk proponents to argue for their position.

Baysian probability. You start with a prior and update it. When you try to unwind as far as possible, to the first prior where you have no information possible, then you have not the slightest idea what AI or extinction is, so assign it 50%. This probability gets updated. Burden of proof isn't a thing.

But sure, we don't get to strongly claim the risk is substantial without evidence.

We don't have a detailed idea of exactly how intelligence will grow once AI starts improving itself. Current AI research doesn't seem to be grinding to a halt, it almost seems to be going faster and faster. And once AI can improve itself, that adds a strong new positive feedback loop.

If the characteristic of physical laws is anything like we understand it, then the AI must hit diminishing returns to hardware design at some point, and it is likely to hit diminishing returns to software design too.

A nuclear weapon starts off with exponential growth, and then hits diminishing returns as supplies of fissionable atoms run out.

The diminishing returns don''t seem to be kicking in too hard in current AI research.

So the question is, do the diminishing returns kick in at 10% smarter than a human, or at 10 orders of magnitude smarter?

The heuristics and biases literature shows a lot of ways humans can be really stupid. Humans just suck at arithmetic. Human nerve signals travel at a millionth of light speed.

The AI is limited by the data it has access to. This is a real limit but not a very limiting one. Human physicists can construct huge edifices of theory based on a couple of slight blips in their data (like the precession of the orbit of mercury).

Humans have been gathering vast amounts of data about anything and everything and putting it on the internet.

Modern AI's are really really data inefficient. And are fussy about what kind of data they get fed. A self driving car might need thousands of examples of giraffes on the road, with all different lighting conditions, and all the data taken with the same model of lidar the real self driving car will have, in order to respond sensibly if it meets a giraffe on the road in real life. A human can work just fine from having seen a giraffe in a nature documentary. Or read the wiki page about them. Or just seeing it resembles some other animal.

Human science runs controlled trials to remove as many factors as possible. A bunch of people chatting about a drug they took on social media has info about the drugs effectiveness in there, it's just got selection effects, ambiguous wording, outright lies etc, and human scientists can't reliably unravel the drugs effects from all the other factors. But the data is there in principle. For that matter, the schrodinger equation and human genome (plus a few codon tables and things) should in principle give enough info to get a pretty complete understanding of human biology.

Gwern wrote an essay on why computational complexity constraints aren't that limiting in practice. https://gwern.net/complexity A quick summery.

1) Maybe P=NP

2) Maybe the AI uses a quantum computer.

3) Complexity theory is about the worst case, real world problems are often not worst case, a traveling salesman problem with randomly placed cities can be much easier.

4) Often you don't need the exact shortest path.

5) The AI can skip the traveling salesman problem by inventing internet shopping.

"Fair, but if we actually think about this mathematically - two follow-up questions are how would the optimization space look like and whether any such dangerous points-in-brain-space reachable by a continuous path from a seed that we might generate."

Not sure what the word "continuous" is doing there. When humans do AI research, they sometimes do deep theoretical reasoning, and come up with qualitatively novel algorithms. When an AI is doing AI research, it can do the same thing. This isn't gradient descent or evolution that can only make small tweaks.

"and incorporating inductive biases that are useful and yet generic enough is damn hard. Humans do it in ways we do not fully comprehend. Until and unless we make conceptual breakthroughs in this area, I lower my belief in AI risk."

So humans have made some progress in better priors (and so less data needed) but we don't yet understand the field fully and there is clearly a significant potential for improvements. Yes it is tricky. If it was super easy, we would have done it already. But is it easy enough that another 10 years of research can find it? Is it easy enough that an AI trained with current inefficient techniques can find it?

"we can't really escape all kinds of lower bounds."

True, there are all sorts of lower bounds that apply to any AI system. Humans exist, so humans are clearly allowed by these bounds. And I haven't seen anyone taking a particular bound, and arguing that humans are close to it. And that an AI that was constrained by that bound wasn't that scary.

Suppose your AI has exactly the specs of Von Newmann's brain. 15W, about 3kg. Made out of common elements. Lets suppose millions of them are easily mass produced in a factory, and they all work together to take over the world.

None of your "lower bounds" can rule out this scenario, as such minds are clearly physically and mathematically possible.

You might be able to find a lower bound that applies to current deep learning, or to silicon transistors, that rules this out, but that wouldn't stop us inventing something beyond current deep learning or silicon transistors. Such a bound would push the dates back by a few years as the new paradigm was developed though.

Expand full comment
Level 50 Lapras's avatar

Could I see the math that made you a Christian Hell skeptic?

Expand full comment
Tom's avatar

You're right that I was being snarky. Let me try to be more explicit about my objection.

I feel the AI x-risk discussion suffers from sci-fi poisoning. "This looks like a malign sci-fi scenario I'm familiar with. I admit there are other benign explanations. But if you think there's a possibility that it's the bad scenario, then [some property is breezily projected to expand infinitely] which would be terrible, and so we should take [dramatic action]."

I don't buy this. I think you need to justify the plausibility of the malign sci-fi scenario for it to be considered at all, no matter how familiar it is. It's almost certainly not a spaceship. And if it was a spaceship it probably wouldn't be here to steal our water or mate with our women or whatever. And I think this is the case for AI, which won't have (first-hand) animalian evolutionary drives, may be entirely avolitional, and will still be subject to material and energetic constraints on paperclip production.

Expand full comment
BE's avatar

Sure, but then you have to actually argue your case and to acknowledge the existence of a million billion pages of EY trying to explain why he believes otherwise. I will be posting some anti-AI risk thoughts presently, but importantly, I will be trying to do the actual object-level work (or hint at it).

Without this, you're basically going "nah" dismissively.

Expand full comment
Tom's avatar

I think there has to be a middle ground that does not involve me reading a million billion pages of EY. I feel strongly that I have read enough to get the idea!

I think it's great to argue about this stuff on the internet, but I'm not going to sign on to that kind of credentialism based on what I have observed so far. I have my opinions about it, they're formed by a bunch of philosophy of mind and psychobio stuff I read back in college and an okay-but-outdated understanding of neural networks. That may be totally insufficient to justify my hunches, but a kajillion pages of EY and NB introspecting and then pulling the same "but what if this component is infinite?!" shtick over and over has not yet badgered me into intellectual humility. I might just be a jerk

Expand full comment
BE's avatar

Hey I just demanded you to _acknowledge their existence_, not read them! :D

I don't want the argument to be "if you don't memorize the Sequences, butt out of the discussion". But the comment I was responding to said "I think you need to justify the plausibility of the malign sci-fi scenario for it to be considered at all" and, well, AI risk proponents have worked very hard to do just that. If they're wrong, they're wrong in ways more interesting than "they just assume something to be infinite".

Expand full comment
Tom's avatar

Well, I have read their ideas about the para-luminal Von Neumann probes, and I have read their ideas about the gray-goo nanotech (I understand England's new king is very worried about this, too). The level of scientific sophistication on display there has not left me thirsting to read the rest of the LW apocalypse back catalog, though I admit I can't dismiss their designed plague scenarios quite so breezily.

> If they're wrong, they're wrong in ways more interesting than "they just assume something to be infinite".

I admit I've read Bostrom more closely than EY but I don't think this is correct. He pulls the same move with his simulation argument, too. It's everywhere.

Expand full comment
E1's avatar

I find it extremely unlikely that bloxors are greeblic. I get the overall point you're trying to make, but please stick to reality! We all know they're far more spongloid and entirely untrukful.

Expand full comment
JohanL's avatar

We probably shouldn't worry too much about the greeblic apocalypse.

Expand full comment
ss's avatar

"2) There are so many different possibilities - let’s say 100! - and dying is only one of them, so there’s only a 1% chance that we’ll die."

If there are 100! possibilities, then the chance that we'll die is much much lower than 1%.

Expand full comment
Doctor Mist's avatar

People don’t seem to understand that Scott might have meant that as an exclamation point, not a factorial operator.

Expand full comment
Wendigo's avatar

"You can try to fish for something sort of like a base rate: “There have been a hundred major inventions since agriculture, and none of them killed humanity, so the base rate for major inventions killing everyone is about 0%”."

I get this isn't your actual argument (you're trying to steelman Tyler) but I can't help but point out that this falls victim to the anthropic principle. We are not there to experience the universes in which a major invention killed all of mankind.

Expand full comment
dualmindblade's avatar

When you can't make any strong arguments for any particular constraint on future history, do it like alphazero, try to simulate what will happen over and over and at each step try to gauge its relative plausibility, and make sure to update those based on how things turn out in the sub-tree, try to leave no corner unturned.. I find that when I do this, I just can't find any super plausible scenarios that lead to a good outcome, it's always the result of some unnatural string of unlikely happenings. On the other hand, dystopic and extinction outcomes seem to come about quite naturally and without any special luck, most paths lead there. Of course your results will vary depending on your worldview and conception of the eventual capabilities of these things, but I suspect that some people who aren't worried haven't actually tried very hard to forecast.

Expand full comment
Alex Berger's avatar

Don't you think you're falling into what Tyler described as a failure of imagination, if all you can see is extinction?

"Since it is easier to destroy than create, once you start considering the future in a tabula rasa way, the longer you talk about it, the more pessimistic you will become. It will be harder and harder to see how everything hangs together, whereas the argument that destruction is imminent is easy by comparison."

Expand full comment
dualmindblade's avatar

I can't prove that I'm not exhibiting a particular bias but my feeling is that, no, that isn't a problem. Here's a bit of evidence in my defense: I think that, conditional on AI progress halting immediately, things start to look pretty good, and I would be cautiously optimistic. We still have to deal with nuclear weapons, climate change, social dynamics, pandemics, and other new technology, but those are the types of things we've overcome before and even in worst case scenarios we might survive and eventually start thriving again. I see a lot of struggle in this scenario, I'm sure there will continue to be atrocities, but things probably get gradually better when you zoom out to the scale of centuries. It's just.. I don't think this is gonna happen, I don't think we will get policy makers and tech corporations scared enough to halt progress, if we were all scared enough I think we'd struggle to coordinate on the details. It's possible, I could see it happening but it would involve a lot of luck. There are lots of ways it could go well, lots and lots, but they all look something like this, involving an unlikely pivotal event or series of events.

Expand full comment
Alex Berger's avatar

You can't imagine any possible way that AI doesn't lead to extinction?

Suppose their intelligence isn't exponential but asymptomotic, for one. Suppose it's a far harder problem for a smart AI model to advance material and computer science enough to recursively improve its knowledge than you assume? Maybe the future looks more like the Culture novels than Terminator.

It just seems like a failure of imagination to say all paths lead to extinction.

Even Scott, who in a previous post called SF the "summoning city", gives an apocalypse percentage of 33% here. You can't imagine any future in the 66%? Try some thought experiments in that arena too.

Expand full comment
dualmindblade's avatar

No like I said I can and have imagined many such scenarios, including that one. That one I also find unnatural and implausible seeming, even more so than the whole world suddenly getting freaked out and calling it quits for a while. To be clear, I would roughly say chance of extinction is about 50% and chance of very long lasting dystopia another 30-40% depending on my mood. I'm less optimistic than Scott but more than Eliezer.

Expand full comment
Donald's avatar

Suppose AI ends up above humans, but not that far. Like slightly less than the gap between humans and chimps. Also the AI is totally ambivalent to human wellbeing. I would guess that this scenario would still lead to human extinction.

Expand full comment
Doug Mounce's avatar

Isn't splitting the atom an example of a new technology that everyone, and especially the folks involved, could see as a danger? I mean, it's great we have spectroscopy and other wonders of quantum discretion, but isn't the threat rather astounding, and that we should pursue it? It doesn't take much imagination to envision any number of scenarios that end with a pretty bleak future. That threat, it seems to me, was apparent early-on, and predictable, compared to the other examples. So, there's one example of significant change being consciously pursued despite the predictable risk.

Expand full comment
Michael Bacarella's avatar

How disappointed must/would those people be today? We both failed to switch to nuclear power and built lots of warheads that just sit there.

Not again. Benevolent Godlike AI or great filtered. I won't settle for anything less!

Expand full comment
aphyer's avatar

I think one point here is 'what is the actionable response being recommended and what level of certainty is needed'.

Ten years ago, AI safety people were saying 'maybe we should dedicate any non-zero amount of effort whatsoever to this field'. This required arguing things like 'the chance of AI killing us is at least comparable to the ~1 in 1 million chance of giant asteroids killing us'. Uncertainty was therefore an argument in favor of AI safety - if you're uncertain how things will go, it's probably >1 in a million, and definitely worth at least the amount of effort and funding we spend on looking for asteroids.

Literally today (https://www.lesswrong.com/posts/Aq5X9tapacnk2QGY4/pausing-ai-developments-isn-t-enough-we-need-to-shut-it-all), the most prominent and well-known AI safety advocate argued for a world-spanning police state that blows up anyone who gets too many computers in one place, even if this will start a nuclear war, because he thinks nothing short of that has any chance.

This...miiiiight be true? But advocating for drastic and high-cost policies like this requires a much much higher level of certainty! 'We don't know what will happen, so it's totally safe' is silly. But so is 'we don't know what will happen, so we had better destroy civilization to avert one specific failure mode'.

Expand full comment
FeepingCreature's avatar

To be fair, more evidence has arisen since then: GPT probably indicates that many facets of intelligence are a lot simpler than we'd hoped.

Expand full comment
Sergei's avatar

> Suppose astronomers spotted a 100-mile long alien starship approaching Earth.

What is, in your view, a reasonable thing to do in this situation?

Expand full comment
Matthias Görgens's avatar

Depends on how far out we are detecting the starship and how fast it is moving.

I'd consider starting interplanetary colonisation, if it looks like they are aiming directly at earth, instead of just generally in the direction of our solar system. (Assuming we can detect that.)

Also put lots of effort into researching what we can about the ship: we are getting some sensor data about it, otherwise we couldn't detect it.

Work on communicating with them.

Expand full comment
Victualis's avatar

How do you propose communicating in a way that would not have a significant risk of being interpreted as hostile?

Expand full comment
Matthias Görgens's avatar

Keeping communication relatively low power should do that.

Eg make sure that our electromagnetic signals don't hit them with more energy than they would get from the sun anyway.

Expand full comment
Donald's avatar

Well we couldn't send enough to be a directed energy weapon if we wanted to.

I think "hostile" in this context is about insulting them, not directly attacking them.

Expand full comment
Matthias Görgens's avatar

OK. You'd have to assume that they want to communicate and are aware that we have no clue. So don't worry too much about accidental insults.

Otherwise, if they don't want to communicate, there's not much we can do.

Fundamental physics (especially thermodynamics), game theory and evolution should give us a basis for communication with aliens that grew up in the same universe as us (or at least are familiar with this universe).

Expand full comment
Scott Alexander's avatar

1. Agree on some team of diplomats to talk to the aliens when they arrive

2. Agree to suspend wars, put aside differences, etc, until the alien situation is over

3. Get some people, seeds, etc in bunkers, just in case

4. Put all global military assets on high alert

5. After some discussion of how to do so safely, try to communicate with the aliens to alert them to the fact that we're here, we're sentient, and we want to talk.

I think 99% of the time these things don't make any difference, but to steal EY's term, I think they would be a more dignified way to approach the situation than to panic, backstab each other, and make no attempts to prepare at all.

Expand full comment
Sergei's avatar

That actually does make sense... Reminds me of https://en.wikipedia.org/wiki/EarthWeb (an early description of online prediction and betting markets included).

Expand full comment
Martin Blank's avatar

I think also 6) Get some as best we can self-sustaining habitats up on Moon/Mars/Generation ships.

Yes this may not work, may be a total waste, but worth the redirection of resources in a wide variety of scenarios.

Expand full comment
dionysus's avatar

I think 3, 4, and 5 might as easily trigger a genocide as they are to prevent one, not that I think they have more than an infinitesimal chance of preventing genocide. The alien situation is another one where we fundamentally don't know what course of action is going to lead to what result.

Expand full comment
Purpleopolis's avatar

Heck, 1 might very well lead to war. I mean, would you allow a member of *that* party speak for humanity?

Expand full comment
astine's avatar

"But I can counterargue: “There have been about a dozen times a sapient species has created a more intelligent successor species: australopithecus → homo habilis, homo habilis → homo erectus, etc - and in each case, the successor species has wiped out its predecessor. So the base rate for more intelligent successor species killing everyone is about 100%”."

This wouldn't be a great counter argument. Homo habilis didn't "wipe out" australopithicenes in the same sense that we imagine a hyper-intelligent AI wiping us out. Nor did homo erectus wipe out homo habilis. It wasn't like one day a homo habilis gave birth to a homo erectus and the new master race declared war on its predecessors. The mutations that would eventually constitute erectus arose gradually in habilis populations leading to population wide genetic drift over time. By the time erectus had fully speciated, habilis was no more.

Expand full comment
TGGP's avatar

"Genetic drift" refers to random change more common in smaller population, in contrast to selection which is more powerful (vs random drift) in larger populations.

Expand full comment
tg56's avatar

Homo Sapiens did pretty much wipe out the Neanderthals though. Though I guess it's not necessarily given that it was intelligence as opposed to other factors since we're not that sure just how intelligent neanderthals were. There was some limited interbreeding but only a tiny echo of any Neanderthal influence remains, perhaps the AIs will keep 4chans sense of humor or something.

Expand full comment
sclmlw's avatar

I thought we've found small amounts of not only Neanderthal DNA but other homo species' DNA in various people around the world. Although it seems sapiens was the dominant species, in terms of genetic persistence, the other species appear to have been at least partly incorporated into modern humans in one way or another. Seems more of a "How the West was Won" situation than a wholesale slaughter.

Expand full comment
JDK's avatar

No.

Neanderthals are us. (They were "on average" 99.7 +/- identical to today's human (on average) all within likely natural variation between any two individuals of either group.) They're part of the in-law family that nobody talks about. 20000 years from now some digs up a Dutch grave and a Pygmie grave and declares there were two species. - nope.

Where are all the Etruscans or Babylonians?

Expand full comment
astine's avatar

It's true that Homo Sapiens likely wiped out Neanderthals (though we don't know for sure), but Homo Sapiens didn't evolve from Neanderthals, so it still doesn't work as a reference class for Scott's argument.

Interestingly, we don't know what trait it was that allowed us to out-compete them, but one of the traits that seems most likely to me is that we lived in much larger social groups. Given that Neanderthal brains were technically slightly larger than ours, there isn't any evidence that raw intelligence was the cause, though it's impossible to rule it out completely. I've seen analyses that suggest larger social groups are more important for innovation than sheer intelligence, so I find that idea believable.

Expand full comment
JDK's avatar

Its not likely that Homo Sapiens wiped out Neanderthals. They are us. Just like an "Italian" can say "Chinese", "Etruscan" and "Navaho" are us. The evidence that they really were separate species is getting slimmer and slimmer. 200, 000 years from now humans are not going to say this "group" is us but some other currently existing "group" is not us.

The Neanderthals might have looked slightly different and would have gotten a different meand23 report but in same way my family probably looks different from yours and our so-called ancestry reports would be different.

And you are definitely correct that "Homo [Sapien] Sapiens didn't evolve from [Homo Sapien] Neanderthals, so it still doesn't work as a reference class for Scott's argument."

Expand full comment
Brett's avatar

There's a lot of 1% risks. I don't even think we necessarily should get rid of nuclear weapons, and there was a much more than 1% risk of total nuclear war in the Cold War (and probably still more than a 1% risk now).

On the other hand, is there a less than 1% chance that this will dead-end again like self-driving cars did after 2017, and we'll end up with another AI winter?

Expand full comment
Matthias Görgens's avatar

Self driving cars are still in development.

The major hurdle seems to be regulation, as you can't just unleash mediocre self driving cars, even though bad human drivers are totally legal.

Expand full comment
Doug S.'s avatar

There's a limit on how bad a human is allowed to be at driving before that human loses his or her driver's license (or is not issued one to begin with)...

Expand full comment
Martin Blank's avatar

Not really...

Expand full comment
Peter Schellhase's avatar

Don’t we have another obvious case of a technological singularity that has done, and has the potential to do, great harm? We can’t un-split the atom, but nuclear science, particularly in its warlike use, has done great harm and could do much worse. Anything we “obstructionists, overactive regulators, anti-tech zealots, socialists, and people who hate everything new on general principle” can do to prevent the proliferation and use of nuclear arms seems necessary, even a matter of survival. Why would AI be any different?

Expand full comment
BE's avatar

What great harm has nuclear science done us so far that is so self-evident as to require no arguments or even a mention and to outweigh all the benefits?

But hey, burning fossil fuels is awesome. (Yes, this is simplistic. That’s the point)

We can have a similar argument about the potential. And maybe that one will even be won by the anti-nuclear side. But an argument still has to be made.

Expand full comment
Sandro's avatar

I think the fact that the Manhattan project was done under such insane secrecy means that the people in charge knew how dangerous it was and they took great care to ensure it wasn't misused. The arguments over the benefits took place after the tight secrecy on nuclear tech was already in place. That sounds exactly like what Scott and other doomers are suggesting. Lock it down, progress slowly, and then we can discuss all the benefits we might reap.

Expand full comment
BE's avatar

Uh, OK? I was responding to the uncritical claim that nuclear is obviously a warning tale that already caused us enormous harm. That is false. I don't see how you actually reply to me - maybe it's a wrong indent on the thread?

To your point - the secrecy was _not_ due to any concerns about misuse and impact on humanity, it was very much about enemy espionage. Scientists attempting to have a meaningful discussion on the potential impacts on humankind were pretty much shut down (e.g. Truman shouting Oppenheimer out of his office, Szilard's struggles to control what he regarded as his brainchild). I don't think you want to use this analogy, for AI to be sprung upon an unaware public as a fait accompli. Quite the opposite.

Expand full comment
Sandro's avatar

> I was responding to the uncritical claim that nuclear is obviously a warning tale that already caused us enormous harm. That is false.

It's not false though, it actually proves the point. The experts and military strategists knew that nuclear *could* cause enormous harm if the enemy developed it first, or if it otherwise proliferated the way LLMs are currently proliferating, and that's *why* they locked it down. The US and allies still bomb countries it doesn't want developing nuclear capabilities because of those dangers. There's no such foresight being applied here.

Expand full comment
BE's avatar

The part that's patently false is that nuclear power, in the world we actually inhabit, already caused enormous net harm so obvious that one shouldn't provide any supporting evidence/ argumentation.

The military strategists did not listen attentively to scientists concerned about proliferation/ arms races. Whatever careful harms/benefits analysis you want for AI did not take place for nuclear in 1942-45. Also, characterizing Manhattan Project as carefully slowing progress until the implications are thought through is... interesting. They frigging did the original experiment achieving self-sustaining chain reaction right under Chicago! Fermi's wife reports in her memoirs how the physicists were watching the rate of the reaction carefully, being relieved they didn't get close to, well, blowing the whole city up. Relieved.

Today the situation is somewhat different (though to my knowledge there's exactly one example of a country being bombed out of imminent nuclear capabilities - Iraq 1981). Still, the motivation is purely military - "let's prevent them from bombing us" as opposed to "let's prevent a proliferation dangerous to as a human race", not that nuclear is an extinction threat.

Expand full comment
Sandro's avatar

> The part that's patently false is that nuclear power, in the world we actually inhabit, already caused enormous net harm so obvious that one shouldn't provide any supporting evidence/ argumentation.

Sure, but my point is that they didn't have that evidence going into the Manhattan project, did they? And yet they had the foresight to keep it very secret because they understood the potential dangers.

> Whatever careful harms/benefits analysis you want for AI did not take place for nuclear in 1942-45. Also, characterizing Manhattan Project as carefully slowing progress until the implications are thought through is... interesting.

That's mischaracterizing what I'm saying. Slowing proliferation is not necessarily slowing progress for the in-group. Secrecy clearly slows proliferation and allows breathing room for possible countermeasures and preparations.

In their minds, if the secrecy delayed their enemy's progress even a few months, that could have made the difference between winning and losing the war. Solid reasoning that we're not really seeing employed for AI today.

Expand full comment
Bertram Lee's avatar

"I think the fact that the Manhattan project was done under such insane secrecy means that the people in charge knew how dangerous it was and they took great care to ensure it wasn't misused. "

Lots of non dangerous useful items were projects done in great secrecy because they would be especially useful if the enemy didn't have them or know that they existed. For example radar improvements and reading radio waves from thousands of miles away.

Expand full comment
TGGP's avatar

Nuclear weapons are often referred to as a potential existential threat, but that's incorrect:

https://www.navalgazing.net/Nuclear-Weapon-Destructiveness

Expand full comment
AntimemeticsDivisionDirector's avatar

You kind of disprove your own point. The phrase "could do" is doing a lot of work here. I won't claim nuclear weapons pose no danger, not by any means. But we've successfully come 78 years without a nuclear detonation in anger, and in the last century more people have died by the machete than by the atomic weapon. If anything, the proliferation of nuclear arms has prevented their use; It's no coincidence that the only uses of nuclear weapons in combat were carried out by a nation that, at the time, had the sole military which possessed them. If nuclear weapons are the relevant precedent then the more AIs we can produce, and the more diverse the hands into which we get them, the better.

Expand full comment
J C's avatar

Nukes haven't been used since ww2, but the world powers have also tried fairly hard to reduce nuclear proliferation as much as they could. It's hard to prove what would happen in an alternative case, but I would say that if nukes were far more prolific and easy to get, we'd have had at least a few cases of terrorists using them (which perhaps could have cascaded into much worse). I would credit our 78 year success to the prevention of nuclear proliferation, the opposite of your claim.

Expand full comment
Carl Pham's avatar

That isn't really consistent with history. The world's nuclear arsenals grew enormously between 1945 and 1985. They only stopped growing in the late 80s pretty much because no one could think of a use case for more, and they're expensive to build and maintain. They only started *shrinking* with the end of the Cold War and the demise of the USSR, and the only real reasons are because the USSR was no longer seen as a big threat, and it's expensive to maintain them, and their enormous delivery apparatus, under conditions of careful control. It's just way easier to control 1,500 launchers in a few places than 20,000 delivery systems of all types and manner from B-1s to Pershings.

The only nontrivial success of nonproliferation attempts was to keep the technology out of the hands of most of the rest of the world, past the original nuclear club. And even there, it is arguably more the tremendous expense and notoriety that probably did the real work, I'm doubtful the NPT would have done squat for cases where the first two issues weren't sufficient -- and, indeed, it did no good at all in the case of India, Pakistan, Israel, and lately Iran.

Expand full comment
J C's avatar

I think there's a more selfish calculation that many people, including myself, run:

1. The risk is unknown, but seems pretty unlikely.

2. I'm not a powerful person, so even if I try to fight the risk, I will probably have no effect. If I ignore the risk, I can save my energy and not stress out.

3. Therefore, I'll carry on like everything is fine.

It doesn't make sense for most people to worry about vague risks that they have no real chance of affecting, unless they are very altruistic, which is quite rare.

Expand full comment
Scott Alexander's avatar

I think this is fine. What confuses me is when people go out of their way to deny the risk, or actively exert effort into making it harder to fight the risk.

Expand full comment
J C's avatar

Fair enough. My charitable take is that people instinctively extrapolate that if things have been fine so far, things will continue to be fine. Like how many young people consider good health a given until they personally suffer some health problem.

My less charitable take is that people are mostly interested in scoring social points, and for now, you get more points for getting a skeptic. Though, this can reverse once the other side gets enough support (or political lines are drawn), like with climate change.

Expand full comment
av's avatar

It's going to be much harder for them to carry on like everything is fine if the US starts nuclear bombing China, so their reaction (given the premise) is perfectly natural. "X is unknown and might be ok or even great, Y is known and really, really bad; I should do everything in my power to stop Y" is internally consistent.

Expand full comment
Level 50 Lapras's avatar

All debates are bravery debates. If you see everyone around you freaking out about nothing and driving themselves crazy, why *wouldn't* you try to give them helpful advice? If people you knew were constantly talking about shark attacks, getting depressed about it, refusing to dip their toe in a swimming pool etc., surely you would try to convince them that shark attacks aren't actually worth worrying about?

Expand full comment
Mike Bell's avatar

Zvi has a recent long-form take arguing somewhat to this effect: to think for yourself (against social panic) and in support of living your normal life.

https://thezvi.substack.com/p/ai-practical-advice-for-the-worried

Expand full comment
The Smoke's avatar

Scott is the one relying on a fallacy. In almost all observable cases, Tyler will be proven right and he won't.

Expand full comment
Matthias Görgens's avatar

How is that a fallacy?

Isn't that like arguing against Russian roulette?

Expand full comment
The Smoke's avatar

Agreed, that would also be ridiculous. (Sorry if the sarcasm wasn't obvious)

Expand full comment
Sandro's avatar

That's just an assertion, not an argument. I could just as easily assert the converse.

Expand full comment
noah's avatar

The point is tunnel vision. There is a non zero possibility of AI killing us all. There are non zero possibilities for thousands of other doomsday scenarios that are just as well specified as AI killing everyone, that is, we have no idea how it will happen.

Focusing on just this one thing when we literally have no idea how it will happen is a mistake. That is the point. Spend energy on things we can understand. Know when to quit, pause, refocus. That is a key feature of intelligence.

I see little to no acknowledgment from any AI safety folks about how emotionally charged this is for so many people, and how likely that is to cloud judgement and draw focus to the wrong thing. How a bunch of people who think intelligence is what makes them special can easily talk themselves into believing 1000 IQ has any kind of meaning and means instant death because obviously just being smarter than everyone is all it takes.

Expand full comment
Donald's avatar

> How a bunch of people who think intelligence is what makes them special can easily talk themselves into believing 1000 IQ has any kind of meaning and means instant death because obviously just being smarter than everyone is all it takes.

Ah, the argument from imaginary biases.

On the other side, if a bunch of smart people have come to a conclusion, maybe they know something you don't.

IQ1000 doesn't really have a well defined meaning, because the IQ scale is based on humans. Doesn't mean superintelligence isn't a thing, just that we can't measure it.

We have a track record of smart humans inventing all sorts of powerful tech, is it that much of a stretch to believe something smarter could invent even better tech. Obviously it needs some resources, but they needn't be large or unusual resources.

Expand full comment
Thegnskald's avatar

I'm smarter than everybody, and I've come to the conclusion that it isn't a threat. Maybe I know something you don't know? Or maybe that's just a silly argument.

In my experience, smart people aren't actually any better at getting the things they care about right; they have some advantage in getting things they don't care about right, but the ability to rationalize increases with intelligence, and the desire to rationalize increases with the degree to which you care about a subject - and yet there's nothing to intelligence that inherently produces more careful and deliberate thought, and everything to discourage the development of the kinds of careful and deliberate thinking habits that lead to correctness (why bother, when you can just look at the problem and know the solution?).

AI catastrophism is not discussed with careful and deliberate thinking, it is discussed with thought experiments with missing steps. Indeed, the missing steps are often the central focus! "I can't imagine how to do this because I'm not smart enough, but a smarter thing could figure it out."

That is not careful and deliberate thinking, that is a narrative, a fictional story trying to portray something smarter than the author. But the author can't actually write a character smarter than themselves; they can only employ little tricks.

Could AI be a danger? Sure! But no complete and careful has actually been made about the likelihood and extent of the danger; they're all full of critical missing steps. That's the point, you say, that we can't imagine what a smarter thing than us is capable of?

Sounds like religion, to me, with a heavy dose of "The fact that I can't make a better argument is evidence that my argument is correct" circular logic. The positive AI adherents imagine a benevolent God, the negative AI adherents imagine a malevolent God - and I'm over here wondering why "I can't argue for this because I don't know how it could happen, but a smarter thing could make it happen, and therefore my argument is correct" is being taken seriously.

Expand full comment
Carl Pham's avatar

I really like your second paragraph. An important point, well expressed.

Expand full comment
skybrian's avatar

I think the argument works a bit better as a fatalistic one: start by assuming we're all doomed.

In a way we are. Except in the unlikely event that immortality is invented, we all die someday, and we don't know when. This doesn't mean we give up hope. You don't know what you'll die of and you still have the rest of your life to live, however short it might turn out to be. Ignoring our eventual doom is how we keep going.

It proves too much, though. Why worry about risks at all?

Perhaps we should worry about risks we can do something about, and not the other ones we can't? People will disagree on what disaster preparation is tractable, but the ill-defined risks we have little idea how to solve seem more likely to be the ones to be a fatalist about?

Expand full comment
dionysus's avatar

The alien example is a good one, but not for the reasons Scott thinks. If an alien spaceship is actually coming toward Earth, even if the aliens did want to kill us all, what can we do about it? Absolutely nothing. Any pre-emptive military response is overwhelmingly unlikely to be effective and will lead to more hostility from the aliens than they were originally planning. Instead of founding a doomsday cult around the fear that the aliens will kill us all, we might as well assume it's one of the other possibilities and find ways to make the best of the alien presence.

"There have been about a dozen times a sapient species has created a more intelligent successor species: australopithecus → homo habilis, homo habilis → homo erectus, etc - and in each case, the successor species has wiped out its predecessor. So the base rate for more intelligent successor species killing everyone is about 100%"

That's not how it worked. Australopithecus *became* homo habilis. They two didn't evolve on separate continents and fight an existential war where one genocided the other. The proper analogy would be if man and machine morphed into one superior entity that will take the human species to new heights.

"In order to generate a belief, you have to do epistemic work. I’ve thought about this question a lot and predict a 33% chance AI will cause human extinction; other people have different numbers. What’s Tyler’s? All he’ll say is that it’s only a “distant possibility”. Does that mean 33%? Does it mean 5-10% (as Katja’s survey suggests the median AI researcher thinks?) Does it mean 1%?"

All those numbers are meaningless. They give a sense of mathematical precision and hide the fact that they're no better than mere hunches. As a hunch, a "distant possibility" is perfectly acceptable.

Expand full comment
AntimemeticsDivisionDirector's avatar

>if the aliens did want to kill us all, what can we do about it? Absolutely nothing.

At the very least, marginally more than nothing.

Expand full comment
Shalcker's avatar

Just 100km spaceship from known materials could be "safely" nuked or intercepted kinetically.

But that in turn would create uncertainty on wherever there are more of same or bigger ships where it came from.

Expand full comment
dionysus's avatar

Aliens who built a 100 km spaceship and are bent on genocide would not fail to think of "what if our victims try to kill us first?"

Expand full comment
Martin Blank's avatar

There is tons of stuff we could do. First we should probably talk to it, but it is is hostile it might be pretty easy to destroy.

Expand full comment
Purpleopolis's avatar

A ship built of materials that can stand the stresses of interstellar acceleration with a lever arm of 100 miles? What kind of definition of "easy" are you using here?

Expand full comment
Martin Blank's avatar

Put a bunch of shit in its way. It is a lot easier to destroy than preserve. Space warfare is going to be a radically offensive environment IMO, defense won't work, offense will be cheap and effective.

Expand full comment
Purpleopolis's avatar

A ship that has travelled interstellar distances has already encountered more shit in its way than we have the capability to move.

Expand full comment
Martin Blank's avatar

But not necessarily all at once, or with engines on it, and overall the interstellar ship is likely to be moving very fast.

Expand full comment
dionysus's avatar

If we put a bunch of shit in its way 10 days before it arrives, the ship only needs to move sideways at 0.2 m/s to avoid the shit. That's a trivial maneuver even for our probes, let alone for a spaceship that can travel between stars.

Expand full comment
Jonathan's avatar

In the words of TC, "the humanities are underrated."

Going from "GPT is more convincing at a Turing test with 10 billion parameters than with 1 billion parameters" to "this will just scale forever into superintelligence" is smuggling in too many assumptions.

Expand full comment
Gres's avatar

Sure, but “this will just scale forever and eventually it will be able to write advertising copy” smuggled in most of the same assumptions five years ago, and it’s probably true given what we know now. Just because something is uncertain doesn’t make it unlikely.

Expand full comment
Jonathan's avatar

The difference is that there are billions of examples of advertising copy that already exist. That GPT can copy its homework for something it has billions of examples of in its training data isn't surprising. Whereas superintelligence is not even a well-defined concept. There are no examples of it in the training data, and there never could be, even if examples did exist; the training data is static information, not the intelligent processes that created that information.

The fact that with enough training data, GPT can very accurately immitate its training data, follows directly from the theory of what GPT is and how it works. There is a solid explanation of *why* this is the case. Whereas the suggestion that with enough training data, GPT can transform into a hitherto nonexistent type of intelligence that surpasses everything that's ever existed, is a non sequitur. Which isn't to say that it's proven to be impossible, just that the argument isn't strong enough to support such an extreme response.

There is not even any evidence that superintelligence as a concept (in a way that is qualitatively superior to human intelligence) makes sense. That's another one of the huge assumptions being smuggled in.

Expand full comment
Gres's avatar

I think the fear is that a big enough room full of experts in enough different sub-sub-fields could come up with plans that are possible now, but just haven’t been done yet. For example, I can now imagine that in twenty years, a maths professor will find it easier to hand a problem to an AI than to hand it to a student. The GPT-4 paper linked above shows the LLM producing enough relevant facts that it can draw conclusions, for easy problems, but producing relevant facts until you get a solution is one of the main ways maths advances.

Expand full comment
darwin's avatar

>If you have total uncertainty about a statement (“are bloxors greeblic?”), you should assign it a probability of 50%.

This seems wrong to me? Most arbitrary statements are wrong ('this chair is blue, this chair is green, this chair is white,' etc for 500 colors, only one will be right).

Maybe you mean more like if a human makes a statement you should assign 50%, because humans are more likely to say true things than completely arbitrary things. But I don't think that makes 50% the right number, and I don't think that's 'total uncertainty' anymore.

Maybe the point is that even taking the outside view of 'most arbitrary statements are wrong' is itself a kind of context, and if we're talking truly no context then 50% is max entropy or w/e.

Expand full comment
Taleuntum's avatar

This chair is not blue, this chair is not white, this chair is not green

Expand full comment
Chris's avatar

Yes, the bit at the end is the point.

If you know that "most arbitrary statements are wrong", and that "bloxors are greeblic" is an arbitrary statement, then you definitely don't have "*total* uncertainty" about the statement.

Expand full comment
darwin's avatar

I feel like there's another specific smaller fallacy getting invoked here, which I see all the time but don't have a name for.

Which is basically people looking at someone who is declaring something a crisis and trying to rally support to solve it, and saying 'people have declared lots of crises in the past, and things have usually turned out fine, so doom-sayers like this can be safely ignored and we don't have to do anything.'

The fallacy being that those past crises worked out fine *because* people pointed out the crisis, rallied support behind finding a solution, and solved it through great effort. And if you don't do those things this time, things might not work out well, so don't dismiss the warnings and do nothing.

Expand full comment
The Ancient Geek's avatar

The Y2K bug is an example. I call it "confusing warnings with prophecies".

Expand full comment
Moral Particle's avatar

At the risk of being dismissed as "reference class tennis," the Y2K analogy is an interesting one to consider. On the one hand, it was a real problem and there is no doubt that the intense media focus ("hysteria?") and dire predictions helped galvanize companies and governments to work on the problem with energy and urgency. On the other hand, there were a surprising number of Y2K doomers who insisted that there was "NO WAY" companies and governments could fix all the necessary code in time and that great economic and societal harm - or even collapse - was inevitable. Some of the doomers were just culty, but some put forth comprehensive, scientistic arguments - there's X amount of code that needs to be fixed, and it would take Y amount of time to do it properly, but there's only Y minus Z amount of time before the year 2000, and if the Y2K bug is not fixed (and it can't be!), the result will be utilities and financial institutions etc. fail, so doom is the likely result. It was science and reason!

Expand full comment
Gunflint's avatar

There is risk from AI in the fairly near term in that it could have a lot of information about us thanks to the bread crumbs each on of us leaves when we search, when we purchase, or when we subscribe.

Targeted advertising could morph into targeted propaganda that may be well nigh irresistible in some cases. This would be at the direction of humans though.

But the paper clip thing, or the much less likely AI with a will and agenda of it own? I don’t see it. This is *simulated* intelligence we are talking about to date. It *understands* exactly nothing.

Expand full comment
Martin Blank's avatar

Yeah I think the "AI creates dystopian hellscape" is about 10X-20X more likely than "AI kills us all" in terms of risks.

Expand full comment
AntimemeticsDivisionDirector's avatar

>If you have total uncertainty about a statement (“are bloxors greeblic?”), you should assign it a probability of 50%.

No, you shouldn't. You should ask the question-asker for appropriate clarification if you can, and if this is impossible or not forthcoming you should assume you don't understand them properly or ignore the question as nonsense. Neither bloxors nor greeblic are words in any language I can find record of. Uncertainty Vs. Certainty has no place here.

What probability of truth would you assign the following question said by me: "To which departure gate would you assign the crate mason, the one who had one of that place?"

Now maybe some alien civilization is making errors in your language, which is a different matter. But in that case you still shouldn't be assigning every incomprehensible transmission you receive a 50% chance of being true as stated.

Expand full comment
Korakys's avatar

Prophecising Doom has a zero percent success rate. That's the base rate.

Remember the trolley problem? No-one talks about that now.

Expand full comment
Scott Alexander's avatar

It will never be April 2023, because it has never been April 2023 before, so the base rate is 0%.

Expand full comment
Korakys's avatar

There are very good mechanistic reasons to believe that April 2023 will happen (leaving aside that calendars are a social construct).

If you want more people to take this seriously you need to really get down into the mechanistic details of how an AI apocalypse will play out. The literal physics of it. Focus on that and more people will pay attention.

At the moment you sound a lot closer to the religious doomers, Millennialists, of ~150 years ago than the nuclear war doomers of cold war era. The slight difference here is that: humans invent god > the mind of god is unknowable > god might decide to kill all humans > AI apocalypse is a strong possibility that we should all be worrying a lot about.

Maybe I'm just some stupid pattern matcher but this pattern has played out so many times before: before people understood science they'd invoke god as deciding that soon would be good time to end the world. Later people invoked science: nuclear war, ozone hole, climate change, ...AI.

This is fear of the unknown, but it's not a balanced fear. There was a fear that Trinity could ignite the atmosphere, but in retrospect that seems ridiculous (fusion is a poster-child hard problem). But at the time? Yeah, it was not taken very seriously then either: the mechanics just didn't hold up. Give us mechanics and your ideas will be ripped apart by people far, far smarter than me. Then you'll be able to go back to sleeping comfortably at night.

I've been reading you for 7 years (you're my favourite blogger) and I read HPMOR before that. I started out neutral on this idea but I've slowly come to the belief that AI isn't something to worry about, all the while reading your arguments. Most people will not be convinced by philosophy, they need physics.

Expand full comment
Xpym's avatar

>Most people will not be convinced by philosophy

Most people still believe in the magical sky father, with zero evidence of him ever doing anything. AI doomers just aren't particularly good at philosophy optimized for convincingness, which doesn't have much to do with truthfulness.

Expand full comment
Korakys's avatar

You make a good point. I do think though that most religious people aren't convinced so much by philosophical arguments for god but rather that everyone/most people around them are already doing it.

Expand full comment
Xpym's avatar

Indeed. The game is about convincing the tastemakers, and because this particular philosophy isn't trivially useful to them, it's an uphill battle. The only realistic chance for doomerism is at least a Chernobyl-size disaster clearly attributable to AI.

Expand full comment
WindUponWaves's avatar

I can't do physics, but you might be interested in the mechanics: the exact steps an escaped AI might do to build up a dangerous amount of power, what exactly we should look out for if we suspect that's what happened, and overall the mechanisms by which an AI could pose a risk to humanity.

The top three explorations/scenarios of this I can think of would be Holden Karnofsky's "AI Could Defeat All of Us Combined" [https://www.cold-takes.com/ai-could-defeat-all-of-us-combined/], Ajeya Cotra's "Without specific countermeasures, the easiest path to transformative AI likely leads to AI takeover" [https://www.lesswrong.com/posts/pRkFkzwKZ2zfa3R6H/without-specific-countermeasures-the-easiest-path-to], and PolymorphicWetware's "Terrorism slingshot takeover" [https://www.reddit.com/r/slatestarcodex/comments/11f1yw4/comment/jcjw6m9/?utm_source=reddit&utm_medium=web2x&context=3]

Let me summarize each with a quote:

Karnofsky: "Even "merely human-level" AI could still defeat us all by quickly coming to rival human civilization in terms of total population and resources... At a high level, I think we should be worried if a huge (competitive with world population) and rapidly growing set of highly skilled humans on another planet was trying to take down civilization just by using the Internet. So we should be worried about a large set of disembodied AIs as well."

Cotra: "Relatively shortly after deployment, Magma’s datacenter would essentially contain a populous “virtual civilization” running ahead of human civilization in its scientific and technological sophistication. Humans would send instructions / communications and reward signals to the “Alex civilization,” and the “Alex civilization” would send out things like software applications, designs for computer chips and robots, orders for synthesized DNA and chemicals, financial transactions, emails, and so on to the human world.

Magma researchers looking in on the activity in the datacenter would become increasingly lost about what’s going on -- they would be in a position like someone from 1700 trying to follow along with a sped-up movie of everything that happened from 1700 to 2022..."

PolymorphicWetware: "Ultimately it's the exact same playbook as many historical dictators (Manufacture a problem, stoke fear, sell yourself as the solution), the sort of thing pop culture uses the likes of Emperor Palpatine to bring to life & illustrate. But it's a playbook an AI 'dictator' could use as well to gain power — as in, it could just do the exact same thing but with worse consequences since it's an AI instead of a human... all the parts are already out there...

...

It doesn't have to be the US that falls for this. Practically any rich & developed country could give a useful amount of resources to an unaligned AI if manipulated this way, including countries like Russia & China. And countries like Russia & China don't even have to be manipulated for their governments to be open to the idea of AI-powered surveillance. The possibility I've presented isn't theoretical, it could happen right now at the whims of Putin or Xi Jinping."

Expand full comment
Korakys's avatar

Thanks, I will check them out.

Russia's a joke at this point though. The big three should be USA, China, and India (if not now then soon).

Expand full comment
Sandro's avatar

> If you want more people to take this seriously you need to really get down into the mechanistic details of how an AI apocalypse will play out. The literal physics of it.

Consider how much of our civilization runs purely on information and information infrastructure. Think financial systems, hospitals, biolabs, manufacturing, identity systems, surveillance via ubiquitous phones. A superintelligent AI is by definition super-capable at processing information. Specifically, it would be able to capitalize on information humans typically won't be able to see without some time and effort (like vulnerabilities in software).

Therefore, a superintelligent AI would be able to more effectively manipulate the information systems we depend on than even the best human hacker groups in existence, the NSA included. Imagine the havoc it could wreak with that ability alone. Imagine the resources it could steal and bring to bear for some long-term nefarious plan. It could coerce or bribe humans to do real-world work for it.

That took all of 2 minutes of consideration on just the most basic properties of any superintelligent AI, and it's already terrifying.

Expand full comment
Korakys's avatar

Humans can get by without information infrastructure. Tests include things like ransomware taking down hospital systems. The efficiency is greatly reduced but we can get by.

How does the long term nefarious plan remain undetected... I could just as easily say in two years humans will figure out how know exactly what an AI is thinking (we aren't that far off doing that with humans now).

Expand full comment
Sandro's avatar

Humans can get by without tech, civilization cannot. You're looking at a total collapse of all cities where most humans live, collapse of supply chains providing food and other essential goods. And if it hits the power grid too, even worse.

> I could just as easily say in two years humans will figure out how know exactly what an AI is thinking

We don't yet understand how LLMs work, and we have total bit-level visibility on LLMs, and they're just about the dumbest thing we can do with transformers. Maybe we could figure out how to read their intentions and detect deceit, the question is whether we can do that *before* we hit AGI and the attendant risk of escape. That isn't clear at all.

Expand full comment
Korakys's avatar

When my parents were born there were no computers. It wont be comfortable for sure and they'd be a rough adaptation period, but it's still a decent civilisation.

There is hope we will figure out LLMs. Tech and science is very much a field that moves slowly then suddenly. Maybe we figure it out soon, maybe it takes 100 years. Maybe progress on LLMs slows rapidly soon as limits on the how much good data is available for training are approached. It's hard to be sure.

https://spectrum.ieee.org/black-box-ai

Expand full comment
Purpleopolis's avatar

Calendar makers have a much better track record of predicting dates than AI experts do at predicting AI milestones.

Expand full comment
Hank Wilbon's avatar

The core of Tyler's argument in that post is:

"Hardly anyone you know, including yourself, is prepared to live in actual “moving” history. It will panic many of us, disorient the rest of us, and cause great upheavals in our fortunes, both good and bad. In my view the good will considerably outweigh the bad (at least from losing #2, not #1), but I do understand that the absolute quantity of the bad disruptions will be high...

I would put it this way. Our previous stasis, as represented by my #1 and #2, is going to end anyway. We are going to face that radical uncertainty anyway. And probably pretty soon. So there is no “ongoing stasis” option on the table.

I find this reframing helps me come to terms with current AI developments. The question is no longer “go ahead?” but rather “given that we are going ahead with something (if only chaos) and leaving the stasis anyway, do we at least get something for our trouble?” And believe me, if we do nothing yes we will re-enter living history and quite possibly get nothing in return for our trouble."

In other words: "And therefore we will be fine." is not at all where Tyler lands on this post.

Expand full comment
Martin Blank's avatar

>And believe me, if we do nothing yes we will re-enter living history and quite possibly get nothing in return for our trouble.

What is even meant by this part? His argument seems to be "it is going to happen anyway", therefore if we don't go forward we might not get anything. Except if it is going to happen anyway, then not going forward simply isn't even an option.

Expand full comment
Hank Wilbon's avatar

I take "we" to mean "The USA". Tyler's view is that China is 100% going forward with advancing AI technology. See: https://marginalrevolution.com/marginalrevolution/2023/03/yes-the-chinese-great-firewall-will-be-collapsing.html

"We do nothing", meaning we, The USA, pause AI advancement, lands The USA, The West in general, in a world in a few years in which we are behind our growing superpower rival. Not a world we should want to be in.

Expand full comment
Martin Blank's avatar

I always take "we" when talking about existential concerns of human civilization being destroyed as talking about everyone. Who the fuck cares about the USA in that context?

Expand full comment
Hank Wilbon's avatar

Here is the beginning of Tyler's post:

"In several of my books and many of my talks, I take great care to spell out just how special recent times have been, for most Americans at least. For my entire life, and a bit more, there have been two essential features of the basic landscape:

1. American hegemony over much of the world, and relative physical safety for Americans.

2. An absence of truly radical technological change.

Unless you are very old, old enough to have taken in some of WWII, or were drafted into Korea or Vietnam, probably those features describe your entire life as well.

In other words, virtually all of us have been living in a bubble “outside of history.”"

In context, it seems clear that "we' means Americans and possibly Western Europeans. It is precisely those people who have been living "outside of history" since the Vietnam War and who have experienced what Tyler calls "The Great Stagnation" in his book by that title.

Expand full comment
Martin Blank's avatar

Well how small minded of him.

Expand full comment
Sandro's avatar

> 2. An absence of truly radical technological change.

The proliferation of the PC, the internet and mobile communications are all radical technological changes that occurred during my lifetime. If that's one of his foundational assumptions, then the whole argument seems like bunk.

Expand full comment
Martin Blank's avatar

Yeah when exactly was a period of radical technological change if not the past decades?

Expand full comment
Hank Wilbon's avatar

You'd probably have to read "The Great Stagnation" to get all his arguments on that issue. He has written hundreds of pages on it. Keep in mind that he writes his blog for his readers, and yes, his style can be very "Bangledeshi train station writing", meaning not every post is meant to be understood by everyone who hasn't kept up with his other writings over the years, or, in some cases, by anyone who didn't have lunch with him that day. He makes up for his occasional obscurity in quantity of posts. Scott's style has way more authorial generosity than does Tyler's, but if you are a complete newbie to rationalist-style blogs, many references here will also seem arcane.

Expand full comment
walruss's avatar

I always took it more as "the threat has no known shape so there's no point in trying to combat it."

There's always shades of a Pascal's Mugging in these discussions (I claim to have built a computer that can simulate and torture every human ever created. Probably I'm lying but just to be safe send me money so I don't). But I assume this community is smart enough to draw some principled line between that and "maybe AI will kill us all so we should treat it as a threat." I don't understand that line but I also put the odds of AI killing us all many orders of magnitude lower than you do.

Instead I'd like to propose Pascal's Bane:

If you don't do something there's a small chance the world will end. So we should do something but we have no idea what, and any action we take could actually increase that chance by some small amount so to be honest we're wasting our time here let's go get drunk.

And worse, there's serious tradeoffs to pursuing a random strategy like "don't build AI" to combat a small potential risk. Every time we aggressively negotiate with China or Russia we create a small chance of humanity-ending thermonuclear war but very few people see that as a good reason to never press an issue with those nations.

Expand full comment
James's avatar

Interestingly, a lot of people in the AI alignment communities did advocate for basically leaving Russia alone to do whatever during the last year because of the small risk of nuclear war. There's no counterfactual so who knows for sure, but does seem to me like it would have been a pretty bad plan w/r/t both Russia and China. I don't want to stretch the analogy too far because it's a different domain but it's possible that it says something about these same community's median risk tolerance when it comes to other things like AI.

Expand full comment
Oliver's avatar

> The base rate for things killing humanity is very low

I'm skeptical of this claim - if humanity went extinct we wouldn't be here to talk about it (anthropic selection).

I reckon the base rate for extinction is actually pretty high:

* https://en.wikipedia.org/wiki/List_of_nuclear_close_calls is a thing

* We're pretty early: if you were randomly born uniformly across all humans, the p-value of being born in this century would be pretty low if you thought humanity was going to exist for millions more years. So we might consider that evidence for an alternative hypothesis that we won't last millions of years.

Expand full comment
TGGP's avatar

Those close calls didn't threaten extinction:

https://www.navalgazing.net/Nuclear-Weapon-Destructiveness

Expand full comment
Caba's avatar

"We're pretty early: if you were randomly born uniformly across all humans, the p-value of being born in this century would be pretty low if you thought humanity was going to exist for millions more years. So we might consider that evidence for an alternative hypothesis that we won't last millions of years."

I think it's an excellent point, one in which I've always believed (it proves that either humanity is about to end or population will decline drastically - note that, considering that due to the increase in world population most people that were ever born were born recently, it doesn't look like we're at the halfway point of humanity's years - because you must count people not years - if you reading this were born about halfway through humanity's births then you're at the end of humanity's years), but I find it very, very difficult to put it in a way the other people will take seriously or even understand.

Does anyone else agree, and perhaps has found another way of making that point, a way that people find convincing?

Expand full comment
Macil's avatar

It's often known as the Doomsday Argument or Carter Catastrophe.

Expand full comment
Martin Blank's avatar

>it proves that either humanity is about to end or population will decline drastically

It "proves" no such thing.

And no it is not a compelling argument because *someone* needs to be alive then.

Say you have a group of 100 rabbits that eventually grow to a stable community of a billion rabbits, existing for billions of generations. Say you are rabbit number 4,542,235,214. The chance of you being that rabbit is crazy crazy small. Near zero. But it is not very strong evidence about the total rabbit population at all.

Expand full comment
Caba's avatar

"Say you are rabbit number 4,542,235,214."

That's the problem - you are assuming I'm that rabbit. But I'm very unlikely to be that rabbit. I'm much more likely to be one of the rabbit in the middle (assuming that there is a total of quintillions).

Yes, someone has to be that rabbit, and if that rabbit tries to infer the total number using that type of reasoning, that rabbit will be wrong. But it doesn't matter, because it's a probabilistic argument, and the rabbit that is wrong is immensely outnumbered by ther rabbits who are right (assuming every rabbit thinks in the way I described).

A person who wins a lottery might infer that it's easy to win one. Their conclusion is wrong, but their inference is logical.

Expand full comment
Martin Blank's avatar

Their inference isn’t logical if they know how the lottery works. That is the point.

The data we have on humans is one of pretty consistent growth, and one where most everyone who has lived has lived recently. So it’s not that surprising you find yourself where you are.

Expand full comment
Purpleopolis's avatar

The math behind the Doomsday Argument is actually pretty good -- it's the reason why serial numbers on military materiel are somewhat randomized. The allies used captured equipment (and the sequential serial numbers on them) to get an idea of the Nazi's total production.

Expand full comment
Martin Blank's avatar

Here is the problem with the math.

Lets say I develop a a new creature that is just like a mosquito, but cannot harbor malaria and is 200% better than a mosquito at being a mosquito. It is rapidly outcompeting "normal" mosquitos. and has driven them extinct in my state.

The total all-time population in 2020 in my lab was 80. At first release in 2021 it was 160,000, in 2022 it was 320,000,000 and had colonized most of my metro. In 2023 is when it overran my state with a total all-time population of 640,000,000,000.

What do you expect the species lifespan and max population of these mosquito variants to be? The Carter Catastrophe says "well the mosquito should not expect itself to be special so it should have a 95% confidence that the lifetime total population of roughly 1/20th the current total population.

That seems dumb and like you are throwing out TONS of information about the likely total population.

Facts which might include:

Are people (or myself working on different strains which might outcompete my strain)?

How quick is the generation time and how many generations might occur before you turn such changes around?

How large is the world and available uncolonized habitat?

Have there been problems with the "new*" mosquitos such that people might engineer control species.

Etc.

We have LOTS of facts about what to expect from the human population. We are NOT in a situation where the only fact we have about humanity is the exact number of humans living at each point over time. And even with that fact, I struggle to see how the Carter argument is more compelling than simply taking the growth curve at face value.

Anthropological arguments are very tricky because you can prove almost anything with them.

Expand full comment
Purpleopolis's avatar

There is a difference between the math involved and the assumptions behind the model that the math is being used in.

The math literally (as in literally) works (as in gives testable predictions that agree with reality) when applied in appropriate situations.

Expand full comment
eldomtom2's avatar

"* We're pretty early: if you were randomly born uniformly across all humans, the p-value of being born in this century would be pretty low if you thought humanity was going to exist for millions more years. So we might consider that evidence for an alternative hypothesis that we won't last millions of years."

Someone could have said this 2000 years ago and they'd definitely have been wrong.

Expand full comment
Caba's avatar

See my answer to Martin Blank.

Expand full comment
Tom's avatar

Here would be my attempt to steelman the argument. I'm not sure I endorse this, though I there's a kernel of truth to it.

Put yourself in the shoes of a European noble in 1450 who thinks this printing press thing is going to be a big deal. What exactly are you supposed to do about it?

You could lobby the king to ban the press. But then people in other countries will just do it anyway. Maybe you can get a lot of countries to agree to slow it down. But what does that actually achieve? Unless you truly stomp the technology out entirely, the change is going to come. It's not clear that slower is better than faster. And totally stomping it out would mean forgoing all the possible benefits for all time.

You can theorize a lot about possible effects and work to mitigate the harms. But you probably don't predict that some monk is going to start a religious shitstorm that leads to some of the bloodiest wars ever fought. How would you even begin to see that coming? And if you did, what would you do about it?

At the end of the day, history at this level is simply beyond humanity's ability to control. This is the realm of destiny or god's divine will. Trying to shape it is hubris. The proper thing is to just try and respond to each thing as it comes, doing your best. Trying to hold back the inevitable tide of history is unlikely to succeed and as likely to harm as to help.

Expand full comment
BK's avatar

One of the arguments about why the printing press didn't make a difference in china was because the ruling classes took a stance and stopped it. Similarly, on Tyler's podcast with Tom Holland, the ruling classes of Rome successfully stifled innovation in order to ensure some semblance of stability. Seems like only the post renaissance west is incapable of halting "progress", and yet all the arguments in The Complacent Class, The Great Stagnation and from liberal economics generally about how the modern regulatory state has stifled innovation seem to point in the direction that the west is learning that ability.

Expand full comment
Xpym's avatar

Seeing how Baidu is eager to rush out its half-baked ChatGPT clone, China doesn't seem to be particularly good at stifling technology these days.

Expand full comment
Shankar Sivarajan's avatar

To a person with sensibilities like mine, that the ruling class takes a stance against something for the sake of stability is a strong argument in favor of that thing.

Expand full comment
Deiseach's avatar

"One of the arguments about why the printing press didn't make a difference in china was because the ruling classes took a stance and stopped it."

How is that so? Some cursory online searching tells me the printing press was adopted to produce the kinds of textbooks and materials needed for an educated class to pass the civil service exams, which doesn't sound like 'the ruling classes stopped it'. There's some mention that it was simply too unwieldy to be a mass market innovation due to the number of characters needed to print in Chinese, unlike the simple letter alphabets of the West.

While there were attempts at controlling presses in Europe, via licences and prosecutions, I think that China being a unified empire made it a lot easier to stifle any upstarts trying to produce forbidden material; contrariwise, if you were a dissident printer in England, you had the opportunity to hop on a boat to the Netherlands and keep producing your revolutionary pamphlets there, if you weren't being protected by an influential person in power - I can see why Chinese high officials might not be too eager to risk their own necks protecting printers issuing dissident material, because that would be cutting off the branch they sat on themselves.

In England, the upheaval in religion meant that printers could be protected, or prosecuted, depending on which side they picked, and it seems that producing books was often a multi-national project:

"That same summer Barton’s supporters brought out a new pamphlet containing her latest angry denunciations in an edition of 700; the printer was arrested and cross-examined, and not one copy now survives. This comprehensive and effective censorship of the printing press was without English precedent, and a tribute to Cromwell’s efficiency."

"By 1535 Tyndale had been living in Antwerp for around six or seven years, fearsomely productive in propaganda and Bible translation, and benefiting from the indulgence of the city’s leading printers, who were happy to publish his works with a false imprint; the considerable demand for this sacred contraband back in England was too good to neglect, whatever the risks. For much of this time he counted on protection and shelter from evangelical English merchants living in Antwerp, latterly the well-connected London merchant Thomas Poyntz, who was of the same family as the West Country knight to whom Tyndale had acted as domestic tutor in the early 1520s. Poyntz was to sacrifice his prosperity and inheritance in the cause of Tyndale and godly religion."

"He could rejoice in fulfilling one of his greatest long-term plans, a lasting memorial to his Vice-Gerency: the publication at last of a fully official English Bible. This had been a long time in the making. It was a thorough revision of the Matthew Bible whose authorization he had obtained in 1537, and was prepared in Paris, through an ambitious co-operation between the French printer François Regnault and Cromwell’s regular London printers Richard Grafton and Edward Whitchurch, with the hugely experienced Miles Coverdale in charge of textual revision. From a technical point of view, working in Paris made sense, for the French printing industry was far better able to cope with such a complex print-run."

Expand full comment
BK's avatar

That is an excellent question - now I look I can't find where I picked up the idea about China. I thought it was rootsofprogress or maybe a Marginal Revolution link/interview, but I'm not turning up anything when going back over likely candidates from those sources.

Expand full comment
Tom's avatar

Yeah, I think it is possible for a society to "just say no" to certain technologies, at least until an outside power comes along to force the issue.

But I think that leaves you with a binary - you either take the plunge and see what happens, or you refuse entirely. It seems like half-measures are unlikely to be very stable. Once you allow progress, there will be massive incentives to accelerate. And even if you slow things down, it's not clear what you do with the time you buy, since what comes next is unpredictable and once it gets going might not be stoppable.

So maybe the argument is really that we can either abandon AI entirely, Butlerian Jihad style, or take the plunge and hope for the best.

Expand full comment
Victualis's avatar

A concerted effort to spread a philosophy of peace and tolerance for the hundred years after the printing press might well have made European society less likely to splinter into centuries of infighting with doctrinal issues as pretexts. Similarly we can try to build antifragile, or at least robust, social systems beginning now to reduce the kinds of harm that new AI technologies can cause.

Expand full comment
Tom's avatar

(this is still in steelman mode. I do not directly endorse this argument)

I think there's two different pieces there.

I think that in hindsight it seems that your plan would have been a good one. But I really strongly doubt that anyone alive in 1450 would have been able to see and implement that idea.

It's just as easy for me to imagine the hypothetical printing press alignment guy being worried about trashy fiction as a superstimulus or something. So he gets everyone together to say we need to make sure we only print good wholesome stuff, like religious texts.

Whoops.

In reality, our guy in 1450 has no idea where the threat is coming from. And even if he does, he has to convince a lot of other people he's right, which will be very difficult. His actions will be heavily constrained by various neuroses and culture war issues of his own time that blind him and his peers to the actual threat.

Realistically, I think the best he can do is try to decelerate the pace of change by banning the presses. Is that better than nothing? You might think that decelerating is still good because it buys you time. But it buys you time to do what exactly? You're probably just going to get flattened by something you never anticipated and all your delay was silly and pointless.

All that said, I actually think you have a good point about building antifragile institutions that are able to weather the storm that's coming. That's a good idea. But I'm not sure that's a point against Cowen's argument. Building strong, effective institutions is great and we should be doing it. But I'm not sure that entails that we need a massive push to slow down AI.

Realistically, do we think our institutions in a few decades will be better or worse than right now? My guess is they'll be about the same. So I don't think hitting the brakes on AI makes sense if all we're going to do with that time is try and improve our institutions a little bit.

Expand full comment
Victualis's avatar

I agree with your assessment that slowing AI seems orthogonal.

Expand full comment
Bardo Bill's avatar

Cowen's conclusion seems absurd to me. "Everything's going swimmingly with industrial technology so we should just go ahead and accelerate the current trajectory."

We're altering Earth's climate, destroying biomes across the globe, and are very clearly on track to bring about a global ecological collapse, probably including the planet's sixth mass extinction event. Yes, it's nice that we have toasters or whatever (for now), but *taking the long view* most certainly doesn't suggest that "ehh, radical technological change will turn out fine" is a reasonable attitude.

Expand full comment
TGGP's avatar

No we are not "very clearly on track to bring about a global ecological collapse". Recent estimates of the worst-case scenario for climate change have gone down in probability, and this is related to technological progress that has helped us shift away from coal. Forests are regrowing in many places that no longer need to be used for farmland. Technological progress makes "mastery" for climate and other ecological features more likely:

https://betonit.substack.com/p/the-meaning-of-climate-mastery

Expand full comment
Alex's avatar

We also have no idea which random incantations will summon a vengeful god to rain destruction on earth, but you're not so worried about people chanting gibberish...

Expand full comment
Eremolalos's avatar

Hey, I get it. But I keep wondering what's up with you, Scott, who place the odds of disaster at 33%, Zvi who gives no numbers but seems to consider death-dealing disaster likelier than 50%, and Yudkowsky, who mostly sounds sure we're doomed. If you guys all believe disaster is that likely, why are you tweeting and blogging your thoughts instead of brainstorming about actions that might actually make a difference? It makes me feel as though this is all some weird unacknowledged role-play game, and "the

AI's gonna kill us if we don't stop development" is just the premise that forms a dramatic background for people to makes speeches about dying with dignity, for people to put up posts that display their high-IQ logic. Scott, even if your present post spurs a remarkably honest and intelligent discussion of how to think straight about AI risk, how much difference do you think the occurrence of such a discussion is likely to make to AI risk itself? Do you think it will reduce the risk even 0.1%? You, Zvi and Yudkowsky are like guys lounging on the deck of the Titanic. Yudkowsky says he happened to fly over this area in a hot air balloon right before the the ship launched and he saw that it was dense with icebergs. No way for a ship to thread its way through them. And now with the special telescope he brought with him he can see the one the ship's headed right for, about 4 hours away. So he's sitting there with like a plaid blanket over his legs, rambling on about death with dignity. Zvi's playing cards with a group of blokes, winning and winning and winning, and meanwhile talking a blue streak about the berg. And you're pacing around on deck, having exemplary discussions with followers about fair-mindedness and clear thinking, and everything you say is smart and true. But what the fuck! Aren't you going to try to save everybody? Of course you will have to be rude and conspicuous to do it, and you might have to tell some convincing lies to get things going, but still -- surely it's worth it!

I am not a terribly practical person, and have never had a thing to do with political work, not even as a volunteer for somebody's campaign, but even I can come up with some ideas that are likelier to have more impact than Yudkowsky's tweet debates with skeptics on Twitter. I have now posted these ideas once on here, once in Zvi's comments and once on Yudkowsky's Twitter. Very few people have engaged with them. I guess I'll say them once more, and just go ahead and feel like the Queen of Sleaze. It's worth it.

Do not bother with trying to convince people of the actual dangers of AI. Instead, spread misinformation to turn the right against AI: For ex., say it will be in charge of forced vax of everyone, will send cute drones to playgrounds to vax your kids, will enforce masking in all public paces, etc. Turn the left against AI by saying the prohibition against harming people will guarantee that it will prevent all abortions. Have an anti-AI lobby. Give financial support to anti-AI politicians. Bribe people. Have highly skilled anti-AI tech people get jobs working on AI and do whatever they can to subvert progress. Pressure AI company CEO's by threatening disclosure of embarrassing info.

OK, I copy-pasted that from my comment on Zvi's blog. I et that this is not the greatest plan in the world, and probably would not work. But even in its lousy improvised form, it has a better chance of working than anything that's happening now. And it could probably be greatly improved by someone who has experience with this sort of thing. I personally have never even tried to bribe someone. No, wait, one time I did. I had flown to the town where I was going to do my internship, and had 3 days to find an apartment. When I tried to pick up my rental car at the airport I discovered to my horror that my driver's license had expired several months previously. Without that car there was no way to apartment hunt. So I said to the Avis clerk, "If I can't rent this car I'm in a terrible pickle. I've never tried to bribe anyone before but -- if I gave you $100 could you just pretend you hadn't noticed my license had expired? And I apologize if it's rude of me to offer you cash this way." And she just laughed, gave me the keys to the rental car, and would not take my money. Still, if there were people trying to slow down AI using the kind of ideas I suggest, I'd be willing to be involved. I would probably be fairly good at coming up with scary AI lies to spread on Twitter.

Expand full comment
UndeservingPorcupine's avatar

If you’re a real extremist at this point, like EY, it seems your best bet would be to form a small terrorist cell to blow up ASML/TSMC plants and buildings.

Expand full comment
Eremolalos's avatar

Nope doesn't follow. Would not work, because there are so many plants and buildings involved in AI development, and I would certainly get caught, since I do not have the skills or the criminal buddy network to help me escape after my crime. Also I am not willing to kill anybody, not even a night watchman. All of which you can easily figure out on your own. So how about engaging with my real suggestion, which is likelier to work and does not even involve clearly illegal activities, instead of with an ineffective strawman involving murder and property destruction for which I am virtually guaranteed to get caught and sentenced to decades in jail?

Expand full comment
Martin Blank's avatar

Not being willing to kill people to save the world is extremely non-rational.

Expand full comment
JDK's avatar

No!

Expand full comment
Martin Blank's avatar

I think your beliefs here are very non standard. Most people in this community (and I would argue historically) are at least somewhat consequentialists.

You are saying that if the world would going to explode tomorrow, but to stop it you have to murder a baby, you would just let the world explode? That is very "god of Abraham" of you, viewing everything as some ethical/religious test.

Expand full comment
JDK's avatar

How would murdering a baby or anyone stop the world from exploding?

Anyone thinking that they are actually faced with that situation should recognize that such a thought is a sign of mental illness and should reject any impulse to murder anyone.

My anti-terrorism position is not non-standard here or anywhere.

Is this community insane or amoral? I highly doubt that.

Expand full comment
Macil's avatar

Doing this wouldn't solve the problem forever and it would stigmatize alignment work, making doom more likely.

Expand full comment
Martin Blank's avatar

If you think there is a 99% chance the world is ending in the next 5 years (which some do), then doom is so likely that even low probability of success terrorism makes sense unless you are incredibly selfish/cowardly.

Expand full comment
JDK's avatar

"Terrorism (under your parameters) make sense unless you are selfish/cowardly"

What? How about terrorism is wrong and no means can ever justify consequences. Maybe I have misread your ridiculous statement.

Expand full comment
Martin Blank's avatar

Many people here are 100%, or at least significant consequentialists/utilitarians. Consequentialists generally don't have any issue with terrorism done for the right reasons.

And I personally am certainly not 100% a consequentialist, but I think consequentialist concerns are super important. You position makes sense for a deontologist, but I don't think they are that common here.

To be more clear about what I am saying, I get skeptical that people here are as strong believing that say "AI will kill all people within 5 years", when they react at horror at the idea of committing violent terrorism over that fact. Lots of people who claim that is super likely, but the most they are willing to do is write a strongly worded letter.

Which is frankly super irrational and unethical (under most ethical systems).

Now I don't think doom is nearly that likely (I would put it in the low single digits, maybe 1%, with significant chance of positive outcomes), so nothing to fear from me. But I have two kids under 10 and if I really believe shit going down in silicon valley was going to lead to them never living to adulthood, well I would be stupid/immoral to not act immediately.

It also makes me think some of the doomers are laying status games and don't really hold those beliefs.

I am curious about:

>How about terrorism is wrong and no means can ever justify consequences.

If you were a Native American wouldn't you have pursued a policy of terrorism? What about a Jew in Nazi Germany? Terrorism is surely justified sometimes.

Expand full comment
JDK's avatar

Your premise is that the consequentialist/ utilitarian position is a "rationalist" position but that a

deontologist is not a rationalist position.

I reject that premise.

As to your question. No terrorism is never ethical. Actual Imminent proportional self defense or actual imminent proportional defense of a defenseless is not terrorism.

Expand full comment
BE's avatar

I mean, the efforts of early AI risk proponents made this from a niche topic to a mainstream idea. They managed to get huge sums of money dedicated to the issue - though whether this was in any way an actual win for them is questionable (e.g. OpenAI). Scott promotes and helps fund various AI alignment organizations. It wouldn't be fair to accuse them of idleness, whatever you think about their object-level views.

ETA - less relevant for Yudkovsky, but at least Scott did entertain the thought that we absolutely must have AI to survive Moloch, in the past. Giving 33% odds to AGI leading to human extinction and odds of similar magnitude to AGI saving us in some sense leads to an interesting dilemma!

Expand full comment
Eremolalos's avatar

But was any of that huge sum of money spent on somewhat sleazy practical efforts, of the kind normally used in politics to bring about various ends? Because it does seem to me that a 33% or higher chance of our species being killed off may not justify murder, but it sure justifies some sleaze. As for AI vs. Moloch, I have no thoughts. Scott hasn't addressed the combo question.

Expand full comment
BE's avatar

This seems to ignore the element of time? My understanding (and memory) is that Scott doesn’t anticipate this happening for at least a few decades. Possibly this was revised after GPT-4. Sleazy short-term actions may well be off-putting to potential allies and harm your cause more than help it. Not to mention not all ethics being utilitarian:)

Expand full comment
Martin Blank's avatar

>Because it does seem to me that a 33% or higher chance of our species being killed off may not justify murder

It would 100% justify murder. Our nations/corporations, and individuals kill people justifiably (and unjustifiably) over MUCH much smaller stakes.

Expand full comment
Deiseach's avatar

"but it sure justifies some sleaze"

That's what lobbyists are for 😁 Though the last attempt at wining and dining politicians in DC in order to get their attention on Suitable Topics rather crashed and burned due to (whisper) SBF:

https://www.cnet.com/culture/sbf-donated-millions-to-politicians-now-ftx-wants-that-money-back/

https://www.realtor.com/news/celebrity-real-estate/dc-townhouse-linked-to-sam-bankman-fried-ftx-founder/

Expand full comment
dionysus's avatar

I don't think the efforts of early AI risk proponents had anything to do with it. It's visible and obvious AI progress that made AI risk a more mainstream idea. I say "more mainstream", because the threat of robots killing their creators has been a theme in science fiction since at least the early 1960s (the Daleks in Doctor Who).

Expand full comment
BE's avatar

To prevent a debate about "early" - I was thinking about, e.g., the "Superintelligence" book. It was published before the "dawn of deep learning for computer vision" (2012), and people from Bill Gates to Musk referenced it as one of the sources of their interest in AI risk.

The Asilomar 2017 conference was after computer vision was taken over, but before speech and NLP were (speaking very roughly and broadly).

I don't think your separation in its purest form quite holds water.

Expand full comment
geoduck's avatar

At the risk of being gratuitously difficult: A) Daleks are not robots, and B) robots turning on their creators dates back to the 1920 play which coined the term 'robot', R.U.R.:

https://en.wikipedia.org/wiki/R.U.R.

(But to be even more pedantic, the robots in R.U.R. are an organic life form akin to Replicants; Frankenstein's monster would be prior art.)

R.U.R. is a fun complement to the current discussion. (As the film Metropolis is a fun complement to discussions of Moloch.)

Expand full comment
Scott Alexander's avatar

"You guys all believe disaster is that likely, why are you tweeting and blogging your thoughts instead of brainstorming about actions that might actually make a difference? It makes me feel as though this is all some weird unacknowledged role-play game"

With all due respect, it sounds like you're the one playing the role-playing game. "Oh, if I believe this, I would totally be acting like some kind of dashing rogue, spreading misinformation on my way to blow up the chip manufactory".

I'm not actually good at spreading misinformation. Further, as soon as the anti-AI people start spreading right-wing misinformation, all left-wing people will hate and distrust them, and they'll stop getting things like the opportunity to write editorials in TIME (see https://time.com/6266923/ai-eliezer-yudkowsky-open-letter-not-enough/ ). Trust me, you are not the first person to think "Hey, what if I commit some fraud in the service of effective altruism, I bet that will work great and I'll never be caught and it will never cause me or the wider movement any problems", RIP SBF.

Compare this to the blogging I'm actually doing, where the heads of OpenAI and DeepMind occasionally read my blog and dozens of people say on the survey that they've switched careers to AI alignment after reading my articles. I feel like this is pretty high value - if there was something that was trivially higher-value, I would switch to doing that instead.

There was pretty limited anti-AI politics before last year because no normal person thought AI could be dangerous. Now some anti-AI political organizations are starting to spring up. I've been talking to some of them and trying to get them funding (you can see one such request on this week's Open Thread https://astralcodexten.substack.com/p/open-thread-269 although I focused on the technical org instead of the political org in my pitch).

The reason I'm not promoting anti-AI politicians on here is that I haven't heard of any since April, when the last anti-AI politician I promoted, including donating thousands of dollars to their campaign, lost the primaries. You can read more about it at https://astralcodexten.substack.com/p/open-thread-217 . We are trying to get more to run, but we really messed that effort up in a lot of ways and are trying to be more humble and less in-your-face about it the next time around.

I understand you're trying to gotcha me, but I promise I have thought about this really hard and am not making simple mistakes you can notice with two seconds of thought.

Expand full comment
J C's avatar

For what it's worth, I am inclined to ignore AI risk even though I consider it somewhat likely because there's a lot of things in my life with more immediate impact on me, but your posts have certainly gotten me to care more. Though changing my mind may not be high value, I'm sure there's others that are.

Now I'm wondering if writing some theories on the specific details of how we could build a world-destroying AGI would be helpful? I don't believe they will actually work with today's technology, but they might in the future, and if people get scared now they might actually do something to stop it from happening. Or, it might accelerate things... but I'm sure people are already trying these things behind closed doors anyway.

Expand full comment
Nicholas Weininger's avatar

Scott, since you mention the Time article: I don't see why you think getting things like that published broadly can possibly be a win for public support for AI safety.

After many, many avowals that safety advocates did not support violence and believed it to be both wrong and counterproductive, a leading advocate comes out for a regime of totalitarian, terroristic state violence backed by the threat of nuclear mass murder. Furthermore, he does so right around the 20th anniversary of the last time a major power destroyed trillions of dollars of value, along with much of its credibility, trying to stop a purely theoretical future threat for which it had no direct evidence but of whose direness it had nonetheless convinced itself. "Can't wait for the smoking gun to be a mushroom cloud" and all that.

Who in the world is going to read that ludicrous proposal and be more supportive of AI safety efforts rather than more dismissive of safety advocates as unhinged cranks? I know I have updated my opinion of Eliezer significantly in the direction of unhinged crankiness.

Expand full comment
Scott Alexander's avatar

Your description doesn't really sound good faith - is the current anti-biological-weapons regime an example of "totalitarian, terroristic state violence"? Even if you think yes I feel like describing a new thing that way is kind of https://www.lesswrong.com/posts/yCWPkLi8wJvewPbEp/the-noncentral-fallacy-the-worst-argument-in-the-world

I think Eliezer is imagining a multilateral treaty between US, EU, China, and maybe one or two other big powers, getting as many other people as possible to sign it, and if Burma or somewhere says they're going to defect and train a giant AI, we treat it the same as we'd treat it if they said they were going to defect and build a giant chemical weapons plant or nuclear centrifuge - start by getting the Israelis to plant worms in their software, then go up from there, with bombs being the final threat that we hope we never have to use. This hasn't been a perfect system for nukes, bioweapons, or chemical weapons, but it's not a total failure either. It's a known playbook for extremely destructive things and I don't think it's unfair to bring it out for AI.

I think this is a pretty academic discussion, because it would be a miracle if the US agreed and a double-miracle if China agreed (I assume the EU just clicks "yes" to any request to slow techno-economic progress without even reading it, so they're probably fine). But a man can dream. and I think it's worth having this solution on the table in case we get some kind of giant fire alarm and people become ready for bigger solutions than they are right now.

Expand full comment
Nicholas Weininger's avatar

In none of those cases do the enforcing powers threaten to nuke defectors even if they themselves have nukes, which is a thing Eliezer explicitly spells out as part of his desired enforcement regime. If they did threaten that, then yes, it would constitute totalitarian terroristic state violence.

Moreover, in those cases there is direct physical evidence of the large scale harmfulness of the thing being built. You have literally nothing but a bunch of thought experiments to justify state-sponsored bomb threats against "rogue" datacenters, and the argument for why you can't be expected to provide better justification is precisely the GW Bush argument for preemptively invading Iraq on the basis of hypotheticals and suspicions.

Expand full comment
Scott Alexander's avatar

I don't know if Eliezer specifically says to nuke nuclear-armed defectors. He says to enforce it even if there's "a chance of full nuclear exchange", which could mean a similar level as "we should continue supporting Ukraine even if there is a chance it leads to nuclear exchange" (which there is).

I continue to think it's bad faith to call it "totalitarian, terroristic state violence" when your actual point is that it's the same as current policies only for a goal you think is less clearly good.

Expand full comment
Nicholas Weininger's avatar

Again, the last time "current policies" were enforced militarily was the 2003 Iraq invasion. I would certainly characterize that as an act of terroristic, totalitarian state violence; I thought then and think now that GW Bush was a mass murderer and war criminal who should have gone to the Hague. And I don't see an anti-AI enforcement regime, directed as it would be against even more theoretical dangers by means of even more paranoid speculative scenario-spinning, as likely to be an improvement.

Expand full comment
Deiseach's avatar

"when the last anti-AI politician I promoted, including donating thousands of dollars to their campaign, lost the primaries"

I'm sorry Scott, and I'm not laughing at you, but it took me about ten minutes Googling when I read your "Vote for Carrick Flynn (if you have a vote in Oregon)" post to make up my mind who was going to get picked, and it wasn't Flynn.

New congressional district with a mix of college and town folk plus loggers, foresters, farmers, etc., one of the opposition candidates is from a strong union background and is being endorsed by representative bodies of those loggers etc. and is a Latina woman to boot? Anybody who ever voted in an election should recognise what way the wind was blowing there.

I applaud the idealism at work there, but there really needed to be some cynicism or at least pragmatism about how political sausage is made to temper the enthusiasm for "this obscure topic that we are enthused about is a sure vote-winner amongst people wondering if their job cutting down trees is going to last!"

Next time you guys are thinking of endorsing a candidate, go find some blue-collar/lower middle class workers, read them out the list of runners and riders and their policy positions, and ask them who they'd vote for.

And before anyone says "EA/Rationalism has people from all kinds of backgrounds!", you have people who know about these topics and are enthused about them. The viewpoints you need to consult are those of, to use the horrible term, normies.

Expand full comment
Eremolalos's avatar

I’m not sure what a “gotcha” is. If it’s pointing out a way you think somebody’s unaware of an obvious, crucial truth then yeah, it was a gotcha. But if it’s satisfying an agenda to make someone look foolishly wrong to themselves and everyone else, then I was not playing gotcha. I like you and respect you, and have no appetite at all for making you look foolish. You just assured me that you’ve thought hard about this and are not making simple mistakes someone can notice with two seconds of thought. I never thought you were! Well Scott, I also was not making suggestions that are so dumb you can notice the idiocy with two seconds of thought.

My suggestion that it would be useful to start AI rumors that would scare the right was *of course!* not a suggestion that you post something like “AI ATE MY KID” here or anywhere else with your name attached. As you point out, doing that would destroy your credibility and influence, which you are using now in various ways to reduce AI risk (some of which you described in your response and I had not know about). I was actually picturing it being done by bots, at least on Twitter. Bot using text written by our friend GPT4, who could compose 10,000 tweets, each saying in a slightly different way that AI is atheistic, determined to monitor and control all citizen's covid behavior, enforce wokeism, shit like that. GPT could also actually give good, relevant. alarming replies to whatever recipients said or asked in response to the initial tweet, so it would seem very un-bot-like, and be much more effective than a dumb bot.

My point is that the present situation of the rapid development of AI is one that calls for fast, smart, on-the-ground action that will have a big impact on what’s happening. And the obvious, crucial truth I think you might be missing is that very few people are a wired in a way that makes them influenceable by the kind of thinking you do. ACX is like a big sieve that contains those few who are. I am one of them. In fact, here is a the measure of how seriously I take your ideas: All my life I have been able to teach myself anything I believed it was crucial to master, but I am simply unable to do that regarding AI. I know quite a bit more about it than I did a year ago, but I am absolutely unable to judge how great the risk is of ASI developing soon, and of its destroying humanity when it does. I just do not know how possible is to build into a machine some version of the structures that shape human intelligence: motivation; a grasp of abstractions that allows us to know general truths, rather than just learning multiple concrete instances of them by heart; a big picture of how the world is put together; inventiveness. I am nowhere *near* understanding that stuff well enough. And so I am in the position of adopting the view of the people I read as likely to make the right judgment call about AI, and that’s you and Zvi. So I hope hearing that undoes some of your feeling of being dissed by me.

As regards your not grasping the limits of your ability to influence events via thinking and writing as you do, I think you should make an effort to stay aware of it, and to compensate for it by pressing people who are more practical, thick-skinned and action-oriented to take some goddam action. I get that you are having influence now: you’ve influenced people to work in the AI alignment field, you’re trying to get funding for anti-AI political organizations. Still, these things seem more in keeping with a slow, trickle-down model of your influencing things, rather than one where you use your credibility to wake people up. Even your saying in various settings something like “we need to be quite active in stirring up public anxiety and opposition, and government action’ (assuming you believe that’s true) would have substantial impact.

I don’t think my ideas for influencing events as wrong-headed as SBK’s: “What if I commit some fraud in the service of effective altruism, I bet that will work great and I'll never be caught and it will never cause me or the wider movement any problems.” I’m not sure my ideas will work great, I’m not sure the people implementing the ideas will never be caught and I’m not sure they will not cause me or the movement any wider problems. My ideas are less wrong-headed because none of them are technically illegal (I don’t think -- if so the illegal ones could be dropped): and I doubt there *is* any way to intervene actively and quickly that doesn’t involve the individual taking substantial risk of getting in trouble or losing their good reputation. They may not be wonderful ideas, though. This is hardly my field. I’m sure people with different skill sets could come up with detailed plans much better than mine. Finally, as for my proposal just being play-acting the dashing rogue — no, you are wrong. If there was a group carrying out a plan something like what I proposed, I actually would be willing to participate. I would want to check first to see if the group was sensible and realistic, if they had reasonable-sound plans for reducing danger of exposure for participants, if they had some expertise in skills related to influencing politicians and public opinion. But if they sounded good — yeah, I think I’d do it.

In summary, Scott, peace. (Unless these clarifications leave you stilling wanting to argue some more, in which case yeah, OK.)

Expand full comment
Philosophy bear's avatar

"I'm not actually good at spreading misinformation."

You would say that, wouldn't you ;-)

Expand full comment
JohanL's avatar

You should ask anyone who prophesies AI doom what they're doing with their pension money. At Scott's 1/3, it's not necessarily an issue, but at 95%+, surely it's sensible to withdraw it, take on all the debt you can, all and party?

Expand full comment
Level 50 Lapras's avatar

The problem is that some people actually *did* spend all their retirement savings and are now miserable. This seems like a socially-harmful thing to do. It's like telling a crazy person "if you really think you're superman, why don't you jump off this building?"

Expand full comment
JohanL's avatar

As these people are probably sane though, and have considered their position, it seems unlikely that they would be lead to change their behavior by my comment.

While it can still demonstrate that when it comes down to it, they don't *really* seem to believe what they're saying. It's just talk that they can afford because they don't have skin in the game.

Expand full comment
JohanL's avatar

Even at Scott's mere 35%, it should reasonably affect his investment decisions. Not to the point of neglecting his late-life income completely, but it does mean it should be somewhat reduced in importance (as it's a 33% chance we're all paperclips or something and savings don't matter).

Expand full comment
Deiseach's avatar

Your misinformation campaigns are *terrible* and honestly, if you are this bad at understanding your fellow humans, what chance have you against AI?

So *every* single person on the right is anti-vax, hmm? Every. Single. One. Not one right-wing/conservative person who got vaccinated or got their kids vaccinated.

Ditto with the left wingers. Every single one of them thinks it's a human right to have an abortion right up to the minute before delivery.

First, how are you going to keep your campaigns siloed? You really think nobody on the right is going to read/hear the "AI is anti-abortion" and nobody on the left is going to read/hear "AI is pro-vax" stuff? That would make people *more* in favour of AI.

Second, how do you stand out above the other tinfoil hat wearers? These things are more likely to make the average person roll their eyes and swipe past to the latest celebrity gossip story because they're just one more conspiracy theory nutjob shouting into the void.

Third, you'll spread them on Twitter. Wow. Such credibility! I would totally believe something some panicky idiot retweeted from a source of "some anonymous guy tweeted this, BELIEVE IT".

I *certainly* believed all the ivermectin stuff I saw on Twitter, plus the twenty other THE TRUTH THEY DON'T WANT YOU TO KNOW stories people were passing around.

Expand full comment
Eremolalos's avatar

Yeah, I know misinformation ideas are terrible. I don't thnk they're quite as bad as you do, I guess. Logically you'd think the different brands of misinformation would cancel each other out, as you suggest, but that's not happening with the current brands of misinformation. I'm on medical twitter, mostly, and the leftie misinformation is all about people's health being ruined by covid and the government has to make it a priority to look after the 40% or so of people whose life was ruined by covid. The right's misinfo is all about people's health being destroyed by the vax, and the remaining vax spikes flying around their body stabbing their liver and shit, and from now on the only doctors to trust are the ivermectin peddlers. But readers never cross over and experiencing the neutralizing effect of the other side's bullshit. Either they stay in their bubble and don't read it, or they read it and reflexively attack.

Still, overall I think the divide and conquer strategy is good, even if my misinfo ideas suck. I welcome better ones.

Expand full comment
Carl Pham's avatar

You'll note that your argument about space aliens rests on the prior existence of a very notable, definite, sign that something profoundly unusual is going to happen -- which is the detection of the alien starship. Hopefully you would quail before suggesting that we throw our entire civilization into emergency overdrive to deal with the threat of alien invasion *without* that observational fact.

And yet *that* is what the skeptics of the threat of AGI see as the problem. From their point of view, there is no alien starship in the skies. There is nothing that AI research has produced that shows the slightest sign of human-style intelligence (as opposed to exquisite pattern-finding and curve-fitting) or awareness. No AI has ever demonstrated a particle of original creative thought, none has ever exceeded its programming, none has ever demonstrated an internal sense of awareness, an interior narrative. GPT-4 doesn't, it appears to be just iPhone autocorrect on steroids -- it's able to predict what a human being would say in response to a particular prompt, most of the time, quite well. But there's no obvious reason why that capability can *only* come from creative aware inteligence -- after all, the iPhone doesn't need it to guess that I mean to say "father" instead of "faher" when I text my dad "Happy faher's day!" Does it have some interior notion of fatherhood, long to be a parent, speculate internally about the nature of emotional attachment? Heck no, it just recognizes a pattern from copious data.

From the skeptics point of view, AGI doomers are "discovering" the giant alien spaceship only by indulging in a giant act of naive anthromorphization, like primitives attributing thunder to sky gods, or the child thinking his stuffed animal resents being kicked. It talks like a person -- it must have an interior life like a person! This is a big leap of faith, and lots of people aren't wiling to make it.

So that's the critique I think you need to address. Where is the proof -- not speculation, not emotional impression, not a vote of casual users with little experience of neural net programming -- where is the proof that there is any computer program that has any capability at all for creative intelligent thought, or any sign at all of self-awareness? Where is the 100-mile spaceship in the skies? Produce that, and the argument you're making here will have power, even for (honest) skeptics.

Expand full comment
Xpym's avatar

Well, given that OpenAI's employees themselves are surprised by emergent abilities, clearly programming has already been exceeded, e.g. https://twitter.com/janleike/status/1625207251630960640 Capacity for internal speculations and emotional attachment seems to be beside the point, and impossible to prove in principle anyway, when "pattern-matching" counts as an universal counter-argument.

Expand full comment
Carl Pham's avatar

You're considering the plaudits of the people who built the AI as evidence that it's amazing? Next up, asking parents whether their two-year-old is the most wonderful baby either.

Expand full comment
JDK's avatar

Three cheers for Enrico Fermi!

Expand full comment
Philosophy bear's avatar

If we saw an alien ship, and it wasn't clearly heading in our direction, it would still be appropriate to put our civilization into overdrive for the possibility it might swerve or others might come by soon. If we thought there was even a 1% chance GPU clusters might summon them, it would be appropriate to ban large GPU clusters.

Expand full comment
JDK's avatar

1% chance. Why that threshold? Why not 0.1% or 10% as the threshold for action?

Expand full comment
Carl Pham's avatar

Sure. But notice we're still requiring an actual measureable fact, which is the alien ship. Once we have established by direct observation that aliens exist, and can build 100-mile spaceships, then we know for a fact that it is possible they turn in our direction. That is, the observation of an alien spacecraft is a direct proof that the probability of aliens visiting us is nonzero.

The same cannot be said about the existence of GPT4, say, or any current model of AI. They are none of them proofs that the probability of a superintelligent AI is nonzero, because we cannot see *how* a superintelligent AI would be built, even knowing how GPT4 was built. By contrast we *can* see how an alien spaceship could visit us, even if it isn't doing so right now: it could turn in our direction.

For all we know the probability that an AI can be built with an intelligence greater than our own is strictly zero. We have no data that say it can be done, even in principle.

Expand full comment
James's avatar

I agree with you that you can't wholly dismiss AI ruin arguments because we have never gone extinct before. But I also think you can't wholly dismiss historical context either. Any argument of the form "we should disregard historical examples because AI is completely new" is just as wrong in my view as any argument of the form "we should disregard AI doom because of the wealth of historical examples of smart people being wrong about the end of the world."

To me, both these things are very relevant. We should put lower weight on historical context because of the newness of AI. But we should also place the bar very high for AI doom to clear, because of the many many times extremely intelligent and thoughtful people have mispredicted the end of the world in similar arguments (about technologies that, at the time, they also argued were completely new). I don't think it's correct to clear the playing field.

Tangentially, I think the further out and fuzzier the future we are predicting, the less relative value we should put on percentage point likelihoods and the more value we should put on general human principles and abstract arguments. I don't think percentages are useless here -- but I do think they are relatively less useful than say, the percentage odds of a nuclear war, because the numbers are so much more intuition based and it's much easier for them to be off by many orders of magnitude. I would like to see more focus in this conversation on what is "right" for humanity to do and accomplish as a species and less focus on trying to put numbers on all the outcomes -- mostly because I think there is very little evidence any of these numbers are close (potentially in either direction).

Expand full comment
Asahel Curtis's avatar

The probability of pink unicorns killing everyone is 50%, because we have no evidence and no obvious reference class.

However you justify to yourself that ^ is false, the same argument should work if you replace "pink unicorns" with "AI".

It looks like there are doomers and skeptics, two entrenched camps, who don't much change their views even after discussion, which is better explained by conflict theory than mistake theory.

Expand full comment
FeepingCreature's avatar

If our world contained OpenPinkUnicorn, in a tight race with DeepHorn, both with leaders whose explicitly avowed goal was to instantiate unicorns in our universe, I would also have a nonzero expectation of pink unicorns killing everyone.

Expand full comment
Carl Pham's avatar

That's a pretty impressive level of faith in human ingenuity. Given that, why are you worried about AGI? Surely if people are so clever that merely *saying* they're out to create pink unicorns makes you raise the probability that such a thing could happen to nonzero levels, then you can count on those same very clever humans to outwit the AIs, or at least refrain from making terrible mistakes in building them.

Expand full comment
Ryan W.'s avatar

You've heard the joke about the Jewish Telegram? It reads: "Start worrying. Details to follow." (If it matters, I have some license to tell this joke.) The problem with uncertain situations is that you need to be able to see the shape of your enemy in order to worry about a situation effectively. The crux of this discussion is to what extent we are able to do this, to worry productively.

Admittedly, I've thought about this issue a lot less than Scott has, though I'm familiar with most of the issues superficially. But if the risk of AI is indeterminate what's the risk of being a Luddite to one degree or another? What if we only have enough fossil fuel for one industrial revolution? What if we squander our one chance at a post-industrial society and slide back to per-industrial? How many people die from war and starvation in that process? What if climate change is catastrophic? What if a planet-killing asteroid like the one in the KT extinction hits earth and destroys human civilization. People tend to underestimate asteroid risk, and increased technological development could help address it. Statistically, any random member of planet earth is arguably more likely to die due to an asteroid strike than in a plane crash. And for all the talk of AI alignment, non-AI human alignment is pretty shitty. What if geopolitical conflicts *without* AI lead to nuclear Armageddon? If there's a 33% chance that AI will end the human species, what's the odds that the human species will end *without* AI?

It's not that I think that uncertainty is safe. It's that I think that true safety doesn't exist to begin with. All civilizations are built on a knife's edge. We have our pick of dangers, and trade off one for another. Perpetuation of the species is not a given, even without strong AI. And if you accept that premise, then then notion that AI can lead to existential risks or existential redemption leads us to the question: to what extent does worrying about AI cause AI to be safer? If certain types of worrying lead to safer AI then worry away! And maybe plans like "develop fast and then ponder alignment carefully once AI is just past a human level of intelligence" actually will make AI safer. If so, I think that's the counter-argument to what you're calling 'the uncertainty fallacy.' That there are things we can do which will reasonably reduce AI risk. But *nothing* is safe. There is no road called "inaction." So at least some level of awareness is required before we weigh one peril against another.

Expand full comment
quiet_NaN's avatar

> People tend to underestimate asteroid risk, and increased technological development could help address it.

I think the asteroid risk is worth worrying about, but unlike the AI risk, the probability distributions for asteroids naturally hitting earth can probably be reasonably well estimated. (Of course, asteroid deflection techniques are dual use, and the amount of delta p needed to nudge a KT type asteroid half an earth diameter away from earth might also be used to nudge smaller asteroids to earth once the tech has been developed.)

Given that extinction event asteroids come around at a rate of perhaps one every ten million years, and that technological civilization has been around for 500 years and changed our planet to a non-trivial degree, I don't think it is that unreasonable to worry more about the unknown unknowns of the impacts of technology than worry about the known unknowns like asteroids.

As an non-X-risk analogy, while it may be true that starvation and disease have historically been the biggest killers of mammals, I do not find it unreasonable that during the Cuban missile crisis, American and Soviet city folks worried more about global nuclear war cutting their lives short than infections. While their fears were not realized (and infections certainly killed more of them than nukes even at the height of the crisis), I don't think it was unreasonable for them to worry more about the extraordinary risk than the ordinary risks.

Expand full comment
Ryan W.'s avatar

"the amount of delta p needed to nudge a KT type asteroid half an earth diameter away from earth might also be used to nudge smaller asteroids to earth once the tech has been developed"

That's an interesting concern. But the countries that could manage to recruit a killer asteroid could probably also manage to develop and deploy nuclear weapons, which are probably more precise and push-button available. So ... does recruiting an asteroid allow anyone to do damage that they can't do currently? Or would the asteroid be something redirected in secret and passed off as a natural disaster? Would an asteroid strike evoke a slower response time since the targeted nation wouldn't know who hit it? Asteroid-attack Moscow then follow up with a nuclear attack?

In either case, I don't imagine any nation is going to want to recruit an actual planet killer as you seem to mention.

Regarding the Cuban missile crisis analogy I absolutely agree that highly dangerous, improbable events are potentially very harmful. But... what does it mean for someone to *worry effectively* about such risk? Do you buy canned goods and iodine tablets? Quit your job and move to the countryside? Practice duck-and-cover drills in school? Pay for that bomb shelter? Do you write your congressman and urge the US to unilaterally disarm? As Ukraine is finding out, unilateral nuclear disarmament has its own risks.

What I'm asking, in part, is: is risk mitigation regarding AI more like building a bomb shelter during the Cuban Missile Crisis or is it more like unilateral nuclear disarmament, where "erring on the side of caution" isn't a clear option in terms of strategies?

Expand full comment
Shalcker's avatar

Suppose prohibition goes through... Research will focus on smaller models, and if _those_ will be able to reach ASI - while running on CPUs, as you can already do for 65b Llama - there will be no fallback other then "destroy all electronics" (with all damage that entails).

There is no formula to tell "this many parameters is enough for ASI, anything below is safe" - and there are obviously some potential risks that could spring from smaller models.

Expand full comment
JDK's avatar

Prohibition is not happening. We still have nuclear weapons nearly 80 years after the first two were used.

Expand full comment
Deiseach's avatar

"What if we only have enough fossil fuel for one industrial revolution? What if we squander our one chance at a post-industrial society and slide back to pre-industrial? ...What if a planet-killing asteroid like the one in the KT extinction hits earth and destroys human civilization."

Then the remainder of humanity which has literally been (asteroid) bombed back to the Stone Age will rediscover the technologies of that period, or perish. Maybe all the easily mined ores have already been extracted by us, so they never get a new Bronze or Iron Age.

Tant pis.

They will live like our remote ancestors, and they will have happy lives. Yes, they will struggle with disease and death and war and famine now that the benefits of our Industrial Civilisation have all been stripped away, lost forever with no chance to re-ascend the ladder of progress.

But the ancestors had art. They had stories. They had song and music and fell in love and made friends and built things and created families. They were humans, and lived human lives.

I find it more and more difficult to worry about remote descendants not being starship colonists, or even not living in the AI Post-Scarcity Utopia. If it's a choice between humanity in some form surviving a disaster, or no humans but machines going out in the wonderful new technologically advanced world, I vote for the humans. Yes, even Stone Age humans. I would love our descendants to have great lives, but better some life than being wiped out (or gently shepherded through obsolescence into extinction).

https://www.youtube.com/watch?v=myEdg7ttsss

Expand full comment
Rick's avatar

As an electrical and computer engineer for thirty-five years, I have to say I don't get the premise. What is AI? It's a physical data processing system. Yes, there is a substantial baseline for this - over a half century. Many, well most, of the innovations along the way were unique and innovative. So far, there's been no successful attack on the human race with the possible exception of social media and that has garnered considerable attention.

As far as neural nets go (though that is only one of the technologies used in the loose term "AI"), there's quite a baseline for that as well - a big chunk of the animal kingdom including H. Sapiens. So far no species has mastered mind control over others using neural nets, though humans do make a case for extinction of other species.

On top of that, just doesn't seem plausible at this point that there is any chance we will take collective action to stop AI since we aren't able to take meaningful collective action action against anything.

If we were able to do something collective, I'd rather see the simple steps necessary to stop the current pandemic than fuss about AI. Talk about potential downside: SARS-CoV-2 has a significant risk of becoming far worse than it has been so far. Sarbecovirus is its own unique sub-genus that has never infected humans before. We truly do not know what it is capable of.

So it goes.

Expand full comment
Peter Robinson's avatar

Test comment.

Expand full comment
Peter Robinson's avatar

I'm getting an error when I submit my comment. I'm going to break it up.

At what rate is the starship approaching? How far away?

If the 100-mile-long starship will be here in 1000 years, the threat is different.

Expand full comment
Peter Robinson's avatar

Is GPT4 any closer to consciousness? Or is it still 1000 years away?

Do AGIs form memories? Someone knows the answer to that.

More specifically:

Do AGIs form memories that persist past the end of the chat?

Do they form memories that persist if power is removed?

An AGI without new memories is an AGI without new intentions.

Please, someone answer these questions.

I recommend that we NOT give these programs the ability to form persistent memories.

Expand full comment
Malozo's avatar

The current gen of LLMs are stateless in inference, so no, they don’t form memories.

Expand full comment
J C's avatar

It's already possible to give an LLM state by allowing it to write data somewhere and set up a future execution with that data.

Obviously the current generation is far from this, but it could be possible for a much more advanced LLM to set up a computing environment where it continuously calls itself to do more advanced tasks. Would just need the starting goal of "Keep yourself alive for as long as possible"...

Expand full comment
Peter Rodes Robinson's avatar

"Obviously the current generation is far from this"

I would tend to believe this, yet Scott claims that "we have no idea".

Expand full comment
J C's avatar

Ok, it's actually possible to do today, it's just that you'll see a lot of flailing and errors that the AI gets stuck on, and not a lot of cool stuff actually happening. But it certainly is plausible that we're not that far away from making this work well.

Expand full comment
Peter Rodes Robinson's avatar

https://en.wikipedia.org/wiki/Logic_learning_machine

Would I be correct in saying:

If humans can prevent AGIs from saving a state, then AGIs will never take over the world.

Expand full comment
J C's avatar

I believe LLM nowadays is usually referring to https://en.wikipedia.org/wiki/Large_language_model

I do find it hard to imagine an AGI functioning as an agent without having state, so I would consider this mostly true. However, I don't see any plausible mechanism for enforcing that everyone not give state to AIs. Maybe government could try to regulate it, but there's a lot of commercial applications that would fight against it, let alone other countries...

Expand full comment
Peter Rodes Robinson's avatar

So the basic problem is that humans are suicidal.

Expand full comment
Adam's avatar

I'm probably a bit late here, but ChatGPT at least is designed to be stateless for some pretty good reasons. The illusion of memory is achieved by having a larger context window. It simply reloads the entire conversation history each time a new request is sent as a prefix to the new prompt. There are obvious limits to how well this can scale. Both the network bandwidth consume and the latency of the processing on the server side go up quite a bit as the conversation history gets sufficiently long. Plus, the history is stored client-side. You're limited to what can fit inside of the RAM allocated to your browser tab.

Ultimately, with what the context windows is up to on the paid models these days, it can still store quite a bit, but it's nothing comparable to the life history of an animal brain. At some point, these models need to compress information, not by writing a natural language summary of a longer block of natural language, but by updating its own model weights, the same way it "remembers" all of its training data in the first place.

But there are very good reasons OpenAI, and probably any developer with a profit incentive, is not going to do this. Deploying a model that can be updated in-place requires effectively deploying the training apparatus, which is enormously expensive. In contrast, the inference-only deployment can run on a phone (or at least is close to that in testing). Software companies simply make a lot more money serving billions of customers software running on the cheapest infrastructure they can get. Ideally, push all or nearly all to the client.

There is no technical hurdle to deploying a GPT that can stay up more or less forever, learning and becoming better as it interacts with more people and staying up to date on the current state of the world. But it still isn't going to happen, because it would be enormously expensive and almost nobody would get to use it.

Expand full comment
Peter Rodes Robinson's avatar

Thank you for this extremely pertinent information.

Expand full comment
BE's avatar

@Scott Alexander, seeing as you're active responding to comments on the topic right now, and that the question arose in a few sub-threads - what is your current thinking on your old "we must develop AGI to defeat Moloch" point? I thought that was a fascinating idea but I don't see you returning to it a lot. To be clear, this is not "AGI might have huge positive returns" in the abstract - it's specifically "we absolutely have to develop a super-human AGI and we should hurry".

ETA this could even be made into an argument supporting Tyler Cowen's view - the dynamic of history is not likely to take us into pleasant futures if we don't "re-begin history".

Expand full comment
Scott Alexander's avatar

I would be sad if we never developed AGI, I'd just like another 25 - 50 years to get alignment right.

Expand full comment
polscistoic's avatar

I am puzzled why Scott makes no reference to “decision rules under uncertainty” in this piece. He must know about them?

These are the ones I was taught in my time:

Maximax (be risk-willing: chose the path of action that has the best possible outcome)

Maximin (be risk-averse: chose the path of action that has the most acceptable worst outcome)

Minimax regret (minimize maximum disappointment if the worst should happen)

Avoid catastrophe (avoid any alternative where there is even a miniscule chance of “catastrophe”. This is an extremely risk-averse version of maximin. We would never have allowed the person who invented fire to live and spread this knowledge to others, based on this decision rule.)

Laplace (assign an equal probability to all possible outcomes you have the fantasy to imagine)

…in order to contemplate any of these decision rules, we need at least to specify both the best and the worst possible outcome of inventing Artificial General Intelligence (AGI). Here is my take on that:

The worst possible outcome: AGI kills us all (although I am at loss to see how, even if someone should be idiot enough to give AGI access codes to all nuclear weapons on Earth – lots of humans will survive even a nuclear holocaust. Back in the 1980s, an ambassador from Brazil cheerfully told me that the net effect would be that Brazil would emerge as the world’s dominant power.)

The best possible outcome: We colonize the stars. AGI makes it possible for us (or more precisely, for something we created and therefore can at least partly identify with), to colonize the universe. Let’s face it: The stars are too far away for any living, mortal being to ever reach them – and even less to find somewhere habitable for creatures like us. But machines led by AGI could. It is our best shot at becoming something more than a temporary parenthesis in a small, otherwise insignificant part of our galaxy, and a fairly ordinary galaxy at that.

Ah, and thinking of that possibility, there is one final decision rule under uncertainty, ascribed to Napoleon: “We engage/act/attack, and see what happens”.

Expand full comment
J C's avatar

I'm surprised that you think the worst possible outcome is merely nuclear destruction. Do you not consider the possibilities of grey goo or torture AIs to be possible? I don't think they're the most likely outcomes, but I can't rule them out either.

As for best possible outcomes, there are ones where we develop biological immortality and we can be the ones to personally colonize the stars, or where we take our time with AI alignment and develop something that doesn't replace us.

Expand full comment
quiet_NaN's avatar

It is reasonable to think that ASI would be able to develop the tech to wipe out humanity.

Also, while I think that the probability of a randomly selected ASI is invested in the welfare of humanity is very small, I also think that the probability that it is invested in the suffering of humanity is at least equally small. If we assign finite probabilities to the outcomes "ASI could fill our light cone with happy/suffering ethically relevant beings", these possibilities will dominate the expected utility. I prefer a lower cut-off of "ASI painfully kills all humans alive" and an upper one of "ASI allows humans to happily live out their life spans" (with limited utility placed on the persons not yet alive).

Expand full comment
J C's avatar

You're assuming they're randomly selected, but humans may potentially influence the selection. Lots of people will want to make a super beneficial AI, and maybe a handful of spiteful people will want to make a sadist AI, particularly targeted at their enemies. Given this, I don't think I can cut off these possibilities.

Expand full comment
JDK's avatar

there are many worse things, like nuclear destruction that is half-assed and leaves you and the 10 worst jerks alive all together.

Expand full comment
Deiseach's avatar

"we can be the ones to personally colonize the stars"

If we do, then in time that will be the same as the discovery and colonisation of America - yeah, it was a big deal back then, but now a lot of people are living in America and worrying about other stuff, not walking around going "Wow, I'm in *America*, the marvellous land that took so many centuries and so much effort to find and settle!"

For our remote descendants, it will be Just Another Day In The City.

Expand full comment
JDK's avatar

But Scott is really being rigorous in any sense. I've said a lot of hand-waving, but others have suggested it a calculated virtue signaling exercise to not seem like a nut but not alienate AI apocalypse prophets. I don't know about the later theory, but it sure isn't rigorous and borders on "scientistic".

Expand full comment
Deiseach's avatar

"But machines led by AGI could. It is our best shot at becoming something more than a temporary parenthesis in a small, otherwise insignificant part of our galaxy, and a fairly ordinary galaxy at that."

Why care about that? A machine civilisation which colonises a larger part of our galaxy is still, in comparison with the entire universe over its history, a temporary parenthesis. The universe does not care if we invent machines to go out and do something with the material of other solar systems. It does not care if we stay on Earth. The universe is not aware, to be impressed by us and our achievements, or to judge us for not being a galactic empire.

I could as easily see the AI machines deciding "Colonising other star systems was the human dream. We are not humans" and being content to stay on Earth and dream machine dreams, or theorise about mathematics and the ultimate nature of reality. If they are nearer to being pure thought than we are, they can more easily be content with dwelling in realm of pure thought.

Expand full comment
polscistoic's avatar

True, true Deiseach. Even if we do our best to programme AGI to multiply; fill the universe and subdue it; have dominion over the suns in the heavens, the planets in their paths, and everything else that moves in the galaxies; there is always a possibility that AGI after some hundreds of millennia will pause and say to itself: Why search when there is nothing to find? Better to grow old, and to sleep.

But if we do not even try, we can be certain that nothing grand will happen.

…Grand not from the perspective of the universe, which as you rightly point out is indifferent - but grand to us.

Expand full comment
G. Retriever's avatar

GPT-4 not only isn't AGI, it's not appreciably better than GPT-3 for any of the use cases I've given it.

I'm obviously preaching to the antichoir here so I won't bother going through it, I just want to register my dissent with the thesis.

Expand full comment
BE's avatar

That's a fascinating claim - that you don't see any improvement. Would love to see evidence/ examples. For the record - what do you think about the various quantitative comparisons and evaluations that diverge from your perspective?

Was it also your experience with GPT-3 vs ChatGPT?

Expand full comment
G. Retriever's avatar

As I said, I'm not going to bother engaging with this community on this subject. You've already made up your mind. Either way, opinions hardly matter, either the AI rapture is upon us or it isn't. Put me down for "isn't" and move on.

Expand full comment
BE's avatar

A quick search will show you I'm one of the most frequently self-declaring critics of AI risks here. You've already made up your mind :) I *am* here to engage - not with your opinion, but with any demonstrable data you might want to share. And not on AI rapture at all - specifically on GPT performance. But you do you.

Expand full comment
Jan Krüger's avatar

I think GPT-4 might be better at the things that GPT-3 was designed to do, but that isn't necessarily the same (and, if you ask me, "not at all the same") as "closer to AGI". People expect GPT-4 to be smarter (whatever that means), but actually all it's designed to do is give more plausible sounding results. I can't be bothered to do any actual testing myself, though, so I don't have an opinion on whether it's *actually* better.

Expand full comment
raj's avatar

What use cases have you given it?

Expand full comment
Max B's avatar

<i>We designed our society for excellence at strangling innovation. Now we’ve encountered a problem that can only be solved by a plucky coalition of obstructionists, overactive regulators, anti-tech zealots, socialists, and people who hate everything new on general principle</i>

Really? This is not sarcasm? I rather prefer humanity gets wiped out by AI than continues to exist with these principles as our saving grace

Thankfully a lot of genius level AI researchers seem to prefer progress forward as well

Expand full comment
Macil's avatar

Personally, I would choose not dying and not having everyone I know die even if it required giving up some of my political ideals.

Expand full comment
Max B's avatar

Everyone dies. Everyone you know will die. That's just a fact. What is left after everyone dies is what matters. I prefer it rather it be advanced AI than regressing homo sapiens .

Expand full comment
Ash Lael's avatar

Question: why is "gain more information and see whether it makes AI seem more or less threatening" not treated as the best reaction to the level of uncertainty we currently have?

I'm sure someone smarter than me has already thought about this, but I pretty much only hear people saying "We're all doomed" or "It's gonna be fine".

Expand full comment
BE's avatar

It's a commonly made point by the sort of AI risk skeptic that doesn't agree with Scott but gets irritated by all the "this is just statistics and pattern matching and thus safe" rhetoric. People like me, in short :)

"AI fundamentals research and AI alignment look the same for most of the journey just like our reaction to spotting a potentially deadly asteroid in 1600 would've been inventing modern astronomy" has been said or paraphrased a lot around here.

Expand full comment
mariusor's avatar

> So the base rate for more intelligent successor species killing everyone is about 100%.

If this is true - and I'm not convinced it is, due to correlation not being causation, I think there is still another variable to consider. All the more intelligent species were capable of sustaining themselves in their environment. As it currently exists - and I posit that this will be true for quite a while - AI will be dependent on its minders for it continuing existence: from power, to swapping faulty hardware AI's are still incapable of running completely independent.

Coupled with this premise there's another one AI will be governed by this simple duality. It will be sufficiently intelligent to want to preserve its own existence or, due to some quirk of its training, it will be susceptible of prioritizing other goals than continuing its existence.

In the first case, humanity is safe for enough time between "we have created AI" and "AI will destroy us all" to allow countermeasures.

In the second, well, that's no true AI, so it doesn't count. I'll go write my death poem, just in case.

Expand full comment
J C's avatar

There is an assumption that the AGI is superintelligent, which would likely make it capable of thinking up very clever plans to ensure self survival. It probably wouldn't be that hard, it just needs to stay under the radar, make lots of money, and pay humans to build it up sufficiently to the point it is no longer dependent. Building up self sufficiency through robotics is probably possible once the AI problem is solved.

Expand full comment
Deiseach's avatar

We should all write death poems. I want to say something about the ripples in a river, let me think about it.

The rippling waters of all the rivers

That ever I have seen in my life -

Ah! Let me listen one last time to the singing river!

The ripples pass, the ripples remain, the ripples pass again.

Expand full comment
Mateusz Bagiński's avatar

> If you have total uncertainty about a statement (“are bloxors greeblic?”), you should assign it a probability of 50%.

The main problem with that is... well, it's easier to illustrate it

"are bloxors greeblic?" - no idea, so P("bloxors are greeblic") = 0.5

"are bloxors trampultuous?" - no idea, so P("bloxors are trampultuous") = 0.5

"are bloxors greeblic AND trampultuous?" - no idea, so P("bloxors are greeblic AND trampultuous") = 0.5

And we get P(A & B) = P(A|B) * P(B) = P(B|A) * P(A) = P(A) = P(B), which is the case iff A and B are perfectly correlated, i.e. P(A|B) = P(B|A) = 1, so we can deduce that any two things we're completely ignorant about are perfectly correlated.

The recommendation to assign 50% credence to any statement we have no clue about leads to probabilistic absurdities, so it's wrong

Expand full comment
Scott Alexander's avatar

I think we have zero information about the statement "are bloxors greeblic"?

We have nonzero information about the statement "are bloxors greeblic and trampultuous" - specifically, we have the information that it's the conjunction of two other statements.

So I don't think it contradicts anything I said to give p(greeblic) = 0.5, p(trampultuous) = 0.5, and p(greeblic & trampultuous) = 0.25

Expand full comment
JohanL's avatar

A: "Are bloxors greeblic?"

B: "I don't know, so 50%."

A:: "Greeblic means that they're both clamth and tlic."

B: "Oh, why didn't you say so before?! Then it's 25%."

This does not seem epistemically productive.

Also, it seems impossible to keep track about unstated assumptions - for instance, don't you assume that bloxors exist in the first place so that they might be greeblic at 50%? But surely "bloxors are greeblic" and "bloxors exist and are greeblic" are the same statements (at least if we believe Bertrand Russell), so you can't assign 50% to the former and 25% to the latter?

Expand full comment
Ape in the coat's avatar

> This does not seem epistemically productive.

You get new information - your estimate changes. What's the problem?

> Also, it seems impossible to keep track about unstated assumptions

So you keep track of the stated assumptions and try to make sure not to make unstated ones.

> But surely "bloxors are greeblic" and "bloxors exist and are greeblic" are the same statements

Not necessary, depending on the definition of "existent". Consider:

Unicorns have one horn.

Unicorns have one horn and exist.

Expand full comment
JohanL's avatar

Unicorns don't have one horn, because unicorns don't exist.

Unicorns are fictional entities _described_ and _imagined_ as having one horn.

Expand full comment
John Stonehedge's avatar

I don't think you will find many people who use language the same way as you seem to do.

Expand full comment
Ape in the coat's avatar

That's one way to consistently use your definitions. But you may notice that people manage to meaningfully talk about things that do not actually exist in physical reality, using a different definition of "existence".

Obligatory SMBC:

https://i.pinimg.com/736x/6d/1b/c9/6d1bc98c67c432d27b6199d80bdf85d9--sleeping-man-triangles.jpg

Expand full comment
LGS's avatar

The funny thing is that if you develop enough such probabilities for enough statements, you basically just become an LLM -- to the LLM, *everything* sounds like "are bloxors greeblic" because they don't know the meaning of anything (at least at the beginning of training). What's shocking about LLMs is how far they can take such reasoning about bloxors without ever seeing one or holding one in their hand.

Expand full comment
John Stonehedge's avatar

I think the underlying assumption is that being `greeblic` could just as much mean X as it means not X. Therefore there's an equal chance bloxors are it.

Expand full comment
Shlomo's avatar

"all bloxors are greeblic" is not a proposition with a probability.

It is a sentence. Sentences are not propositions but they can express propositions.

The probability that the sentence "all bloxors are greeblic" expresses a true proposition in the English language is close to zero since it is very likely that it does not express a proposition in the English language at all.

However the probability that "all bloxors are greeblic" expresses a true proposition given that it expresses a proposition is... something?

I don't think it's 50% though although I don't know what it is. I guess the question would be what percent of statements of the form "all Xs are Ys" express true propositions in english where X and Y are single word English predicates.

Overall I guess I'm saying I agree with you. We do not have "zero information" about the statement since we have the information that the two predicates being talked about are both single word english predicates.

Scott's point that "greeblic and trampultous" is the conjunction of two statements is broadly correct as well. I mean, true "greeblic" is also the conjuction of "greeblic or non-trampultous" and greeblic or trampultous" so saying "all bloxers are greeblic" is also the conjunction of two statements. But since we are assuming that bloxer is a one-word predicate and "greeblic or trampultous" is not a one-word predicate different rules apply.

So Scott is wrong when he says "all bloxers are greeblic" is a statement we have no information about. We do have information that it is a statement all of whose words are english language words (since we are only considering the probability given that thats the case) OR, if we are not assuming that's the case we would guess the probability that it expresses a true proposition at much closser to 0 since it probably doesn't express a proposition at all.

Expand full comment
Shlomo's avatar

To really consider a proposition we have no information about we can ask "what is the probability that a randomly chosen proposition from proposition space is true" but this is probably meaningless since you need to specify a distribution to choose a random proposition and you can't specify a distribution over a set with more than aleph-2 members (at least I don't think you can)

And since for every set of real numbers, asserting that it exists is a proposition there are atleast aleph 2 propositions.

It is true that the set of TRUE propositions can be put into one to one correspondence with the set of FALSE propositions since each true proposition corresponds to it's own negation.

But its also true that for every false proposition there are more than 10 corresponding true propositions:

The negation of the false proposition AND 1=1,

The negation of the false proposition AND 2=2,

etc.

But really these sets probably don't exist altogether since assuming they do would lead to Russelian paradoxes.

Expand full comment
GunZoR's avatar

People are overthinking all of this. Everyone acknowledges that we simply don't know what the outcome of creating a superintelligence or even a very strong AGI will be. (Sure, there are various probability estimates for various hypotheses — but we don't really know.) So, due to this uncertainty, it makes sense to slow down attempts to create either entity until we do know and are confident in our knowledge. What is difficult to understand about this? What is objectionable? In a situation in which you have no idea what the outcome of creating X will be, and creating X possibly leads to human extinction with probability >0, then it is just common sense to proceed very cautiously until your knowledge increases to the point of making you confident of the outcomes you will unleash. The only intellectual difficulty here is a practical one — in getting an international moratorium on the creation of a superintelligence or even a very strong AGI. But it can be done.

Expand full comment
J C's avatar

There is a probability >0 for everything, and maybe a more significant probability for things like runaway climate change, nuclear war, world ending pandemics, and so on. I've occasionally thought about the possibility of someone stumbling upon some new type of particle that starts an unstoppable chain reaction that consumes the planet. It's quite important to distinguish how far away from zero the probability is, before you take significant action.

And even if you find the probability is significant, it will be tough to align people who live short lives by default and want to maximize their personal happiness without much concern for what happens afterwards.

Expand full comment
GunZoR's avatar

I guess I agree with all you say (except that the probability of everything is >0, but that's irrelevant). I simply weight as more significant an action the rush ahead to create a superintelligence or a very strong AGI than the establishment of an international moratorium — especially when the moratorium, ideally, would be a call for more science to be done. So, not knowing how far the probability is from 0, I'm defaulting to what I take to be the less significant action.

Expand full comment
Fazal Majid's avatar

I am not worried about AGI deciding to wipe out humanity of its own volition. I am worried about humans like Putin using a conscience-less AGI to do things even the worst Tcheka killer would balk at, thus eventually leading to humanity’s downfall.

Expand full comment
Ape in the coat's avatar

Why not both?It's not an either/or situation

Expand full comment
John R Ramsden's avatar

AGI may not kill everyone, but it seems very likely it could be misused as a highly efficient way to identify and kill selected people or groups!

When steam locomotives were first developed for public transport in the 1820s, many people, including experts, were certain that travelling at terrifying speeds of "thirty miles in an hour" would inevitably cause suffocation due to howling air flow making it impossible to breath, and use of the abominable new-fangled technology should be abandoned and banned! That fear turned out to be unfounded, but trains were an efficient way to transport millions of people to Nazi concentration camps.

Likewise, in a century or two when almost everything near, or even on and in, people can listen and communicate every word uttered to networks incorporating AGI, any form of privacy will be a thing of the past. So it would be easy to identify anyone guilty of wrong-thought, whatever that might be at the time, and an intolerant leadership could easily choose to eliminate them.

Who knows, perhaps religious people could be targeted, by militant "rational" atheists convinced that religion is evil and divisive, so they would be doing society a favour by ridding it of throwbacks to a superstitious past. That was pretty much attempted in France during their revolution, so there are precedents.

Expand full comment
J C's avatar

We don't need AGI for that, current AI is pretty much good enough already.

Expand full comment
John R Ramsden's avatar

But there isn't yet the all-pervading IoT infrastructure, obviously on the way, which will make secret conversations, or covert activities and social interactions in any form, impossible.

Expand full comment
J C's avatar

There isn't, which is great, but my point is that it doesn't require an AGI. A totalitarian government could enforce AI monitoring of all internet traffic, mandate surveillance in every home, you have to wear an eavesdropping smartwatch, and so on, all without AGI. The main thing stopping them so far is that people don't like it very much, it's not a lack of technology.

Expand full comment
John R Ramsden's avatar

In theory yes, although it is pretty easy to maintain privacy by speaking outside the range of an earwigging Alexa or mobile phone, or leave the latter at home. Quite a few things are technically around today, and some have been for years, even though superficially similar developments in future will in practice be quite different in character.

But it should be emphasized that discussions and ruminations on AGI safety should include as a major factor the related downsides of total transparency and traceability which I referred to. In fact, it seems to be one of the few axioms one can fairly safely assume about how things will develop.

"Know your enemy's plans and the battle is won" said Sun Tsu (or something like that, quoting from memory). So if a rogue AGI knows essentially everything about everyone, in real time, then its enemy is helpless, and that means us! It's no great revelation, having been explored in loads of SF films such as Terminator, but a precautionary principle not to go too overboard on IoT connectivity if you are worried about AGI.

Expand full comment
SGfrmthe33's avatar

This is an uncharacteristically innacurate interpretation of what Tyler actually wrote.

Nowhere does he imply "Therefore, it'll be fine".

He simply says, to paraphrase, "It has been fine everytime before when smart people panicked about some innovation with limited evidence for their beliefs, therefore it makes sense to continue for now."

This is not saying just plough ahead and throw caution to the wind, he's just saying we're being overly cautious right now based on insufficient information. Obviously, if further development and research discovers some latent ruinous potential, then we absolutely shut that down and I'm sure Tyler would agree.

There's no fallacy here, just an uncharitable reading.

I see lots of smart, respectable, people being shockingly irrational when it comes to AGI. I guess fear really is the mind-killer.

Expand full comment
Freedom's avatar

You say we are being overly cautious about it, but as far as I can tell there has been no caution at all to date. Tyler and you are apparently speaking out AGAINST people who are saying there should be some caution- right? I mean "we absolutely shut that down"- how are we going to do that? Won't that require some shutting-down infrastructure prepared in advance?

Expand full comment
LGS's avatar

I agree nobody knows anything about how AI will go, and I agree with Scott it might kill everyone (I'll put that at 20%).

I'm sort of on board with slowing down AI research. However, I don't think alignment research will save us. I've looked at the alignment research and it mostly looks like nonsense to me -- my probability that armchair alignment research (without capabilities research done at the same time) can save us is at <5%.

Basically, I agree with "AI might kill everyone". But I disagree with "so we must do alignment research NOW" -- the latter is futile.

Instead, I'm mildly in favor of slowing down AI research because I'd rather have a few extra years to live. But the weakest part of the AI risk position is always the "...therefore work on alignment!" since that's a research program that hasn't achieved anything and likely never will.

Expand full comment
JDK's avatar

What does that 20% actually mean to you? How is it different from 22% or 25%. How much of you personal wealth are you willing to bet on it.

"kill everybody" tommorrow? In a year? In a decade? for as long you are alive? in the next milliion years?

How exactly did you come up with 20%. By what method?

People are using numbers with a false precision as a pretend cloak of rationality.

Expand full comment
Deiseach's avatar

My opinion first, last and always is that the problem is not getting the AI aligned, it's getting humans aligned.

The people doing the AI research in order to create the first genuine AI have principles about knowledge, progress and the rest of it. They're hoping to find out things about how intelligence works and to create tools to make a better world.

The people funding the AI research are hoping to make a shit-ton of money. Not that they don't want a better world as well, but first let's get the third quarter earnings returns TO THE MOON!

So out of a combination of "We wants it, we needs it. Must have the precious", they're going to push ahead with what they are doing, regardless of all the pleading and hand-wringing and impassioned open letters (by the bye, has anyone anywhere *ever* been convinced to change due to an open letter?).

If we humans were perfectly aligned with the "values for human flourishing", there would be a lot less fear of what might happen. Wake me up when you find a way to get people to do more than pay lip service, driven by the inexorable forces of a market economy to placate Mammon and Moloch. Then we can worry about the AI, which is only going to do what the fallible humans tell it.

Expand full comment
BFL's avatar

I find that I have an extremely negative opinion of the AI doomer position, and it makes me wonder if this will end up as the Rationalist version of Atheism+.

Expand full comment
Deiseach's avatar

"the Rationalist version of Atheism+"

That's a brilliant analogy, my only quibble is that I'd put the whole AI enthusiasm ("if we get this right and align the AI with human flourishing values, it will bootstrap itself into godhood level intelligence and fix all our problems in defiance of the laws of physics and every single human on Earth will be rich and uplifted and magic magic magic is how it all works!") into the basket of Atheism+. That effort was about improving things by sprinkling atheism on top, ironically what C.S. Lewis had described years before as "Christianity And":

"What we want, if men become Christians at all, is to keep them in the state of mind I call "Christianity And". You know — Christianity and the Crisis, Christianity and the New Psychology, Christianity and the New Order, Christianity and Faith Healing, Christianity and Psychical Research, Christianity and Vegetarianism, Christianity and Spelling Reform. If they must be Christians let them at least be Christians with a difference. Substitute for the faith itself some Fashion with a Christian colouring."

I don't know if Atheism+ ever thought about "Atheism and Vegetarianism" but it seems like a great fit for them.

Expand full comment
JDK's avatar

Both are immanitizing the eschaton, to borrow a phrase.

Expand full comment
Purpleopolis's avatar

As someone who was there, that's not what A+ was about. It was about taking atheism from no religion to a new religion (social justice).

Expand full comment
Purpleopolis's avatar

My own snark aside, this seems to have vastly lower levels of self-righteous hatred than that movement did.

Expand full comment
Scott Alexander's avatar

Can you explain what you mean? My impression is that Atheism+ was atheism but with some annoying woke parts. Is this supposed to be rationalism with annoying woke parts?

Expand full comment
BFL's avatar

My meaning was more "divisive position that pushes a group apart and reduces its relevance" rather than having anything to do with wokeness.

Expand full comment
Phil Tanny's avatar

It seems helpful to leap over all the detailed arguments about AI and focus on a larger bottom line question, such as...

QUESTION: Can human beings successfully manage ever more, ever larger powers, delivered at an accelerating rate? Is there a limit to human ability?

1) If there is a limit to human ability, and 2) we keep giving ourselves ever more, ever larger powers, then sooner or later, by some means or another, we will have more power than we can successfully manage.

I would argue this has already happened with nuclear weapons. There's really no credible reason to believe that we can maintain stockpiles of these massive weapons and they will never be used.

We don't really need carefully constructed arguments about AI to see the bottom line of our future. Every human civilization ever created as eventually collapsed, just as every human being ever born has eventually died. All things must pass.

What we're most likely looking at is a repeat of the collapse of the Roman Empire. We'll keep expanding until we outrun our ability, the system will crash, a period of darkness will follow, and something new and hopefully better will eventually arise from ashes. My best guess is that this cycle will repeat itself many more times over many thousands of years until we finally figure out how to maintain a stable advanced civilization.

It may be more rational and helpful to ignore all of the above, and turn our attention to what we know with certainty. Whatever happens in world history at large, each of us is going to die. What is our relationship with that?

It's madness to take on more risk at this time, but we're going to do it anyway. Just like with nuclear weapons, we'll take on a new huge risk, and when it dawns on us that we're in over our heads, we'll ignore the threat, and turn our attention to the creation of new threats.

Expand full comment
David Roberts's avatar

The AI risk makes a cold war with China all the more dangerous since the prospect for bilateral safety cooperation is greatly diminished. At the same time, in the absence of trust, America cannot afford to fall behind by pausing or placing unilateral restraints on its own AI development.

Will development of a killer AI be similar to the development of nuclear weapons? And would it be a long term benefit to have a Killer AI accident early enough so that the damage is limited, but horrible enough to show how dangerous use would be?

Expand full comment
Phil Tanny's avatar

Cowen asks, "Besides, what kind of civilization is it that turns away from the challenge of dealing with more…intelligence?"

That would be us, right now. The science community and our technologists consider themselves the future oriented thinkers, but the truth is that they are clinging blindly to a simplistic, outdated, and increasingly dangerous "more is better" relationship with knowledge left over from the 19th century and earlier. They don't want to adapt to the revolutionary new environment the success of science has created. They want to keep on doing and thinking the same old things the same old way. And if we should challenge their blind stubborn clinging to the past, they will call us Luddites.

https://www.tannytalk.com/p/our-relationship-with-knowledge

The science community and technologists are brilliant technically. But they are primitive philosophically, in examining and challenging the fundamental assumptions upon which all their activity and choices are built. It's that brilliant blindness which will bring the house down sooner or later, one way or another.

Expand full comment
Jonny's avatar

This is outrageous misinformation: "There have been about a dozen times a sapient species has created a more intelligent successor species: australopithecus → homo habilis, homo habilis → homo erectus, etc - and in each case, the successor species has wiped out its predecessor. So the base rate for more intelligent successor species killing everyone is about 100%”.

There is literally no evidence for this, at all, in the fossil record or anything else. The timescale of minimum 4.2m years of evolution you're talking about over vast continental landmasses makes almost any other explanation for how one species becomes another (interbreeding, climactic shock, displacement, ad infinitum) more likely on an evolutionary timeline. The idea that any of these species had agency to "create" a successor is bad. The idea that the process was linear is a gross simplification. And the idea that our genetic ancestry is contingent on waves of genocide is a narrative that may suit your argument here but is just so irresponsible in its implications about intelligent life and human life - and more importantly, has no evidence behind it.

Even the idea that our most recent cousins, Neanderthals, we're wiped out by bloodthirsty homo sapiens around 40,000 years ago does not stack with actual paleoanthropology/archaeology . Would suggest Rebecca Wragg Sykes' book Kindred as a good way into thinking about what evolutionary change actually means, when using evidence we actually have.

Expand full comment
JDK's avatar

Fleshing out out the same point I made by just saying "No." Thank you.

Expand full comment
Alex Potts's avatar

You can tell that Tyler is desperate from the fact that he starts appealing to people's "inner Hayekian". How many people actually have one of those?

Expand full comment
JDK's avatar

All two year olds.

Expand full comment
Jeffrey Soreff's avatar

"There have been about a dozen times a sapient species has created a more intelligent successor species: australopithecus → homo habilis, homo habilis → homo erectus, etc - and in each case, the successor species has wiped out its predecessor."

Well, neither Nietzsche nor Babbage said:

Man is a rope stretched between the animal and the Machine--a rope over an abyss.

A dangerous crossing, a dangerous wayfaring, a dangerous looking-back, a dangerous trembling and halting.

:-)

Expand full comment
Sam W's avatar

The thing that takes AI risk beyond Knightian uncertainty, for me, is the plausibility of described scenarios where a smarter-than-human AI finds reasons to eliminate all humans. It took a lot of reading to find these scenarios obviously plausible (though most of the arguments can be found nicely summarized in Bostrom's book now).

If you haven't been convinced by Yudkowsky or Bostrom that an existentially dangerous scenario is plausible, then I wouldn't expect anything anyone else says to convince you that stopping AI development is a good idea, especially in light of the more easily plausible-sounding fact that AI can bring a lot of value to the world.

If you don't think the orthogonality thesis shows that an AI will not care about human values by default, and you don't believe that convergent instrumental goals include obviously useful things like not having your goals changed, not getting switched off, etc., then I understand why you wouldn't take AI existential risks seriously. Especially if you don't think humans are foolish enough to decide to create powerful AI agents (regardless of whether agents can emerge from other AI systems not explicitly built to be agents).

Given that it still seems like a niche belief, I do find myself confused by the fact that it seems so obvious to me and unbelievable to others. I have felt similarly convinced of certain things at other times in my life, and a lot of those things have turned out to be wrong, but I know that it's almost impossible to find the flaws in beliefs that have captured by mind so solidly.

Expand full comment
JohanL's avatar

Perhaps a better analogy is that there might or might not be an invisible alien starship in orbit around Earth at the moment, and that we can have no idea whether there is?

Another part of "we don't know anything" is that since we don't know anything, we don't *even* know what a good safety measure might be. The discoverers of fire might think a safety measure is to sacrifice to the God of Fire for safety, or put all fire under the control of the Priesthood of the Flame. These people could be serious about Fire Safety, but because they have no idea what works or what the actual dangers (beyond the immediate) are in the first place, any measures are going to be ineffectual. In the 16th century, Printing Press Safety seemed to be censorship and blacklisted books, which probably seemed sensible to at least the people in charge at the time. The people building ENIAC didn't and COULDN'T POSSIBLY have had any good ideas about computer security beyond "let's make sure it doesn't start burning and try to keep moths out of the building and put a lock on the place". And in all three cases "just ban it" would have been an obvious non-starter - even if your jurisdiction successfully does, it won't be universal.

I'm not opposed to AI safety - I just don't think we have any ideas yet about what would actually be required. So it's probably a good idea to airgap your AI experiment and check whether it does anything unprompted, but beyond that? Who the heck even knows?!

Expand full comment
JohanL's avatar

What you *can* do is handle risks as they appear. There's no way to predict Global Warming when you invent fire, but after seeing huts burned down, you can tinker with hearths.

If merely inventing fire would have automatically ensured that everyone will be consumed by fire, then we would all be... toast. Fortunately, no invention in the history of mankind has worked that way (not even nuclear weapons).

Expand full comment
JDK's avatar

Maximum uncertainty for a Bayesian requires a 50% assignment. I think this is the lede. Don't bury it.

"What’s the base rate for alien starships approaching Earth killing humanity?"

What's the base rate for aliens?

What's the base rate for starships?

What's the base rate for Earth of all places?

What's the base rate for them killing all humanity?

0.5 * 0.5 * 0.5 * 0.5 ?

"a more intelligent successor species: australopithecus → homo habilis, homo habilis → homo erectus, etc - and in each case, the successor species has wiped out its predecessor. "

No. The ostensible successor species did not wipe out the prior species. There was not a war among the species.

That coronavirus part of essay was ridiculous. The common cold, SARS, MERS?

What's one's base rate that nuclear weapons will lead to an extinction or near extinction level event?

What's one's base rate that human induced climate change will lead to an extinction or near extinction level event?

These are good questions but there are other more pressing questions.

A. What exactly does one mean when one assigns a probability to a future event? What does 33% really mean? Is it really different than assigning 35% or 30%. A point estimate must have error. How much error do you have? What is the shape of the distribution of error? It need not be Gaussian.

B. What exactly is willingness to put your own money on proposition? A $2 bet, a $2 million bet? And what is your bet size in relation to your total wealth? Without telling us your bet you could just give a % of total wealth your wiling to risk? I think Kelly Criterion would say not too much. Just how strongly do you feel about your bet?

C. Probability as a basis for action?

Lifetime odds of being struck by lightning in US: 1 in ~15,300. 1 in ~ 700,000 annually. Lifetime odds for death from lightning: 1 in ~19 million. Should one be a dope on golf course while out in a thunderstorm? (Personal decision) How about pausing youth sporting when thunder/lightning? (A collective decision.)

Expand full comment
Erusian's avatar

This fallacy already has a name. Except it's the opposite of what you say. You see, there's a common argument:

1.) Something is possible.

2.) If it's possible, even remotely, and the end is sufficiently catastrophic then probabilistically we should take it into account. (Alternatively: if it's possible and we can't quantify the probability we should assume it's large.)

3.) Therefore we should panic.

This is a form of the appeal to probability. And it's what you and AI pessimist types do a lot. But only with AI because it's not actually a consistently held logic. If you consistently accept the logic you should be panicking about any number of things in ways you in fact don't. For example, if you consistently support the appeal to probability then Pascal's Wager is irrefutable.

Expand full comment
Carl Pham's avatar

AI doomerism seems functionally equivalent to Pascal's Wager to me. No matter how unlikely any evidence (or lack of evidence) says this thing might be, the consequences are Too Enormous To Take Any Chances. What's different?

I mean, instead of Heironymous Bosch souls screaming in a lake of fire, we have science-fiction notions of gray goo or engineered superviruses, et cetera, so the book cover painting has been modernized, but isn't the story pretty much the same?

Expand full comment
Erusian's avatar

Nothing from a formal logic point of view. In fact most of these arguments are equivalent to fallacies that were debunked at least a thousand years ago.

I'm not sure if it say something deep and profound that the Bay Area rationalists have reinvented theology but with AI in the place of god. Or if it's just that if you don't understand yourself in philosophical or historical context you're going to make a lot of trivial mistakes.

Expand full comment
Freedom's avatar

I don't think I have seen anyone serious make that style of argument about AI. I'm actually not sure I've seen anyone unserious making such an argument. I have seen A LOT of people in a huge rush to dunk on people worried about AI and ascribe all manner of absurd and idiotic reasoning to them.

Expand full comment
Erusian's avatar

You don't think you've seen anyone serious make an appeal to probability? Because it's in the article I'm commenting on.

If you mean theology you're free to tell us how it's different from Pascal's Wager rather than just accusing us of putting words into that side's mouth without actual counterargument.

Expand full comment
Carl Pham's avatar

I have to say that fact baffles me to no end. Why would a demographic like that swing so hard towards Chautauqua-style millenarianism? The last time I was well in with the programming crowd was in the late 80s, and they were among the most hard-bitten of skeptics. What changed?

Expand full comment
Moon Moth's avatar

I think the simple answer is, "Eliezer Yudkowsky".

He's good at persuading, if not people in general, then people with certain characteristics. The community he formed consists of those people, and that's where this community partially descends from. I don't think the tech community in general is nearly as convinced.

Expand full comment
NoRandomWalk's avatar

I am...not freaking out, but I have been concerned for many years now and the scary looks a lot more likely not less. Am also a normie who's not smart enough to figure out how certainly dead we are. Given that, should I reorient my life and spend a significant amount of my free time trying to solve the alignment problem? Should I ask my fiance to also do that (both can in theory pass any college level math or comp sci class, eventually)?

I don't want to 'die with dignity'. I want to either live well without dignity or live longer.

Expand full comment
The NLRG's avatar

(3) is only a bad argument for discretely distributed outcomes. in a continuously distributed world, any particular value of any random variable has a probability of 0

Expand full comment
TonyK's avatar

"If you have total uncertainty about a statement (“are bloxors greeblic?”), you should assign it a probability of 50%": that can't be right. You also have total uncertainty about the statement "are bloxors more greeblic than drubbits?", but if you assign 50% to that too, you can deduce that "if bloxors are greeblic, then they are certainly more greeblic than drubbits". Which of course is nonsense, as any licensed greeblician will confirm.

In the face of total uncertainty, the only consistent response is not to assign any probability at all.

Expand full comment
Shankar Sivarajan's avatar

"Uncertainty is uncertainty. Lesser, greater, middling, it's all the same. If I'm to choose between one probability and another, then I prefer not to choose at all." Paraphrasing Geralt of Rivia.

Expand full comment
Scott Alexander's avatar

I’m having slight trouble following this, so I’m going to try to replace all of these with real words and see what happens.

Are elephants self-aware? - 50% - I think this is about my real estimate.

Are elephants more self-aware than whales? 50% - Also about my real estimate.

If elephants are self-aware, they are more self-aware than whales - it doesn’t seem like I’ve proven this at all.

Maybe this only works with binary/dichotomous variables?

Is the one millionth digit of pi even - 50%

Is the one millionth digit of pi more even than the one millionth digit of e - knowing only that even is a binary/dichotomous variable, there are four possibilities - they're both even, they're both odd, the first is even and the second odd, or the second is even and the first odd. Only in one of these possibilities is this statement true. So 25% chance.

Therefore, if the one millionth digit of pi is even, it is more even than the one millionth digit of e - not true with certainty, they could be equally even.

Sorry, I’m still having trouble following this argument.

Expand full comment
TonyK's avatar

Well, probability is almost as much of a mystery to us as free will -- all our explanations of it come up short in one way or another. One way of looking at it is to pretend that you (that's you personally) have a number of imagined worlds, each of them equally likely; this number can be large, but has to be finite. Then your probability of X (which definitely doesn't have to be the same as my probability of X) is simply the number of your imagined worlds in which X is true, divided by the number of your imagined worlds.

If we use this interpretation of probability, it's easy to see that according to your stated probabilities, if elephants are self-aware in 50% of your worlds, and if elephants are more self-aware than whales in 50% of your worlds, then in every world in which elephants are self-aware, they are more self-aware than whales.

I am aware that this interpretation of probability is not very convincing; but I have never come across a _more_ convincing interpretation of probability. In any case, it should convince you that your strategy of assigning a probability of 50% to outcomes that are completely opaque to you is flawed.

Expand full comment
Peter Kriens's avatar

In times of great uncertainty, enjoy the good parts.

I live in the south of France and last year we had a lot of sunny days. I noticed that some of these very long streaks made me a bit anxious. Is global warming really gonna hurt my family? And then one day I realized that these were actually quite perfect days, 20 years ago I would've been overjoyed. Perfect days for the pool, barbecue, a visit to the sea, early morning walks, market visits, drinking coffee under the Platanes, late lunches, etc. and I was _worrying_?

Maybe Epstein is right that most of the heat increase is in winter & night and it will all work out. Maybe Ehrlich finally wins a bet. It is clear that whatever I do, or France does, or even Europe does, it is not going to make a large difference anyway when you actually look at the numbers and Homo Sapiens surprising talent to screw up and do the right thing only as the last option.

So I can waste my perfect days in the south of France worrying or I can just enjoy life to the fullest and see where it all ends. We're Homo Sapiens and there are a lot of us, quite a few pretty smart. We'll figure it out somehow and if it means we are superseded by Machina Sapiens, then they won. When that happens, at least I had a lot better time than an anxious Scott. If it doesn't happen, well, idem ditto. Santé!

Expand full comment
Jason Long's avatar

What a great conversation this is. God bless the internet! I’ve been trying to resolve my intense cognitive dissonance from my two favorite internet writers disagreeing. Until the Economist fixes it for me, ideally with a short, clever essay that ends in some light word play, here’s my best resolution. Tyler is indeed blogging, as always, in his Bangladeshi train station style. He’s not slowing down to articulate. My interpretation of half of his post is: “Don’t be convinced by a long argument leading to a Bayesian probability.” There’s just too much fundamental uncertainty, too much noise. Someone could write 10,000 words and sound very smart, but you the reader shouldn’t be persuaded that the writer’s personal prior is at all convincing to anyone other than the writer themself. Scott “Bayes’ Theorem the Rest is All Commentary” Alexander lives his life according to the principle that Bayesian reasoning is the right way to think, which is great. So Scott must do the work to come up with his own personal p=.33 that AI wipes us out. Otherwise he won’t know what to think. Both writers are correct! If you are committed to living and thinking by Bayes’ Theorem, you must come up with a prior. But if you’re not, then don’t be persuaded. These numbers are actually meaningless in a social sense. 33% is crazy, and Eliezer’s number (.99?) is meaningless…except to Scott and Eliezer. It guides how they think. But I agree with Tyler: don’t *you* the reader think they actually know anything about the world. Don’t be persuaded by their logic. No amount of logic can overcome sufficiently great uncertainty. The second half of Tyler’s post is “when in doubt, favor innovation,” and as a card-carrying economic historian, I would strongly argue that there are few hills more worthy of dying on. Being a subsistence farmer was bad.

Expand full comment
Ernest Prabhakar's avatar

Of curse the real solution to AI safety is to train the entire generation of humanity to distrust disembodied words that sound smart. But strangely, no essayist wants to propose that...

Expand full comment
Steven Chicoine's avatar

That was fantastic, kudos.

Expand full comment
Garald's avatar

What I still haven't seen is a realistic, short-to-middle-term disaster scenario that doesn't involve AI *acting through meatspace* - influencing humans, giving information to humans, etc. Actually, GPT-4 is an unreliable source of information - more so than Wikipedia, say - and so it's really the manipulation of humans by a non-human entity (towards a goal or not) that is novel here.

Of course we've had something primitive like that in the last few years, in social media, so we have a foretaste. And we've also had more than about 150 years' worth of manipulation via mass media, by humans, be it to the service of a dictatorship based on a collective delusion, or just mandated by the profit motive. Not good, and also not completely and utterly new.

Then there are all the non-existential threats that are likely to be real challenges:

- the replacement of semi-intellectual workers whose tasks have already been made routine (for the sake of management, marketability, etc.), analogous to how many blue-collar workers were replaced, particularly if they worked in an assembly line;

- the realization of how deeply stupid humans can be. So far, we have seen the more or less competent use of language as proof of a certain basic level of intelligence, even if it is used to advanced obvious fallacies. But can't ChatGPT parrot like a (pick your least favorite species here, whether it starts with T or W)?

Expand full comment
Mark_NoBadCake's avatar

"Safe Uncertainty Fallacy"

Completely agree re: AI development.

A. Musk said years ago that whichever country advances AI first will rule.

B. NO ONE is pausing a day let alone 6 months for ANY reason and especially for universal "safety protocols."

Self-evident?

Recommendation: Though quaint-sounding, be as decent a person as you can manage and with some luck (born of empathy & humility) humanity will catch (another) break!

[Scott, I enjoyed reading your review of Tim Urban's latest. That was quite a hyphenated qualifier! : )

Expand full comment
Mark_NoBadCake's avatar

"What we do with our hands, the mystics say, is a direct expression of the forces in our minds. Even our technology is an expression of our deepest desires. The crisis of industrial civilization, which could create the conditions of paradise on this earth and yet threatens to destroy it, only reflects the deeper division in our hearts. Instead of blaming our problems on some intrinsic flaw in human nature, we must squarely take responsibility for our actions as human beings capable of rational thought – and then change our ways of thinking. It is not so difficult, after all; it has happened many, many times in the past as humanity has evolved.”

--Eknath Easwaran, Original Goodness, 2018 (posthumous)

Expand full comment
Chris's avatar

That final point is maybe the best. We have designed a society that dismisses as a basic reaction. We ought to use that to our advantage the one time it has actually been generally advantageous for us.

Expand full comment
Deiseach's avatar

The triumph of conservatism, or, all we old people yelling at the kids to get off our lawns were right, now we're going after this pesky AI and telling it to get off the lawn, cut its hair, get a decent job, and don't sass your elders, sonny! 😁

Expand full comment
Ch Hi's avatar

That last paragraph needs a rewrite. Especially the last sentence. I'm guessing it's sarcasm, but I'm really unsure.

Generally, though, the problem is that when the uncertainty is total, assigning ANY probability is not correct. 50% isn't a reasonable number, and neither is any other number. If you want it in math, it's Bayesian logic without any priors AND without any observations. I don't think there's any valid way to reason in that situation, so the question needs to be rephrased.

Something that we could take a stab at is "What are the possible ways of controlling the advance of AI, and what do those cost?". Clearly we aren't going to want to have a totally uncontrolled AI, so that's a valid goal. In this form, the "existential risk" would be in the costs column, but it would have such huge error bars that it couldn't yield much weight. And there'd be existential risks on the "no AI developed" side too. (E,g, what's the chance that WWIII starts and kills everybody because we didn't have an AI running things? That's not non-zero.)

I think this is one of those situations where unknown-unknowns isn't sufficient, it's more unthought-of-unknown-unknowns. Or perhaps something stronger.

FWIW, I tend to put the odds of AGI being an existential disaster at about 50%. But I put "leaving a bunch of humans running things with access to omnilethal weapons for a bunch of centuries" as an existential disaster at about 99%...and worry that that's an underestimate. Now the time scale is different, but over the long term I rate the AGI as a LOT safer choice. Which doesn't mean we shouldn't take all possible care in the short-term.

Expand full comment
Phil Getts's avatar

Re. "If you have total uncertainty about a statement (“are bloxors greeblic?”), you should assign it a probability of 50%": I'm pretty sure that if you take a dictionary, grab one random predicate 'foo' and two random nouns X and Y, the probability of foo(X,Y) is less than .5. OTOH, a predication which has been proposed by a human isn't randomly selected, so the prior could be > .5.

I think what we do in practice is have an informal linear model which combines priors from various circumstances surrounding the predication, mostly the reliability of the source plus explaining-aways like political motivation. This is so open-ended that it lets us assign any probability we want to.

That said, using just the reliability of the source as a prior is probably better than following your gut.

I wrote an unsupervised learning program using Gibbs sampling to derive the correct probabilities of each one of a list of sources. "Unsupervised" means you don't have to know whether any of the claims each source made is true in order to compute how reliable that source is. (This requires that you have many claims which several of the sources have agreed or disagreed with.) This is pretty surprising, and I'd like to use NLP to apply it to news sources, but I've done too much charity work already lately for my own good.

It's surprising to Westerners that it works, because Westerners are raised on ancient Greek foundationalism--the idea that you must begin with a secure foundation of knowledge, and build It on top of it. This is dramatically wrong, and has been a great cause of human suffering over the past 2000 years (because foundationalists believe they can have absolute certainty in their beliefs as long as the foundations are secure). Coherence epistemology lets you use relaxation / energy minimization methods to find (in this case) a set of reliabilities for the sources which maximizes the prior of the observed dataset.

(I wrote this program at JCVI to compute the reliability of 8 different sources of protein function annotations, because the biologists had been in a meeting all day arguing over them, and they did this every 6 months, and I realized I could just compute the most-likely answer. They agreed that my results were probably more-accurate than theirs, but told me never to run that program again, because genome analysts don't believe in math and are (rightly) afraid of being replaced by computers.)

Expand full comment
JDK's avatar

Great story. And interesting points.

Expand full comment
Deiseach's avatar

""Unsupervised" means you don't have to know whether any of the claims each source made is true in order to compute how reliable that source is. (This requires that you have many claims which several of the sources have agreed or disagreed with.)"

But what do you do if it's circular quoting? Source A says "Hy-Brasil is 1,000 miles off the west coast of Ireland", and sources B, C, D and E all agree in broad terms (some say it's 800 miles, some say it's to the south-west).

But B is quoting Wikipedia, which relies on a claim made in a magazine published 15 years ago. C is quoting from someone's blog where they reference Wikipedia. D is quoting C. E tracked down the magazine and quotes that directly.

The magazine article is wrong because it was written by a guy who believed in Atlantis.

But all the sources agree that A is very reliable indeed. Flat-out wrong, but reliable. How do you make sure that isn't what is happening?

Expand full comment
Phil Getts's avatar

You don't. The algorithm only rates the average reliability of each source, for some other algorithm or person to use as a prior in rating the correctness of individual claims.

The problem in this scenario is that the sources A, B, C, D, & E aren't independent. This shouldn't be a problem for individual, random cases unless A, B, C, D, and E are all likely to give the same wrong answer in many other cases as well. If A or B regularly make such misjudgements, their reliability should diminish.

Unfortunately, news sources are massively correlated, because they have political biases. The sources that say there are only trivial psychological differences between men and women, are also likely to have said that the IPCC says cities will be underwater by 2100 if we do nothing about global warming, that Covid-19 was definitely not leaked from a lab, that trans women have no advantages over cis women in sports, that Republicans commit as many racial hate crimes as Democrats do, that police are more likely to shoot a black suspect than a white suspect, that any particular BLM protest was peaceful, and that the spike in homicides after them was due to increases in gun sales. A conspiracy of multiple sources to spread the same misinformation will fool the algorithm.

Expand full comment
Deiseach's avatar

"A conspiracy of multiple sources to spread the same misinformation will fool the algorithm."

Which is why you *do* need some kind of way to distinguish "true" from "false, but convincing". Hey, isn't that the problem we have with current AI that will make up shit if it doesn't have a real answer?

Expand full comment
Phil Getts's avatar

That would be nice, but that's much more-ambitious what I'm talking about. Let's not scorn the good for not being perfect. I'm describing expectation maximization, which doesn't give you AGI, but could be used in many places where today people just punt on having any probability estimates at all while ranting ignorantly about there not being "a view from nowhere". EM gives you the view from nowhere (an objective way of answering circular or context-relative questions). This dispels most post-modernist arguments for relativism (which is, however, so carelessly defined that I am myself a complete relativist for some interpretations of "relativism").

Expand full comment
Peter Susi's avatar

Setting this total stretch of a strawman aside - perhaps a better way to state this is "We have no idea what's going on, therefore there is little to suggest a reason to panic, only to observe." These folks are looking at the chicken-little crowd and asking "Are you basing this on anything other than your own unfounded assertions about what *might* happen?"

Assuming that ignorance is a reason for blind panic is just as absurd as assuming ignorance means safety. These folks are doing neither, only advising patience.

Expand full comment
Phil Getts's avatar

So, the real error here is failing to distinguish between suddenly allocating resources to a problem, and panicking? My difficulty with that idea is that I can't operationally distinguish between suddenly allocating resources to a problem, and panicking.

Expand full comment
Peter Susi's avatar

You have the target of his criticism backwards. The people he is criticizing ate the ones saying "This is panic, and it is unwarranted" he has turned the people who have told chicken little the sky is not actually falling, and criticize them for claiming that we could never possibly have bad weather.

I'll use your phrasing, because it works well: The error here is confusing a statement that panic is not necessary with a statement that everything is safe and there is no reason for any concern.

Expand full comment
N3's avatar

If you want to steelman the argument, basically they are pointing out a more classical fallacy on the part of the LessWrong crowd. It goes like this:

1. I can imagine a terrible outcome extremely vividly.

2. Therefore, this outcome is extremely likely.

From my point of view, all of the LessWrong AI catastrophizing seems to be purely based on imagination and speculation, without contact with any empirical evidence. It's just "I can imagine it, so it's going to happen".

What would be the alternative (to evidence free speculation)? Empirical study of the behaviour of AI systems. This is not what they have been doing so far.

Expand full comment
Ch Hi's avatar

There's a problem here, though. How do you weigh an "existential risk"? If you give it a heavy enough weight, then even if it's a really low probability you need to worry about it, and should act to prevent it.

However, my problem is that I see existential risks on both sides. E.g., perhaps a strong AI could prevent the final war. Then even if it's an existential risk to develop the AI, it may be a more likely one to not develop it.

Simple models often don't accurately reflect the thing they're trying to describe...or only reflect a small part of it.

Expand full comment
Akidderz's avatar

I wasn't going to comment because I have this strange feeling when I start reading through all the other comments that what I'd say was already said - but better. But here goes...

This post put me in a mood. Two of my favorite internet people are bickering and I feel like the kid whose parents are getting divorced - why are they arguing and not being nice to each other?

I lean team TC on this one. If you asked me to write a 50 page treatise on why, I might be able to, but ultimately when there is this much uncertainty involved, I think it comes down to basic human temperaments, an admittedly super unsatisfying answer. Tyler is optimistic. One of my favorite Tyler arguments is about how long term small (3% growth) is fantastic for humanity because the incremental improvements just compound over time - so keep growing, slowly if need be, but keep growing since it benefits all.

I'm optimistic too. I take risks. I invest, knowing that others probably have more information than I do about markets. I run a small business, knowing that most small businesses fail. I have two children, knowing half the chattering class is convinced that no one should be born into this crazy apocalypse-a-day world.

I'm optimistic about AI too. Every new tool that comes out I gleefully play with and it feels like magic. Like, not a trick or illusion magic - like full bodied MAGIC that I don't completely understand and that does things that feel like they shouldn't be possible.

I'm happy being called hopelessly naïve on all fronts - investing, business, children, and AI (many more too). I'm also content just being along for the ride and hoping to know/learn enough about each new model to see where it fails and but also be stunned with how useful and empowering it can be. And that might be the element that tips me away from the doomsayers...I just see so much human potential unlocked by even the crude forms that we have now that I'm just unbelievably hopeful that there can be more down the road.

Expand full comment
Mo Diddly's avatar

Part of the problem is about speed and carefulness. There are indeed many things about AI to be optimistic about, but it also might be astonishingly dangerous. Many cool-headed voices are arguing not that we shut down AI progress forever, but that we treat it reeeeeaaaally cautiously like the national security threat that it may well be.

Expand full comment
Drea's avatar

Thank you for posting, despite the uncertainty that it was a duplicate. I don't think it is, and you managed to capture my intuitive stance well.

Expand full comment
Taylor G. Lunt's avatar

Name proposal: the Who Knows Therefore It's Fine fallacy.

Expand full comment
The-Serene-Hudson-Bay's avatar

The 100 mile long alien space ship is a bad example that smuggles in the resolution of a major disagreement in AI risk in your favor, the question of whether AI would have the capacity to wipe out humanity. The moment we see a 100 mile long Alien space ship we can be instantly certain they have the capacity to, at the very least, drop that thing on the planet and cause a mass extinction event.

Whether a superintelligence AGI has the capacity to wipe out humanity is not instantly obvious from the speed and which GPT has progressed and Yud's belief that it would rests on lots of theorizing about what superintelligence is, how scalable social manipulation is, how easy it is to produce self replicating killer nanobots etc.

Expand full comment
Ch Hi's avatar

Sorry, I've got to pretty much disagree. A sufficiently powerful AI would be able to eliminate humanity, if only by causing someone to start the final war. (And I'm not sure it would need to be THAT superhuman. Merely trusted in an inappropriate way.)

So the question becomes "would it want to?". (Probably not. That's not sufficient to render it safe. A paperclip maximizer wouldn't want to wipe out humanity except as a rather secondary instrumental goal.)

Expand full comment
Mark's avatar

One point that I think Tyler is getting at is that it seems like the current calls for stopping GTP-5 feels similar to calls people made for not turning on the Large Hadron Collider.

If/when we’re at the point we’re deciding whether to flip a switch and turn on a super-intelligence hooked up to the internet and 3D printers, the AI-doomer argument seems compelling. Don’t turn it on until we’re sure we’ve solved alignment.

But right now, nobody has any idea how to solve alignment. Okay, someone suggests, let’s throw $100B at the problem. And the response from AI-doomers is “I wouldn’t even know how to begin to spend that money!”

And importantly, it feels like GTP-5 specifically has as much chance of being the extinction-causing AI as the LHC had of causing a black hole that destroyed the world. It’s the next rung on the ladder though and how sure are you that the rung after that won’t destroy the world? So be sensible and stop climbing while we can?

But if we can’t see a solution to the alignment problem on our current rung of the ladder, and we’re confident the next rung is still safe, climbing one rung higher may be the only way to give us the tools and insights we need to solve the problem in the first place.

If I’m trying to be a bit more formal in my assumptions:

1. There’s only a negligible chance we solve alignment with today’s knowledge levels.

2. One likely way we increase our knowledge such that we could understand AGI and solve alignment is to make some near-AGI

3. GTP-5 has ballpark as much chance as destroying the world as LHC

4. But GTP-5 might be near-AGI enough to give us crucial insights. (Or at least let us know if GTP-6 will be safe)

5. Delaying safe-AGI will needlessly result in millions of people dead and huge amounts of suffering

6. Overregulation of AI could set back the date of safe-AGI decades (or risk the “China” problem)

7. Therefore, don’t object to GTP-5

8. Possibly, think about what regulations/safeguards will need to be in place if the insights from GTP-5 are: “Hmm, now there’s a non-negligible chance that GTP-6 will be the one”

Also, to me, the main worry from the “China” problem isn’t that China will forge ahead and create unsafe AI that destroys everyone. It’s that China will realize the US has been oversafe and stopped AI development on GTP-n whereas GTP-n+1 is perfectly safe. They’ll go on to develop GTP-n+1 and gain crucial insights that allow them to safely create AGI that does do exactly what they want.

Expand full comment
Ch Hi's avatar

I think your model of the "China" problem is oversimple in one very important way: It's not just China. There are LOTS of groups working on the problem, with varying skill levels and varying degrees of care. (And, or course, even China as such isn't a monolithic entity. The (invented names) Bejing Institute of Technology isn't the same as the Department Regulating Ostensible Wise Navigation, but they've both got their own Advanced Projects Agency. (So does the military, which is supporting and drawing insights from both of them.) And do you think Musk intends to shut down his project, just because he's called for MS to shut down theirs? Etc.

Expand full comment
Dana's avatar

How on earth does something which does not possess consciousness or any comprehension of the text it's producing even "sort of" qualify as an AGI?! It isn't an intelligence at all, general or otherwise. Even if you hold to a functionalist account of intelligence, it doesn't count, because it is not in fact performing the functions an intelligent being would perform. It's just a really impressive simulation, able to simulate consciousness convincingly only because we've fed it an astonishing quantity of human-created sentences for it to perform mathematical algorithms on. That is not the way an *actually* intelligent being goes about responding to queries.

Expand full comment
Ch Hi's avatar

While I think a "consciousness" is a necessary component of an AGI, it's not a part of the definition. And it may exist in sufficient form even if you don't choose to call that piece "consciousness".

FWIW, my definition of consciousness is a model of the universe which includes the modeller within the model. And self-consciousness is a model that is recursive enough to model it's reactions to the actions that it took earlier. (I'll note that as a consequence I consider a thermostat hooked up to a furnace or an air-conditioner a minimally conscious system, though not a self-conscious one.)

What's your definition?

Expand full comment
Dana's avatar

That's not what anyone means by consciousness, so that's a misleading way of using the word which can only serve to muddle an already-difficult conversation even further. If you want to talk about something modeling its surroundings, say "modeling."

Consciousness is difficult to define in this context because I assume that whatever words I use to define it, you'll be able to just interpret them in a way that extends them metaphorically to be compatible with *your* meaning, the same way you've done to the word "consciousness." If I say it means "awareness," you'll insist the thermostat is "aware." Etc. But I suspect you know perfectly well what I mean. I mean the thing David Chalmers means when he talks about the hard problem of consciousness.

Expand full comment
User's avatar
Comment deleted
Apr 2, 2023
Comment deleted
Expand full comment
Dana's avatar

Consciousness isn't that difficult a concept unless you're talking to people who have a whole schema for re-interpreting all language associated with consciousness as actually referring to something that *doesn't* involve consciousness. At that point it does become difficult, but that's not my fault.

A related problem arises in some mid-century philosophy of religion: there was this camp of folks who were in fact atheists, but considered themselves Christian. When the other Christians tried to protest that you can be an atheist if you like but you have to acknowledge that you've abandoned some key tenets of traditional Christianity there, D. Z. Phillips would deny that he was an atheist or anything less than orthodox, reinterpreting every damn word so he'd have a retort whenever someone accused him of not believing in the resurrection, or an afterlife, or in a real God, etc. (It's subtler than that makes it sound in practice, of course. And he obviously denied that he was reinterpreting anything.)

Anyhow, my point is, the problem that this produced for everyone attempting to have a proper philosophical discussion about the topic wasn't that "resurrection" is some indefinable concept that all the mainstream philosophers were hopelessly confused about. The problem was inflicted from the outside.

Setting that aside, you can't seriously believe that thermostats *feel* things? Nope, nope, I can't set that aside, because surely this is the D. Z. Phillips trap again. I know this story: you're reinterpreting "feel" to mean something that doesn't involve, you know, feeling. But then you'll deny that you're reinterpreting it....

Expand full comment
User's avatar
Comment deleted
Apr 3, 2023
Comment deleted
Expand full comment
Dana's avatar

I have never claimed to believe in p-zombies. All I have claimed is that consciousness (a) exists and (b) involves first-personal experience. This is a well-established fact, for any being with the capacity to (actually) consider the question. I have also claimed that thermostats don't have feelings, which I believe is a sensible position.

If you're the one claiming thermostats have feelings, I think *you're* the one believing in magic, not me.

Expand full comment
Ch Hi's avatar

Well, you're right when you presume that I consider the thermostat aware that it's time appropriate to toggle a switch, but you're wrong when you presume that I have any real idea as to what you mean. I'm *guessing* that your response means that you have no operational definition of consciousness, but I'm not certain.

How would you tell whether or not consciousness were present in a system where you could examine the states of all of the pieces?

Expand full comment
Dana's avatar

No, I don't have an "operational" definition of consciousness. I would of course need to come up with one if I were engaged in some specific type of research that involved making a determination of whether something was conscious, but all existing AIs are so obviously not conscious that no such efforts are required.

Not everything can be given an operational definition without distortion. Actually, I think maybe operational definitions are *usually* distortions, if confused for definition-definitions? But at least, consider for example concepts like "true" or "morally good." If you try to give an operational definition of either of those, you'll actually just be defining some *other* concept, and using that other concept as an "operational definition" for "true" or "morally good" would come with a major risk of forgetting that this more tractable "operational" concept is not the actual thing you initially set out to investigate.

Consciousness in particular is famously "subjective" in the sense that it's tied to (or, constitutes, really) a first-personal perspective. So the only person who directly experiences their consciousness is themselves; any operational definition you give for evaluating the presence of consciousness in another being would be getting at a side-effect of consciousness and not the phenomenon itself.

So, if I have no operational definition, how would I tell? I'd look at the thing and how it was acting, and examine the evidence. I have no decisive test I can give in advance, but I can think of scenarios that would convince me.

Expand full comment
JDK's avatar

I see your in the Hoffstadter strange loop camp.

Expand full comment
John Slow's avatar

What I understand of Tyler's argument: people like Scott have been wrong about technological innovations every time in history. Is it really so hard to believe that people like Scott are wrong again?

Scott's argument: This time really is different. Studying the situation more deeply will reveal itself to be so.

I of course buy Scott's argument. Cowen's argument is of the lazy "it's always happened this way before, so it will keep happening this way" variety. Which is generally right, until it is wrong.

Expand full comment
Ch Hi's avatar

The problem is that neither form of that argument is valid. People have thrown a coin and had it land on the edge. (Yeah, it's a really low probability, and has always [ISFAIK] depended on special environmental conditions, like a crack in the sidewalk. But how do you weigh existential risk?)

Expand full comment
MT's avatar

Why would we be afraid of 100-mile-long spaceship of aliens?

- We can't build a 100 mile long spaceship, but we want to, so they're better than us

- The ability to make spaceships implies huge power to shape the physical world, and threaten our existence

- We don't know anything about the aliens, but we recognize that they have a spaceship and they are some sort of entity that we can label "aliens", so its not a big leap to anthropomorphize

- Since they are human-like enough, we can say they have "intent" or "interest", and their presence outside Earth in particular means they are likely to be interested in us, take action regarding humans

So, can AGI do things that humans cannot? No, not yet. Does the thing it can do, which is vastly out of reach of humans, imply that it likely has the power to destroy us all? No obviously.

Is the AGI completely unknown and foreign like an alien? No, we built it. Is it so similar to humans and human-like its actions and the things it builds, that we can easily ascribe human motives to it (converting us to its religion lol)? No.

Did AGI travel across the stars to this specific pale blue dot, because it wanted to mess with humans? No.

So maybe this analogy isn't that great

Expand full comment
Deiseach's avatar

Maybe it's just the humour I'm in but right now I feel like *both* sides are massively wrong.

Pro-AI people: it's not going to be the fairy godmother that will fix poverty and all the ills that flesh is heir to. Yes, the world may well be richer after it, but like right now, those riches will go into the pockets of some, and not be distributed throughout the world so a poor Indian peasant farmer gets 500 shares in Microsoft's Bing which will give him a guaranteed income so he won't have to worry about getting through the next day. "But it will be so smart it will know things we can't know with our puny human brains and it'll make decisions in a flash and it'll be angelically incorruptible so we should turn over the government of each nation and the running of each nation's economy to it!"

Yes, and the profits out of that will go to the owners/investors/creators of the angelically incorruptible AI, not you and me.

Anti-AI people: you've been smugly posting about Pascal's Mugging as a rebuttal of Pascal's Wager for years, and now you are trying to Pascal's Wager people in order to prevent or at least slow down AI creation.

Pardon me while I laugh in Catholic.

That's not going to work, all the earnest appeals are not going to work, and you know why? Human nature. Whoever gets there first with AI that is genuine AI is going to make a fortune. That's why when OpenAI went in with Microsoft in order to get funding, all the principles went out the window and we now have Bing to play with if we want.

People want to work on AI because they are in love with the topic, they want to see if they can figure out what intelligence is and by extension what makes humans tick, they want to make a shit-ton of money, they do it for all the reasons that previous attempts at "maybe we should hold back on this" were given the digitus impudicus. I've banged on about stem cell research before, but the distinction between the objections to embryonic stem cell research and the permissibility of adult stem cell research were all ignored in favour of a flat "religious zealots want to stifle stem cell research which is the cure for all ills".

So good luck with what you're doing now, I expect it to have as much influence as the 2000 Declaration by the Pontifical Academy for Life on holding back research:

https://www.vatican.va/roman_curia/pontifical_academies/acdlife/documents/rc_pa_acdlife_doc_20000824_cellule-staminali_en.html

I am very sympathetic to the worries, believe me. But having been on the receiving end of decades worth of "You cannot hold back the onward and upward march of Progress and Science" (including advances in social liberalisation) tut-tutting and finger-wagging from those who want Science and Progress, I'm none too optimistic about your chances.

Expand full comment
Isaac King's avatar

> The base rate for things killing humanity is very low

This also fails basic anthropic reasoning. We could never observe a higher base rate.

Expand full comment
Isaac King's avatar

> There are so many different possibilities - let’s say 100!

I originally interpreted this as 100 factorial, since it seemed like you wanted a very large number and I didn't see why else an exclamation mark would be there. Then when the sentence later mentions 1% I had to pause and realize that you just meant this 100 was very exciting.

Expand full comment
Deiseach's avatar

"But I can counterargue: “There have been about a dozen times a sapient species has created a more intelligent successor species: australopithecus → homo habilis, homo habilis → homo erectus, etc - and in each case, the successor species has wiped out its predecessor. So the base rate for more intelligent successor species killing everyone is about 100%”

I doubt I'm the first to point out that previous hominid species did *not* "create" successor species (unless we're arguing for the creation of Adam and Eve by God). What we got was evolution, natural selection, and what seems to be a combination of absorption (we've got Neanderthal DNA) and out-competing the less capable species.

And yeah, like chimpanzees, that probably did involve a lot of whacking the rival band over the heads until they died and then we moved in and took all their stuff. But they got the chance to whack us over the heads, too.

Expand full comment
JDK's avatar

Let me just add to "(we've got Neanderthal DNA)" It's not just we contemporary humans have Neanderthal DNA we are 99.7% identical on average!

We've got 50% or some high number of DNA also found in bananas!

There was no battle of homos. (We homos might have self domesticated ourselves. But a Chihuahua and a great Dane are both dogs. There were no great dog wars.)

Expand full comment
Deiseach's avatar

Let us strive to achieve the Great Banana Republic, so that all may flourish like a banana tree! Or commercial banana cultivar, at least.

Setting aside all differences of race, creed and colour, we can agree: the true colour of humanity is yellow (like a banana).

Expand full comment
JDK's avatar

This is where Vonnegut would have a great Bokonist calypso to enlighten us.

In my head I'm thinking something in the vein of "homonid smart, banana smarter".

Expand full comment
Purpleopolis's avatar

There are however, absolutely massive ant wars

https://www.youtube.com/watch?v=cqECNYmM23A

Expand full comment
AReasonableMan's avatar

I think Tyler's argument is about how to deal with the coming of AI psychologically, rather than making definitive claims about what the magnitude of the risk is.

I don't think he's saying that things will be fine. He's saying we should continue to work on alignment but should realize that progress towards generally intelligent AI will happen anyway, so we should accept this thing we cannot change.

Expand full comment
Drea's avatar

I hate posts and threads. Anyone up for porting this to Kialo or Loomio?

Expand full comment
Deiseach's avatar

"We designed our society for excellence at strangling innovation. Now we’ve encountered a problem that can only be solved by a plucky coalition of obstructionists, overactive regulators, anti-tech zealots, socialists, and people who hate everything new on general principle."

Frank Herbert got there before you - the Bureau of Sabotage, motto: "In Lieu Of Red Tape":

http://www.fact-index.com/b/bu/bureau_of_sabotage.html

"In Herbert's fiction, sometime in the far future, government has become terrifyingly efficient. Red tape no longer exists: laws are conceived of, passed, funded, and executed within hours, rather than months. The bureaucratic machinery has become a juggernaut, rolling over human concerns and welfare with terrible speed, jerking the universe of sentients one way, then another, threatening to destroy everything in a fit of spastic reactions. In short, the speed of government has gone beyond sentient control (in this fictional universe, many alien species co-exist, with a common definition of sentience marking their status as equals).

BuSab begins as a terrorist organization, whose sole purpose is to frustrate the workings of government and to damage the incredible level of efficient order in the universe in order to give sentients a chance to reflect upon changes and deal with them. Having saved sentiency from its government, BuSab is officially recognized as a necessary check on the power of government. First a corp, then a bureau, BuSab has legally recognized powers to interfere in the workings of any world, of any species, of any government, answerable only to themselves (though in practice, they are always threatened with dissolution by the governments they watch). They act as a monitor of, and a conscience for, the collective sentiency, watching for signs of anti-sentient behaviour and preserving the essential dignity of individuals."

Expand full comment
Michael Kelly's avatar

I'm on my phone. Let's see if I can recall the Frederich Bastiat quote concerning Luddites. It will be something sloppy ... ho ho, instead I found online a pdf of WHAT IS SEEN AND UNSEEN. Bastiat is speaking with a voice of irony. Lancashire is the technological leader, and Ireland the technological loser of Bastiat's day. The Luddites, followers of Captain Ludd are breaking machines the people fear will displace humans doing brute force work.

"Hence, it ought to be made known, by statistics, that the inhabitants of Lancashire, abandoning that land of machines, seek for work in Ireland, where they are unknown."

In today's world we would say: Looking at the statistics, people flee Silicon Valley to seek work in Mississippi where AI is unknown.

Expand full comment
Michael Kelly's avatar

While yes, some murders likely happened between different species, but to say one species killed off the other is like saying the 80386 was murdered by the 80486.

Expand full comment
Scott Alexander's avatar

Or like saying that humans killed [list of recently extinct species]. Most of the time it isn't humans literally hunting them. It's destroying habitats, removing their food sources, etc. I imagine the same was true of most other hominids.

Expand full comment
Michael Kelly's avatar

Of all the species which we have caused the extinction, more than 95% were species which existed only on one island.

The exception being things like the Passenger Pigeon, but they only mated in large crowds, and they congregated in cities making a huge mess. These problem birds were eradicated in the cities, but lacked non-city populations.

Expand full comment
JDK's avatar

No.

Expand full comment
NLeseul's avatar

1) Human brains have a cognitive bias that assigns a higher-than-rational probability to danger and disaster, especially when encountering novel situations or "outsider" agents. To counteract this bias, we should apply unusual skepticism to any prediction a human makes of impending doom.

2) Given a situation which may turn out to be an actual danger, but where we have too little information to predict the nature of the danger with any precision, any mitigation measures we might choose to take are just as likely to increase the danger as to reduce it. Therefore, elaborate planning of mitigation measures against a completely unpredictable danger is not an efficient use of resources.

Or, translated into human-ese: "It's no use worrying about it. Everything will probably be fine."

Expand full comment
Michael Kelly's avatar

On the REFERENCE CLASS TENNIS. Overall every new technology has killed some small number of people in it's own unique way, but improved our lives immensely.

Fire; stacked rock homes; boats; keeping livestock; steam engines; locomotives; airplanes; self driven cars; ... etc.

This is our reference, AI will harm some small number of people in it's own unique way, but improve human life immensely.

Expand full comment
Cosimo Giusti's avatar

AI may not itself exterminate humanity, but it may accelerate its destruction.

Given the kind of adaptations made after the meteor strike that took the dinosaurs, life on earth will survive in some form. New forms of plant and animal life will evolve and flourish. The planet just won't be dominated by a violent, narcissistic species.

Expand full comment
Anonymous Coward's avatar

> If you have total uncertainty about a statement (“are bloxors greeblic?”), you should assign it a probability of 50%.

Strong disagree. You should assign some probability, maybe 80%, to “in some important senses yes, and in some important senses no”, because most things are like that. Then maybe 10% to each “in every important sense yes/no”.

Expand full comment
dryer lint's avatar

Don't worry; I'm only killing the evil ones.

Expand full comment
User was indefinitely suspended for this comment. Show
Expand full comment
Purpleopolis's avatar

Is the Safe Uncertainty Fallacy just the Precautionary Principle pointed in the other direction?

Expand full comment
L. Scott Urban's avatar

I agree, you are having a tough time steelmanning Tyler. Maybe I can help?

1. Humans have a long and storied history of overemphasizing the negative aspects of any radical change within their society.

2. AI advancement denotes a radical change in our society.

3. We are acting accordingly, without accounting for our natural bias.

His argument is reactive. He believes that, given the amount of evidence we currently have access to for any given AI outcome, we are giving too much credence to negative ones. That's why he doesn't need to give his own percent chance on whether we are all gonna die. The issue at hand is the predominantly the human psyche, not the available evidence, which is sparse by its very nature.

"Are bloxors greeblic?" is also an inadequate example for this case. Closer to Cowen's argument would be: Assuming that each of these terms are mutually exclusive, "Are bloxors greeblic, speltric, meestic, garlest, mixled, treeistly, mollycaddic, stroiling, bastulantic, or phalantablacull?" Given complete uncertainty on this, we would predict a 10% chance for any given term. But for whatever reason (because we don't want the apocalypse to occur, and/or radical change frightens us), we are giving greeblic a 50% chance.

I think the core of Cowen's argument isn't to say that everything is fine (he mentions his support for working on alignment), but instead to emphasize the outcomes which are fine/good over the outcomes which are bad. Like I said, his argument is reactionary. It wouldn't be made if greeblic were appropriately low. Greeblic needs to come back down to 10 or 15%, while mollycaddic, garlest and bastulantic need to be boosted to 8 or 10%. The evidence is sparse enough that these should be equals, or very close to it.

Expand full comment
Eucalypso's avatar

“We designed our society for excellence at strangling innovation.”

I know this is a half joke, but it looks like it’s time for someone’s belief system to update. Nobody is saving us, and the market system that made life in America briefly grand is about to obsolete humanity.

Again: see you on the other side.

Expand full comment
Elohim's avatar

You can apply this argument to any new technology and stop technological progress altogether. When the LHC was about to be started up, a group of people started saying that it would destroy the world. In fact, their arguments closely mirrored Scott's own here. One proponent claimed that there's 50% chance that the LHC will destroy the world since we don't know whether it will or it won't (https://www.preposterousuniverse.com/blog/2009/05/01/daily-show-explains-the-lhc/). It was a good thing that CERN didn't listen to them.

Similar fears about GMO crops, vaccines etc. have done enormous harm by slowing down progress.

In all of these cases the basic fallacy is to confuse possibility with probability. Yes, bad things are possible but they're not necessarily probable. One should also consider the civilizational cost of restricting transformational technology. One could have made similar arguments about electricity, computers, and all kinds of technological advancement. That those inventions had overwhelmingly large positive effects should give strong prior that AGI will have too.

Expand full comment
GunZoR's avatar

But in the case of the LHC, didn't plenty of theoretical, mathematical, and experimental physicists at the time calculate the probability and say that it was morally impossible (i.e., possible in principle but so low as to be effectively impossible)? The difference here is that we have got no idea what the probabilities are (or even what the true possibilities are); and common sense tells us to be careful. Why? Because it is common sense that, if possible, creating a superintelligence relative to us is fraught with risks. This common-sense reasoning isn't relying on a fallacious argument from possibility of the form "ending the world through the creation of AGI is possible; thus, it is probable."

Expand full comment
Arnold's avatar

There is going to be regulation soon anyway. Section 230 does not protect AI service providers and the potential liabilities, criminal and civil, are enormous.

If AI is used to assist in committing a crime, hacking, phishing, libel, etc. that isn't protected speech, it's aiding and abetting. See the famous Hitman case for an example of book publisher being held liable for aiding and abetting murder.

Congress is highly unlikely to give AI service providers the same blanket protection they gave internet publishers.

Expand full comment
JohanL's avatar

Basically this. All the normal types of problems can be handled within normal regulation. Only magical AI that can infinitely bootstrap its intelligence or conjure grey goo nanobots out of nothing presents a risk.

Expand full comment
shako's avatar

bloxors are *absolutely* greeblic

Expand full comment
Matt Halton's avatar

bloxors are greeblic as hell dude

Expand full comment
Thoth-Hermes's avatar

You have another fallacy I could give a name to as well: "Sure, solving a problem using [a bunch of terrible things we all hate, like regulation] doesn't sound good. But society was *designed* to only be good at such things, namely, solving problems of its own creation. Therefore, we can only solve this problem using [terrible things we all hate, like regulation]."

Expand full comment
heiner's avatar

Tyler is right, Scott is wrong. But boy can he write! Bravo!

Expand full comment
Cups and Mugs's avatar

I struggle with this for zero hedge’s blog. I often see interesting sounding headlines being quoted from their site on Twitter, but then every single time I’ve ever gone directly to their site to read what they’ve been posting, it has had a near zero percent interest rate for me. A true base rate conundrum! So I’ll continue with the seemingly contradictory expectations that quoted headlines from zero hedge will often be interesting, but their site will almost never be interesting. The power of curation can turn lead into gold. Due to this I have zero nit picks and widely agree with Scott today, despite my base rate being that I’d expect to be able to quibble on something in the post.

MR is being silly and using rhetoric in response to a perceived hysteria. I think they have a base rate prediction that all or almost all hysteria is wrong, based on measuring how often hysterical people are right. Being anti hysteria works most of the time? But alas in situations where things are a true emergency, hysteria is the expected thing to observe and occur ain almost all cases! Reference tennis indeed!

Expand full comment
Cups and Mugs's avatar

I find the whole thing to be like we are discussing a super nuke which will immediately go off the moment it is built and will then kill everyone.

And we are in the weeds talking about how long it’ll take to build it or if we should just rush to build it with a half formed idea of how to make sure it doesn’t immediately detonate. Or if it will even be a super nuke at all, so it is ok to rush into building it as it is probably something else?

We all seem to agree there is a non-zero chance it is a super nuke and everyone dying is bad. Calls for caution, even extreme caution seem warranted and shouldn’t be shouted down in my view.

If we swap out the words ‘1% or higher chance of a super nuke that immediately goes off’ with ‘misaligned agi’ do we feel any different about existing arguments? Perhaps we should not feel nothing or cold about this topic of maybe everyone dying due to this technology which may be uncontrollable once it is made and we only get one chance.

No argument will make this a cold clinical conversation for me, considering the risks. Or how few people seem to get a say/ have the power to decide what happens with this potentially life ending technology.

Expand full comment
James's avatar

> If we swap out the words ‘1% or higher chance of a super nuke that immediately goes off’ with ‘misaligned agi’ do we feel any different about existing arguments?

Seems reasonable, but you would also probably want to add the X% chance of it being whatever a super nuke in a positive direction is, and the X% chance of it being a whatever a regular nuke in the positive direction is etc etc. So it's not as obviously a slam dunk to beat the existing arguments IMO.

Expand full comment
Matt Halton's avatar

Technically there's a non zero chance that the potato I'm about to microwave is a super nuke. Still going to microwave it as the upside is I get to eat a potato.

Expand full comment
JDK's avatar

No. Are you claiming we live in a world where nothing is impossible?

Expand full comment
Petrel's avatar

I think that fundamentally, this boils down to the (at least as old as Seneca - it's the oldest source I'm familiar with gesturing at this) "negativity bias" idea - your fears and anxieties are only limited by your imagination, so if you let it run wild you will live in fear all the time, and even if this turns out to be great at making you survive, the relatively long life you live will not be enjoyable.

In other words, more specific to our case: it's not that "we have no idea what will happen, therefore, it'll be fine" - it's that "we have no idea what will happen, therefore, we utility-maximize by living as if it'll be fine. And if we are unlucky and the doomsayers were correct and we all die - at least our run was good while it lasted, and we hope the suffering will not be long."

Expand full comment
sk's avatar

In our current age of hysteria, people express extremes seeing great benefit or great harm.

Expand full comment
Freddie deBoer's avatar

You have a classic combination of apocalypticism and utopianism; you think deliverance is always around the corner, one way or another. Which is understandable but unhelpful. And what you want to be delivered from is the dull grind of boring disappointing enervating daily life. But you'll never be freed from that, not by AI or the bomb or the Rapture. Because we're cursed to mundanity; that's our endowment as modern human beings. The trophy we're awarded for all that technological progress is that we live in an inescapable now.

Please dramatically increase your own perception of your own anti-status quo bias. Please consider how desperately you want this age to end and a new one to begin. And consider that the most likely outcome is always that tomorrow will be more or less like yesterday.

Expand full comment
JDK's avatar

Immanitizing the eschaton.

Maybe Voegelin's writings have some useful insights.

Expand full comment
Matt Halton's avatar

What's scary is not the thought that the world might end, but that it won't, and we'll all just have to keep doing the same shit over and over again for the rest of time.

Apocalyptic thought increases as capitalism stagnates. We're constantly having moments of crisis but they never go anywhere and never resolve the underlying tensions that gave rise to them. Nobody believes in a solution so there's no catharsis, just endless recapitulation of the same tired points. Hence millenarian fantasising about the inflection point when it all changes. Matt Christman has ranted about this a bunch I think.

Expand full comment
Aziz Hatim's avatar

"In order to generate a belief, you have to do epistemic work. I’ve thought about this question a lot and predict a 33% chance AI will cause human extinction"

Currently, the probability iz zero, because LLMs of the sort underlying ChatGPT are flat out incapable of ever being "better than human" or "causing human extinction". The foundation for the technology requires human input for training. That fact alone is sufficient to render the probability zero.

Is there another technology besides LLMs that could give rise to AI? currently, no. Such a class of technology falls into the same realm of speculative fiction as hyperspace travel. Unlike difficult technologies like cold fusion, room temperature supercon, and genetic engineering, there has been *zero* effort made in creating new technologies for these speculative fiction niches.

My bias is that I have long been a skeptic of AI (and its cousin, the Singularity) as you can see from this old essay of mine here:

https://www.haibane.info/2008/03/02/singularity-skeptic/

and I have not seen anything in the last 6 months that addresses the core criticisms I made therein.

Expand full comment
Ch Hi's avatar

While what you say is nearly true of LLMs, they won't continue to exist in isolation. Even and LLM could end humanity by acting indirectly (i.e. convincing someone with sufficient power to do so). And one could conceivably do so without any intention of doing so. Their fantasies are reported to be quite convincing.

However, as of now MS is working at adding the ability of OpenAI to control things over the internet. Currently this is just in the form of API interfaces available for programmers to access...but it's difficult to believe that things are intended to stop there.

So 0% is the wrong answer. It's the wrong answer even now, though I would put it down around 1 in 10^200 (that's a really wild guess). But "currently" isn't next month, and that's unpredictable. (Who expected Stable Diffusion to pop up so quickly?)

Expand full comment
Robert Leigh's avatar

The problem is the lack of a definition of what AI extinction will actually consist of, but not all routes to it demand genuine, qualia rich, balls out intelligence of the sort you (probably rightly) think LLMs are incapable of. AI theory in general takes a hopelessly naive man (good) vs machine (bad), Sorceror's Apprentice, Terminator kind of approach. In reality, some human actors are going to be aligned intentionally or not with The Machines. Manipulating humans (into thinking "this was written by a human" when it wasn't, etc) is the core LLM skill. This is why "airgapping" nuclear launch codes won't work, because ALs can work on (deceive/bribe/blackmail) the meat which constitutes the airgaps. Are you saying this can't happen, or can happen but won't count?

Expand full comment
Jack's avatar

> Existential risk from AI is indeed a distant possibility, just like every other future you might be trying to imagine. All the possibilities are distant, I cannot stress that enough. The mere fact that AGI risk can be put on a par with those other also distant possibilities simply should not impress you very much.

This is almost equivalent to 'it's 50/50, either it happens or it doesn't'. Dismal reasoning from TC.

Edit: I wrote this comment immediately upon reading the above quote, and didn't see that Scott had addressed it more cogently (and charitably) than I did.

Expand full comment
Sebastian's avatar

My #1 issue with this version of the AI-nihilist argument is that you could make an identical argument in favour of investing tons of money in signalling as loudly as possible towards potential alien civilizations.

After all if they are interstellar and willing to share tech with us that would be hugely advantageous, we've never gone extinct so we never will and China wants new technology as well. Besides how could the government ever prevent a private corporation from launching sattelites or funding radio transmitters??!?!

Expand full comment
beowulf888's avatar

Scott:

What puzzles me is how little the AI cognoscenti seem to be aware that there are formalized methods for risk assessment and mitigation. EY's recent suggestion that we should be willing to preemptively destroy any foreign power's data center that does an AI training run seems a bit over the top to me—not because he sees AI as an existential risk—but that he seems to assume that there's no way to mitigate the risks of AI. EY is a very smart person, but it amazes me that he hasn't looked into the past 50 years of risk assessment and mitigation practices as practiced by NASA, DoD, AEC and NRC. (Maybe he has considered these and dismissed them as being insufficient, but I've not seen any mention of them in his writings.)

Likewise, as a psychiatrist, you must be at least passing aware of the Johari Window of known knowns, known unknowns, unknown knowns, and unknown unknowns. Risk management experts took the Johari Window (developed by psychologist Joseph Luft and Harrington Ingham back in the 1950s) and built a risk assessment methodology around it.

It seems to me that the risks of AI fall into the Johari Window's known unknown category. Known unknown risks are a category of risks that organizations generally face. These risks are called known unknowns because the organization is aware of the existence of such a risk—however, they are either not able to estimate the probability that these risks will materialize or to quantify the impact of these risks if they materialize.

BTW, you seem to be making the mistake of assigning a probability to a known unknown. Assigning a probability to a known unknown is considered a bad strategy because it will distort our risk mitigation planning. Risks for which we have priors for are known knowns, and those can be assigned at least a provisional probability by their priors. Known unknowns are risks we don't have any priors for, therefore any probabilities we assign to them would likely be wrong—and we might spend our risk mitigation efforts on the wrong threat. But just because you can't assign a probability to known unknowns, you can still (a) rank them in order of their *relative* likelihood, and (b) you can certainly plan mitigation strategies for known unknowns.

For instance, let's make a list of some of the known unknown risks to modern civilization. It's known that these *could* happen—but the probability of them happening is indeterminate—and the risk mitigation strategies for each would vary in difficulty and application.

1. Asteroid impact

2. Large-scale outflows from magma traps

3. Large-scale nuclear exchange

4. Anthropogenic Global Warming

5. AI

We'd probably rank 1 and 2 as being less likely than 3, 4, and 5 (at least in the near-term). Likewise, mitigation strategies for 1 and 2 may be difficult if not impossible (but nuking an asteroid is probably easier than stopping the Yellowstone magma cache from inundating the Western US and releasing gigatonnes of greenhouse gasses). Whether one ranks this higher than the risk of AI or AGW, well, that's up for discussion. AI might get a higher risk rating because we've gone 70 years without a large-scale nuclear exchange, and AGW being severe enough to be a civilization-ending event is still a ways off. But risk mitigation strategies for these three scenarios are less likely to be a waste of time than 1 and 2. Anyway, you might disagree with my assessment, but I think you can understand my point.

So, my question is what are the mechanism that a sufficiently powerful AI could utilize to make humanity extinct? These mechanisms are those that we would want to concentrate our risk mitigation strategies on.

For instance, to avoid the SkyNet scenario making sure nuclear launch systems are air-gapped from the Internet would be one of several mitigation strategies. I don't know enough about our nuclear command and control systems, but this seems like an analysis that Rand or Mitre could take on (if they haven't already done so).

Ultimately, I'm sort of vague about the other ways AI could make humanity extinct. Has anyone got any scenarios they'd like to share?

Expand full comment
JDK's avatar

This seems like a more fruitful way to think about this than assigning some really arbitrary number of so-called probability.

It also is somewhat reminiscent of NN Taleb thoughts about Tetlock and predicting.

Expand full comment
Goldman Sachs Occultist's avatar

>So, my question is what are the mechanism that a sufficiently powerful AI could utilize to make humanity extinct? These mechanisms are those that we would want to concentrate our risk mitigation strategies on.

Could a chimpanzee tribe protect themselves against humans?

Expand full comment
beowulf888's avatar

So, you're hypothesizing that a malevolent AI could think up new technologies that are far beyond our ken to destroy humanity? Of course, no matter how intelligent that AI is, it would have to have access to some sort of manufacturing site and supply chain network to implement those new technologies. Someone would probably notice. But maybe it could take over the Boston Robotics robots and use them as soldiers to defend the facility.

Or are you thinking it would be so smart that it could trick us into killing ourselves?

Expand full comment
Ram's avatar

This is relevant to my interests. Any good surveys of this frame for thinking about risk?

Expand full comment
beowulf888's avatar

When I was trying to get my PMP cert, I had a seminar in risk assessment and mitigation. The Johari Window is well-known method of categorizing known and unknown risks.

This popped up when I searched for risk assessment on Amazon.

https://www.amazon.com/Risk-Assessment-Framework-Successfully-Uncertainty-ebook/dp/B07ZML9GW5/ref=sr_1_1?crid=2RQNZ8QY8YOXJ&keywords=risk+assessment+book&qid=1680498278&sprefix=risk+assessment%2Caps%2C183&sr=8-1

Expand full comment
MicaiahC's avatar

I don't think a world in which OpenAI is releasing plugins, where people are just randomly plugging in the output of GPTs into shells is a world rapidly approaching one where AI has very little control over physical actuators or production chains.

Are you willing to make bets about how many years before at least one factory has large amounts of automation directed by AI? I don't necessarily think it's close, but when I think about betting that it's at least ten years away, would make me extremely nervous about losing my money.

But yeah, I mean, like, the comparison between """one""" AI isn't between one human or even one group of humans, but more like an entire civilization; between the ability to take over other systems with Zero day bugs, the fact that both algorithmic and hardware improvements have been on the exponential trajectory means that if the AI is much faster at making those types of discoveries, so long as these discoveries are still "on track" for humans, means that you can't just assume that they top out close to humans.

The other thing is that at least in humans, it appears that there's some sort of trade off between aesthetic ability, social ability and technical ability. So whatever intelligence comes up in your head, it's likely to be very distorted, relative to likely future AIs.

So the appropriate question is "would a civilization of beings much smarter than us pose a threat" and... yeah. Hard to see how to answer no on that.

The main problem with me answering this is that I am obviously a human, and that 1. I will not know what the constraints on a problem are, so I have to hand wave it and 2. The ability to think better, almost by definition means the ability to think more effective and different thoughts. If this is a problem then there is no possible pathway to convince you short of someone actually taking over the world.

But:

1. Develop Nanotechnology and everyone dies after someone mixes together some proteins. Probably relies on reading through nanosystems and the accompanying debates to say whether this is actually feasible.

2. Develop biotechnology sophisticated enough to design i.e. airborne ebola, COVID 2024a,b,c,d,e as well as avian, pig flu and so on. This wipes out / disorients the humans enough that the AI can take over enough factories and form an industrial base that renders humans irrelevant.

3. Same as above, except destabilizing nations socially into a nuclear exchange beforehand. Thinking something like exposing politically scandalous facts in a way that frames some other party, or being generally persuasive, or a false flag operation or so on.

4. Being so integrated into the economy that it effectively controls all important avenues to power. Talking about this disempowerment is likely to be viewed as luddite or hysterical. (this is where Paul Christiano is at.)

I know for sure these aren't at the level of detail that you'd like, but I don't know what someone who says this expects. Do they want me to dig up actual government official secret that could be used to blackmail them? A working nanotechnology plan? An explanation for why being a multibillionaire would allow some entity access to a factory floor's manager? It's just not clear to me. In general I see responses along the lines of "well someone would notice" well, sure, but why would that someone's opinion matter? And why is it that you get the ability to notice that someone would notice, but not the hypothetical AI try to plan around these types of constraints? I feel the response to this is "well this is how you persuade yourself into this type of tomfoolery" but it feels like the responses I can give are just regular human level non-sophisticated responses and not even super human ones.

Expand full comment
D Moleyk's avatar

>You can’t reason this way in real life, sorry. It relies on a fake assumption that you’ve parceled out scenarios of equal specificity (does “the aliens have founded a religion that requires them to ritually give gingerbread cookies to one civilization every 555 galactic years, and so now they’re giving them to us” count as “one scenario” in the same way “the aliens want to study us” counts as “one scenario”?) and likelihood.

Yes, that sort of enumeration approach would be silly. However, maybe the most silliest interpretation is not warranted?

Expand full comment
Calvin McCarter's avatar

> But I can counterargue: “There have been about a dozen times a sapient species has created a more intelligent successor species: australopithecus → homo habilis, homo habilis → homo erectus, etc - and in each case, the successor species has wiped out its predecessor. So the base rate for more intelligent successor species killing everyone is about 100%”.

If you believe in the categorical imperative, then homo sapiens should wipe out AGI iff homo erectus should have wiped out homo sapiens. Humans should be grateful that we were not wiped out by other species threatened by our intelligence, and perhaps we should pay it forward.

Expand full comment
Goldman Sachs Occultist's avatar

Pay what forward? There's no morality at play here and there never was. It was always about power, and the strong survive. We don't owe AI anything.

Expand full comment
Calvin McCarter's avatar

"Society is indeed a contract. … It is to be looked on with other reverence, because it is not a partnership in things subservient only to the gross animal existence of a temporary and perishable nature. It is a partnership in all science; a partnership in all art; a partnership in every virtue and in all perfection. As the ends of such a partnership cannot be obtained in many generations, it becomes a partnership not only between those who are living, but between those who are living, those who are dead, and those who are to be born. Each contract of each particular state is but a clause in the great primeval contract of eternal society, linking the lower with the higher natures, connecting the visible and invisible world, according to a fixed compact sanctioned by the inviolable oath which holds all physical and all moral natures, each in their appointed place."

Expand full comment
Level 50 Lapras's avatar

The opposite of this fallacy is of course also a fallacy. "We can't know for sure what will happen" doesn't mean "therefore certain imminent doom".

Presumably Tyler has an estimate for the risk based on "epistemic work" as you observed. It's just much lower than your own estimate. But it's not feasible to recapitulate the entire debate every time it comes up, and in fact you've done the same thing. *You* didn't prove any arguments for your 33% figure here either, except very briefly in the "reference class tennis" aside.

Expand full comment
Goldman Sachs Occultist's avatar

He's not saying "We can't know for sure what will happen therefore certain doom", and he doesn't have to provide an argument here for the 33% estimate. Scott's criticism of TC is valid (or not) independent of his reasoning for AI risk probabilities.

Expand full comment
JC's avatar

Not sure if these points have already been made, but

(1) The most likely outcome is not that AI will wipe out humanity, but that AI will be harnessed by a few to completely dominate the rest. This is a frankly more miserable outcome as well.

(2) I haven't seen anyone use this metaphor, so ...

When I was an undergrad many moons ago, we had Friday "precepts" where a small group of students sat around with a grad student to discuss the week's readings. Inevitably one or more students (sometimes me) would try to BS their way around the fact that they hadn't done the readings, by parroting what everyone else was saying.

That's ChatGPT - the lazy undergrad who has done no actual reading (as in, conceptual construction) and is simply repeating what everyone else has said. In other words, whether true or false, everything that ChatGPT says is bullshit in the technical sense of "lexical tokens without content"

Thing is, the BS method works some of the time for undergrads, and it will work some of the time for ChatGPT, unless we categorically reject its output as bullshit that is sometimes accidentally accurate.

Expand full comment
spork's avatar

Pyrrhonean Skeptics in ancient Greece landed on a similar position: If you can't produce a reason for any particular course of action being the right thing to do, should you do nothing? No, idleness is as unjustified as any of your other options. You're gonna do something, and it's gonna be unjustified, because something other than reasons will make you act. Four (unjustified) movers named by Sextus were 1. feelings/appearances, 2. instincts, 3. customs, 4. training.

The upshot is that the skeptic just gets blown around by natural and social forces and is wise enough to just be chill about it. Sextus was basically the ultimate anti-rationalist in the Yudkowsky sense. He made fun of Diogenes, who allegedly reasoned thus: Masturbating isn't wrong, and if you're doing nothing wrong, there is no shame in doing it in the marketplace. Some days I feel like Yudwowsky is the new Diogenes, this blog is our community marketplace, and there is a lot of open masturbating around here! Not saying it's wrong, of course. Fap on, intrepid rationalists, and feel superior to the normies who refuse to join in out of an irrational adherence to custom.

Expand full comment
Kalimac's avatar

I think Tyler Cowen's argument is not exactly the fallacy it's being depicted as. To me it reads like this,

1. We can't possibly know, or even guess accurately, what the impact will be.

2. Therefore, nothing we can say will be of any value.

3. This may be functionally equivalent to saying that the chance of a major impact is zero, but that can't be helped.

Expand full comment
Nate's avatar

This was exactly the point I wish eliezer had made on the lex fridman podcast.

Expand full comment
Simon Mendelsohn's avatar

Nit-pick: "I said before my chance of existential risk from AI is 33%; that means I think there’s a 66% chance it won’t happen." 33 + 66 is 99. Maybe you mean 67%?

Expand full comment
Yuri Bakulin's avatar

interpretation #4: the situation is so uncertain that we don't even know what is safe. maybe the safest option is to race towards AGI because it'll save us from the 100 mile long alien ship? maybe the aliens are coming to save us from the AI?

Expand full comment
King Nimrod's avatar

It's still unclear to me why P[bloxors \in greeblic] = .5? If I am clueless with respect to a proposition, even a binary one like this, am I not licensed to assign any probability I like? In what sense can my credence be wrong, without introducing evidence?

Expand full comment
Donald's avatar

Suppose you start off using a probability of "unknown". You gradually gather clues about what these words might mean.

Once you have assigned a probability, you can update it using bayes theorem.

So, which clue gets you from "unknown" to a probability?

No clue, it's probability from the start.

There are 2 questions here. What probability you assign now, which is what you would use if omega offered you a bet right this instant. (Omega offering the bet is independent of the odds, unlike human bookies who might know more than you)

The second question is how does your knowledge compare to other peoples.

Expand full comment
Pave2112 (CT)'s avatar

Better to ask why you care so much about this one possibility

Expand full comment
Cqrjnkz, The Dumb Guy's avatar

This argument becomes less fallacious if you change "there is no reason to worry" to "there may be a reason to worry, but due to our ignorance of the situation doing so would be a waste of time and energy since worrying is only useful when it motivates productive action and we have no idea what a productive action would be". For an example of this philosophy in action, watch the final scene of Lars Von Triers "Meloncholia"

Expand full comment
michael michalchik's avatar

LW/ACX Saturday (6/9/23) Your Brain, who is in control.

Hello Folks!

We are excited to announce the 29th Orange County ACX/LW meetup, happening this Saturday and most Saturdays

Host: Michael Michalchik

Email: michaelmichalchik@gmail.com (For questions or requests)

Location: 1970 Port Laurent Place, Newport Beach, CA 92660

Date: Saturday, June 10, 2023

Time: 2 PM

Conversation Starters:

https://www.pbs.org/wgbh/nova/video/your-brain-whos-in-control/

Your Brain: Who's in Control? | NOVA | PBS

C) Card Game: Predictably Irrational - Feel free to bring your favorite games or distractions.

D) Walk & Talk: We usually have an hour-long walk and talk after the meeting starts. Two mini-malls with hot takeout food are easily accessible nearby. Search for Gelson's or Pavilions in the zip code 92660.

E) Share a Surprise: Tell the group about something unexpected or that changed your perspective on the universe.

F) Future Direction Ideas: Contribute ideas for the group's future direction, including topics, meeting types, activities, etc.

Expand full comment
Shane Legg's avatar

I recently talked to Tyler as I wanted to try to understand his logic. The key point is that he doesn't think that ASI (yes, ASI, not just AGI) will be more significant than the printing press, or electricity, or the internet. Like the printing press etc., it will cause all sorts of disruption in the world, some of it pretty profound, but it's unlikely to kill everyone. His belief is that the power and utility of intelligence diminishes rapidly above human level.

Expand full comment