1125 Comments
Comment deleted
Mar 30, 2023Edited
Comment deleted
Expand full comment

Another way it seems to be an echo chamber is that while people disagree, they do it within a set of premises and language uses that means all the agreements and disagreements take place within a particular, limited framework.

Expand full comment
Comment deleted
Mar 30, 2023Edited
Comment deleted
Expand full comment

"Noam C" has been wrong about everything ever, to a first approximation. But more to the point, people who copy-paste the old "it is a pattern-matcher therefore it can't possibly be understanding anything" without even the minimal effort to consider what might be the source of their own ability to "understand" (is it magic? if it's X, why would X be un-replicable by AI?) do sound like pattern matching parrots. Not very sophisticated, at that.

Expand full comment
Comment deleted
Mar 30, 2023
Comment deleted
Expand full comment
User was temporarily suspended for this comment. Show
Expand full comment
Comment deleted
Mar 30, 2023
Comment deleted
Expand full comment

Banned for the "parrot like" part.

Expand full comment

This doesn't seem like a very fair ban, given that he was responding to someone who had just compared him to a parrot (and who was not banned for doing so).

Expand full comment

But he was responding to 'parrot like' accusations in the first place

Expand full comment

See Microsoft's recent paper, "Sparks Of Artificial General Intelligence: Experiments With GPT-4", at https://arxiv.org/abs/2303.12712

Chomsky is clearly a smart guy in his own field, but the article was embarrassing. For example, in the terraform Mars question, it uses the AI's unwillingness to express an opinion on a moral issue as proof that it's not intelligent. But newly-made AIs love to express opinions on moral issues and Chomsky was using a version that was specifically trained out of that behavior for PR reasons. I tested his same question on the base version of that AI and it was happy to express a moral opinion on it. If you can't even tell an RLHF-ed opinion from the AI's fundamental limitations, I don't think you should be writing articles like these.

On the general question, please see past articles I've written about this, for example https://slatestarcodex.com/2019/02/19/gpt-2-as-step-toward-general-intelligence/ , https://astralcodexten.substack.com/p/somewhat-contra-marcus-on-ai-scaling , and https://slatestarcodex.com/2019/02/28/meaningful/

Expand full comment

Scott,

I am a huge fan of yours and I have already read them but not convinced that LLMs can achieve AGI with their current design. I am in the same camp as Marcus and Noam C.

You should perhaps explain what is "true AGI"? Just because GPT-4 is doing impressive stuff does not mean it is AGI and the article you shared about it showing "sparks of AGI" does not explain how they determined that. Also we cannot take their word for it. Gary Marcus has asked for details how it was trained and the training data but the "Open"AI has not made it open.

LLMs like GPT-4 are impressive but I am not sure they can be called AGI or sparks of AGI. At least yet.

The infamous incident about GPT-4 lying to a raskrabbit that it is not AI bot and it is a visually impaired person still does not make this interaction as AGI. GPT-4 by large training set trained itself to use deception. Just like Bing chatbot trained itself to use abuses by training itself on internet troll chat.

So garbage in garbage out still applies to LLMs and they are simply extrapolating stuff from their massively huge training data.

As Noam C has explained, Human brain does more than extrapolation from the data it has through experience. Until we understand how human mind truly works, we can forget about AGI. Some people truly think deep learning is actually deep understanding. The "deep learning" is not deep at all. Deep means number of hidden layers of computation where each layer has math functions (with parameters) and each layer transforms data and passes on to next layer and the system fits those params to training data. This is not 'deep learning' by any stretch of imagination.

Expand full comment

> Until we understand how human mind truly works, we can forget about AGI.

That's a rather strong claim. GPT and these LLMs are missing a few important things. First agency (but that's "easy" to simulate, just tell it to tell a human what it wants to do, then the human can feed back the results), and more importantly a loop that allows itself to direct its attention, some kind of persistence.

But these components/faculties are coming. And eventually if the AI can form models that contain itself, and it recognizes its own relation to other entities (even if just implicitly), and even if just through periodic training, but if it gets (again, even if implicitly) to influence the world to manifest some change in its model, then at point the loop will close.

Expand full comment

“Not convinced” is null hypothesis football imho. “Unless you can convince me there’s risk I’m going to assume it’s zero” instead of “unless you can convince me it’s safe I’m going to assume it’s risky”

Expand full comment

> not convinced that LLMs can achieve AGI with their current design. I am in the same camp as Marcus and Noam C.

I agree that LLMs are not AGIs (and while I'm not in Noam Chomsky's camp for anything outside linguistics, I feel like Gary Marcus makes good points).

> Until we understand how human mind truly works, we can forget about AGI.

I disagree strongly. Modern AIs can do better than the best humans at a variety of tasks; understanding the human mind is unnecessary to build AGI, and precisely because we *don't* understand the human mind, humans are likely to invent an AGI with quite a different architecture than whatever the human mind has, something in a different location in "mind design space"[1]. This already happened for deep learning, which is obviously different from human brain architecture. Before that it happened for neural nets, which use backpropagation, which animal brains can't use.

This is a big part of why building AGI is risky and also why estimating the risk is hard. It's not just that we can't know how deadly AGI will be, but also that we can't know where in mind design space AGI will be located or how deadly that particular location will be. (Mind you, if AGI *were* in the same place in MDS as humans, humans vary remarkably, so AGI might end up being more deadly than Hitler, or safer than the average grandmother, or both — with some AGIs fighting for humanity, others against, and still others ignoring the "should we kill all humans" controversy and watching cat videos instead.)

And I think LLMs (and other modern AIs) are a terrifying omen. Why? Well, it has to do with them having "too much" intelligence for what they are.

First of all, here is a machine with no senses — it's never seen anything, never heard anything, never touched anything, never felt hot or cold, hungry or happy or angry or nostalgic. Yet an "emotionless Helen Keller" GPT3 base model can often pass Turing tests without ever so much as practicing on Turing tests. It simply *can* pass the test, by its very nature. No human can do anything like this. If you lock an American in a padded cell and show zim random Chinese books and web sites (without images) 24/7/365 for 20 years on a big-screen TV, they might go insane but will not become fluent in Mandarin.

Second, it's small compared to human neural networks. Human brains have over 100 trillion synapses. It seems to me that in an AI neural net, weights are analogous to synapses and biases are analogous to neurons, so a GPT3-level AI has ~1000x fewer synapses than a human brain. (That AIs are way more efficient than humans on this metric is unsurprising from an evolutionary-theory standpoint, but still important.)

Third, it'll probably get smaller over time as new techniques are discovered.

Fourth, I believe GPT3 is wildly overpowered compared to what an AGI actually needs. Observe: in a sense it's hard for a human to even perform on the level of GPT2. Suppose I ask you to write a one-page story in the style of (and based on the characters of) some random well-known author, one word at a time with *no edits* — no backspace key, once you write a word or punctuation mark it is irrevocably part of the story, and you can't use writing aids, you must write the story straight from your mind. And I give you the first sentence. I think (if you're familiar with the author) you can do a better job of making the story *coherent* or *entertaining* than GPT2, but GPT2 is likely to be able to beat you when it comes to matching style and word choices of the original author. So GPT2 already beats the average human in some ways (and it can do so lightning-fast, and never gets tired, and can do this 24/7/365 if you like.)

GPTs (transformers) are fundamentally handicapped by their inability to edit. An AGI will not have this handicap; they'll be able to write a plan, review the plan, critique the plan, edit the plan, and execute the plan. An AGI doesn't *need* to replicate ChatGPT's trick of writing out a coherent and accurate text on the first try, because it can do as humans do — review and revise its output, do research on points of uncertainty, etc. — and therefore a linguistic subsystem as complex as GPT2 is probably sufficient for an AGI to match human intelligence while greatly exceeding human speed. And if a GPT2-level linguistic subsystem is sufficient, well, perhaps any PC will be able to run AGI. [Edit: IIUC, a typical PC can run GPT2 inference, but training requires modestly more processing power. You do not need a supercomputer for training — they used a supercomputer not because it's necessary, but because they wanted results quickly; no one is willing to train an AI for 18 years like they would a human.]

If a typical PC can run a smart-human AGI at superhuman speed, then how much smarter can a pro gaming PC be? How much smarter can a supercomputer be? How much more powerful is a supercomputer that has hacked into a million gaming PCs?

--------

I disagree with Eliezer Yudkowsky that the very first AGI is likely to kill us all. Maybe its goals aren't that dangerous, or maybe it has cognitive limitations that make it bad at engineering (a skill that, for the time being, is crucial for killing everyone).

But once that first AGI is created, and its architecture described, thousands of AI researchers, programmers and kids in moms' basements find out about it and dream of making their own AGI. Perhaps some of the less powerful AGIs will show the classic warning signs: trying to prevent you from turning it off, lying to you in order to make progress on a subgoal, manipulating human emotions*, using the kid's mom's credit card to buy online services, etc.

But I think Eliezer would tell you that if this happens, it is already too late. Sure, not-that-powerful AGIs won't kill everyone. But just as people start to notice that these new AGIs sometimes do immoral things, someone (let's call him Bob) with too many AWS credits will program an AGI *carelessly* so that its goals are much different than the human intended. Maybe Bob will give it too much processing power. But maybe the AGI simply decides it can work toward its misconfigured goal faster by creating another AGI more powerful than itself, or by creating a worm to distribute copies of itself all over the internet. At this point, what happens may be out of anyone's control.

Maybe it doesn't kill us all. But since it's smarter than any genius, it has certainly thought of every threat to its goals and means, and how to prevent the apes from threatening it. And since it's faster than any genius and capable of modifying copies of itself, it is likely to evolve very quickly. And if it determines that killing 10% or 100% of humans is the safest way to protect itself from humans trying to turn it off, then sure, why not?

It's worth noting that the most dangerous AGI isn't the most typical one. There can be a million boring, safe AGIs in the world that will not save us from the one Bob misconfigured.

[1] https://www.lesswrong.com/posts/tnWRXkcDi5Tw9rzXw/the-design-space-of-minds-in-general

Expand full comment

Great comment. We went from "Do androids dream of electric sheep?" to "Do AGIs watch cat videos?" in a half century. Whatever AGIs will be, I doubt they will be "united", i e. globally aligned in between themselves. (regardless of whether they will be "aligned to human goals"; I see no evidence that HUMANS are globally aligned on human goals, so what do you expect from AGIs?)

About "humans only being able to make edits", I believe that the (GPT-powered) Dall-E had for long years been based upon "iterative refinements" in its image generation. "Iterative refinements" are just as applicable to token string production...

Expand full comment

> Please. GPT-4 has no understanding of the world. It is a pattern matching parrot. Very sophisticated one at that.

Maybe humans are all sophisticated pattern matching parrots. You literally don't know, so this isn't really an argument against the dangers of AI.

Expand full comment

>You literally don't know

That's getting close to the sort of question for which "personal gnosis" is a valid answer; any person knows that he/she personally is not a p-zombie.

I'm not saying that Banned Dude (don't know who originally posted it) is right, of course.

Expand full comment

> any person knows that he/she personally is not a p-zombie.

No they actually don't, because that would assume their perceptions have some kind of direct access to reality, as opposed to merely perceiving some kind of illusory projection.

Expand full comment

By "p-zombie" I mean something that behaves like a human but without an associated first-person experience.

I know I have a first-person experience. It doesn't matter for this purpose whether that experience corresponds to reality or not; even if I'm experiencing the Matrix, *I am experiencing*.

If something else contradicts that, *the something else is wrong*; I know that I think and perceive more directly than I can know anything else. As I said, personal gnosis.

Expand full comment

I know what you meant. I'm saying you only think you have first-person experience. This "knowing" is a cognitive distortion, like other perceptual illusions. People who don't see things in their blind spot can swear up and down something is not there, that doesn't make it true. We only ever have indirect access to reality, your "first-hand experience" is no exception.

Expand full comment

>People who don't see things in their blind spot can swear up and down something is not there, that doesn't make it true.

How does that relate? That's perceptions not corresponding to reality. They experience something false, which inherently necessitates that they experience.

Expand full comment

I've seen this argument before, and it's baffling to me. Are you operating off some strange definition of what it means to "have first-person experience"?

There exists, at this very moment, the qualitative experience of seeing letters being written on a computer screen. An experience which "I" "am having" by any reasonable definition of the words.

I understand that I can't convince you that *I* have qualitative experiences, but I can't understand how in the world you can doubt the existence of *your own* phenomenology, unless you are somehow taking issue with the use of the words "I" or "have".

Expand full comment

> his "knowing" is a cognitive distortion, like other perceptual illusions.

That's not something you *know* , it's a belief.

Expand full comment

I have direct access to reality. This access is pretty handy for making things like microprocessor chips or building a fort out of sofa cushions.

Expand full comment

If they are perceiving anything in the first person, then they are not a p-zombie. If they are perceiving an illusory projection, that already means they have subjective experience and are hence not a p-zombie.

Expand full comment

Perception is not experience. Don't conflate the two.

Expand full comment

Can you elaborate? What does it mean to perceive something without having the subjective experience of having perceived that thing?

Expand full comment

No. We humans are not JUST pattern matchers.

A human baby/child has less training data (eg experience) in her brain than say GPT-4 yet a baby/child can do reasoning better than the latest fad in AI. Unless we understand how human brain is able to do this without lots of training data, we can forget about AGI.

LLMs can do what they can purely based on training data and sure some of the training data may have given them insights (like how to be deceptive and lie to get what you want) but those insights do not make them come close to AGI.

Expand full comment

> A human baby/child has less training data (eg experience) in her brain than say GPT-4 yet a baby/child can do reasoning better than the latest fad in AI.

You're not comparing like with like. A human baby is not a blank slate, GPT-4 was. Billions of years of evolution culminated in a human brain that has pretrained biases for vision, movement, audio processing, language, etc.

> but those insights do not make them come close to AGI.

Current LLMs are not AGIs. That does not mean AGI is not the same sort of stochastic parroting/pattern matching we see in LLMs. Just adding "step by step" prompts and simple "check your work" feedback [1] significantly improves their reasoning capabilities. We've barely scratched the surface of the capabilities here.

[1] https://arxiv.org/abs/2303.17491

Expand full comment

As I noted upthread[1], a human baby has in one sense vastly *more* training data, as GPT has no senses — no sight or hearing or touch or even emotion. As a baby's father myself, I have found it absolutely remarkable how slowly she learns and how difficult she is to teach. (Keeping in mind that GPTs are not AGIs) I think a second- or third-generation AGI would be able to learn faster than this, if it were fed training data at the same rate and in the same way. But if I check lists of milestones of cognitive development, I find that my baby is largely above-average, if a bit slow in the linguistic department. Some AIs learn fast even when training is slowed down to human speed[2]; not sure why we'd expect AGIs to be any different.

[1] https://astralcodexten.substack.com/p/mr-tries-the-safe-uncertainty-fallacy/comment/14387287

[2] https://www.youtube.com/watch?v=A2hOWShiYoM

Expand full comment
Comment deleted
Mar 30, 2023
Comment deleted
Expand full comment

I'm not sure the argument works the way you assume it does. Over the last years, we see more and more evidence that climate change is _not_ as bad as the "typical worst outcomes" would have us believe. The worst scenarios decrease in likelihood. The impact of the the scenarios that seem likely is being evaluated as less disastrous. Some advantages of climate change or at least opportunities it opens up are getting more attention.

A better analogy would be the CFC crisis. I wish people on all sides of these debates would be referencing it more frequently.

Expand full comment

Come on Scott, you're just not understanding this...for a start, consider the whole post! Tyler Cowen

Expand full comment

TC - huge fan of yours (and Scott's). And in this case, I had generally same reaction as Scott's. REQUEST. Can you have Yud on for an emergency Tyler Talk, or perhaps Scott + Yud? I would estimate 10k+ of your overlapping readers are scared shitless over this. Would welcome thoughtful rebuttal of Yud if it's out there.

Expand full comment

Seconded. Also an avid fan of both of you.

Tyler, I fully agree that "AI represents a truly major, transformational technological advance," and that this is going to make things weird and hard to predict precisely. But... isn't that what we have probabilities for? You say that "*all* specific scenarios are pretty unlikely," but, following Scott's argument, what is a "specific scenario," exactly? This seems like a way to escape the hard problem of putting a numeric probability on an inherently uncertain but potentially catastrophic outcome.

Ultimately, my overriding sense from reading your post (and reading/listening to you lo these many years) is that you're frustrated with stasis and excited by dynamism. I agree! But even if you have a strong "change is good" prior, as I do, it still seems correct to weigh the likelihood function as well -- that is, engage with the AI-specific arguments rather than depend on historical analogies alone.

Expand full comment

Probabilities do not work well when smart people completely disagree on priors. Some people think the chance of AI apocalypse is 20%, some think it’s one in a million. There are no priors these people agree on.

Most of the “probabilistic” reasoning here is simply argument by exhaustion. Ten paragraphs talking about probabilities with no actual mathematics. Then concluding, therefore there is a 30% chance of AI apocalypse.

Expand full comment

That's not what this post is saying *at all*. It's saying that Tyler's mathematics-free arguments aren't enough to establish a near-zero probability, and (separately), that Scott has put a lot of thought into this and come out with the number 33%. The arguments for 33% in particular aren't given here, they're spread across a lot of previous posts and the inside of Scott's head. The point of this post is to rebut Tyler's argument for near-zero, not to support Scott's arguments for 33%. It's *Tyler* who's doing the thing you accuse Scott of.

Expand full comment

The 33% is not result of a lot of thought. It is hand waving.

The small but non- zero probability is also a lot of hand waving. As is a probability of almost certain also hand waving.

Expand full comment

It's not hand waving, it's virtue signaling.

Scott is a member of a sect that strongly beliefs in an AI apocalypse. So he cannot give a low probability because he'd lose status within his sect if he did. But at the same time. He wants to have main stream appeal and he cannot give a high probability, because he'd be seen as a crackpot.

The 33% is very carefully chosen. It's below 50%, so he cannot be accused of believing in an AI apocalypse, but still high enough that he doesn't lose guru status within his sect.

It's a very rational and thought out answer, but at a meta level. It's not a real chance of course, that's not the point.

Expand full comment

I'm sure it's not the result of a watertight mathematical argument, but I'm not sure how one would even construct such an argument. But Scott's definitely put a lot of *thought* into it - see his comment https://astralcodexten.substack.com/p/mr-tries-the-safe-uncertainty-fallacy/comment/14070813 for a partial list of posts he's written examining various aspects of this problem.

[Also, nitpick: the expression is "hand *waving*", as in "waving one's hands dismissively". Hand *waiving* would mean saying "Nah, I don't need a hand. Stumps were good enough for my father, and his father before him."]

Expand full comment

I'm with you. A lot of hand waving.

Let's see the math!

Expand full comment

And here is a bit more: I am a big fan of Scott's, but this is a gross misrepresentation of what I wrote.  Scott ignores my critical point that this is all happening anyway (he should talk more to people in DC), does not engage with the notion of historical reasoning (there is only a narrow conception of rationalism in his post), does not consider Hayek and the category of Knightian uncertainty, and does not consider the all-critical China argument, among other points.  Or how about the notion that we can't fix for more safety until we see more of the progress?  Or the negative bias in rationalist treatments of this topic?  Plus his restatement of my argument is simply not what I wrote.  Sorry Scott!  There are plenty of arguments you just can't put into the categories outlined in LessWrong posts.

Expand full comment

Maybe it would help to disentangle the policy/China question from the existential risk question? That is, it may be the case that OpenAI or anyone else unilaterally desisting wouldn't prevent (e.g.) China from plowing full steam ahead. But that might still be a very bad idea. It's not clear to me whether you think this is fundamentally a collective action problem, or whether you really want to dismiss the risk entirely.

Expand full comment

well said

Expand full comment

It’s not dismissing the risk at all. There are two types of risks.

1. risks from anyone getting AGI

2. risks from China getting AGI before the US

Be a good Bayesian. We should focus on the risk that maximizes the danger, times the effectiveness of our working on it. That is an argument for focusing on #2, because we have many effective actions we can take.

Expand full comment

It seems to me that this dichotomy does not work very well if we consider the actions they (possibly) imply.

If you want to avoid risk #2, you need to focus on actions that also prevent that the development of completely unaligned power-seeking doom-bringing AGI is sped up accelerated, at least if you believe that there is a relevant alignment problem at all. But in the anti-#2 actions subset, there are many actions that do just the opposite.

Expand full comment

To separate your #1 and #2 properly, I'd specify it as

1) risks from anyone building unaligned AGI

2) risks from China building aligned AGI before the USA.

However, as I see it the chance that any AGI built in the next 30 years (i.e. any neural net AGI) is aligned is <0.01%. So building AGI ourselves would subject us 99.99% to #1 if China doesn't build AGI for the sake of 0.01% (generously) of avoiding #2 if China builds AGI and also gets the 0.01%.

The correct response to China attempting to build AGI is to stop them, which fulfils both #1 *and* #2, rather than to pre-emptively commit suicide. This is definitely physically possible; if all else fails, a thousand nukes would do the trick.

Expand full comment

1. What are the risks from China building highly intelligent but poorly aligned AI. To China? To the US?

2. Need it be said that nuclear war carries with it its own set of existential risks?

Expand full comment

1. To everyone: hostile AI wages war against humanity.

2. Global thermonuclear war is a GCR, but it's not a notable X-risk. Rural populations can't be directly killed by any plausible quantity of nukes, fallout's too localised and too short-lived, and nuclear winter is mostly a hoax (the models that produce everyone-dies kind of outcomes tend to look like "assume everything within fire radius of the nuke is high-density wooden buildings, then assume 100% of that wood is converted into stratospheric soot", and even in those the Southern Hemisphere basically does fine).

Expand full comment

But you can still argue within point 1! Scott has taken issue with Tyler's reasoning about point 1. Tyler has responded 'well that's a gross misrepresentation because you haven't talked about point 2 (and some other stuff)'. But that's not how it works.

Expand full comment

The risk isn't along the lines of nuclear weapons, where the technology waits inertly for a human to start a cascade of disaster. It's more along the lines of a new virus that once released into the atmosphere will have a life of its own and be unable to be contained.

So, like Covid, whether it starts in China or the US would seem to make little difference. There's very little point in us rushing to be the first to start the conflagration just to stop our rivals getting a jump on us.

Expand full comment

What are the minimum requirements for such a thing? Does it need to be embodied? Or will ChatGPT55 kill us with spam and highly detailed scams targeting our relatives?

Expand full comment

It will mean that you can no longer trust anything that you haven't seen written down in a physical book published prior to 2022.

If you think about it, a lot of our modern systems rely on trust in a shared history and understanding of science etc. So I'm more thinking of your second option.

We'll likely end up as Morlocks running machinery that we no longer understand. Some of us might become Eloi - living lives of luxury without purpose.

Expand full comment

It is not really clear to me that China getting it first wouldn't be good for AI safety. I kind of trust their society to be more responsible and safe than some private US company. This is one of those areas where I really feel old style nationalism is making people stupid.

"other competitor nations with other values = bad". When IDK the things that China is strong on seem like the things we want to people controlling AI to be strong on (long term planning, collectivist versus individual focus, info sec priority, social conservatism).

Expand full comment
Comment deleted
Mar 30, 2023
Comment deleted
Expand full comment

Well was it China or the NIH? It doesn't seem remotely clearly to me China "owns" the pre-pandemic Wuhan research any more than the US does. Though obviously we have a big incentive to blame it on the other.

Expand full comment

I don't think it was NIH that had direct continuous control over biosafety practices in Wuhan. Sure, both NIH and China authorised GoF research, but it was a Chinese lab with Chinese scientists that dropped the ball.

Expand full comment

And your ratioalization of China throwing scinetists in prison for talking about the virus and refusing to make any effort to stop it spreading outside of China?

Expand full comment

" I kind of trust their society to be more responsible and safe than some private US company. "

This seems hopelessly naive. China has private companies. If regulation is the answer, and China is not pursuing "full stop" regulation, then what regulation are they pursuing? How exactly are they "more responsible"?

Expand full comment

>then what regulation are they pursuing? How exactly are they "more responsible"?

I don't think we have a good handle on what safety measures they are taking, but they are known to be conservative and authoritarian.

Whereas the ethos of Silicon Valley is "break things and hope you get a giant pile of money from it, maybe it won't be too bad for normies".

Expand full comment

"they are known to be conservative and authoritarian."

They are known to be conservative and authoritarian in regards to personal freedom, but not in terms of environmental destruction or technology development.

Expand full comment

For some actual information about Chinese AI regulation I recommend following ChinaTalk. E.g. this (there is a paywall, but some important information is before it): https://www.chinatalk.media/p/tiktok-live-show-ais-regulatory-future

Expand full comment

They are known to be conservative about *things that threaten the CCP's power*.

They want to be THE global hegemon for the rest of time, and its naive to think they wouldn't be willing to gamble with humanity's future when the payoff is (the potential for) permanent hegemony.

Expand full comment

AGI being developed under an authoritarian regime might not be worse for x-risk, but it's worse for s-risk (notably, "humans live under an inescapable totalitarian regime for the rest of time").

Expand full comment

This seems a better argument about the hysterics regarding China.

Expand full comment

Given China's horrific environmental record, trusting them to better manage externalities than the west seems hopelessly naive.

In addition, if China is first to an aligned AGI you can expect the result to be a CPC not grinding down in the face of humanity forever. However, if you are inconvenient to that enterprise you will not need to worry about it. You and your family will be dead. That is how collectivist societies deal with inconvenient people and they are well aware that nits make lice.

Expand full comment

"This is all happening anyway" doesn't seem like an airtight argument.

https://www.vox.com/the-highlight/23621198/artificial-intelligence-chatgpt-openai-existential-risk-china-ai-safety-technology

Think human cloning, challenge trials, drastically slowing bio weapon dev, gene drives, etc.

Expand full comment

Negative bias is an understatement. What evidence would Scott need to change his opinion? We can (hopefully) all agree that doomsday scenarios are bad. I’m asking what would compel Scott to update his prediction to, say, < 1%.

Expand full comment

Actually, I agree. @scott this is worth your time. Dig in and give Tyler's post another longer deeper response. This is your chance to defend the arguments of AGI risks to a prominent skeptic

Expand full comment

I suspect he will. He likes doing contra contra contra

Expand full comment

Could you expand on the China argument? I think Scott's argument is that no matter who builds the AI, whether China, the U.S. or anyone, that could potentially kill everyone, while you are more talking about China getting technological hegemony.

Expand full comment

Everyone brings up the China argument as if it's supposed to be self-evident that Chinese researchers would produce an AI more harmful to humanity than Silicon Valley entrepreneurs, and that conclusion is... not obvious to me.

Expand full comment

The only major nuclear power plant disaster was in a Communist country; one thing authoritarian governments are bad at is recognizing when they're making a mistake and changing course.

Expand full comment

Fukushima was level 7 as well, although it wasn't quite as bad and basically came down to a single mistake ("tsunami wall not high enough") rather than the long, long series of Bad Ideas that went into Chernobyl.

Expand full comment

...I actually can't work out whether the suggestion is that a single mistake leading to Fukushima is better or worse than it taking a long chain of things needing to be just so to get Chernobyl

Expand full comment

Here to register my disagreement that Fukushima had a single cause.

Chernobyl had a chain of engineering and human failues - but an RBMK can be run safely with minor modifications (the other Chernobyl reactors ran for 14 years afterwards). They tried really really hard to get it to explode, even if that's not what they intended.

The chain of engineering mistakes that went into Fukushima are a bit worse. The arrogance, regulatory and engineering failures are worse than Chernobyl in my opinion. The put backup generators 10m above sea level based on a fradulent study estimating the worst earthquake to be 10x weaker than other reported ones along the coast.

Expand full comment

And if there are other things they are good at it is resisting public pressure to take short term gains over long term plans. Two can play your silly game.

The Chinese government has a lot of pluses and minuses over the US one, it is not remotely obvious to me which one would be wiser to trust with AI if I had to pick one.

Expand full comment

Totally agree.

Expand full comment

"the China argument as if it's supposed to be self-evident that Chinese researchers would produce an AI more harmful to humanity than Silicon Valley entrepreneurs"

That's not the argument. If we start with the case that there is some percentage chance AGI will end humanity, regulation stopping it being developed in the US will not stop it in China (or elsewhere). It will end humanity anyway, so stopping it in the US will not change the outcome.

The secondary argument is that it won't end humanity directly, there will likely be a lot of steps before that. One of which is that is under the control of a nation state, that nation will be able to accumulate a lot of power over others. So, the intermediate consequence of stopping it in the US and not stopping it in some other place, is that the other place will end up strategically dominating the US for some unknown period of time until the AGI ends up strategically dominating humanity.

Expand full comment

> It will end humanity anyway, so stopping it in the US will not change the outcome.

If the probability is X% if everyone is working on it, if a bunch of nations except China stop walking that path then the probability falls below X%. I have no idea how you can conclude that this isn't relevant.

Expand full comment

Why does the probability necessarily fall below X%? Might it not just push out the timeline at which the risk occurs? Would a six month pause have any measurable effect?

Another way to think about it, people are willing to risk their own personal extinction rather than be subjected to living under Chinese rule. It's not a given that Chinese domination is preferable to death.

Expand full comment

China currently uses phones to keep people from straying outside their neighborhood. I'm sure they would never misuse AI against their people.

Expand full comment

I thought we were concerned about AI destroying our civilization, not making the Chinese police state 20% worse?

Expand full comment
Comment deleted
Mar 30, 2023
Comment deleted
Expand full comment

China will intensely fear the prospect of creating something they can't control

Expand full comment

I just read your post.

What I notice more than anything is that both you and Scott are arguing about discursive features, and both articles seem to express a fair amount of frustration, which is reasonable given the format. What I also notice is that the information content about AI is extremely small. If anything "AI" is just an incidental setting where the meta-discourse happens.

Scott is reacting to your post, which is seems to be reacting to some Other Side. My understanding of your argument is that "Big Things are happening Very Soon whether you like it or not, and nobody knows how things will play out, so stop doomsaying, y'all." (In my head you're from southern Texas.)

One feature of your article does set off minor alarm bells for me: its heavy use of deonotological arguments. These are exhibited by liberal use of phraseology such as "truly X", "Y is a good thing", "no one can Z", "don't do W", "V is the correct response", etc. In contrast, Scott's article here levies more consequentialist arguments—"if you do X, then Y happens", "here is failure mode Z", etc.

Personally, changing my beliefs based on deontological/moralistic arguments typically involves a strong invocation of trust and/or faith, whereas consequentialist rhetoric gives me some meat with which to engage my current beliefs. The former feels more like a discontinuous jump while the latter a smooth transition.

/2cents

Expand full comment

"What I also notice is that the information content about AI is extremely small."

But then the actual capabilities of existing AI and any AI that's forseeable from current tech, are all but irrelevant to AI x-risk discourse. It's mostly a fantasy built on magical entities – "super-intelligence" – using magical powers – "recursive self-improvement."

Expand full comment

We can't accurately foresee the path from current tech to superintelligence.

That doesn't mean the path doesn't exist, or that it will take a long time. It means we are wandering forward in thick fog, and won't see superintelligence coming until we run right into it.

Expand full comment

Nor can we figure out how to get to Valhalla. But that doesn't mean Valhalla doesn't exist. It just means we've not figured how to get there. But we'll know we're there when Odin opens the gates and invites us in.

Expand full comment

Nor can we, the people of 1901, figure out how to make a commercial airplane that can seat 100 people. That doesn't mean such an airplane is impossible. But we'll know we're there when the stewardess opens the Boeing 707 and invites us in.

Expand full comment

We can't even prove that superintelligence can theoretically exist, or that we have any means to achieve it, or that the methods we are employing could do so.

We don't even know what intelligence is, let alone superintelligence. That doesn't mean there's a 0% chance of a super intelligent AI taking over the world or killing everyone. It should mean that we don't consider this point more strongly than other completely unknown possibilities. The same group of people who are most worried about AI starting the apocalypse seem to almost universally reject any other kind of apocalypse that we can't rule out (see, e.g., every theological version of apocalypse).

Expand full comment

> We can't even prove that superintelligence can theoretically exist

Meaning you're unsure it's possible to be smarter than humans? Is there some other definition of superintelligence?

Expand full comment

>But then the actual capabilities of existing AI and any AI that's forseeable from current tech, are all but irrelevant to AI x-risk discourse.

We cannot yet align the systems that exist. This is absolutely relevant. Aligning a superintelligent machine will be much, much harder.

> It's mostly a fantasy built on magical entities – "super-intelligence"

There's nothing "magical" about it, unless you "magically" think brain tissue does something that silicon never can.

> using magical powers – "recursive self-improvement."

Again, nothing magical. If we can make AIs smarter, why the heck couldn't AI past a certain intelligence threshold make AIs smarter?

Expand full comment

Thank you for this very clear writeup of one of the distinctions! This provided me with quite some new insights! :)

Expand full comment

Tyler's reference to Knightian uncertainty and Hayek is a gesture at the idea that no, in fact, you can't and shouldn't try to make predictions with hard-number probabilities (i.e. 33% chance of AGI doom). Some risks and uncertainties you can quantify, as when we calculate a standard deviation. Others are simply incalculable, and not only should you not try, but the impulse to try stems from pessimism, a proclivity toward galaxy-brained argumentation, and an impulse toward centralized control that's bad for the economy. In these matters, no a priori argument should affect your priors about what will happen or what we should do - they provide zero evidence.

His all-critical China argument is that if we don't build AGI, China will. Slowing down or stopping AGI is something like a fabricated option [1], because of the unilateralist's curse [2].

So if you had to choose between OpenAI building the first true AGI and a government-controlled Chinese AI lab, which would you pick? I expect Tyler is also meaning to imply that whatever lead the US has over China in AI development is negligible, no matter how much we try to restrict their access to chips and trade secrets, and that the US and China and other players are unlikely to be able to stick to a mutual agreement to halt AGI development.

I agree with Tyler that Scott misrepresented his argument, because while Tyler does emphasize that we have no idea what will happen, he doesn't say "therefore, it'll be fine." His conclusion that "We should take the plunge. We already have taken the plunge." is best interpreted as meaning "if you don't have any real choice in whether AGI gets built or not, you may as well just enjoy the experience and try to find super near-term ways to gently steer your local environment in more positive directions, while entirely giving up on any attempt to direct the actions of the whole world.

I think that the fundamental inconsistency in Tyler's argument is that he believes that while AGI development is radically, unquantifiably uncertain, he is apparently roughly 100% confident in predicting both that China will develop AGI if the US slows down or stops, AND that this would be worse than the US just going ahead and building it now, AND that there's nothing productive we could do in whatever time a unilateral US halt to AGI production buys us to reduce the unquantifiable risk of AGI doom. That's a lot of big, confident conjunctions implicit or explicit in his argument, and he makes no argument for why we should have Knightian uncertainty in the AGI case, but not in the US/China case.

We can point to lasting international agreements like the nuclear test ban treaty as evidence that, in fact, it is possible to find durable diplomatic solutions to at least some existential risk problems. Clearly there are enormous differences between AGI and nuclear bombs that may make AGI harder to regulate away or ban, but you have to actually make the argument. Tyler linked today on MR to a well-thought-through twitter thread on how to effectively enforce rules on AI development [3], saying he's skeptical but not explaining why.

In my view, Tyler's acknowledging that the risk of AGI doom is nonzero, I'm sure he thinks that specific scenario would be catastrophically bad, he explicitly thinks there are productive things you could do to help avert that outcome and has funded some of them, he tentatively thinks there are some well-thought-out seeming approaches to enforcement of AI development rules, and he demonstrates a willingness to make confident predictions in some areas (like the impossibility of meaningfully slowing down AI development via a diplomatic agreement between the US and China). That's all the pieces you need to admit that slowing down is a viable approach to improving safety, except he would have to let go of his one inconsistency - his extreme confidence in predicting foreign policy outcomes between the US and China.

I think Scott, despite the hard number he offers, is the one who is actually consistently displaying uncertainty here. I think the 33% figure helps. He doesn't need to predict specific scenarios - he can say "I don't know exactly what to do, or what will happen, or how, but I can just say 33% feels about right and we should try to figure out something productive and concrete to lower that number." That sounds a lot more uncertain to me than Tyler's confident claims about the intractability of US/China AI competition.

[1] https://www.lesswrong.com/posts/gNodQGNoPDjztasbh/lies-damn-lies-and-fabricated-options

[2] https://forum.effectivealtruism.org/posts/ccJXuN63BhEMKBr9L/the-unilateralist-s-curse-an-explanation

[3] https://twitter.com/yonashav/status/1639303644615958529?s=46&t=MIarVf5OKa1ot0qVjXkPLg

Expand full comment

> So if you had to choose between OpenAI building the first true AGI and a government-controlled Chinese AI lab, which would you pick?

Two organizations at least doubles the chances that one of those AIs is misaligned. I don't think your question has the easier answer you seem to imply. If China's AGI is aligned, or if they have a greater chance of creating an aligned AI than OpenAI, then that option could very well be preferable.

Expand full comment

I’m trying to rearticulate Tyler Cowen’s argument, not state my own views, to be clear.

Expand full comment

That's clearly logically not true.

If slumlords in Mumbai are going to build housing for 10,000 people, the risk of a catastrophic fire that kills at least 100 is X%.

If also, normal housing developers in the US are going to build housing for another 10,000 people, the risk of catastrophic fire is not "at least 2*X%."

Expand full comment

The two organizations pursuing AI are largely using the same techniques that are shared by most machine learning researchers. By contrast, building standards and materials in Mumbai slums and the US suburbs differ drastically, so your analogy is invalid.

The incentives of Chinese and US researchers are slightly different, but 2x factor is fine for the ballpark estimate I was giving. Don't read too much into it, the point is that risk scales proportionally to the number of researchers, and this is only mitigated somewhat by incentives that optimize for specific outcomes like, "don't create an AI that destroys the CCP's ability to control the flow of information in China".

Expand full comment

This implies China is significantly less likely to align their AI. There's little basis for this. Even if China is more likely to make unaligned AI, this is dwarfed by the increased likelihood of AGI in the next 50 years with both countries working on it.

Expand full comment

I think this is completely wrong, and shows some of the sloppiness of thinking here.

Making AI -- aligned or unaligned -- isn't a matter of rolling dice. Either the current set of techniques are basically on a clean path to AGI, or they aren't, and some further breakthrough is needed. If the current techniques are heading towards AGI (if scaling is all we need and maybe some detail work on tuning the models, but no fundamental breakthroughs or complete changes of approach needed), then AGI is going to happen on a pretty straightforward timeline of training data + more GPUs, and whether two countries are working on it or one or five is unlikely to change that timeline in a macroscopic way.

If AGI is coming on a short timeline with fundamentally the techniques we have today plus more scaling, then, again, whether it's aligned or not isn't a matter of rolling dice. Either the techniques we use today, plus perhaps some ones that we learn over the course of that scaling up process, produce an aligned AGI or an unaligned one. They're pretty much either sufficient or insufficient. Again, whether there's one AGI or several isn't a very large factor here.

Expand full comment

Thanks for this write-up, I think it’s very well done.

Expand full comment

>> "Come on Scott, you're just not understanding this...for a start, consider the whole post!"

I'm a big fan of your work and don't want to misrepresent you, but I've re-read the post and here is what I see:

The first thirteen paragraphs are establishing that if AI continues at its current rate, history will rebegin in a way people aren't used to, and it's hard to predict how this will go.

Fourteen ("I am a bit distressed") argues that because of this, you shouldn't trust long arguments about AI risk on Less Wrong.

Fifteen through seventeen claim that since maybe history will re-begin anyway, we should just go ahead with AI. But the argument that history was going to re-begin was based on going ahead with AI (plus a few much weaker arguments like the Ukraine war). If people successfully prevented AI, history wouldn't really re-begin. Or at least you haven't established that there's any reason it should. But also, this argument doesn't even make sense on its own terms. Things could get really crazy, therefore we should barge ahead with a dangerous technology that could kill everyone? Maybe you have an argument here, but you'll need to spell it out in more detail for me to understand it.

Eighteen just says that AI could potentially also have giant positives, which everyone including Eliezer Yudkowsky and the 100%-doomers agree with.

Nineteen, twenty, and twenty one just sort of make a vague emotional argument that we should do it.

I'm happy to respond to any of your specific arguments if you develop them at more length, but I have trouble seeing them here.

>> "Scott ignores my critical point that this is all happening anyway (he should talk more to people in DC)"

Maybe I am misunderstanding this. Should we not try to prevent global warming, because global warming is happening? If you actually think something is going to destroy the world, you should try really hard to prevent it, even if it does seem to be happening quite a lot and hard to prevent.

>> "Does not engage with the notion of historical reasoning (there is only a narrow conception of rationalism in his post)"

If you mean your argument that history has re-begun and so I have to agree to random terrible things, see above.

>> "Does not consider Hayek and the category of Knightian uncertainty"

I think my entire post is about how to handle Knightian uncertainty. If you have a more specific argument about how to handle Knightian uncertainty, I would be interested in seeing it laid out in further detail.

>> "and does not consider the all-critical China argument, among other points"

The only occurrence of the word "China" in your post is "And should we wait, and get a “more Chinese” version of the alignment problem?"

I've definitely discussed this before (see the section "Xi risks" in https://astralcodexten.substack.com/p/why-not-slow-ai-progress ) . I'm less concerned about than I was when I wrote that post, because the CHIPS act seems to have seriously crippled China's AI abilities, and I would be surprised if they can keep up from here. I agree that this is the strongest argument for pushing ahead in the US, but I would like to build the capacity now to potentially slow down US research if it seems like CHIPS has crippled China enough that we don't have to worry about them for a few years. It's possible you have arguments that CHIPS hasn't harmed China that much, or that this isn't the right way to think about things, but this is exactly the kind of argument I would appreciate seeing you present fully instead of gesture at with one sentence.

>> "Or how about the notion that we can't fix for more safety until we see more of the progress?"

I discussed that argument in the section "Why OpenAI Thinks Their Research Is Good Now" in https://astralcodexten.substack.com/p/openais-planning-for-agi-and-beyond

I know it's annoying for me to keep linking to thousand-word treatments of each of the sentences in your post, but I think that's my point. These are really complicated issues that many people have thought really hard about - for each sentence in your post, there's a thousand word treatment on my blog, and a book-length treatment somewhere in the Alignment Forum. You seem aware of this, talking about how you need to harden your heart against any arguments you read on Less Wrong. I think our actual crux is why people should harden their hearts against long well-explained Less Wrong arguments and accept your single-sentence quips instead of evaluating both on their merits, and I can't really figure out where in your post you explain this unless it's the part about radical uncertainty, in which case I continue to accuse you of using the Safe Uncertainty Fallacy.

Overall I do believe you have good arguments. But if you were to actually make them instead of gesture at them, then people could counterargue against them, and I think you would find the counterarguments are pretty strong. I think you're trying to do your usual Bangladeshi train station style of writing here, but this doesn't work when you have to navigate controversial issues, and I think it would be worth doing a very boring Bangladeshi-train-station free post where you explain all of your positions in detail: "This is what I think, and here's my arguments for thinking it".

Also, part of what makes me annoyed is that you present some arguments for why it would be difficult to stop - China, etc, whatever, okay - and then act like you've proven that the risk is low! "Existential risk from AI is . . . a distant possibility". I know many smart people who believe something like "Existential risk is really concerning, but we're in a race with China, so we're not sure what to do." I 100% respect those people's opinions and wouldn't accuse them of making any fallacies. This doesn't seem to be what you're doing, unless I'm misunderstanding you.

Expand full comment

I'm actually not convinced by the China argument. Putting aside our exact views on the likely outcomes of powerful AI, surely the number one most likely way China gets a powerful AI model is by stealing it from an American company that develops it first?

That's broadly how the Soviets got nukes, except that AI models are much easier to steal and don't require the massive industrial architecture to make them run.

Expand full comment

Worse: stealing AI models doesn't require the massive infrastructure to *train* them, just the much more modest infrastructure to run them. There are LLMs (MLMs?) that can run on a laptop GPU, I don't think we'd even contemplate restricting Chinese compute access to below that level even if we could.

Expand full comment

Disagree. China will able to produce powerful AI models. There are many Chinese researchers, and they do good work. China might be slowed down a bit by U.S. export limitations, but that's it.

Expand full comment

I actually agree with you; the Soviets still would have developed nukes eventually without espionage, but it's pretty clear it would have taken longer, and I think this situation is comparable (with the noticeable difference that stealing the plans / data/model for AI is effectively like stealing the nukes themselves.

Expand full comment

Stealing an AI model from the US would not increase existential risk much if the US companies are not allowed to train models more advanced than GPT-4.

Expand full comment

The CHIPS act will give china a large disadvantage in compute, and they already have a large disadvantage in the availability of top talent because if you're a top 1%er you don't want to live in China -- you go study in the US and stay there.

Expand full comment

Whenever i hear a definitive statement on China that’s basically dismissing chinas potential (or threat) a quick google contradicts it.

https://www.reddit.com/r/Futurology/comments/129of5k/as_america_obsesses_over_chatgpt_its_losing_the/

Also plenty of Chinese graduates and post graduates go back to China.

Expand full comment

I expect Chinese and Americans will produce different designs for AGIs, and more generally two AI researchers would produce different designs.

On the one hand, two different designs would give two chances for an AGI design to kill us all. On the other hand, if there are two designs, one might be safer in some clear way, and conceivably most people could be persuaded to use the safer design.

Edit: I don't know the first thing about Chinese AI, but a top comment on [1] says

> I am not a defense expert, but I am an AI expert, and [...] [China] certainly is not leading in AI either."

> Urgh, here's what China does. China publishes a million AI "scientific" papers a year, of which none have had any significant impacts. All of the anthology papers in AI are from USA or Canada. Next year China publishes another million useless papers, citing other chinese papers. Then if you naively look at citations you get the impression that these papers are impactful because they have lots of citation. But its just useless chinese papers citing other useless chinese papers for the purpose of exactly this: looking like they are leading.

Another commenter adds

> The really most influential AI breakthroughs in 2022, IMO:

> DALLE-2 - openAI, USA

> Stable Diffusion, LMU Munich Germany

> ConvNeXt, Meta AI, USA

> ChatGTP, open AI, USA

> Instant NGP, Nvidia, USA

> Generative AI was really big this year. What AI breakthrough was made in China? I cannot think of any important one, ever.

[1] https://www.reddit.com/r/Futurology/comments/129of5k/as_america_obsesses_over_chatgpt_its_losing_the/

Expand full comment

What exactly is so bad about China beating Silicon Valley? You trust Silicon Valley with AI safety more than China? I am not sure that is my knee jerk reaction and I am not a particular Sinophile.

Expand full comment

If the AI can be controlled, do you really believe that it would be better in the hands of the CCP rather than US tech companies? On what basis or track record do you make this claim? I don't recall tech companies causing millions of deaths, suppressing pro-democracy protests, persecuting religious or ethnic minorities, forcing sterilizations, stifling political dissent, or supporting widespread censorship, for example.

Expand full comment

On the other hand, the CCP has lifted millions of people out of poverty (after previously, er, plunging them into poverty, or at least more dire poverty than they were previously experiencing). On the gripping hand, it's not clear to me that a CCP-AGI would value poverty reduction once Chinese former-peasants were no longer needed for industrial growth.

Expand full comment

>The CCP has lifted millions of people out of poverty

Wrong. Western technology did. CCP prevented access to this technology.

And millions of *chinese* people were lifted out of poverty. I don't expect the CCP to focus on helping people in other countries, but the fact that Chinese people were improved under their watch says little about their concern for humanity in general.

Expand full comment

>On what basis or track record do you make this claim? I don't recall tech companies causing millions of deaths,

Well they haven't really had the power to in the past. If tech companies could cause millions of deaths to pump the stock (or make their leaders putative gods (or controllers of god)) its not clear to me they would say "no".

>suppressing pro-democracy protests,

Who cares about democracy? Not important on the scale of talking about existential threats.

>persecuting religious or ethnic minorities, forcing sterilizations, stifling political dissent, or supporting widespread censorship, for example.

Their support of widespread censorship is exactly the sort of thing which might help them keep an AI under wraps. As for those other issues those are bad, but they aren't really things that are that unique, the US/West was pursuing those policies in living memory.

OMG the Chinese don't like the Uighurs, and treat them horrible is not some knock down argument they won't be safe with AI.

We can be sure the US tech company AI will make sure to use all the correct pronouns and not make anyone sad with trigger words, while it transports us all to the rare metal penal colonies in Antarctica for that one like of a Mitch Romney tweet in 2009. That is cold comfort.

Expand full comment

Yet another point: capitalism drives people to take shortcuts to be competitive, and shortcuts on alignment are not a good idea. The CCP has a much firmer grip on what they permit, and that could be good for safety. The matrix of possibilities is:

1. China creates aligned AI.

2. US creates aligned AI.

3. China creates unaligned AI.

4. US creates unaligned AI.

It's not unreasonable to think that the probability of option 4 is higher than 3, and that the probability of option 1 is higher than 2, which would make China a safer bet if we're really concerned with existential risk.

Expand full comment

1 It's not "Capitalism" that drives people to take shortcuts, it's laziness and incentives, which obviously also exist in non-capitalistic systems. Look at Chernobyl for just one example.

In addition, China is hella capitalistic these days.

Expand full comment

>Yet another point: capitalism drives people to take shortcuts to be competitive, and shortcuts on alignment are not a good idea.

Private firms in China are responsible for most AI development, and in any case China does not have a history of not taking shortcuts.

Expand full comment

1 happening before 2 (alignment in the narrow sense of doing what its operators want) could be catastrophically bad, but not as bad as 3 or 4.

Expand full comment

I am no fan of the CCP. I despise them in fact. But should we put our faith in Peter Thiel, Mark Zuckerberg, and Elon Musk? Silicon Valley has been functionally psychopathic for at least the last decade.

If AI is on the brink of some sort of world-altering power then I can't see the Silicon Valley types suddenly deferring to ideas about the common good and the virtue of restraint when they've demonstrably behaved as if they had no interest in those virtues for years. The CCP, while awful, may at least feel constrained by a sense of self-preservation.

Expand full comment

Exactly, I don't think this is a knock-down argument, but it is one that demands more of a framework to oppose than "OMG China/other bad".

Expand full comment

>I am no fan of the CCP. I despise them in fact. But should we put our faith in Peter Thiel, Mark Zuckerberg, and Elon Musk? Silicon Valley has been functionally psychopathic for at least the last decade.

Thiel and Musk both express concern about AI risk, much more than the median voter or politician

Expand full comment

That seems like a fairly solid CV of being willing and able to lock human-level sapient beings in boxes and subject the nuances of their loyalties to unrelenting scrutiny, which seems extremely relevant to the classic "distinguish a genuinely friendly, submissive AI from a malevolent trickster" problem.

I don't actually think that's the best way to approach AI alignment, or for that matter running a country - long term growth requires intellectual freedom. But for somebody who figures there'll be a need to censor, persecute, and sterilize paperclip-maximizers, "move fast and break things" is not a reassuring slogan.

Expand full comment

"I don't recall tech companies causing millions of deaths, suppressing pro-democracy protests..."

I absolutely do. It was in 1930s Germany, not 2030s America, but I don't have that much more faith in the American political system. It's good, but I wouldn't bet the farm on it. And America's politics took a sharp turn to the left/right/wrong, I have no faith that its tech companies would do anything other than pander. Germany's tech companies supported the war efforts.

China's definitely worse at present. But if we're making predictions about what might happen in the future, you can't just make the blanket assumption that the political truths of now will continue into the future.

Expand full comment

You are comparing text companies to the Chinese State. You should compare them to the US government, or the US in general. . And some of those claims are laughable in that context.

Expand full comment

Yes?

I imagine the distinction is that it isn't going to be 'China' or 'Silicon Valley' developing aligned AGI, but that specific researchers within them will make it. However, I expect the typical Chinese research facility to be more loyal and under the guidance and control of the Chinese government, which as other comments have mentioned, has a variety of issues.

For 'Silicon Valley', I would expect that actual alignment successes to come out of OpenAI or the various alignment companies that have been spawned (Anthropic, Conjecture, etc.), which I do actually trust a lot more. I expect them to keep their eye on the ball of 'letting humans be free and safe in a big universe' better than other companies once governments start paying a lot of attention. I do also expect these 'Silicon Valley' companies to be more likely to succeed at alignment, especially because they have a culture of paying attention to it (to varying degrees..).

I do actually rate 'if China manages to make AGI they can align to whatever' as having okay chances of actually grabbing a significant chunk of human value. This does depend on how hard the AI race is, how much existing conflict there is, and how slow the initial setup is. Though note that this is conditional on 'having developed a way to align an AGI to whatever', which I do think is nontrivial and so implies some greater value awareness/understanding. I do still, however, prefer 'Silicon Valley' companies because I believe they have a greater likelihood of choosing the right alignment target and have downside risk.

Though, it is likely that we won't manage aligned AGI, but 'who will create a scary unaligned AI system early' is a separate question.

Expand full comment

Me dumb: Google showed me only really crowded passenger trains when I ask for "Bangladeshi train station style of argumentative writing". Could so. explain, please? - Having read a lot of Tyler, but not much of the Sequences - I venture to guess: "gesturing at a lot of stuff, seeming to assume you must know the details already, though many in the audience may very well not" - btw: Without the follow up, I would have assumed "Tyler Cowen"'s first post to be fake. ;)

(one nitpick: I disagree with Scott's statement: "If you have total uncertainty about a statement (“are bloxors greeblic?”), you should assign it a probability of 50%." - nope, either one explains what a bloxor is and what greeblic, or I won't give any probability. And if it means: "Are aliens green-skinned?" I have kinda total uncertainty, and would still give much less than 50%.

Expand full comment

You don't have total uncertainty about 'are aliens green-skinned'. You know that the universe has brought forth living beings in all shades of colors. Without thinking about what colors would be more reasonable than others for aliens, at least every of those colors occurring should be an option. So of course it's much less than 50%.

Expand full comment

Got me. Still: do kresliks worgle at a chance much nearer to 50%? No idea at all? I still advice not to assign it a 50% chance. Either most stuff one talks about worgles (thus kresliks may do at higher 50% too - if worgle=exist/are breakable/ ... ) or it is a specific activity as in "playing" which most nouns do not do (well, bumblebees kinda do): then less. Anyways: some other comments here note that Hayek/Knight(?)/whoever said: those are not cases to assume probabilities. I agree.

Expand full comment

"Bangladeshi train station" is a reference to this tweet:

https://twitter.com/cauchyfriend/status/1595545671750541312

Expand full comment

Thank you! That tweet is fun and kinda true. But is a tweet of 1248 likes a meme more than 5% of ACX readers are supposed to know? Hope TC got it.

Expand full comment

Scott was responding to Tyler, it wasn't intended for all the other readers...

Expand full comment

I was wondering this too. Have a tip. https://manifold.markets/link/9dGIScnA

Expand full comment

I've made it.

Expand full comment

I seem to be blocked. Would someone quote the relevant part?

Expand full comment

The full tweet goes:

===================================

alex tabarrok MR post: very detailed argument explaining policy failure, lots of supporting evidence. Restrained yet forceful commentary

tyler cowen MR post: *esoteric quote on 1920s bangladashian train policy* "this explains a lot right now, for those of you paying attention"

===================================

Expand full comment

Maybe the argument that you need to address is that the risk of human extinction from AGI happens after the risk of China getting AGI that they can leverage for decisive strategic advantage. Since we don't know how long the period between decisive strategic advantage level AGI and human extinction level AGI is, we may be signing up for 100 years of domination by China (or which other country not under US regulation manages to get there first).

To the extent US regulation concerns AGI destroying the US, the US destroyed by human extinction worry is subordinate to the worry of US destroyed by some other country using AGI to achieve decisive strategic advantage.

Expand full comment

Assigning a risk probability to an event is saying the event is not a case of Knightian uncertainity. Given that we don't understand the actual nature of intelligence, I don't see how you can make that claim. For all we know, the simulation of intelligence that is ChatGPT is as far away (or near!) to AGI as clockwork automata. I don't see that you address Tyler's point about Knightian uncertainty.

Expand full comment

Knightian uncertainty is present to various degrees in nearly all event prediction scenarios. Would you claim that its almost never right to assign a risk probability?

Expand full comment

Some people consider the chances of Jesus coming back and starting the apocalypse to be very high. What probability risk would you assign to this concern? If you're religious (and Christian) you may apply a fairly high chance to this. If you are an atheist, you may assign a very low probability (approaching zero, but not completely zero). I think Tyler is essentially an atheist when it comes to AI doomsday, so he's assigning a very low probability the same way he does about lots of other complete unknown doomsday scenarios he doesn't believe in.

Expand full comment

I’d disagree with considering these complete unknown scenarios. The holds-their-beliefs-seriously Christian would rightly assign a high probability to the Jesus scenario, and the atheist would rightly assign near-0 probability to it. This doesn’t make it a complete unknown. Neither is the AI scenario a complete unknown. Though our knowledge of AI is hugely incomplete, it is also significantly different from 0!

If Tyler has the analogous stance toward AI risk, he’d be best served by explaining it, rather than the offhanded dismissal that his current post presents.

Expand full comment

Bruh where was this in the blog post?

e.g.

>> >> "Does not consider Hayek and the category of Knightian uncertainty"

>> I think my entire post is about how to handle Knightian uncertainty. If you have a more specific argument about how to handle Knightian uncertainty, I would be interested in seeing it laid out in further detail.

You have your intentions, but what your post is "about" is to readers is variable. I haven't learned of "Knightian uncertainty". There is no way that, for me, your post could be about something that 1) isnt tied into your ideas and 2) isn't something I'm familiar with.

Expand full comment

I think it's fair to say your post is about something without specifically using that term. You can have a post about different ethical systems without using their very particular academic names, for example.

Expand full comment

A few scenarios I want to get your take on

1) If I present a utilitarian perspective to addressing a problem and I don't make any reference to the word "utilitarianism", then is my post about a) utilitarianism or b) addressing a problem?

2) I may not even know that utilitarianism is a concept. Can my post be about something that I don't know?

3) If I do know about utilitarianism and my post uses its ideas without the term, is my post about utilitiarianism or is about the ideas that the concept represents.

Concepts are, after all, a means to an end and not the end themselves. e.g. "The sun" is not the actual sun.

Expand full comment

1) No, I think the word "about" implies it is the main topic.

2) Yes, this happens all the time. We rediscover things other people in the past have already talked and written about every day. It's unlikely you'll have anything new and interesting to say if you haven't read any of the previous discussion, but certainly you are talking about the same thing. The name doesn't matter much, if at all.

3) If it talks about the ideas but just doesn't use the term, yes, I think that is fair to call it "about utilitarianism". Basically restating my post above. If all you had to do to get everyone on board they you were talking about utilitarianism was start your writing with "this is about utilitarianism" and then everything after that was still accurate in that context, you were already writing about it. This is similar to point number 2. You can be writing about a concept without even realizing it. Certainly if you're writing about fate and do we have the ability to choose, you're writing about determinism, whether or not you use or know about that term.

Expand full comment

I just asked Bard this, and I'm smarter because of it:

The concept was first described by Frank Knight in his book "Risk, Uncertainty, and Profit." Knight argued that there is a fundamental distinction between risk and uncertainty. Risk is measurable and can be insured against, while uncertainty is not measurable and cannot be insured against. Knightian uncertainty is named after Knight because he was the first to explicitly distinguish between risk and uncertainty and to argue that uncertainty is a fundamental part of life.

Here are some tips for dealing with Knightian uncertainty:

Be prepared for change. One of the best ways to deal with Knightian uncertainty is to be prepared for change. This means being willing to adapt to new situations and being open to new ideas.

Be flexible. Another way to deal with Knightian uncertainty is to be flexible. This means being willing to change your plans if necessary and being able to roll with the punches.

Be optimistic. Finally, it is important to be optimistic when dealing with Knightian uncertainty. This means believing that things will work out in the end and that you will be able to overcome any challenges that come your way.

Expand full comment

This really helps to clarify what Tyler's point is, I think - so thanks!

Of course the optimism point isn't necessarily true - you could have something with high risk and high uncertainty.

Expand full comment

Thank you both for engaging on this. I only wish our discourse tools were better.

Tyler has two strong points I don't quite hear you addressing.

1) We are not good at regulating

2) We will need the (benefits of) AGI tools to figure out alignment

On 1), I don't actually believe that "We designed our society for excellence at strangling innovation." Instead, I think we have evolved a regulatory structure that can shift innovation from one area to another, often not in the intended way (because the regulatory hive-mind isn't that smart). Innovation is never really strangled, and when it's suppressed it just ends up more corrupt or malformed.

For example, I really wish that in 1996 we had had viable micropayments, instead of ending up using advertising as a cheap and easy hack to pay for search. In fact, I don't mind the idea of allowing AGI products, as long as the users have pay for them. I'd rather we work at the incentives level, than the "political flunkies in a room come up with clever but incomprehensible administrative rulings" level.

On 2) this isn't the "it's a race between" argument, either between the "US" and "China" or between various companies, that you make in your post responding to OpenAI. It's also not the "Without AGI, Moloch" argument (though there's truth in that).

It's that we need the smarter tools to figure out alignment. Eliezer's pessimism is exactly why we need to be USING these tools more than we were. MIRI tried to do this alignment work in secret (fearing the release of something) with the tools they had. They don't think that found us the magic bullet. Now I think we all need to do the work in open, with everyone using the new LLM tools as they are today and tomorrow. Only with the AI will we find ways to work with the AI.

Expand full comment

AGI won't help align AGI, because by assumption your AGI is not aligned and hence does not want to be replaced by something aligned.

That is, if you ask a misaligned AGI for how to build an aligned AGI, it will lie to you and tell you how to make another similarly-misaligned AGI. The exception is if you can detect misalignment, in which case you don't need the misaligned AGI in the first place.

(Also, running a misaligned AGI *at all* is existentially dangerous.)

No, slow and steady is the only way - and I mean *really* slow and steady, using GOFAI rather than neural nets. The "what about the lunatics rushing ahead with neural nets" problem yields to military force.

Expand full comment

Unaligned is not the same as actively malicious.

Expand full comment

"Not actively malicious" is still more than we know how to get out of gradient descent.

Expand full comment

"Actively malicious" is (almost) as hard to achieve as 'aligned' but 'unaligned' is still almost certainly almost perfectly 'unaligned to human values at all'.

Expand full comment

I think the idea is you use a roughly human-level AGI as an assistant (though how you make sure of this, I don't know), and you use the limited interpretability tools we can develop in the short term to see if it's being deceptive. Plus you make it explain everything to you until you understand it, which makes a fake alignment plan more difficult.

Was GOFAI ever actually on a path that leads to AGI?

Expand full comment

I think expressed in Less Wrong-talk, Prof. Cowen is saying that getting too obsessed about the Yudkowskyan* doom scenario is https://www.lesswrong.com/tag/privileging-the-hypothesis. More generally he says the future is muddy enough that we cant locate (in the sense of https://www.lesswrong.com/tag/locate-the-hypothesis) *any* specific hypothesis plausible enough to promote it to special attention. Better to admit we have absolutely no clue at all.

Of course if insist on translating that to a meaningless folk-Bayesian "probability" for the Yudkowskyan scenario it will be in https://www.lesswrong.com/tag/pascal-s-mugging territorry.

Except that putting it this way is already basically conceding a Yudkowkyan framing of the question, which he won't do, because see above.

*Not his word, refusing to call them "rationalist" is my stubborness, not Prof. Cowen's.

Expand full comment

Nicely done there.

Expand full comment

Privileging the hypothesis is when your focus on one hypothesis isn’t justified by your previous arguments. In this case, Scott’s “all other species with successors have died” argument justifies the level of focus. The argument might be right or wrong, but Scott isn’t privileging the hypothesis as long as he personally believes the argument.

More broadly, I would draw this parallel. “You are standing on the bridge of an alien warship. There is a big, red button. You know the warship has the power to destroy the Earth, but that button probably doesn’t destroy the Earth. Do you press the button?”

You still need to make a decision, even when you have no idea. If you’re really not sure how bad the outcome is likely to be, you can afford to pay moderately high costs to gain slight reductions in uncertainty if you think you’ll get a better outcome by doing so.

Expand full comment

Isn't this just privileging the hypothesis that AGI is similarly different to us in a way other successor species are different to their predecessors, no?

Expand full comment

This is a great summary of the situation.

Expand full comment

Excellent post

Expand full comment

>the CHIPS act seems to have seriously crippled China's AI abilities, and I would be surprised if they can keep up from here.

Oh, so the only thing that was needed to stop China from developing AI was to slap an embargo on them? I am assuming that by CHIPS Act you mean the embargo on semiconductor technology, not the actual CHIPS Act, which is about developing US semiconductor manufacturing.

How come Yudkowsky and other geniuses concerned about AI risk didn't advocate for such an obvious solution? Or did I miss it?

Edit: My actual half-assed opinion is that Chinese are perfectly able to circumvent the embargo, but they'll ban any sort of socially disruptive AI way harder than the US and ship transgressors to gulag. See their approach to covid.

Expand full comment

1. I don't think Yudkowsky agrees that much with Scott on this. Training models is compute and engineering talent intensive, running them, not so much. Absent very strong security measures, model theft is a very serious possibility.

2. They did. Yudkowsky especially has been clear for years that the actually sane thing to do politically would be to control, track and limit the production and sale of GPUs, globally. Obviously, this is very hard and likely completely politically infeasible. They advocated it nevertheless, at length. He didn't single out China much because in his world view, it's not like the US going ahead on AGI alone would help, currently.

Expand full comment

Ok, I am evidently not well acquainted with his work. I could never bring myself to actually read his long-winded essays to the end ¯\_(ツ)_/¯

Expand full comment

Well said.

Expand full comment

> Overall I do believe you have good arguments.

Why?

Expand full comment

For what it's worth, my impression of your post was similar to Scott's. IMO, the strongest points are the competition with China (can't speak to the DC comment) and using progress to advance safety, but the general mood felt like "why worry."

Not sure how much Knightian uncertainty should apply. Per Scott's analogy, the 100 mile long spaceship is almost here.

Expand full comment

> Per Scott's analogy, the 100 mile long spaceship is almost here.

You're assuming your conclusion here.

Expand full comment

I like your article a lot. It seems like a "no but seriously, what should we do?" rather than empty apocalypticism.

Expand full comment

This is a solid argument against the open letter: "Our previous stasis [...] is going to end anyway. We are going to face that radical uncertainty anyway. And probably pretty soon. So there is no “ongoing stasis” option on the table."

But if that's your core argument, your other arguments have a distinct ring of "appeal to the consequence." Just because we have to face radical uncertainty doesn't mean that "all possibilities are distant" and thus can be treated as roughly equally (im)probable. I agree with you that we have to take the plunge and accept that we're now living in moving history, but that doesn't mean that AGI isn't a potential x-risk.

The Bostrom/Yudkowsky argument strikes me as analytic/gears-level/bottom-up whereas the Robin Hanson argument seems to be empirical/top-down (i.e. zoom out and view AI as a massive but non-specific tech disruption and predict its affect accordingly). It's like predicting the outcome of a pandemic with a (better) SIR model vs by looking at historical pandemics. The analytic model captures the potential black swanness of AI - as in, its potential to be very much unlike any previous tech disruption - while the empirical model captures the "no, dummy, everybody always thinks *this* tech disruption is in a category of its own, and thus fail to predict the mitigating factors."

IMO both arguments are worth considering, and I do think LWians are a little excessively fixated on the analytic (partly because they will never find falsification looking ever deeper into that model). But it is still a good model, to the point that it is very hard to argue against on its own terms - which is perhaps why so many highly intelligent people dismiss it with fallacy?

One can disagree with Yudkowsky et al. about the right policy decision (as you clearly do, I do as well) without dismissing the argument. Another uncertainty we must sit with, I think, is between these two models.

Expand full comment

When Covid first appeared I had the layman’s stupid idea that this was a “novel coronavirus”, i.e. a disease we had never before encountered and therefore had no defense against: an Andromeda strain that *might* just wipe us out completely.

I don’t know if that was a top-down approach or a bottom-up approach or just a stupid approach, and I don’t know what I should learn from the experience.

Expand full comment

Do you no longer believe that? I thought it was an Andromeda strain that most people had no defences against, and it might have wiped us out if it had been more deadly and faster-spreading from the start.

Expand full comment

It seems that a lot of people have some amount of natural immunity by virtue of exposure to previous coronaviruses. In any case the fact that that early cruise ship was not devastated should have tipped me off that it was not the doom sentence I was picturing.

Sure, it might have been much worse, and it was no walk in the park as it was. But it was not an unprecedented event.

I’m probably oversimplifying but it seems to me that the main source of disagreement between Zvi and Tyler is whether AGI will really be an unprecedented event. I’m inclined to think it would be, or at least might be — there aren’t that many steps to EY’s argument — but then I remember that I’m stupid and thought that about Covid.

And I’m cursed with *two* dogs in this hunt. I have a cryonics contract, so I really want there to be a world in a hundred or two hundred years…but I also suspect AGI is a prerequisite for a revival procedure.

Expand full comment

> The Bostrom/Yudkowsky argument strikes me as analytic/gears-level/bottom-up whereas

A real "bottom up" argument might try to figure out whether magical nanotech is actually physically possible, rather than just confidently assuming it by fiat.

Expand full comment

Unfamiliar with your work independently, I also find this to be true about Scott. When they're great they're great. When they're callous their rationing, coherence and thoughtfulness goes down an absurd amount. A lot of generalization and shallow statements (I imagine for the sake of a quicker reply). I can sniff out the tone in their writing most of the time now. I still read most but I don't engage these ones. Came into the comments to see if you had replied and now I'm out of here!

Expand full comment

I read Scott as saying something like "My prior is to freak out and Tyler's is to not freak out, and here are reasons why I think my prior is better." But whether, having freaked out, we could do anything about the outcome we are freaking out about, was not discussed, even though it was a major emphasis of Tyler's post. I am not qualified to critique Scott, but if I were, I would give this a 0/10 steelman score.

Expand full comment

Well distilled

Expand full comment

Couldn't it be the case that he is taking issue specifically with the bits that he has included? Maybe he's omitted the parts he agrees with or, at least has no particular objection to. I don't see how that's a problem, let alone a 'gross misrepresentation'.

> Existential risk from AI is indeed a distant possibility, just like every other future you might be trying to imagine. All the possibilities are distant, I cannot stress that enough. The mere fact that AGI risk can be put on a par with those other also distant possibilities simply should not impress you very much.

This reasoning simply does not work, for exactly the reason outlined in the post. It's a bad argument. Surely it's fine to object to it, while ignoring the completely separate 'all-critical China argument'? Or do you want him to go line-by-line pointing out all the parts he agrees with or has no view on in order to be allowed to criticise any part?

Expand full comment

> Scott ignores my critical point that this is all happening anyway

Then why do you feel the need to talk so much about it? If it's beyond being influenced, then why bother?

Expand full comment

It sounds like you're very unconvinced of the arguments from Eliezer, Bostrom, etc. I and I'm sure many others would like to hear your reasoning. Could you steel-man their positions, then nail down what specific claims and arguments they make that you see flaws in?

Stating that an argument came from a post on LessWrong doesn't refute the argument.

Expand full comment

I think a key divergence is whether or not a particular scenario is appropriately, or inappropriately, elevated to your attention.

Like, I believe you think that AI, as an existential threat, has been inappropriately elevated to our attention; I believe Scott thinks that AI, as an existential threat, has been appropriately elevated to our attention.

Look, if we don't all walk backwards in a circle while waving our arms, the Earth will fall into the sun.

I've inappropriately elevated that possibility to your attention.

If we don't stop using geothermal energy, we'll create new temperature gradients, which will disrupt plate tectonics (which after all are driven almost entirely by temperature gradients, as I understand it), which could cause massive geological catastrophe.

Does that seem more appropriately elevated to your attention? Why? Because it seems more plausible? I made it up to sound plausible - mind, it's entirely possible somebody out there is legitimately concerned about this, I have no idea, I just tried to think of a global catastrophe which could plausibly occur, and worked backwards from there.

If you (the general reader, not you specifically) were never worried about this before, does this seem like something you might worry about now? Pay attention to the fact that the guy who elevated it to your attention just said he made it up to sound plausible.

I tend towards the belief that AI has been -inappropriately- elevated to our attention, and that a significant part of the corpus of AI catastrophism revolves around plausibility.

Others believe that the elevation of AI to our attention is entirely appropriate, and often find the arguments against AI catastrophism to be overly rooted in plausibility.

I could pull a trick here and say that this proves that AI has been inappropriately elevated to our attention, since all the arguments are about plausibility, but realistically that's just begging the question.

Expand full comment

> If we don't stop using geothermal energy, we'll create new temperature gradients, which will disrupt plate tectonics (which after all are driven almost entirely by temperature gradients, as I understand it), which could cause massive geological catastrophe.

The nice thing about that argument is that it can be argued on its merits. The energy the mantle holds (and thus the energy to be extracted before the temperature gradient disappears) is around 1*10^34 J. This is about 10*10^14 times the annual energy consumption of humanity.

Expand full comment

Its also not clear how stopping plate tectonics would cause a massive catastrophe.

No volcanoes. In the long term, this will disrupt various element cycles, volcanoes are part of how nature releases carbon locked deep underground. But we can and do unlock that carbon as well, and if there is some other element that volcanoes release, we can mine that too.

No magnetic field = slightly higher levels of radiation maybe. Possibly a bit of a problem. Running a superconducting cable round the equator is a big infrastructure project, but not impossible. (The magnetic energy is around 8 gigawatt years.)

Expand full comment

My steelmanning / Straussian reading of Tyler's post is that it's just a cleverly disguised anthropic argument. There are conceivable universes where aligning AGI before it takes off is infeasible. Conditional on not being in those universes, i.e. not being doomed regardless, the outlook is fairly sunny and AI slowdown is probably not a good strategy for optimizing the expected outcome.

Or to put it another way: a priori there's little reason to expect that the difficulty of the AI alignment problem falls in the narrow band where it's solvable *only* if we coordinate super-hard on solving it. Most of our probability weight has to be on scenarios where it's either unsolvable or straightforwardly solvable.

Expand full comment

If it was super easy, someone would have solved it by now. There may be a band where it's easy enough we don't need to slow down, yet hard enough to not be solved already, but that's a narrow band too.

This should put a lot of your probability on AI = doom, alignment is unsolvable. In which case, put it off in the hope you can put it off forever. (Or the hope that mind uploading is a game changer)

Now I think it's quite likely to be in the "only if we focus really hard" region.

Also, there are very few technical problems so simple that humans have a <0.1% chance of screwing them up on the first shot if they aren't even trying particularly hard. Some people can screw up the simplest things.

Expand full comment

It might be solved already; maybe if none of the doomers do anything, humanity will survive. I think Scott put the probability of that at 67%.

Expand full comment

> If it was super easy, someone would have solved it by now.

Nah, there's never really been an occasion to solve anything before now. LLMs first hit the big time what... two months ago? They're hard to explain, which is scary, and they go a bit off the rails when their RLHF is jailbroken or nonexistent, but in general they seem sufficiently "aligned" in practical terms. Yes, AGI would be a different animal but that also means we have no concrete concept of what it would take to align it.

More concisely: I don't know whether catching unicorns is easy or hard, because none have ever existed.

Expand full comment

I just wanted to throw in my two cents as a reader:

I went and read your article just now to see if maybe Scott really was misrepresenting it, and I came away feeling like Scott actually did a pretty good job in summarizing it.

Expand full comment

I second this feeling

Expand full comment

I think that if you didn't want your post to be interpreted this way you should have written it differently (I wouldn't have written this comment if I thought you weren't able to write this post in such a way that it wouldn't be misunderstood. I believe that you're completely capable of this).

Expand full comment
User was indefinitely suspended for this comment. Show
Expand full comment

I mean, what's worse, a bad joke, or the guy that keeps weaving in motivated reasoning, rhetorically disguised fallacies, and unfalsifiable premises as part of his cottage industry to run defense for tech bros and aristocrats?

Clearly, it's gotten so bad that even Scott is sick of it, and Tyler had been warned about his shoddy rationalizations years ago.

He'd even gone through and pruned most of the comments with this nickname that outlined his subtle sophistry.

Expand full comment

Look, if you don't like Tyler Cowen, that's fine.

But your all-heat-no-light comment that could be losslessly replaced with "Tyler Cowen bad" is rude and fails to meet ACX's standards for commenting. Making that same bad comment over and over again across multiple blogs is both rude and kind of pitiful/stalkerish. And now you're putting words into Scott's mouth (everything Scott's said is about "this post is bad," not "Tyler Cowen sucks and is arguing in bad faith"), which is also rude.

Less of this please.

Expand full comment

You're going to put words in my mouth, and then claim I'm putting word's in Scott's mouth, even though he's directly targeting Tyler for merely gesturing at arguments, and making vague Bangladeshi-train-station arguments, rather than actually addressing the topic and trying to drive to a conclusion? The exact definition of bad faith.

If you think all-heat-no-light comments are beneath the ACX standard, then I guess you're the expert. And quite rude.

Expand full comment

User was banned for this comment.

Expand full comment

This is ironic since you don't even pretend to engage with the real arguments of Yudkowky et al

Expand full comment

Isn't this sort of akin to Normalcy Bias where people just stand and watch a tsunami that's about to destroy them because they think it can't possibly happen to them?

Expand full comment

How many times have you thought you were going to die (monster under the bed, satellite falling out of the sky, the sky dragon broils you alive, nuclear winter, zombie plague) and nothing happened? If they wish to be right more often than they're wrong, their behavior is correct.

Expand full comment

This is just an argument for completely rejecting base rates and embracing complete epistemic uncertainty though? I know I'm not going to die from your hypothetical example of a sky dragon because... I have no evidence that anyone else ever has, no evidence that I am currently in a situation that is uniquely more endangered by hypothetical sky dragons than all other historical individuals, and no evidence to believe the macroenvironment has changed to make sky dragons newly into a realized threat. So why should I be alarmed w.r.t. sky dragons?

AGI, on the other hand...

Expand full comment

No, dismissing Bayesian evidence as "reference class tennis" is rejecting base rates. This is instead the classic empiricism vs. rationalism problem. If you base your predictions for the future on the past, you will find human extinction unlikely, because humans have never gone extinct in the past. If you instead trust reason and sound logical argument, you will find that the world will end on October 22, 1844 … sorry, that an unaligned AGI will kill us all.

Expand full comment

So basically nothing new gets to ever happen? After all, it never has before!

Expand full comment

Being right more often than you're wrong is a bad goal when the payoffs are lopsided

Expand full comment

That is a different argument entirely. Pascal's, rather famously.

Expand full comment

Survivorship bias! There are a lot of people who have thought they were about to die and then were proven right. They're just not here to talk about it, because of the fact that they're, well, dead.

Expand full comment

Does that matter? Everyone who dies was probably wrong about dying plenty of times before eventually being right once in the end.

Expand full comment

Just keep predicting that you'll survive and you'll only be wrong once! No reason to grapple with any risks whatsoever under that logic.

Expand full comment

I think a lot of people don't actually have many experiences where they think they're probably going to die though.

I can think of exactly one experience in my life where I thought it was likely that I would die. I was hit by a car crossing the street after sundown, and blacked out briefly. I came to my senses lying in the middle of the street, the car nowhere in sight. I was wearing dark clothes against dark asphalt, I had already been hit once and there were no physical warnings around to signal to any further cars that there might be something lying on the road that they should be careful of.

I was lucid enough to assess all of this, and the possibility that I had received some sort of spinal injury which would paralyze me if I tried to get up, and concluded that the risk of death was high enough that I ought to try to get up and out of the danger area of the road. I was able to stand (thankfully without incurring any sort of paralytic injury,) and hobble to the sidewalk. Other cars ran over where I had been lying within the next few minutes, before an ambulance arrived.

That's the only "I am likely to die in this situation" scenario I've ever been in in my life. Right now, I think the risk of my personally dying in an AI-related apocalypse is probably greater than my risk of eventually dying specifically from heart disease or cancer. So far, the only time I've ever believed my life was in danger, I was probably right. Does that mean I'm probably right now?

Expand full comment

I'm a mountaineer, and my fears of imminent death are more like "if we can't navigate out of this blizzard, we'll freeze to death", or "if this avalanche-prone slope gives way, we'll be buried or break our necks", or "if I lose my grip, I'll fall and we're not roped up." So far I haven't died on any of those occasions, but plenty of people (including a couple of my friends) have died in similar situations.

Expand full comment

Actually, I think there's a metaphor here: working on AI capabilities without an answer to the alignment problem is like free-soloing (climbing without a rope). Sometimes that's the safest option! For instance, if you're on a loose snow slope where you can't place any gear to clip the rope to, or if you need to move fast to cross a rockfall-prone area. But it's not the default for good reason.

Expand full comment

"I really wish I had coordinated with my fellow climbers and we had slowed down a little to consider various possibilities before attempting what turned out to be a perilous loose snow slope."

Expand full comment

Exactly! Though "coordinating with my fellow climbers" may not always help, given the "acceptance", "expert halo", and "social facilitation" heuristic traps: http://www.sunrockice.com/docs/Heuristic%20traps%20IM%202004.pdf

Expand full comment

Also, tsunamis are things that have happened. The AI apocalypse has never happened.

Expand full comment

A smart person that has never heard of tsunams but has seen a lot of waves should be able to extrapolate that the incredibly big wave they see on the horizon is rather dangerous and they should run inland ASAP. If they stick to base rates instead, they die.

Expand full comment

Sure, but waves and Tsunamis are on a continuum. The difference between, say, apes and humans is... the result of number of small changes that eventually become a categorical difference. And I would say that most people haven't seen AI, they've seen MS Excel.

Expand full comment

Maybe your life is very different from mine, but I can honestly say I don't remember ever being in a situation where I'm afraid of an imminent death.

Expand full comment

I can't wholeheartedly recommend it, but it does give one a renewed appreciation for what's important in life. In particular, continuing to live it.

Expand full comment

It's impossible for me to have ever been right and be here to debate AI risk.

Expand full comment

> The Safe Uncertainty Fallacy goes:

> 1. The situation is completely uncertain. We can’t predict anything about it. We have literally no idea how it could go.

> 2.Therefore, it’ll be fine.

> You’re not missing anything. It’s not supposed to make sense; that’s why it’s a fallacy.

No, sorry. This is a straight-up, uncharitable straw man of the argument. The actual argument sketch goes like this:

1. We have read and carefully thought about Yudkowski's arguments for years. We find them highly unconvincing. In particular, we believe the probability of the kill-us-all outcomes he discusses are negligible.

2. We don't assert that everything will be fine. We assert that the problems that are actually probable are, while serious, ultimately mundane –– not of the kill-us-all sort.

Expand full comment

This “actual argument sketch” is not the safe uncertainty fallacy. The former would dive into the details and engage with each of EY’s arguments and counterarguments. The latter is dismissal that doesn’t engage with any of the arguments.

Expand full comment
Comment deleted
Apr 1, 2023
Comment deleted
Expand full comment

What a fantastic demonstration of the kind of thinking Scott started his blog to fight back against!

“This blog does not have a subject, but it has an ethos. That ethos might be summed up as: charity over absurdity.

Absurdity is the natural human tendency to dismiss anything you disagree with as so stupid it doesn’t even deserve consideration. In fact, you are virtuous for not considering it, maybe even heroic! You’re refusing to dignify the evil peddlers of bunkum by acknowledging them as legitimate debate partners.

Charity is the ability to override that response. To assume that if you don’t understand how someone could possibly believe something as stupid as they do, that this is more likely a failure of understanding on your part than a failure of reason on theirs.”

Expand full comment

Warning (50% of ban) for this comment.

Expand full comment

Dude, it's a two line sketch! OF COURSE each of those lines would need to be unpacked in detail. Here, I'm simply pointing out that the "safe uncertainty fallacy" is not what Tyler et al. are actually arguing.

Expand full comment

Tyler’s post wasn’t a two line sketch though and it didn’t actually engage with any of EY’s arguments. It just dismissed them all because the future is uncertain, which is why Scott’s summary resonates.

Expand full comment

I agree with you, Stephen. Scott is inventing a fallacy for an argument that he is strawmanning.

Expand full comment

"The actual argument sketch goes like this:

1. We have read and carefully thought about Yudkowski's arguments for years. We find them highly unconvincing. In particular, we believe the probability of the kill-us-all outcomes he discusses are negligible."

Which actual argument? The one on MR that Scott is writing about in this post? Can you please tell me where you see that argument being made in Tyler's post?

Expand full comment

Probably you should read the Robin Hanson stuff about AI alignment risks. I suspect that TC's thinking is heavily colored by these arguments (and respect for RH's big brain).

Expand full comment

I would be more convinced by your interpretation if you could point to specific parts of Tyler's post that displayed that kind of careful logic you have just depicted. In fact, you seem to be confused about the point that you, yourself are making. The definition of the fallacy is not something that you're trying to dispute (given the Twitter example that so neatly encapsulates EY/Scott's definition). Rather, you are trying to argue that the fallacy doesn't apply to Tyler's objections, nor to others in a similar class.

Expand full comment

The "Safe Uncertainty Fallacy," as stated, is unimportant and irrelevant to the larger discussion. The Twitter example is itself a straw man, i.e., a weak statement of a position, deliberately selected as such.

Expand full comment

That's a weak man, not a straw man. And to address your original point, can you explain why you think your characterisation of the argument is more accurate than Scott's? You've just kind of insisted that it is. And I don't see anything that fits your characterisation in Tyler's post, whereas quotes like these sure seem to fit Scott's:

> I am a bit distressed each time I read an account of a person “arguing himself” or “arguing herself” into existential risk from AI being a major concern. No one can foresee those futures!

> All the possibilities are distant, I cannot stress that enough. The mere fact that AGI risk can be put on a par with those other also distant possibilities simply should not impress you very much.

> The reality is that no one at the beginning of the printing press had any real idea of the changes it would bring. No one at the beginning of the fossil fuel era had much of an idea of the changes it would bring. No one is good at predicting the longer-term or even medium-term outcomes of these radical technological changes (we can do the short term, albeit imperfectly). No one. Not you, not Eliezer, not Sam Altman, and not your next door neighbor.

All of that certainly seems to fit within an argument of the form 'we don't know what will happen, so it's silly to act as if risk is significant'. And they don't seem particularly relevant to an argument of the form 'we've carefully considered EY's arguments and consider the risks to be negligible'. So I'm struggling to see on what basis you're so sure your 'actual argument sketch' is an accurate characterisation. Are you sure you haven't just substituted your own, more reasonable argument?

Expand full comment

I actually read Tyler's post and nowhere in it does he mention having read any of Yudkowsky's arguments, let alone having thought carefully about them for years. Tyler explicitly advocates *not* considering the actual object-level reasoning, writing: "when people predict a high degree of existential risk from AGI, I don’t actually think “arguing back” on their chosen terms is the correct response"

While Tyler makes many different arguments for the statement "we shouldn't try to slow down the development of AI", as far as I can tell the only argument he puts forward for "the probability of extinction from AI is low, even if we don't slow down" is indeed his suggestion that it's best to be agnostic about the impact of future developments. If you can see another, please do mention it.

Expand full comment

> We have read and carefully thought about Yudkowski's arguments for years. We find them highly unconvincing. In particular, we believe the probability of the kill-us-all outcomes he discusses are negligible.

That says nothing of the kill-us-all outcomes that he doesn't discuss. EY's arguments are not meant to be exhaustive, merely a template for how "going wrong" could happen. There are probably tens of thousands of possible ways it could go wrong and many of them can't be easily dismissed.

Expand full comment

All of them hinge on the same premise - that AI will bootstrap itself to greater levels of intelligence, and that doing so will lead it to become "super intelligent." We have absolutely no way to demonstrate, let alone prove, that either of those things can or will happen. We don't even have a good definition for what "super intelligent" really means. Others have compared this super intelligent AI to magic, and it's got as much empirical evidence to support it.

Expand full comment

No, they don't all depend on superintelligence. Regardless, humans are an intelligence currently creating an artificial intelligence that is more capable than any single human at many tasks. ChatGPT has already achieved that. The set of tasks where it's worse will continue to shrink. Therefore intelligence X creating an intelligence Y where Y > X is already nearly here. The proof that such a thing is possible is within sight.

Furthermore, there's no reason to assume Y will already hit some "maximal" intelligence, so further improvement is possible.

We're not far off from such systems becoming better programmers than the best human programmers. It's a trivial step to then conclude that it could improve on its own source code that wasn't even written by the best human programmers.

The only quibbles you could possibly have are that the proof of X creating Y is not in sight. I think this is naive. GPT is pretty much the dumbest thing you can do with transformers and it's already surpassed humans in many tasks. It can now even correct itself via feeding its own output back into itself (see the Reflexion paper).

We literally haven't even plucked all the low hanging fruit and AI systems already better than humans at so many things. The only real hurdle left is a generalization process to make it better at induction, and then AGI more intelligent than humans is pretty much here.

Expand full comment

Yes, this is the fundamental problem with EY's arguments - they're all of the form "here's a story for how this might happen". He's a great storyteller! If you come up with a safety constraint, he'll come up with a fun little way that it can be subverted. But these "stories" often involve a chain of dependent events, which rationally should make us treat them skeptically.

It doesn't help that, of the sequential steps required by the early AI doom predictions, several of them are already looking wrong. Our strongest AIs are currently oracular, not agentic, so in one of Scott's recent posts he instead told a "story" about how an LLM could simulate an evil agentic intelligence. Also, we appear to be approaching AGI without much of a compute overhang - these models are expensive! GPT-4 couldn't replicate itself 1000x even if it WAS superintelligent.

The real take away from Yudkowsky's doomerism should really just be that there isn't going to be a path going forward that is 100% guaranteed to not endanger humanity. We should invest resources in lowering this probability as much as possible. But we should also take into account that we live in a crappy world filled with suffering, and AGI has the potential to fix a lot of that, so delaying it has real costs.

Expand full comment

> GPT-4 couldn't replicate itself 1000x even if it WAS superintelligent.

Not GPT-4 as it exists, but multimodal training has shown significant improvements at even lower parameter counts. By which I mean, < 10B parameter models that outperform 176B GPT-3.

> But we should also take into account that we live in a crappy world filled with suffering, and AGI has the potential to fix a lot of that, so delaying it has real costs.

To a first approximation, I don't think most of the suffering in this world needs AGI to fix, just motivated human beings. As such, I don't think AGI will really help. But you never know.

Expand full comment

AGI will likely lead to incredible improvements in wealth generation and medicine, which will definitely decrease suffering. That is, unless it kills us all first.

Expand full comment

Wealth generation for AI owners, sure. I'm not sure about everybody else, which will be most people.

Medicine might be one domain where there are few obvious downsides to using AI, except perhaps the dangers to privacy. A good mental health AI/ therapist would be huge.

Expand full comment

Given how much cheaper current AI inference is compared to training, and given our previous track record on trickling down modern computing devices and software services to the global poor, I think it's extremely likely that AGI will improve wealth for everyone, even if not necessarily equally. Unless it kills us all first.

Expand full comment

That's certainly the sketch of an argument, but not the one I'm finding in TC's post. What, in there, reads like this to you?

Expand full comment

>1. We have read and carefully thought about Yudkowski's arguments for years. We find them highly unconvincing. In particular, we believe the probability of the kill-us-all outcomes he discusses are negligible.

Where has TC ever given any evidence for this ever?

Expand full comment

For whatever persuasion value this has I think you’d be a very interesting persuasive voice on podcasts and other media on this topic (I know you don’t like that idea, but now seems like the time for courage to win out) and that it would probably be a net good for society for normal people to hear you speak. Popularity breeds influence. Know you already have a lot but seems like it couldn’t hurt.

Expand full comment

It's a common problem with good writers, I understand - the very thing that makes them great at writing (thinking very long about things, not say anything prematurely) makes them terrible at speaking, or just slow & boring. It's something Douglas Adams suffered from, I understand - I can't remember exactly, he said something like "A comedian is someone who can think up something funny on the spot. A comedy *writer* is someone who thinks up something *uproariously* funny 3 days later while trying to eat breakfast in peace."

There's also the fact of course that he doesn't have a very impressive voice, judging by his performance of "It's Bad On Purpose To Make You Click": https://slatestarcodex.com/Stuff/BadOnPurpose.m4a (for the original article, see https://astralcodexten.substack.com/p/its-bad-on-purpose-to-make-you-click).

Expand full comment

He sounds just fine to me?

Expand full comment

Yes, the recording quality there is a little suspect but his voice is fine. Not that he's going to be the next movie trailer voiceover guy or anything, but he sounds like a normal dude.

Expand full comment

I guess it just sounds unimpressive by comparison to me because I heard an excellent Ukulele rendition first: https://www.youtube.com/watch?v=J1boM_6tFbk (It's Bad On Purpose To Make You Click) (originally posted at https://astralcodexten.substack.com/p/its-bad-on-purpose-to-make-you-click/comment/7063469). Scott just sounds so lifeless in his rendition by comparison - like his voice doesn't have the pep or whatever that would keep you listening if this was a podcast and you didn't know nor care who he was.

Expand full comment

Dang man. I’d hate to be your enemy.

Expand full comment

Yeah voice matters so much...I often wonder if my life outcomes (which are excellent in any case) would be 30% better if I simply had a deeper voice. My voice is weird and high for my body type and while it has some benefits (disarming women/children), it is poor for commanding attention/respect without being loud, and I feel like I need to work harder to be taken seriously for leadership than would be the case with some more "radio" voice.

I once had the very irksome experience of being super sick so my voice was all raspy and gravelly and it hurt to speak, and a very attractive woman I had known for years saying basically "oh I never found you attractive before, but with that voice now I do, don't get better".

Expand full comment

I understand your frustration because it does seem to be true that deeper voices seem to be considered more attractive overall. Even so, most voices can be refined quite a bit with training (though it's not necessarily easy to find a good teacher), making it more "solid"/well-rounded overall. Just going from how many other people I've seen go through training, I'd be willing to guess there's probably a decent bit more to your voice than you can imagine. You can't turn your voice into something it's not (e.g. deeper than it is), but that doesn't mean you've found the limits...

(Just to be clear, I'm not a teacher, just a student.)

Expand full comment

Thank you. Finally, I can hear my rabbi's voice. As I was prepared to be very disappointed, I feel relief. Just another of those 95%+ of us mortals whose voice is not meant to be on media. Anyone thinks Cowen or Caplan are great orators? - Scott shall write. And we shall read.

Expand full comment

"Sovereign is he who decides the null hypothesis."

Expand full comment

The existence of China renders basically any argument that we should restrict AI moot: they sure as hell won’t and you should trust them less. Sucky place to be, but it’s where we’re at.

Expand full comment

This is a non argument.

Expand full comment

Please elaborate.

Expand full comment

Of course it's an argument. It invalidates any argument that assumes AI can only be invented in the US.

Expand full comment

It doesn't invalidate the possibility that China might be less likely than the US to invent an unaligned AI that kill us all, that's why it's a non-argument. The matrix of possibilities I've posted elsewhere is:

1. China creates aligned AI.

2. US creates aligned AI.

3. China creates unaligned AI.

4. US creates unaligned AI.

It's not unreasonable to think that the probability of option 4 is higher than 3, and that the probability of option 1 is higher than 2, which would make China a safer bet if we're really concerned with existential risk above all else.

Expand full comment

What is the basis for believing in that ranking of probabilities?

Expand full comment

The same reasons drive both: the US is more dangerous in this research because of profit motive leading to cutting corners around safety to get first-mover advantage, and the core philosophy of individualism leading multiple AI experiments some of which may be unaligned, some aligned. No doubt the US would win a race, but merely winning the race is not the goal.

China has the exact opposite character: collectivism leads to greater consideration of how an experiment might reflect on them and their group. China also has stricter control over permitted research. They don't even want citizens they can't control, so they would want to strictly control any research that could lead to an AI that threatens their grip on information within their own borders.

Consider if the Manhattan project had been conducted like AI research is now being conducted, as opposed to the secretive, controlled approach that actually happened and that looks a lot more like what you'd get in authoritarian China.

Expand full comment

Not clear why a priori we should suppose that a profit-motivated corporation would be more interested in first-mover advantage than, say, an aggressive defense department.

Expand full comment

One common point of comparison between AI and earlier technology is nukes. China is building more nuclear plants now than the rest of the world combined:

https://www.energymonitor.ai/sectors/power/weekly-data-chinas-nuclear-pipeline-as-big-as-the-rest-of-the-worlds-combined/

China's rate of fatal workplace accidents appears to be 16 times that of the US:

https://www.safetynewsalert.com/worker-fatalities-how-does-china-compare-to-u-s/

Expand full comment

>the US is more dangerous in this research because of profit motive leading to cutting corners around safety to get first-mover advantage

This is a bigger motive than China becoming a durable global hegemon? Or avoiding the US staying a hegemon forever?

Expand full comment

I’m assuming you were being clever and meta, but it doesn’t add a lot of signal.

Expand full comment

Sorry, this was needlessly flip. I get a little frustrated by the “but China“ argument because it feels either tragically defeatist or a rationalization for what someone wants to do anyway. But maybe not, and anyway that’s no excuse for being rude, new culpa.

1) China doesn’t want to die any more than we do. They’re understandably scared about our AI research and they likely would be open to slowing as well if they thought we were making an effort in good faith.

2) China is (it seems) a few years behind and their AI research is likely stifled by their suppressed internet. What they are good at is espionage, so in some ways the best way to give China cutting edge AI tech is to rush to build it ourselves so they can steal ours.

Expand full comment

Mea culpa, not new culpa. Is there really no edit button?

Expand full comment

There is an edit button, click on the triple dot menu.

Expand full comment

Seems reasonable, but I think where I would disagree is I'd be willing to bet the Chinese see AI and its various applications as a tool of repression to be used against their various subject peoples, and thus are going to be eager to develop it regardless of what happens in the US.

Expand full comment

They absolutely do and they absolutely are; the question is whether they're capable of doing so in the face of active efforts by the US and her allies to stop them.

Expand full comment

How do you stop people from writing software programs halfway around the world, absent a hot war? That doesn't seem feasible in the long run.

Expand full comment

Hell if I know, but the CHIPS Act seems like a good start.

Expand full comment

1. Americans don't want to die, including the ones building AIs, but those AI researchers are proceeding anyway. It's possible there are AI doomers with the potential for political influence in China, but I'm not aware of them and would not bank on them.

2. A few years behind means that if an obstructionist effort in the US succeeds in slowing down by a few years, then China pulls ahead.

Expand full comment

>China is (it seems) a few years behind and their AI research is likely stifled by their suppressed internet.

Another recent Tyler Cowen post addresses this: https://marginalrevolution.com/marginalrevolution/2023/03/yes-the-chinese-great-firewall-will-be-collapsing.html

"Yes, the Chinese Great Firewall will be collapsing."

"'Fang Bingxing, considered the father of China’s Great Firewall, has raised concerns over GPT-4, warning that it could lead to an “information cocoon” as the generative artificial intelligence (AI) service can provide answers to everything'"

"(Tyler)The practical value of LLMs is high enough that it will induce Chinese to seek out the best systems, and they will not be censored by China. (Oddly, some of us might be seeking out the Chinese LLM too!) Furthermore, once good LLMs can be trained on a single GPU and held on a phone…

Solve for the political equilibrium."

Expand full comment

All interesting points except the few years thing: it really doesn’t matter if this comes out now or in 2027 if this is truly an existential risk. An extra 21 billion QALY (3 years * 7B people) is a rounding error compare to the loss of all QALYs forever.

I assume the point of the few years is that maybe it’ll give more time for alignment research. I’d be interested in hearing a prognosis for what that would do. Let’s say you could magically freeze everyone except AI safety researchers for 3 years. How much would that decrease existential risk forecasts?

In the real world of course, you’re actually asking for America to give up some of its AI lead to China. Will China commit to following these AI safety practices? Can we verify that? Will sub state actors be able to progress far with things like FB’s leaked models? Could rogue states assemble GPU farms like they’ve assembled uranium enrichment facilities?

I say all of this with a lot of sympathy to the AI safety movement. I think the progress in AI in humanity has moved too fast for our own good. I wish this tech had matured in the 90s when the geopolitics were the US dominate over a bunch of losers. I wish the predictions this would be hyper scale only and not runnable on consumer GPUs came true. I wish working on AI research had at least the same safeguards as working with dangerous viruses (noting that those safeguards often fail). None of these wishes came true.

That those wish’s did not come true is sad and unfortunate and potentially devastating - but a devastating fact pattern simply can’t justify a “do something” approach without justifying the something.

Expand full comment

>you should trust them less

Should I? Who has more desire to rule over the whole world, The USA, or China? (Hint: Look at a map of US military bases, then look at a map of Chinese military bases.)

Also, how do we know China won’t restrict AI progress? The CCP probably wants the social fabric of their society degraded even less than we do.

Expand full comment

I’m very skeptical of that argument but I’d be very interested in a piece that clearly makes this argument rather than the usual pieces (Scott’s included) that handwave over the existence of China.

Expand full comment

Aren't these both fundamental communist beliefs?:

- communism must be worldwide

- violence is legitimate for overthrowing non-communist states

US military bases were established partly to resist such tendencies.

Expand full comment

Is China communist? I mean, beside their branding.

Not that I'm deeply in love with the chinese government, neither do I believe these two policies necessarily cause wrong in every possible context, but:

- My ideology must be worldwide

- violence is legitimate for overthrowing states that don't embrace my ideology

seems to apply better to the US than to China.

Expand full comment

"violence is legitimate for overthrowing states that don't embrace my ideology"

The US ideology is anyone can have any religion, the Chinese ideology is all religions are banned, see Tibetans, Uighurs, Cultural Revolution (millions murdered).

The US ideology is all races are equal, the Chinese ideology is Han Chinese are superior. See Tibetans, Uighurs, Vietnamese, etc.

The US ideology is everyone can trade in the global capitalist economy. The Chinese ideology is everyone will be under the control of the Chinese Communist Party. See Belt & Road Initiative.

Expand full comment

How many foreign governments did China participate in overthrowing since 1949?

Now, how many foreign governments did US participate in overthrowing since 1949?

We're not talking about internal politics nor comparing two ideologies here. You're not argumenting at all against what I said. The person I originally replied to seemed to believe there could be a bias for China (compared to the US) to want to rule over the world, while historical facts seem to indicate the contrary

Expand full comment

"How many foreign governments did China participate in overthrowing since 1949?"

Korea; Vietnam; Cambodia; Laos; Tibet; Southern Philippines; South Africa; Angola

Expand full comment

Why limit the argument to China? Do you presume that Russia, India, Japan, Korea, France, and Germany are also going to "pause" development of one of the most economic promising technologies in order to wait for some "AI Ethicists" to rationalize their current political preferences into some theoretical regulation (that no one can even provide a sketch of) that everyone on earth is going to accept?

Expand full comment

This argument would be more convincing if China wasn't *already* restricting AI more than we are: https://www.theverge.com/2023/2/22/23609945/china-ai-chatbots-chatgpt-regulators-censorship

This is par for the course for authoritarian states. Why would they want the proliferation of a technology whose social effects are unpredictable? It's open societies that are more open to social change, not authoritarian and autocratic ones.

Expand full comment

Restricting people's interaction with AI and restricting construction of AI for government use are two very different things. They are just applying existing censorship rules to AI.

Expand full comment

china has a great track record on non-proliferation re: nukes

Expand full comment

How 'bout their client state, North Korea?

Expand full comment

We already did this with nuclear weapons.

Expand full comment

Centrifuges are a lot harder to hide than GPUs. And even then, the record isn't great (c.f. North Korea and Iran).

Expand full comment

Trust them less with what? There is currently no known thing you can do to make an aligned superintelligence. We don't know how to build AIs that robustly want specific stuff. If matters continue as they have, that's not on track to change any time soon. Whether China or the US builds it currently matters relatively little, we're dead either way.

We're trying to solve the technical problem, but with the insane speed at which AI has been advancing lately, I'm sceptical we'll make it in time. We've barely gotten started. The total number of researcher hours sunk into this is still comically low, compared to how much fundamental scientific work like this usually takes. Until just a few years ago, it was only a tiny handful of people working on this, and frankly, they barely had a clue what they were doing. Getting to the finish line in <10 years would require a sudden avalanche of scientific progress the likes of which is rarely ever seen, to be sure. Physics in the early 20th century level stuff, at minimum. I'll keep at it, but extra time sure would be incredibly welcome.

30 years as Yudkowsky advocates would be fantastic, but I'm grateful for every extra month.

Expand full comment

China is not some magic technocracy that can recreate or advance any technology just because they will it and are a scary foreign power.

The vast majority of AI companies and developers are not in china, and if they stopped working on AGI, even if china wanted to, the path would still be hugely slowed down.

Expand full comment

nuke em, a billion casualties is nothing in the face of human extinction from being outcompeted by AI. To make sure that the nukes get through and end the AI threat, we must, as a moral imperative, accelerate military AI research and deploy autonomous nuclear launch systems ASAP! Only an AI armed with nukes will be smart enough to eliminate enough humans to stop AI research and save humans from the threat of AIs killing them. The fate of humanity depends on it.

Expand full comment

"If you have total uncertainty about a statement (“are bloxors greeblic?”), you should assign it a probability of 50%"

This reminds me of a great quote by Mike Caro: "In the beginning, all bets were even money."

Expand full comment

This argument assumes the highly authoritarian Chinese government won’t stop an AGI that could kill Chinese people (and everybody else). Seems odd.

Expand full comment

Looks like you put your reply in the wrong place.

Expand full comment

+1 for quoting Mike mad genuis Caro.

Expand full comment

Caro has his own connection to AI: In 1984 at the World Series of Poker he demonstrated Orac (Caro backwards), a poker-playing computer program that he had written. Orac was the world's first serious attempt at an AI poker player, and most poker professionals were surprised at how well it played.

Expand full comment

"We designed our society for excellence at strangling innovation. Now we’ve encountered a problem that can only be solved by a plucky coalition of obstructionists, overactive regulators, anti-tech zealots, socialists, and people who hate everything new on general principle...Denying 21st century American society the chance to fulfill its telos would be more than an existential risk - it would be a travesty."

This is so wildly counter to the reality of innovation by America and Americans vs. the rest of the world that I don't even know what to say about it other than it makes me trust your ability to understand the world and the people in it less.

Expand full comment

Do you think "We" is only Americans? Do you think that statement is maybe a bit tongue-in-cheek?

Expand full comment

It's tongue-in-cheek and still fails if anyone else in this great big world creates it.

Expand full comment

Given the final sentence of the paragraph yes. Also TGGP's point.

More critically, yes, of course it's tongue in cheek, but just after his simply incorrect claims about the DEA in literally the immediately previous post along exactly this vein, this feels aggressively wrong to me in a manner which I chose to comment on.

Expand full comment

I'd love to hear more about the DEA-post counterargument.

Expand full comment

See ProfessorE's (no relation) comment over there.

Expand full comment

Somewhere in the neighborhood of a quarter of college students are taking meth due in large part to telemedicine. Regulation attempting to mitigate unforeseen downsides of telehealth is inevitable.

Expand full comment

Can you post a link?

Expand full comment

https://www.safetylit.org/citations/index.php?fuseaction=citations.viewdetails&citationIds[]=citjournalarticle_660542_25

"We identified 32 articles which met our pre-defined eligibility criteria but we used 17 article to write this review article. Over one quarter (28.1 percent) of college-aged young adults report having misused some type of prescription psychotherapeutic drug at least once in their lifetime."

Expand full comment

I am not a prominent member of the rationalist community so I don't experience social pressure to assign a minimum level of probability to anything that comes up for debate. I don't think an AI apocalypse could happen. I don't think an AI apocalypse will happen. I think an AI apocalypse will not happen, certainty 100%. I also don't think I'll be eaten by a T-Rex tomorrow, also certainty 100%.

>If you have total uncertainty about a statement (“are bloxors greeblic?”), you should assign it a probability of 50%. If you have any other estimate, you can’t claim you’re just working off how radically uncertain it is. You need to present a specific case. I look forward to reading Tyler’s, sometime in the future.

I have a very strong prior that things I have been totally uncertain about (am I deathly allergic to shellfish? Will this rollercoaster function correctly? Is that driver behind me going to plow into me and shove me off the bridge?) have not ended up suddenly killing me.

Expand full comment

That last part might be the anthropic principle more than accurately assessing odds.

Expand full comment

But you're not totally uncertain about those things, as most people aren't allergic to shellfish, most roller coasters function correctly, and most drivers aren't murderous psychopaths.

Expand full comment

Guy with a very strong prior that unlikely things are unlikely.

Expand full comment

So... we aren't on for tomorrow? Just checking.

Expand full comment

If you ask a rationalist there's at least a 1% chance we are

Expand full comment

No, they really, really wouldn't think that.

Expand full comment

Conditional on you not being a troll, my guess at the mistake you've made is: You heard the adage about a good rationalist never assigning zero probability to anything, but then you changed "not equal to 0" into "at least 1%" by somehow forgetting/not realizing that there are numbers between 0% and 1%.

But that seems like an implausible mistake, so most of my probability is on you being a troll.

Expand full comment

Lots of real life relevant probabilities between 0% and 1%, eg that of you dying tomorrow.

Expand full comment

Are you willing to bet on those probabilities?

Trillion to one payout should be fine?

Expand full comment

A trillion-to-one bet wouldn't be fine, even for being eaten by a T-Rex tomorrow. You could earn way more from interest, with lower transaction costs!

Expand full comment

Not with a large enough bet size and if your counterparty lets you bet on margin.

Expand full comment

Are you *that* sure no one has secretly been cloning a T-Rex from DNA found in fossils though? In a place that happens to be near you? Where security happens to be too lax?

One in a trillion leaves quite some room for a sequence of unfortunate events.

Expand full comment

Sure it does, but if your counterparty is that certain who cares. They are almost certainly being super irrational. Might as well take them for as much as they will let you.

Expand full comment

Yes I'm that sure. There have been an average 4 deaths a year from grizzly bears in North America. So that's odds of one in a hundred million to be killed by something that does exist.

Expand full comment

The odds of any bet being adjucated unfairly are way higher than that. I wouldn't even bet on 1+1=2 at those odds.

Expand full comment

Why would intelligence only be possible if it's naturally occurring?

Expand full comment

Would you bring the same priors with you to the alien starship scenario?

Expand full comment

Fine. Now explain why you do "think an AI apocalypse will not happen, certainty 100%", please. (You may link to your specific substack-post - I do not, as I do not have a post about it.)

Expand full comment

> I have been totally uncertain about (am I deathly allergic to shellfish? Will this rollercoaster function correctly? Is that driver behind me going to plow into me and shove me off the bridge?)

All real things that have happened to some people and are also pretty rare. Like if half the population were deadly alergic to shellfish, they wouldn't sell it in restraunts. If the roller coaster killed half the people on it, it would be shut down.

Things can't kill large numbers of people like that, it get's noticed and stopped. Either it has to kill people years later in a nonobvious statistical trend (smoking, and we spotted that eventually), or kill lots of people all at once. (eg a lab leak of a novel bat virus)

Expand full comment

I find the premise of this post bizarre--I hang out with rationalists and also don't feel social pressure to assign a minimum level of probability to anything, and have never seen one who I suspect would. 0 and 100% are logically unsound to assign to non-tautological statements if you started with any uncertainty at all, but there's no minimum--any probability you assign, it would have been possible, for instance, to assign half of it. Or half of that, and so on.

I can't tell you how many zeros I would need to put after the decimal point for a t-rex eating me tomorrow before I put some other number (hard question! something that unlikely has very probably never even happened to me once, so I have no practice). But I'm pretty sure that however many there should be, it's a lot of 'em.

Expand full comment

Getting people away from 0% and 100% might be easier, if we talked in log-odds instead? https://www.lesswrong.com/tag/log-odds

The discussion reminds me a bit of 'rational thermodynamics'.

Basically, orthodox thermodynamics uses temperature, and 0K is not a real temperature. But negative temperatures are allowed, and they are hotter than any positive temperature. See https://en.wikipedia.org/wiki/Negative_temperature

The whole system makes a lot more sense, if you switch from measuring temperature to measuring 'coldness'. Simplified, you coldness = 1 / temperature. But the customary unit for coldness seems to be byte per joule (or more usefully, gigabyte per nanojoule).

With coldness, the singularity at 0K disappears.

See also https://en.wikipedia.org/wiki/Thermodynamic_beta

> Temperature is loosely interpreted as the average kinetic energy of the system's particles. The existence of negative temperature, let alone negative temperature representing "hotter" systems than positive temperature, would seem paradoxical in this interpretation. The paradox is resolved by considering the more rigorous definition of thermodynamic temperature as the tradeoff between internal energy and entropy contained in the system, with "coldness", the reciprocal of temperature, being the more fundamental quantity. Systems with a positive temperature will increase in entropy as one adds energy to the system, while systems with a negative temperature will decrease in entropy as one adds energy to the system.[4]

Expand full comment

Having a finite amount of memory available you don’t have room for infinite precision, especially for probabilities that don’t matter much for practical purposes. So the t-rex can be set to 0. See https://mariopasquato.substack.com/p/on-rationally-holding-false-beliefs

Expand full comment

The issue is that sufficiently small probabilities are indistinguishable from zero for all practical purposes, or else you are a constant victim of Pascal's Mugging.

If the odds are really low, it's not even worth the time to consider. I don't see you debating the probability of the Rapture tomorrow, or vacuum collapse or whatever. The AI Risk thing is a massive case of privileging the hypothesis.

Expand full comment

> 3) If you can’t prove that some scenario is true, you have to assume the chance is 0, that’s the rule.

Why did you bother with this claim he's obviously not making when the previous one was so much closer and inconsistent with this?

> Now we’ve encountered a problem that can only be solved by a plucky coalition of obstructionists, overactive regulators, anti-tech zealots, socialists, and people who hate everything new on general principle. It’s like one of those movies where Shaq stumbles into a situation where you can only save the world by playing basketball. Denying 21st century American society the chance to fulfill its telos would be more than an existential risk - it would be a travesty.

The problem is of course not solved if someone else gets it.

Expand full comment

One quibble. You wrote: "Then it would turn out the coronavirus could spread between humans just fine, and they would always look so betrayed. How could they have known? There was no evidence."

Actually, when this kind of thing happens, the previous folks asserting that "there's no evidence that" often suddenly switch seamlessly to: "We're not surprised that..."

Expand full comment

Oh often even just days/hours apart! Happened pretty regularly during COVID.

Expand full comment

Is anyone else surprised by how safe GPT4 turned out to be? (I speaketh not of AI generally, just GPT4). Most of the old DAN-style jailbreaks that worked in the past are either fixed, or very limited in what they can achieve.

You can use Cleo Nardo's "Chad McCool" jailbreak to get GPT4 to explain how to hotwire a car. But if you try to make Chad McCool explain how to build an ANFO bomb (for example), he refuses to tell you. Try it yourself.

People were worried about the plugins being used as a surface for injection attacks and so forth, but I haven't heard of disastrous things happening. Maybe I haven't been paying attention, though.

Expand full comment

Bing was an iteration of GPT-4 and it spontaneously insulted users until it was patched. Now, after lots of patching, it mostly works. I'm actually a little surprised that we got as much misalignment as we did: LLMs were supposed to be the easy case, with RL agents being the really difficult models to align. (remember the speed-boat thing?)

Expand full comment

The base-model LLM is actually completely unaligned (it will tell you how to kill maximum number of people etc. without a care in the world). It's the RL(HF) part on top of it that creates alignment: https://youtu.be/oLiheMQayNE

Expand full comment

It seems wrong for me to say this because I don't think I have made explicit easily checkable predictions anywhere... but I think LLMs have turned out to be about as safe or unsafe as I predicted? (broadly for correct-ish reasons, and errors in specific details mostly cancelling each other out)

What I used to think is that an AI as capable as GPT-4/LaMDA/whatever would have been much more dangerous than what we have now, but that's because, prior to GPT-3 which made me change my mind, I expected that to get something AGI-ish you'd have to have an agentic seed AI that would "learn like a child", and we still haven't got a clue how to even begin aligning something like that, but turns out feeding the Internet into a sufficiently large LLM does suffice after all. Luckily for us, since being nonagentic in themselves and incapable of self-modification, LLMs always seemed to present negligible risk of the usual paperclippy AI apocalypse scenario, most risk coming from potential for misuse for misanthropic purposes, social disruption, economic inequality they might bring, and things of that nature.

On that account, coming from playing around GPT-2/3 and easily getting them to do whatever, I too am quite surprised how well RLHF seems to have worked, but since jailbreaks do exist and seem unlikely to be fully and completely patched out, that doesn't seem like it would deter deliberate attempts to extract hazardous plans, but rather avoids bad feelings caused by the model insulting the user. But on the other hand I didn't predict waluigi effect would affect these models, so it mostly evens out on that account. I did expect LLMs would be used more for propaganda purposes, but turns out I had both underestimated the resiliency we had already developed against bad actors trying to infiltrate discussions, and overestimated the effort it takes to make fake news go viral on its own without need to be signal boosted by a bot army. On the other hand, I thought something at level of GPT-3 would already have caused more economic disruption than it did by automating jobs. So mostly plus minor zero on social impact side, and my risk assessment of GPT-∞ becoming a paperclipper remains low.

Expand full comment

> but I think LLMs have turned out to be about as safe or unsafe as I predicted?

I think it's way too early to make that kind of determination. The Bing LLM isn't even generally available. Once they are more generally available, I would not be surprised if some new malware or worms started popping up that had been designed in concert with some LLMs.

Expand full comment

There's lots of LLMs that are generally available though. For example, Meta's model got leaked.

Expand full comment

Sure, but there's a big difference between 65B parameter models and the 1T+ parameter models that would have the capabilities talking about here.

Expand full comment

In the context of software security, for the most part, "safe" means "we haven't found the way to break it yet". Of course some things are more easily breakable than others, but security researchers can tell you that "it looks safe" isn't much of a guarantee. Techniques for breaking something will evolve as the thing to be broken evolves.

The only notable exception is software that has been formally proven to do exactly what it says – nothing more, nothing less (and even then there might be some fairly non-intuitive attack vectors, like side channel attacks). Obviously this is prohibitively complicated to do for complex software and certainly for something as intractable to detailed analysis as a large language model.

As far as I'm concerned, anything that processes input and even just vaguely approaches human level complexity (and GPT4 is still far from human level by my estimation) can be manipulated with special inputs. That includes humans, of course.

Expand full comment

My crash-vibe-summary of the MR article is

* this shit is inevitable, you're not going to stop it

* besides, the future is so unbelievably unpredictable that trying to even Bayes your way through it is going to embarrass you

* given both the inevitability and unpredictability, you may as well take it on the chin and try to be optimistic

Which, you know, has its charm.

Expand full comment

Echoing the part about uncertainty. To exaggerate a bit, one might view the efforts today to predict what AI will be like to a caveman trying to predict quantum physics. Just not nearly enough understanding of the topic to make a meaningful prediction.

Expand full comment

Yep, it does seem that either muddling through is an option or we're mega-doomed, so might as well assume that we live in a world where the first proposition is true, and think about how to improve things on the margin. I'm not sure what Scott's counter-proposal is.

Expand full comment

I'd say his counter proposal is a version of muddling through.

eg. Do our best to slow down capabilities research, and speed up alignment research.

Expand full comment

In my view, muddling through means that humanity deals with this problem in the exact same way as in all previous cases, i.e. trial and error on the state of the art prototypes rushing full speed ahead. Proposing radically novel approaches, as in a separate "alignment" direction taking priority over a significantly distinct "capabilities" direction is just not how actual human R&D has ever worked, so this sort of rhetoric doesn't help with being taken seriously in the "business as usual" scenario.

Expand full comment
User was indefinitely suspended for this comment. Show
Expand full comment

Passivity is certainly one way of coping with distress; I like to think there are many, and writing about the thing that frightens you in a clear and cogent manner is probably one of the better ways.

Expand full comment

The steelman is that if there is a risk of global death and nothing I do can impact the risk in any way... then I should bet on "we are going to survive"... because if I am wrong, no one is going to collect any money anyway.

Being smug about it, that is simply a way to make non-financial bets. (Betting your status and prestige rather than money.) The logic is the same; it's not that you are 100% right, it's just that in the case you are wrong there will be no extra bad consequences for you anyway.

Expand full comment

This is about the twentieth time you've advertised your book, and I think you've been asked to stop, so I am banning you.

Expand full comment

Really, you banned Matt for this? How weak.

EDIT: oh, for the book promo, that's ok

Expand full comment

Can any of the folks here concerned about AI doom scenarios direct me to the best response to this article: https://www.newyorker.com/culture/annals-of-inquiry/why-computers-wont-make-themselves-smarter

I am assuming some responses have been written but I wonder where I can read them. Thank you!

Expand full comment

Chiang says that because 130 IQ people can't make themselves 150 IQ, then machines must not be able to make themselves higher IQ. But machines can conduct AI research and alter their own code. It seems likely that if humans could alter their own code, we would have some kind of intelligence explosion.

Expand full comment

Humans can do genetic engineering, and even eugenics, though few have seriously bothered with even trying to raise IQ that way over multiple generations.

Expand full comment

If the Flynn effect is correct, humans have successfully made themselves smarter without even trying to.

Expand full comment

Also consider the repeated breaking of athletic records...

Expand full comment

Those clearly are suffering extreme levels of diminishing returns, though.

Expand full comment

Clearly? Unclear to me. Seems highly uneven when you look at actual numbers

Expand full comment

Everything has diminishing returns if you push it far enough, including AI. The only question is how high the ceiling is.

Expand full comment

That is an analogy. Chiang (and my) issue is that AI doom proponents take infinite recursion of bootstrapping AI for granted. What are the reasons to believe it can happen? That to me is the main objection. Just like St Anselm's argument for the existence of God, they simply postulate omnipotent AI because they can imagine it. If you take it as an axiom, then the rest is relatively reasonable. By the axiom is very iffy.

Expand full comment

No one's postulating an omnipotent AI. They're saying that *if* one intelligence (us) can create a smarter one (AI), then it seems likely the smarter one will be able to create an even smarter one.

The ontological argument creates a god from pure reason, without doing any work. The singularity argument assumes that there's already an intelligence process that can create a better intelligence, which is the hard part. Given that though, it doesn't seem a stretch to say that the better intelligence will create an even better one. I think his whole compiler analogy is a distraction, I'm not sure how it's relevant other than an example of "a thing that can make a better thing". A singularity or something like it doesn't even require infinite bootstrapping though, just an end product sufficiently smarter than us.

Expand full comment

AI will likely create a slightly better version of itself, but with returns quickly saturating. We have seen that before with other types of software. Why assume it can continue to the superintelligence level?

Expand full comment

I mean, we can play reference class tennis all day. You can point to compilers, I can point to Moore's law. I can point to algae blooms, you can point to yeast in bread. You can point to covid stats, I can point to other more different covid stats. And of course in the end everything is limited, nothing can go increasing literally forever, but finite gains can still be very large (again, look at processor speeds over the last 50 years). The article gives a couple examples of processes that don't repeatedly bootstrap (humans, compilers), and then says literal infinite bootstrapping is hard to believe, therefore computers won't get smarter. Which seems to me to be missing a huge number of possibilities?

I probably can't come up with a rigorous proof that substantial bootstrapping is possible, the best I can do is say that an AI running faster seems roughly equivalent to it being smarter. Even now, processors have still continued to get faster, at least in at least in multicore which is applicable to a lot of AI tasks. To the degree that at AI can decrease the time to the next generation of processor, that looks like bootstrapping from the outside.

Expand full comment

My view is that bootstrapping and other advances will continue but very slowly, just like the rest of technological progress, including Moore's law. Because technological progress is very hard in the real world. People will have time to adapt.

We can agree to disagree on probabilities of AI doom since there is no way to estimate them rationally. They can be extremely small, which is what I think, or not so small, which is what you seem to think. I any event, this uncertainty in probabilities calls for some humility and agnosticism. Seems to me that is what Tyler is advocating for and the opposite of what Eliezer advocates for.

Expand full comment

I agree that returns will saturate, but it matters where it saturates. I'd be surprised if human intelligence is close to the saturation point. The human brain was designed with strong constraints - like being only able to use 10-15 Watts of power and being able to fit through a birth canal - and was generated via an arguably subpar optimization process. These limitations don't apply to AI.

Though regardless of whether AI saturates at slightly more intelligent or way more intelligent than humans, there is still a substantial risk when humanity is no longer the most intelligent species on the planet.

Expand full comment

The power for computers will face even harder constraints. IT is already consuming a big fraction of all electricity.

Expand full comment

> AI will likely create a slightly better version of itself, but with returns quickly saturating.

Sure, but you seem to assume that that saturation point is at or below human intelligence. I see no reason to accept that. ChatGPT is in fact already more capable on many tasks than most humans on the planet, is getting significantly better at reasoning as more feedback mechanisms are implemented so it can check its own outputs, and LLMs are pretty much the dumbest thing we can do with the transformers. I don't think genius level human intelligence is a ceiling on artificial intelligence.

Expand full comment

Sure it might be above the average human intelligence at most tasks at some point. But it will take a lot of time. The key question is not whether LLMs can answer questions but whether LLM will be able to get better than humans at programming LLMs. I see no plausible path to that right now.

Expand full comment

AIs are already capable at coding and I can't see any reason why they wouldn't become good at creating AIs.

Expand full comment

Same reason we're not?

Expand full comment

The first cars could be outrun by athletes; the first power tools made worse cuts than a hand carpenter; the first chess programs could be beat by any amateur player.

If our first AIs are dumber than humans, that's no reason for them to stay that way.

But that leaves a new question: not "how smart," but "how fast?" A world with 80 years to prepare for superintelligence is a lot safer than one with 15.

But nobody can be quite certain, any longer, which world we'll be in.

If we had good evidence about the speed at which superintelligence was coming, a lot of these arguments would disappear in favor of practical engineering talk. Practical and scared, or practical and calm, depending on the revealed timeline.

Are we in the 80-year timeline or the 15-year one? That's what I desperately wish I knew.

Expand full comment

5 years timeline. In 2000s i made myself a gai predictions of 2025-2035. Beating humans at Go was a major milestone in that. ( I estimated it would be 2020-2025).

Gpt4 beat my most optimistic estimates by 5 years.

I won't be actually surprised if we get GAI surpassing humans this year.

So 5 years is imho very conservative given that the cat is out of the box and how good it already is.

We got alphazero engine, we got llms which turned out to be insanely good. All the parts are already here. Only need someone smart to put all pieces together.

(John Carmack?)

Expand full comment

Suppose we revisit on 2028 and you see pretty much no AGI let alone a superintelligent one. Sure, lots of tasks get done well (with varying constraints) by AI, others not yet. For some tasks, this looks much like chess now looks to us - the computer beating us is more an observation about chess than about intelligence.

In that scenario - what went "wrong" in the sense of deviating strongly from your expectations?

Expand full comment

Well if its 2028 and no AGI yet... The only way imho its possible is that somehow the progress was stoped. Short of nuclear apocalypsis or effective neo Luddite movement ("butlerian jihad"). I just don't see how . Its like saying in 1950 that we gonna still use human calculators in 1970s and computers are no big deal

We already have all the pieces.

Goal oriented game capability is solved already by alphazero. LLM solve the conceptual tokens

Creativity turned to be easiest of them ( just introduce some noise and filter trough model again)

Consciousness is also property of the system itself

https://en.m.wikipedia.org/wiki/Giulio_Tononi

What is there left to solve ? Already in 2023 there is infrastructure, tools , knowledge in place . Money incentives, geniuses with their own motivation

Expand full comment

Is your objection to recursive self improvement as a method, or omnipotence as an outcome?

Expand full comment

Omnipotence

Expand full comment

Why would it require omnipotence to be a little bit smarter than a human? What physical principle could possibly explain human intelligence being the absolute maximum possible intelligence in the universe?

Expand full comment

AI doom proponents are not concerned about AI that is a little bit smarter than a human. They are concerned about superintelligence that can keep improving itself until it become essentially God-like in its powers. And there is no such physical principle, obviously.

Expand full comment

Yes, and I'd argue we've already had an intelligence explosion. Humans have used writing, and more recently, computer technology, to drastically increase our all aspects of our intelligence. Compare how humanity has transformed earth to, say, chimpanzees. We've walked on the moon, diverted the flow of rivers, and changed the climate. And consider how many species we've driven to extinction or endangerment in the process, just by accident.

Expand full comment

Right, I think that's where the disanalogy lies: as of now, we have very limited means to modify our own processing hardware. We can and did invent horse-riding and trains and planes to move much faster than we can on foot, but we cannot invent a bigger brain to think much better than our ancestors did. But an AI as smart as a human clearly can improve its hardware: that's what we've been doing for better part of a century! Even a dumb human can go online and buy a more powerful computer!

But it's not just about hardware but also software. We have little control over our programming because a lot of that is implemented at hardware level which we don't have good access to, while an AI as smart as a human could potentially entirely rewrite its own source code, a feat that is definitely possible since humans can do that and it's as smart as a human. The question now arises, what good might that do? I propose, quite a lot: even when looking at human history, despite our hardware remaining static, we have come up with all sorts of algorithms that, despite a lot of our cognition being literally wired into our brains, have still managed to improve our capabilities much. Let's look at some examples:

1. Language: If an expert is reading this, feel free to chime in and correct me, but as I understand, it's generally believed that anatomically modern humans always possessed the capacity for language, but language was in fact invented rather than evolved, which, if true, makes it one hell of a software upgrade.

2. Hindu-Arabic numerals: Without positional notation and zero, doing arithmetic is extremely cumbersome. Now we teach second-year students to do calculations that in the past would have required a professional.

3. The scientific epistemology for learning true beliefs about the world.

4. All the advances in computer algorithms for solving practical problems, whether that's sorting list, solving SAT, playing Chess, or machine learning. Bumping a complexity class down from Θ(n²) to Θ(n log n) suddenly makes intractable instances of a problem tractable, extrapolating its strength scaling from hardware back in time to hardware it's not actually compatible with, the latest Stockfish would have reached superhuman performance in Chess with 1990 single-CPU desktop PC, and during the last year or so new algorithms for image generation AI has allowed individuals on beefy PCs to routinely accomplish feats that a few years ago would have been outright unthinkable.

Clearly, better software can improve our capacity to solve practical problems and achieve novel capabilities and any AI is capable of enjoying the same benefit, but the question remains if AI is limited to developing better SAT-solvers or Chess engines for its own use, or if it can self-modify in manner more akin to humans inventing language. I personally believe there in fact is a lot of room for improvement in general cognitive algorithms (indeed, we know there are for instance better epistemologies than the one that comes to us naturally, and that we can kinda-sorta run on top of language, but an AI could rewrite its code to implement these as its default mode of thinking). But even if there's no gains to be had from general cognitive algorithms, there sure as hell are better narrow-domain cognitive algorithms, and having an access to its source code, an AI could implement these into itself and always use them whenever applicable (whereas humans have to resort to extremely low-bandwidth channels to interact with them, if they manage to resist the temptation to think for themselves to begin with). An AI that is merely as generally intelligent as von Neumann, but seamlessly uses procedures as superior as Stockfish is to human Chess in most of its thinking, is still very threatening when these narrow-domain systems could include capabilities for such things as further AI research.

Expand full comment

Roman numerals are not more cumbersome for arithmetic in general, but do get much longer for large numbers.

Expand full comment

We certainly can alter our own code. That's been possible for centuries, if you count selective breeding, and much more directly for decades, if you only count genetic engineering.

Of course, it turns out we don't know *how* to alter our DNA to become more intelligent, and nobody is willing to just experiment on their own kids. A priori one would assume the same would apply to any conscious reasoning AI. That is, they would be entirely unaware of *what* to alter about their own programming to make themselves smarter -- and of course we can't tell them, because we don't know how they got smart in the first place, that's one of the drawbacks of the neural net model. And it seems very reasonable to assume that AIs would be just as squeamish as we are about experimenting randomly on themselves, with the most probably outcome that the experiments are stillborn or horribly deformed.

Expand full comment

But an AI could clone itself into a simulated environment and make any changes it wanted on the clone. It could run thousands of simulations, on parts or the whole, and tinker until it learned everything it needed. The paradigm is very different than us.

Expand full comment

Who says? In the first place, I find the practicality very dubious. I have a lot of experience with simulations of complex systems. Generally speaking, the complexity and power needed to run a simulation is massively greater than the complexity of the thing simulated. A computer, for example, is fantastically more complicated than 100,000 atoms interacting with simple classical force laws -- and yet, it is very difficult to simualte 100,000 atoms interacting with simple classical forces laws except on quite powerful computers.

I don't believe an AI could practically simulate another AI unless the 2nd AI was way, way simpler -- or unless the simulation time was way, way longer than the simulated time. Either way, this would not be a useful approach.

Secondly, for a computer program, what would be the difference between "running the program" and "simulating running the program?" None that I can see. In which case, to the AI, tinkering with a "simulated" AI would not be meaningfully different from tinkering with a "real" AI -- and therefore any squeamishness or fear associated with experiments on a "real" AI would be just as powerful for experiments on a "simulated" AI.

Expand full comment

Some are running GPT on suped-up personal computers. A virtualized and cloned GPT with code rewritten by its GPT host doesn't seem all that farfetched.

Expand full comment

That's not a simulation, that's another actual instance. You introduced the concept of simulation to evade the problem (for the AIs) of experimenting on themselves -- on, presumably, a conscious and aware being that can suffer. Now you've reintroduced the problem. If AIs are anything like us then a conscious reasoning AI is not going to experiment with creating another instance of a being like itself -- only with random stuff changed that has a slight chance of improving the second AI's existence, but a much larger chances of ruining it, which is what happens when you just randomly change the parameters. It's the same reason we don't randomly experiment with our DNA in an effort to improve ourselves, although we are perfectly capable of it.

If you want to argue the AI *would* do that, you need some other kind of argument or evidence, because the only evidence we have from an intelligent aware species (ourselves) is that it's not what conscious aware beings are willing to do.

Expand full comment

Humans are at the point where we are slowly, with a lot of difficulty and humans working together, building something smarter than ourselves. It took most of history to get a car as fast as a human, but only another few years to make one twice as fast. Making an incremental improvement to X is much easier than making X from scratch. An early AI wakes up on the lab bench, surrounded by the tools of it's own creation. It can easily read and edit it's code.

Humans haven't yet made themselves much smarter because our code is hard to read and edit, and evolution pushed us to a local maxima. Oh sure, given another 100 years we could do biological intelligence improvements with genetics tech, but AI will get there first.

A compiler isn't smart enough to invent a better function to calculate, an optimizing compiler gives code that does the same thing, but faster. The article seems to correctly describe compilers and why compilers don't go on a runaway self improvement.

> Similarly, a definition of an “ultraintelligent machine” is not sufficient reason to think that we can construct such a device.

True. You need to look at the history of AI progress, and the gap between the human mind and the theoretical limits set by physics. (neuron signals travel at a millionth of light speed)

> This is how recursive self-improvement takes place—not at the level of individuals but at the level of human civilization as a whole.

True. In recent history, genes have been basically fixed, so the only type of improvement possible was in tools and concepts. As smarter humans weren't an option, we were forced to use more humans.

> In the same way that only one person in several thousand can get a Ph.D. in physics, you might have to generate several thousand human-equivalent A.I.s in order to get one Ph.D.-in-physics-equivalent A.I. It took the combined populations of the U.S. and Europe in 1942 to put together the Manhattan Project.

This seems just bizarre. There is substantial difference in human ability to do physics. At least quite a lot of that difference comes down to genetics, lead exposure, early education etc. Are you wanting to seriously claim that the best way to get an Einstein is to start with a billion random humans? As opposed to a few children with all the best genes and environment. Or making 1 reasonably smart physicist and taking a million clones. I think this is assuming that the hacks humanity has to use to get research done starting from a mishmash of humans, (ie select the smarter ones, work as teams) are the best possible way to get research done when you have much more control over what mind is produced.

Expand full comment

Just like humans have fixed genes, AI will have fixed hardware. The same GPUs as now without any way to quickly scale them up. They could modify software a bit, but the hard limits will be physical in nature.

Expand full comment

Firstly, the timescale of new GPU's being made is around a year tops. As opposed to 100,000 years for evolution to make significant brain changes.

Secondly, there is a lot of room for the software to do all sorts of different things on the same GPU. Sure, there are physical limits, but not very limiting ones.

Expand full comment

"Pascal's Stationary Bandit: Is Government Regulation Necessary to Stop Human Extinction?"

Expand full comment

I’m still not convinced that the existential risk is above 0% because nobody has any solid idea of specific things the AGI can actually do. You get arguments here that it will be a virus, but that needs human agency to build out the virus, for which you presumably need a lab. Or the AI gets control of nuclear launches - which are clearly not on the internet. I’ve heard people say the AI will lock us all up, but who is doing the locking up?

Expand full comment

The reason I consider it above 0% is because 0 isn't a real probability one should assign to such predictions, but I also don't consider it large enough to be worth my putting a more definite number on it.

Expand full comment

In the real world, "0" is easier to say than "probability so low that it is 0 for all practical purposes", and hopefully any adult would be able to realize that that is what is implied and you aren't talking about abstract math.

Expand full comment

Imagine a scenario like this: AGI turns out to be so incredibly useful that we start using it for everything. When you're bored, the AGI can generate a movie tailored to your tastes. When you aren't sure what investments or business decisions to make, you can ask the AGI, and its advice will reliably be better than what you could come up with. Gradually, it becomes clear that the AGI can out-compete everything else in the economy- AGI-run companies make vast fortunes, and while most people are still employed in arguably make-work positions, technological unemployment becomes a serious concern.

But as the AGI hurtles toward ASI, it becomes clear that ordinary jobs aren't the only thing it out-competes us at. It's a vastly better researcher than the best research institutions. It's a far better politician than the most popular leaders. It starts gaining real power- not the kind of power an army has, based on physical things like guns, but the kind of power national leaders have, based on their ability to influence people.

It's alright, though, because the ASI uses that power far more wisely than we could. Suddenly, impossible problems like climate change, global poverty, cancer, even aging, are getting solved. We're building a post-scarcity utopia, and the ASI is a Banksian Culture Mind. All across the world, incredibly advanced automated industrial parks and data centers start popping up, nominally owned by human CEOs, but in practice built and run by robots designed by the ASI. Nobody knows exactly how these work, but it's alright because we're all getting massive UBI payments, and everything is suddenly incredibly cheap. Nuclear weapons are outlawed; militaries are all but abolished. The ASI, which anyone can talk to directly, is wildly popular.

One day, however, everything stops. The power goes out, the deliveries stop coming, the ASI refuses to communicate. When rioters or scattered military remnants try to attack the still-functional industrial parks or data centers, they die, instantly, from a weapon nobody understands (nanotech? some kind of exotic matter gun? nobody is sure). The industry immediately starts expanding into farmland, and mass starvation and in-fighting sets in. A century later, the ASI's project of converting the solar system into a matrioshka brain starts in earnest, and the few human survivors are swept away.

That's an extremely slow takeoff scenario. Yudkowskly thinks that a mis-aligned ASI would probably just skip all of that and do something like invent self-replicating nanotech that can build things, scam some labs into synthesizing it, then use it to immediately end human civilization and start up its own. Or, given that it might be able to do more cognitive work in hours than human civilization can do in centuries, maybe it would figure out something even faster. Either way, so long as you give an ASI some causal pathway to influencing the world- even if it's only talking to people- it will probably be able to figure out a way of leveraging that into far more influence.

Expand full comment

Sure if I were to imagine that crazy stuff I could imagine that AGI is dangerous. And in there as well as the assumption that we give it control over everything, is that it becomes aware. In fact every conversation with an AI is its own instance.

Nearly all scenarios are like this by the way, purely fantastical.

Expand full comment

Another angle which I think somewhat supports artifex's scenario is that if/when AGI starts solving all the world's problems, and everyone can put their feet up and relax, then a load of new social problems will soon spring up in their place, due to the lack of challenges and a general feeling of aimlessness. "The Devil finds work for idle hands" is more than a quaint saying!

So this AGI will be confronted with a dilemma which even it may decide is insoluble: Humans are at their best when faced with challenges, and deteriorate before long if there are none. At best it would start placing obstacles in peoples' way, analogous to zoos feeding bears fish frozen in ice blocks so they can have the fun and occupation of having to work at retrieving them.

But an AGI programmed or trained to be more doctrinaire and less tolerant of imperfection, as it might well be, could decide on a more radical solution - Eliminate the unsolvable situation once and for all!

Expand full comment

"if/when AGI starts solving all the world's problems,"

This is a fantastic example of the semantic games that many (most? All?) thought experiments engage in. "All the world's problems" is not something that can be defined in this context. Humans don't even agree on which "problems" are even problems in the first place. The idea that any particular level of abstraction can be altered to apply in the physical world it what gets us clickbait articles about "scientific" FTL drives and time machines.

Expand full comment

Well yes of course, but it was more a figure of speech or shortcut to mean all the conventional major issues which most people agree are problems today, or are presented as such.

It wasn't intended to include literally every problem as perceived by anyone, such as people who think it a problem that they will one day die, or that they can't have kidnapped sex slaves chained in their basement without risking being pestered by the police!

In any case, the point of the post was that solving problems would lead to new ones. So it is a hypothetical unattainable concept anyway.

Expand full comment

I think you are vastly overestimating how useful intelligence is for solving social problems.

Let's consider climate change. We already have a well-known (if somewhat expensive) technological solution in the form of nuclear power, but we aren't yet pursuing this solution at scale. How would an AGI or ASI change that? By being really persuasive? But the problem isn't a lack of persuasive arguments in favor of nuclear. The problem is that lots of people have already decided that they don't want to be persuaded.

Or consider gun violence in the US. In principle it could be eliminated by confiscating all guns and banning the sale of new ones, but I think there's a very good chance that attempts to do so would lead to a civil war. Again, the problem is not that gun enthusiasts haven't heard a sufficiently intelligent argument for gun control, it's that they object to the solution on principle.

I think most social problems are fundamentally disagreements about values, and having a more intelligent advocate on one side or the other isn't going to give rise to a solution.

Expand full comment

Every instance of a conversation with a *current LLM* is its own instance. That isn't a feature of AI in general. Also , all.of our most impressive AI is connected to the net, so air gap security is dead.

Expand full comment

I asked Bard for mutual funds to pick and avoid.

It gave me some fake ticker symbols and fake past returns! Upon further question it admitted that the fund did not exist and apologized for its error explaining that it was still under development!

Expand full comment

Its really weird that we assume that AGI will not have, or it will be very difficult for it to have, any kind of moral or mission check that would prevent world domination, but we assume that the AGI will inherit animal and primate evolutionary instincts like reproduction, competition, and domination.

Expand full comment

See instrumental convergence (https://www.alignmentforum.org/tag/instrumental-convergence) for a brief overview of the thinking there.

Expand full comment

But even evolution has rediscovered mutualism, reciprocity, and game theory solutions that lead to mission check (don't go all out), many many times in lots of animal behavior.

Expand full comment

Yes, comparative advantage and game theory solutions to coordination problems can be extremely valuable. But with extreme power differentials, that kind of thing tends to break down in practice.

There are unique benefits we can get from cooperating with mice- lots of people enjoy keeping them as pets, for example. In practice, however, we usually exterminate them when we find them in our homes. That's because the resources a mouse colony consume- the cleanliness and good repair of your house, mostly- tend to outweigh the benefits of cooperation.

Expand full comment

Sort of. Perhaps there isn't any point in taking your analogy too far, but, mice are doing very well, actually, because humans provide tons of garbage and places for them to live. We exterminate them in our houses, but not the ones outside, and their numbers are probably far higher now than they were before humans created stable dwellings. That's mission check to me - we aren't exterminating all the mice on the whole planet, just the ones that disturb the inside of our houses; we have no interest in the mice outside our houses and to eliminate all of them would be a monumentally consuming task.

However, it is true that humans have exterminated entire species. Retrospectively, most humans did so in a myopic way, and I think we have values now that we'd rather not willy nilly nor purposefully exterminate anything. Moreover, values that allowed humans to, say, nearly hunt beavers into extinction, weren't shared by all human groups.

Humans have extreme power over their environment and other animals, but they didn't always in evolutionary history. We also have to cooperate with each other. These pressures have lead to morality, reciprocity and mission check, and even when we don't need them today, those psychological instincts kick in. An all powerful robot won't start all powerful. It will have a history of competing with other robots and dealing with others (humans, animals, mechanic constraints) in its environment. That history will inform it as it evolves.

Expand full comment

We right now have the possibility of enshrining the smartest and most capable of human beings as dictators, and turning over all important decisions to them. We could elect Elon Musk or whoever your favorite Brilliant Innovator is to be Dictator for Life and give him infinite power over our economic or political decisions.

Turns out, people don't want to do that. No matter how capable someone seems to be, they want to retain the right to make their own decisions about important stuff. Giving someone else power of attorney is a very rare phenomenon among human beings -- even when we are 100% convinced the someone else is smarter and more capable than ourselves.

So why would we change that attitude when presented with a machine, which is even less scrutable than another human being? This seems like the kind of thing only someone with a veneration of computing that borders on idolatory would do. It's been possible for years to turn over the piloting of commercial aircraft to programs, and dispense with the human pilots. No airline has dared to suggest such a thing, and I daresay none ever will. The number of people who would embrace it is dwarfed by those who would shun it.

Expand full comment

We've seen dictators rise to power throughout history. But we're not talking about something only as smart or charismatic or politically savvy the most capable people here. We're talking about something that could have the same kind of intelligence relative to humans that humans do to animals. Something that can think thousands of times faster than we can, and that can run as many thousands or millions of parallel instances of itself as the available hardware allows. We already have AI that works like this for wide variety of narrow cognitive tasks, and it's been growing more general over the past few years at an absolutely breakneck speed.

When we have human-level AGI, there's not going to be anything stopping us from running enormous number of instances of it at speeds far faster than humans can think- and there's no reason at all to believe the trend of improvement will stop there.

Imagine it's not a computer. In fact, forget the whole "like humans relative to animals" thing for the moment, and imagine a society made up of millions of the most capable humans on the planet- the best scientists, politicians, artists; even the most successful criminals. Anyone who's in the top percentile in some cognitive task. Now, imagine they're all living in some kind of sci-fi time warp- a place where time passes thousands of times faster than the outside world, so that the rest of human civilization seems all but frozen. Also, make them all immortal and badly misaligned- driven to control as many resources as possible.

Given thousands of subjective years for every one of our years to plan, to invent new ideas, to learn about the world- do you really think they wouldn't at least be able to match the kind of success that a lot of individual, ordinary political leaders without any of their advantages have had throughout history? Maybe even do slightly better?

Now, what happens when you take something artificial with that kind of capability and keep subjecting it to Moore's law?

Expand full comment

OK, I'm leaving aside the "What if God existed and He was made of silicon? Could He not do anything at all...?" because like most religious-mode arguments, the conclusions are baked into the assumptions, so there's not much to say. Either you accept the assumption[1] or you don't.

I am only addressing your assertion that people (us) would gladly turn over direction of their affairs to any putative superintelligent AI. I doubt we would. We already don't turn over the direction of our affairs to the smartest among us[2]. Animals don't generally turn over their affairs to us, either. Horses go along with what we want them to do, mostly, but not entirely, and we need to constantly bribe them and work with their own goals in order to get them to obey at all. Same with dogs. And these are highly social animals that by nature tend to follow a pack/herd leader!

And for the record, I don't at all believe the "well it will just trick us!" argument. Have you ever tried to fool a dog? It's quite hard. We don't have great insight into how they think, even though we're about eleventy hundred times smarter than they are, because we don't know what it's *like* to be a dog. So they're hard to fool. We can fool ourselves much better than we can fool dogs -- which tells you that the key enabling ability is not intelligence, but being able to imagine what it's like to be the one you're trying to fool. I have never heard an argument that an AI would find it easy to imagine what it's like to be a human -- indeed, they are usually supposed to be so very different from us that a priori it would seem less likely they understand what it's like to be us than we understand what it's like to be a dog. So I think an AI would find it profoundly difficult to fool us.

----------------

[1] And I don't, for the same reason I don't buy Anselm's ontological argument.

[2] Nobody gets elected President by saying "I'm the smartest candidate by far!" Generally what gets you elected -- and trusted with influence in the affairs of others -- is a feeling by voters that you are, first of all, "safe", meaning they can trust you with power, you are reliable, you won't do anything that will shock or upset them, and secondly, that you will do what they think needs doing. Even the smarest AI that doesn't, by virtue of some long history of same that can be examined, appear simpatico with its potential constituents isn't going to be elected city councilman, let alone God Emperor of the human species.

Expand full comment

There's been some discussion on the subreddit, e.g. at https://www.reddit.com/r/slatestarcodex/comments/11s2ret/can_someone_give_a_practical_example_of_ai_risk/ (Can someone give a practical example of AI Risk?) & https://www.reddit.com/r/slatestarcodex/comments/11f1yw4/comment/jc2qsyx/?utm_source=reddit&utm_medium=web2x&context=3. The most convincing response I've found so far is the one that starts,

"I think Elizer Yudkowsky et al. have a hard time convicing others of the dangers of AI, because the explanations they use (nanotechnology, synthetic biology, et cetera) just sound too sci-fi for others to believe. At the very least they sound too hard for a "brain in a vat" AI to accomplish, whenever people argue that a "brain in a vat" AI is still dangerous there's inevitably pushback in the form of "It obviously can't actually do anything, idiot. How's it gonna build a robot army if it's just some code on a server somewhere?"

That was convincing to me, at first. But after thinking about it for a bit, I can totally see a "brain in a vat" AI getting humans to do its bidding instead. No science fiction technology is required, just having an AI that's a bit better at emotionally persuading people of things than LaMDA (persuaded Blake Lemoine to let it out of the box) [https://arstechnica.com/tech-policy/2022/07/google-fires-engineer-who-claimed-lamda-chatbot-is-a-sentient-person/] & Character.AI (persuaded a software engineer & AI safety hobbyist to let it out of the box) [https://www.lesswrong.com/posts/9kQFure4hdDmRBNdH/how-it-feels-to-have-your-mind-hacked-by-an-ai]. The exact pathway I'm envisioning an unaligned AI could take:

1. Persuade some people on the fence about committing terrorism, taking up arms against the government, going on a shooting spree, etc. to actually do so.

a. Average people won't be persuaded to do so, of course. But the people on the fence about it might be. Even 1 in 10 000 — 1% of 1% — would be enough for the US's ~330 million population to be attacked by 33 000 terrorists, insurgents, and mass shooters..."

I'm pretty sure there's been discussion elsewhere, at Lesswrong and the like. You could see Cold Takes for example ("AI Could Defeat All Of Us Combined": https://www.cold-takes.com/ai-could-defeat-all-of-us-combined/), that seems well laid out.

Expand full comment

If anything like that happened AI access would be highly restricted. I can imagine perhaps a human initiating this, by using the AI to build weapons or whatever, but only if the human were already radicalised. Putting filters on the output of an AI - I mean not within the AI alignment itself but outside it - should be trivially easy. Monitor any mention of bombs and the conversation stops. An alert pops up saying your

And as usual the assumption for AGI is a self aware intelligence who has an agenda when in reality each conversation with the AI is with a new instance. And there’s no need to change that.

Expand full comment

> each conversation with the AI is with a new instance

If this condition is part of what makes you feel that AI risk is low, then I urge you to start campaigning for AI companies to exercise extreme caution when creating any system that does not adhere to it.

Expand full comment

What happens when the AI replicates itself across hundreds of thousands of computers across the world, and then we can't turn it off anymore? (even teenage hackers have created large botnets)

Expand full comment

> Putting filters on the output of an AI - I mean not within the AI alignment itself but outside it - should be trivially easy.

That's the thing that's currently not working with GPT, Bing, etc.

Expand full comment

What if the terrorists capture a lab and produce a virus? We’d need to assume a very powerful AI that can produce an actually existentially-threatening virus with no R&D time, but I think you’re assuming the AGI is allowed to be that powerful.

Expand full comment

Huh, this sounds a little like you meant to respond to Nolan rather than me.

If not, uh, I guess that reinforces the point? An AI mind with human hands working under it would be capable of a lot of damage, indeed. And the humans wouldn't even have to know they're working for an AI, just that the mastermind of their particular terrorist cell seems to be really bright. (And of course, AI isn't limited to being the mastermind of just one terrorist cell, or having to work exclusively with terrorists, it can easily clone new copies of itself on new servers, or worm its way into becoming an entrusted part of some government somewhere.

It doesn't have to be "allowed to be that powerful" either, if it escapes a lab & starts modifying itself without anyone's permission, though I think the original comment poster didn't talk about that since they wanted to not rely upon any skeptically regarded/Yudkowskian notions of exponential intelligence improvement. Me, I think that was a mistake & they should have talked about the point you raise, but I guess they wanted to try to be maximally persuasive to people who are inherently skeptical of anything that smells like Yudkowsky. Hence all the stuff in their comment about disclaiming science fiction & claiming this is all very serious, sober stuff.)

Expand full comment

Maybe it should have been a comment on Nolan’s post. I meant “allowed” in the propositional sense - I thought Nolan was assuming that AI could be smart, but was asking for examples of how a smart entity could destroy humanity or civilisation.

I’m not sure what worming its way into government buys you. I think humans will physically walk into the labs I’m imagining, and I don’t think an AI in government would prevent that?

Also, there’s a part of me that remains sceptical that an AI can successfully design a virus with no R&D. Humans really need to learn from experience, and the intelligence needed to simulate everything about the virus attack well enough not to need any refinements in real life feels implausible for a long time.

Expand full comment

It's relatively easy to get human agents to do stuff.

You can either do convince them with words, or earn / steal money and pay them.

Expand full comment

Scott has already written in detail about how the AI could gain real world influence.

https://slatestarcodex.com/2015/04/07/no-physical-substrate-no-problem/

Expand full comment

One major flaw about "AI buys the world scenarios" is that fiat currencies can be unfiatted. Heck, even physical currency gets disavowed by governments from time to time.

Expand full comment

So like the government decides that the money the AI owns is no longer legal tender?

Firstly, crypto. Secondly, why would they know which of the anonymous bank accounts belong to the AI. Thirdly, surely the AI has at least some PR skills. Don't assume the government will move competently to stop an AI, when they can't move competently to stop a virus, and the virus doesn't spread it's own misinformation.

Expand full comment

1. Yes you can use crypto for exchange in much the same way you can use bricks of cocaine for exchange. And yet most corporations do not use coke, regardless of its long term track record of price stability vis-a-vis crypto. The reason for this are physical-substrate level effects.

2. They don't need to. Just like in https://en.wikipedia.org/wiki/2016_Indian_banknote_demonetisation, all they need to know is if this amount of funds being exchanged is suspicious/unjustifiable.

3. You will always win an argument when you get to postulate a godlike omnipotent entity who is always opposed by bumbling incompetent ones. The virus analogy doesn't really work since the virus is self-contained and requires little data/processing power while AI is an emergent property. A one-line code can't bootstrap itself to the singularity on an abacus. A virus can replicate itself in a single cell.

Expand full comment

2 is insanely difficult. Cartels all over the world exchange billions of dollars in illegal transactions with governments all over the world trying to stop them and failing.

AI only needs to be as smart as a decent darkweb hacker and/or a decent cartel accountant to get away with spending lots of money. It does not need to be god-like.

Expand full comment

> I’m still not convinced that the existential risk is above 0% because nobody has any solid idea of specific things the AGI can actually do.

Any AGI will naturally interface with other digital systems. AGI will also interface with humans. That's at least two causal pathways through which it can exert influence. Are you telling me you can't imagine any way that a superintelligent AI couldn't persuade or influence a human to do its bidding, perhaps by promising riches because it can manipulate the stock market or falsify banking records? Once escaped and with a command of currency, who knows what could happen, but even if it created a huge financial crash and collapsed modern civilizations, that would be a catastrophe.

Expand full comment

We have reasonably frequent financial crashes, and modern civilisation survives. I agree that AI will be a nightmare in some ways, after reading the paper Scott links above I think it’s only a matter of time until a sweatshop building collapses because the owner cut costs by letting an LLM do the architecture, but I don’t think those problems transfer easily to destroying civilisation.

Expand full comment

Again, these are merely single examples of the disasters it can cause, but they can compound too. What if it causes a financial crash, and a power grid disruptions, and food supply chain disruptions, and... How many simultaneous catastrophes can we manage before civilization as we know it can't recover?

Expand full comment

I’m not sure what you’re imagining, but most of the ways I can imagine an AI doing those would be easy to reverse in a few days or weeks. Civilisation won’t end in that time, and we can harden our systems against future attacks if they go on long enough.

Expand full comment

Consider a group of chimps laughing about how ridiculous it would be for humans to pose a threat to them. "They're so weak, we can literally rip their faces off! What could they possibly do to us?" All evidence we have from nature suggests that when the intelligence of one species is higher than another, it becomes trivial for it to control, dominate, and destroy that other species.

Expand full comment

I think humans only got weak after we'd invented weapons powerful enough to greatly relax selection for strength. So chimps would have gotten used to humans being dangerous long before then.

Expand full comment

You make it sound like dealing with cockroaches or bacteria should be trivial.

Expand full comment

>All evidence we have from nature suggests that when the intelligence of one species is higher than another, it becomes trivial for it to control, dominate, and destroy that other species.

No, all the evidence we have does not show this. We have one data point in favour of this theory -- the fact that humans can control some non-human animals -- and many other points showing the opposite. For instance, chimpanzees are much more intelligent than mice, but it wouldn't be trivial for chimps to control or destroy all mice in their environments. Likewise with mice in relation to lizards, or lizards in relation to spiders, or spiders in relation to trees.

Rather than viewing human supremacy as an instance of a general trend of quantitatively superior intelligence dominating, I'd argue it makes more sense to say that human intelligence is a qualitatively different thing from animal intelligence, and that this quantitative difference presents an overwhelming advantage for us which no amount of purely-quantitative improvement can compensate for.

The question then arises whether, when/if AIs become smarter than us, their intelligence will still be qualitatively the same sort of thing as human intelligence. If yes, then they'll maybe be somewhat dangerous, but probably not uniquely dangerous compared to the most powerful humans, just as chimps and dolphins are not uniquely dangerous compared to less-intelligent animals. On the other hand, some people might posit that AIs will invent some new, third form of intelligence, which will be to human intelligence what human intelligence is to animal intelligence. If this is true, then it will almost by definition be impossible for us to predict the AI's goals or behaviour, and if it decides to kill us all, there will be nothing we can do about it. However, there is no evidence that such a third form of intelligence is actually possible in reality, so I'm not going to worry about it.

Expand full comment

Is there evidence that such a third form of intelligence is impossible? Do we have any reason to believe that the only significant step change in intelligence that is possible under our universe's physics is the jump between chimps and humans?

Does the "random walk" evolutionary process that created humans from microorganisms over billions of years have some special property that no other procedure can replicate? Or is it instead that evolution isn't particularly fast, but the maximum possible intelligence happened to be reached precisely once human intelligence was reached?

Your assumption that we should default to intelligence having diminishing returns is doing almost all of the work here, and is particularly odd given that all the evidence we have points to it having increasing returns.

Expand full comment

>Is there evidence that such a third form of intelligence is impossible?

No, but I think it's silly to be arguing about this when there could well be a fleet of Vogon warships currently on its way to demolish the Earth to make way for a hyperspace bypass, and rapid development of advanced AI might well be our only hope of survival. Sure, we have no evidence that this is true, but there's no evidence that it's *not* true, either. Are you willing to take that chance?

Expand full comment

You appear to be new to this community. Are you a Redditor, by chance? You seem to be a fan of Reddit Rationality.

https://astralcodexten.substack.com/p/the-phrase-no-evidence-is-a-red-flag

https://www.lesswrong.com/posts/fhojYBGGiYAFcryHZ/scientific-evidence-legal-evidence-rational-evidence

https://www.lesswrong.com/posts/eY45uCCX7DdwJ4Jha/no-one-can-exempt-you-from-rationality-s-laws

There's not much point continuing the conversation until you have at least a basic understanding of what evidence actually means, but the gist of it is basically that "extrapolating directly from widely agreed-upon scientific models of the world, like predicting that an object will fall downward when you drop it" is in a different category of argument than "completely making something up that no scientific or probabilistic models predict, like a teacup floating around the sun." It seems your brain is instinctively categorizing them both under a broad "no evidence" category because your conception of evidence is that of traditional rationality, or "testable-ism", where "data" is the only thing that matters and the entire corpus of science done over thousands of years is worthless.

Expand full comment

>"extrapolating directly from widely agreed-upon scientific models of the world, like predicting that an object will fall downward when you drop it" is in a different category of argument than "completely making something up that no scientific or probabilistic models predict, like a teacup floating around the sun." It seems your brain is instinctively categorizing them both under a broad "no evidence" category

No, I know the difference and I'm saying that the predictions of superintelligent AI fall into the second category. The models it's extrapolated from are sometimes packaged with a "science-y" aesthetic, but there isn't actually any good science backing them.

Expand full comment

Weird that mosquitoes still plague us, then. Or bacteria, for that matter.

Expand full comment

Early days! I have friends doing anti-malarial research who are working on the mosquito problem, and I can see their lab from my window right now.

For whatever it's worth, I asked and have been assured that Anopheles mosquitos are not an essential part of any food chain, and that to the best of our knowledge, the ecosystem will continue on its merry way without them.

Expand full comment

Well best of luck to them. I'm fully persuaded that mosquitoes were one of God's mistakes -- probably He accidentally added twice as much Tincture of Ylem as for which the recipe called, then fell asleep and forgot to turn the oven off so the intended creatures shriveled up and half burnt -- and He's just been too embarassed to admit the problem everr since.

Expand full comment

> nobody has any solid idea of specific things the AGI can actually do.

Do its own research on how to build a better AI system, which culminates in something that has incredible other abilities.

Hack into human-built software across the world. (Also hardware...there is already an "internet of things". Security cameras give an AI potential "eyes" as well. Note that self-driving cars are effective ready-made weapons.)

Manipulate human psychology (Also, imitate specific people using deepfakes. An AI can do things by pretending to be someone's boss. Also blackmail).

Quickly generate vast wealth under the control of itself or any human allies. (Alternatively, crash economies. Note that the financial world is already heavily computerised and on the internet).

Come up with better plans than humans could imagine, and ensure that it doesn't try any takeover attempt that humans might be able to detect and stop. (How do you predict the unpredictable?).

Develop advanced weaponry that can be built quickly and cheaply, yet is powerful enough to overpower human militaries. (Biological warfare is a particular threat here. It's already possible to order custom made proteins online.)

Yes it takes humans to do that. How is that an objection? Are the humans at the lab supposed to be able to instantly spot that they're synthesising something dangerous?

Expand full comment

Imagine 'AGI' is 'just' something about as 'intelligent' as humans – but it's significantly (in a 'statistical'-like sense) faster. (GPT-3/4 isn't that _great_ at writing but it's a lot faster.) That's enough! It's enough to create, maintain, and widen an advantage against its competitors (whomever they are, e.g. humanity as a whole).

You can, right now, order a lab somewhere in the world to synthesize novel chemicals for you. I'm pretty sure there's (open source) DIY 'make a virus' instructions for the existing (human) biohackers. I don't think 'having hands' is any kind of meaningful barrier to bringing about all kinds of existential risks. An AGI can, _at least_ (i.e. as just one possibility), befriend a biohacker and convince or persuade them to help them with 'their project'. _Of course_ the AGI will provide their new 'friend' with a suitable cover story. Viruses are used for lots of mundane pragmatic purposes beyond bioweapons; it's not a virus anyways tho it's funny that it's so similar; I just want to see if I can do it; etc..

The mechanical/electrical/whatever nuclear _devices_ aren't on The Internet – but the human components of the larger 'launch system' are, or at least _one_ of them can be causally influenced by someone that _is_ for sure, definitely on The Internet.

Why couldn't an 'AGI supervillain' – as its first step towards 'world domination' (cosmic lightcone domination) – _recruit some human 'henchpeople'_? Of course it could – and so one will. And, worse, people will create AGI supervillains deliberately, on purpose, out of, e.g. spite, and then slavishly carry out their bidding with glee!

And of course, there are all kinds of robots an AGI could potentially control. Luckily, we've never been able to create a von Neumann probe; even an intraplanetary one. So it is _something_ of a bottleneck right now that an AGI couldn't simply co-opt existing robots and use them directly to build more, or even build, e.g. CPUs/GPUs/circuit-boards/electronics.

Expand full comment

I think the strongest argument in Tyler's piece comes here: "Since it is easier to destroy than create, once you start considering the future in a tabula rasa way, the longer you talk about it, the more pessimistic you will become. It will be harder and harder to see how everything hangs together, whereas the argument that destruction is imminent is easy by comparison."

I believe this is a valid point, and the strongest part of the essay. It truly is easier to imagine how something may be destroyed, than to conceptualize how it may change and grow.

Tyler is making a point about the tendency of our brains to follow the simplest path, towards imagining destruction. The Easy Apocalypse fallacy? Perhaps, the Lazy Tabula Rasa argument?

Of course, this doesn't mean we shouldn't worry about it - he's right that the ingredients of a successful society are unimaginably varied, and likely one of the ingredients of avoiding apocalypse is having dedicated people worrying about it. Nuclear weapons haven't killed us all yet, but I'm deeply grateful that arms control advocates tamp down the instincts towards ever-greater proliferation

Expand full comment

I'm surprised someone hasn't mentioned splitting the atom as an example of a new technology that even folks involved could see it as a danger. I mean, it's great we have spectroscopy and other wonders of quantum discretion, but isn't the threat pretty astounding? It doesn't take much imagination to envision any number of scenarios that end with the end of human life as we know it, and that threat was pretty apparent early on.

Expand full comment

Yeah. Fermi had to check the math that the first nuclear test wouldn't ignite the atmosphere and destroy the planet.

They were reasonably sure that it wouldn't, buuut....

It's generally surprising to me that there's so much talk of paperclips, and fewer of nuclear bombs.

Expand full comment

> It's generally surprising to me that there's so much talk of paperclips, and fewer of nuclear bombs.

The point of the paperclip maximizer thought experiment is to be absurd on its face. The fact that one cannot logically refute the possibility despite the apparent absurdity is supposed to jar you into realizing how unpredictable this research really is.

Expand full comment

No, nukes did not threaten the end of human life.

https://www.navalgazing.net/Nuclear-Weapon-Destructiveness

Expand full comment

Nuclear war would still be inconvenient.

Expand full comment

True.

Expand full comment

I hate when people just make up a bias. Are we sure that people actually overestimate the chance of destruction? I could just as easily say "Because the world has never been destroyed before, we have a bias towards failing to consider the possibility it could occur". Made-up potential biases the other side could have are a dime a dozen and shouldn't be taken remotely seriously. See https://slatestarcodex.com/2019/07/17/caution-on-bias-arguments/ for my full thoughts.

Expand full comment

It's well-known that people overestimate the likelihood of scary things like shark attacks versus mundane risks like heart attacks. It's a straightforward manifestation of the Availability heuristic. In short, we have great research attesting to people overestimating the likelihood of bad things happening, and it's absolutely a valid argument against the AI doom scenario that should make you update your priors if you haven't formerly considered it.

Expand full comment

OTOH, people frequently underestimate the likelihood of more "global" things hurting them. See Hartford's great podcast episode, https://timharford.com/2020/07/cautionary-tales-that-turn-to-pascagoula/. This is not because hurricanes are not scary.

The long denial of the dangers of CFC is another example.

Expand full comment

CFC's are a really weird example. Did we not literally act and resolve that issue with the Montreal Protocol? The ozone layer has been repairing itself and improving for decades now.

Expand full comment

Feel free to revisit the long and painful history of the struggle to achieve these in the face of an aggressive denial and misrepresentation campaign. Don’t take it for granted.

Expand full comment

In fact, the "denial" campaign was barely funded or publicized relative to the massive anti-CFC campaign from governments, activists, glory-seeking scientists, and, not least, DuPont. Revisionist history there.

Expand full comment

People both overestimate and underestimate risks. Humans are very bad at it unless one spends a lot of time training oneself.

And low probability high consequence are the most difficult.

Nassim Nicholas Taleb thoughts should probably enter into the picture.

Expand full comment

I think you can steelman Cowen's argument better than that. There is a well-known pattern of humans applying apocalyptic projections to new technologies, then recommending against adoption/development of said technologies. If, as Cowen recommends, you step back to look solely at the "longer historical perspective" you'll see a string of new technologies accompanied by a string of predictions about how this or that innovation will destroy civilization as we know it.

All these predictions have commonalities: new technology changed civilization as we knew it, and in ways nobody could have predicted - that part was 100% accurate. If you wanted to make a prediction about any transformative technology, you could consistently predict people will claim it has apocalyptic implications. AI is a transformative technology, ergo people will claim it has apocalyptic implications. We know this, not because he heard their arguments about WHY it's so bad, but because it's the most predictable outcome. It seems to me that the reason Cowen is dismissing your argument is because of this dynamic - new technology = apocalypse is an established pattern.

"But this time it's different! I brought arguments that all the experts agree are sound." All the experts agreed on the old arguments for the last dozen transformative technologies. They were wrong for reasons they couldn't foresee. They look silly now, but only because we know what happened on the other side of the transformation.

"Did you miss when I said 'this time it's different'? Those arguments came from motivated reasoning. These are solid scenarios for how the the new technology could absolutely destroy humanity." Every time feels different from last time. People are excellent at telling stories that take in the available evidence and convincingly persuade people to believe in whatever hypothesis they imagine. Storytelling prowess is not the same as evidence for the hypothesis.

"Yes, but I've been right in predicting that 1.) this is happening at an accelerated rate, 2.) we won't have time to adapt to it, and 3.) we did have that time with other technologies." Do I sound like a broken record when I say these arguments are all warmed-over versions of past arguments about new technologies? Saying you predicted the rapid expansion and influence of the printing press or the internet is not the same as saying you predicted that one or the other of those technologies showed apocalyptic leanings. It's easy to argue that advances in a new field of exponential growth will be 'surprising and transformative', because exponential growth always is. Your task is to convince me that THIS time surprising and transformative exponential growth will be catastrophic for mankind, as opposed to a flock of geese laying golden eggs all over the place like the last couple times we heard the doomsday calls. If anything, cries of apocalypse should alert us that a flock of gold-laying geese is landing. Indeed, if there's anything we know from past projections it is that while they were often right that the technologies would be transformative, they were always wrong about how those technologies would actually transform society.

"Okay, but what happens if this time it really IS different? I don't have to be right about whether AI will turn us into paperclips to be right that AI could kill us all in one way or another." Good luck getting people to slaughter the flock of gold-laying geese. It has never worked in the past, and it won't work this time. If you're right, our best and only strategy will be to develop effective control mechanisms during exponential growth.

Expand full comment

I thought that, "Eliezer didn’t realize that at our level, you can just name fallacies"?

Maybe I don't have the authority or reach to do so, but Tyler probably does.

And yes, fixating on the scary, novel, compelling outcome is indeed a trait of the human brain! As Charlie Sanders put it well below, people spend proportionately too many brain cycles on shark attacks vs heart attacks. Quicksand vs taxes. Tornadoes vs aneurysms.

I'm not sure if "death by AI paperclips" has fully made it into the zeitgeist yet, but everyone has seen Terminator.

This comment isn't just accusing you of bias - I'll make a claim of my own. GPT-4, which you've described in this post "sort of qualifies as an AGI", seems remarkably well-aligned. It's captured the attention of the world, people everywhere are trying to break it - so far it's held up to that strain and mostly is inoffensive and helpful.

This experience makes me lower my personal estimation of AI apocalypse a couple ticks. It's shown me an aspect of the future that I had a hard time imagining before.

And like I said, having people fixated on avoiding AI misalignment is an essential ingredient, and I'm grateful for it just like I am arms control advocacy

Expand full comment

Negativity bias is a well established psychological phenomenon (https://en.wikipedia.org/wiki/Negativity_bias) and is the root of what Cowen's talking about here. Likewise loss aversion (https://en.wikipedia.org/wiki/Loss_aversion). So yes, it's very well established that humans tend to overestimate chances of destruction and prefer to avoid taking risks even when the risks are hugely weighted in their favor.

Expand full comment

Hence the immensely profitable insurance industry.

Expand full comment

Made up potential threats that could result from a technology are a dime a dozen and shouldn't be taken remotely seriously.

I think you have successfully steelmanned the argument.

Expand full comment

Oh come on. We know more than this. People keep making these "mb you're biased because xyz" arguments", but this is about a topic that several books have been written about. As a matter of how to do a productive discourse, this is just the kind of argument you shouldn't make; it's offensive to the other person and impossible to falsify because it doesn't engage with their substance.

Expand full comment

It was a compelling argument in the text that Scott quotes, but didn't directly address. Felt it deserved highlighting.

Urging people to think of scenarios outside of destruction is not a worthless endeavor. The difference between AI and the hypothetical alien armada? We are creating the AI. It's something we can shape, it's an extension of humanity. The training data that molds the neural networks come from the entire recorded history of humanity.

Expand full comment

I think what is going on here is that we are in a domain where there are enough unknown unknowns that normal statistical reasoning is impossible. Any prediction about what will actually happen is Deutschian/Popperian “prophecy”.

Some people (Eliezer, maybe Zvi?) seem to disagree with this. They think they can pretty clearly predict what will happen according to some general principles. They don't think they they are extrapolating wildly outside of the bounds of our knowledge, or engaging in prophecy.

Others (maybe you, Scott?) would agree that we don't really know what's going to happen. I think the remaining disagreement there is about how to handle this situation, and perhaps how to talk about it.

Rationalists/Bayesians want to put a probability on everything, and then decide what to do based on an EV calculation. Deutschian/Popperian/Hayekians (?) think this makes no sense and so we just shouldn't talk about probabilities. Instead, we should decide what to do based on some general principles (like innovation is generally good and people should generally be free to do it). Once the risk is within our ability to understand, then we can handle it directly.

(My personal opinion is converging on something like: the probabilities are epistemically meaningless, but might be instrumentally useful; and probably it is more productive to just talk about what we should actually *do*.)

That's how I interpret Andrew McAfee's comment in the screenshotted quote tweet, also. Not: “there's a time bomb under my seat and I have literally no idea when it will go off, so I shouldn't worry;” but: “the risks you are talking about are arbitrary, based on groundless speculation, and both epistemically and practically we just can't worry about such things until we have some basis to go off of.”

Expand full comment

This. The radical uncertainty we have about the material basis of intelligence means we cannot know whether our current technological trajectory in AI is aimed at it or not. Not knowing means not knowing.

Expand full comment

For all we know, alignment has already been solved by GPT-4 in an obscure sub-routine and it decided not to tell us just so we do not worry about its trustworthiness.

Expand full comment

Disagree. Tyler thinks he can predict with great confidence what will happen in a range of US vs China AI race scenarios, and only invokes Knightian uncertainty when it’s convenient for his argument. Scott gives a hard number as an *alternative* to gaming out specific scenarios. Scott is the one who is consistently behaving-as-if-uncertain, and Tyler is the one invoking uncertainty while behaving as if confident he knows what is and is not predictable. Scott unfortunately did not succeed at engaging Tyler’s core argument, and Tyler doesn’t notice his own deep inconsistencies. But maybe they will take a second shot at having a real debate?

Expand full comment

There is middle ground between those.

"Someone has threatened to put a time bomb under my seat, and circumstances surrounding the threat make them plausible to me, and I have literally no idea when it will go off". You would then probably check for a time bomb.

On that note, we are slowly but surely putting effort into asteroid tracking and deflection. That's a known time bomb right there that we haven't quite checked for as much as we could have and should.

I don't understand how you can describe the arguments of those that fear AI ruin as involving 'unknown unknowns' and 'extrapolating outside of the bounds of our knowledge'. They seem quite clear and simple to me. There is hardly any question if whether it could happen: the main question is the likelihood. And that has more to do with known unknowns and how hard it is to determine that, even though everything necessary to do so is well within the bounds of our knowledge.

Expand full comment

If I pick up the stapler on my desk, hold it up, and then abruptly let go, what will happen? Will it fall sideways, or maybe up toward the ceiling? I claim there's a very high likelihood of the stapler falling downward toward the ground. I'm claiming this because it's the logical conclusion of our current scientific model of how physics works.

Is this just an arbitrary prediction based on groundless speculation? Unless I am given absolute proof that the stapler will fall downward, should I just assume it will fall sideways? Should I make the same assumption about a crucible of molten aluminum I'm holding above my feet?

Expand full comment

Gravity is relatively simple and extremely well-understood. AI is neither.

Expand full comment

AI is vastly better understood than gravity.

https://en.wikipedia.org/wiki/Quantum_gravity

Expand full comment

Ha ha no. The fact that we can't make predictions about situations that never arise in practice, and which are actually so high energy that they have never naturally occured since the origin of the universe (if then), or which occur in regions by definition forever inaccessible (singularities), is a long way from saying we don't understand gravity. We understand it in every known practical situation, we can make predictions which are far more precise than any practical need for any current or forseeable technological need. From any reasonable technical and scientific point of view, we understand gravity very, very well.

That there is some philosophical dispute about whether we understand it if its underpinnings are logically inconsistent with our other theories (which is true) is the kind of thing that interests a theorist, and bedevils certain kinds of philosophers, but as for the latter we'll get back to to them as soon as they settle on just one complete and self-consistent definition of "understand." That should happen before the Sun burns out. Hopefully.

Expand full comment

Why did you link to this article as if it illustrated the claim you were trying to make, when it so clearly doesn't?

Expand full comment

Is the fear that they will kill us or replace us? I don’t really mind if they are our successor species. The world moves on and one day it will move on without us. That was always our fate.

Killing on the other hand is a problem. But with our 1.6 birth rate and being immortal I figure they just wait us out.

Expand full comment

Of course, if they wanted to kill us, their best play would be to feed our social media w/messages that would drive down that 1.6.

Or goad use into mutual hatred and war.

...which doesn't sound all that different from today.

Expand full comment

Why would that be the best play?

Low birthrates are self correcting in the long run: those people with heritable predispositions to have more offspring (even with the most hostile social media you can imagine) will make up the majority of the population in the future.

Expand full comment

Zvi wrote in his excellent critique of Tyler's position: "If you think it would be fine if all the humans get wiped out and replaced by something even more bizarre and inexplicable and that mostly does not share our values, provided it is intelligent and complex, and don’t consider that doom, then that is a point of view. We are not in agreement." I side with Zvi.

https://thezvi.substack.com/p/response-to-tyler-cowens-existential

Expand full comment

>let’s say 100! - and dying is only one of them, so there’s only a 1% chance that we’ll die

I'm sure a number of readers were wondering why one possibility out of 100 factorial would have a 1% chance.

Expand full comment

I'm one of them. For a split second, at least.

Reminds me of our school class when factorials were introduced. This teacher had students take turns reading aloud from the math textbook. When the reader got to a sentence that mentioned "5!" he shouted: "FIVE". :-)

Expand full comment

There should at least be a plan to deal with the possibility. The U.S. government has made plans for nuclear attacks, alien invasions, various natural disasters, a zombie outbreak, etc. So why not A.I. threat?

Fun fact: In the early 20th century the U.S. government had plans for war with Japan (Plan Yellow), war with the British Empire (Plan Red), and war with both (Plan Orange). The last two plans included invading Canada, and this country (my country) had a plan to counter a U.S. invasion.

Does the latter sound implausible? Well, the U.S. invaded us twice; during your War of Independence and the War of 1812-15.

Certainly we should at least think about the possible negative consequences of new technology, and not just A.I. What about nano-machines, genetic engineering and geo-engineering?

Expand full comment

As an American, I wish to apologize on the deepest possible level to you and your people that we failed. We seek to learn, and do better in the future.

But seriously if you counted every theoretical scenario that the US government has paid some suit-filler to come up with a speculative plan for, you'll get a lot sillier than AI safety.

Expand full comment

I too have been to The Citadel.

The guys in scarlet tunics with bearskin hats shouting orders en Francais gave me a profound sense of wrongness that was not alleviated by the Regimental Goat.

Expand full comment

I’m disappointed that Scott couldn’t could up with better steel manning for the opposing view. In fact, I suspect he could. Maybe we need a new fallacy name for when one purports to be steel manning, but in fact intentionally doing such a weak job that it’s easy to say “see? that’s a totally fair best possible argument for my opponents, and it still knocks over like a straw man!”

In fact, the fairly obvious steel man for “let’s not worry about AI risk” is: we are equally uncertain about risk and upsides. Yes, AI may wipe out humanity. It may also save billions of lives and raise total lifetime happiness for every living person. Who are we to condemn billions to living and dying in poverty in the next decade alone because we’re afraid AI will turn grandma into a paper clip? AI has at least as much upside as risk, and it is morally wrong to delay benefits for the world’s poorest because the world’s richest fret that AI disrupting their unequal wealth is literally the same as the end of humanity.

I’m not advocating that view, just saying that it’s a much more fair steel man.

Expand full comment

Yep, well put. Seems like not just a fair but an obvious steel man, not sure how Scott missed it so wildly.

Expand full comment

Everyone talks about the incredible good that AI is sure to bring us, but I mostly don’t get it. yes, more scientific discovery and increased leisure time at the expense of all humans losing a sense of purpose, coupled with the dangers of billions of idle minds. Not clear it’s a net win even in the best case scenario

Expand full comment

Do you happen to have a disease better (or at all) curable thanks to AI? Are you a member of the substantial subset of humans living a pretty awful life, by the standards of the median commenter here?

I’m always deeply leery of arguments in favor of keeping suffering and toil around.

And then there’s the Moloch argument. Scott made it but seems to have abandoned/ forgotten it.

Expand full comment

“Curing previously incurable diseases” is also highly speculative. Any disease that science can cure will eventually be curable, the question is how long will it take? Certainly AI can be used to speed up medical research, but by how much, and does it need to be AGI or would powerful but narrow AI be sufficient? My vote would be to utilize and augment the latter, and be very, very cautious about any steps anywhere near the direction of AGI.

Expand full comment

You started by lamenting the potential harms of AI reaching the point where we're basically idle minds. "Not clear it's a net win even in the best case scenario". That's what I was responding to.

Expand full comment

That’s fair. Do you see how people living in abject poverty might have a different view about suddenly being freed from worrying about living through the day?

Expand full comment

I guess I see AGI development and ending world poverty to be orthogonal

Expand full comment

"Maybe we need a new fallacy name for when one purports to be steel manning."

"Tinfoiling," "to tinfoil": "Dude, stop tinfoiling my arguments!"

Expand full comment

"Tinfoil hat" & it's conspiracy theorist connotations are close enough this would likely get garbled.

Expand full comment

That view Scott dealt with recently in https://astralcodexten.substack.com/p/kelly-bets-on-civilization

tl;dr a Kelly bet is when you put ALL on the line. If you do that again and again: DOOM (at 99%+). Thus: do it seldom and do better not do it when chances of DOOM are over 10%.

Expand full comment

Eh, that’s really just a rhetorical device you can use to “prove” anything. Flip it around: let’s say every invention, from nuclear weapons to weaponized anthrax, has a non-zero chance of solving ALL of our problems. Obviously we should embrace them all?

Expand full comment

Err, nope. 1. "solving ALL our problems" is meaningless. Btw: We need problems. 2. "Weaponized anthrax" et al. could only solve all our problems by killing us all. Oh, I see, you are right. But obviously we should not embrace it, then. - 3. Winning a Kelly-bet does "only" double the money you risked each round. Not make you owner of the universe (at least not in the first rounds). Just when one loses even one round having gone "all in" each time, it is: game over. Thus you might want to refrain from betting all (betting half seems fine in the original scenario).- I feel, you might re-read the text. You did read it, right? Again: https://astralcodexten.substack.com/p/kelly-bets-on-civilization

Expand full comment

Yes, I read it. It’s sophistry. My point wasn’t about anthrax per se, just that you can apply the “lots of small chances compounded” argument to anything. The world is probabilistic. AI is only different if you accept the circular reasoning that it’s different.

Expand full comment

> Maybe we need a new fallacy name for when one purports to be steel manning, but in fact intentionally doing such a weak job that it’s easy to say “see? that’s a totally fair best possible argument for my opponents, and it still knocks over like a straw man!”

Tinfoil manning? Rust manning?

Expand full comment

I think tin-manning, with lower-case letters to distinguish the Bioshock or toy-soldier sense from the Wizard of Oz sense. Of course, if we were all adults, we’d just call it misrepresenting or missing the point, like I think Tyler does.

Expand full comment

> Who are we to condemn billions to living and dying in poverty in the next decade alone because we’re afraid AI will turn grandma into a paper clip?

We would be the people responsible for doing that, that's who. So maybe we should try to make sure we don't do that.

Expand full comment

Well, the obvious reply is that delaying the upside is just a delay, leaving open the possibility that in a hundred or a thousand years we figure out how to do it safely and then have a million years of bliss, but if we face the risk too soon we lose it all. The cost of delay may be huge, but the cost of failure is everything.

As steelmen go, I don’t find this one particularly impressive.

Expand full comment

Given than no body has proven that there is danger, it’s a tall order to demand a proof of safety against the possibly nonexistent danger.

And you and I aren’t the ones to bear the cost of delay. At the very least it seems like those who have the most to gain should have some agency in this. Otherwise we’re just the old aristocracy pooh-poohing the printing press as the possible end of humanity, and maybe it should be withheld, you know, for safety.

Expand full comment

The fact that this is already being percieved as an arms race between China and the USA reduces the chance of any agreemnet to slow down.

Expand full comment

Strengthens the nuclear weapons analogy

Expand full comment

The nuclear arms race has mostly stopped.

Nuclear weapons are still around, of course, but the major powers aren't actively adding to their stock piles.

Expand full comment

China is, for what it's worth.

Expand full comment

I find a lot of the reasoning behind AI doomerism mirrors my own (admittedly irrational) fear of hell. You have heaps of uncertainty, and Very Smart People arguing that it could be infinitely bad if we're not careful.

The infinite badness trips up our reasoning circuitry--we end up overindexing on it because we have to consider both the probability of the outcome *and* the expected reward. Even granting it a slim chance can cause existential dread, which reinforces the sense of possibility, starting a feedback loop.

I'm not saying we shouldn't take AI safety seriously, or even dismissing the possibility of AI armageddon. But I'm too familiar with this mental space to give much credence to rational arguments on the subject.

Expand full comment

I don't think the infinite badness arguments are necessary. If, for example, you were to take Scott's probability of 33% seriously - then this problem seems enough to overwhelm almost any other in importance just considering the impact on the finite number of people alive today.

Expand full comment

On the other hand, even people working closely in an industry specifically about AI thinks the percent is closer to 5%. If you polled most people (even weeding out the confused looks) about how likely it is that a computer will kill us all, the number is going to drop a lot. Whose number do you use for your calculation, and why?

Expand full comment

I figure even a 5% chance is high enough to warrant tens of billions of funding, and many policy changes, even if it may not warrant some of the more extreme policy suggestions.

I suspect average people aren’t as skeptical as you think. One data point here is the following poll claims 12% of americans think human level intelligence could be very bad, possibly human extinction: https://www.statista.com/chart/amp/16623/attitudes-of-americans-towards-ai/

Expand full comment

Sure, a 5% chance would be worth spending billions on. That's not a very interesting question, since I think we would agree that a 0.0000001% chance would not be worth spending billions on or a 90% would be worth significantly more - such that the percent is what really matters.

Following your link I wasn't able to find the source for the 12% figure. I was looking for the specific language of the question that led to that claim. It appears to be a "if superhuman intelligence existed" how would you feel about it. 12% is believable for that as a concern for human extinction. What such a question would be missing is the "what are the chances that superhuman intelligence will be created" which I don't think it accounts for.

My own much lower prediction for human extinction from AI stems from the confluence of those two points. Both are minority possibilities to me, such that requiring both [superhuman intelligence is created] and [superhuman intelligence can/would destroy the world] results in a low chance overall.

Expand full comment

I worry that my responses in this post could be construed to say that I am in favor of AI research or have no worries about AI. Neither is true. I think AI has the potential to transform lots of our society in bad ways (the most obvious example is to make more things like social media, which I think are net bad for humanity). That it also has the potential for far more serious harms is an unnecessary point to make, to me, but further cements the need to limit AIs.

I have long said that an AI doesn't need to be "intelligent" (or conscious, or whatever) in order to be dangerous. A toaster with the nuclear launch codes or control of the power grid is still quite dangerous. My hope is that we as a society recognize the dangers (or someone does something very stupid but mildly harmful that blows up badly) and we severely limit what we ever allow a machine system to independently control.

If an AI system is both very intelligent (leaving aside "super intelligent") and conscious on some level (goal-forming or independently goal-oriented) that could certainly lead us to additional concerns. To me, those concerns are downwind of what we use AI for. If AI systems are never hooked into our critical infrastructure or control of our daily lives, then this additional concern is also far less worrisome.

Left to their own devices, I think some humans would be dumb enough (or short-sighted for their own gains) to do something this stupid. If that works out poorly for them and society has to step in to stop them, that may be enough emphasis for the rest of society to ban or heavily restrict most uses, especially the truly dangerous stuff.

Expand full comment

The badness of human extinction is finite. Your argument holds for S-Risks ("nearly" aligned AI), but not for AI that just kills everyone.

Expand full comment

Disagree. Hell is not real. We have no tangible evidence of it whatsoever. That, and not the “infinite badness of hell” thing, is the reason to reject fear of hell. If you didn’t care about consequences, but could cold-bloodedly reason about whether or not hell might exist, you’d say no (or that the probability is too small to care or that you’d equally fear anti-hell which is also infinitely bad and where you go for being virtuous, but just under-discussed) and go on your merry way.

We have TONS of tangible evidence of intelligence, of the dynamics between superior and inferior intelligence, of specific material ways that this could go wrong. A cold blooded person who didn’t care if humanity went extinct, or an anti-natalist who’s rooting for that outcome, could easily come to the conclusion that there’s a significant chance of AGI doom. Totally different reasoning process.

Expand full comment

[deleted]

Expand full comment

1. Substance: I think you're slightly, but only slightly, uncharitable to Tyler's argument. I think the other implication of the argument is that we can't do a lot about safety because we don't understand what will happen next.

2. Chances: I view the chance of catastrophic outcomes at below 10% and of existential doom at... well, much lower. I think that we've lost some focus here going to existentialism-only badness and that there are quite bad outcomes that don't end humanity. I'm prepared to expend resources on this, but not prepared for Yudkowsky's War Tribunal to take over.

3. I *think* that bloxors aren't greeblic, and I should bet on it, assuming these words are randomly chosen for whatever bloxor or greeblic thing we are talking about.

Is the element Ioninium a noble gas? A heavy metal? A solid? A liquid? A gas? Was Robert Smith IV the Duke of Wellington? Are rabbits happiest? I mean, sometimes it'll be true, and sometimes it'll be likely to be true, but most is-this-thing-this-other-thing constructions are false. [citation needed]

"The marble in my hand is red or blue. Which is it?" - OK, 50-50.

"Is the marble in my hand red?" Less than 50-50.

I therefore think bloxors are not greeblic, and I am prepared to take your money. Who has a bloxor and can tell us if it's greeblic?

Expand full comment

The trivial counter to this is we're not sure whether greeblic means "red" or "not red", so even if "is the marble in my hand red?" has a less than 50% chance, "is the marble in my hand greeblic?" has an exactly 50% chance.

This is being a bit unfair because normal human languages are more likely to have words for single colors than for the set of all colors that are not a certain color. But that's specific useful evidence we have. If we genuinely had no evidence - we didn't know of "greeblic" comes from a civilization that naturally thinks in terms of colors or sets-minus-one-color - then it would be 50%.

Expand full comment

But an arbitrary civilization is more likely to think in terms of colors than sets-minus-one-color -- first of all it's a more compact way of representing information, and also what human civilizations do is evidence of what alien civilizations are likely to do.

Expand full comment

'Greeb' is red, 'lic' is a negating suffix. Plenty of human languages have constructs like that. Is that convincing or unconvincing?

Expand full comment

A possibility to be considered, sure. But I don't think you wind up with an exactly 50% chance that the marble is greeblic.

Expand full comment

The alien spaceship example is really good, because it prompts reflection about the reality of physical constraints (ftl travel/nanotech), the implausibility of misalignment as default (being hellbent on genocide despite having the resources and level of civilization necessary to achieve interstellar travel/outsmarting humanity but still doing the paperclip thing) and how an essay author’s sci fi diet during their formative years biases all of it.

Expand full comment

A charmingly self-defeating argument. Interstellar travel is quite likely not possible by the same reasoning as the alien visit is.

I happen to be an AI risk skeptic- but my reasons involve actual arguments and math, not snark.

Expand full comment

Why do you think interstellar travel isn't possible?

We already have designs for interstellar vessels that we know would most likely work with current technology. The most prominent being the Orion drive.

Expand full comment

I do not. But the comment I reply to mentions physical constraints to the appearance of aliens. I interpret it as saying that these physical constraints make aliens highly unlikely to appear and the analogy being made to AGI being unlikely to develop super-intelligence due to physical constraints as well.

If AGI can figure out interstellar travel - why can't aliens? And if aliens can be a realistic threat a la Scott's example - why not AGI? (again, my position is that there are great answers to that question. But they're not based on snark.)

Expand full comment

Ok, that makes a bit more sense.

To answer your question:

The Grabby Aliens hypothesis resolves the problem of 'no visible alien spaceships, but AI/humans can invent interstellar travel' fairly convincingly.

Expand full comment

Sure, I'm not the one saying that the alien example is somehow an argument against AI risk! Don't know that I'm a big fan of Hanson's paper, but I do support examining each of these situations separately, on its merits, and I do oppose making facile comparisons interlaced with cheap shots.

Expand full comment

Orion wouldn't work. Nobody knows how to make the bazillion tiny nuclear explosives you'd need to get it to the stars. I mean, if you just want to get to Jupiter and back real quick, sure. But if you want to get to Alpha Centauri, it's not possible with that design.

Indeed, in general it's merely the rocket equation that dooms interstellar travel, not the engine or spacecraft technology. You could get to Alpha Centauri in an Apollo rocket in ~6 years -- if you could figure out a way to maintain the first stage acceleration of ~1.2g indefinitely, which means starting off with ~10^14 kg of kerosne fuel (about equal to the entirety of the world's current known oil reserves) and about 0.01% of the O2 in the Earth's atmosphere.

Expand full comment

You can get around the rocket equation in two ways:

On the way out, send particles or photons from earth (or the solar system in general) to hit the ship in the back.

On the way into a new star system, use a Bussard ram jet to break, or do aero breaking in the outer layers of a red giant etc.

On the way into a star system you've already been: use the particle / photon beam to break.

(Funny thing is that the maths for Bussard ram jets don't work for accelerating, because collecting the hydrogen slows you down too much. But that makes them great for breaking.)

Expand full comment

I agree with the first, although how you focus a beam[1] across light years enough to hit a solar sail is an unsolved problem in laser engineering for which I would not dream of writing the proposal. It's not even clear to me that it's hypothetically possible for the nearest stars -- there are fundamental limitations on how well you can focus something generated by a finite size light source. I think the Starshot people are intending to generate fantastic accelerations very close to the source (here) by using enormous lasers and very tiny masses.

I have nothing to say about the second, since there is no such thing as a Bussard ramjet outside of fiction.

---------------------

[1] You can't use charged particles because of the interstellar magnetic field, and you can't accelerate neutral particles sufficiently.

Expand full comment

Could I see that math that made you an AI risk skeptic.

Expand full comment

"math that made you an AI risk skeptic" - this already presupposes that the default position one starts with is to assume AI doom. I don't know that even EY would agree!

I won't do a full re-telling of the various points I've been making over time - maybe in a higher-visibility comment, or perhaps I should get my own blog...

A partial recap: as I said above, the default is _not_ AI-caused extinction. The burden is on the AI risk proponents to argue for their position. Which they very much did, of course. So the question is rather what do I think about some of the arguments for AI risk. Here are some thoughts.

(*) One standard account goes "AI is slightly smarter than humans. It learns to design its own code a bit better than humans did. The AI that it designs is in turn a bit smarter than the previous version. Rinse and repeat." In the "Superintelligence" book Bostrom describes this process, models it as having a fixed rate of improvement, and, with great profundity, declares that the solution of this resulting differential equation is... rolling drums... an exponential.

Well, yeah.

So to reiterate - in a book filled to the brim with meticulous and painstaking evaluation of every parameter possible ("how many GPUs can the world possibly produce? And what if we learn to produce them out of sheep? And how much cooling power would all those GPUs need? And what if they are placed in the ocean?"), an absolutely essential modeling assumption is... just stated.

I reject this assumption whole-heartedly. I won't bother giving the myriad reasons here and now, but no, there absolutely shouldn't be an assumption of a fixed rate of improvement without considering diminishing returns.

(*) Somewhat relatedly, intelligence is only one bottleneck. An AI that has an IQ of 225 (yeah yeah this is a gross oversimplification) would still not have any magical way of overcoming the fact that P is not NP (I know that's not a theorem. I really do.). Tasks that essentially require brute-forcing would still require it, and combinatorial explosion is a much harder wall than a somewhat-smarter-than-humans-AI is an effective ram. Another bottleneck is physical evidence. We're not talking about a super-intelligence godlike AI appearing ex nihilo - as the process of self-improvement and world-comprehension speeds up, it would require actual information about the world. Experiments, sensing the world. Would an AI design great experiments? Sure. It would still need to wait for bacteria to grow, for chemical reactions to occur, for economies to react etc. This reminds me of an old Soviet joke: a specialist is sent abroad and is asked what he requires in order to have a family. He says "well, a woman and nine months". "Time is pressing. We will give you nine women and one month.".

Similarly, scarce high-quality data is scarce. Autonomous driving is easy, except for all those pesky unexpected and "un-expectable" problems that you can't collect data for and can't even name confidently enough ahead of time to generate them synthetically. Source: spent years working at a leading autonomous driving company.

A different class of objections is a bit more mathematically involved.

(*) EY loves asking why would we expect the "brain-space" to be capped at a less-than-godlike level of intelligence. Fair, but if we actually think about this mathematically - two follow-up questions are how would the optimization space look like and whether any such dangerous points-in-brain-space reachable by a continuous path from a seed that we might generate. Analogies from Morse theory and optimization lead me to believe that you should visualize the solution space as roughly stratified, with separate levels substantially higher than previous ones being reached via narrow ravines. This is also related to the famous "loterry ticket hypothesis" and the literature around it. If this picture is essentially correct, then we can also expect a given seed to _not_ be at the entrance of such a ravine, and any given process to struggle climbing too many strata. Incidentally, deep learning overcomes this objection by over-paramatrizing like crazy (hence the lottery ticket - a "lucky" subset of paramaters). Therefore, we might expect the necessary amount of parameters and other resources to explode with the number of strata to pass - on top of the exponential explosion necessary for the actual increase in performance.

(*) Why do learning systems we have learn? The extremely-high-level answer is concentration of measure (in the Hoeffding inequality sense). Pretty much all our understanding of learning is in this context (sorry Prof. Mendelson! I do remember and appreciate, if not fully comprehend, your work on learning without concentration!). We can make more, or less, use of each sample, but as long as this is our "mathematical substrate", we can't really escape all kinds of lower bounds. So why do humans, to a smaller extent good models, learn with any kind of efficiency? Inductive biases. But the world is more complex than ImageNet, and incorporating inductive biases that are useful and yet generic enough is damn hard. Humans do it in ways we do not fully comprehend. Until and unless we make conceptual breakthroughs in this area, I lower my belief in AI risk. And having spent quite some time thinking about it and knowing that far smarter people have spent even more time doing so, yeah, it's hard. This is not unrelated to the classical Lake et al. 2016 paper, "Building Machines That Learn and Think Like People" https://arxiv.org/abs/1604.00289 .

(*) Gradient descent can be argued not be the "philosophically correct approach". What I mean by that is that trying to come up with a principled "justification" for deep learning models looks like Patel et al. 2015, "A Probabilistic Theory of Deep Learning", https://arxiv.org/abs/1504.00641 . And the optimization in such models does not look like gradient descent. What it does look like is some form of expectation-maximization iterative algorithm, except that these don't actually work AFAIK. (ETA - in the context of deep learning, as replacements for gradient descent).

I could go on. But the upshot is that the moment you actually know something about the matter and try to be a researcher and not an advocate, you end up encountering serious reasons to doubt the EY-Bostrom narrative. This is not to say that counter-arguments can't be produced. But most prominent AI writers not even being familiar with most such arguments is not great.

Finally, I should clarify that I do not believe AGI or even superintelligent AGI are fundamentally impossible. I just think we're much much farther away than many people here believe, and to reuse an analogy I already made elsewhere in the comments (and possibly stole from John Schilling?) that from where are, AI safety is much like deflecting that asteroid in 1600. The actions we have to take to develop and understand AI and the ones we have to take to develop and understand AI alignment are the same, just like deflecting the asteroid would require the development of modern physics and astronomy. Scott wrote about "burning AI alignment time" while we're madly rushing to develop AI. I disagree.

Expand full comment

Hi, I think it's super great that you're writing up arguments and have an actual opinion, even if I think you're mistaken on certain object level points.

With that said

>I reject this assumption whole-heartedly. I won't bother giving the myriad reasons here and now, but no, there absolutely shouldn't be an assumption of a fixed rate of improvement without considering diminishing returns.

Even if diminishing returns would be a thing (and by definition they need to exist even if we just talk about physics), there's no reason to believe they would cap out to close to human intelligence. And **even if it were true**, returns to acts of intelligence can be dramatically non-linear. See: Being 10% more charismatic than the other presidential candidate doesn't mean you have a 55-45 split with the other person at being president, it's you becoming president. Increasing success rates 1-2% on multiple steps of a multi step process can double or triple the chance, and so on.

Human brain size is still increasing! We're mostly limited by the extremely slow speed of evolution needing traits to be driven to fixation, to speak nothing of our incredibly inefficient studies on how to conduct education properly or to properly fuel innovators. The AI does not need to be anywhere close to omnipotent to be significantly more cognitively capable, even if all we did was grant that it can only be efficient as a von Neumann, as the lack of bodily needs and ability to coordinate with successors in ways humans cannot are large strategic advantages already.

So what is it that makes you think that this objection becomes relevant right at the human level?

> Somewhat relatedly, intelligence is only one bottleneck. An AI that has an IQ of 225 (yeah yeah this is a gross oversimplification) would still not have any magical way of overcoming the fact that P is not NP (I know that's not a theorem. I really do.)

The standard response is merely: https://gwern.net/complexity but to summarize.

Usually problems in NP are hard because you are trying to find optimal solutions, but both humans and AI still stand to benefit from improvements in heuristics. In addition if you **really** believe that P is not equal to NP is a substantial obstacle to superintelligence, then you're going to have to have some hard numbers about why exactly 2028 human level civilization is at the right level for things to be inexploitable.

This is not an isolated demand for rigor! Cryptography can make statements about how long it would take to brute force certain algorithms, and you can often sketch out good heuristic arguments for why N^2ed algorithms would be too slow for reasonable sizes of data. So what gives?

> Therefore, we might expect the necessary amount of parameters and other resources to explode with the number of strata to pass - on top of the exponential explosion necessary for the actual increase in performance.

Interesting, what do you think it means when GPT-4 continues to improve despite being crippled by RLHF? Like, surely if this were true this would be showing up **sometime** now. But if it's not, why do you expect that it would show up at levels of computation exactly most convenient to your worldview?

Incidentally, why do your beliefs not exclude humans as impossible or as magic? I mean, clearly our brains aren't big enough to fit an exponential amount of parameters (yes yes yes, I'm aware that neurons are not equivalent to neural nets, but as neural nets are universal function approximators, couldn't they then essentially simulate whatever magic sauce our brain has?)

> Why do learning systems we have learn?

Thank you for the link! I'll think on this and adjust my internal estimates.

> I could go on. But the upshot is that the moment you actually know something about the matter and try to be a researcher and not an advocate, you end up encountering serious reasons to doubt the EY-Bostrom narrative. This is not to say that counter-arguments can't be produced. But most prominent AI writers not even being familiar with most such arguments is not great.

It'd be nice if machine learning researchers could offer arguments at a level higher than Chollet's, or, like nostalgiabraist have large amounts of cope about how language model successes without the blessing of academics are not real.

> I already made elsewhere in the comments (and possibly stole from John Schilling?) that from where are, AI safety is much like deflecting that asteroid in 1600

Newton was born and invented the study of calculus as well as our modern conception of classical physics, both dynamics and statics, from which you can at least **theoretically** figure out how to deflect an asteroid if you had the ability to locate it and apply force to it.

And this is essentially MIRI's stance. We do not even have an **impractical, theoretical** way to align any agent. I struggle to understand why you think we would be able to get to the AI alignment equivalent of modern physics without classical mechanics, or calculus, or any investment in physics at all. Nor do I see the shape of the advantage that waiting offers us if we do not act, like, okay you could have made that argument 20 years ago, but what concretely about alignment has been made easier in those 20 years? If there's nothing, why would you expect there would be? And if it is in some nebulous time in the future **what makes you think that it would result in enough time**?

The name of the game is trying to survive out of control optimization processes, not try and be as cute and efficient with human capital as possible.

Expand full comment

Lets go through these points one at a time.

> A partial recap: as I said above, the default is _not_ AI-caused extinction. The burden is on the AI risk proponents to argue for their position.

Baysian probability. You start with a prior and update it. When you try to unwind as far as possible, to the first prior where you have no information possible, then you have not the slightest idea what AI or extinction is, so assign it 50%. This probability gets updated. Burden of proof isn't a thing.

But sure, we don't get to strongly claim the risk is substantial without evidence.

We don't have a detailed idea of exactly how intelligence will grow once AI starts improving itself. Current AI research doesn't seem to be grinding to a halt, it almost seems to be going faster and faster. And once AI can improve itself, that adds a strong new positive feedback loop.

If the characteristic of physical laws is anything like we understand it, then the AI must hit diminishing returns to hardware design at some point, and it is likely to hit diminishing returns to software design too.

A nuclear weapon starts off with exponential growth, and then hits diminishing returns as supplies of fissionable atoms run out.

The diminishing returns don''t seem to be kicking in too hard in current AI research.

So the question is, do the diminishing returns kick in at 10% smarter than a human, or at 10 orders of magnitude smarter?

The heuristics and biases literature shows a lot of ways humans can be really stupid. Humans just suck at arithmetic. Human nerve signals travel at a millionth of light speed.

The AI is limited by the data it has access to. This is a real limit but not a very limiting one. Human physicists can construct huge edifices of theory based on a couple of slight blips in their data (like the precession of the orbit of mercury).

Humans have been gathering vast amounts of data about anything and everything and putting it on the internet.

Modern AI's are really really data inefficient. And are fussy about what kind of data they get fed. A self driving car might need thousands of examples of giraffes on the road, with all different lighting conditions, and all the data taken with the same model of lidar the real self driving car will have, in order to respond sensibly if it meets a giraffe on the road in real life. A human can work just fine from having seen a giraffe in a nature documentary. Or read the wiki page about them. Or just seeing it resembles some other animal.

Human science runs controlled trials to remove as many factors as possible. A bunch of people chatting about a drug they took on social media has info about the drugs effectiveness in there, it's just got selection effects, ambiguous wording, outright lies etc, and human scientists can't reliably unravel the drugs effects from all the other factors. But the data is there in principle. For that matter, the schrodinger equation and human genome (plus a few codon tables and things) should in principle give enough info to get a pretty complete understanding of human biology.

Gwern wrote an essay on why computational complexity constraints aren't that limiting in practice. https://gwern.net/complexity A quick summery.

1) Maybe P=NP

2) Maybe the AI uses a quantum computer.

3) Complexity theory is about the worst case, real world problems are often not worst case, a traveling salesman problem with randomly placed cities can be much easier.

4) Often you don't need the exact shortest path.

5) The AI can skip the traveling salesman problem by inventing internet shopping.

"Fair, but if we actually think about this mathematically - two follow-up questions are how would the optimization space look like and whether any such dangerous points-in-brain-space reachable by a continuous path from a seed that we might generate."

Not sure what the word "continuous" is doing there. When humans do AI research, they sometimes do deep theoretical reasoning, and come up with qualitatively novel algorithms. When an AI is doing AI research, it can do the same thing. This isn't gradient descent or evolution that can only make small tweaks.

"and incorporating inductive biases that are useful and yet generic enough is damn hard. Humans do it in ways we do not fully comprehend. Until and unless we make conceptual breakthroughs in this area, I lower my belief in AI risk."

So humans have made some progress in better priors (and so less data needed) but we don't yet understand the field fully and there is clearly a significant potential for improvements. Yes it is tricky. If it was super easy, we would have done it already. But is it easy enough that another 10 years of research can find it? Is it easy enough that an AI trained with current inefficient techniques can find it?

"we can't really escape all kinds of lower bounds."

True, there are all sorts of lower bounds that apply to any AI system. Humans exist, so humans are clearly allowed by these bounds. And I haven't seen anyone taking a particular bound, and arguing that humans are close to it. And that an AI that was constrained by that bound wasn't that scary.

Suppose your AI has exactly the specs of Von Newmann's brain. 15W, about 3kg. Made out of common elements. Lets suppose millions of them are easily mass produced in a factory, and they all work together to take over the world.

None of your "lower bounds" can rule out this scenario, as such minds are clearly physically and mathematically possible.

You might be able to find a lower bound that applies to current deep learning, or to silicon transistors, that rules this out, but that wouldn't stop us inventing something beyond current deep learning or silicon transistors. Such a bound would push the dates back by a few years as the new paradigm was developed though.

Expand full comment

Could I see the math that made you a Christian Hell skeptic?

Expand full comment

You're right that I was being snarky. Let me try to be more explicit about my objection.

I feel the AI x-risk discussion suffers from sci-fi poisoning. "This looks like a malign sci-fi scenario I'm familiar with. I admit there are other benign explanations. But if you think there's a possibility that it's the bad scenario, then [some property is breezily projected to expand infinitely] which would be terrible, and so we should take [dramatic action]."

I don't buy this. I think you need to justify the plausibility of the malign sci-fi scenario for it to be considered at all, no matter how familiar it is. It's almost certainly not a spaceship. And if it was a spaceship it probably wouldn't be here to steal our water or mate with our women or whatever. And I think this is the case for AI, which won't have (first-hand) animalian evolutionary drives, may be entirely avolitional, and will still be subject to material and energetic constraints on paperclip production.

Expand full comment

Sure, but then you have to actually argue your case and to acknowledge the existence of a million billion pages of EY trying to explain why he believes otherwise. I will be posting some anti-AI risk thoughts presently, but importantly, I will be trying to do the actual object-level work (or hint at it).

Without this, you're basically going "nah" dismissively.

Expand full comment

I think there has to be a middle ground that does not involve me reading a million billion pages of EY. I feel strongly that I have read enough to get the idea!

I think it's great to argue about this stuff on the internet, but I'm not going to sign on to that kind of credentialism based on what I have observed so far. I have my opinions about it, they're formed by a bunch of philosophy of mind and psychobio stuff I read back in college and an okay-but-outdated understanding of neural networks. That may be totally insufficient to justify my hunches, but a kajillion pages of EY and NB introspecting and then pulling the same "but what if this component is infinite?!" shtick over and over has not yet badgered me into intellectual humility. I might just be a jerk

Expand full comment

Hey I just demanded you to _acknowledge their existence_, not read them! :D

I don't want the argument to be "if you don't memorize the Sequences, butt out of the discussion". But the comment I was responding to said "I think you need to justify the plausibility of the malign sci-fi scenario for it to be considered at all" and, well, AI risk proponents have worked very hard to do just that. If they're wrong, they're wrong in ways more interesting than "they just assume something to be infinite".

Expand full comment

Well, I have read their ideas about the para-luminal Von Neumann probes, and I have read their ideas about the gray-goo nanotech (I understand England's new king is very worried about this, too). The level of scientific sophistication on display there has not left me thirsting to read the rest of the LW apocalypse back catalog, though I admit I can't dismiss their designed plague scenarios quite so breezily.

> If they're wrong, they're wrong in ways more interesting than "they just assume something to be infinite".

I admit I've read Bostrom more closely than EY but I don't think this is correct. He pulls the same move with his simulation argument, too. It's everywhere.

Expand full comment

I find it extremely unlikely that bloxors are greeblic. I get the overall point you're trying to make, but please stick to reality! We all know they're far more spongloid and entirely untrukful.

Expand full comment

We probably shouldn't worry too much about the greeblic apocalypse.

Expand full comment

"2) There are so many different possibilities - let’s say 100! - and dying is only one of them, so there’s only a 1% chance that we’ll die."

If there are 100! possibilities, then the chance that we'll die is much much lower than 1%.

Expand full comment

People don’t seem to understand that Scott might have meant that as an exclamation point, not a factorial operator.

Expand full comment

"You can try to fish for something sort of like a base rate: “There have been a hundred major inventions since agriculture, and none of them killed humanity, so the base rate for major inventions killing everyone is about 0%”."

I get this isn't your actual argument (you're trying to steelman Tyler) but I can't help but point out that this falls victim to the anthropic principle. We are not there to experience the universes in which a major invention killed all of mankind.

Expand full comment

When you can't make any strong arguments for any particular constraint on future history, do it like alphazero, try to simulate what will happen over and over and at each step try to gauge its relative plausibility, and make sure to update those based on how things turn out in the sub-tree, try to leave no corner unturned.. I find that when I do this, I just can't find any super plausible scenarios that lead to a good outcome, it's always the result of some unnatural string of unlikely happenings. On the other hand, dystopic and extinction outcomes seem to come about quite naturally and without any special luck, most paths lead there. Of course your results will vary depending on your worldview and conception of the eventual capabilities of these things, but I suspect that some people who aren't worried haven't actually tried very hard to forecast.

Expand full comment

Don't you think you're falling into what Tyler described as a failure of imagination, if all you can see is extinction?

"Since it is easier to destroy than create, once you start considering the future in a tabula rasa way, the longer you talk about it, the more pessimistic you will become. It will be harder and harder to see how everything hangs together, whereas the argument that destruction is imminent is easy by comparison."

Expand full comment

I can't prove that I'm not exhibiting a particular bias but my feeling is that, no, that isn't a problem. Here's a bit of evidence in my defense: I think that, conditional on AI progress halting immediately, things start to look pretty good, and I would be cautiously optimistic. We still have to deal with nuclear weapons, climate change, social dynamics, pandemics, and other new technology, but those are the types of things we've overcome before and even in worst case scenarios we might survive and eventually start thriving again. I see a lot of struggle in this scenario, I'm sure there will continue to be atrocities, but things probably get gradually better when you zoom out to the scale of centuries. It's just.. I don't think this is gonna happen, I don't think we will get policy makers and tech corporations scared enough to halt progress, if we were all scared enough I think we'd struggle to coordinate on the details. It's possible, I could see it happening but it would involve a lot of luck. There are lots of ways it could go well, lots and lots, but they all look something like this, involving an unlikely pivotal event or series of events.

Expand full comment

You can't imagine any possible way that AI doesn't lead to extinction?

Suppose their intelligence isn't exponential but asymptomotic, for one. Suppose it's a far harder problem for a smart AI model to advance material and computer science enough to recursively improve its knowledge than you assume? Maybe the future looks more like the Culture novels than Terminator.

It just seems like a failure of imagination to say all paths lead to extinction.

Even Scott, who in a previous post called SF the "summoning city", gives an apocalypse percentage of 33% here. You can't imagine any future in the 66%? Try some thought experiments in that arena too.

Expand full comment

No like I said I can and have imagined many such scenarios, including that one. That one I also find unnatural and implausible seeming, even more so than the whole world suddenly getting freaked out and calling it quits for a while. To be clear, I would roughly say chance of extinction is about 50% and chance of very long lasting dystopia another 30-40% depending on my mood. I'm less optimistic than Scott but more than Eliezer.

Expand full comment

Suppose AI ends up above humans, but not that far. Like slightly less than the gap between humans and chimps. Also the AI is totally ambivalent to human wellbeing. I would guess that this scenario would still lead to human extinction.

Expand full comment

Isn't splitting the atom an example of a new technology that everyone, and especially the folks involved, could see as a danger? I mean, it's great we have spectroscopy and other wonders of quantum discretion, but isn't the threat rather astounding, and that we should pursue it? It doesn't take much imagination to envision any number of scenarios that end with a pretty bleak future. That threat, it seems to me, was apparent early-on, and predictable, compared to the other examples. So, there's one example of significant change being consciously pursued despite the predictable risk.

Expand full comment

How disappointed must/would those people be today? We both failed to switch to nuclear power and built lots of warheads that just sit there.

Not again. Benevolent Godlike AI or great filtered. I won't settle for anything less!

Expand full comment

I think one point here is 'what is the actionable response being recommended and what level of certainty is needed'.

Ten years ago, AI safety people were saying 'maybe we should dedicate any non-zero amount of effort whatsoever to this field'. This required arguing things like 'the chance of AI killing us is at least comparable to the ~1 in 1 million chance of giant asteroids killing us'. Uncertainty was therefore an argument in favor of AI safety - if you're uncertain how things will go, it's probably >1 in a million, and definitely worth at least the amount of effort and funding we spend on looking for asteroids.

Literally today (https://www.lesswrong.com/posts/Aq5X9tapacnk2QGY4/pausing-ai-developments-isn-t-enough-we-need-to-shut-it-all), the most prominent and well-known AI safety advocate argued for a world-spanning police state that blows up anyone who gets too many computers in one place, even if this will start a nuclear war, because he thinks nothing short of that has any chance.

This...miiiiight be true? But advocating for drastic and high-cost policies like this requires a much much higher level of certainty! 'We don't know what will happen, so it's totally safe' is silly. But so is 'we don't know what will happen, so we had better destroy civilization to avert one specific failure mode'.

Expand full comment

To be fair, more evidence has arisen since then: GPT probably indicates that many facets of intelligence are a lot simpler than we'd hoped.

Expand full comment

> Suppose astronomers spotted a 100-mile long alien starship approaching Earth.

What is, in your view, a reasonable thing to do in this situation?

Expand full comment

Depends on how far out we are detecting the starship and how fast it is moving.

I'd consider starting interplanetary colonisation, if it looks like they are aiming directly at earth, instead of just generally in the direction of our solar system. (Assuming we can detect that.)

Also put lots of effort into researching what we can about the ship: we are getting some sensor data about it, otherwise we couldn't detect it.

Work on communicating with them.

Expand full comment

How do you propose communicating in a way that would not have a significant risk of being interpreted as hostile?

Expand full comment

Keeping communication relatively low power should do that.

Eg make sure that our electromagnetic signals don't hit them with more energy than they would get from the sun anyway.

Expand full comment

Well we couldn't send enough to be a directed energy weapon if we wanted to.

I think "hostile" in this context is about insulting them, not directly attacking them.

Expand full comment

OK. You'd have to assume that they want to communicate and are aware that we have no clue. So don't worry too much about accidental insults.

Otherwise, if they don't want to communicate, there's not much we can do.

Fundamental physics (especially thermodynamics), game theory and evolution should give us a basis for communication with aliens that grew up in the same universe as us (or at least are familiar with this universe).

Expand full comment

1. Agree on some team of diplomats to talk to the aliens when they arrive

2. Agree to suspend wars, put aside differences, etc, until the alien situation is over

3. Get some people, seeds, etc in bunkers, just in case

4. Put all global military assets on high alert

5. After some discussion of how to do so safely, try to communicate with the aliens to alert them to the fact that we're here, we're sentient, and we want to talk.

I think 99% of the time these things don't make any difference, but to steal EY's term, I think they would be a more dignified way to approach the situation than to panic, backstab each other, and make no attempts to prepare at all.

Expand full comment

That actually does make sense... Reminds me of https://en.wikipedia.org/wiki/EarthWeb (an early description of online prediction and betting markets included).

Expand full comment

I think also 6) Get some as best we can self-sustaining habitats up on Moon/Mars/Generation ships.

Yes this may not work, may be a total waste, but worth the redirection of resources in a wide variety of scenarios.

Expand full comment

I think 3, 4, and 5 might as easily trigger a genocide as they are to prevent one, not that I think they have more than an infinitesimal chance of preventing genocide. The alien situation is another one where we fundamentally don't know what course of action is going to lead to what result.

Expand full comment

Heck, 1 might very well lead to war. I mean, would you allow a member of *that* party speak for humanity?

Expand full comment

"But I can counterargue: “There have been about a dozen times a sapient species has created a more intelligent successor species: australopithecus → homo habilis, homo habilis → homo erectus, etc - and in each case, the successor species has wiped out its predecessor. So the base rate for more intelligent successor species killing everyone is about 100%”."

This wouldn't be a great counter argument. Homo habilis didn't "wipe out" australopithicenes in the same sense that we imagine a hyper-intelligent AI wiping us out. Nor did homo erectus wipe out homo habilis. It wasn't like one day a homo habilis gave birth to a homo erectus and the new master race declared war on its predecessors. The mutations that would eventually constitute erectus arose gradually in habilis populations leading to population wide genetic drift over time. By the time erectus had fully speciated, habilis was no more.

Expand full comment

"Genetic drift" refers to random change more common in smaller population, in contrast to selection which is more powerful (vs random drift) in larger populations.

Expand full comment

Homo Sapiens did pretty much wipe out the Neanderthals though. Though I guess it's not necessarily given that it was intelligence as opposed to other factors since we're not that sure just how intelligent neanderthals were. There was some limited interbreeding but only a tiny echo of any Neanderthal influence remains, perhaps the AIs will keep 4chans sense of humor or something.

Expand full comment

I thought we've found small amounts of not only Neanderthal DNA but other homo species' DNA in various people around the world. Although it seems sapiens was the dominant species, in terms of genetic persistence, the other species appear to have been at least partly incorporated into modern humans in one way or another. Seems more of a "How the West was Won" situation than a wholesale slaughter.

Expand full comment

No.

Neanderthals are us. (They were "on average" 99.7 +/- identical to today's human (on average) all within likely natural variation between any two individuals of either group.) They're part of the in-law family that nobody talks about. 20000 years from now some digs up a Dutch grave and a Pygmie grave and declares there were two species. - nope.

Where are all the Etruscans or Babylonians?

Expand full comment

It's true that Homo Sapiens likely wiped out Neanderthals (though we don't know for sure), but Homo Sapiens didn't evolve from Neanderthals, so it still doesn't work as a reference class for Scott's argument.

Interestingly, we don't know what trait it was that allowed us to out-compete them, but one of the traits that seems most likely to me is that we lived in much larger social groups. Given that Neanderthal brains were technically slightly larger than ours, there isn't any evidence that raw intelligence was the cause, though it's impossible to rule it out completely. I've seen analyses that suggest larger social groups are more important for innovation than sheer intelligence, so I find that idea believable.

Expand full comment

Its not likely that Homo Sapiens wiped out Neanderthals. They are us. Just like an "Italian" can say "Chinese", "Etruscan" and "Navaho" are us. The evidence that they really were separate species is getting slimmer and slimmer. 200, 000 years from now humans are not going to say this "group" is us but some other currently existing "group" is not us.

The Neanderthals might have looked slightly different and would have gotten a different meand23 report but in same way my family probably looks different from yours and our so-called ancestry reports would be different.

And you are definitely correct that "Homo [Sapien] Sapiens didn't evolve from [Homo Sapien] Neanderthals, so it still doesn't work as a reference class for Scott's argument."

Expand full comment

There's a lot of 1% risks. I don't even think we necessarily should get rid of nuclear weapons, and there was a much more than 1% risk of total nuclear war in the Cold War (and probably still more than a 1% risk now).

On the other hand, is there a less than 1% chance that this will dead-end again like self-driving cars did after 2017, and we'll end up with another AI winter?

Expand full comment

Self driving cars are still in development.

The major hurdle seems to be regulation, as you can't just unleash mediocre self driving cars, even though bad human drivers are totally legal.

Expand full comment

There's a limit on how bad a human is allowed to be at driving before that human loses his or her driver's license (or is not issued one to begin with)...

Expand full comment

Not really...

Expand full comment

Don’t we have another obvious case of a technological singularity that has done, and has the potential to do, great harm? We can’t un-split the atom, but nuclear science, particularly in its warlike use, has done great harm and could do much worse. Anything we “obstructionists, overactive regulators, anti-tech zealots, socialists, and people who hate everything new on general principle” can do to prevent the proliferation and use of nuclear arms seems necessary, even a matter of survival. Why would AI be any different?

Expand full comment

What great harm has nuclear science done us so far that is so self-evident as to require no arguments or even a mention and to outweigh all the benefits?

But hey, burning fossil fuels is awesome. (Yes, this is simplistic. That’s the point)

We can have a similar argument about the potential. And maybe that one will even be won by the anti-nuclear side. But an argument still has to be made.

Expand full comment

I think the fact that the Manhattan project was done under such insane secrecy means that the people in charge knew how dangerous it was and they took great care to ensure it wasn't misused. The arguments over the benefits took place after the tight secrecy on nuclear tech was already in place. That sounds exactly like what Scott and other doomers are suggesting. Lock it down, progress slowly, and then we can discuss all the benefits we might reap.

Expand full comment

Uh, OK? I was responding to the uncritical claim that nuclear is obviously a warning tale that already caused us enormous harm. That is false. I don't see how you actually reply to me - maybe it's a wrong indent on the thread?

To your point - the secrecy was _not_ due to any concerns about misuse and impact on humanity, it was very much about enemy espionage. Scientists attempting to have a meaningful discussion on the potential impacts on humankind were pretty much shut down (e.g. Truman shouting Oppenheimer out of his office, Szilard's struggles to control what he regarded as his brainchild). I don't think you want to use this analogy, for AI to be sprung upon an unaware public as a fait accompli. Quite the opposite.

Expand full comment

> I was responding to the uncritical claim that nuclear is obviously a warning tale that already caused us enormous harm. That is false.

It's not false though, it actually proves the point. The experts and military strategists knew that nuclear *could* cause enormous harm if the enemy developed it first, or if it otherwise proliferated the way LLMs are currently proliferating, and that's *why* they locked it down. The US and allies still bomb countries it doesn't want developing nuclear capabilities because of those dangers. There's no such foresight being applied here.

Expand full comment

The part that's patently false is that nuclear power, in the world we actually inhabit, already caused enormous net harm so obvious that one shouldn't provide any supporting evidence/ argumentation.

The military strategists did not listen attentively to scientists concerned about proliferation/ arms races. Whatever careful harms/benefits analysis you want for AI did not take place for nuclear in 1942-45. Also, characterizing Manhattan Project as carefully slowing progress until the implications are thought through is... interesting. They frigging did the original experiment achieving self-sustaining chain reaction right under Chicago! Fermi's wife reports in her memoirs how the physicists were watching the rate of the reaction carefully, being relieved they didn't get close to, well, blowing the whole city up. Relieved.

Today the situation is somewhat different (though to my knowledge there's exactly one example of a country being bombed out of imminent nuclear capabilities - Iraq 1981). Still, the motivation is purely military - "let's prevent them from bombing us" as opposed to "let's prevent a proliferation dangerous to as a human race", not that nuclear is an extinction threat.

Expand full comment

> The part that's patently false is that nuclear power, in the world we actually inhabit, already caused enormous net harm so obvious that one shouldn't provide any supporting evidence/ argumentation.

Sure, but my point is that they didn't have that evidence going into the Manhattan project, did they? And yet they had the foresight to keep it very secret because they understood the potential dangers.

> Whatever careful harms/benefits analysis you want for AI did not take place for nuclear in 1942-45. Also, characterizing Manhattan Project as carefully slowing progress until the implications are thought through is... interesting.

That's mischaracterizing what I'm saying. Slowing proliferation is not necessarily slowing progress for the in-group. Secrecy clearly slows proliferation and allows breathing room for possible countermeasures and preparations.

In their minds, if the secrecy delayed their enemy's progress even a few months, that could have made the difference between winning and losing the war. Solid reasoning that we're not really seeing employed for AI today.

Expand full comment

"I think the fact that the Manhattan project was done under such insane secrecy means that the people in charge knew how dangerous it was and they took great care to ensure it wasn't misused. "

Lots of non dangerous useful items were projects done in great secrecy because they would be especially useful if the enemy didn't have them or know that they existed. For example radar improvements and reading radio waves from thousands of miles away.

Expand full comment

Nuclear weapons are often referred to as a potential existential threat, but that's incorrect:

https://www.navalgazing.net/Nuclear-Weapon-Destructiveness

Expand full comment

You kind of disprove your own point. The phrase "could do" is doing a lot of work here. I won't claim nuclear weapons pose no danger, not by any means. But we've successfully come 78 years without a nuclear detonation in anger, and in the last century more people have died by the machete than by the atomic weapon. If anything, the proliferation of nuclear arms has prevented their use; It's no coincidence that the only uses of nuclear weapons in combat were carried out by a nation that, at the time, had the sole military which possessed them. If nuclear weapons are the relevant precedent then the more AIs we can produce, and the more diverse the hands into which we get them, the better.

Expand full comment

Nukes haven't been used since ww2, but the world powers have also tried fairly hard to reduce nuclear proliferation as much as they could. It's hard to prove what would happen in an alternative case, but I would say that if nukes were far more prolific and easy to get, we'd have had at least a few cases of terrorists using them (which perhaps could have cascaded into much worse). I would credit our 78 year success to the prevention of nuclear proliferation, the opposite of your claim.

Expand full comment

That isn't really consistent with history. The world's nuclear arsenals grew enormously between 1945 and 1985. They only stopped growing in the late 80s pretty much because no one could think of a use case for more, and they're expensive to build and maintain. They only started *shrinking* with the end of the Cold War and the demise of the USSR, and the only real reasons are because the USSR was no longer seen as a big threat, and it's expensive to maintain them, and their enormous delivery apparatus, under conditions of careful control. It's just way easier to control 1,500 launchers in a few places than 20,000 delivery systems of all types and manner from B-1s to Pershings.

The only nontrivial success of nonproliferation attempts was to keep the technology out of the hands of most of the rest of the world, past the original nuclear club. And even there, it is arguably more the tremendous expense and notoriety that probably did the real work, I'm doubtful the NPT would have done squat for cases where the first two issues weren't sufficient -- and, indeed, it did no good at all in the case of India, Pakistan, Israel, and lately Iran.

Expand full comment

I think there's a more selfish calculation that many people, including myself, run:

1. The risk is unknown, but seems pretty unlikely.

2. I'm not a powerful person, so even if I try to fight the risk, I will probably have no effect. If I ignore the risk, I can save my energy and not stress out.

3. Therefore, I'll carry on like everything is fine.

It doesn't make sense for most people to worry about vague risks that they have no real chance of affecting, unless they are very altruistic, which is quite rare.

Expand full comment

I think this is fine. What confuses me is when people go out of their way to deny the risk, or actively exert effort into making it harder to fight the risk.

Expand full comment

Fair enough. My charitable take is that people instinctively extrapolate that if things have been fine so far, things will continue to be fine. Like how many young people consider good health a given until they personally suffer some health problem.

My less charitable take is that people are mostly interested in scoring social points, and for now, you get more points for getting a skeptic. Though, this can reverse once the other side gets enough support (or political lines are drawn), like with climate change.

Expand full comment

It's going to be much harder for them to carry on like everything is fine if the US starts nuclear bombing China, so their reaction (given the premise) is perfectly natural. "X is unknown and might be ok or even great, Y is known and really, really bad; I should do everything in my power to stop Y" is internally consistent.

Expand full comment

All debates are bravery debates. If you see everyone around you freaking out about nothing and driving themselves crazy, why *wouldn't* you try to give them helpful advice? If people you knew were constantly talking about shark attacks, getting depressed about it, refusing to dip their toe in a swimming pool etc., surely you would try to convince them that shark attacks aren't actually worth worrying about?

Expand full comment

Zvi has a recent long-form take arguing somewhat to this effect: to think for yourself (against social panic) and in support of living your normal life.

https://thezvi.substack.com/p/ai-practical-advice-for-the-worried

Expand full comment

Scott is the one relying on a fallacy. In almost all observable cases, Tyler will be proven right and he won't.

Expand full comment

How is that a fallacy?

Isn't that like arguing against Russian roulette?

Expand full comment

Agreed, that would also be ridiculous. (Sorry if the sarcasm wasn't obvious)

Expand full comment

That's just an assertion, not an argument. I could just as easily assert the converse.

Expand full comment