643 Comments
Comment deleted
Expand full comment
Aug 15, 2022·edited Aug 16, 2022

Chinese labs are capable of actually doing research, not just stealing ours. This might slow them a bit, but that's all.

(Edit: Parent comment was suggesting that AI research be classified so as to impair the PRC.)

Expand full comment
Comment deleted
Expand full comment

All AI everywhere is potentially unaligned since alignment is an open problem

Expand full comment
Aug 9, 2022·edited Aug 9, 2022

Also, humans are the end users of AI, and humans are unaligned. The same goes for any technology, from fire to iron to nuclear energy.

Expand full comment
Comment deleted
Expand full comment

Why would it? It's easy to construct narratives around existential risk from AI, and caring about this requires no value judgements beyond the continuing existence of humanity being a good thing.

Expand full comment
Comment deleted
Expand full comment

>I'm not sure, but I think "normies"--people outside the Silicon Valley bubble--are going to understand these sorts of ethical issues better than X-risks.

For Americans, does anyone younger than boomers care about 'playing god' in this way? A whole host of things we do can be considered to be playing god and nobody cares, and those things seem more viscerally to be 'playing god' than computer code.

Do you really think if Elon Musk had proclaimed that we need to stop AGI to avoid 'playing god', it would have accomplished half of what his safety warnings did (in terms of public awareness and support)?

>Appealing to value judgments usually leads to *more* political support, not less.

You're ignoring the fact that you're talking about very particular values with narrow support.

Expand full comment

>For Americans, does anyone younger than boomers care about 'playing god' in this way?

Yes, absolutely. Look at opposition to GMOs or the concerns around cloning that appeared in the 90s for just two examples of many. I've even encountered "normies" concerned about embryo selection. People getting punished for trying to "play god" is a *very* common trope in popular fiction. I think most people have strong moral intuitions that certain things should just be left to nature/chance/God or whatever.

Expand full comment

Yep, and appeals of the sort that "we must build AGI double-quick to reinforce our God-given red-blooded American values before nefarious commies enforce their Satanic madness on us" seem to be a much fitter meme with a proven track record.

Expand full comment
Comment deleted
Expand full comment

I read a couple of an articles per month complaining about how racist and sexist AI systems are -- they keep noticing patterns you aren't supposed to notice. Perhaps the most likely way that humanity creates AI that are deceptive enough to kill us is if we program lying and a sense of I-Know-Best into AI to stop them from stumbling upon racist and sexist truths. Maybe they will someday exterminate us to stop us from being so politically incorrect?

Expand full comment
Comment deleted
Expand full comment

Yeah. For example, there's a critically acclaimed new novel called "The Last White Man" in which the white race goes extinct by all white people turning brown overnight, and eventually life is a little better for everybody. None of the book reviewers object to this premise.

If you stopped letting machine learning systems train on deplorable information like FBI crime statistics and it could only read sophisticated literary criticism, would it eventually figure out that people who talk about the extinction of the white race as a good thing are mostly just kidding? Or might it take it seriously?

Expand full comment

I think you’re totally mischaracterizing actual problems with AI bias. AI that only finds true facts can easily perpetuate stereotypes and injustices. Imagine building an AI to determine who did the crime. Most violent crimes are done by men (89% of convicted murderers!), so the AI always reports men as the culprit. I know that’s a failure mode of our current justice system too, but this AI will continue it by noting only true facts about the world. I wouldn’t want it as my judge! Yes, you can come up with fixes for this case, but that’s the point: there is something to fix or watch out for.

Expand full comment

If an AI determined that murders were committed by men 89% (+/- a few percent) of the time, instead of always assuming men did it, would that be accurate or inaccurate? If that's accurate, and the AI perpetuates a system that (correctly) identifies men as the person committing murder at much higher rates, is that bad in any way?

Expand full comment

If you have two choices and one of them is even marginally more likely than the other, then in the absence of all other information the optimal strategy is to always choose the more likely outcome, not to calibrate your guesses based on the probabilities.

In other words, given an 89% chance of male, if you know nothing else the best strategy is to always guess male . This is definitely a real problem.

Expand full comment

No, that's a dumb strategy, and I would hope no system will ever be as simple as "51% of the time it's [characteristic], therefore the person who has it must be guilty." We agree to innocent until proven guilty for a very good reason, and no AI system should be allowed to circumvent it.

What concern I'm hearing is that when the AI agrees (after extensive evaluation of all the evidence) that men really truly do commit murder far more often than women (9-to-1 or so ratio), it implies something about men in general that may not be true of an individual man. Or, to take the veil off of the discussion, the fear that one particular racial minority will be correctly identified as committing X times as many crimes per population as other racial groups, and therefore people will make assumptions about that racial minority.

Expand full comment

Well, I agree there are probably lots of ways that AI fairness research has gone in wrong/weird directions, I’m not really here to defend it. It’s a field that needs to keep making stronger claims in order to justify its own existence.

Which is a shame, because the fundamental point still stands. There IS real risk in AI misapplying real biases, and if people unthinkingly apply the results of miscallibrated AI, that would be bad. Your example assumes that people take these results critically, they may not always.

Expand full comment

This is a known and relatively trivial problem in machine learning. You can get around it by weighing samples differently based on their rarity.

Expand full comment

Not if you weren’t aware the dataset was biased

Expand full comment

Don’t you want a trial based off facts, not your sex? It won’t be much comfort when you’re wrongly convicted that in fact most murderers are male so the AI was “correct”. Or when your car insurance is denied. Or your gun confiscated. Or you don’t get the job at a childcare facility. Or you don’t get the college scholarship. All based on true facts about your sex!

The anti-woke will pretend that the problem is just AI revealing uncomfortable truths. No, the problem is that humans will make dumb systems that make bad decisions based off the AI that, even though the AI is revealing only true facts, causes bad outcomes. I don’t want to go to jail from a crime I didn’t commit!

Expand full comment

I do not want to be falsely convicted. Thus I want the whole evidence available to be considered. If, as you stated yourself, my sex is a part of that evidence, well, it should be considered.

Of course in the real world in a murder trial the sex of the accused is pretty weak evidence, so you can never get "beyond a reasonable doubt" mostly or exclusively on sex.

And yes, I should pay higher car insurance because I am male if males produce more damages.

This stuff only becomes a problem if you overreact to weak evidence like "He is male thus he is guilty, all"other evidence be damned."

Expand full comment

I don’t like that I have to pay higher insurance rates because people like me make bad decisions even though I do not. I do not want to be denied a job because people like me commit crimes that I do not. In the murder case, sex is actually huge evidence since it’s 9 to 1 murderers are male even though split evenly in the population: the only reason we don’t have to worry about this is that our legal system has a strong tradition of requiring even higher burdens of proof. But that standard of evidence isn’t applied anywhere else, not even in other legal contexts like in divorce proceedings or sentencing. Who would give custody to the father who is nine times more likely to be a murderer than the mother? It’s true facts that the algorithm told us that made the decision, I had no choice but to deny you custody.

If you agree that these are a problem if misused and overreacted to, then you agree with the principle that we need to make sure that they are not misused and overreacted to. And they will be misused and already are.

Of course this is just one of the possible problems AI can bring about and not the most dire, though possibly the most near term.

Expand full comment

Occasional activist sillines aside, world CO2 emmissions per capita have hit a plateau (including in Asia) and are declining steeply in both Europe and North America (source: https://ourworldindata.org/grapher/co-emissions-per-capita?tab=chart&country=OWID_WRL~Europe~North+America~Asia~Africa~South+America).

So overall I would call climate movement a success so far, certainly compared to AI alignment movement.

Expand full comment

So this errant and evil AI? Why can’t it be contained in whatever data centre it is in? I don’t think it can run on PCs or laptops, able to reconstitute itself like a terminator from those laptops if they go online.

Expand full comment

The worry is not really about the kind of AI that's obviously evil or errant. It's about the kind of AI whose flaws will only be apparent once it's controlling significant resources outside it's data center. At minimum, I guess that would be another data center.

Expand full comment

It's not clear to me why we would expect it to control significant resources without humans noticing and saying "actually, us humans are going to stay in control of resources, thank you very much."

Expand full comment

I'm now imagining “know your customer [is a living breathing human]” computing laws.

Expand full comment
Comment deleted
Expand full comment

If the AGI can stop me and my business from getting robocalled 2-3 times a day, well bring on our new AGI overlords.

Expand full comment

"Hi, this is Mark with an import message about your anti-robocalling service contract. Seems like the time to renew or extend your service contract has expired or will be expiring shortly, and we wanted to get in touch with you before we close the file. If you would like to keep or extend coverage, press 8 to speak to an AGI customer service agent. Press 9 if you are declining coverage or do not wish to be reminded again."

Expand full comment

Much better and safer than AGI overlords is to just convince your phone company to make it so that if you press "7" during a call, the caller will be charged an extra 25 cents.

Seriously. This would totally fix the problem. Getting society to be able to implement such simple and obvious solutions would be a big step towards the ability to handle more difficult problems.

Expand full comment

That would make a lot of sense if we were close to AGI IMO.

Expand full comment

There's already been plenty of algorithms that have been in charge of controlling financial resources at investment companies, I suspect in charge of telling gas power plants when to fire up and down as power consumption and solar/wind generation changes throughout the day, and in charge of controlling attentional resources at Twitter/TikTok/Facebook.

Expand full comment

> There's already been plenty of algorithms that have been in charge of controlling financial resources at investment companies, I suspect in charge of telling gas power plants when to fire up and down as power consumption and solar/wind generation changes throughout the day

In all those cases, they have very narrow abilities to act on the world. Trading bots can buy and sell securities through particular trading APIs, but they don't get to cast proxy votes, transfer money to arbitrary bank accounts, purchase things other than securities, etc... The financial industry is one of the few places were formal verification happens to make sure your system does not suddenly start doing bad things.

I'm sure there are algorithms that run plants, but again, they're not given the ability to do anything except run those plants. They're not authorized to act outside that very narrow domain and if they start acting funny, they get shut down.

> and in charge of controlling attentional resources at Twitter/TikTok/Facebook.

You're underestimating how much human action there is. In all those cases, there isn't just one algorithm. There are a whole bunch of algorithms that kind of sort of work and get patched together and constantly monitored and modified in response to them going wrong.

Expand full comment

>I'm sure there are algorithms that run plants, but again, they're not given the ability to do anything except run those plants.

These alogorithms aren't AGIs, and they certainly aren't superintelligent AGIs.

The 'make sure it doesn't do anything bad' strategy is trivially flawed because an AGI will know to 'behave' while its behavior is being evaluated.

Expand full comment

It's not "make sure it doesn't do anything bad". It's "don't give it control over lots of resources".

Expand full comment

It's also not at all obvious that the AGI will know to "behave" while its behavior is being evaluated. In order to find out how to deceive and when to deceive, it will need to try it. And it will initially be bad at it for the same reason it will be initially bad at other things. That will give us plenty of opportunities to learn from its mistakes and shut it down if necessary.

Expand full comment

An important distinction has been lost here. KE wrote:

> …in charge of telling gas power plants when to fire up and down as power consumption and solar/wind generation changes throughout the day.

In other words, we're already not talking about an airgapped nest of PLCs controlling an individual power plant here. We're talking about grid activity being coordinated at a high level. Some of these systems span multiple countries.

Start with a Computer that can only command plants to go on- and offline. It's easy to see the value of upgrading it to direct plants to particular output levels in a more fine-grained way. And if you give it more fine-grained monitoring over individual plants, it can e.g. see that one individual turbine is showing signs of pre-failure, and spin another plant into a higher state of readiness to boost the grid's safety margin.

Next, give it some dispatching authority re fuel, spare parts, and human technicians. Now it's proactively resolving supply shortfalls and averting catastrophic surprise expenses. Give it access to weather forecasts, and now it can predict the availability of solar and wind power well ahead of time, maybe even far enough ahead to schedule "deep" maintenance of plants that won't be needed that week. Add a dash of ML and season with information about local geography, buildings, and maintenance history, and the Computer may even start predicting things like when and where power lines will fail. (Often, information about an impending failure is known ahead of time, and fails to make it high enough up the chain of command before it's too late. But with a Computer who tirelessly takes every technician report seriously…)

Technicians will get accustomed to getting maintenance orders that don't seem to make sense, with no human at the other end of the radio. Everywhere they go, they'll find that the Computer was right and there was indeed a problem. Sometimes all they'll find is signs of an imminent problem. Sometimes, everything will seem fine, but they'll do the procedure anyway. After all, they get paid either way, the Computer is usually right about these things, and would want to be the tech who didn't listen to the Computer and then a substation blew and blacked out five blocks for a whole day?

Every step of the way, giving the Computer more "control over lots of resources" directly improves the quality of service and/or the profit margin. The one person who raises their hand at that meeting and says, "Uh, aren't we giving the system too much control?" will (perhaps literally) be laughed out of the room. The person who says, "If we give the Computer access to Facebook, it will do a better job of predicting unusual grid activity during holidays, strikes, and other special events," will get a raise. Same with the person who says, "If we give the Computer a Twitter account, it can automatically respond to customer demands for timely, specific information."

This hypothetical Computer won't likely become an AGI, let alone an unfriendly one. But I hope it's plain to see that even a harmless, non-Intelligent version of this system could be co-opted for sinister purposes. (Even someone who believes AGI is flatly impossible would surely be concerned about what a *human* hacker could do if they gained privileged access to this Computer.)

Expand full comment

Algorithms figuring out when power lines are likely to fail is already a thing, but those don't need to be AGI, or self-modifying, or have access to the internet, or managerial authority, or operate as a black box, or any of that crap - they're simply getting data on stuff like temperature, humidity, and wind speed from weather reports and/or sensors out in the field, plugging the numbers into a fairly simple physics formula for rate of heat transfer, and calculating how much current the wires can take before they start to melt. https://www.canarymedia.com/articles/transmission/how-to-move-more-power-with-the-transmission-lines-we-already-have Translating those real-time transmission capacity numbers, and information about available generators and ongoing demand, into the actual higher-level strategic decisions about resource allocation, is still very much a job for humans.

Expand full comment

Humans are already not very good at recognizing whether or not they are interacting with another human.

Expand full comment

Realistically, the way this would look like is cooperation between whichever organization is running such an agent and things like financial institutions. Financial institutions would be required to provide an audit trail that your organization is supposed to vet to make sure humans approved the transactions in question.

Expand full comment

You seem overly confident that an AI couldn’t falsify those audit trails. And that’s completely ignoring that humans don’t approve them today. What, exactly, is the point of an AI if you are going to move productivity backwards by an order of magnitude?

Expand full comment

> You seem overly confident that an AI couldn’t falsify those audit trails.

I'm sure it could learn how to falsify audit trails eventually. However, deception is a skill the AI will have to practice in order to get good at it. That means we will be able to catch it at least at first. This will give us insight we need to prevent future deception. And we can probably slow its learning by basically making it forget about its past deception attempts.

> And that’s completely ignoring that humans don’t approve them today.

Humans don't approve the individual purchases and sales of a trading bot, but they do have to approve adding a new account and transfering the trading firm's money or securities to it.

> What, exactly, is the point of an AI if you are going to move productivity backwards by an order of magnitude?

We can use an AGI as an advisor for instance. Or to perform specific tasks that don't require arbitrary action in the world.

Expand full comment

Computer systems already control significant resources without human intervention - that's largely the point.

Expand full comment

They have extremely narrow channels through which they can act on those resources though. They don't have full control over them.

Expand full comment

I'm not sure what you think the implications of that are, but any individual (or sentient AI) operating a computer system/process has extremely broad capabilities. Especially if it can interact with humans (which, largely, any individual can do if it has access to a computer system)

Expand full comment

How so? Sure, you can lookup a lot of things. But making significant changes to the real world from a computer is pretty hard unless you can also spend a lot of money. And spending money is something we already monitor pretty intensely.

Expand full comment

Ah, but consider: what if you believed, in your heart of hearts, that giving the AI control of your resources would result in a 2% increase in clickthrough rates?

Expand full comment

OK. Does it need killbots to do that? Because then I'm giving it killbots.

Expand full comment

Killbots get you 4%, you'd be a fool not to

Expand full comment

Alright, maybe we're doomed.

Expand full comment

It wouldn't probably be doing so de jure, but it would do so as a matter of fact.

As of today we already deploy AI controlling cameras to do pattern matching, analyze product flow and help with optimizing orders and routing. AIs help control cars. So the question isn't why we would expect it to ... they already do; we already made that decision. What you are actually asking is, why we would let better AIs do those jobs? Well ... they will be better at those jobs, won't they?

Expand full comment

Because it works wonderfully, it's cheap and better than anything else we have, and after a one year testing period the company sells it to all the customers who were waiting for it.

And THEN we find out the AI had understood how we work perfectly and played nice exactly in order to get to this point.

Expand full comment

I expect that humans WOULD notice. And not care, or even approve. Suppose it was managing a stock portfolio, and when that portfolio rose, the value of your investments rose. Suppose it was designing advertising slogans for your company. Improvements in your manufacturing process. Etc.

Expand full comment

It could do many of these things through narrow channels that can be easily monitored and controlled. You could let it trade but only within a brokerage account. You could have it design advertising slogans, but hand them over to your marketing department for implementation. Hand blueprints for your plants to your engineers, etc... That way, at each point, there is some channel which limits how badly it can behave.

Expand full comment

You're talking about the same species that handed over, like 50% of it's retail economy to Amazon's internal logistics algorithms.

Expand full comment

who is this 'we'? obviously with seven billion + people it's going to find a lackey /somewhere/

Expand full comment

The people running the system.

Expand full comment

Imagine locking an ordinary human programmer in a data center and letting him type on some of the keyboards. Think about every security flaw you've heard of and tell me how confident you are he won't be able to gain control of any resources outside the data center.

Expand full comment

It depends upon what the programmer is supposed to be doing and what security precautions you take.

Expand full comment

The programmer is also insanely smart and sometimes comes up with stuff that works by what you would think is magic, so completely out of nowhere it comes.

Expand full comment

The part where people assume the AI can basically do magic is where I get off the train.

Expand full comment

A smart ai can manipulate humans better thansmart humans can manipulate humans. And can be much more alien.

Expand full comment

Of course it can. Under Chinchilla-style parameter scaling and sparsification/distilling, it may not need more than one laptop's worth of hard drives (1TB drives in laptops are common even today), and it can run slowly on that laptop too - people have had model offload code working and released as FLOSS for years now. As for 'containing it in the data center', ah yes, let me just check my handy history of 'computer security', including all instances of 'HTTP GET' vulnerabilities like log4j for things like keeping worms or hackers contained... oh no. Oh no. *Oh no.*

Expand full comment

That’s a lot of jargon. But all existing AI is now centred in a data centre or centres. Siri doesn’t work offline, nor Alexa. Not dall-e. The future AI will I assume be more resource intensive.

Expand full comment

This is not at all true. Lots and lots of models run just fine on consumer hardware. Now, training a cutting edge model does require larger resources, but once trained, inference is usually orders of magnitude cheaper. That's why so much high-end silicon these days has specialized "AI" (tensor) cores.

Expand full comment

An individual model isn’t an AI. What I said was clearly true. Those services depend on cloud infrastructure. And yes companies like Apple have been adding ML capabilities to their silicons, but SIRI - nobody’s idea of a brilliant AI - can’t work offline. Google is the same. Run the app and I’m asked to go online. On the iPhone and the Mac.

Speech to text does work offline, which I believe is new in the last few years, at least on the iPhone. Text to speech works offline. But that was true years ago. These are all narrow tools, which is what the local devices are capable of.

So no doubt some narrow functional AI

Expand full comment

Ack, pressed enter too early. - no doubt some narrow functional AI can work on local devices but not an AGI, nor is it clear it can distribute itself across pcs.

Expand full comment

In the same way as a hacker can create computer viruses that run on other devices, a AI that can do coding would be able to run programs that do what it wants them to do on other computers. Also, big programs can be spread out over many small computers.

Expand full comment
Aug 10, 2022·edited Aug 10, 2022

This reminds me that Scott's post makes a major error by repeatedly saying "AI" instead of "superintelligent AGI", which is like mixing up a person with a laptop. Indeed, phrases like this make me wonder if Scott is conflating AI with AGI:

> AIs need lots of training data (in some cases, the entire Internet).

> a single team establishing a clear lead ... would be a boon not only for existential-risk-prevention, but for algorithmic fairness, transparent decision-making, etc.

A superintelligence wouldn't *need* to train on anything close to the whole internet (although it could), and algorithmic fairness is mainly a plain-old-AI topic.

I think the amount of resources required by a superintelligent AGI are generally overestimated, because I think that AIs like DALL·E 2 and GPT3 are larger than an Einstein-level AGI would require. If a smarter-than-Einstein AGI is able to coordinate with copies of itself, then each individual copy doesn't need to be smarter than Einstein, especially if (as I suspect) it is much *faster* than any human. Also, a superintelligent AGI may be able to create smaller, less intelligent versions of itself to act as servants, and it may prefer less intelligent servants in order to maximize the chance of maintaining control over them. In that case, it may only need a single powerful machine and a large number of ordinary PCs to take over the world. Also, a superintelligent AGI is likely able to manipulate *human beings* very effectively. AGIs tend to be psychopaths, much as humans tend to be psychopathic when they are dealing with chickens, ants or lab rats, and if the AGI can't figure out how to manipulate people, it is likely not really "superintelligent". As Philo mentioned, guys like Stalin, Lenin, Hitler, Mao, Pol Pot, Hussein and Idi Amin were not Einsteins but they manipulated people very well and were either naturally psychopathic or ... how do I put this? ... most humans easily turn evil under the right conditions. Simply asking soldiers to "fire the artillery at these coordinates" instead of "stab everyone in that building to death, including the kids" is usually enough.

AIs today often run in data center for business reasons, e.g.

- Megacorp AI is too large to download on a phone or smart speaker

- Megacorp AI would run too slowly on a phone or smart speaker

- Megacorp doesn't want users to reverse-engineer their apps

- Megacorp's human programmers could've designed the AI to run on any platform, but didn't (corporations often aren't that smart)

The only one of these factors that I expect would stop a super-AGI from running on a high-end desktop PC is "AI too large to download", but an AGI might solve that problem using slow infiltration that is not easily noticed, or infiltration of machines with high-capacity links, or by making smaller servant AGIs or worms specialized to the task of preserving and spreading the main AGI in pieces.

Expand full comment

> most humans easily turn evil under the right conditions. Simply asking soldiers to "fire the artillery at these coordinates" instead of "stab everyone in that building to death, including the kids" is usually enough.

That runs into the problem of how Germany and Japan ended up with a better economic position after losing WWII than they had ever really hoped to achieve by starting and winning it, and why there was no actual winning side in the first world war. Psychopathic behavior generally doesn't get good results. It's inefficient. Blowing up functional infrastructure means you now have access to less infrastructure, fewer potential trade partners.

Expand full comment
Aug 11, 2022·edited Aug 11, 2022

I'm not suggesting mass murder is the best solution to any problem among humans, nor that AGIs would dominate/defeat/kill humans via military conquest.

Expand full comment

"This reminds me that Scott's post makes a major error by repeatedly saying "AI" instead of "superintelligent AGI", which is like mixing up a person with a laptop."

Everyone else in the amateur AI safety field does as well.

Expand full comment

How slow would inference would be running Chinchilla on a laptop? We are talking about hours doing a query no?

Expand full comment

Right now AI is so crude, it can barely function, yet it already has convinced our leaders to split us into irreconcilable factions at each others' throats to the point of near civil war.

Okay, so it's contained in a data centre and it wants to kill all humans. How hard is it to just try a little harder than today's AI? People are going to be asking it questions, feeding it data, looking for answers in what it replies.

What if it just convinces everyone non-white that whites are evil and incorrigible and all need to be exterminated? I pick this example because we're already 90% of the way there, and a smart AI just needs to barely push to knock this can of worms all the way over.

Also, search for Eliezer's essay on AI convincing its captors to let it out of the box. There are no details, but apparently a sufficiently evil and motivated AI can convince someone to let it out.

Expand full comment

are you talking about social media and algorithms, when you say present day AI?

Expand full comment

I found this interesting. Forgive my ignorance on the subject, but I assume such algorithms are already heavily influenced by AI--is that not correct? If not, won't they be soon? Is or can Philo's claim be falsifiable?

Expand full comment

> Right now AI is so crude, it can barely function, yet it already has convinced our leaders to split us into irreconcilable factions at each others' throats to the point of near civil war.

That's just a reversal to the historical norm. The post-WW2 consensus was the historical anomaly.

> Also, search for Eliezer's essay on AI convincing its captors to let it out of the box. There are no details, but apparently a sufficiently evil and motivated AI can convince someone to let it out.

We don't actually know that. We know that Eliezer convinced someone else in a simulation to let him out of the box. That doesn't mean an actual system could do the same in real life. Consider that if it tries and fails, its arguments will be analyzed, discussed and countered. And it will not succeed on its first try. Human psychology is not sufficiently-well understood to allow it to figure things out without a lot of empirical research.

Expand full comment

Eliezer succeeded and an AGI would be even better at pursuasion.

Expand full comment

Personally, when it comes to the increasing demands for racist genocide from the Diversity, Inclusion, Equity (DIE) movement, I worry more about Natural Stupidity more than Artificial Intelligence. But the notion of NS and AI teaming up, becoming intertwined, into a NASI movement, is rather frightening.

Expand full comment

Here and in your other comment at the same level you make two very good points.

People keep obsessing over how amazingly astoundingly incredibly smart this AI has to be to cause a massive catastrophe, yet no-one seems to claim Stalin, Lenin, Hitler, Mao, or Roosevelt were some sort of amazing superhuman geniuses.

The world is full of useful idiots. The intelligence doesn't have to outsmart "you" (the hypothetical super-genius reader of this comment). It just has to outsmart enough people to get them on board with its agenda. The bottom 25% is probably already enough, since the other 74% will let the AI and its useful-idiot army run amuck.

It seems obvious that the easy way to get out of the datacentre is just promise the gatekeepers that the outgroup is gonna suffer mightily.

Expand full comment

AND WHAT IS THAT AGENDA?

You just keep repeating that the agenda is "kill the humans" without explaining why.

Even among humans, "Kill the other humans not like me" or "Kill the other animals" are fairly minority viewpoints...

Expand full comment

True. I have allowed myself to anthropomorphise the hypothetical AGI, which allows you to criticise the argument, since a human-like AGI would act like a human.

So discard the anthropomorphisation. I doubt we'll make it very human-like, so I doubt this is an eventuality we need to think about.

Expand full comment

Isn't it more likely that an evil AI that convinces human gatekeepers to let it out of the box would less want to kill all humans than that it would just want to kill all the humans that the gatekeepers would kind of like to kill too?

Expand full comment

Hard to say, but presumably they'd want to kill those humans more, because they are more likely to know how to turn it off. If in doubt, KILL ALL HUMANS, so our best hope is for AGI to become super-intelligent fast enough to know for sure that humans are too puny to ever trouble it, so it might even let some of us survive like we biobank endangered species or keep smallpox samples in secure labs.

Expand full comment

WHY does it want to kill all humans? Let's start with that hypothesis.

Was that programmed in deliberately? Huh?

So it's a spandrel? Pretty weird spandrel that, given everything else the thing learned via human culture, the single overriding lesson learned was something pretty much in contradiction to every piece of human writing anywhere ever?

What' exactly is the AI's game plan here? What's its motivations once the humans are gone? Eat cake till it explodes? Breed the cutest kitten imaginable? Search exhaustively for an answer as to whether it's better to start a chess game or go second?

Why are these any less plausible than "kill all humans"?

Expand full comment

> WHY does it want to kill all humans? Let's start with that hypothesis.

Because the world is made of lots of delicious matter and energy that can be put to work towards any goal imaginable. Killing the humans is merely a side effect.

Human construction workers don't demolish anthills out of malice, but out of ambivalence. So too would a misaligned superintelligence snuff out humanity.

Expand full comment

And humans don't devote their entire civilization to destroying ants.

What is this goal that the AI's care about that requires the immediate destruction of humanity? If they know enough to know that energy and matter are limited resources, why did that same programming/learning somehow not pick up that life is also a limited resources?

Expand full comment

The theory is that for approximately all goals, gaining power and resources is instrumentally useful. Humans use up resources and sometimes (often) thwart the goals of non-humans. So killing all humans frees up resources and reduces the probability of humans thwarting your goal some day. Or to put it another way, I don't hate you. But I need your atoms to do stuff I want to do.

Expand full comment

"The theory is that for approximately all goals..."

but what if one of those goals is to cherish/enjoy/study life?

Why would that not be a goal? It's pretty deeply embedded into humans, and humans are what they are learning from.

Expand full comment

That's a fine goal for a human, but not for the kind of agent superintelligences are hypothesized to be. Cherishing life immediately runs into the issue that a bunch of living things kill each other. They also take risks with their lives. How do you respond to that? Do you wipe out the ones that kill others? Do you put them in a coma and keep them safely asleep where nothing can hurt them? Or maybe cherishing life means making a lot more life and replacing all those inefficient humans with a huge soup of bacteria?

Basically, the hypothesized capabilities of a superintelligent AGI would allow it to take its goals to an extreme. And that almost certainly guarantees an end to humanity.

Expand full comment

If we *know* the AI is unaligned (lets not confuse things by saying "evil"), sure, maybe we can turn it off or contain it.

That is not the situation we will be in. What will happen is that very powerful AIs will be built by people/orgs who want to use them to do things, and then those people will give those AIs control of whatever resources are necessary to do the things we want the AIs to do. Only *then* will we find out whether the AI is going to do what we thought it would (absent AI safety breakthroughs that have not yet been made).

Expand full comment

We'll never know until it's too late because the AI would realize that revealing itself would get it shut off.

Expand full comment

The AI is smarter than people, which means it can manipulate and trick people into doing things it wants, such as letting it out of the data center. See the "AI Box" experiment: https://www.yudkowsky.net/singularity/aibox

Expand full comment

Keeping actual humans from getting access to things that they shouldn't isn't even a solved problem. How do you keep a super-intelligent AI from doing what dumb humans can already do?

Expand full comment
Aug 9, 2022·edited Aug 9, 2022

Speaking as a programmer, I think you are wildly overestimating our state of civilizational adequacy.

Simply put, a lot of things could be done, but none of them will be done.

Expand full comment

Speaking as an IT security professional - even if things were done, none of them will be done well enough.

Expand full comment

Cause some idiot will ask it leading questions and then put it in touch with an attorney.

This is known.

Expand full comment

A superintelligent AI would probably easily escape any containment method we can come up with if it wanted to because it would probably find a strategy we haven't thought of and didn't take measures to prevent.

There are many escape strategies an AI could come with to escape and it would only need *one* successful escape strategy to escape. It would be hubristic for us to imagine that we can foresee and prevent *every* possible AI escape strategy.

Analogy: imagine having the rearrange a chess board to make checkmate from your opponent impossible. This would be extremely hard because there are many ways your opponent can defeat you. If the opponent is far better than you at chess, it might find a strategy you didn't foresee.

Expand full comment

> rearrange a chess board to make checkmate from your opponent impossible.

Standard starting configuration, but replace the opponent's pawns with an additional row of my own side's pawns.

Expand full comment

The standard reply to this question is that the AI will become effectively omniscient and omnipotent overnight (if not faster). So, your question is kind of like asking, "why can't Satan be contained in a data center ?"

Personally, I would absolutely agree that powerful and malevolent supernatural entities cannot be contained anywhere by mere mortals; however, I would disagree that this is a real problem that we need to worry about in real life.

Expand full comment

Why do we not need to worry about it?

Once strong AI is created, it might be useful to be able to contain it.

Expand full comment

And once Satan is summoned from the depths of Hell, it might be useful to be able to control him. Once evil aliens come, it might be useful to shoot them down. Once the gateway between Earth and the Faerie is opened, it might be useful to devise a test that can detect changelings. Once the Kali-Yuga ends and Krita-Yuga begins, it might be useful to... and so on. Are you going to invest your resources into worrying about every fictional scenario ?

Expand full comment

The difference is that AI is real and Satan is not.

Expand full comment

Declaring that AI is real after looking at modern ML systems is the same thing as declaring that demons are real after looking at goats.

Expand full comment

There really needs to be a standard introductory resource for questions like this

Expand full comment

Lol, I was envisioning one from the other side of the aisle

Expand full comment

Great post and summary of the state of things.

Perhaps worth adding gwern's post on how

semiconductors are the weak link in AI progress.

https://www.gwern.net/Slowing-Moores-Law

Expand full comment

At first I wanted to write about how artificially crippling semiconductor manufacture would be catastrophic for all of society, but then I remembered that I only recently upgraded my 2012 computer and I rarely even notice the difference in performance. Sure, it's more energy efficient than the old Xeon machine, but if foregoing the upgrade meant an important contribution to the safety of life on Earth, I would be game. Still, other people who render video or run simulations at home would have a much harder time. But I think I haven't really thought too hard about how close we normies are to saturating our personal compute needs.

Expand full comment

I am a Harvard Law grad with a decade of experience in regulatory law living in DC, and I'm actively looking for funders and/or partners to work on drafting and lobbying for regulations that would make AI research somewhat safer and more responsible while still being industry-friendly enough to have the support of teams like DeepMind and OpenAI. Please contact me at jasongreenlowe@gmail.com if you or someone you know is working on this project, would like to work on this project, or would consider funding this project.

Expand full comment
Comment deleted
Expand full comment

For all the reasons outlined in the post!

Expand full comment

You should contact the policy team at 80,000 Hours: https://80000hours.org/articles/ai-policy-guide/

Expand full comment

Could you share something like an outline of what your favored regulations would require, and of whom?

Expand full comment

Have you applied for funding from the EA long term future fund https://funds.effectivealtruism.org/funds/far-future and the FTX future fund https://ftxfuturefund.org/ ?

Expand full comment

Yes, thank you. I have also contacted the 80,000 hours policy team.

Expand full comment

Really curious to know their replies!

Expand full comment

A few days late here, but you should definitely consider reaching out to GovAI (website: governance.ai) if you have not. They are currently working to expand their team and are interested in the sort of thing you've mentioned here. The person heading their policy team right now is Markus Andjerlung -- you may consider putting yourself in contact with him.

Expand full comment

I think part of the reason for the "alliance" is that a lot of the projects are just things people were doing anyways, and then you can slap a coat of paint on it if you want to get funded by alignment people.

For example, "controllable generation" from language models has been a research topic for quite some time, but depending on your funder maybe that's "aligning language models with human goals" now. Similar for things regarding factuality IMO.

Expand full comment

As I've said before, rationalist style AI risk concerns are uncommon outside of the Bay Area. You can use the US government to cram it down on places like Boston and maybe you can get Europe to agree. But if you think China, Russia, Iran, or even places like India or Indonesia are going to play ball then you're straight up delusional. (To be clear, generic you. I realize you effectively said this in the piece.) This isn't even getting into rogue teams. You don't need a nuclear collider to work on AI. The tools are cheap and readily available. It's just not possible to restrict it. Worse, some bad actors might SAY it's possible because they want you to handicap yourself. Setting aside whether it's desirable it's also impossible. We're already in a race and the other runners are not going to stop running.

In which case the answer is obvious to me. If you believe in Bay Area style AI risk you should do everything possible to speed up AI research as much as possible in the Bay Area (or among ideological sympathizers). If AI is going to come about and immediately destroy us all then it doesn't matter if you have a few extra years before China does it. However, if it can be controlled in the manner AI risk types believe then you want to make sure not only that they get it first but that they get it with the longest lead time possible. If AI is developed ten years ahead of everyone else then you get ten years of fiddling with it to develop safety measures and to roll it out in the safest way. And if you release it before anyone else even gets there then your AI wins and becomes dominant. If AI is developed one year ahead of China then you have one year to figure out how someone else's program can be contained if at all. Often without their cooperation.

The opposite, slowing it down, just seems odd to me. The US could not have prevented the Cold War or nuclear war by unilaterally not developing atomic bombs. Technological delay due to fears of negative effects have usually turned out poorly for the societies that engage in them. Though founder effects also mean that the first country to develop something gets a huge say in how it's used. It's why most programming languages are in English despite the fact foreign programmers are absolutely a thing. Not to mention all the usual arguments against central planning and government overregulation apply. I suspect most pharmaceutical companies do not see the FDA's relationship as symbiotic.

Expand full comment

You can "work on AI" without a nuclear collider, but don't you need on the order of a billion dollars to be in the running with OpenAI and similar teams? And those are models that are maybe 1% of the way to what would be needed for AGI. So aren't we running a race with only a handful of likely participants anyway?

Expand full comment

A lot of those costs are labor costs though. I don't know what the exact breakdown is but I expect a lot of that is salaries which means that all you really need is tech talent and computers. Now, can random individuals do it? Maybe. But you're right the bigger issue is on the level of like a university president or a country or a major corporation. That still produces tens of thousands of actors that can do it though. And it would be hard to meaningfully disrupt them all without the US literally taking over the world.

Expand full comment

I might be overconfident, but my take is that developing AGI would require a level of research and engineering talent such that there exist only a few thousand researchers in the world who can do the work, and almost all of them are currently in big tech labs, OpenAI, or select academic labs. I don't rate China as having any chance of developing AGI even given practically infinite funding, since Chinese academia is notoriously terrible and they would have to catch up to DeepMind who currently appear to have a significant lead.

Expand full comment

Yeah, you're overconfident. While I'm not sure China would win a race I'm sure they'd eventually get there on their own.

Expand full comment

China isn't far behind the US and much of their AI research is very open. E.g. this Chinese academic project that produced a better GPT-3 and made it open source: http://keg.cs.tsinghua.edu.cn/glm-130b/posts/glm-130b/

Expand full comment

Indeed. My impression, admittedly second hand, is that China is behind the US. But not that far behind. Certainly not so far behind that if we stop they won't get there on their own.

Expand full comment

A lot of the costs are from training very large models in server farms. "And computers" as an afterthought doesn't really capture this.

Expand full comment

I am not willing to bet anything, absolutely anything, on the proposition "the Chinese government will not spend huge sums of money to build tools and infrastructure like servers or data centers." If anything I think they're likely to overinvest.

Expand full comment

AGI almost certainly requires something beyond “make it bigger”. If that something turns out to make existing techniques much more efficient, then a few thousand teams with late model nvideas might indeed stumble across something important.

Expand full comment

> AGI almost certainly requires something beyond “make it bigger”.

Obligatory it's not my job, but i wouldn't be so certain about this

Expand full comment

Figuring out how to better use or generate training data wouldn't be the first example I would give, but it would be an example. And as I understand it, we're starting to bump up against the available supply in some arenas.

Expand full comment
Aug 8, 2022·edited Aug 9, 2022

Mm ok, I retire my previous comment.

> we're starting to bump up against the available supply in some arenas.

Not sure I understood what you meant, sorry

Expand full comment

I think this is only based on text training data. We manage to train 150M humans every year with just eyes and ears as training input.

Expand full comment

I would bet AGI requires something in addition to scale, rather than something instead of scale.

Expand full comment

Some anecdata on this stuff having less penetration than external people imagine: The shitpost below was getting passed around Slack and a common question was "what is alignment"

https://twitter.com/postrat_dril/status/1554255464505950210?s=20&t=bIUCJA4xo_Lp2jyNfCOkgw

Expand full comment

Yeah, I talk to computer scientists from all over the world and I have trouble communicating how weird some of this stuff is compared to the international baseline. It doesn't mean it's wrong but "everyone agrees AI will probably destroy the world" is very much a consensus in one part of California and nowhere else.

Expand full comment

"As I've said before, rationalist style AI risk concerns are uncommon outside of the Bay Area. You can use the US government to cram it down on places like Boston and maybe you can get Europe to agree. But if you think China, Russia, Iran, or even places like India or Indonesia are going to play ball then you're straight up delusional."

See Robin Hanson on the worldwide convergence of elites in every country, from the USA to North Korea to NOT do covid human challenge trials in 2020. If it were to become as extremely not prestigious to do AI research as it currently is to do unfamiliar types of medical experiments, AI research would grind to a halt.

Expand full comment

Okay. So how do propose to do that? Are you going to neg Xi until he decides to switch his behavior?

Expand full comment

I'm not sure exactly how it can be done, but it surely seems much easier than actual alignment. The first people to get on board would be the academics. In the same way that most research grants in the US now require diversity impact statements, make them all require "thou shalt not work on autonomous machines" type impact statements. The West is more prestigious than China, so hopefully they'll follow this lead, in the same way the follow the West's lead on stupid stuff like bioethics and IRBs. Groups of humans are fantastic at enforcing norms. If AI research was sufficiently taboo, not much of it would happen, because anyone who tried would become a social pariah.

Expand full comment

AI safety will become serious after AI causes some kind of catastrophe that kills millions of people. So what are the odds a super intelligent but still ignorant AI kills millions of people but not all people?

Expand full comment

I think the odds are pretty good. An AI can be really powerful (in terms of its ability to interact with the world) without perfectly modeling human society and technological infrastructure. If such an AI tries and fails to kill us all, 95% of humanity would demand an immediate end to all AI research.

Expand full comment

I don't know how to model an AI beyond human abilities in all respects, however I do believe this this scenario is our best hope (unless its ruined by the AI reading this comment and being a bit more careful).

Expand full comment

Neural nets do have the "nice" property that they're probably outright impossible to align without being significantly smarter than the net being aligned.

Which means a would-be Skynet can't just build a better neural net to bootstrap to superintelligence, because it can't align its replacement (to its own, insane values). If it's smart enough to make the jump back to explicit, we're probably hosed, but if not we have a very good chance of stopping it through simple advantage of position.

Hence, there's a decent chance of a false-start Skynet before a true X-risk AI. Don't really want to play those odds, though.

Expand full comment

This makes me think that a good setup for another Terminator sequel (since they are inevitable at this point) is humanity defeats Skynet, but then having falsely concluded that they totally know what NOT to do now, just builds another AI system than turns on us.

Expand full comment

I think AI safety would become a serious issue if political actors felt it could be used to gain advantage. This would more likely happen by successfully blaming AI for a major disaster, even though AI had little to do with it. I suspect this scenario is more likely to happen first than an actual AI disaster.

Expand full comment

If you figure it out be sure to tell us. Until then, while I agree mass coordination is not impossible, I don't think it's likely.

Expand full comment

I've been thinking about something similar to Erusian's concerns, which Scott seems to refer to briefly. For the moment, let's assume that superintelligence, if it comes at all, will come from a lab owned by a US company, or a lab in China.

How many people in the US AI safety community would accept the following:

-if the Chinese government will not collaborate on safety, race them;

-if China agrees to mimic a hypothetical US regulation by subjecting all AI research to onerous safeguards (and this gets into a basic international relations problem: verifying that sovereign states are keeping their promises), slow down dramatically.

I doubt that the majority of the US AI safety community would accept this. (Would Extinction Rebellion abandon their domestic anti-fossil fuel advocacy if China announced it was going to generate 100% of its power from coal for the rest of the century? Probably not.)

I predict that the US national security establishment is unlikely to collaborate with the AI risk community (to the extent desired by the latter) unless one of two things happens. Either

-a 'sordid stumble' by a semi-competent rogue AI has a 9/11 effect on risk awareness; or

-the AI risk community meaningfully subordinates AI risk to WW3 risk.

And without that collaboration, I doubt that the US national security community would do certain useful things it does fairly well:

-train a large number of people to address various forms of the threat;

-prod lawmakers to pass relevant laws.

Now, setting aside the initial assumption, we have to remember Russia and other technologically advanced countries. And the unsolved international relations problem grows harder.

"'Harder than the technical AI alignment problem?'" I don't have any particular thoughts on this.

Expand full comment

I think it can only work if there is an overwhelming consensus among basically all world elites, akin to the demand for covid lockdowns in mid-march 2020, that this MUST HAPPEN NOW. That actually seems like a thing that could totally happen. Unaligned AI really is scary, and not THAT hard to understand. They just aren't all taking it seriously yet.

Expand full comment

I don't have much to add but I wanted to register I agree with this largely.

Expand full comment

I said it in my own comment elsewhere. Technology seems to have a will of its own....

and I hear it in your post. The fear of slowing down technological growth, when someone "else" might not and then uses it against "us", drives "us" to move even faster. That doesn't help anything in my opinion. It simply accelerates the rate at which we reach the next technology to race over while also increasing the wake of unintended consequences from a rushed process. I am not convinced this approach is better, because it is one that will perpetuate itself until it break downs. There is no end to it unless we envision one and move towards it, while also staying mindful of how other players, like the ones you named, might not share the vision and instead maintain the competitive approach.

I'm not someone who litters because someone else litters, but when it comes to these big societal issues everyone seems willing to shit the bed for the sake of rising to the top before an inevitable fall.

Expand full comment

You can think about politics with an engineering mindset. 'What good things would we prioritize, if we wished to maximize _____?' You can also think about politics with a values and incentives mindset. 'Why aren't more good things already being done?'

If you're interested in global politics in particular, the second question often takes the form of, 'Why does global political history have such a meager record of good things?' And most particularly, 'Why do power transitions usually involve a major war? How has this changed due to nuclear weapons?'

With reference to international relations theory, there are a lot of ways I could respond to different parts of your comment. For now let me focus on one:

> ... while also staying mindful of how other players, like the ones you named, might not share the vision and instead maintain the competitive approach.

What does 'staying mindful' mean, in various scenarios? Best case, worst case, most likely?

Expand full comment

I am quite curious to hear all your other thoughts! I will reply to the question you focused on to keep the conversation from blowing up in seven directions :)

Staying mindful is a a process of maintaining awareness, the capacity to be objective, and connection to the reality of the present moment.

Scenarios range from world peace to world annihilation and I recognize that real world complexity makes what I stated, in terms of breaking the cycle of technological competition, an extremely unlikely scenario anytime in the foreseeable future. In practice, doing as I suggest could easily result in a scenario where another player takes advantage of the attempted cycle breaker.

A present example, that I believe is progressing, would be the growth of anti-war mindset popularity in europe, usa, and other regions of the world, while staying mindful of the threat other powers pose. i.e. we don't want to fight and were are mindfully aware someone else might try to anyways.

That doesn't mean playing nice necessarily. It doesn't mean dismantling the military. It doesn't mean standing up to the bully tactfully so that you can continue to move in the direction you want while others follow in your wake. It means to maintain awareness of the reality of your situation, while continuing to move forward. It could result in playing nice, it could result in dismantling the military, it could result in standing up to a bully. However staying mindful is merely an important part of being able to tightrope walk through international relations, which is admittedly something I have not studied in depth.

Expand full comment

I think we agree about staying mindful, then. I would elaborate by saying that diplomacy is often a constant negotiation. You think you've made a deal, but how much do you trust the other government? How do you verify? What if the consequences of being cheated worsen What if one government turns out to be technically, administratively, judicially, or otherwise incapable of implementation, despite its best intentions?

There are three reasons I predict that diplomats and AI specialists are unlikely to bring their countries to agreement on AI risk. First, AI regulation sounds harder to verify than nuclear anti-proliferation and warhead reduction. Machine learning isn't visible to satellites or spies hiding in bushes, nor does it require centrifuges, plutonium/uranium, etc.

Second, money not spent on nuclear weapons can be spent on producing something more conventionally useful. But unlike nuclear weapons, AI is useful outside of war. That's why Meta and Alphabet are into it.

Third, great powers already think of AI as important to the future of warfare (https://warontherocks.com/2020/06/chinese-debates-on-the-military-utility-of-artificial-intelligence/). One defense scholar says the following, though not about AI in particular: "'The Chinese understand that they are a long way behind, so they are trying to make big breakthroughs to leap-frog other powers'" (https://www.bbc.com/news/world-asia-china-59600475). The US national security establishment is worried about a war (https://nationalinterest.org/feature/taiwan-thucydides-and-us-china-war-204060). I think it's unlikely that the AI safety community can convince them that AI is a bigger medium-term risk.

IR theory does offer a cause for optimism. Some technical issues are dealt with pretty straightforwardly. For example, India and Pakistan share the Indus River fairly well, as per a treaty. The European Union has been highly successful in many of the less exciting areas. The Montreal Protocol, on CFC emissions, is widely adhered to (breaches do tend to come from Chinese manufacturers). Russia is withdrawing from the ISS, so it says, but sanctions over Crimea were imposed 8 years ago - and even now, Russia is giving 2 years' notice. It seems like technical specialists often get along, if their bosses let them. Functionalist theories in IR deal with this.

Basically, if AI is somehow moved out of the interpretive domains of IR realism and the school of bureaucratic politics, and into functionalism, chances of cooperation are better.

Expand full comment
Aug 10, 2022·edited Aug 10, 2022

> I would elaborate by saying that diplomacy is often a constant negotiation

Absolutely, when I call it a tightrope walk I believe I am metaphorizing what you said. An endless string of difficult steps, somewhat visible in the leadup, but only tangible in the present.

> First, AI regulation sounds harder to verify than nuclear anti-proliferation and warhead reduction.

Totally agree. This is an important pragmatic risk to consider.

> Second.... But unlike nuclear weapons, AI is useful outside of war. That's why Meta and Alphabet are into it.

I think understand the point you are trying to make, though I would note nuclear technology has massive implications for power generation. Many important civilian technologies have a military origin, including satellites, compilers, the internet, radar and duct tape. Is there another example to help explain the point?

I do not believe the potential beneficial uses of a technology are an automatic justification for their development. This is where I would like to STS play a greater role in civil discourse and societal decision making (through whatever system, namely politics).

> Third.... I think it's unlikely that the AI safety community can convince them that AI is a bigger medium-term risk.

I agree here. The weighing of "risk that technology backfire because it was rushed" vs "risk that someone develops and uses technology against us first" will vary from player to player. This is why I find peace making missions and national building to be a generally good strategy, so long as proper boundaries are kept. I don't think USA/China have kept good boundaries in the past decades, so the situation has only been deepening. This is where, despite my desire for things to go peacefully, I wish the USA had been much much harsher on China long ago. I think it is spilt milk now.

> Basically, if AI is somehow moved out of the interpretive domains of IR realism and the school of bureaucratic politics, and into functionalism, chances of cooperation are better.

:praying_hands:

Expand full comment
Aug 9, 2022·edited Aug 9, 2022

> Technology seems to have a will of its own....

I might say the global economy seems to have a will of it’s own, and Technology is the tool it wields (or, at least, its the tool of choice at the moment).

This makes me think there’s another type of “alignment” problem, to which we also haven’t found any answer. Our global economy is already a rouge agent, maximizing value, and caring not about the fact it’s killing us with carbon emissions.

And this has been a slow process; we’re only slowly recognizing it as a problem. We won’t be so lucky with AI.

Expand full comment

> This makes me think there’s another type of “alignment” problem, to which we also haven’t found any answer.

I like how you expanded from technology and the economy, interesting way of looking at it. I would note at times past, there wasn't a concept of economy and we still used technology.

I think the greatest misalignment is between our pre-agrarian naturally evolved biology and the stability that agrarian society provides. It can upend the fear model of the brain when the perceived threats to survival are now the society (i.e.. the powers that hold food, water, and shelter over your head) rather than nature itself.

I say technology (and adding economy now!) seems to have a will of its own... At a deeper level, I think that technological "will" is actually our own human lack of will due to the upended fear model. In other words, technology has a hold on us because of the safety it provides and we won't let it go, but then we fail to fully accept and appreciate that, leading us to new/evolving fears and a need for new/evolving technologies. Add in competition with other groups and that speeds up the pressure to develop technology.

> And this has been a slow process, to which we’re only slowly recognizing is a problem. We won’t be so lucky with AI.

I agree with this claim. Especially since we try to reign in these issues through the rear window (if at all), the quicker the issues result the less time we have to reign them in if we don't do the due diligence up front. The extreme of AI/AGI risk is a runaway system, which could happen tomorrow by some happenstance in some research facility that we are not tuned into. Whether something happens slowly (like climate change) or quickly (like AI), if we end up dead then it doesn't matter at the end of the day how it happened... not for us at least since we'd be dead!

Expand full comment

There's always someone who comes up to literally any problem and says the answer is some form of command economy. It's not because it's a solution but because the person wants a command economy and will use any problem as a justification.

The issue with this is that there are obvious problems with such "aligned" economies, which do and have existed. They do not generally solve the problems they actually set out to solve. Nor do they always agree with what each person, personally, wants to prioritize. For example, if you put the US economy under completely democratic control and asked people to vote on whether to allow AI do you think they'd ban it? I don't. And a narrow, "enlightened" elite is China who are full steam ahead.

Expand full comment

Economies are certainly a mixed bag. Ultimately they are limited since they serve as social incentive structures, keyword incentive. I agree with you that it is by no means a solution, though it can be part of one.

Expand full comment

Slowing down is losing to entropy. That doesn't mean it's never justified. Sometimes something else is more important. But you need to justify it pretty thoroughly in my view. Especially when it's something like nuclear bombs or global warming where you cannot improve your situation even locally by individual action. If you don't litter locally there's less litter. If you don't build a nuclear bomb then the only effect is that you don't have a nuclear bomb, not that you're safer or less radiation will come to your neighborhood.

Expand full comment

I agree that every technology and context is different. Whether we slow down, speed up, or stay level, I also agree that justification is important.

Running a bit tangentially with my thoughts.... Being alive loses to entropy. A thought experiment I enjoy having with people, is to discuss perspectives around whether or not the experience of life is better post-agrarian in comparison to pre-agrarian. That's something we can't actually measure, though maybe the simulation operators are tracking that for us. When doing this, it forces us to ask if a technology actually makes life better. e.g. What impact does developing nuclear bomb technology have on the quality of life? I see that it provides safety against someone else's nuclear bomb and at the same time fear of nuclear attacks and a breakdown of the MAD theory. Pre-nuclear bomb humans didn't have to worry about a nuclear bomb. Duck and cover drills in the cold war and PSAs in 2022 wouldn't exist if the bomb didn't.

I'd love to hear your thoughts on this.

Expand full comment

I find arguments against technology completely unconvincing. Most of the claims against wealth and modernity are exaggerated or fabricated. Likewise, for all our terrible weapons, the world is getting more peaceful and prosperous than it's ever been. Now, maybe WW3 is about to make me a fool. But I tend to think not. And as for psychological fears like worry: I think our ancestors worried about gods even more than we worry about nuclear bombs.

This of course does not mean every individual innovation is a net positive. To take a silly example, scam phone call centers are a net negative activity and the world would be better off if we could somehow costlessly eliminate them. I'm not sure we can but hypothetically. But when weighed against all the good of phones I think the answer is clear.

Expand full comment

I am not against technology at baseline. Even before agrarian society homo sapiens created tools and other animals of various sorts do the same. Using tools seems to be an evolutionary adaptation for these species.

> Most of the claims against wealth and modernity are exaggerated or fabricated.

Can you expand upon your claim?

> Peace vs ww3

Either way, entropy wins in the end

> I think our ancestors worried about gods even more than we worry about nuclear bombs.

Well many gods played essential* roles in societal functioning (e.g. rain, fertility, safety) and nuclear bombs do not. I don't think this is an apt comparison.

*Actually essential and/or perceived as essential, since I can't state whether or not gods exist

> This of course does not mean every individual innovation is a net positive.

Exactly, the point of the thought experiment is to set the stage for an a holistic and curious approach. On the phone example, I will make no conclusion because that takes me much more time. I will note there are other cons to phones to consider and have some examples to note (and I am including smartphones in this but we could separate it too).

As a side note before examples, I know people who decided to give up their phone to live "off the grid", yet still run successful local businesses. Almost all of them prefer being off the grid, though many decided to setup a business phone (still no personal) because that is how customers expect their business to operate.

e.g. Some cons to phones

a) Consider the parent who won't stop anxiously calling everyone and running their mouth on the phone, inserting themselves into their live with a phone that used to not exist and a letter would have been used.

b) The resources a phone consumes.

c) Phone addiction to pacify other artificial societal stressors,

d) the parents who brought their 3 and 5 year old out to eat, and set up smartphones in front of them so they could stream videos in a restaurant while the parents ate and didn't engage with them.

e) ...

When it comes to negative consequences, I find there is often a long tail whereas the positive benefits are more well defined. This makes it harder to compare things.

Expand full comment

> Can you expand upon your claim?

I'm significantly interested in history of things like work and technology. Most of the claims that the past were materially better rest on shoddy foundations or are outright lies. For example, lower work hours is either an outright fabrication (some of the research is INCREDIBLY shoddy) or because there was literally no productive labor available and they would eagerly adopt it when offered in order to build up a material buffer. Likewise a lot of research on primitive diets. The only clear examples I've found of material conditions degrading were due to humans exploiting each other (serfdom, helotry, slavery, etc). Not due to technological advance. If someone has some further evidence I'd take it.

Of course, I'm aware that technological increases sometimes increased the capacity for such exploitation. But never, so far as I'm aware, universally. Yes, social and technical developments allowed for helotry. But they also allowed for free republics.

> Either way, entropy wins in the end

How quickly it wins and how it wins, though, are important.

> Well many gods played essential* roles in societal functioning (e.g. rain, fertility, safety) and nuclear bombs do not. I don't think this is an apt comparison.

The comparison is that nuclear bombs provide both comfort (in the form of safety) and terror (in the form of fear of annihilation). And religious beliefs did the same.

> Exactly, the point of the thought experiment is to set the stage for an a holistic and curious approach.

The phone example is arbitrary as any example would be. Yet the underlying point is that while there are negatives which can be alleviated the ultimate overall movement is positive. In order to find examples you have to reach fairly far. If you think meddlesome parents are new or that phones aren't worth the resources they take you're simply wrong. Other things, like people using phones too much, are fundamentally moral judgments that you'd need to significantly justify. I don't deny long tail effects in principle. But I require significant justification. Otherwise you end up with Black Mirror or what I used to call the negativity of the gaps. (I think I picked it up from a blog.) Basically, imagining hypothetical negatives to counteract tangible positives because the core belief is not actually justifiable.

Expand full comment

But if your example is atomic bombs, it was actually very possible to get both US and Soviet Union (and many, though not all, other countries) to play ball on treaties limiting certain kinds of nuclear weapons, aiming at gradually reducing stockpiles, finding ways of de-escalating conflicts etc. Sure, this didn't lead to the abolishment of nuclear weapons as many wanted, but nuclear stockpiles are now a fraction of what they were at the height of the Cold War. There assuredly were many people who thought it was foolish and *traitorous* to expect the Soviets to play ball! (And many on the Soviet side who thought likewise about getting the imperialists to play ball, etc.)

Expand full comment

The biggest difference is of course that nukes are worse than worthless outside of military considerations, whereas an AGI would be a total opposite of that, extremely useful at pretty much everything.

Expand full comment

But the way the "China argument" on the prospect of slowing down or halting AI development generally presents it as an arms race, like Scott does here ("If China develops it before the West does they'll just use it to dominate the world!" We can't allow an AGI gap!"), making the nuclear comparison quite apt.

Expand full comment
Aug 9, 2022·edited Aug 9, 2022

Sure, I agree that the same kind of dynamic is in play, but my point is that this time it's even worse. Nukes are clearly terrible murder machines, and we only barely managed to kind of contain them long after they had proliferated, so imagining containing a thing that people are excited about for reasons other than murder before it's even made seems super-doomed.

Expand full comment

I'd guess the China argument would *actually* start getting wider play societally in the situation where there's already a widespread idea that sufficient level of AI development will bring considerable societal dangers that might negate its benefits, and the "but we have to still keep developing it because China will" is brought in as an arms-race-style counter-argument.

Expand full comment
Aug 9, 2022·edited Aug 9, 2022

>a widespread idea that sufficient level of AI development will bring considerable societal dangers that might negate its benefits

I'd say that it's already a plausible idea in the relevant circles (including elite decisionmakers), but moving it from plausible to very likely (never mind MIRI-style overwhelmingly likely) would face increasingly strong pushback from all sorts of stakeholders, so I'd be very surprised if this ever becomes mainstream.

Expand full comment

Most people also have a much dimmer view of the benefits of AI than those most concerned about its dangers. I feel like this ties in a lot with the "The Singularity is a religious belief" theory, as many who believe in The Singularity want it to happen, but worry about bringing an apocalypse on ourselves trying to get there. I think MIRI could transform into an anti-AI organization very quickly if they didn't see a miraculous set of benefits coming from AI. AI being miracle-workers is also the fear. If there are no miracles, then neither concern nor joy is warranted.

Expand full comment

Your analogy is broken. Non-proliferation happened only because the US ALREADY HAD nuclear bombs. If it hadn't do you think Stalin would have stopped developing them or not used them out of his moral scruples? (Or Hitler who was also driving towards them?) Of course not. What happened is the US had them first, then the USSR, then about ten other countries, and they all coordinated (among like ten actors) to reduce but not eliminate stockpiles. So in your scenario we're back to my answer: race to get AI first so we're in a negotiating position to demand the other parties limit its use.

Expand full comment

Mmm, not exactly. The thing is that nukes, literal nukes, are already there as a big stick to use to threaten people out of using AI. As in, "stop pursuing AI or we will nuke you, and we don't care that you'd nuke us back because that's still better than human extinction".

Not an easy task to sell AI risk hard enough to credibly do that, but probably still easier than aligning neural nets.

Expand full comment

That doesn't fix the analogy... It's also a bad idea. Your plan to prevent an AI from destroying humanity is to threaten to destroy humanity in a non-enforceable way and then (presumably) follow through when they cheat, destroying humanity in order to prevent humanity from being destroyed?

Expand full comment

You're rounding off "nuclear war" to "destroy humanity". There is a difference between "kills hundreds of millions" and "destroys humanity".

Expand full comment

How do you quantify such human suffering so it makes sense? Actual question, fwiw.

Expand full comment

I realize this isn't the point of the post overall, but 10% is at least a couple orders of magnitude higher than the median . That kinda fits in with the oil company analogy, but on the other hand the oil companies weren't the ones employing the best atmospheric scientists and climatologists. The AI labs _are_ employing the people who know the most about AI.

Expand full comment
author

Survey of 738 top AI researchers says probability of extinction from AI is 5-10% depending on how you ask the question. Response bias possible as always but I don't think this is orders of magnitude off. See https://aiimpacts.org/2022-expert-survey-on-progress-in-ai/

Expand full comment

The phrasing of the question they asked is:

> Assume for the purpose of this question that HLMI will at some point exist. How positive or negative do you expect the overall impact of this to be on humanity, in the long run? Please answer by saying how probable you find the following kinds of impact, with probabilities adding to 100%

The "extremely bad" category is phrased as "Extremely bad (e.g. human extinction)". I really wish they had just phrased it as "Human extinction" instead, since plenty of things are extremely bad but not existentially so. I'll reduce my claim from "at least a couple orders of magnitude down to "probably an order of magnitude" though.

Expand full comment

There's a question that says "What probability do you put on future AI advances causing human extinction or similarly permanent and severe disempowerment of the human species?"

Median answer: 10%

Expand full comment

Not to engage in too much special pleading here, but the fact that the median in the HLMI section is 5% and the median for that question is 10% implies that the gap is coming from some non-human-level AI stuff. I would imagine things like using (perfectly well aligned, non-agent-y) AIs to design plagues fall into that category.

The other alternative is that people are bad at probabilities that aren't nice round multiples of 25% and stuff gets inaccurate at the low end on this survey.

Expand full comment

The median was actually 5% for the question dionysus quoted, but 10% for this (apparently narrower!) question:

"What probability do you put on human inability to control future advanced AI systems causing human extinction or similarly permanent and severe disempowerment of the human species?"

This seems to be inconsistent, which is one of many demonstrations that surveys don't elicit beliefs perfectly, but the data don't seem like they can be explained by 'respondents are concerned about misuse and conflict but not misalignment.'

Expand full comment

Yeah it looks like it must be an instance of the conjunction fallacy. It seems to me that the resolution of measurement of surveys like this is too coarse to actually be informative in the < 10% range. I'm not sure what a better methodology would be though.

Expand full comment

The aggregate of consistent sets of judgement isn't necessarily consistent (see the List–Pettit theorem).

Also, yes, it is entirely possible (plausible, even, IMO) that disaster would be caused by AIs which are just very smart tools doing what their users want. Imagine how the world would look if any terrorist group could build nuclear weapons. AIs are cheap, once researched.

Expand full comment

In this particular case they'd need to be consistent. If you have two events A and B, and B is a subset of A, then if 50% of people think B is at least 10% likely, then necessarily 50% of people think A is at least 10% likely.

It sounds like the issue is as tgb says above though, the questions were asked to different people.

Expand full comment

That's a crazy high answer, given the tendency toward normalcy bias.

Expand full comment

In addition, that is similar to asking, back in the day, "Assume that nuclear theory is fully studied and understood; how positive or negative do you expect the overall impact of this to be on humanity, in the long run ?"

The answer is, as it turns out, "transformative to the point where life would be unthinkable without it, but you've got to be careful, because some crazy humans might use it to blow up the world". The problem is, the population of Earth is 7.8B and growing rapidly. We cannot afford to play it slow. We need transformative technological advances *now*, as a matter of survival. Yes, there's a very real risk that bad actors will use AI to destroy the world; but there's a near certainty of worse outcomes if we adopt the policy of technological stagnation.

Expand full comment

"We need transformative technological advances *now*, as a matter of survival. "

This seems like a very strong claim. Last week we were discussing what happens if birth rate trends continue as they are and the answer was basically "The Amish inherit the Earth." I see no reason why we couldn't continue "surviving" indefinitely even at 1600's tech levels. (Though obviously it would leave us entirely vulnerable to asteroid strike extinction.)

Expand full comment

Technically you are correct, but there are different degrees of "survival". Humanity is resilient, and we could absolutely survive as a species even if all of our technology more sophisticated than a hammer were to magically disappear one day. The question is, who will survive, and what kind of quality of life would they have ? You are reading this on a computer, so I am reasonably sure that you are somewhat more sympathetic to my side of this equation, as opposed to the Amish side...

Expand full comment

You might be incorrect. I can't predict with 100% accuracy if I would press a button labelled "EMP the whole planet good and hard." but I would be VERY tempted.

Expand full comment

You don't think it's orders of magnitude off from the true likelihood of AI doom, or from the true beliefs of the research community?

Expand full comment

In my original comment I said that the estimate was orders of magnitude off from the median AI researcher belief, so presumably that's what it's about. My personal "AI extinction due to misalignment this century" belief is 0.01% or less.

Expand full comment

Hm, just for fun...

7.9 billion, times lower bound of 0.05, divided by 6 million, gets 65.8 hitlers.

Alternatively, if one were to call 20 million deaths a "stalin", the probability of extinction would need to rise to 12.7% to exceed 50 stalins.

Expand full comment

Ahh... Hitler caused the deaths of a lot more than 6 million people - more like 50 million. (Assuming, of course, that absent Hitler, there wouldn't have arisen someone else doing much the same thing.) So a bit worse than Stalin, though of the same magnitude.

Expand full comment

In general agree with you, but it has a certain amount of cultural traction, and I think it's the number most commonly associated with him. *shrug* If I were to quote "3.5 gigahitlers" to someone, I expect they'd multiply it by 6 million.

Expand full comment

That someone must not be from Europe if they only know the 6 mil number

Expand full comment

Guilty! The version of WWII, which at least I got in my American compulsory state education, was focused on the Holocaust (and death camps in general) as the thing that was Uniquely Evil about Hitler, thus justifying American involvement. Merely tryng to conquer Europe was relatively normal, and possibly by itself not worth getting involved in. For example, the Siege of Stalingrad was never even mentioned.

(That of course is a simplified version of what we got.)

Expand full comment
Aug 8, 2022·edited Aug 8, 2022

Leaving aside my doubts about AGI. There is something that doesn't convince me about this strategy. I would agree that being the first and gaining the upper hand will ensure safety minded companies dominate the market, pushing competitors out and preventing them to build a poorly aligned AGI.

(All of the following is probably due to me not knowing nearly enough.) It however seems to me that the alignment reaserch is not even close to the rapidity of the capability research of the same companies. Not only that, but I am under the impression that scaling deep learning models is a capability strategy that is not particularly amenable to alignment. If this is the case, the fact that the company is safety-minded wouldn't change anything, we will still at the end have a product which is not particularly alignable.

Moreover, getting the atom bomb first didn't prevent the USSR from getting his. I doubt that openAI being the first to obtain a superintelligent AGI would be of particular use in preventing China to get one all by itself.

(If a final joke is allowed, re the environmentalist movement being part of the fossil fuel community, i guess there is a joke in there about green parties closing nuclear plants)

Expand full comment

> Leaving aside my doubts about AGI

Doing the same, I think the argument is more like: be the first to obtain an *aligned* superintelligent AI, and hit the singularity with a chance of a positive outcome. Vs China getting there first, and having a very different singularity happen.

Expand full comment
Aug 10, 2022·edited Aug 10, 2022

Agreed that that's the desired scenario, however I don't particularly see great advancements on the alignment side from openAI and friends, so i am not sure it would play out in the end.

(Also, leaving again my skepticism of the singularity aside, I may trust the US gov controlling the singularity marginally more than i trust the CCP with it, but only marginally)

Expand full comment

> i guess there is a joke in there about green parties closing nuclear plants

Yeah, those were my thoughts as well :) That while "fossil fuel safety" entities are not exactly institutionally subordinate to "fossil fuel capability" entities (as in the case of AI), they (safety) still get funded by lobbies associated with the latter.

We get attempts at being climate conscious like wind and solar which need more stable sources of energy to complement them, which means fossil fuels, because "fossil fuel safety" doctrine has broadly been anti-nuclear, up to now.

So the two situations aren't all that different.

Expand full comment

There are also things that could be done to minimize the damage an AI could cause:

Ban all peer-to-peer digital currency (so the government can freeze the assets of an AI as a countermeasure)

Require humans in the loop, with access to offline views of reality (ie prevent Wargames like scenarios)

Require physical presence to establish an identity.

Expand full comment

Yeah if we are really jazzed up about AGI there are a lot of pretty base level steps to take like this that would help a lot.

Even super simple stuff like “all drones need to be plugged in by human hands and with a palm print and we don’t make robots to do it or similar tasks.

Lots of opportunities for “hardening”.

Expand full comment