Comment deleted
Expand full comment
Comment deleted
Expand full comment
Comment deleted
Expand full comment
Comment deleted
Expand full comment

So this errant and evil AI? Why can’t it be contained in whatever data centre it is in? I don’t think it can run on PCs or laptops, able to reconstitute itself like a terminator from those laptops if they go online.

Expand full comment

Great post and summary of the state of things.

Perhaps worth adding gwern's post on how

semiconductors are the weak link in AI progress.


Expand full comment

I am a Harvard Law grad with a decade of experience in regulatory law living in DC, and I'm actively looking for funders and/or partners to work on drafting and lobbying for regulations that would make AI research somewhat safer and more responsible while still being industry-friendly enough to have the support of teams like DeepMind and OpenAI. Please contact me at jasongreenlowe@gmail.com if you or someone you know is working on this project, would like to work on this project, or would consider funding this project.

Expand full comment

I think part of the reason for the "alliance" is that a lot of the projects are just things people were doing anyways, and then you can slap a coat of paint on it if you want to get funded by alignment people.

For example, "controllable generation" from language models has been a research topic for quite some time, but depending on your funder maybe that's "aligning language models with human goals" now. Similar for things regarding factuality IMO.

Expand full comment

As I've said before, rationalist style AI risk concerns are uncommon outside of the Bay Area. You can use the US government to cram it down on places like Boston and maybe you can get Europe to agree. But if you think China, Russia, Iran, or even places like India or Indonesia are going to play ball then you're straight up delusional. (To be clear, generic you. I realize you effectively said this in the piece.) This isn't even getting into rogue teams. You don't need a nuclear collider to work on AI. The tools are cheap and readily available. It's just not possible to restrict it. Worse, some bad actors might SAY it's possible because they want you to handicap yourself. Setting aside whether it's desirable it's also impossible. We're already in a race and the other runners are not going to stop running.

In which case the answer is obvious to me. If you believe in Bay Area style AI risk you should do everything possible to speed up AI research as much as possible in the Bay Area (or among ideological sympathizers). If AI is going to come about and immediately destroy us all then it doesn't matter if you have a few extra years before China does it. However, if it can be controlled in the manner AI risk types believe then you want to make sure not only that they get it first but that they get it with the longest lead time possible. If AI is developed ten years ahead of everyone else then you get ten years of fiddling with it to develop safety measures and to roll it out in the safest way. And if you release it before anyone else even gets there then your AI wins and becomes dominant. If AI is developed one year ahead of China then you have one year to figure out how someone else's program can be contained if at all. Often without their cooperation.

The opposite, slowing it down, just seems odd to me. The US could not have prevented the Cold War or nuclear war by unilaterally not developing atomic bombs. Technological delay due to fears of negative effects have usually turned out poorly for the societies that engage in them. Though founder effects also mean that the first country to develop something gets a huge say in how it's used. It's why most programming languages are in English despite the fact foreign programmers are absolutely a thing. Not to mention all the usual arguments against central planning and government overregulation apply. I suspect most pharmaceutical companies do not see the FDA's relationship as symbiotic.

Expand full comment

I realize this isn't the point of the post overall, but 10% is at least a couple orders of magnitude higher than the median . That kinda fits in with the oil company analogy, but on the other hand the oil companies weren't the ones employing the best atmospheric scientists and climatologists. The AI labs _are_ employing the people who know the most about AI.

Expand full comment
Aug 8, 2022·edited Aug 8, 2022

Leaving aside my doubts about AGI. There is something that doesn't convince me about this strategy. I would agree that being the first and gaining the upper hand will ensure safety minded companies dominate the market, pushing competitors out and preventing them to build a poorly aligned AGI.

(All of the following is probably due to me not knowing nearly enough.) It however seems to me that the alignment reaserch is not even close to the rapidity of the capability research of the same companies. Not only that, but I am under the impression that scaling deep learning models is a capability strategy that is not particularly amenable to alignment. If this is the case, the fact that the company is safety-minded wouldn't change anything, we will still at the end have a product which is not particularly alignable.

Moreover, getting the atom bomb first didn't prevent the USSR from getting his. I doubt that openAI being the first to obtain a superintelligent AGI would be of particular use in preventing China to get one all by itself.

(If a final joke is allowed, re the environmentalist movement being part of the fossil fuel community, i guess there is a joke in there about green parties closing nuclear plants)

Expand full comment

There are also things that could be done to minimize the damage an AI could cause:

Ban all peer-to-peer digital currency (so the government can freeze the assets of an AI as a countermeasure)

Require humans in the loop, with access to offline views of reality (ie prevent Wargames like scenarios)

Require physical presence to establish an identity.

Expand full comment

If you think of AI as a research field rather than a mature industry, then maybe a more exact analogy than Exxon employees and environmentalists going to all the same parties, would be the scientists working on engines, and the scientists worried about the long-term effects of engines, going to the same parties in the mid-19th-century (or even being the same scientists). Which actually seems 100% plausible! Of course they'd go to the same parties; they're part of the same conversation about how to shape an unknown future! Come to think of it, one wishes there had been much *more* fraternizing of that kind. If there had been, maybe it would've been realized earlier that (e.g.) putting an internal combustion engine into every car was an ecological disaster in the making, and that the world's infrastructure should instead be built around electric cars and mass transit, while meanwhile research proceeded into how to build cleaner power plants.

(Full disclosure: am working on AI safety at OpenAI this year :-) )

Expand full comment

As someone that could loosely be described as working on capability gains research, the MAD-style arguments hit me in the gut the hardest. We *are* in a race. That’s the reality. I’d be more comfortable prioritizing disarmament when we know we’re not going to be glassed by whichever other actor has the least scruples.

Expand full comment

> and most of that came from rationalists and effective altruists convinced by their safety-conscious pitch

Wait, people actually believed that?

Expand full comment

I don't mean to sound too pessimistic but what about a kind of anti-natalist/voluntary extinctionist perspective, that since the world is in a pretty terrible state and AI has the potential to either improve or just end it, it's kind of a win-win scenario.

You wouldn't need to thinking ending the world would be a good thing, only that it's in a state, below a certain threshold of goodness, which balanced against the chance of rogue AI would justify the risk.

I don't think you'd need to be that much of a doomer to think that, provided the alignment problem has a decent chance of being solved or if rogue AI didn't turn out to be that destructive.

Expand full comment

The OpenAI charter (https://openai.com/charter/) says "if a value-aligned, safety-conscious project comes close to building AGI before we do, we commit to stop competing with and start assisting this project." This isn't precise or binding, and would be pretty easy for them to ignore (particularly given that capabilities are always ahead of what's been publicly announced), but the one OpenAI researcher I've spoken to about it actually thinks it might be followed if DeepMind seems close to AGI.

Expand full comment

Is a nefarious AGI considered more dangerous/likely than a nefarious human in possession of neutral AGI?

Expand full comment

The near to midterm mental model for AI that I use is imagining a world where a bunch of somewhat shitty magic lamps have been dumped all over the surface of the Earth.

Right now you have to go to a special school to learn to rub them the right way.

The genie speaks a different language than yours.

The genie can’t “do” anything other than give you true answers (that may or may not fit with your question) and explain how to do things to you (that may or may not fit with what you wanted to accomplish).

Over time the lamp gets easier to rub, the genie gets better at speaking your language and understanding what you want. And everyone has a magic lamp.

How long before everyone is making wishes?

How often does making a wish work out, even in fiction?

Expand full comment

Is part of the reasoning also "we need a company like OpenAI to build an aligned AGI faster than a company like Facebook can build an unaligned AGI, so that the aligned AGI can forcibly prevent anyone from building an unaligned AGI - say, by burning all GPUs"?

See #6 here: https://www.lesswrong.com/posts/uMQ3cqWDPHhjtiesc/agi-ruin-a-list-of-lethalities

Expand full comment

What about the downsides of NOT developing AI?

Cancer detection, vaccine development, self driving cars (maybe, eventually safer than human drivers), etc.

As for the risks to labor and inequality - there are other mitigation methods. Ie Universal basic income or government owned production and distribution to citizens.

The idea of a blanket ban seems short sighted to me. And difficult.

Expand full comment

Why does AI safety require AI researchers? Nuclear safety doesn't require nuclear researchers until you have a nuclear reactor. If you want you stop dangerous technological proliferation, the only effective way I'm aware of is the Israeli approach of targeted airstrikes.

Expand full comment

Read this and then go back and re-read “Meditations on Molech”, and then try to sleep peacefully.

Expand full comment

I always thought OpenAI was pursuing a PR-oriented strategy, producing AI optimized for impressing the public. An AI that plays grandmaster-level Go or is great at predicting user behavior and recommending ads will not seem instinctively impressive to most people because they don't have a good grasp of what problems are hard, and can't play Go / make ads well enough to be able to directly test the AI's skill level. An AI that excels at a very generic skill like making text or images and can be interacted with is much more effective at impressing onto the interested layperson how quickly the AI field is moving. Also, unlike a chess AI, a generic text or image generator AI can be weaponized, so it becoming widely available will prompt many people to start thinking about AI risk and safety, even if only at the low-risk end of the scale.

That seems to me like a smart strategy, and reasonably effective so far.

Expand full comment

> What if they started thinking of us the way that Texas oilmen think of environmentalists - as a hostile faction trying to destroy their way of life?

As a Texas oilman, I have some comments. We as a group don't want to destroy the environment. Many of us are hicks of various kinds who love nature -- hunting, fishing, that sort of thing. Lots of people employed by oil companies are something like internal regulatory compliance enforcers. These employees aren't viewed as the enemy. Nobody feels any emotions about this at all, regulation is just part of doing business in the modern world.

There is a "neat trick" here, which I think is worth spelling out. The trick is that we oilmen always prefer for our internal regulatory experts (EHS, reserves, etc.) to tell us about any problems first, so that we have a chance to fix the problem quickly. We do not like paying massive fines for noncompliance. We like going to prison for trying to hide noncompliance even less! The regulatory apparatus of US oil & gas is actually so terrifying to us that we pay people a lot of money to help us avoid falling on its bad side.

The other half of the neat trick is that "we" (the industry) negotiated a regulatory environment that actually works for everybody. Oil companies LIKE doing business in the US, and it's not because the US regulator regime is the wild west. On the contrary, the US provides transparent and fairly enforced regulatory structure. Investors love this, so they invest in American companies. Companies love this because business is predictable. Employees love this because they aren't constantly faced with moral hazard and bribery. Environmentalists probably don't love the status quo, especially when there are big photogenic disasters like Deepwater Horizon to point to, but you know them, they're never happy anyway, and they have objectively gotten a lot of important concessions over the years. (Need I mention that we also get mad at the companies that are responsible for huge disasters? Their sloppiness makes us look bad, in addition to simply being a huge tragedy. We largely wouldn't really mind stricter regulation of that sort of project.)

I feel like the problem upstream of the regulation-of-AI issue is that "tech" as a unit is totally unfamiliar with and antagonistic toward the whole notion of being regulated *at all*. They have so far been pretty successful at dodging and weaving their way out of ever really having to submit to regulatory scrutiny. I don't think you solve AI-regulation without solving the general problem of tech-regulation.

As a Texas oilman I propose:

- Clear and explicit regulations that are negotiated between the financial interests of industry and concerned public parties, such as AI risk professionals. The industry stakeholders need to feel confident that you're not going to come after their money, and that you're not going to change or reinterpret the rules suddenly. This sort of trust takes time to build.

- Horrifically steep penalties for violation, applied fairly and evenly. Make it more expensive to break the rules than follow them.

I admit I don't have a magical Theory of Change that allows this to actually happen. But I feel like, if the government broke Standard Oil, surely they can break Facebook?

Expand full comment
Aug 8, 2022·edited Aug 8, 2022

An important disanalogy with oil and pharma companies is that the AI arms race is not really about what gets deployed.

When Google created and announced PaLM, they were participating in the race. If you ask "how is Google doing in the race right now?", PaLM is one of the cases you'll include in your assessment. It doesn't matter, here, that no one can see or use PaLM except maybe a few Google employees, or that Google has said they have no plans for deploying it. The race is a research race, and what counts is what is published.

One can imagine trying to restrict deployment anyway, and this might be very good for slowing down near-term stuff like surveillance. But if companies keep pushing the envelope in what they publish, while getting steady revenue for more research from old-but-trusty deployed products, that doesn't seem much better for longer-term AGI worries.

(Whereas in eg pharma research, the incentives are about bringing new drugs to market, not about publishing impressive papers that make people believe you have developed great new drugs even if no one can use them.)

Could publication itself be restricted? Maybe -- I'm not sure what that would mean. In the academic world, you can "restrict publication" by refusing government funding to certain types of research, but that's not relevant in industry. I guess you *could* just try to . . . like, censor companies? So they're banned from saying they've developed any new capabilities, in the hope that would remove the incentive to develop them? This is obviously a weird and unsavory-sounding idea, and I'm not advocating for it, but it seems like the kind of thing that ought to be discussed here.

Expand full comment

This seems kind of analogous to infectious disease researchers doing gain of function style research. Figuring what the space of possibilities is, how fast they could be reached, what the limits are etc. but in a controlled environment with (presumably) careful safety concerns (and a side shot of fame / glory / riches). Possibly the same sorts of risks apply in terms of something escaping the lab.

Expand full comment

I'm in the "Butlerian Jihad Now" camp, personally. If AI has even a relatively small chance of ending the world, should it not be stopped at all costs? As Scott says in the article, onerous regulations stopped the expansion of the nuclear industry cold. If AI is similarly resource intensive, it should be practical to stop it now.

Expand full comment

Minor quibble but I suspect the data privacy stuff wouldn't really matter much - it matters a lot now, but I'd expect the sort of AI that actually nears AGI to be much less reliant on large datasets (so at most it'd slow it down a few months). GPU limits might matter more - finding a way to limit chip production might be more useful here.

(The obvious dark conclusion here is "get China to invade Taiwan and cause a global shortage", but the US is pushing hard to increase domestic manufacturing ability so in a few years that might not even help)

Expand full comment
Aug 8, 2022·edited Aug 10, 2022

It's no wonder AI safety researchers end up as AI builders: it's the same research problem with more money attached.

It's like map design: is a good map one that gets you to your destination quickly, or one that doesn't hit nasty obstacles along the way? They are different. But they use very similar information.

AI is useful when it spots things we'll love. AI is safe when it avoids things we'll hate. So building safe AI is nearly the same problem as building useful AI.

Consider cookbooks. Do you need different skills to write a great-tasting leftover recipe than to write tips for leftover safety? Sure. But in both cases, you first need to already know a fair amount about food.

AI is a cognitive product, like cookbooks or maps, not a physical product like trucks or drills. In knowledge products, "do the good" and "avoid the bad" are closely related. Safety progress and commercial progress won't become separate until AI is more sophisticated.

Expand full comment
Aug 8, 2022·edited Aug 8, 2022

All of this is funny from the perspective of someone who thinks alignment of any single one agent of limitless power is likely an unsolvable problem, but fortunately for us, convergent instrumental subgoals almost certainly include human values.

Here's what it looks like from that outside perspective:

A bunch of people trying really hard to solve a likely unsolvable problem that may not need a solution at all, collectively behaving in ways that violate their own theories about what is dangerous (like forming multiple companies), doing their damndest to make sure the god we are about to summon is good, while thinking that belief in 'god' and 'good' as real things is for silly people.

Expand full comment

An obvious regulatory hinge point -- and I do not think this would be ethical but it is at least technologically apt and less obviously evil than outright AGI-enslavement plans -- would be to treat anything that exhibited any stochastic parrot behaviour on insufficiently licensed datasets as copyright-infringing. That would at a single stroke eliminate the overparameterization approaches that are scaling out so well right now but which cannot avoid reproducing their training data on a well-chosen adversarial input.

Expand full comment

We must slow progress in AI. The pentagram has been drawn and Satan's hand has begun to emerge from the floor. Yet the cultists remain enamored with the texture of his scales. We can't just sit back and ask nicely forever... the cultists may have to be tackled if they don't listen.

Expand full comment

Asimov's robots were first taught the three rules of robotics - ethics. US big business and military are biggest AI profit centers, and each is actively against teaching AI ethics.

Expand full comment

The happy fossil fuel community sounds absurd until you realize it wouldn't have been 70 years ago.

If responsible people had been worried about climate change at the time they wouldn't have wanted to stop fossil fuels. The benefits on human florishing are too great. Instead it would have been about slowly phasing them out and creating alternatives. Or maybe about doing carbon capture on point sources.

If AI starts to become really dangerous, or a visible problem, the happy AI community would probably start to sound as absurd.

Expand full comment

Ah, now I understand why the MSM keeps telling me AI Is Racist. They're really describing the tensions between the Safety and Capabilities factions of the Broader AI Community.

The cooperate-with-China tactic genuinely seems more feasible than "solving alignment", since we at least have ever done the former and have some vague idea how such a plan would go. If AI is really as much an x-risk as people say, and the genie can't be put back in the black box (what would it be like to universally censor the concept of AI...?), then it ought to be as much of an all-of-humanity effort as possible. We can worry about dividing the spoils after defusing the bomb.

Expand full comment

> This would disproportionately harm the companies most capable of feeling shame.

It will also expand the army of shamers with the most empathetic companies, that you can use to shame less cooperative companies. It all sounds like arguments for not starting to work on a problem because it will not be completely solved in one step.

It isn't even necessary to start being emotionally antagonistic - the problem is factual disagreement on the usefulness of capability research, not a difference of values. So everyone should just shame safety teams for not spending all their time politely presenting graphs of differential impact of capability/alignment research.

Expand full comment

People worried about AGI X-risk refuse to correct enough toward what actually "cutting the enemy" looks like. Too much forswearing the "minigame" only to keep playing the game. The thoughtless brushing off any illegal action as obviously ineffective is a symptom of this: what is your prior belief, really, that in the space of all possible actions to prevent AGI-induced extinction, the best action magically happens to break no laws (or even better, fall within the even smaller subspace of actions you consider "pro-social" or "nice"). And of course, the insidious implied belief we can just narrowly cut out AI and leave behind the technophile digital social systems that created it, with our nice standard of living! How lucky!

Here's my attempt to give an example from the "meta-class" of what I think effective (relatively, that is, still extremely unlikely) action to stop AGI X-risk might actually look like: get a disciplined core group of believers to agitate for an extremist anti-technological religion (or borrowing an existing religion like Islam or QAnon, and weaving anti-AI) into it, using all the methods of QAnon, radical Islam, etc: do your best here. Then, do your best to precipitate crises leading to some sort of technophobic reaction in the masses (aka PATRIOT act style overreaction). Make it socially unpalatable to say, "I work in AI". This might work only in one country -- hopefully nuclear armed. Some encouragement: QAnon (aka word vomit by bunch of random underpaid GRU hacks) has nucleated a radical movement close to seizing power in USA. The Bolsheviks started a totalitarian state by taking over a couple of post offices. Your x-rationality isn't worth much if it can't figure out a way to do at least as well.

Once your movement has succeeded, install an extremist government that issues an ultimatum: AI work stops, worldwide, (with inspections), or we nuke you. If they refuse, actually nuke them: nuclear war is not X-risk (thought it would certainly suck!) -- nuclear war still allows the hundreds of billions yet to exist to actually exist. Extinction does not.

Would this work? Extremely unlikely. But just hoping that the prisoner's dilemma of "first to AGI" goes away like a miracle if enough people have goodness in their heart, which is what existing safety work looks like from the sidelines, seems indefensibly useless: waiting to die.

My advice to people who believe in imminent AGI X-risk: take your pro-social blinders off. Stop acting in bad faith. Stopping the basilisk might require unleashing very discomfiting forces and a lot of pain. But a good utilitarian can weigh those appropriately against extinction.

Expand full comment
Aug 9, 2022·edited Aug 9, 2022

I've read half of superintelligence, I've read most of Scott's posts and some of Eliezer's. I've listened to most 80K episodes on AI safety/policy/governance, I've read the concrete problems in AI safety paper, I've spoken to several people irl. Nothing sticks, this consistently feels like magical thinking to me, like you're saying "assume we invented a god, wouldn't that be bad" having snuck your conclusion into your premise.

I don't know why I (or literally any of my colleagues in ML) just don't see what everyone else here is seeing. The short version of why I'm skeptical is something like:

1) planning and RL are really really hard problems, and it's not clear that you can get exponential gains in solving highly <general> scenarios (i.e., specialization will always be better, vague no free lunch vibes), and

2) there is so much irreducible uncertainty that long-term planning of the sort typified by "a secret/hidden motivation arises inside an AI trying to fool you" seems hard to imagine.

What empirical evidence I think would change my mind: an alpha-go like capability on any real world (non game) task, that can solve long-horizon planning problems with large state/action spaces, a heavy dose of stochasticity, and highly ambiguous reward signal.

Maybe this is equivalent to the AGI problem itself. Maybe I lack imagination! But these pieces genuinely leave me so confused about proponents' models of the world under which AGI (as we conceive of it in the next 30 years) is an *existential threat*.

This is not to say I'm indifferent in the face of AI progress. I didn't think GPT-3 possible by scaling up existing methods. The AI-inventing-bioweapons scenario is potentially concerning (I'm uncertain how much design is the bottleneck for bio risks), but also *not* an AGI scenario. These seem like the ~regular~ cowboy shit we get up to -- technically impressive, potentially dangerous, needs care, but not an Avengers level threat.

So -- is there a textbook, a paper, a book, anything, that has a track record of being persuasive to someone with my background (academic ML, skeptical but open-minded) that anyone has to recommend? Ideally it would sketch out a clear argument, with technical details, on exactly *how* AI poses an *existential threat* (not just that is could be dangerous).

Expand full comment

Time to start a new field of AI Alignment Alignment

Expand full comment

It strikes me that if the base assumption requires us to come up with a plan to predictably align a power that's assumed to be akin to God, we're very much in the realm where disappointment is going to be out sole companion and confusion reigns.

I'd love to see more empirically bounded views on specific things the companies can try to be aligned. So far the only one I've seen is to de-bias the algorithms, which is fantastic and worth doing, but scant little else. In fact it's easier now because the companies get to say "yes ai safety is important" and so what they were going to do anyway. So we can argue if an Thor can really beat the Hulk but meanwhile progress will only happen if there's some falsifiable specifics in this conversation.

Expand full comment
Aug 9, 2022·edited Aug 9, 2022

I ask this question about all technological progress. More often then not, we could slow it down if we had the right societal incentives.

In my education we studied Science, Technology and Society (STS). Research has provided a variety of frameworks for conceptualizing and managing the rollout of technology and the ways S T and S each affect each other. In the real world outside of academia, it feels like nobody cares about this research. Many don't even know the field exists. I think one problem is that STS does not have a measurable ROI, whether it be short term or long term. Another problem is that slowing technological growth runs contrary to economic growth in the short term, which disincentivizes people from taking a more pragmatic approach to the introduction of new technologies. I could go on, but I'll leave that to those interested or anyone who replies.

The pandemic as a whole would be a fascinating window of time to look at it, but the example that sticks out to share is how Electric Scooter companies literally dumped scooters on sidewalks and left the resulting problems on local governments. I simply laugh at how ridiculously it played out in my area, yet the same pattern of technological introduction holds true for social media, plastic, and petroleum. They were introduced in an unrestrained fashion and now those 3 things have the potential to fuel societal collapses through their various impacts.

AI is another technology which is already having unintended consequences, from minor things like automatically getting flagged as a bot and temporarily loosing account access to major things like creating more racial disparities in healthcare or causing road accidents. I think people speak about AI risks more than others because it invokes a more primal fear of being replaced, but when you look at how AI is actively being used... the pattern is back. Even if part of scientific community decided on a safe boundary for AI, I wouldn't trust everyone to respect it nor the political institutions to enforce it. It's as if technology has a will of its own.

Studying STS left me angry and depressed toward society, so I've shifted my focus to acceptance... societal systems upend themselves if that is their destiny. I wonder what the future holds for the introduction of new technologies. Personally, I do hope it is much slower and evidence driven.

Expand full comment

If AI can get this advanced, the name of the game is leveraging good AI against bad AI.

Stifling AI and sending it to rogue or irresponsible actors does the exact opposite.

AI can be programmed to be human-loving (and self-destruct upon mutation).

In addition, vast server farms can be controlled by humans for the purpose of protective AI, that would develop solutions to overwhelm any emerging AI threat.

Expand full comment

I remain convinced that these apocalyptic risks - (the unaligned super-intelligence/over-enthusiastic paperclip manufacturer/etc) are just not defined enough to worry about. My concerns, and I imagine the concerns of the broader public, are with problems we can actually define and identify, and which if solved, might help avoid the apocalypse scenarios as well:

- The people setting machine learning loss functions sometimes have bad incentives.

- AI sometimes optimizes in ways that contradict our values (i.e. race is correlated to recidivism, so it explicitly considers whether a parolee is African American before deciding whether to grant parole).

- Modern AI has sometimes been known to obscure its decision-making process when using metrics we don't like, but that optimize its given goals. That's not necessarily a sign of intelligence (to the extent the concept has any meaning), it's just reacting to incentives in accordance with its programming. It's therefore worthwhile to research how machine learning decisions are reached.

- Modern AI sometimes makes bizarre and inexplicable choices, which have implications for self-driving cars and other uses.

Possibly the nascent urge to deceive will one day allow AI to improve itself beyond mortal comprehension in secret. Possibly incentive structures made without sufficient thought will one day cause my Tesla to drain the zinc from my body to make more Teslas. That sounds flippant, and I am being a bit funny, but I also don't discount the possibility. But if we address those small issues now, we *also* solve the big issues later.

When focusing on these immediate issues, it becomes less like wondering if environmentalists would be more successful if they hung out with oil barons and more like wondering if environmentalists should support green energy or allow the coal plants to keep running because green energy also has problems.

Expand full comment

Hmmmm..... You might want to keep an eye on John Carmack. In an interview with Lex Fridman he said:

"I am not a madman for saying that it is likely that the code for artificial general intelligence is going to be tens of thousands of line of code not millions of lines of code. This is code that conceivably one individual could write, unlike writing a new web browser or operating system and, based on the progress that AI has machine learning had made in the recent decade, it's likely that the important things that we don't know are relatively simple. There's probably a handful of things and my bet is I think there's less than six key insights that need to be made. Each one of them can probably be written on the back of an envelope."

I've posted the clip along with my transcription at my blog: https://new-savanna.blogspot.com/2022/08/fools-rush-in-were-about-six-insights.html

Expand full comment

There is another explanation I've seen in the writings of Eliezer Yudkowsky that I'm surprised wasn't mentioned (or maybe I'm misremembering or misinterpreting him). The argument goes something like:

The only way to stop an unaligned AGI is with an aligned AGI. Perhaps the only way to prevent an unaligned AGI is with an aligned AGI. Once one group figures out how to make superintelligence, other groups will be close behind, including groups that will almost certainly make unaligned AGI. It is impossible to actually prevent AGI from being developed, so our only hope for victory is to make an aligned AGI first. It is important for safety people to be working on capabilities, not because they're likely to figure out alignment, but because the only possible path to victory is if the good people get AGI first.

What are your thoughts on this argument?

Expand full comment

See, this is exactly why I won't donate to EA, or any capital-R Rationalist organization. They are like transportation specialists back in the day, when locomotives were first invented. You could invest in more railroads and better engine technology and eventually revolutionize your entire transportation network... or you can mandate that every locomotive has to be preceded by a man riding on a horse and waving a little warning flag.

Modern machine learning algorithms are showing a lot of promise. We are on the verge of finally getting viable self-driving cars; comprehensible translation from any language to any language; protein folding; and a slew of other solutions to long-standing problems. There are data scientists working in biology, chemistry, physics, engineering, linguistics, etc. who will talk your ear off about marginal improvements to prediction accuracy. But if you asked them, "hey, how likely is your machine translation AI to go full Skynet ?", they'd just think you were joking. I've tried that, believe me.

Are we really at the point now where we are willing to give up huge leaps in human technology and our shared quality of life, in exchange for solving imaginary problems dealing with science-fictional scenarios that may or may not be physically possible in the first place ? At the very least, do we have some kind of a plan or a metric that will tell us when we can retire the guy on the horse with his little flag ?

Expand full comment

My biggest hope is that human-level intelligence turns out to require hundreds of trillions of parameters and about the amount of data in a human childhood, and that it would be impossible to scale up to this level (and beyond it to superintelligence) quickly enough to matter.

My second-biggest hope is that something like Neuralink allows us to merge with the AI and use it as a literal brain extension. (As outlined in this Wait But Why post: https://waitbutwhy.com/2017/04/neuralink.html).

A problem with Scott's analogy: AI-safety crowd does not have its own version of green energy - i.e. a proven viable alternative to fossil fuels whose only obstacle is price and the slow pace of deployment. The AI version of green energy might be safe AGI or merging with the AI, but neither of those exist so far, whereas green energy does exist and is getting exponentially cheaper (energy storage technology is following close behind, and deregulating nuclear would greatly help). The environmentalists can afford to be hostile to fossil fuels, since they champion competing energy sources, but the AI safety movement cannot afford to fight AI-capabilities companies when they have no alternative technology to rally around. They would be more like the degrowth environmentalists, who have a poorer track record than more mainstream pro-wind/pro-solar/pro-batteries environmentalists.

Expand full comment

I feel the idea that we should take AI slow depends on the assumption that w/o AI our situation is relatively secure. If, OTOH, we are constantly at risk of nuclear and biological apocalypse then the relative harms and benefits of potentially disruptive tech seems very different.

That is doubly so given that there is about 0 chance we'll convince the Russians and Chinese to delay AI research w/o clear and convincing evidence of a threat.

Expand full comment
Aug 9, 2022·edited Aug 9, 2022

>So the real question is: which would we prefer? OpenAI gets superintelligence in 2040? Or Facebook gets superintelligence in 2044? Or China gets superintelligence in 2048?

Unstated fourth option: "Or nuclear war in 2047".

I don't mean just threaten it, although if that works it'd be nice. I mean actually do it. The PLA deterrent is irrelevant if the result of *not* nuking them is likely the literal end of humanity.

NB: I'm not saying Nuclear War Now. I'm saying "work with them to stop, if that doesn't work say 'stop or we nuke you', if *that* doesn't work then nuke them".

Expand full comment

Every time I read something like this, I hope my work helps make unaligned AI happen a little bit faster.

Expand full comment
Aug 9, 2022·edited Aug 9, 2022

Pick a future year. Say that by around 2040, computers get cheap enough that anyone can make AGI, whether or not they care about safety.

Then 2040 becomes our deadline. If the professionals don't develop safe AGI before 2040, amateurs will develop unsafe AGI after.

It might not be 2040. The exact date doesn't matter. But every year brings cheaper hardware and more training data.

Right now, AI is a subject for serious researchers who care about their reputation with their community, who at least have to pretend to care about their community's safety standards. That won't last forever.

Even if China agreed to freeze development, once every smartphone is smart enough to do AGI, the genies will come entirely out of the bottle. Before that critical year, be it 2040 or not, we'd better have figured out how to make our genies kind.

Expand full comment

>"So that was what I would be advocating to you know the Terence Tao’s of this world, the best mathematicians. Actually I've even talked to him about this—I know you're working on the Riemann hypothesis or something which is the best thing in mathematics but actually this is more pressing."

This is a really, profoundly silly way to try to get people on your side. People get Terry Tao to work on problems all the time, but this is not how they do it. Show him something of mathematical interest in AI alignment and you might get his attention; cold-call him sounding like a doomsday grifter and telling him his work is, well, just not quite as important as the stuff you want him to work on, and I can pretty much guarantee you're not going to get results.

Expand full comment

What if the superintelligent AI decides to take revenge on us because we went slow on its creation? Suppose without any restrictions superintelligence would arise by 2050 but due to the concerns of the AI safety people it got delayed to 2100. Maybe the superintelligence would get angry and punish us for the delay.

This argument may seem absurd but is it any more absurd than those put forth by the existential AI-risk folks? (Here I'm not arguing against more prosaic concerns such as bias but rather the idea that superintelligence is an existential risk for humanity.)

Expand full comment
Aug 9, 2022·edited Aug 9, 2022

Maybe this development will be a next big step towards AI...


Expand full comment

So AI alignment is extremely hard, even harder than cooperation between countries. If the progress isn't slowed we are all extremely likely to die. Regulator capture is a thing that can tremendously slow progress. Am I missing something or the actual reason for not broadly supporting AI policy approach is the "as a sort of libertarian, I hate blah blah blah same story"?

I mean, I myself would aesthetically prefere that the world would be saved by nerdy tech geniuses and visionaries who, due to their supperior rationality, took the right kind of weird ideas seriously instead of establishment bureaucrats who use more regulation as a universal tool for every problem and just got lucky this time. But whatever was achieved with highter prices for GPU due to cryptocurrencies is definetely not enough to properly prolong the timeline. Blockchain turned out not to be the best solution even in this sphere.

I would love having aligned AI in my lifetime solve all the issues and create heaven on earth, the dream of witnessing good singularity myself is intoxicating. But I guess that's not really an option for us. Our society isn't that great at solving complicated technological problems from the first try with a harsh time limit. We have a much better score at ceasing progress through various means even when there is no good reason to do so. Maybe it's time to actually use our real strength.

Expand full comment
Aug 9, 2022·edited Aug 9, 2022

Rationalists are very prone to cognitive traps of the “Pascal’s Mugging” type. A strong sign of this is thinking “maybe we should pattern ourselves on the crazy anti-nuclear folks who stopped the nuclear industry in the name of the environment,” thereby killing the one thing that could have prevented climate change.

Maybe it’s time to step back and realize that getting super hyped up about low-probability events is a cognitive anti-pattern.

Expand full comment
Aug 9, 2022·edited Aug 10, 2022

I really, really hate this toxic dynamic where working on intermediate steps is deemed "not having a plan". Eliezer apparently does this to people as well, in person. If someone said they were working on getting people in the field to at least be aware of a non strawmanned version of the AI X-risk thesis and they'd gotten three engineers to quit their jobs at FAIR and do something else, I suppose to you and Eliezer that wouldn't count as having a "plan" if they can't unroll the stack on the spot about how it leads to a complete alignment problem solution.

Expand full comment

Though I am still agnostic about AI-risk, I see us under control of yet-dumb-algorithms already - and unable to handle even those: 1. Icelandic Volcano broke out. The air-control followed its rule/algorithm/simulation and grounded all civil planes, cause "maybe dangerous" - it refused even flights to actually measure it. https://en.wikipedia.org/wiki/Air_travel_disruption_after_the_2010_Eyjafjallaj%C3%B6kull_eruption#Attempts_to_reopen_airspace

Quote: "On 17 April 2010, the president of German airline Air Berlin ... stated that the risks for flights due to this volcanic haze were nonexistent, because the assessment was based only on a computer simulation produced by the VAAC. He went on to claim that the Luftfahrt-Bundesamt closed German airspace without checking the accuracy of these simulations. Spokesmen for Lufthansa and KLM stated that during their test flights, required by the European Union, there were no problems with the aircrafts."

2. An economist (GMU, I remember) called his bank to complain about the algorithm too often stoping his card-use (international trips et al.). Bank said, "the AI decides and we can not interfere with the algorithm". - Makes you wonder how we would handle ("align") a super-intelligent AI, does it?

3. Governments are bureaucracies running on algorithms, that should be easy to adapt if found lacking. R.O.T.F.L. . Baby-formula (Zvi). Enticements for Russians soldier to desert (Caplan). Keeping German nuclear power running this winter (me et al.). Bringing vaccines out asap. Bringing updated vaccines out asap. ---

"SuperAI, thou shall come, thine shall be the power and the glory. We are unworthy." Hope we not end up as paperclips. Our rulers-to-date turned us into social-security-numbers. So ...

Expand full comment

I think the fact that many "AI safety" advocates are now working on developing AIs says something about "AI safety"...

Expand full comment

> When I bring this challenge up with AI policy people, they ask “Harder than the technical AI alignment problem?”

I would argue that the CCP is an (admittedly slow, but powerful) AI that has a track record of unFriendly behaviour, soooooo...

Expand full comment

slowing down AI development is just as much an existential threat as badly aligned AI

think for a moment of all the possible future problems that we can face and that an AI would help solve

but only one of those existential threats is a good excuse for rent seeking

Expand full comment
Aug 9, 2022·edited Aug 9, 2022

I think the main difference between AI dev vs alignment and fossil fuel vs global warming is that it took some two hundred years between the industrial revolution (marking the start of large-scale burning of fossil fuel) and the establishment of the IPCC (marking the consensus that CO2 caused climate change).

For AGI, the fundamental idea of misalignment, the creation turning on its creator, predates the concept of computers by centuries at least. Hence Asimov's three laws. Later, it was recognized that aligning AGI would actually be kinda hard. But generally, any AI development is taking place in a world where memes about evil AIs are abundant.

Still, judgement of danger is subjective, and the people in the position to act generally have some personal stake in the outcome beyond being an inhabitant of earth. Take the LHC: the payoff for a person involved is "make a career in particle physics" while the payoff for a member of the public is "reading a news article about some boffins discovering the god particle". Thus, they might judge the risk case of "LHC creates a black hole which swallows earth" very differently. (Personally, I am pro LHC, but also work in an adjacent field.) Some biologists are doing gain of function research on virii because they are convinced their safety measurements are adequate. Perhaps they are, I don't know.

I would also like to point out that the potential payoffs for creating a well-aligned AGI are tremendous, as the universe is not quite optimal yet. Suppose you have a dry steppe environment with several early ancestors discussing the dangers and benefits of developing tech to create and control fires. Some might claim that fire can only be created by the god of lightening, so no human will ever create fire. Others might think that it is more easy to create fire than to control it, and thus inadvertently destroy the world. They might either want to ban fire tech research permanently or focus on fire containment research first. Still others might argue that fire control could be a gateway tech eventually advancing their civilisation beyond their wildest dreams.

At the moment, we are in this position. Is AGI even within the reach of our tech level? Will it lead to a singularity or will the costs for intelligence increase exponentially and effectively max out at IQ 180, giving us just a boost to discover tech we would eventually have discovered anyhow? How far does that tech tree go, anyhow? Is alignment possible at our tech level, or will AGI invariably kill all humans?

Expand full comment

I am worried that when you're talking about when the AGI will come, you're using 2040 as your made-up number. Do you really believe it's possible to do it in less than 20 years?

Expand full comment

Some mixed thoughts:

There are some cool ideas in AI policy, like structured access or compute measurement/management. Some other ideas lack a clear theory of change that describes how they reduce AI/AGI risk. Proponents of regulation (like the EU AI Act) claim it will incentivise more safety and checks, but I'm skeptical - most labs can just do the research in countries not covered by Europe's (weakening) "Brussels Effect". Standardisation is helpful but tricky to get right and in practice requires balancing the interests of many stakeholders who are not necessarily interested in or convinced by AGI risk - and people can also deviate away from standards (eg Volkswagen emission scandal).

Another issue is that frequently, policy proposals don't go much further than the blog post or academic paper stage. That's not always useful for policymakers. Few consider things like policy deesign, what the costs are, how to finance it and so on. And on the other side, policymakers are often too risk-averse to try new things that might make their minister or department look bad.

Still, I think we need more people in AI policy and governance. It's a nascent and growing field, and it will remain important going forward even if it doesn't tackle alignment directly. Managing competition with China, understanding how AI impacts society in other concerning ways, minimising risks of malicious use, financing safety research etc.

The fossil fuel analogy is helpful: ultimately what seems to be helping right now involves some innovation (nuclear, fusion), some regulation (carbon taxes) and some policies (fiscal incentives, research funding). I (weakly) believe coordinating and executing this well is more easily done if you're not in an excessively adversial or polarized environment, and if you have a strong AI policy/governance ecosystem (ie with skilled, coordinated and proactive people).

Expand full comment

I disagree with your existential risk estimate. We've already been within 30 seconds of a full-scale nuclear war, so an estimate of 0.0001% per century, while it may apply to giant asteroids, doesn't map onto the wider existential risk. It's my presumption that when Altman said "wait for the asteroid", he was talking about the wider existential risk estimate, as that's the only way it makes sense.

Expand full comment

So what are the odds that the correct strategy for AI alignment is for safety-conscious teams to learn how to build AI as fast as possible and figure out alignment on the fly in order to achieve super-intelligence before malicious actors do?

Expand full comment

You write " While there are salient examples of government regulatory failure, some regulations - like the EU’s ban on GMO or the US restrictions on nuclear power - have effectively stopped their respective industries." Is "stopping their respective industries" the mark of regulatory success? I'd say that while it's important to ensure safety of both nuclear power and GMO's, the EU's actions on GMO's has been generally destructive, as has the US regulatory approach to nuclear power (during the period from 1975-2015), with very little concern for either science or technology.

Expand full comment

If it's any comfort- and I'm really not sure this should in any way be regarded as a comforting scenario- there's a pretty good possibility that China and Russia are going to economically implode over the next 10-20 years or so due to supply-chain risks and demographic collapse. This might remove them from the field of play as major geopolitical competitors to the United States, which is one less thing to worry about from an AI Arms Race perspective.

On the flip side, it might also make them more desperate, which doesn't exactly contribute to a culture of safety.

Expand full comment

Real AI safety and capability research are two sides of the same coin, since noone wants to make an AI that doesn't function as desired and you need to understand how AIs work to figure out what the dangers might be.

"AI safety" as distinguished from regular AI research is just a distinction that people make in order to excuse their lack of actual research output and pretend that they are the only ones working on the problem so they can get more funding.

Expand full comment

Why not slow nuclear progress? D-Nader's Raiders get rich, so what if America gets brownouts, we can do a long con with green energy hoaxes. Why not slow electronics research progress? D-Jimmy Carter's judge can break Bell Labs, no sweat. Why not break the dot-com boom? D-Clintons get a half billion from Microsoft's competitors, Microsoft gets knocked down a peg and starts a billion-dollar bribery office in DC, if the economy never gets back to nineties prosperity that's too bad. Why not break US Steel? Worse case D-JFK immiserates Baltimore and Gary and America has to buy steel abroad.

Why not slow AI progress?

Expand full comment

We live in a world full of natural and man made threats.

Artificial Intelligence is no more dangerous than many other technologies, and control will actually be easier. Take for example CRISPR


Expand full comment

"So maybe (the argument goes) we should take a cue from the environmental activists, and be hostile towards AI companies."

Except, how well did this working out for environmental activists? There has been a lot of noise. But we still seem on track for all but the worst outcomes. Today's profit is still favored over tomorrow's plight. The profitability of electric seems to have driven it more than environmental protest.

Perhaps we should explore what would make no-AI more profitable.

(I have not fully read the article or _all_ the comments, yet; perhaps this was addressed. The first section was a "wait-a-minute..")

Expand full comment

Can we cross post this to the EA Forum? I think it's very relevant to discussion there.

Expand full comment
Aug 9, 2022·edited Aug 9, 2022

"Suppose that the alignment community, without thinking it over too closely, started a climate-change-style campaign to shame AI capabilities companies. This would disproportionately harm the companies most capable of feeling shame."

The target of such a campaign should not be to adjust companies' policies. That will never work anyway. Rather, a campaign should be directed against companies, but aiming to convince voters to support government policies. If that is done, it will slow down/pause all companies equally. An international treaty would also be needed.

"If AI safety declares war on AI capabilities, maybe we lose all these things."

The war team doesn't have to be the same team. AI safety wouldn't even make sense as a flag to rally around, you might want to use something like @AgainstASI by Émile Torres. Would probably attract a different crowd, too. AI safety could be the good cop (from a capabilities company perspective), AI regulation the bad cop. Carrot/stick, works much better than just a carrot that doesn't even taste very good.

Your decisions here are really dependent on how likely you think it is that AI safety is going to be successful, and how likely it is that AI regulation is. Your post still assumes AI safety has a fair chance of being successful, which is why it's even relevant if companies are not nice anymore to AI safety researchers. However, the more AI safety probabilities asymptote to zero (as they are doing), the better the AI regulation story gets. I would be curious what this story would look like if you would update your AI safety probability to Yudkowsky's figure.

Expand full comment

Might something akin to this be a more likely fate for humans than extermination?


Expand full comment

You missing the most important reason:

If you're wrong about AI safety risk, you would end up standing in the way of human progress for 30 years for no reason and that has a massive cost associated with it. Seriously, this is a lot of dead people compared to the counterfactual of allowing the benefits of improved productivity and living standards.

At the moment, the evidence for existential risk of AI hovers somewhere between speculative guesswork and religious belief. The upside, on the other hand is totally discarded in the AI safety arguments, even though upside and risk are likely to be highly correlated. Nobody is going to put an AI in charge of systemically important decisions unless there is a huge win to be had.

Expand full comment

> Greenpeace and the Sierra Club would be “fossil fuel safety”

50 years ago, long before wind & solar were viable, these groups rejected nuclear more strongly than coal, making them valuable contributors to the "fossil fuel community". Even today the Germans planned to stop their nuclear plants 8 years before their coal plants, and when Ukraine was invaded, they quickly decided to ramp up coal if it would help reduce their dependence on Russian fossil fuels... but keeping their last nuclear reactors online seemed to be a more contentious issue.

Expand full comment

Given that China is entering a long period of crisis, from the real estate economic issues to the demographic bomb they are going to face, without forgetting geopolitical problems recent actions are throwing their way and the COVID lockdowns worsening everything, is the specter of Chinese AGI a realistic possibility or a good excuse?

Expand full comment

I just had a funny dream about AI researchers being part of a mason-like semi-secret society. In the meetings the members would wear bangle bracelets representing their work. Silver for theoretical work, gold for empirical work. Left arm for capabilities work, right arm for safety work. A silly idea, but it was satisfying in my dream.

Expand full comment

Whenever someone starts a new company they put out a public relations statement about how they are excited to make everyone's lives better by serving the community and something something equality. They do this even when they are in the business of making predator drones or extracting orphan souls, and most people do not assume the statements reflect anything more than a recognition that words like equality sound nice in a press release.

But I'm sure the AI companies who talk about safety are deeply committed to it.

Expand full comment

The problem with cooperating on AI slow-down is that it requires a pretty balanced world. Otherwise, whoever feels he's on the losing side of the current battle for power is almost bound to go all-in on unrestricted AI research in hope it will help and not harm him in the short run. If only because it seems less dangerous and easier to win than a nuclear war to most anyone, but a few geeks (and not even all geeks who do AI research, so no consensus here). It even need not to be USA or China - a smaller power can attempt this, because while running a modern AI model requires very expensive hardware, it does not (yet) require classified, impossible-to-obtain and hard-to-replicate hardware.

Expand full comment

Of course safety people are working with capabilities people. The question is not whether OpenAI or Facebook gets to deploy AGI first. Facebook is large enough to break down global trade with a pretty stupid AI with well known issues that become relevant exactly in the same order as described in an OpenAI paper (published and also available on arXiv and OpenAI website) five years prior. The question is whether OpenAI can get licensing their 30% less harmful AI an industry standard to the level where EU might apply 25% of global turnover fines for negligence to Facebook (are there any downsides for EU if there is at least an excuse?)

Expand full comment
Aug 10, 2022·edited Aug 10, 2022


@JackClarkSF has really come a long way, personally and intellectually, since his days playing for the Giants, when he was an annual contender for the title of "Most Sexist Asshole in Professional Sports in the 70's" (A crowded field, that included Bobby Rigg and Jimmy Conners, just in Tennis, for example. It's no surprise he ended his career with the Red Sox.)


The dude could mash though. After he was traded to the St. Louis Cardinals (I'm from the LOU and I'm Proud!) in 1984-5, he was the author of one of the greatest EFF-YOU backbreaking home runs in playoff history. In honor of the late, great Vin Scully, here is that 1985 one-pitch at-bat against Poor Tom Niedenfuer. "What a way to end it!" Indeed. Watch for:

-- Pedro Guerrero, Dodgers' LF, throwing his glove down like the Bad News Bears

-- Jack Clark's g-l-a-c-i-a-l home-run victory lap, clocking in at a full 30 seconds (!) He was still a dick.

-- Tom Niedenfuer's unbearably painful, trapped, firghtened expression. A man praying he can escape the stadium alive.


Expand full comment

So, some concern on alignment - it seems to me that, in addition to being an unsolved problem, it would also cause problems if it *were* solved - namely, that the already-existing artificial "intelligences" called "corporations" and "governments" would start using any techniques so discovered on us and on each other, to the extent the techniques generalize.

Source: they already do, it's called "diplomacy", or "public relations", or "indoctrination", or "war", it's just not foolproof.

This does not rule out *all* attempts at alignment - there's a lot of differences between us and computers - but it does suggest caution. And possibly looking at what recourse humans have when other, larger groups of humans desire them ill.

Expand full comment

while stopping governments/ companies from expanding research further in dangerous waters seems difficult, getting chatbots/ bots posing as humans off the internet seems like it should be pretty doable. Not sure why so few people take this idea seriously.

Expand full comment

It doesn't boil the problem down to anything smaller than "Figure out a way to cooperate with China on this", but I think Robin Hanson's [fine-insured bounty](https://www.overcomingbias.com/2018/01/privately-enforced-punished-crime.html) system holds some theoretical promise if we decide the best course of action is to try to slow AI research through policy mechanisms.

Most of the top people working in places like OpenAI and DeepMind make a lot of money doing so. Instituting large cash fines for working on enhancing AI capabilities research, paid out of their own pockets, would probably work as an effective deterrent to get them to stop. If the FIB is paid *towards* informants, you also solve the issue of discovery quite cleanly - the only way someone could seriously continue work on expanding AI capabilities is by retreating to a place where they cannot be discovered by others and compelled to pay the bounty.

International law becomes the biggest barrier to such a system working, because committed actors would quickly learn to flee to friendlier shores. China would probably start World War 3 if the United States started kidnapping their own AI researchers for a quick buck like this. Then again, China wants more AI researchers as well, a lot more. Kidnappings from the research pool in the US would be a fast, cheap way to get access to the best AI talent pool in the world under the guise of humanitarianism.

Expand full comment

I think the real problem is that AI is just going to be a way of avoiding innovation and change. It's not transparent, and all it does is embed current decision making in a chunk of software. It's great for my rice cooker which uses AI to cook me a good batch of rice, but the main risk of AI is that it will become an excuse to keep doing things poorly or worse once they are embedded in our software infrastructure.

If AI produced transparent software, it could be used for innovation. One could simply modify its rule set or tuning or whatever so it tries using unconventional fluorine compounds rather than avoiding them and get a host of new compounds. Unfortunately, it can only deal with what its training set and training criteria let it deal with, and it is often just one large dataset carefully presented in a particular order to produce an internally consistent result. If we learn something new, we'd have to retrain it from scratch, and, odds are, we wouldn't retrain it at all. We'd just live with its defects.

AI was supposed to open us to new thoughts and new ideas, but as it is constituted, it does the opposite. This favoring of the status quo and its limitations is bad enough, but AI is worse than that. When it fails, it fails unpredictably. Minor changes to the input have been demonstrated to completely destroy its algorithmic usefulness. Adversarial AI is just about finding the edge cases that the tuning misses. It's like flipping a few bits in a JPEG file and getting a totally bizarre image.

I agree that we need to slow down and figure out where we can use AI and where we can't. There's a reason we want human pilots in aircraft. All the automatic systems on board are wonderful. Even back in the 90s, it was possible for a plane to fly itself from one airfield to another, but such systems couldn't handle changed plans well nor could they deal with problems if anything when wrong.

We face a similar problem deploying AI. We need to slow down and figure out where it could work and where it shouldn't.

Expand full comment

Two points:

1) There is one admittedly dark but conceivable non-globally catastrophic event that could drastically impair the development timelines for AGI: A destructive invasion of Taiwan. If TSMC were destroyed and this were to be combined with strong policy inhibiting AI research one could really change timelines. Especially if this is combined with action against ASML. It does likely help China's relative odds of victory.

2) Given this, is Nancy Pelosi a closet rationalist? She does live in San Francisco.

Expand full comment

While I'm not not worried about misaligned AI, I have to be honest that this thought process really undersells the potential benefits of speedy AI development. Looking at it from a personal perspective, which of these possible future scenarios should I be more frightened of?

1) Big tech develops AI unchecked. Sometime between 2030 and 2060, they inadvertently deploy a misaligned general AI, which kills me prematurely during a fast takeoff scenario.

2) AI development proceeds extremely cautiously, and no general AI is deployed by 2060. My health gradually declines due to age related illness. Medicine has made only incremental progress in this time, and I die of natural causes sometime between 2060 and 2080, the last few years of which are spent infirm and senile.

I am definitely more worried about the second scenario. The first scenario is a worse experience, but certainly not an order of magnitude worse. But the second scenario requires no assumptions beyond technology and society not dramatically changing in the next few decades--not no changes, we may have VR visually indistinguishable from reality, cheap supersonic flights, and narrow AI that can drive better than humans if there are enough sensors everywhere--but a sort of stagnation where our way of life doesn't fundamentally change. It's not some unlikely asteroid strike. It's the baseline outcome.

Let's say I look at it more altruistically. What's best not for me personally, but overall? I don't think the answer changes. I consider it a moral crisis that, absent dramatic action, nearly all of the 8 billion people alive today will be dead within the next century. This situation is justifiable only by religious dogma and status quo bias. I value all lives, including the lives of the elderly. If I were king, we'd be putting way more emphasis on research in general, but certainly general AI is a plausible solution to this crisis. And with 60 million deaths a year (most of which are age related nearly everywhere in the world), any delay will have devastating consequences. So while I can definitely get behind more research into AI safety, going a step further to a Luddite “hit the brakes” perspective has a high bar to clear. And frankly it has echoes of other scenarios where society has been too cautious about technological advancement. Regulation has indeed been quite effective at limiting nuclear power. Fossil fuel pollution causes far more deaths than nuclear power ever has, but like AI takeover, nuclear disasters are flashy and make for good Netflix movies. We could have saved lives by cutting more corners in COVID-19 vaccine development/testing/deployment--the mRNA vaccines were developed quite rapidly--but instead we followed the process, and even then many people didn't trust a new technology.

Probably the trickiest philosophical question is how to value potential human extinction. AI killing everyone alive would be quite bad for those people, but not an order of magnitude worse than the baseline scenario of everyone dying anyway in the next century. But what about the people who won't exist at all thousands of years into the future? I definitely don't value a potential human as much as an actual human. There are countless humans who could theoretically exist but never will due to any number of mundane choices. Few people consider it their moral obligation to increase the human population as much as they possibly can. And most AI takeover scenarios don't involve the end of all sentience, but rather a change in the substrate in which thought is occurring. Why should I place more value on the humans who will never exist due to possible AI takeover than to the AIs who will never exist if we prevent AI takeover? Maybe the AIs would even have more complex and meaningful experiences than humans are capable of, or maybe they're joyless energy maximizers. The best I can do is say the interests of potential beings are equal in either scenario and just ask what is, on average, in the best interest of humans over the next few decades. More AI safety research? Easy to justify. Slowing down AI development? I'm not convinced.

Expand full comment

Is AI value alignment significantly different from human value alignment?

Like, there are humans in the world today who are very smart, and who control vast resources, and whose values are not aligned at all with ours. But nobody sees that as an existential risk. Somehow society still seems to be ticking along. Why would it be different with an AI?

Expand full comment

I'd rather have an unaligned AGI than a stagnant world full of busybodies policing each other while everyone suffers of preventable causes. Safety, in fossil fuels much like in AI, is a rallying cry for people who can't contribute so they can pretend they're relevant - especially if they don't develop anything but work on "social solutions" and "raising awareness".

I'd rather roll the dice at technoutopia or human extinction than default to 1984.

Expand full comment

It seems worth pointing out that whether the present AGI safety strategy is preferable to a more climate activism-like one is to a considerable extent conditional on how much time we have until AGI is invented (I doubt I'm making a novel point here, apologies if this has already been pointed out bettter in the comments and I'm being redundant): If AGI is 100 years away at current research speed, maintaining good relations with and influence over existing AGI research labs becomes less important as they (or at any rate, the current administration and research teams) won't be the ones eventually deploying AGI, and there will be more time for legal and cultural changes to take effect and slow the speed of research (here the climate change analogue becomes more accurate, as it is more a matter of continuous emission of new ideas into the AI research milieu that gradually reach dangerous concentrations, rather than a binary scenario where individual labs either do or don't go suprercritical at any moment with risks largely in proportion to the competence and resources at that location rather than the general pace of research). (This also hinges on advances in AGI safety research not being so dependent on advances in AGI capacity research that slowing down the latter does not slow down the former to such an extent that there is now less total progress made on safety at the later date when AGI still eventually arrives). Whereas, if AGI is only a decade or two away, the best bet really is to try to have as strong an influence as possible over its creation, for which the present course is clearly the better strategy, assuming we can somehow solve the alignment problem until then, but absent this any strategy is really a moot point.

Expand full comment

I'm sorry but what ***EXACTLY*** are we supposed to be worried about here. The argument always seems to be of the Underpants Gnome variety

Step 1: AI

Step 2? ???

Step 3: panic


- bad people will use the AI to do bad things? You mean the way humans have behaved since Ugg hit Glob over the head with a rock? No interested.

- AI will do exactly what I would do if I were master of the world? So your fondest wish is to "enslave" all humans forever or something? I mean, why? This is Matrix-level insanity.

- the paperclip maximizer will convert it all to paperclips. Again, why? Specifically

+ how is this different from the cellular maximizer striving the convert the world to biological cells one reproduction at a time?

+ that turns out OK, with interesting results, because of darwin and competition. Why, in the paperclip world is there a single maximizer not multiple maximizers? Why do they not evolve towards more interesting goals? Even within the context of supposed AGI and supposed love of paperclips, isn't creating starships so they can go make paperclips on other planets the better extrapolation of the goals? And why can't interesting things happen during the process of building these starships, like an AI starts asking "I mean, paperclips are great and all, but don't you ever wonder about alternative ways to bind foolscap together?"

Honestly, when I look at this, it seems to me basically the nerd version of "OMG we'll all be dead of underpopulation in 20 years". Both completely untethered from reality, and unwilling to state the actual concern, in the one case that "us means white people", in the other case that "us means intelligence".

I don't consider it much of a tragedy that mammals took over from dinosaurs, why should I consider it a tragedy that AI's take over from humans? You're insisting that I identify with my biological self more than with my mental self, and why should I do that? My biological self sucks in so many ways!

Why insist that the framing has to be "the bad tribe come to take away our stuff"?

Why can't it be "angels descend from heaven to help us out of the mighty big hole we've dug for ourselves"?

Expand full comment

I would think that if you're close friends with AI labs, and open to regulatory capture, then people will think AI x-risk is a scam/conspiracy to reduce competition and kill open source.

Expand full comment