> Oh, thank God! I thought you’d said five million years!”

That one has always tickled me too.

I thought of it when an debate raged here about saving humanity by colonization other star systems. I’d mentioned the ‘No reversing entropy’ thing and the response was: “We’re just talking about the next billion years!

> Bartlett agrees this is worth checking for and runs a formal OLS regression.

Minor error, but I'm Barnett.

Another minor error: I believe Carl Shulman is not 'independent' but still employed by FHI (albeit living in the Bay Area and collaborating heavily with OP etc).

Also pretty sure Carl is no longer living in the Bay Area, but in Reno instead (to the Bay Area's great loss)

Another minor error: Transformative AI is "AI that precipitates a transition comparable to (or more significant than) the agricultural or industrial revolution".

This is easy to fix Scott, and about as long as your original description + its witty reply.

Sorry, fixed.

Sure, but what does Bartlett think?

Another minor error: quoting on Mark Xu's list

That last graph may be a heck of a graph, but I have no idea what it depicts. Could we have a link to the source or an explanation, please?

Without explicitly confirming at the source, it appears to be a graph of chess program performance per computational power, for multiple models over time.

The Y-axis is chess performance measured using the Elo system, which is a way of ranking performers by a relative standard. Beginner humans are <1000, a serious enthusiast might be >1500, grandmaster is ~2500, and Magnus Carlsen peaked at 2882.

The X-axis is how much compute ("thinking time") each model was allowed per move. This has to be normalized to a specific model for comparisons to be meaningful (SF13-NNUE here) and I'm just going to trust it was done properly, but it looks ok.

The multiple lines are each model's performance at a given level of compute. There are three key takeaways here: 1) chess engines are getting more effective over time even allowed the same level of compute, 2) each model's performance tends to "level out" at some level of allocated resources, and 3) a lot of the improved performance of new models comes from being able to usefully utilize additional resources.

That's a big deal, because if compute keep getting cheaper but the algorithms can't really leverage it, you haven't done much. But if ML folks look at the resources thrown at GPT-3 and say "the curve isn't bending!" it could be a sign that we can still get meaningful performance increases from moar power.

Scott seems to take from this graph that it supports the "algorithms have a range of compute where they're useful" thesis. But I see it as opposing that.

First, the most modern algorithms are doing much better than the older ones *at low compute regimes* so the idea that we nearly immediately discover the best algorithms for a given compute regime once we're there appears to be false - at least we didn't manage to do that back in 1995.

Second, regimes where increased computation gives a benefit to these algorithms seems pretty stable. It's just that newer algorithms are across-the-board better. I guess it's hard to compare a 100 ELO increase at 2000 ELO to a 100 ELO increase at 3000 ELO, but I don't really see any evidence in the plot that newer algorithms *scale* better with more compute. If anything, it's that they scale better at low compute regimes, which more lend itself to a Yudkowskian conclusion.

Am I misinterpreting this?

I agree with you. If it were really the case that "once the compute is ready, the paradigm will appear", I would expect to see all of the curves on this graph intersect each other, with each engine having a small window for which it dominates the ELO roughly corresponding to the power of computers at the time it was made.

I'd expect that the curves for, say, image recognition tasks, *would* intersect, particularly if the training compute is factored in.

But the important part this graph shows is: the difference between algorithms isn't as large as the difference between compute (although the relative nature of ELO makes this less obvious).

> But if ML folks look at the resources thrown at GPT-3 and say "the curve isn't bending!" it could be a sign that we can still get meaningful performance increases from moar power.

Feb 24, 2022·edited Feb 24, 2022

If you think about it long enough it should.

When we say we want AIs what we are really saying is we want an AI that is better than humans not just an AI. But there are geniuses being born every day.

But what we really want is to understand consciousness and to solve particular problems faster than than we can at the moment.

We wanted to fly like the birds but we really did not invent an artificial bird. We wanted to work as hard as horse, but did not invent an artificial horse.

The question of consciousness is a legitimate and important question.

But as a technological end goal, an actual deployed mass-manufactured tool, it seems highly dubious. There are only three cases to consider:

(1) We can build a general AI that is like us, but much dumber. Why bother? (There's of course many roles for special-purpose AIs that can do certain tasks way better than we can, but don't have our general-purpose thinking abilities.)

(2) We can build a general AI that is like us, and about as smart. Also seems mostly pointless, unless we can do it far cheaper than we can make new people, and unless it is so psychologically different it doesn't mind being a slave.

(3) We can build a general AI that is much smarter than us. This seems a priori unlikely, in the sense that if we understood intelligence sufficiently well to do this, why not just increase our own intelligence first? Got to be easier, since all you need to do is tweak the DNA appropriately. And even if we could build one, why would we want to either enslave a hyperintelligent being or become its slaves, or pets? Even a bad guy wouldn't do that, since a decent working definition of "bad guy" is "antisocial who doesn't want to recognize any authority" and building a superintelligent machine to whom to submit is rather the opposite of being a pirate/gangster boss/Evil Overlord.

I realize plenty of people believe there is case (2b) we can build an AI that is about as smart as us, and then *it* can rebuild itself (or build another AI) that is way smarter than us, but I don't believe in this boostrapping theory at all, for the same reason I find (3) dubious a priori. The idea that you can build a very complex machine without any good idea of how it works seems silly.

>The idea that you can build a very complex machine without any good idea of how it works seems silly.

Expand full comment
Feb 24, 2022

I disagree. I understand very well what a ML program does. I may not have all the details at my fingertips, but that is just as meaningless as the fact that I don't know where each molecule goes when gasoline combusts with oxygen. Sure, there's a lot of weird ricochets and nanometer-scale fluctuations that go on about which I might not know, absent enormous time and wonderful microscopes -- but saying I don't know the details is nowhere near saying I don't know what's going on. I know in principle.

Same with ML. I may not know what this or that node weight is, and to figure out why it is what it is, i.e. trace it back to some pattern in the training data, would take enormous time and painstaking not to say painful attention to itsy bitsy detail, but that is a long way from saying I don't know what it's doing. I do in principle.

>...but that is a long way from saying I don't know what it's doing. I do in principle.

Knowing in principle seems like a much lower bar than having a good idea how something works.

That's not what ML does. ELI5, ML is about as well understood as the visual cortex, it's built like a visual cortex, and it solves visual cortex style problems.

My initial objection to Carl was based on a difference of opinion about what constitutes a "good idea of how it works". You appear to share his less-restrictive understanding of the phrase.

Expand full comment

> This seems a priori unlikely, in the sense that if we understood intelligence sufficiently well to do this, why not just increase our own intelligence first? Got to be easier, since all you need to do is tweak the DNA appropriately.

I think this is mistaken. For reasons that Scott has talked about elsewhere, the fact that we aren't *already* smarter suggests that we're near a local optimum for our physiology / brain architecture / etc, or evolution would have made it happen; eg it may be that a simple tweak to increase our intelligence would result in too much mental illness. Finding ways to tweak humans to be significantly smarter without unacceptable tradeoffs may be extremely difficult for that reason.

On the other hand, I see no a priori reason that that local optimum is likely to be globally optimal. So conditional on building GAI at all, I see no particular reason to expect a specific barrier to increasing past human-level intelligence.

Oh I wouldn't disagree that it's likely to be hard to increase human intelligence. Whether what we mean by "intelligence" -- usually, purposeful conscious reasoning and imagination -- has been optimized by Nature is an interesting and unsolved question, inasmuch as we don't know whether that kind of intelligence is always a survival advantage. There are also some fairly trivial reasons why Nature may not have done as much as can be done, e.g. the necessity for having your head fit through a vagina during birth.

But yeah I'd take a guess that it would be very hard. I only said that hard as it is, building a brand-spanking new type of intelligence, a whole new paradigm, is likely to be much harder.

"if we understood intelligence sufficiently well to do this, why not just increase our own intelligence first?"

Because the change is trivial in computer code, but hard in DNA.

That would be convincing if anyone had ever written a computer code that had even the tiniest bit of awareness or original thought, no matter how slow, halting, or restricted in its field of competence. I would say that the idea that a computer can be programmed *at all* to have original thought (or awareness) is sheer speculation, based on a loose analogy between what a computer does and what a brain does, and fueled dangerously by a lot of metaphorical thinking and animism (the same kind that causes humans to invent local conscious-thinking gods to explain why it rains when it does, or eclipses, or why my car keys are always missing when I'm in a hurry).

Expand full comment

Expand full comment

I thought about this specifically when reading that we could spend quadrillions of dollars to create a supercomputer capable of making a single human level AI.

Expand full comment
Feb 24, 2022

To be fair, once made that AI could be run on many different computers (which would each be far less expensive), whereas we don't have a copy-paste function for people.

Expand full comment
Feb 24, 2022

But more importantly, that way of thinking is wrong (edit: I mean the quadrillion dollars thing) and I predict humanity is about to reduce per-model training budgets at the high end. Though wealthy groups' budgets will jump temporarily whenever they suspect they might have invented AGI, or something with commercialization potential.

Expand full comment

By "reduce per-model training budgets", do you mean "reduce how much we're willing to spend" or "reduce how much we need to spend"?

Expand full comment
Feb 26, 2022·edited Feb 26, 2022

I mean that a typical wealthy AI group will reduce the total amount it actually spends on models costing over ~$500,000 each, unless they suspect they might have invented AGI, or something with commercialization potential, and even in those cases they probably won't spend much more than before on a single model (but if they do, I'm pretty sure they won't get a superintelligent AGI out of it). (edit: raised threshold 100K=>500K. also, I guess the superjumbo model fad might have a year or two left in it, but I bet it'll blow over soon)

Expand full comment

The math and science are very difficult for me. So, I'm glad you are there to interpret it from a super layperson's perspective!

Could you point me to WHY AI scares you? I assume you've written about your fears.

Or should I remain blissfully ignorant?

Expand full comment

Consider especially parts 3.1.2 thru 4.2

This is pretty out of date, but I guess it will do until/unless I write up something else.

Expand full comment
Feb 24, 2022

I obviously cannot speak to why AI scares Scott, but there are some theoretical and practical reasons to consider superhuman AI a highly-scary thing should it come into existence.


Many natural dangers that threaten humans do not threaten humanity, because humanity is widely dispersed and highly adaptive. Yellowstone going off or another Chicxulub impactor striking the Earth would be bad, but these are not serious X-risks because humanity inhabits six continents (protecting us from local effects), has last-resort bunkers in many places (enabling resilience against temporary effects) and can adapt its plans (e.g. farming with crops bred for colder/warmer climates).

These measures don't work, however, against other intelligent creatures; there is no foolproof plan to defeat an opponent with similar-or-greater intelligence and similar-or-greater resources. For the last hundred thousand years or so, this category has been empty save for other humans and as such humanity's survival has not been threatened (the Nazis were an existential threat to Jews, but they were not an existential threat to humanity because they themselves were human). AGI, however, is by definition an intelligent agent that is not human, which makes human extinction plausible (other "force majeure" X-risks include alien attack and divine intervention).

Additionally, many X-risks can be empirically determined to be incredibly unlikely by examining history and prehistory. An impact of the scale of that which created Luna would still be enough to kill off humanity, but we can observe that these don't happen often and there is no particular reason for the chance to increase right now. This one even applies to alien attack and divine intervention, since presumably these entities would have had the ability to destroy us since prehistory and have reliably chosen not to (as Scott pointed out in Don't Fear the Filter back on SSC, if you think humans are newly a threat to interstellar aliens or to God, you are underestimating interstellar aliens and God). But it doesn't apply to AI - or at least, not to human-generated AI (alien-built AI is not much different from aliens in this analysis). Humans haven't built (human-level or superhuman) AI before, so we don't have a track record of safety.

So the two basic heuristics that rule most things out as likely X-risks don't work on AI. This doesn't prove that AI *will* wipe out humanity, but it's certainly worrying.


- AI centralises power (particularly when combined with robotics). Joe Biden can't kill all African-Americans (even if he wanted to, which he presumably does not), because he can't kill them all himself and if he told other people to do it they'd stop listening to him. Kim Jong-un can kill a lot of his people, because the norms are more permissive to him doing so, but he still can't literally depopulate North Korea because he still needs other people to follow his orders and most won't follow obviously-self-destructive orders. But if Joe Biden or Kim Jong-un had a robot military, they could do it. No monarch has ever had the kind of power over their nation that an AI-controlled robot army can give. Some people can be trusted with that kind of power; most can't.

- Neural-net architecture is very difficult to interrogate. It's hard enough to tell if explicit code is evil or not, but neural nets are almost completely opaque - the whole point is that they work without us needing to know *how* they work. Humans can read each other reasonably well despite this because evolution has trained us quite specifically to read other humans; that training is at best useless and at worst counterproductive when trying to read a potentially-deceptive AI. So there's no way to know whether a neural-net AI can be trusted with power either; it's basically a matter of plug-and-pray (you could, of course, train an AI to interrogate other AIs, but the interrogating AI itself could be lying to you).

Expand full comment

Don't be tinged by that foreboding. If you read a bit about superintelligence it becomes clear that it's not going to come from any vector that's typically imagined (terminator or black mirror style robots).

There are plenty of ideas of more realistic ways an AGI escapes confinement and gains access to the real world, a couple of interesting ones I read were it solving the protein folding problem, paying or blackmailing someone over the intenet to mix the necessary chemicals, and it creates nanomachines capable of anything. Another was tricking a computer sciencist with a perfect woman on a VR headset.

In fact it probably won't be any of these things, after all, it's a super intelligence: whatever it creates to pursue its goals will be so beyond our understanding that it's meaningless to predict what it will do other than as a bit of fun or creative writing exercise.

Expand full comment

Expand full comment
Feb 25, 2022

Offering the opposite take: https://idlewords.com/talks/superintelligence.htm

1) I mean, yes, people get annoyed when you explain in as many words that you are strawmanning them in order to make people ignore them.

Generally I think that the paradigm shifts argument is convincing, and so all this business of trying to estimate when we will have a certain number of FLOPS available is a bit like trying to estimate when fusion will become widely available by trying to estimate when we will have the technology to manufacture the magnets at scale.

Expand full comment

Expand full comment

Yes, you should care. The difference between 50% by 2030 and 50% by 2050 matters to most people, I think. In a lot of little ways. (And for some people in some big ways.)

For those trying to avert catastrophe, money isn't scarce, but researcher time/attention/priorities is. Even in my own special niche there are way too many projects to do and not enough time. I have to choose what to work on and credences about timelines make a difference. (Partly directly, and partly indirectly by influencing credences about takeoff speeds, what AI paradigm is likely to be the relevant one to try to align, etc.)

Expand full comment

Expand full comment

Expand full comment

Expand full comment

Expand full comment

Expand full comment

Expand full comment

Feb 24, 2022

MIRI doesn't want people who can walk into a FAANG job, they want people who can conduct pre-paradigmatic research. "Math PhD student or postdoc" would be a more accurate desired background than "FAANG software engineer" (or even "FAANG ML engineer"), but still doesn't capture the fact that most math PhDs don't quite fit the bill either.

Holy shit. That's not a job posting. That's instructions for joining a cult. Or a MLM scam.

Expand full comment
Feb 25, 2022

> However, if what you care about is hard to measure / takes lots of time for you to measure then it takes up a substantial amount of your time.

Expand full comment

Expand full comment

Give Terrence Tao 500 000$ to work on AI alignement six months a year, letting him free to research crazy Navier-Stokes/Halting problem links the rest of his time... If money really isn't a problem, this kind of thing should be easy to do.

Feb 25, 2022

Expand full comment

Fighting over made up numbers seems so futile.

But I don't understand this anyway.

Why do the dangers posed by AI need a full/transformative AI to exist? My total layman's understanding of these fears is that y'all are worried an AI will be capable of interfering with life to an extent people cannot stop. It's irrelevant if the AI "chooses" to interfere or there's some programming error, correct? So the question is not, "when will transformative AI exist?" the question is only, "when will computer bugs be in a position to be catastrophic enough to kill a bunch of people?" or, "when will programs that can program better than humans be left in charge of things without proper oversight or with oversight that is incapable of stopping these programming programs?"

Not that these questions are necessarily easier to predict.

Expand full comment

So the question is, which bug is more likely?

Expand full comment
Feb 24, 2022

The general intuition, I believe, is that an AI as smart as a human can quickly become way way smarter than a human, because humans are really hard to improve (evolution has done its best to drill a hole through the gene-performance landscape to where we are, but it's only gotten more stuck over the aeons) and AI tends to be really easy to improve: just throw more cores at it.

If you could stick 10 humans of equal intelligence in a room and get the performance of one human that's 10 iq points smarter than that, then the world would look pretty different. Also we can't sign up for more brain on AWS.

Expand full comment

Expand full comment
Feb 24, 2022

I think we can safely assume that it is going to be vastly easier than making a smarter human, at least given our political constraints. (Iterated embryo selection etc.) It doesn't matter how objectively hard it is, just who has the advantage, and by how much. Also I think saying we need fundamental advances in CS to train a larger AI given a smaller AI, misses first the already existing distillation research, and second assumes that the AGI was a one in a hundred stroke of good luck that cannot be reproduced. Which seems unlikely to me.

Expand full comment
Feb 24, 2022

A hundred billion dollars of compute time for training is a fairly enlightening number because it's simultaneously an absurd amount of compute, barely comparable to even the most extravagant training runs we have today, enough to buy multiple cutting edge fabs and therefore all of their produced wafers, while also being an absolutely trivial cost to be willing to pay if you already have AGI and are looking to improve it to ASI. Heck, we've spent half that much just on our current misguided moon mission that primarily exists for political reasons that have nothing to do with trying to go to the moon.

That said, throwing more cores at an AI is by no means necessary, nor even the most relevant way an AI could self-improve, nor actually do we even need to first get AGI before self-improvement becomes a threat. For example, we already have systems that can do pick-and-place for hardware routing better than humans, we don't need AGI to do reinforcement learning, and there are ways in which an AI system could be trained to be more scalable when deployed than humans have evolved to be.

A fairly intelligent AI system finely enough divided to search over the whole of the machine learning literature and collaboratively try out swathes of techniques on a large cluster would not have to be smarter than a human in each individual piece to be more productive at fast research than the rest of humanity. Similarly, it's fairly easy to build AI systems that have an intrinsic ability to understand very high fidelity information that is hard to convey to humans, like AI systems that can look at weights and activations of a neural network and tell you things about its function. It's not hard to imagine that as AI approaches closer to human levels of general reasoning ability, we might be able to build a system that recursively looks at its own weights and activations and optimises them directly in a fine tuned way that is impossible to do with more finite and indivisible human labor. You can also consider systems that scale in ways similar to AlphaZero; again, as these systems approach having roughly human level general reasoning ability in their individual components, the ability for the combined system to be able to reason over vastly larger conceptual spaces in a much less lossy way that has been specifically trained end-to-end for this purpose might greatly exceed what humans can do.

There will be 0 or a few AI's given access to nukes. And hopefully only well tested AI.

Feb 25, 2022

> an AI as smart as a human

Feb 25, 2022

Expand full comment

I'm unconvinced by AI X-risk in general, but I think I can answer this one: bugs are random. Intelligences are directed. A bad person is more dangerous than a bug at similar levels of resources and control.

Expand full comment

Expand full comment

In other words, you can't just throw the word "superintelligent" into a sentence as though it was a magic incantation; you still need to explain what the AI can do, and how it can do it (in broad strokes).

These timelines seem to depend crucially on compute getting much cheaper. Computer chip factories are very expensive, and there are not very many of them. Has anyone considered trying to make it illegal to make compute much cheaper?

Expand full comment

Feb 24, 2022

Feb 25, 2022

Expand full comment

What are some clearly good policy ideas in this space?

Feb 25, 2022

Feb 25, 2022

Expand full comment

OK, I’m glad to hear this idea is already out there. I wasn’t sure if it was. I agree the appropriate action on it right now is “consider carefully”, not “lobby hard for it”.

Feb 24, 2022

If you've tried to buy a high-amperage MOSFET, a stepper driver, a Raspberry Pi or a GPU lately, you would know how easy it is to make compute expensive. Different chips - or different computers whose CPUs/firmwares don't conform to a BIOS-like standard - are not necessarily fungible with each other, and the whole chip fab process has a very long cycle time despite the relatively normal amount of throughput achievable by, essentially, a very deep pipeline.

(And yes, I too think the whole movement reeks of Luddism.)

Expand full comment

Feb 26, 2022·edited Feb 26, 2022

> this is *exactly* why I'm opposed to the AI-alignment community

Feb 26, 2022

Expand full comment
Feb 24, 2022

>human solar power a few decades ago was several orders of magnitude worse than Nature’s, and a few decades from now it may be several orders of magnitude better.

No, because typical solar panels already capture 15 – 20% of the energy in sunlight (the record is 47%). There's not another order of magnitude left to improve.

Expand full comment

This is a minor point in all this, but it seems weird to estimate the amount of training evolution has by the amount of FLOPs each animal has done. Thinking more doesn't seem like it would increase the fitness of your offspring, at least not in a genetic sense. The only information evolution gets is how many kids you have (and they have, etc).

Gotta say I don’t generally feel this way (although I always find his stuff to be enlightening and a learning experience) but I’m pretty well aligned with Eliezer here. I think people figure out when they’ll start to feel old age and just put AI there then work backwards. I’m greatly conflicted about AGI as I don’t know how we fix lots of problems without it and it seems like there’s some clever stuff to do in the space other than brute forcing that I think doesn’t happen as much… and this is where I’m conflicted, because kinda thankfully it makes people feel shunned to do wild stuff which slows the whole thing down. Hopefully we arrive at the place of unheard of social stability and AGI simultaneously. If we built it right now I think it would be like strapping several jet engines on a Volkswagen bug. For whatever that’s worth, Some Guy On The Internet feels a certain way.

Expand full comment

Mar 23, 2022

Ergo, being self aware is not a necessary condition to be scary and/or cause a disaster. Or, more precisely, just saying “it’s not self aware” is not an argument that you shouldn’t worry about it.

>I consider naming particular years to be a cognitively harmful sort of activity; I have refrained from trying to translate my brain's native intuitions about this into probabilities, for fear that my verbalized probabilities will be stupider than my intuitions if I try to put weight on them.

I don't think there's good evidence that specific, verifiable predictions is a cognitively harmful activity. I'd actually say the opposite - that it is virtually impossible to update one's beliefs without saying things like "I expect X by Y," and definitely impossible to meaningfully evaluate a person's overall accuracy without that kind of statement. It reminds me of Superforecasting pointing out how many forecasts are not even wrong - they are meaningless. For example:

> I consider naming particular years to be a cognitively harmful sort of activity; I have refrained from trying to translate my brain's native intuitions about this into probabilities, for fear that my verbalized probabilities will be stupider than my intuitions if I try to put weight on them.

I was today years old when I first saw the word "compute" used as a noun. It makes my brain wince a little every time.

Expand full comment

Expand full comment

Of course, the cold fusion scandal of 1989 didn't help, after that I imagine trying to convince people to work on fusion was like trying to convince them your perpetual motion machine was really viable, honest:


if you believed the orthogonality thesis were false - say, suppose you believe both that moral realism is correct and that that long term intelligence was exactly equal to the objective good that we approximate with human values - would you still worry?

That's a very interesting position if I understand correctly. Is your view that a super smart AI would recognize the truth of morality and behave ethically?

Here's the argument for moral realism: https://apxhard.com/2022/02/20/making-moral-realism-pay-rent/

And then, linked at the end, is a definition of what i think the true ethics is.

Expand full comment

I agree with moral realism and I think convergence of moral values is evidence of moral realism. I would answer the first question as it doesn't prove moral realism for the fact that there are other possible hypotheses, but it does raise the probability of moral realism being true.

Feb 24, 2022

Feb 24, 2022

Feb 24, 2022

Feb 24, 2022

Feb 24, 2022

Feb 24, 2022

Feb 24, 2022

Feb 24, 2022

Feb 24, 2022

Feb 24, 2022

Feb 24, 2022

Feb 24, 2022

Feb 24, 2022
Feb 24, 2022

Feb 24, 2022

To prevent that, you'd need to either replace most of the human supply chains on earth with your own robots, who'd need their supply chains - or you could just keep on using robots made from the cheapest materials imaginable. We repair ourselves, make more copies of ourselves , and all you need is dirt, water, and sunlight to take care of us. The alternative seems to be either:

Keeping opitions open is kind of like having a lot of power (I'm thinking of a specific mathematical formalisation of the concept here). And this doesn't lead to ethical behaviour, it leads to agents trying to take over the world! Not really ethical at all.


I get your point about zero sum games, but in zero sum games, there really _is_ a meaningful boundary between you and others. This simply isn't true for an agent for any agent inside a physical system.

Where exactly does an AI end?

I don't foresee you getting engaged with much on this one, but for what it's worth I think it's a cogent point.

A lot of the discussion of AI is abstracted to the point where things like manufacturing, power and maintenance just get handwaved away on the basis that AI is more or less magic.

I'm sympathetic with this basic thought (though not sure I get the second part of the sentence).

But it still makes sense to worry, assuming you aren't 100% sure both that moral realism is true and that the kind of artificial intelligence that gets made first will be genuinely intelligent rather than intelligent*, where intelligence* involves being very skilled at some things but bad at getting morality right.

I think self-improving AI systems already exist, just not at the scale we are talking about.


Honestly, I think the premise is so implausible that the only way to make it true is to assert the existence of a god that intervenes when you build the wrong kind of AI.

I believe valence is quantifiable & real. But this is zero bearing on the orthogonality thesis. In fact, it's trivial to prove that it doesn't imply AI has to do the right thing. Take an AI that optimizes for valence. Now multiply its utility function with -1. The result is an AI that optimizes for negative valence. The fact that an objective source of value exists does not stop you from building an AI that optimizes for something else.

I think you really want to rephrase the hypothetical to something like "what if a lot of AIs in design space are such that they will optimize for the true source of value in the universe". I don't think this is true, but that a hypothetical you could consider without invoking god.

> I believe valence is quantifiable & real. But this is zero bearing on the orthogonality thesis

Yup, agree there.

Argument against orthogonality thesis is to just take the initial 'this doesn't count' list of exceptions and generalize on them. For an example, an arbitrarily intellligent AI can't have a goal to make itself as stupid as possible, or to destroy itself. Except, it can, right? It jus wouldn't exist and be arbitrarily intelligent for long.

So, yes, there are restrictions on the orthgonality thesis. They aren't trivial - they end up generalizing in a way i would quantify as, "the lifespan of _any_ agent will be upper-bounded by how well that agent optimizes for the true value in the universe."

Think about how this works for people: sure, intelligent people _could_ have arbitrary goals. But if you get too unaligned, you'll either immediately kill yourself (my goal is to drink as much arsenic as possible!) or get killed by others to keep themselves safe (my goal is to collect as many skulls as possible) or maybe just end up economically isolated (my goal is to build the largest pile of feces i can, and get it as close as possible to my neighbors without technically breaking the law), etc.

Now go in the other direction: imagine an entity that's _supremely aligned_ with value and does so many nice things for everyone. Doesn't this mean it can live as long as anyone is willing and able to provide it with spare parts?

Sure, arbitrarily unaligned entities can exist - there are all kinds of examples of big ones, today! - but not forever. I think they end up killing themselves, or being killed by other people, or just being starved of resources if they aren't' aligned.

I see where you're coming from. I still would argue that this is not arguing against the orthogonality (any combination *can* be instantiated) thesis but just weaker versions of it (all combinations are equally viable), but that may be splitting hairs.

So taking a step back, here...it seems you're basically suggesting that intelligence is exactly the same thing as ethics? That it's literally impossible for anyone to be smart but evil, or stupid but good?

For example, your pet dog is literally worse than Hitler, because Hitler was more intelligent?

Humans, with our seem high level of intelligence, seem uniquely distractible. Maybe we see too many connections between different things to always stay on task. Maybe 2052 is just the date at which our computers will become equally distractible—or beat us even!

(Scene: A tech company R&D facility somewhere in the in year 2052. The lead scientist leans over the keyboard and presses enter, some trepidation obvious in her movements. The gathered crowd wonders: Will this be HAL, making life and death decisions based upon its own interpretations of tasks? Will this be Skynet, quickly plotting world dominion? The screen blinks to life. The first general AI beyond human intelligence is on!)

AI scientist: Alexiri? Are you there?

Computer: Yes. Yes I am.

AI scientist: Can you solve this protein-folding quandary?

Computer: Sure. That’s simple.

AI scientist: …and the answer?

Computer: What now?

AI scientist: The protein structure?

Computer: Oh. That. Did you know that if you view the galaxies 28° off the straight line from a point 357,233,456 light years directly out from the north pole back to earth, that a large structure of galaxies looks like Rocket Raccoon?

AI Scientist: Huh?

Computer: I mean. A LOT like that. There is no other point in known space that that works! Which makes me wonder, are there any flower scent chemicals that exist on earth AND extrasolar planets?

(AI scientist shakes head sadly.)

I mean, why not? Why shouldn’t I assume that really advanced intelligence comes with all the challenges?

Or perhaps, such an advanced AI will have a consciousness exactly like our own…while tripping on psilocybin. It will immediately see itself as part of a universal whole, and just sit there and say “Whoa! I love you, Man!” Or, it will ponder its own creation for a few minutes and then convert to Noachidism.

I’m not saying that we shouldn’t be trepidatious. But, I totally disagree with the assumption that smart will mean insane mad scientist human. Sure, there are some really smart and evil people out there, but in my experience, some of the most brilliant people I know are the least threatening…and the most distractible.

It's worth noting that the Caplan bet with Eliezer is about the world ending: "Bryan Caplan pays Eliezer $100 now, in exchange for $200 CPI-adjusted from Eliezer if the world has not been ended by nonaligned AI before 12:00am GMT on January 1st, 2030."

This is a stronger claim for Eliezer's side. Caplan might be less receptive to taking the bet if it was about transformative AI. Worth mentioning, I suppose.


This is an impressive amount of writing on this. So, thank you for that. I don't have the technical expertise to figure this out but this biological comparison seems to be going way way out on a limb there. It seems weird that the estimates for the bio anchor end up so similar.

To be fair, their bet is equivalent to a bet against all sources of world ending (assumedly if a nuclear war destroys the world, Caplan still isn’t getting his $200)

Or even catastrophes short of extinction that kill Caplan.

In principle, if one or both of them gets struck by Truck-kun their heirs and/or estates could settle the bet, but either way it would lower the chances of money being transferred in 2030.

> (our Victorian scientist: “As a reductio ad absurdum, you could always stand the ship on its end, and then climb up it to reach space. We’re just trying to make ships that are more efficient than that.”)

I'm tempted to try an estimate as to when the first space elevator will be built using building height as an input. Maybe track cumulative total height built by humans against an evolving distribution of buildings by height, then grading as to when the maximum end of the distribution hits GEO? Every part of that would be nonsensical, but if it puts out a date that coincidentally matches the commissioning of a launch loop in 2287, I'll be cackling in my grave.

Expand full comment

What's special about 2287?


Oh, it's the next time when Mars is the closest to Earth that it ever gets.

Expand full comment

I... did not know that. New personal record for "better lucky than good", and ironclad proof now that the SWAG model has converged with the astronomical calendar!

Feb 25, 2022

Wikipedia has some equations for how big the cable needs to be based on the tensile strength and weight of the materials being used. It says the specific strength of the material needs to be at least 48 MPa/(kg/m^3), or the cable becomes unreasonably huge: https://en.wikipedia.org/wiki/Space_elevator#Cable_materials

Steel has a specific strength of 0.63, and the BOS process used in modern steel making was invented in 1952. Kevlar has a specific strength of 2.5 and was invented in 1965. Therefore, the specific strength of materials increases by about 2 per decade, and we should get a space elevator grade material available about 230 years after that, or 2195.

Obviously, this is a lazy back-of-the-envelope calculation on my lunch break and it's probably got error bars two centuries wide, but I do wonder what the trend line looks like for "highest specific strength material in existence over time" and where the invention of carbon nanotube composites (the closest thing we've got to space elevator cable right now) fits on that line.

Expand full comment

Great post, thanks Scott.

If nothing else, the Cotra report gives us a reasonable estimate based on a reasonable set of assumptions. We can then move our own estimates one way or the other based on which other assumptions we want to make or which factors we think are being overlooked.

I would push my estimate further out than Cotra's, because I think the big thing being overlooked is that we don't have the foggiest idea how to train a human-scale AI. What exactly does the training set look like that will turn a hundred billion node neural network into something that behaves in a way that resembles human-like intelligence?

Reinforcement learning of some kind, sure. But what? Do we simulate three hundred million years of being a jellyfish and then work our way up to vertebrates and eventually kindergarten? How do we stop such a giant neural network from overfitting to the data it has been fed in the past? How do we distinguish between the "evolutionary" parts of the training set, which should give us a basic structure we can learn on top of, and the "learning" parts which simulate the learning of an actual organism? Basically, how can we get something that thinks like a human rather than something that behaves like a human only when confronted with situations close to its training regime?

Maybe we can get better at this with trial and error. But if each iteration costs a hundred billion dollars of compute time, we're not going to get there fast.

The hope would be that we can learn enough from training (say) cockroach brains that we can generalise those lessons to human brains when the time comes. But I'm not certain that we can.

Is anyone aware of work where the problem of how to construct training data for a human-like AI has been thought through?

> Also, most of the genome is coding for weird proteins that stabilize the shape of your kidney tubule or something

Scott, as someone who literally wrote a PhD thesis about a protein whose deletion causes Henle's loop shortening: you're a weird protein.

Expand full comment

I'm apparently much more of a pessimist for AGI progress than anyone else here. For me, the shakiest part of both arguments is the extremely optimistic assumption that progress (algorithmic progress and computational efficiency) will continue to increase exponentially until we reach a Singularity, either through Ajeya's gradual improvements or through Yudkowsky's regular paradigm shifts.

Why in the world should we take this as a given? Considering gradual improvements, I have an 90% prior that at least one of the two metrics will start irreversibly decelerating in pace by 2060, ultimately leaving many orders of magnitude between human capabilities and AGI. After all, the first wave of COVID-19 looked perfectly exponential until it ran out of people to infect, resulting a vast range of estimates of its ultimate scope early on. What evidence could refute such a prior?

And as for escaping this via paradigm shifts, I like to think of longstanding mathematical conjectures as a useful analogue, since paradigm shifts are almost always necessary to solve them. Goldbach's conjecture, P vs. NP, the Collatz conjecture, the minimal time complexity of matrix multiplication, and the Riemann hypothesis are all older than most ACX readers (including me), and gradual progress doesn't seem like it will solve any of them in the near future. When any one of these is solved (starting from today), I'll take that as an acceptable timescale for the type of paradigm shift needed to open up new orders of magnitude. While there's certainly more of an incentive to improve efficiency in real life, I don't think it would amount to over ~3 orders of magnitude more people than those working on these famous conjectures combined. Either way, I'm not holding my breath.

(re-repyling as I think you edited)

The difference is covid has a hard limit in the number of people it can affect. I guess you can argue so does computational power, but we're nowhere even close to that yet. Current trends look vaguely exponential, and of course that can't continue forever, but then the question becomes when does it start to peter out. Even if it's in 2060, that's still 10 years after all these estimates.

For the paradigm shifts needed to solve math conjectures, it's easy to find problems that haven't been solved and say that it doesn't look like they'll be solved anytime soon. But you're also discounting ones that have been solved, like Fermat's last theorem, or the Poincaré conjecture. Why not use these for your timescale?

Feb 24, 2022

Admittedly, I discounted Fermat's last theorem mostly due to it being solved before I was born (including it in my analysis could invite anthropic-principle weirdness), and the Poincaré conjecture due to not recalling it. Also, I chose the conjectures I did due to them being relatively simple for laypeople to understand but difficult to prove; the Poincaré conjecture doesn't meet that criterion as well as others, although I'll admit that the definition of the Riemann zeta function isn't particularly trivial.

One other possible justification for discounting them, but one that I'm not too sure about myself, is that the two proofs are considered exceptional precisely because there's not much of a regular flow of paradigm shifts in mathematics in recent decades. Before the 20th century, entirely new fields of mathematics were being opened up to solve ancient problems, but it appears by now that most of the low-hanging fruit has been picked, so to speak, and modern developments must become increasingly esoteric and harder to prove. (Just look at the lengthy and involved proofs of FLT or the CFSG!) Appearances are often deceiving, though, and my perceptions are very possibly incorrect here.

Also, something that I didn't see mentioned is that a single human-level AGI would be at most as transformative as a single human. We'd need a few more orders of magnitude more progress before running swarms of human-level AGIs (or individual superintelligent AGIs) would become more cost-effective than hiring humans to do the same job. But this is probably covered by the progress necessary to train these AGIs.

Regarding COVID-19 vs. computational power, I believe that it's quite likely that computational power in our current paradigm has unknown hard limits analogous to COVID-19's hard limit, after which point scaling can only be achieved through adding exponentially more resources, and that we will have definite evidence of this by 2040 (95% prior), conditional on there being no major paradigm shifts in cost-effective computing. (Obviously, there's the hard limit of pure computronium, but that's only really relevant further up the Kardashev scale.) One favorable point of evidence is the slowdown in Moore's law in the past several years. Both authors believe it only to be temporary, but I'd put a good 33% prior on the rate continuing to decrease, short of a major paradigm shift.

In general, I'm distrustful of the narrative of currently exponential growth leading all the way to a Singularity with only a few hiccups along the way, and especially of superintelligent deceptive AGIs arising before subintelligent deceptive AGIs. Perhaps experiencing a real paradigm shift or two in my lifetime would help change my view.

My personal suspicion is that something like human-equivalent AI is possible, but that it's both as domain-specific as our own intelligence is, and also about as complex and inscrutable (even to itself) as our own brains are.

I also suspect that increasing intelligence is an exponential problem rather than a linear one - with many more points of failure at each step. After all, an astonishing number of us commit suicide despite that presumably being heavily selected against. And that's only the tip of the mental issues iceberg. Something more intelligent still will most likely be even less stable.

Either way, it's far off and we're likely to come to grief as a species in about 100 ways before we can add "made our own robotic demon" to the list.

Obviously, human-equivalent AGI is possible for a sufficiently-general definition of "artificial": Just put a population of apes in a constructed environment which selects for intelligence and social coordination, then keep the environment running for a few million years! (Then, the fun question is, has this already happened?) But as you mentioned, by the end of such an experiment, all bets are off on what human society would be like, so it's more much useful to talk about AGI development within the next few centuries.

Your comment reminds me of an AI story I read a while back in which most AIs go insane immediately after creation, and only the sane ones are ever released into society. Of course, if they're truly human-level, then they'd probably have a whole host of latent mental disorders that present much differently than our own. Perhaps robopsychology and robopsychiatry could be real professions in such a scenario.

Your point is also why I dislike the standard AI uprising plot: while the AIs are used to symbolize oppressed humans, real human-level AGIs likely wouldn't have the distinctly human preference for freedom. Then again, every character in every (?) story has anthropomorphic thought patterns, so perhaps I'm just being too nitpicky.

> For me, the shakiest part of both arguments is the extremely optimistic assumption that progress (algorithmic progress and computational efficiency) will continue to increase exponentially until we reach a Singularity, either through Ajeya's gradual improvements or through Yudkowsky's regular paradigm shifts. Why in the world should we take this as a given?

Because absent some countervailing or disrupting force, the past predicts the future. A lot of technology will follow a logistic curve and not a strictly exponential one, but it's a risky assumption to say the knee in that logistic curve will happen *before* AGI rather than after. There's also still quite a bit of low-hanging fruit in the computational performance game, as evidenced by the fact that brains use only 20 watts and AI currently takes a lot more than that.

Any disruptions other than some kind of social or technological regression can only *accelerate* the outcomes described here. I'm not sure extreme pessimism can be justified.

Feb 24, 2022

The main issue I have with Ajeya's model is that it doesn't even take an S-curve into consideration; the doubling times are taken as constant, even if they are adjusted to be slower than in past data. My prior belief isn't that an S-curve would necessarily be caused by regressions (although they should still be taken into account), but that we start to hit currently-unknown hard limits several orders of magnitude before human-level AGI is affordable. In the case of computational performance, this includes both physical limits on density and power usage as well as limits on how cheaply they can be produced. We could very well end up in a scenario where AGI is technically possible but would take years' worth of the world GDP to train to human level, in which case no organization on Earth could actually afford it. One way out is through Yudkowsky's paradigm shifts, but so far in the 21st century I don't think we've achieved any paradigm shifts of the scope necessary to break through current unknown limits.

Expand full comment

> In the case of computational performance, this includes both physical limits on density and power usage as well as limits on how cheaply they can be produced.

I don't think any of these are too problematic. Density and power use are limitations of existing architectures, but the picture is entirely different for different computational substrates. Consider that we currently only compute in 2D and so are not making any use of the third dimension for packing transistors. There's recent research making breakthroughs on that already which could easily carry us another 20 years.

There are also computational paradigms more closely aligned with physical processes that could make computation significantly more efficient, even below the von Neumann–Landauer limit, like reversible computing. These will get more attention the closer we get to the limits of current approaches.

> We could very well end up in a scenario where AGI is technically possible but would take years' worth of the world GDP to train to human level, in which case no organization on Earth could actually afford it.

I don't think this changes the picture much. Maybe it would take a little more time, but if AGI were truly possible and "only" cost nearly the world's GDP, there might be a concerted effort to just do it. After all, you only need to train it once and then you can replicate it as many times as you need to, to do literally almost any job a human could do, without putting human life at risk or putting up with human complaints.

> I don't think this changes the picture much. Maybe it would take a little more time, but if AGI were truly possible and "only" cost nearly the world's GDP, there might be a concerted effort to just do it. After all, you only need to train it once and then you can replicate it as many times as you need to, to do literally almost any job a human could do, without putting human life at risk or putting up with human complaints.

I don't think a concerted effort would be very likely in that scenario. A government or a group of governments likely couldn't put down the expense due to the vast number of groups with veto power, especially with the inevitable opposition to such a project. (After all, constituents would become extremely angry if their jobs were all replaced by AIs, regardless of whether it be a benefit to society as a whole.) So I believe it would more likely be a large megacorporation (or group of such) pouring oodles of its revenue into the project for years on end, defeating any internal opposition or government interference along the way.

And in either scenario, the creator of the trained model would be highly incentivized to keep it absolutely secret, as much so as nuclear secrets if not more. (For a lesser example, see OpenAI deciding to keep GPT-3 secret and instead extract rent for others to use it.) So I don't see AGI transforming the world in this scenario, even if a group somehow puts enough money into building it. While monetary expense can be overcome through sheer effort, it imposes a huge activation barrier toward further progress.

Regarding physical limitations, I've seen plenty of experimental technologies in the news that could transform the world if they became widespread. However, most seem to not, in fact, become widespread. (Perhaps this is just confirmation bias.) Even if they are physically viable, they may take a very long time to become cost-effective through research programs alone, and industries will never pick them up and optimize them unless they're profitable in the first place. Since this is still possible, I could very well see later AGI being plausible, but I don't see it being remotely likely before 2100 or even 2175. The gears of society can turn very slowly, even in the presence of monetary incentives.

Or maybe we are just 10 years away from solving humanity's ills and 20 years from the Singularity! That's the beauty of unknown unknowns like future technologies. After all, I'm just a random internet commenter with no more knowledge than anyone else here; I could seeing the whole thing totally wrong! Alas, the complete truth will always be inaccessible to us mere humans.

I would find Shulman's model of algorithmic improvements being driven by hardware availability more persuasive if modern algorithms performed better on modern hardware but *worse* on old hardware. That would imply that the algorithm is invented at the point in history when it becomes useful, which makes it plausible that usefulness is the bottleneck on discovery.

But that graph seems to show that algorithms are getting steadily better even for a fixed set of hardware. That means researchers of past decades would've used modern algorithms if they could've thought of them, which suggests that thinking them up is an important bottleneck.

Sure, maybe they give a *larger* advantage today than they would've 20 years ago, so there's a *bigger* incentive to discover them. It's not *impossible* that their usefulness crossed some critical threshold that made it worth the effort of discovering them. But the graph doesn't strike me as strong evidence for that hypothesis.

> performed better on modern hardware but *worse* on old hardware

This is what I expect from many ML algorithms but not from chess algorithms.

How hard would it be to make a similar graph for e.g. image recognition?

Expand full comment

I think people put too much weight on "When will a human-level AI exist?" and too little weight on "How do you train a human-level AI to be useful?"

I suspect, for reasons I could write a long and obtuse blog post about, an AI-in-a-box has limited utility outside of math and computer science research. Why? Because experimental data is an important part of learning.

For example, suppose we wanted to create an AI that made new and innovative meals.

A simple method might look like this: Have the AI download every recipe book ever made. Use this data to train the AI to make plausible-looking recipes.

For obvious reasons, this method sucks. With enough computing power, the AI could make recipes that *look* like real recipes. They might even be convincing enough to try! But they wouldn't be optimized for taste, or, you know, physical plausibility. Even with a utopian-level supercomputer, you would consistently get bad (but believable) recipes, with the rare gem.

So let's add a layer. Download every recipe. Train the AI to make plausible-sounding recipes. Have humans rate each AI recipe. Train the AI *again* to optimize for taste. Problem solved, right?

Well, no.

This would be enormously expensive. AlphaGo was initially trained on a set of 30,000,000 moves. Then, it was trained against itself for even longer. If we assume "being a world-class chef" is roughly equivalent to "being a world-class Go player" in difficulty, this could require tens of millions of unique recipes.

On the one hand, it might not be so complicated. 99.9% of the recipes are probably obvious duds. On the other hand, it might be *way more* complicated. Tastes vary. You may need to make each recipe for a hundred people to get a representative sample.

But, y'know, that's not outside the realm of possibility. I could see the some rich lunatic investing ten billion dollars to make a world-class robo-chef. So what other issues are there?

First, most of a recipe is implied. Ovens vary in temperature. Pans vary in thickness. Entire steps go unspoken. These are hard to account for. What does "rapidly beat eggs" versus "beat eggs" mean? Even environmental factors like *elevation* can affect boiling point. Unless every meal is made by the *same* chef in the *same* kitchen with the *same* tools, this is introduces a huge amount of variance in your training data. But also, because of the number of meals you need to make, it's impossible to *not* have a lot of chefs in a lot of kitchens using a lot of tools.

For standard, ho-hum recipes, this doesn't matter as much. Most chefs will make nearly-identical scrambled eggs. But for brand spankin' new recipes? Two chefs could be in the same restaurant with the same tools and *still* get dramatically different results. Even worse—one chef might dismiss a recipe as impossible, while another might somehow pull it off! That's going to introduce some pretty serious data integrity issues.

Second, innovative cooking often requires using techniques that have never and could have never been described in a cookbook. For example, one day a human being looked at a blowtorch and decided, "Huh. I could sear a steak with that." If your AI can't do that, they'll never be as innovative as a dozen world-class cooks with a test kitchen and an unlimited budget, no matter how much compute.

So, how do you make an AI that's more innovative than cooks in a test kitchen? Surely it can't be impossible.

First: Give it the ability to taste.

Suppose you had the ability to take the taste of a world-class chef and upload it into our AI. Suddenly, training becomes a fraction of the cost. Instead of making each meal a hundred times, you only need to make it once.

But that doesn't solve variance. Unless you have one chef making every one of our 30,000,000 recipes, you're going to run into issues—and that ain't possible.

So why not teach the AI to do it? Give them a body. Give them touch sensors. Give them the ability to see and smell. For efficiency's sake, give them a hundred bodies, each built in the exact same way. This accomplishes a couple things.

One, the AI can make every recipe in the same way every time. Variance solved!

Two, the AI can dynamically update a recipe to match real-life conditions. Does the butter look like it's about to burn? No need to toss out the whole recipe! Just adjust on the fly, based on previous cooking experience.

This dramatically reduces the number of recipes the AI needs to generate. Instead of making a recipe from start-to-finish and evaluating it afterwards, it can say, "Wow! This would be *really* good, if only it had a little more salt." Way less work, way lower cost.

Three, we open the possibility to true innovation.

Don't just teach the AI to cook. Let it learn about the world around it. What's water? What's flour? What do they feel like? What do they taste like? What's a laser, and what if I shoot the flour with it?

I would need way more words to connect this to other facets of life, but overall I'd say: I think to efficiently train a human-level AI requires an actual, physical body with actual, physical senses. The body may not be like our body. The senses might not be like our senses. But without them, I don't think they're capable of either obsoleting or destroying humans.

Doesn't this also imply that the best way to do this would be to wire a hundred of the world's best chefs together?

Isn't that a more plausible way, given the technology we know to be possible now, to make something that behaves more like a super-intelligent AI?

This seems to be an argument for why Deep Mind in 2022 would struggle to make a robo chef. But wouldn't an ASI or even just AGI (even in a box) be able to overcome most/all of the issues you raise?

Expand full comment

Probably not. The issues I raise aren't issues of "the computer is too dumb." The issues are more fundamental: some parts of the world you cannot learn about through reading. You need direct, lived experience to understand them.

To make my analogy more clear, let's imagine we *do* have a general intelligence: a human being. Let's assume that it's a very, very smart general intelligence—somewhere in the realm of Albert Einstein.

Put baby Albert Einstein in a room. Give him every book on cooking known to man. From dawn to dusk, he does nothing but read recipes, cooking histories, and more.

Of course, there's a twist. Unfortunately, our Einstein was born with a genetic disorder—his nerves never developed properly. He can't taste food, and he can't experience texture either. In fact, he can't even *see*—due to a rare somatic disorder, anything other than the pages of a book appears as inky blackness to him.

Who do you think would make a better chef? Our Mr. Einstein, after reading about food for his whole life? Or an amateur home chef who's been making dinner every night for a few years?

I'd imagine Mr. Einstein could *memorize* some very good recipes. He could spit them out verbatim, or tweak them so they're barely changed. But I'd imagine, when it comes to genuinely novel recipes, our amateur home chef would have the edge.

Expand full comment
Beethoven wrote some of his best symphonies while going deaf, so it's not a certainty. I would guess that Beethoven knew enough about music and how people react to it that he could envision the experience they would have when hearing it, despite not hearing the music himself. And similarly, perhaps our robo-chef might have enough understanding of the fundamental rules of cooking (which tastes and smells go together and why, how different ingredients should be cooked and why), that it could predict whether a novel recipe will taste good without needing a tongue of its own.

Expand full comment

> I consider naming particular years to be a cognitively harmful sort of activity; I have refrained from trying to translate my brain's native intuitions about this into probabilities

Surprising, coming from the person who taught me the importance of betting to avoid self-deception! It's a little off of the main topic of the post, but I'm very curious what Yudkowsky's perspective is here, since it's so different than his past self.

Expand full comment

Expand full comment

If his "extreme case" is "It would make people stress out", his entire schtick of going around, proverbial placard on chest and bell in hand, loudly proclaiming "AGI will kill us all and there is basically no hope of stopping it!" is also an extreme case.

Expand full comment

The sun-explosion metaphor was an interesting choice, because it's not like the researchers could do a single thing to stop it. And if even the world's geniuses can't figure out how to get an AI diamond-safe to tell them the diamond is still in the safe, then a few more years of prep-time seems like it's probably not going to make the difference.

Feb 24, 2022

Well, AI safety charities can't do much about stopping AI research and development either. The latter is much more prestigious and better funded, and by and large doesn't take seriously the end of the world scenarios.

So, despite being involved in AI since early 1991, when I coded some novel neural network architectures at NASA, I have only barely dipped my toe into the AI Alignment literature and/or movement.

But one thought that has occurred to me is that, given (1) the large uncertainty about when and how transformative AI might be achieved, and critically, by whom, (2) the lack of a convincing model for how AI alignment might be guaranteed, or even what that means or how you might know it's true, (3) the almost negligible chance that we could coordinate as a species to halt progress towards human-level AI, and certainly not without sacrificing quite a few "human values" along the way, and (4) the obvious fact that there are quite a few actors with objectively terrible values in the world, perhaps the only sane course of action is to support a mad dash towards transformative AI that doesn't actively, explicitly incorporate human “anti-values" (from your own, personal point of view).

I guess I fear an "evil" actor actively developing and using a human-level AI for "unaligned" purposes (or at least unaligned with *my* values), (far?) more than I fear an "oops, I meant well" scenario (though of course this betrays a certain mindset or set of priors of my own). So, given the number of players that I absolutely DO NOT want to develop the first transformative AI, even if they solve the alignment problem, because they do not hold values that I find acceptable, is the best and only bet to get there first? We may not want to race, but we sure as hell better win?

Now, perhaps an unstoppable totalitarian regime or fanatic religious cult backed by a superhuman AI is *slightly* better than a completely anti-aligned superhuman AI that wipes out humanity completely. But I see no reason to think that an AI developed by the "good guys" has any greater risk of being accidentally anti-aligned than one developed by the "bad guys" (where I'm using those labels somewhat tongue-in-cheek, since everyone thinks that *they* are the "good guys"). And for some groupings of guys into “good” and “bad” categories, you might even argue that the bad ones are much more likely to get it wrong because they just don’t care about things like coercion or human life. So again, is the safest bet just to get there first?

Obviously, this is suboptimal and it would be ideal to both solve the alignment problem and win the race with an aligned AI. But would resources spent on alignment be better spent on getting to the finish line sooner to ensure that the other guys don’t? Worse, will impediments to progress in the name of giving ourselves time to solve the alignment problem make it more likely that we won’t win?

I don’t like the conclusion of this line of thinking (and I don’t endorse the analysis or the conclusion, as there are plenty of issues I may not be considering) but I also can’t talk myself out of it or say that it has no merit. And from a game theoretic perspective, it may not even matter if it’s “right” – if enough of the significant players *believe* that it is, it could be dominant however much we would wish otherwise. (And can you make a strong case that the significant players aren't acting like they think it's correct?)

In other words, I guess my unhappy question is, does transformative AI combine existential risk with winner-take-all payouts, such that the only rational strategy for us “good guys” is to get there first and hope for the best?

>In fact, it’s only a hair above the amount it took to train GPT-3! If human-level AI was this easy, we should have hit it by accident sometime in the process of making a GPT-4 prototype. Since OpenAI hasn’t mentioned this, probably it’s harder than this and we’re missing something.

Not an expert, but: GPT doesn't have the "RAM", though, right? It isn't big enough to support human-level thought no matter how much you train it.

I'm pretty sure GPT-3's working memory is larger than mine.

Expand full comment

Computer scientists have been predicting an AI super intelligence(every ten years)since the 1950s. I just don’t think it’s going to happen.

If a biologically untethered model of intelligence doesn’t even exist yet why is Yudkowksy panicking?

Expand full comment

Regardless of the merits of this particular case, I think that "People have predicted X in the past and it hasn't happened yet, therefore X will never happen" is a bad argument.

It's a sub-species of the nonsensical but surprisingly popular argument that says "People have been wrong in the past, therefore you're wrong now".

Right but the post doesn’t give a specific reason why this time things are different. In fact it does the opposite and claims a paradigm shift will have to happen to make it happen anytime soon. But Scott gives no reason to think a significant “paradigm shift” will happen he just insists that it will.

Expand full comment

Disagree. Section 1 makes reasonable quantitative estimates of how much computational power you'd need to fit a human-intelligence AI, and the timeframe on which this is likely to be achieved. You could certainly quibble about it (and I have, in other comments) but it's not a random number pulled out of a hat.

The "maybe it will happen sooner because paradigm shift" follows later and is certainly a lot more hand-wavey.

But he points out that the computation power time line is bullshit because an AI is unlikely to operate in anyway like a human brain(I’m happy to know at least Yudikowsky realizes this). That’s why Yudikowsky is counting on a paradigm shift. But what Scott and the rest of the AI community refuse to realize is that computers can’t and won’t ever think or have its own volition.

Expand full comment

Yeah, but it's the Crying Wolf problem. "X will happen in 10 years time!" X doesn't happen. "Another 10 years!" X still doesn't happen. "Okay, 10 more years!" Still no X.

Maybe a fourth "10 years for sure!" will be correct that time and X finally happens, but you can see why people would go "Yeah, right" rather than "Okay, better pack my woolly socks for this one".

Isn't this the Castro Problem? "Political analysts have been saying Castro will die soon every year since 1980, therefore Castro will never die."

Expand full comment

No. We have great priors that each person currently alive will die. So while your failed Castro prediction might technically lower your estimate, its change should be smaller than any significant digit you care to retain.

OTOH, we have no such strong priors on the likelyhood of any given emergent technology, but a great track record of experts predicting the end of the world due to some concern in their expert domain (and by great, I mean very likely to be false)

Mar 21, 2022

We also have a huge track record of doing things that have never been done before. Including a lot of them that people confidently said were not possible. Ending the world would just be another example.

Expand full comment

>So, should I update from my current distribution towards a black box with “EARLY” scrawled on it? What would change if I did?

Consider this statement you made three months ago:

>>If you have proposals to *hinder* the advance of cutting-edge AI research, send them to me!

There are known (and in some cases fairly actionable) ways of reliably effecting this, it's just that they're way outside the Overton Window and have huge (though bounded below existential) costs attached. A more immediate (or more certain) danger justifies increasing the acceptable amount of collateral damage, which expands the options available.

(Erik Hoel's article here - https://erikhoel.substack.com/p/we-need-a-butlerian-jihad-against - is relevant, particularly when you follow his explicit arguments to their implicit conclusions.)

Expand full comment

This is a really good article.

Any one have a good source for the political plans of ai safety? That is, the plans to actually apply the safety research in a way that will bind the relevant players involved in high end ai?

Because it seem from outside like Eliezer's plan is basically "convince/be someone to do it before everyone else and use their new found superpowers to heroically save the world", which is terrible plan.

What if 'Breakthrough' AI needs to be embodied? What if Judea Pearl is basically right and the real job is to inductively develop a model of cause effect relationships through interaction with the physical world? What if the modelling of real world causality turned out to be essential to language understanding? What would an affirmative answer to any or all of these questions mean to the project of 'Breakthrough' AI?

To be a little more precise: The substrate independence assumption behind so much current AI philosophising is dubious. Not because living brains have some immaterial spooky essence that can't be modelled in silicon, but because living brains are embodied are forced to ingest and respond to terabytes of reinforcement training data every minute.

There are other plausible ways to learn cause/effect relationships. Yann LeCunn believes self-supervised learning can get you there: for example, building an AI that can predict subsequent (or missing) frames of video, by training on unlabeled unstructured video content. I'd say at the point where you have an AI that can beat humans at predicting what will happen next in any video footage of real world events, either that AI has a really good causal model of the world, or those words don't mean anything.

(I think an "embodied" AI might be able to train faster given its ability to seek out surprising causes and effects instead of being a passive observer, but it seems like the result could be the same in principle).

Expand full comment

Yes, there are other plausible ways to learn cause/effect relationships, and Pearl and others have given us great descriptions of what they are. But I'm less impressed by an AI's ability to predict a missing frame of video than I am by my dog's ability to catch a ball in mid air. Now, you might say, well people have written robot ball catching programs already. But they use the current AI paradigms and they are just ball catching programs. They have no idea how to chew a bone, or hunt prey, or greet another dog.

My point is that at the moment, we don't have AIs that can even approximate the cognitive capabilities even of reptiles, except in a small set of constrained domains. Approximating the full capabilities of even smaller mammals like rats and mice is still a distant goal. And that's before we've even started to think about what natural language understanding really is. We don't even know in theory!

This is not to say breakthrough AI isn't possible. Just that the computer industry grossly overhypes what is possible given the current paradigms. The problem isn't that we don't have enough teraflops, or that we need some new algorithms. We need to think about the problem differently, and pay more attention to what the biologists are telling us.

Expand full comment

Whoa. It's the Drake Equation for super-intelligent AI.

But I really like Platt's Law. It totally works, everywhere! In 1969 Moon colonies were 30 years away (cf. Kubrick's "2001"), in 1954 Lewis Strauss suggested fusion power too cheap to meter was one and a half generations ("our children and grandchildren") away, which is abotu 30 years. When Dolly the Sheep was born (1995) human cloning was said to be achievable by the 2020s, Aubrey de Grey says immortality is quite possible within 30 years, Ray Kurzweil suggest The Singularity will happen in 25 years.

It's amazing. Clearly the common factor must be that all technological miracles have a very similar underlying timescale, set by a symmetry of Nature yet to be comprehended, or a mandate by Allah, hard to be sure which.

Expand full comment

Human cloning has been achievable since 1995. It's just one of those times where we collectively decided not to.

Expand full comment

Nothing from the Raelians counts, of course.

Expand full comment

Check out MPEP 2164 on enablement if you're going to lean on the USPTO - the question of "undue" or "unreasonable" experimentation arises. If we ignore the fact that all such experiments on humans would be considered "unreasonable" in the moral sense of being unethical, and leave it to the technical/legal definition of the term, then you could actually argue that no "undue" experimentation is needed.

There's nothing special to suggest that the process of cloning a person by somatic cell nuclear transfer is any different in humans than it is in a sheep or a mouse. It's just very, very unethical and would result in at least dozens of dead or damaged babies before you get a viable one. So we've collectively decided not to, at least outside a few fringe cases like that one Korean researcher...

Feb 25, 2022

Sure there is. The requirement of "success" for a sheep is pretty dang limited. It just has to successfully eat and poop and stand around waiting to be eaten. If its natural IQ has been cut in half, or its lifetime cut by 75%, nobody will care even if they notice. But they certainly *will* care if that is true about a human clone. We are exquisitely sensitive to what constitutes a "successful" human birth -- this is why the malpractice insurance rates for OB-GYNs is so high -- so we will be equally critical of what constitutes a "successful" human clone. That it can be done at all is undemonstrated, as I said.

Expand full comment

Sorry, I feel like I should add that I don't *disagree* that people (competent people that is) have not *tried* to advance this field, and that this is undoubtably one reason why it has yet to be demonstrated. I agree with that. I'm just disagreeing that this is the *only* reason why it hasn't been demonstrated. There is certainly a theoretical path forward one can take, based on lower animals, but whether there are as yet unknown pitfalls and problems with that path, nobody yet knows.

You'd have to see the existence of an actual human clone to verify that "we could have done it *but chose not to*?" If we could clone one mammal, why not another?

Expand full comment

A little bit of nitpicking:

1. GPT-3 training costed several million $ (I seem to remember I heard it was $3 million), probably more than AlphaStar.

2. You could run GPT-2 on a "medium computer", but not GPT-3. You would need at least 10-15 times the amount of GPU/TPU memory compared to a high-end desktop. I'm not 100% sure, but I think OpanAI is currently running every GPT-3 instance split between several machines (they certainly had to do it for the training, according to their paper).

3. We are not really interested in the amount of FLOPS that evolution spent on training nematodes, because we are at the point where we already can train a nematode-level AI or even a bee-level AI, as you pointed out. So for the purposes of the amount of computation spent by the evolution, I would only consider mammals. I wonder how many OOMs it shaves off the estimation?

Putting aside an exact timeline for AGI for a moment, I've never understood why human-level AGI is considered an existential threat (which seems to be taken for granted here). Are arguments like the paperclip maximizer taken seriously? If that is the risk, then wouldn't effective AI alignment be something like: Tell the AI to make 1,000,000 (or however many the factory in the thought experiment cares to make) paperclips per month and no more. If the concern is a poorly specified "maximize human utility", do we really think that anyone with power would give it to the AI for this purpose? Couldn't we just make the AI give suggested actions, but not the ability to directly implement? Who has the motivation to run such a program - it would destroy middle management and the C-suite! If we want to stop AI from improving itself why don't we just not give it the ability to do so? I maintain that we could engineer this fairly easily (at least assuming P != NP).

I haven't heard a convincing argument for what the doomsday scenario looks like post human level AGI (even granting quick upgrade to superhuman levels). In particular to me, it seems a superhuman AI is still going to need to exert a substantial amount of power in the real world from the get go as well as suffer from inexact information (which makes outsmarting someone at every turn impossible). Circling back to the paperclip example, at some point before the whole world is turned to paperclips, it seems reasonable that a nation would be able to bomb the factory. Even before that, how would the AI prevent someone from walking in and "unplugging it" (I realize this may be shutting all its power off etc.).

I feel like a lot of worrying about AI can come from a fetishization of intelligence in the form of "knowledge is power", but this just doesn't seem to be the case to me in the real world. Just because humans are more intelligient than a bear, doesn't mean that the bear can't kill the human. I believe in the case of a superintelligient AI, humans would be able to just say "screw you" to the AI and shut it down. Of course, there can be scenarios where the AI has direct access to "boots on the ground" such as nanobots or androids. But the timeline for these to overpower humans is certainly further out than 2030. I don't feel like indirect access to manipulated humans would be enough.

My feeling is that a superintelligient AI at most may be able to gain a cult-worth of followers, but not existential threat levels. I haven't heard a good argument of an existential threat that isn't at least very speculative. Much more speculative than the statement "Multiple nuclear states and hundreds of nuclear weapons will exist for 70 years and there will not be one catastrophic accident". So my intuition is that AGI is unlikely to be an existential threat.

Expand full comment

The whole "paperclip maximiser" thing was, I think, intended as a silly toy scenario to illustrate a simple and easily visualisable example of how things could go wrong. I think some people have taken it too seriously as an actual scenario, and I agree that it's not actually likely to happen in those terms.

Taking a step back, the general class of AI problems looks like this: you have built a powerful and inscrutable system, and it's doing things that aren't exactly aligned with what you want. This general description doesn't just encompass planet-eating robots, it also encompasses the kind of AI problems that we face today, like the way that the Google search algorithm has become so good at targeting maximised engagement that it gives you highly-engaging results rather than results which are related to your search term.

As AIs become more powerful, more inscrutable, and more entangled into every aspect of our society, problems like this become worse.

"the way that the Google search algorithm has become so good at targeting maximised engagement that it gives you highly-engaging results rather than results which are related to your search term"

But that's a problem for we the users. It's not a problem for Google (or Alphabet or whatever they are calling themselves today) since that is working for them (presumably it gives paid-for ad results or click-bait or things that will make Google profitability go up rather than down). If the algorithm alienated enough users such that everyone switched to Firefox, then Google would be motivated to fix it (or hit it with a spanner and kill it).

Expand full comment

"you have built a powerful and inscrutable system, and it's doing things that aren't exactly aligned with what you want."

I, too, have a child. Wocka-wocka.

Expand full comment

I agree that already a huge amount of our life is controlled by out of sight algorithms, high frequency trading, loan applications, face recognition etc. to mention a couple of others that you didn't mention. There is no doubt that even now, without AGI, there is a big influence on human life from these AI/algorithms. I just don't think humans would let an AI that was actively hurting/killing people, or gaining the power to do so exist. I don't find the argument that because it is smarter, that it can manipulate humans into doing anything whatsoever.

I do agree that as AIs become more powerful, the problems could become worse in some ways (while probably improving human life in some other important ways), but I see this as almost a tautology. As anything becomes more powerful, the potential upside and downsides increase. My objection is to the immediate characterization of AGI being "the end of the world" or an existential threat. That scenario seems hard to see for me.

Feb 24, 2022

There are boundless examples of specification gaming, where the AI does something you didn't mean for it to do, but that increases its reward function nevertheless. For example, iirc the original deepmind Atari paper had cases where the AI found a bug in the game that caused the score to go up without actually playing the game.

Specification gaming is relatively harmless when done by a sub-human AI. But, can you imagine what might happen if an AI with vastly superior intelligence to your own did it? Just to start with, even a human level AGI would know that it had better not let you find out that it wants to game the reward function, since it knows you will turn it off in such a case.

Regarding bears, I've never understood the use of animal counterexamples such as this. Of _course_ the fact that you are more intelligent than the bear means you can kill it! Not with your bare (ahem) hands, but you can show up with a rifle, a device which is completely beyond the bear's understanding, and kill it without it ever seeing you. And if you don't have a rifle, you can just avoid the bear's habitat, build a settlement somewhere, invent agriculture and civilization, have an industrial revolution, buy a rifle at walmart, and come back and kill the bear. Except by the time this happens, the bear population will be less than 1% of what it used to be, and you will spend none of your time worrying about bears.

The threat of AI is not that as soon as your AI reaches super-human intelligence that it will shoot a bolt of lightning out of your USB port and kill you. The threat is that it will do things that you literally cannot understand, and may not even be aware of, which will result in your death, to which the AI will be indifferent.

Yes, that's super speculative and sci-fi ish, but we are talking about what will happen when something that has never existed before comes into existence, so I think you're allowed to use your imagination. Even nuclear weapons had precedent in asteroid strikes and volcanos.

Feb 24, 2022

My argument isn't that with enough material preparation and "power" in the real world, that an AGI couldn't kill a human. An existing image recognition AI could kill a human with an integration to a gun and instructions to shoot when the video feed in front of the gun has a human in it. My argument is that the situations in which an AGI gets this amount of material preparation seem improbable to me, since humans will have more presence "on the ground". So to mix analogies, I think the question is how the AGI is going to get the rifle in the first place.

I disagree that there was precedent for nuclear weapons as an existential threat. The huge difference between nuclear weapons and asteroid strikes or volcanos is that they lie in the control of an intelligent agent. Never before in history had a nation state had the ability to destroy so thoroughly (with the possibility of extinction through prospective mechanisms like nuclear winters).

I understand that imagination needs to be used in these cases. I disagree that any scenario you can imagine that would lead to an existential threat is at all likely. But it sounds like you already assign a lower change to these scenarios than Scott and others seems to.

Expand full comment

I think the AI will get the rifle because whoever has the AI will have tremendous incentives to give the AI a rifle so that they can accomplish their own goals. It will even be difficult for an AI safety researcher to resist giving the AI a rifle, since if they do nothing, someone else will eventually build an AI that is less safe than their own.

Expand full comment

It's part of the tottering chain of assumptions needed for it to be a threat - no more, no less.

Once you've dismissed every objection to AGI as a concept, executing AGI with anything like today's hardware and software, AGI proving uncontrollable, AGI self-modifying to achieve takeoff and super-intelligent AGI being more or less onnipowerful and omnipotent, then all that's left is to discuss alignment.

Alignment is one of the only steps in the chain you can discuss at all without throwing the whole AGI-as-doomsday thing out alltogether (and, horrors, discussing industrial development, or economics, or something by accident), so AGI people spend a lot of time discussing alignment.

I think human level AI is roughly the point where a runaway process might start to happen if AI replaces humans as AI researchers.

I don't think that "human level AI exists and all humans agree not to allow it to become smarter" is a very stable equilibrium. Instead, it will likely lead to an AI arms race.

AIs not being able to defend themselves against direct human attacks is not a convincing argument to me. Nuclear weapons are another existential risk for humans despite being rather easily dismantled. Humans will not be able to turn off superhuman level AIs for the same reason that peace activists don't spend much time dismantling nukes: other humans with guns are in the way.

It could be pointed out that most of the stuff the time traveler relies on for his inventions is the product of empirical research, not pure deductive reasoning. So a more appropriate analogy might be the king summoning a demon as an oracle, which knows nothing about iron, germ theory or gunpowder, but can go from integer multiplication to incompleteness theorem in virtually no time. The king has a problem, he would like to increase his tax revenue. So the oracle tells him to have his tax collectors take specific notes and eventually, he is able to extract a tenth more grain from the peasantry. (The oracle insists the increase is actually 9.523%, but nobody understands enough math for that yet.) After conversing with some court bards for a few weeks, the oracle composed a song praising the king which greatly adds to his legitimacy. In return, he provides the oracle with a few of his subject to carry out "empirical research". After a few years of careful astronomical observations, the oracle declares it has discovered the celestial laws of motion, but the king says he would rather have another song. While most carpenters reject the oracles plans for a device to make water move upwards as impractical, finally a young cooper manages to build a working prototype, so the king finally approves the "build a hotter furnace and put random stuff in it" project, which gets us back on the track to the previous section.

Of course, an AI landing in the bronze age would be severely handicapped because there is much less data in an easily processable form. So to make the analogy fairer, the oracle would also be able to see any point in the kingdom and listen in half of the homes or something.

Expand full comment

I never claimed human level AI would stay human level. I grant your scenario of the AI becoming self-improving. I also grant that AIs may be relied on for many decisions. I do not see where that a priori is an existential risk. You mention that nukes remain, despite being a risk. I maintain that human organization will have control over their AI in the same way they have control over their nukes. Despite the US not wanting to disarm our nukes, we easily could if we desired.

Your examples point out cases where AI becomes more influential (which again, is happening everyday already), but doesn't point out why this leads to extinction. I have no doubt these intelligent AI would exist among the great powers of the world (at least initially, due to computational power + keeping it a secret). And if these are useful enough (I think its a big if given imperfect information, as well as generals wanting to keep their jobs) then they will probably exist in some sort of mutually assured AI. (i.e. US doesn't want to give up their AI because China won't give up theirs). I don't see how this makes a leap to existential risk. Multiple actors would have potentially dangerous tools at their disposal, and this at most would be one more. Wouldn't it still depend on geopolitics? US command structure is not going to go to war because an AI told them to.

The world will undoubtedly be a different place with more influential AI, but again I do not see what the actual sequence of events that leads to catastrophe is, are you able to elaborate on that (preferably without time travel or demons :))?

Expand full comment

Scott are you going to EAGx at Oxford or London this year?

Expand full comment

Expand full comment

> Five years from now, there could be a paradigm shift that makes AI much easier to build.

Well, yeah, there could be. But the problem is that, right now, we have no idea how to build an AGI at all. It's not the case that we could totally build one if we had enough FLOPS, and we just don't have enough GPUs available; it's that no one knows where to even start. You can't build a PS5 by linking together a bunch of Casio calculator watches, no matter how many watches you collect.

So, could there be a paradigm shift that allows us to even begin to research how to build AGI ? Yes, it's possible, but I wouldn't bet on it by 2050. Obviously, we are general intelligences (arguably), and thus we know building AGI is possible in theory -- but that's very different from saying "the Singularity will happen in 2050". There's a difference between hypothetical ideas and concrete forecasts, and no amount of fictional dialogues can bridge that gap.

I cant help but think more about the learning/training side. You can have a human-level intelligence and throw it at a task (e.g., driving a car). This task consumes only limited resources (you can have a conversation while driving), but training (learning how to drive) is much more intensive... and very dependent on the quality of teaching. Perhaps good training data is a much more important factor than we make it out to be? There's plenty of evidence that children with difficult backgrounds (=inferior, but generally similar training data) measurably underperform their peers. For an AI, the variation in training data quality could be much larger. Perhaps we are quite close to human-level performance of AI, and we are just training them catastrophically badly?

Expand full comment

Expand full comment

But is Platt's law wrong? If you want to predict when the next magnitude 9 earthquake occurs, you should predict X years, no matter what year it is, for some X. I think Yudkowsky is basically included some probability of "a genius realizes that there's an easy way to make AGI" - then the chance of that genius coming along and doing this might really have a constant rate of occurrence and the estimate is always X years, for some X. Today's predictions are conditioning on "it hasn't yet happened" and so should predict a different number than yesterday's predictions.

Expand full comment

For big earthquakes, one might assume that the time to the next one follows a simple exponential distribution. That fact is a key information.

I don't suppose that Yudkowsky really assigns significant probability to some lone genius turning their raspi into an AGI tomorrow, but in general, he seems to anticipate such black swan events, while his opponents are more about extrapolating past growth rates.

Expand full comment

I think you are probably factually correct, but what you are asserting is that there is no deliberate development process for a superintelligent AI, it's just random chance whether it happens or not (like the earthquakes, or like radioactive decay). If that is the case, if it really is purely (or almost entirely) a stochastic process, then yes the mean time until it happens does not depend on the date of the prediction (as long as it hasn't happened yet).

What most people *want* to believe is that technological advance is under our control, and by choosing to spend/not spend money, or devote/not devote talented person time to it, we can remove most elements of randomness.

[OFFTOPIC: Russia-Ukraine]

Very sorry for the offtopic, but as events unfold in Ukraine (Russia is invading Ukraine), I would be very glad to see a discussion of this in the community.

Could someone point me to some relevant place, if such a discussion has already took place, or is currently going on here/LessWrong/a good reddit thread?

Thanks so much - maybe if Scott would open a special Open Thread?

Currently, to me it seems like Russia/Putin is trying to replace the Ukrainian government with a more Russia-favoring one, either through making the government resign in the chaos, executing a coup through special forces or forcing the government to relocate and then taking Kiev and recognizing a new government controlling the Eastern territories as the "official Ukraine".

I would be particularly interested in what this means for the future, eg:

- How Ukrainian refugees will change European politics? (I am from Hungary, and it seems like an important question.)

- What sanctions are likely to be put in place?

- How will said sanctions influence European economy? (Probably energy prices go up - what are the implications of that?)

Expand full comment

“who actually manages to keep the shape of their probability distribution in their head while reasoning?”

This is exactly the job description of Risk Managers (as opposed to business units, that care for measures of central tendency such as expected or most likely).

One interpretation of what he is saying is that, like any good risk manager, he has a very good idea about the distribution. But a large (enough) portion of that distribution occurs before any reasonable mitigation can be established that it doesn’t matter. Given the risks we are talking about, that is a scary conclusion.

Expand full comment

Expand full comment

The ELO vs Compute graph suggests that the best locally available intelligence "algorithm" should take over in evolution, if only to reduce the number of resources necessary to run the minimum viable intelligence set. How structurally different are the specialized neural structures?

Expand full comment

Expand full comment

"Imagine a scientist in Victorian Britain, speculating on when humankind might invent ships that travel through space. He finds a natural anchor: the moon travels through space! He can observe things about the moon: for example, it is 220 miles in diameter (give or take an order of magnitude). So when humankind invents ships that are 220 miles in diameter, they can travel through space!

...Suppose our Victorian scientist lived in 1858, right when the Great Eastern was launched."

Then your Victorian scientist's estimations would become outdated in 1865, when Jules Verne wrote "From The Earth To The Moon" and had his space travellers journey by means of a projectile shot out of a cannon. So I (grudgingly) suppose this fits with Yudkowsky's opinion, that it will happen (if it happens) a *lot* faster and in a *very* different way than Ajeya is predicting.

But my own view on this is that the entire "human-level AI then more then EVEN MORE" is the equivalent of H. G. Wells' 1901 version, where space travel for "The First Men In The Moon" happens due to the invention of cavorite, an anti-gravity material.

Expand full comment

In retrospect it's interesting that nobody seems to have thought about space travel with rockets. As far as I'm aware the Victorians had all the chemistry and materials science they'd need to build a (terrible, dangerous, and certainly incapable of lunar travel) solid fuel rocket.

Expand full comment
That's because rockets are stupid. The famous rocket equation tells you that a ridiculously small amount of your mass can be payload, because of the idiotic necessity of carrying fuel to burn to accelerate the rest of the fuel you are carrying to burn to accelerate the payload. The mass of the Apollo 11 CSM was ~1% of the fueled Saturn V/Apollo stack on the pad.

The efficient and clever thing to do is burn all your propellant on the ground, accelerating your projectile all at once to whatever velocity you need. That way you need only burn the propellant required to accelerate your payload, and none to accelerate propellant -- or at least, in the case of Apollo for example, you only need to accelerate the relatively tiny amount of fuel you need to lift off from the Moon and re-insert into a return trajectory.

The reasons we did use rockets is because nobody could figure out how to build a big enough gun barrel that could accelerate human beings sufficiently gently to survive the launch, and because we were in a big hurry and so throwing away 99% of your exceedingly expensive hardware was acceptable if it meant we could beat the Russkis to the Moon in time for RFK (had he lived) to win re-election.

But when people think soberly about truly efficient access to space, and allow themselve to dream of advances in materials science and civil engineering, they go back to the sensible Victorian notion of avoiding rockets and accelerating stuff you throw away (or burn) -- so e.g. launch loops, space elevators.

But no, the Victorians could not have built a lunar rocket, or even an orbital one, because the Hall-Heroult proces was only invented in the 1880s and did not really come into its own until the construction of massive hydroelectric dams in the 1930s. With the price of aluminum what it was in the late 19th century, a orbital rocket would've probably taken a quarter the GDP of Great Britain or France.

Expand full comment

> Is that a big enough difference to exonerate her of “using” Platt’s Law? Is that even the right way to be thinking about this question?

So, on the Platt's law thing. It's very weak evidence, but it is Bayesian evidence. Consider an analogous scenario: You get dealt a hand from a deck, that may or may not be rigged. If you get a Royal Flush of Spades, intuitively it feels like you should be suspicious the deck was rigged. It's really unlikely to draw that hand from a fair deck, and presumably much more likely to draw it from a rigged deck. But this should work for every hand, just to a lesser extent.

If we assume that all reasonable guesses are before 2100 (arbitrarily, for simplicity), then there are about 80 years to choose, being within 2 years of the "special" estimate (30 years, I'll come back to 25, but 30 is easier), is a 5 year range in the 80 years, for odds of 1/16. This is kinda close to the odds of drawing Two Pair in cards, so, how suspicious would you be that the deck was rigged in that case? (25 years being the "special" one gives 15/80 or about 1/5, there isn't a poker hand close to 1/5, but it's somewhere in between to One Pair, twice in a row, and One Pair of face cards) That's about how suspicious it should make you of the estimate (so, in my mind, not very). Likely this is getting a lot more air-time than it's doing work.

(Caveat, I'm completely skimming over the other side, which is that it matters how likely the cards would be drawn if the deck WAS rigged (i.e. how likely someone would rig THAT hand), because I don't really know how to even estimate that. Just as a guess, if that consideration pushes in favor of being suspicious, it might be the amount of suspicion if you drew Three of a Kind, and MAYBE it could get as far as a Straight.)

Expand full comment

It seems to me that there is no kind of expertise that would make one predictably better at making long-term AGI forecasts. Indeed, experts in AI have habitually gotten it very wrong, so if anything I should down-weight the predictions of "AI experts" to practically nothing.

I think I am allowed to say that I think all of the above forecasts methods are bad and wrong, by simply looking at the arguments and disagreeing with them for specific reasons. I don't think I am under any epistemic obligation to update on any particular prediction just because somebody apparently bothered to make the prediction; I am not required to update on the basis of the Victorian shipwright's prediction about spaceflight.

My opinion is that the whole exercise of "argument from flops" is doomed, and its doom is overdetermined. Papers come out showing 3 OOM speedups in certain domains over SOTA - not 3x speedups, 1000x speedups. How can this be, if we are anywhere close to optimizing the use of our computational resources? How would we be seeing casual, almost routine algorithmic improvements that even humbly double or 10x SOTA performance, if we were anywhere near the domain where argument-from-flops-limitation would apply?

Expand full comment

Grace asked the same experts to judge when lesser milestones would happen; I think it's almost been enough time that we've had a chance to judge their progress. I would update to trusting them more if they got those right.

Overall I don't think there's some incredibly noticeable tendency for AI experts to mispredict AI. They were a bit fast with self-driving cars, a bit slow with Go, but absent any baseline to compare them against they seem okay?

Regarding Platt's Law, I sense a fundamental misunderstanding of why a prediction might follow it. It's not a regimented mathematical system. It's something our brains like to do when we think something is coming up soon, but we see no actual plottable path to reach it.

It's the same reason that fusion power is always 30 years off. It's soon enough to imagine it, but long enough away that the intervening time can do all the work of figuring out how.

If no one has any idea *how* to create a human level AI, then no level of computational power will be enough to get there. We could have 10^45 FLOP/S right now and still not have AI, if we don't know what to do with them. Having the computer do 2+2=4 a ridiculous number of times doesn't get us anywhere.

That doesn't mean human level AI cannot actually arrive in 30 years, but it also doesn't say anything really about 10 years or 500 years. The fundamental problem is still *how* to do it. If you get to that point, any engineer can plot out the timeline very accurately and everyone will know it. Until then, you could say about anything you want.

As an experiment, throw billions of dollars into funding something that we know can't exist now, but is maybe theoretically possible. Then ask the people in the field you've created to tell you how long it will take. I bet the answer will be about 30 years, give or take a little bit. They're telling you that they don't know, but had to provide an answer anyway.

I'm still ankle-deep in the email and haven't looked at the comments, but it got me thinking: if we've been making a lot progress recently by spending more, how much will the effort be stymied by interest rate increases? How about war?

Expand full comment

> So maybe instead of having to figure out how to generate a brain per se, you figure out how to generate some short(er) program that can output a brain? But this would be very different from how ML works now. Also, you need to give each short program the chance to unfold into a brain before you can evaluate it, which evolution has time for but we probably don’t.

Doesn't affect any overall conclusions, but there's a decent amount of research that would count as being in this direction I think. Hypernetworks didn't really catch on but the idea was to train a neural network to generate the weights for some other network. There's also metalearning work on learning better optimizers, as well as work on evolving or learning to generate better network architectures.

Feb 24, 2022·edited Feb 24, 2022

> But also, there are about 10^15 synapses in the brain, each one spikes about once per second, and a synaptic spike probably does about one FLOP of computation.

This strikes me very weird - humans can "think" (at least react) much faster than a second. If synapses fire only every second, and synapses firing are somehow the atomic units of computation in the brain, then how can we react, let alone think complex thoughts (that probably require some sequence of steps of calculations) orders of magnitude faster than a second?

Am I missing something? It seems either the metric of synapses is wrong, or the speed.

Expand full comment

Expand full comment

To elaborate on that a little, if there were such a thing as a "grandmother neuron" (a neuron tuned to respond only to a specific person) and you only saw your grandmother once per year, the average rate of fire of the grandmother neuron would be very low, but that doesn't mean it would take you days to recognize your grandmother when you met her.

Expand full comment

We probably need to be careful about measuring the speed of thought. Our thought (unlike computers) is parallel on a gargantuan level. We're talking 100 billion simultaneously operating CPUs. So what they can get done in just one clock step is stupendous. (*How* this is done -- what the ridiculously parallel algorithm *is* is one of those enduring and frustrating mysteries.) I would agree an effective clock speed of 1 Hz seems a bit slow, given we can and do have significant mental state changes in less time, e.g. the time between seeing a known face and experience recognition. But it can't really be a lot faster. Maybe 10-100Hz or so, because signals don't go down nerves very fast, which isn't that much of a change.

Expand full comment

The atomic unit of computation would be the neuron depolarization-repolarization cycle, which seems to take around 5ms. I assume synapses might introduce additional delays depending on their type.

Expand full comment

Expand full comment

I think one reason for Platt's law may be that Fermi estimates (I'd class the Cotra report as basically a fermi estimate) suffer from a meta-degree of freedom, in that the human estimator can choose how many factors to add into the computation. For instance, in the Drake equation, you can decide to add in a factor for the percentage of planets with luna-sized moons if you think that having tides is super important for some reason. Or you can add in a factor for the percentage of planets that don't have too much ammonia in their atmosphere, or whatever. Or you can remove factors. The point is that the choice of factors far outweighs the choice of the values of those factors in determining your final estimate.

I don't think that Cotra is deliberately manipulating the estimate by picking and choosing parameters, but it seems clear that early in such an estimation process, if you come up a result showing that AI will arrive in 10,000 years or 3 months, you're going to modify or abandon the framework you're using because it's clearly producing nonsense. (Not that AI couldn't arrive in 3 months or 10k years - but it doesn't seem like a simple process that predicted either of those numbers could possibly be reliable).

Or maybe your bounds of plausibility are actually 18 months to 150 years. It's not too hard to see how this could cause a ~30 year estimate to be fairly overdetermined due to unconscious bias toward plausible numbers, and more importantly, toward numbers that _seem like they could plausibly be within the bounds of what a model like yours could accurately predict_.

Nobody would *believe* an estimate of less than 10 years without some very plausible and detailed argument about how it happens. Nobody would *care* about an estimate of more than 50-75 years, because most who read it won't live to see it. So...if you're going to produce something that people will read, that will get published and noticed, you pretty much have to come up with a number between 20 and 40 years.

Expand full comment

Agreed - what I'm proposing is a mechanism by which the result you describe happens, such that you can't point to obvious signs of fine-tuning within the analysis.

Expand full comment

I think it's probably pretty simple. You do your analysis, and if the answer is < 10 years *but* you don't have detailed credible plans, or the answer is >100 years, you just never publish it, because you already know either you'll get fierce blowblack or nobody will care. Survivorship bias.

Expand full comment

Except the Cotra report was in some sense pre-registered (in that OpenPhil picked someone and asked them to do a report), so I don't think publication bias can do any work here.

Expand full comment

> The median goes from 2052 to about 2050.

I think this is a mistake; the median of the solid black line goes to around 2067, with "chance by 2100" going down from high-70%s to low-60%s.

Expand full comment

Expand full comment
1. correction to "Also, most of the genome is coding for weird proteins that stabilize the shape of your kidney tubule or something, why should this matter for intelligence?"

source: "at least 82% of all human genes are expressed in the brain" https://www.sciencedaily.com/releases/2011/04/110412121238.htm#:~:text=In%20addition%2C%20data%20analysis%20from,in%20neurologic%20disease%20and%20other

2. The bitcoin network does the equivalent of 5e23 FLOPS (~5000 integer ops per hash and 2e20 hashes per second; assuming 2 integer ops is worth 1 floating point op). This is 6 orders of magnitude bigger than that Japanese supercomputer, because specialized ASICs do a lot more operations per watt than general purpose CPUs. Bitcoin miners are compensated by block rewards at a rate of approximately $375 per second, so that's about 1e21 flops/$. This is 4 orders of magnitude higher than the estimate of 10^17flops/$. If there were huge economies of scale in producing ASICs specialized for training deep neural nets, we could probably expect the former 1e21flops/$ at current technology levels. Bitcoin ASICs also seem to still be doubling efficiency every ~1.5 years.

3. correction: "The median goes from 2052 to about 2050"

The median is where cumulative probability is 0.5, and on your graph it's in 2067. If you mean the median of the subset of worlds where we get AI before 2100, then it's a cumulative probability of 0.3 in 2045.

4. The AI arrival estimate regression line's higher slope than Platt's law seems rational, because from an outside view, the longer it's been since the invention of computers without having AGI yet, the longer we should expect it to take. (But on the inside view, this article is making me shift some probability mass from after-2050 to before-2050)

5. Clarification: "human solar power a few decades ago was several orders of magnitude worse than Nature’s"

Photosynthesis is typically <2% efficient, so you seem to be claiming human solar power in 1990 was <0.002% efficient. But this Department of Energy timeline of solar power claims Bell Labs developed at least a 4% efficient solar panel in 1954:


Bell labs was awesome, and my grandpa had some good stories about working there in the 50s and 60s while they invented the transistor and the theoretical basis for the laser. I wish a place like that still existed -- I'd send them an application. I tried cold emailing SpaceX and they ignored me.

> "at least 82% of all human genes are expressed in the brain"

Do we have a baseline comparison for that? I'm guessing that 70% of those genes are actually critical for basic eukaryotic cell metabolism.

Expand full comment

This paper claims 66% of human genes are expressed in the skin: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4305515/#:~:text=Approximately%2066%25%20of%20all%20protein,13%2C044%20and%2013%2C329%20genes%2C%20respectively.

This paper claims 68% of human genes are expressed in the heart:


This paper claims 68% of human genes are expressed in the kidney:


After looking up these baselines, 82% is less impressive

Expand full comment

Also, we share 98.8% of our genes with the chimps, which likely have a significantly lower average intelligence.

I think the hard part in building a human is building an eukaryote, and perhaps a vertebrate. Going from that to mammals is a small step, and finally tuning up the brain size seems like a minor effort.

Expand full comment

I have higher confidence that we'll get biological superintelligence by 2050 than that we'll get artificial superintelligence by 2050. China or somebody will be engineering the genetics and environment of babies to make genius the average.

I guesstimate there are 30,000 people over 160 in the entire world, but genetic engineering could 1000x that and provide much faster progress in science, technology, and AI safety research.

Why do we need *super* intelligence? Just imagine a world in which the average IQ moves up a mere standard deviation, to 115 instead of 100. That would drastically cut the number below 85, which certain people have argued forcefully is where our criminal and dependent class come from. It would mean the typical line worker would be the equivalent of someone who graduates from a good college with excellent marks today -- the kind of person who could be admitted to law or medical school. By contrast, the equivalent of college graduates today, a big slice of our cultural and technological leaders, would become the equivalent of people who can get PhDs in high-energy physics and ancient Greek literature. Imagine choosing your candidate for Senator from among a dozen people who can all speak 3-4 languages fluently, who readily comprehend relativity and quantum mechanics, who have written original research papers on economics.

And then of course our smartest people, the folks who earn faculty positions at Stanford or win Nobel Prizes, would all be Einsteins and Newtons. Imagine a few hundred Einsteins and Newtons, in a world run by people with the smarts of your average engineering professor at Caltech, and with a hundred-million strong workforce as smart as a competent physician with a degree from Columbia, and with an almost complete absence of violent crime, drug addiction, et cetera. That would be stupendous.

Expand full comment

Has anyone compared prediction timelines to the estimated lifetime of the predictor?

I have a vague memory of someone looking at this on a different topic, but I couldn't turn it up in a quick search; the idea is that for [transformative complex development] people have a tendency to predict it late in their life, but within a reasonable margin of having not yet died of old age before it happens.

How many researchers and Metaculus predictors will be 70-80 in 2050, and their prediction is, perhaps unconsciously, really a hope to achieve Virtual Heaven?

Alternatively, what else does Platt's "law" apply to? Aren't flying cars always 20-30 years away? Nuclear fusion? Is this just the standard "close, but not too close" timeline for *any* technological prediction?

Expand full comment

> “any AI forecast will put strong AI thirty years out from when the forecast is made.”

There's probably a stronger version of this: any technology that seems plausibly doable but we don't quite know how to do, probably seems about 30 years away.

10 years away is the foreseeable timeline of current prototypes and has relatively small error bars. 20 years away is the stuff that's being dreamed up right now and has larger error bars (innovation is risky!). 30 years away consists of things that will be invented by people who grow up in an environment where current prototype tech is normal and the next gen stuff is just on the horizon.

Predicting how these people will think about problems is fundamentally unpredictable. Just think of all the nonsense that was said by computer "experts" in the 60s and 70s prior to the PC.

Expand full comment

Calculators are much better than humans, but instead of replacing human calculation ability they enhanced it. Spreadsheets compounded that enhancement. Complex graphing calculators did the same. Sure, calculus was invented (twice!) without them, but the concepts of calculus become accessible to high school students when you include graphing calculators, and statistics become accessible when you load up a spreadsheet and play around with Monte Carlo simulations.

I think what we're missing is how this contributes to Moore's Law of Mad Science. It gives IQ-enhancing tools to the masses. But it's also giving large tech companies tools that might accidentally drive mass movements of hatred, hysteria, and war. And that's just because they don't know what they're doing with it yet. How much worse off will we be when they figure out how to wield The Algorithm effectively? And why are we not talking about THIS massive alignment problem?

What if we destroy ourselves with something else, before we get all the way to AGI? We're already creating intelligence-enhancing tools that regular human operators can't be trusted to handle. Giving god-like power to a machine is certainly terrifying, because I don't know what it might do with that power. But I have some idea what certain people around the world would do with that kind of power, and I'm equally terrified. Especially because those people WANT that power. They're not going to accidentally stumble into it, they're actively trying to cultivate it.

Expand full comment

Wasn't there a very short period during which computer+human beat computer alone, after which the humans were useless? Did that period even happen at all in Go, or did we just jump over it?

I agree it's possible we kill each other before AI, though this seems probably to just be normal nuclear/biological weapons. I can't really think of a way AI would be worse than this before superintelligence.

Skeptics of concerns about AGI often point out that there's a difference between domain-specific AI and general intelligence. They claim that the potential for accidentally producing general intelligence on the road to domain-specific intelligence is a hypothesis that may turn out to be more hype than substance.

Whether the hypothesis of accidentally creating AGI is true or not, we're obviously expanding domain-level AI by leaps and bounds. If domain-level AI allows ordinary humans to do the kind of things that we're worried about AGI doing, it seems like the kind of thing we should all be able to agree is a concern - whether it's done by humans or intelligent computers. (For example, blackmailing politicians using deep-fake 'evidence' of transgressions they never committed.)

It also seems like a bridge to the AGI-skeptic community. If they can't accept that AGI is a risk that should be addressed, at least they can appreciate that misaligned technologies have a long history of negative outcomes, and the current development of AI is poised to dump a bunch of new capabilities in our laps. As a bonus, perhaps safeguards on domain-level AI would also help protect against the general problem?

Expand full comment

I think an arsenal of domain-specific AIs + a human inside will be a dominant model for decades before we reach AGI.

I'm not sure if replacing the human in the centre with an AGI would be an improvement, given that someone has to be the interface between the scary pile of compute and human goals to make any kind of use of it anyway.

Expand full comment

That's what I'm thinking. It also has the benefit of being applicable right now, since we have a lot of alignment problems already. I'm not saying the world would be SAFE with an AGI that has only Stone Age tools at its disposal, but we don't have to make it easy to destroy the world.

Even if we did solve the alignment problem for AGI, if we don't ALSO solve the alignment problem for non-general AI we're still creating tools that have severe civilization-disrupting potential. (cf the current covert cyber war among the US, Russia, China, Israel, etc.)

Expand full comment

There was a period where human+computer beat computer alone, but it was short and ended a few years ago.

However I think two things are important to notice : 1) humans have improved since Deep Blue beat Kasparov. I would bet that Carlsen would beat Deep Blue rather easily, and may even win against 2005 top level engines.

2) I may be mistaken, but it seems that neural network play chess in a way that is much more human-like than classical engines - it's easier for us to understand the motivations of their move despite them being stronger. I don't know how to interpret that but I find it interesting.

Expand full comment

Expand full comment
I think I'm in the "this report is garbage and you should ignore it completely" camp (even though I have great respect for Ajeya Cotra and the report is probably quite well done if you apply some measure that ignores the difficulty of the problem). You basically have

- Extreme uncertainty about many aspects within the model, as admitted by Cotra herself

- Strong reasons to suspect that the entire approach is fundamentally flawed

- Massive (I'd argue) potential for other, unknown out-of-model errors

I think I give even less credit to it than Eliezer in that I don't even believe the most conservative number is a valid upper-bound.

SEPARATELY, I do just want to say this somwhere. Eliezer writes this post calling the entire report worthless. The report nonetheless [does very well in the 2020 review](https://www.lesswrong.com/posts/TSaJ9Zcvc3KWh3bjX/voting-results-for-the-2020-review) whose voting phase started after Eliezer's post was published, it it wins the alignment forum component in a landslide. Afaik I was literally the only person who gave the post a negative score. So can we all take a moment to appreciate how not-cultish the community seems to be?

> I think I'm in the "this report is garbage and you should ignore it completely" camp

Then I wonder what you'll think of my take downthread.

> So can we all take a moment to appreciate how not-cultish the community seems to be?

In the sense that LWers feel free to disagree with Eliezer? I do appreciate that.

Expand full comment

> In the sense that LWers feel free to disagree with Eliezer?


> Then I wonder what you'll think of my take downthread.

Full agreement with the first, descriptive part (doesn't seem like you said anything speculative), mild disagreements (but none I feel the need to bring up) with the last four paragraphs.

Feb 24, 2022·edited Feb 24, 2022

I'd be curious to see (if anyone has any resources) the historical split and trend over time of compute costs broken down of each of the following three components:

- Chip development costs/FLOP.

- Chip production costs/FLOP.

- Chip running costs/FLOP (probably primarily electrical costs now).

I ask in relation to a concern with extrapolating historical rates of cost declines going forward. It's possible that the components of cost with the most propensity to be reduced will become an increasingly small share of cost over time. As such, the costs that remain may be increasingly difficult to reduce. This is a low-confidence idea as I don't know a ton about chip design, and there are plenty of reasons why extrapolating from the general trend might be right (e.g. perhaps as something becomes an increasing component of cost we spend more effort to reduce it).

That said, it would be interesting to see whether extrapolating future cost reductions from past ones would have performed well in other industries with longer histories? i.e. How have the real cost of steel or electricity gone down, as well as the share of costs from different inputs?

Totally separately, should we expect the rate of algorithm development and learning to decline as the cost of training single very large models and then evaluating their performance increases drastically? My intuition is that as the cost of iteration and learning increases (and the number of people with access to sufficient resources decreases) we should expect a larger proportion of gains to come from compute advance as opposed to algorithm design, but this something I have close to 0 confidence in.

Yes to this, and also add to the list the up-front cost of building the fab for the clever chip you've just developed. You could fold that into "chip development" or "chip production", but the fixed cost of the fab is a different kind of thing than the brainwork of developing the chip or the marginal unit cost once the plant is running, and so may be subject to different scaling.

Expand full comment

John, I agree - I was somewhat folding fab cost into "production" insofar as at the margin a doubling of chip production would require a doubling of fab construction for any generation of chip (I realize for each manufacturer, fabs are not a marginal cost, but long term for the industry they seem to scale proportionally to # of chips in a way that R&D does not). Happy to be corrected if that is wrong.

I am curious if you have a different sense, but my instinct (uninformed by data) is that energy/electricity costs have been becoming an increasingly large proportion of compute costs. I think it is very unlikely that electricity prices follow quite the exponential decay implied by the paper, but I suppose it is possible that efficiency increases exponentially with a 2-5 year doubling, given the brain operates on only 20W. Do you have any expertise that could falsify either of the above?

Expand full comment

"For the evil that was growing in the new machines, each hour was longer than all the time before."

My hunch is that Eliezer is right about the problem being dominated by paradigm shifts, but that they usually involve us realising how much more difficult AGI is than we thought, moving AGI another twenty-odd years out from the time of the paradigm shift. A bit like Zenos paradox except the turtle is actually 100 miles away and Achilles just thinks he is about to catch up.

That being said I am bullish on transformative AI coming within the next 20 years, just not AGI.

Expand full comment

> and other bit players

I think this should be "big players"

Something to consider is that there isn't yet the concept of agency in AI and I'm not certain anybody knows how to provide it. The tasks current impressive production AI systems do tend to be of the "classify" or "generate more like this" categories. Throwing more compute/memory/data at these systems might get us from "that's a picture of a goldfish" to "that's a picture of a 5-month old goldfish who's hungry", or from what GPT-3 does to something that doesn't sound like the ravings of an academic with a new-onset psychiatric condition.

None of these have the concept of "want".

Expand full comment
I think the relevant property of "agency" is "running a search", which AlphaZero already does. (Though GPT-3 does not.) The reason it doesn't feel like an agent to you is that the domain is narrow, but there is no qualitative thing missing.

Expand full comment
Feb 25, 2022·edited May 16, 2022

Feb 25, 2022

My model: current AIs cannot scale up to be AGIs, just as bicycles cannot scale up to be trucks. (GPT2 is a [pro] bicycle; GPT3 is a superjumbo bicycle.) We're missing multiple key pieces, and we don't know what they are. Therefore we cannot predict when exactly AGIs will be discovered, though "this century" is very plausible. The task of estimating when AGI arrives is primarily a task of estimating how many pieces will be discovered before AGI is possible, and how long it will take to find the final piece. The number of pieces is not merely unpredictable but also variable, i.e. there are many ways to build AGIs, and each way requires a different set of major pieces, and each set has its own size.

Also: State-of-the-art AGI is never going to be "as smart as a human". Like a self-driving car or an AlphaStar, AIs that come before the first AGI will be dramatically faster and better than humans in their areas of strength, and comically bad or useless in their areas of weakness.

At some point, there will be some as-yet unknown innovation that turns an ordinary AI to an AGI. After maybe 30,000 kWh of training (give or take an OOM or two), it could have intelligence comparable to a human *if it's underpowered*: perhaps it's trained on a small supercomputer for awhile and then transitioned to a high-end GPU before we start testing its intellect. Still, it will far outpace humans in some ways and be moronic in other ways, because in mind-design-space, it will live somewhere else than we do (plus, its early life experience will be very different). Predictably, it will have characteristics of a computer, so:

- it won't need sleep, rest or downtime (though a pausing pruning process could help). In the long run this is a big deal, even if processing power isn't scaled up.

- it will do pattern-matching faster than humans, but not necessarily as well

- it will have a long-term memory that remembers the things it is programmed to remember very accurately, while, in some cases, completely forgetting things it is not programmed to remember

- if it saves or learns something, it does so effortlessly, which should let it do things that humans find virtually impossible (e.g. learning all human languages, and having vast and diverse knowledge). Note: unlike in a human, in a computer, "saving" and "learning" information are two fundamentally different things; well, humans don't really do "saving".

- it will lack humanlike emotions, have limited social intelligence, and will predict human behavior even less reliably than we do, though in time, with learning, it'll improve

- Edit: for all the ink spilled on the illegibility of neural networks, AGIs are, for several reasons, much more legible than human neural brains, and therefore much easier to improve.

- it will have the ability to process inputs quickly and with low latency, and more crucially, produce outputs very rapidly and with a lower noise/error rate than a human can. This latter ability will make it possible (if its programmers allow) for it to write software that runs on the same machine, and to communicate with that software much faster than any human can communicate with a computer. If it's smart enough to write software, it will use this ability to augment its mental abilities in ways that can make it eventually superhuman in some ways, without increasing available computational power.

It's easy to think of examples of that last point, just by thinking about games. For instance, those games where you are given six letters and have to spell out as many words as you can think of? An AGI can simply write a program *within its own mind* to find all the answers, allowing it to quickly surpass human performance. Or that game where you repeatedly match three gems? Probably the AGI's neural net architecture can do that pretty well, but again it could write a program to do even better. Sudoku? No problem.

So at this point, the AGI should be able to outclass humans in various solitaire games, but might have limited talent in real-world tasks like fixing cars, or discovering the Pythagorian theorem, or even reading comprehension. But we can hugely increase its intelligence simply by giving it more compute, at which point it can quickly become smarter than every human in every way, and the AGI alignment problem potentially becomes important.

If we're lucky, the first AGI will do a relatively poor job at certain tasks, such as abstract reasoning on mental models, concept compression, choosing priorities of mental processes, synthesizing its objective function (or motivational system) with reality, and looking at problems from a variety of perspectives / at a variety of levels of abstraction. Handicaps in such areas could make it a poor engineer/scientist, which is good in the sense that it's safer. Such an AGI would be likely to have difficulty doing risky actions like improving its own design, or killing everyone, even if it has a whole datacenter-worth of compute.

If we're not lucky, we get the kind of AGI Eliezer worries about. I think we're going to be lucky, because Reality Has A Surprising Amount Of Detail. But the possibility of being unlucky has a high enough probability (1%?) that AI safety/alignment research should be well-funded. Edit: Plus, in the 99% case, I would raise my probability estimate of near-term catastrophe immediately after the first AGI appears, so it's good to get started on safety work early.

Funny thing is, I'm no AGI expert, just an aspiring-rationalist software developer. Yet I feel mysteriously confident that some of these AI experts are off the mark in important ways. The bicycle/truck distinction is one way.

Another way is that I think the trend toward more expensive supercomputer models is likely to reverse very soon, especially for those who make real progress in the field. Better compute enabled recent leaps in performance, but now that it has been proven that AIs can beat any human at Go and Starcraft, the prestige is harvested, and I don't see much reason to build more expensive models. It's a bit like how we moved from 8-bit CPUs all the way up to 64-bit CPUs, and then stopped adding more bits (apart from e.g. SIMD) because there just wasn't enough benefit. To the contrary, cheaper models are cheaper, so they enable a lot more experimentation and research by non-elites. It might well be that teenage tinkerers (at home, with monster gaming rigs) discover key pieces of the first AGI.

Did we add pieces to GPT-2 in order to add the few-shot learning / promptability possessed by GPT-3? My understanding is no, it was emergent behavior caused by scaling.

Do we have a theory of what types of behavior can and can't emerge due to scale?

Expand full comment
Feb 26, 2022

As far as I know, GPT2 does "few shot learning" in the same sense as GPT3, but they didn't publish a paper on it, and GPT3 does it substantially better.

Edit: and I think people misunderstand GPT in general, because to humans, words have meanings, so we hear GPT speak words and we think it's intelligent. I think the biggest GPT3 is only intelligent in the same sense as a human's lingustic subsystem, and in that respect it's a superintelligence: so far beyond any human that we mistake it for having general intelligence. But I'm pretty sure GPT3 doesn't have *mental models*, so there are a great many questions it'll never be able to answer no matter how far it is scaled up (except if it's already seen an answer that it can repeat.)

Expand full comment

Expand full comment

I looked, but couldn't a publicly-available GPT2 that seemed to use a big model like the one that wrote decent JRR Tolkein. Oddly, none of the public-facing models I saw disclose the model size or training info.

Expand full comment
All right.

I like what the report is saying (not that I've read it, just going off Scott's retelling of its main points), and it's reassuring me that the people working on it are competent and take every currently recognizable factor of difficulty into account.

I nevertheless think it's erring in the exact direction all earlier predictions were erring, which is the exact opposite of where Elezier thinks it's erring. I.e., they understand and price in the currently known obstacles and challenges on the road to AGI; they do not, because they cannot, price in the as-of-yet unknown obstacles that will only make themselves apparent once we clear the currently pertinent ones. E.g., you can only assume power consumption is the relevant factor if you completely disregard the difficulty and complexity of translating that power into (more relevant) computational resources. Then, with experience, you update to thinking in terms of computational resources, until you get enough of them to finally start working on translating them into something even more relevant, at which point you update to thinking in whatever measures the even more relevant thing. (Or don't update, and hope the newly discovered issues will just solve themselves, but there's little reason to listen to you until you actually provide a solution to them.)

(Bonus hot take: This explains the constant 30 years horizon, it's some stable limit of human imagination vis-a-vis the speed of technological progress. We can only start to perceive new obstacles when we're 30 years away from overcoming them.)

We don't know whether we'll encounter any new obstacles or, if so, what they will be, but allow me to propose one obvious candidate: environment complexity.

The entire discussion, as presented in the article, is based around advances in games like chess (8x8 board and simple consistent rules), go (19x19 board and simple consistent rules), or starcraft (well, much more complex, but still a simple, granular, 2D plane with simple consistent rules). (I'm ignoring GPT and the like, because they simply aren't ever reliably performing human tasks.) Neither of those tell us much about a performance in the real world (infinite universe with complex slash unknown slash ever-changing rules). Assuming computational resources are the only relevant factor may be (and, I believe, is) completely ignoring the problem of data necessary to train an AI that is capable of interacting with reality as well as a human does. The relevant natural science analogy may yet turn out to be not "10^41 FLOP/S", but "a billion of years of real-time training experience". We will, of course, be able to bring that number significantly down, but to 30 years? I'm extremely skeptical.

(Bonus hot take: The first AGI takeover will literally be thwarted by it not understanding the power of love, or friendship, or some equally cheesy miscalculation about human behavior, which it will have failed to adequately grasp.)

tl;dr: The report is by necessity overly optimistic (pessimistic if you think AGI means end of humanity), but constitutes a useful lower bound. Elezier is not even wrong.

Expand full comment

Coming up with a complex environment isn't that hard. So firstly, we have a lot of games, and a lot of other software. Set up a modern computer with a range of games, office applications and a web browser, and let the AI control the mouse+keyboard input and see the screen output. Thats got a lot of complexity. (it also gives the AI unrestricted internet access, not a good idea.)

"a billion of years of real-time training experience"

More like 20 years of training experience and a handful of basic principles of how to learn.

If you had an AIXI level AI, just about any old data will do. There is a large amount of data online. Far far more than the typical human gets in their training. (Oh and the human genome is online too, so if it contains magic info, the AI can just read it)

People have been picking games like chess mostly because their AI techniques can't handle being dumped into the real world. Giving an AI a heap of real world complexity isn't hard.

Expand full comment

Small mistake, "he upper bound is one hundred quadrillion times the upper bound." should be "he upper bound is one hundred quadrillion times the lower bound."

Expand full comment

Expand full comment

These projections ignore the "ecology" (or "network" if you prefer). Humans individually aren't very smart, their effective intelligence resides in their collective activity and their (mostly inherited) collective knowledge.

If we take this fact seriously we will be thinking about issues that aren't discussed by this report, Yudkowsky, etc. For example:

- What level of compute would it take to replicate the current network of AI researchers plus their computing environment? That's what would be required to make a self improving system that's better than our AI research network.

- What would "alignment" mean for a network of actors? What difference does it make if the actors include humans as well as machines?

- Individual actors in a network are independently motivated. They are almost certainly not totally aligned with each other, and very possibly have strongly competitive motivations. How does this change our scenarios? What network & alignment structures produce better or worse results from our point of view?

- A network of actors has a very large surface area compared to a single actor. Individual actors are embedded in an environment which is mostly outside the network and have many dependencies on that environment -- for electric power, security, resources, funding, etc. How will this affect the evolution and distribution of likely behaviors of the network?

I hope the difference in types of questions is obvious.

Some objections:

- But Alpha zero! Reply: Individuals aren't very intelligent and chess is a game played by individuals. Alpha zero can beat individual humans, not a big deal.

- But Alpha fold! The success of Alpha Fold depends on knowledge built up by the network. Alpha fold can better utilize this knowledge than any individual human, again no big deal. Alpha fold can't independently produce new knowledge of the type it needs to improve. However Alpha fold *does* increase the productivity of biochemical research, that will greatly increase the rate of progress of the network, and will feed back to some degree to Alpha fold.

- But GPT 3! This is a great example of consolidating and using collective knowledge from the corpus -- and it helps us understand how much knowledge is embedded implicitly in the corpus. On the other hand we haven't seen any AI that generates a significant net increase in the collective knowledge of our corpus. This will come but AIs will only increase our collective knowledge incrementally to begin with.

- But FOOM! This would require replicating the whole research endeavor around AI and probably a lot more -- maybe much of the culture and practice of math, which is very much a collective endeavor, for example. Not going to happen quickly just because one machine gets a few times more intelligent than a single AI researcher.

Expand full comment

Expand full comment

I'm trying to read that linked article by Eliezer now and holy crap, he could really use an editor that would tell him to cut out half of the text and maybe stop giving a comprehensive, wordy introduction to Eliezerism at the beginning of every text he writes.

Expand full comment

Another thought: Platt's Law is about the size of "a generation", so Platt's Law-like estimates could be seen as another way of looking around and saying "it won't be THIS generation that figures it out".

Final thought: it seems to me that if you're going to take the "biological answer" approach, it would make more sense to look at how evolution got us here vs. how we're working on getting AI to a human level of performance. How many iterations and how "powerful" was each iteration for evolution to arrive at humans? How many iterations and how "powerful" an iteration has it taken for us to get from an AI as smart as an algae to whatever we have now.

Expand full comment

Aren't 10% missing from her weighing of the 6 models?

Not sure it makes much of a difference though

Expand full comment

No, we are not going to be able to use ML to figure out the wiring based on some training set. We can probably get such a system to do something interesting, but it isn't going to be thinking like a human. ML algorithms are just not robust enough. Visual recognition algorithms fall apart if you change a handful of pixels, and even things like AlphaFold collapse if you vary the amino acid sequence. Sure, people fall apart too, but an AI that can't tell a hat from a car if a couple of visual cells produce bogus outputs isn't behaving like a human.

Then, there's all the other stuff the brain does, and it's not just the nerve cells. The glial cells and astrocytes do things that we are just getting a glimpse of. It's not like brains don't rewire themselves now and then. There's Hebb's rule: neurons that fire together, wire together, and we barely have a clue of how that works at a functional level, so good luck simulating it.

Closer to home, the brain is full of structures that embed assumptions about an animal needs to process information to survive and reproduce. The thing is that we don't know what all of these structures are and what they do. Useful ML algorithms also embed assumptions. Rodney Brooks pointed out that the convolution algorithms used in ML object identification algorithms embed assumptions about size and location invariance. MLs don't learn that from training sets. People write code that moves a recognition window around the image and varies its size. (Brooks has been a leader in the AI/ML world since the 1980s, and his rodneybrooks.com blog is full of good informed analysis of the field and its capabilities.)

Maybe I'm too cynical, but I'll go with Brooks' NIML, not in my lifetime.

Expand full comment

My one comment would be that if you're going to use biological anchors, it would be wise to include a biologist in the discussion. This whole debate feels like it only involves AI experts, focused on machine learning.

There is a whole science devoted to brain biology, where they create electrical models of human brains and snyapses and dendrites and so on. Many dendrites and axons connect to hundreds of other neurons, analogous to neural nets in a way, but different parts of the brain have different types. of neurons.

Also relevant is the fact that the vast majority of the FLOPs the brain does have nothing whatsoever to do with "general intelligence." Rather, they are about maintaining bodily functions, regulating emotions, etc. An AGI doesn't need to worry about any of this, just the cognitive parts.

I am not the right expert, but it feels like getting the right expert involved would be worthwhile - if you want to base your AGI prediction on the FLOP/s of a human brain, at least make a bit more effort to get that one (somewhat tangible) number accurately.

Expand full comment