385 Comments

> Oh, thank God! I thought you’d said five million years!”

That one has always tickled me too.

I thought of it when an debate raged here about saving humanity by colonization other star systems. I’d mentioned the ‘No reversing entropy’ thing and the response was: “We’re just talking about the next billion years!

Expand full comment

> Bartlett agrees this is worth checking for and runs a formal OLS regression.

Minor error, but I'm Barnett.

Expand full comment

Another minor error: I believe Carl Shulman is not 'independent' but still employed by FHI (albeit living in the Bay Area and collaborating heavily with OP etc).

Expand full comment

Also pretty sure Carl is no longer living in the Bay Area, but in Reno instead (to the Bay Area's great loss)

Expand full comment

Another minor error: Transformative AI is "AI that precipitates a transition comparable to (or more significant than) the agricultural or industrial revolution".

This is easy to fix Scott, and about as long as your original description + its witty reply.

Expand full comment
author

Sorry, fixed.

Expand full comment

Sure, but what does Bartlett think?

Expand full comment

Another minor error: quoting on Mark Xu's list

Expand full comment

That last graph may be a heck of a graph, but I have no idea what it depicts. Could we have a link to the source or an explanation, please?

Expand full comment

Without explicitly confirming at the source, it appears to be a graph of chess program performance per computational power, for multiple models over time.

The Y-axis is chess performance measured using the Elo system, which is a way of ranking performers by a relative standard. Beginner humans are <1000, a serious enthusiast might be >1500, grandmaster is ~2500, and Magnus Carlsen peaked at 2882.

The X-axis is how much compute ("thinking time") each model was allowed per move. This has to be normalized to a specific model for comparisons to be meaningful (SF13-NNUE here) and I'm just going to trust it was done properly, but it looks ok.

The multiple lines are each model's performance at a given level of compute. There are three key takeaways here: 1) chess engines are getting more effective over time even allowed the same level of compute, 2) each model's performance tends to "level out" at some level of allocated resources, and 3) a lot of the improved performance of new models comes from being able to usefully utilize additional resources.

That's a big deal, because if compute keep getting cheaper but the algorithms can't really leverage it, you haven't done much. But if ML folks look at the resources thrown at GPT-3 and say "the curve isn't bending!" it could be a sign that we can still get meaningful performance increases from moar power.

Expand full comment

many thanks!

Expand full comment

Scott seems to take from this graph that it supports the "algorithms have a range of compute where they're useful" thesis. But I see it as opposing that.

First, the most modern algorithms are doing much better than the older ones *at low compute regimes* so the idea that we nearly immediately discover the best algorithms for a given compute regime once we're there appears to be false - at least we didn't manage to do that back in 1995.

Second, regimes where increased computation gives a benefit to these algorithms seems pretty stable. It's just that newer algorithms are across-the-board better. I guess it's hard to compare a 100 ELO increase at 2000 ELO to a 100 ELO increase at 3000 ELO, but I don't really see any evidence in the plot that newer algorithms *scale* better with more compute. If anything, it's that they scale better at low compute regimes, which more lend itself to a Yudkowskian conclusion.

Am I misinterpreting this?

Expand full comment

I agree with you. If it were really the case that "once the compute is ready, the paradigm will appear", I would expect to see all of the curves on this graph intersect each other, with each engine having a small window for which it dominates the ELO roughly corresponding to the power of computers at the time it was made.

Expand full comment

I'd expect that the curves for, say, image recognition tasks, *would* intersect, particularly if the training compute is factored in.

But the important part this graph shows is: the difference between algorithms isn't as large as the difference between compute (although the relative nature of ELO makes this less obvious).

Expand full comment

I think those algorithms have training baked in, so a modern trained net does really well even with low compute (factor of 1000 from hardware X software), but the limit on how good an algo you could train was a lot lower in the past (factor of 50 from software alone)

Expand full comment

> But if ML folks look at the resources thrown at GPT-3 and say "the curve isn't bending!" it could be a sign that we can still get meaningful performance increases from moar power.

I don't follow the space closely, but I think this is exactly what ML folks are saying about GPT-3.

Expand full comment

Basically a Gwern quote IIRC, but I wouldn't hold him responsible for my half-rememberings!

Expand full comment

It seems easier to just have children.

Expand full comment

This made me laugh

Expand full comment
Feb 24, 2022·edited Feb 24, 2022

If you think about it long enough it should.

When we say we want AIs what we are really saying is we want an AI that is better than humans not just an AI. But there are geniuses being born every day.

But what we really want is to understand consciousness and to solve particular problems faster than than we can at the moment.

We wanted to fly like the birds but we really did not invent an artificial bird. We wanted to work as hard as horse, but did not invent an artificial horse.

The question of consciousness is a legitimate and important question.

Expand full comment

I think this is an important point. Doing basic research in AI as a way to understand NI makes enormous sense: we understand almost nothing about how our mind works, and if we understood much more we could (one hopes) make enormous strides in education, sociology, functional political institutions, the treatment of mental illness, and the improvement of life for people with mental disabilities (through trauma, birth, or age). We could also optimize the experience and contributions of people who are unusually intelligent, and maybe figure out how to boost our own intelligence, via training or genetic manipulation. Exceedingly valuable stuff.

But as a technological end goal, an actual deployed mass-manufactured tool, it seems highly dubious. There are only three cases to consider:

(1) We can build a general AI that is like us, but much dumber. Why bother? (There's of course many roles for special-purpose AIs that can do certain tasks way better than we can, but don't have our general-purpose thinking abilities.)

(2) We can build a general AI that is like us, and about as smart. Also seems mostly pointless, unless we can do it far cheaper than we can make new people, and unless it is so psychologically different it doesn't mind being a slave.

(3) We can build a general AI that is much smarter than us. This seems a priori unlikely, in the sense that if we understood intelligence sufficiently well to do this, why not just increase our own intelligence first? Got to be easier, since all you need to do is tweak the DNA appropriately. And even if we could build one, why would we want to either enslave a hyperintelligent being or become its slaves, or pets? Even a bad guy wouldn't do that, since a decent working definition of "bad guy" is "antisocial who doesn't want to recognize any authority" and building a superintelligent machine to whom to submit is rather the opposite of being a pirate/gangster boss/Evil Overlord.

I realize plenty of people believe there is case (2b) we can build an AI that is about as smart as us, and then *it* can rebuild itself (or build another AI) that is way smarter than us, but I don't believe in this boostrapping theory at all, for the same reason I find (3) dubious a priori. The idea that you can build a very complex machine without any good idea of how it works seems silly.

Expand full comment

>The idea that you can build a very complex machine without any good idea of how it works seems silly.

But that's essentially what ML does. If there was a good idea of how a solution to a given problem works, it would be implemented via traditional software development instead.

Expand full comment
Feb 25, 2022·edited Feb 25, 2022

I disagree. I understand very well what a ML program does. I may not have all the details at my fingertips, but that is just as meaningless as the fact that I don't know where each molecule goes when gasoline combusts with oxygen. Sure, there's a lot of weird ricochets and nanometer-scale fluctuations that go on about which I might not know, absent enormous time and wonderful microscopes -- but saying I don't know the details is nowhere near saying I don't know what's going on. I know in principle.

Same with ML. I may not know what this or that node weight is, and to figure out why it is what it is, i.e. trace it back to some pattern in the training data, would take enormous time and painstaking not to say painful attention to itsy bitsy detail, but that is a long way from saying I don't know what it's doing. I do in principle.

I'll add this dichotomy has existed in other areas of science and technology for much longer, and it doesn't bother us. Why does a particular chemical reaction happen in the pathway it does, exactly? We can calculate that from first principles, with a big enough computer to solve a staggeringly huge quantum chemistry problem. But if you wanted to trace back this wiggle in the preferred trajectory to some complex web of electromagnetic forces between electrons, it would take enormous time and devotion to detail. So we don't bother, because this detail isn't very important. We understand the principles by which quantum mechanics determines the reaction path, and we can build a machine that finds that path by doing trillions of calculations which we do not care to follow, and maybe the path is not what raw intuition suggests (which is why we do the calculation at all, usually), but at no point here do we say we do not *understand* why the Schroedinger Equations is causing this H atom to move this way instead of that. I don't really see why we would attribute some greater level of mystic magic to a neural network pattern-recognition algorithm.

Expand full comment

>...but that is a long way from saying I don't know what it's doing. I do in principle.

Knowing in principle seems like a much lower bar than having a good idea how something works.

>I don't really see why we would attribute some greater level of mystic magic to a neural network pattern-recognition algorithm.

Intelligence is an emergent phenomenon (cf., evolution producing hominid minds), so what magic do you see being attributed beyond knowledge of how to build increasingly complex pattern-recognition algorithms?

Expand full comment

That's not what ML does. ELI5, ML is about as well understood as the visual cortex, it's built like a visual cortex, and it solves visual cortex style problems.

People act like just because each ML model is too large and messy to explain, all of ML is a black box. It's not. Each model of most model classes (deep learning, CNN, RNN, gbdt, whatever you want) is just a layered or otherwise structured series of simple pattern recognizers, each recognizer is allowed to float towards whatever "works" for the problem at hand, and all the recognizers are allowed to influence each other in a mathematically stable (ie convergent) format.

End result of which is you get something that works like a visual cortex: it has no agency and precious little capacity for transfer learning, but has climbed the hill to solve that one problem really well.

This is a very well understood space. It's just poorly explained to the general public.

Expand full comment

My initial objection to Carl was based on a difference of opinion about what constitutes a "good idea of how it works". You appear to share his less-restrictive understanding of the phrase.

N.B., I am a working data scientist who was hand coding CV convolutions two decades ago.

Expand full comment

> This seems a priori unlikely, in the sense that if we understood intelligence sufficiently well to do this, why not just increase our own intelligence first? Got to be easier, since all you need to do is tweak the DNA appropriately.

I think this is mistaken. For reasons that Scott has talked about elsewhere, the fact that we aren't *already* smarter suggests that we're near a local optimum for our physiology / brain architecture / etc, or evolution would have made it happen; eg it may be that a simple tweak to increase our intelligence would result in too much mental illness. Finding ways to tweak humans to be significantly smarter without unacceptable tradeoffs may be extremely difficult for that reason.

On the other hand, I see no a priori reason that that local optimum is likely to be globally optimal. So conditional on building GAI at all, I see no particular reason to expect a specific barrier to increasing past human-level intelligence.

Expand full comment

Oh I wouldn't disagree that it's likely to be hard to increase human intelligence. Whether what we mean by "intelligence" -- usually, purposeful conscious reasoning and imagination -- has been optimized by Nature is an interesting and unsolved question, inasmuch as we don't know whether that kind of intelligence is always a survival advantage. There are also some fairly trivial reasons why Nature may not have done as much as can be done, e.g. the necessity for having your head fit through a vagina during birth.

But yeah I'd take a guess that it would be very hard. I only said that hard as it is, building a brand-spanking new type of intelligence, a whole new paradigm, is likely to be much harder.

Anyway, if we take a step back, the idea that improving the performance of an engine that now exists is a priori less likely than inventing a whole new type of engine is logically incoherent.

Expand full comment

"if we understood intelligence sufficiently well to do this, why not just increase our own intelligence first?"

Because the change is trivial in computer code, but hard in DNA.

For example, maybe a neural structure in 4d space works really well. We can simulate that on a computer, but good luck with the GM.

Maybe people do both, but the human takes 15-20 years to grow up, whereas the AI "just" takes billions of dollars and a few months.

Because we invented an algorithm that is nothing at all like a human mind, and works well.

Expand full comment

That would be convincing if anyone had ever written a computer code that had even the tiniest bit of awareness or original thought, no matter how slow, halting, or restricted in its field of competence. I would say that the idea that a computer can be programmed *at all* to have original thought (or awareness) is sheer speculation, based on a loose analogy between what a computer does and what a brain does, and fueled dangerously by a lot of metaphorical thinking and animism (the same kind that causes humans to invent local conscious-thinking gods to explain why it rains when it does, or eclipses, or why my car keys are always missing when I'm in a hurry).

Expand full comment

Deep blue can produce chess moves that are good, and aren't copies of moves humans made. GPT3 can come up with new and semi-sensible text.

Can you give a clear procedure for measuring "Original thought".

Before deep blue, people were arguing that computers couldn't play chess because it required too much "creative decision making" or whatever.

I think you are using "Original thought" as a label for anything that computers can't do yet.

You have a long list of things humans can do. When you see a simple dumb algorithm that can play chess, you realize chess doesn't require original thought, just following a simpleish program very fast. Then GPT3 writes kind of ok poetry and you realize that writing ok poetry (given lots of examples) doesn't require original thought.

I think there is a simplish program for everything humans do, we just haven't found it yet. I think you think there is some magic original thought stuff that only humans have, and also a long list of tasks like chess, go, image recognition etc that we can do with the right algorithm.

Expand full comment

"Because the change is trivial in computer code, but hard in DNA."

In any large software shop which relies on ML to solve "rubber hits the road" problems, not toy problems, it takes literally dozens of highly paid full time staff to keep the given ML from falling over on its head every *week* as the staff either build new models or coddle old ones in an attempt to keep pace with ever changing reality.

And the work is voodoo, full of essentially bad software practices and contentious statistical arguments and unstable code changes.

Large scale success with ML is about as far from "the change is trivial in computer code" as it is possible to be in the field of computer science.

Expand full comment

I thought about this specifically when reading that we could spend quadrillions of dollars to create a supercomputer capable of making a single human level AI.

Expand full comment
Feb 25, 2022·edited Feb 25, 2022

To be fair, once made that AI could be run on many different computers (which would each be far less expensive), whereas we don't have a copy-paste function for people.

Expand full comment
Feb 25, 2022·edited May 16, 2022

But more importantly, that way of thinking is wrong (edit: I mean the quadrillion dollars thing) and I predict humanity is about to reduce per-model training budgets at the high end. Though wealthy groups' budgets will jump temporarily whenever they suspect they might have invented AGI, or something with commercialization potential.

Expand full comment

By "reduce per-model training budgets", do you mean "reduce how much we're willing to spend" or "reduce how much we need to spend"?

Expand full comment
Feb 26, 2022·edited Feb 26, 2022

I mean that a typical wealthy AI group will reduce the total amount it actually spends on models costing over ~$500,000 each, unless they suspect they might have invented AGI, or something with commercialization potential, and even in those cases they probably won't spend much more than before on a single model (but if they do, I'm pretty sure they won't get a superintelligent AGI out of it). (edit: raised threshold 100K=>500K. also, I guess the superjumbo model fad might have a year or two left in it, but I bet it'll blow over soon)

Expand full comment
founding

The math and science are very difficult for me. So, I'm glad you are there to interpret it from a super layperson's perspective!

Could you point me to WHY AI scares you? I assume you've written about your fears.

Or should I remain blissfully ignorant?

Expand full comment

He has written about this before on his previous blog, but even more helpfully summarized the general concerns here https://www.lesswrong.com/posts/LTtNXM9shNM9AC2mp/superintelligence-faq

Consider especially parts 3.1.2 thru 4.2

Expand full comment
author

This is pretty out of date, but I guess it will do until/unless I write up something else.

Expand full comment
Feb 24, 2022·edited Feb 24, 2022

I obviously cannot speak to why AI scares Scott, but there are some theoretical and practical reasons to consider superhuman AI a highly-scary thing should it come into existence.

Theoretical:

Many natural dangers that threaten humans do not threaten humanity, because humanity is widely dispersed and highly adaptive. Yellowstone going off or another Chicxulub impactor striking the Earth would be bad, but these are not serious X-risks because humanity inhabits six continents (protecting us from local effects), has last-resort bunkers in many places (enabling resilience against temporary effects) and can adapt its plans (e.g. farming with crops bred for colder/warmer climates).

These measures don't work, however, against other intelligent creatures; there is no foolproof plan to defeat an opponent with similar-or-greater intelligence and similar-or-greater resources. For the last hundred thousand years or so, this category has been empty save for other humans and as such humanity's survival has not been threatened (the Nazis were an existential threat to Jews, but they were not an existential threat to humanity because they themselves were human). AGI, however, is by definition an intelligent agent that is not human, which makes human extinction plausible (other "force majeure" X-risks include alien attack and divine intervention).

Additionally, many X-risks can be empirically determined to be incredibly unlikely by examining history and prehistory. An impact of the scale of that which created Luna would still be enough to kill off humanity, but we can observe that these don't happen often and there is no particular reason for the chance to increase right now. This one even applies to alien attack and divine intervention, since presumably these entities would have had the ability to destroy us since prehistory and have reliably chosen not to (as Scott pointed out in Don't Fear the Filter back on SSC, if you think humans are newly a threat to interstellar aliens or to God, you are underestimating interstellar aliens and God). But it doesn't apply to AI - or at least, not to human-generated AI (alien-built AI is not much different from aliens in this analysis). Humans haven't built (human-level or superhuman) AI before, so we don't have a track record of safety.

So the two basic heuristics that rule most things out as likely X-risks don't work on AI. This doesn't prove that AI *will* wipe out humanity, but it's certainly worrying.

Practical:

- AI centralises power (particularly when combined with robotics). Joe Biden can't kill all African-Americans (even if he wanted to, which he presumably does not), because he can't kill them all himself and if he told other people to do it they'd stop listening to him. Kim Jong-un can kill a lot of his people, because the norms are more permissive to him doing so, but he still can't literally depopulate North Korea because he still needs other people to follow his orders and most won't follow obviously-self-destructive orders. But if Joe Biden or Kim Jong-un had a robot military, they could do it. No monarch has ever had the kind of power over their nation that an AI-controlled robot army can give. Some people can be trusted with that kind of power; most can't.

- Neural-net architecture is very difficult to interrogate. It's hard enough to tell if explicit code is evil or not, but neural nets are almost completely opaque - the whole point is that they work without us needing to know *how* they work. Humans can read each other reasonably well despite this because evolution has trained us quite specifically to read other humans; that training is at best useless and at worst counterproductive when trying to read a potentially-deceptive AI. So there's no way to know whether a neural-net AI can be trusted with power either; it's basically a matter of plug-and-pray (you could, of course, train an AI to interrogate other AIs, but the interrogating AI itself could be lying to you).

Expand full comment
founding

Very helpful to my understanding why AI is a unique threat. Thanks for this. You explain it very well. Although now when i see video clips of kids in robot competitions, my admiration will be tinged with a touch of foreboding.

Expand full comment

Don't be tinged by that foreboding. If you read a bit about superintelligence it becomes clear that it's not going to come from any vector that's typically imagined (terminator or black mirror style robots).

There are plenty of ideas of more realistic ways an AGI escapes confinement and gains access to the real world, a couple of interesting ones I read were it solving the protein folding problem, paying or blackmailing someone over the intenet to mix the necessary chemicals, and it creates nanomachines capable of anything. Another was tricking a computer sciencist with a perfect woman on a VR headset.

In fact it probably won't be any of these things, after all, it's a super intelligence: whatever it creates to pursue its goals will be so beyond our understanding that it's meaningless to predict what it will do other than as a bit of fun or creative writing exercise.

Let me know if you want links to those stories/ideas, I should have them somewhere. Superintelligence by Nick Bostrom is good read, although quite heavy. I prefer Scott's stuff haha.

Expand full comment

The hypothetical "rogue superintelligent AGI with no resources is out to kill everyone, what does it do" might not be likely to go that way, but that's hardly the only possibility for "AI causes problems". Remote-control killer robots are already a thing (and quite an effective thing), militaries have large budgets, and plugging an AI into a swarm of killbots does seem like an obvious way to improve their coordination. PERIMETR/Dead Hand was also an actual thing for a while.

Expand full comment
founding

The "killbots" can't load their own ordnance or even fill their own fuel tanks, which is going to put a limit on their capabilities.

Expand full comment

> solving the protein folding problem, paying or blackmailing someone over the intenet to mix the necessary chemicals, and it creates nanomachines capable of anything

Arguably the assumption that "nanomachines capable of anything" can even exist is a big one. After all, in the Smalley - Drexler debate Smalley was fundamentally right and drexlerian nanotech is not really compatible with known physics and chemistry

Expand full comment
Feb 26, 2022·edited Feb 26, 2022

Offering the opposite take: https://idlewords.com/talks/superintelligence.htm

(Note this essay is extremely unpopular around these parts, but also, fortunately, rationalists are kind enough to let it be linked!)

Expand full comment

1) I mean, yes, people get annoyed when you explain in as many words that you are strawmanning them in order to make people ignore them.

2) There are really two factions to the AI alarmists (NB: I don't intend negative connotations there, I just mean "people who are alarmed and want others to be alarmed") - the ones who want to "get there first and do it right" and the ones who want to shut down the whole field by force. You have something of a case against the former but haven't really devoted any time to the latter.

Expand full comment

Generally I think that the paradigm shifts argument is convincing, and so all this business of trying to estimate when we will have a certain number of FLOPS available is a bit like trying to estimate when fusion will become widely available by trying to estimate when we will have the technology to manufacture the magnets at scale.

However, I disagree with Eliezer that this implies shorter timelines than you get from raw FLOPS calculations - I think it implies longer ones, so would be happy to call the Cotra report's estimate a lower bound.

Expand full comment

>she says that DeepMind’s Starcraft engine has about as much inferential compute as a honeybee and seems about equally subjectively impressive. I have no idea what this means. Impressive at what? Winning multiplayer online games? Stinging people?

Swarming

Expand full comment

Building hives

Expand full comment
author

You people are all great.

Expand full comment

It plays Zerg well and Terran for shit.

Protoss, you say? Everyone knows Protoss in SC2 just go air.

Expand full comment

Yes, you should care. The difference between 50% by 2030 and 50% by 2050 matters to most people, I think. In a lot of little ways. (And for some people in some big ways.)

For those trying to avert catastrophe, money isn't scarce, but researcher time/attention/priorities is. Even in my own special niche there are way too many projects to do and not enough time. I have to choose what to work on and credences about timelines make a difference. (Partly directly, and partly indirectly by influencing credences about takeoff speeds, what AI paradigm is likely to be the relevant one to try to align, etc.)

EDIT: Example of a "little" way: If my timelines went back up to 30 years, I'd have another child. If they had been at 10 years three years ago, I would currently be childless.

Expand full comment
author

Why does your child-having depend on your timelines? I'm considering a similar question now and was figuring that if bringing a child into the world is good, it will be half as good if the kid lives 5 years as if they live 10, but at no point does it become bad.

This would be different if I thought I had an important role in aligning AI that having a child would distract me from; maybe that's our crux?

Expand full comment

I myself am pro bringing in another person to fight the good fight. If it were me being brought in I would find it an honor, rather than damning. My crux is simply that I am too busy to rear more humans myself.

Expand full comment

FWIW I totally agree

Expand full comment

Psst… kids are awesome (for whatever points a random Internet guy adds to your metrics)

Expand full comment

I'm not sure it is rational / was rational. I probably shouldn't have mentioned it. Probably an objective, third-party analysis would either conclude that I should have kids in both cases or in neither case.

However the crux you mention is roughly right. The way I thought of it at the time was: If we have 30 years left then not only will they have a "full" life in some sense, but they may even be able to contribute to helping the world, and the amount of my time they'd take up would be relavitely less (and the benefits to my own fulfillment and so forth in the long run might even compensate) and also the probability of the world being OK is higher and there will be more total work making it be OK and so my lost productivity will matter much less...

Expand full comment

(Apologies if this is a painful topic. I'm a parent and genuinely curious about your thinking)

Would you put a probability on their likelihood of survival in 2050? (ie, are you truly operating from the standpoint that your children have a 40 or 50 percent chance of dying from GAI around 2050?)

Expand full comment

Yes, something like that. If I had Ajeya's timelines I wouldn't say "around 2050" I would say "by 2050." Instead I say 2030-ish. There are a few other quibbles I'd make as well but you get the gist.

Expand full comment

Thanks for answering.

Expand full comment

> money isn't scarce, but researcher time/attention/priorities is.

I don't get the "MIRI isn't bottlenecked by money" perspective. Isn't there a well-established way to turn money into smart-person-hours by paying smart people very high salaries to do stuff?

Expand full comment

My limited understanding is: It works in some domains but not others. If you have an easy-to-measure metric, you can pay people to make the metric go up, and this takes very little of your time. However, if what you care about is hard to measure / takes lots of time for you to measure (you have to read their report and fact-check it, for example, and listen to their arguments for why it matters) then it takes up a substantial amount of your time, and that's if they are just contractors who you don't owe anything more than the minimum to.

I think another part of it is that people just aren't that motivated by money, amazingly. Consider: If the prospect of getting paid a six-figure salary to solve technical alignment problems worked to motivate lots of smart people to solve technical alignment problems... why hasn't that happened already? Why don't we get lots of applicants from people being like 'Yeah I don't really care about this stuff I think it's all sci-fi but check out this proof I just built, it extends MIRI's work on logical inductors in a way they'll find useful, gimme a job pls." I haven't heard of anything like that ever happening. (I mean, I guess the more realistic case of this is someone who deep down doesn't really care but on the exterior says they do. This does happen sometimes in my experience. But not very much, not yet, and also the kind of work these kind of people produce tends to be pretty mediocre.)

Another part of it might be that the usefulness of research (and also manager/CEO stuff?) is heavy-tailed. The best people are 100x more productive than the 95th percentile people who are 10x more productive than the 90th percentile people who are 10x more productive than the 85th percentile people who are 10x more productive than the 80th percentile people who are infinitely more productive than the 75th percentile people who are infinitely more productive than the 70th percentile people who are worse than useless. Or something like that.

Anyhow it's a mystery to me too and I'd like to learn more about it. The phenomenon is definitely real but I don't really understand the underlying causes.

Expand full comment

> Consider: If the prospect of getting paid a six-figure salary to solve technical alignment problems worked to motivate lots of smart people to solve technical alignment problems... why hasn't that happened already?

I mean, does MIRI have loads of open, well-paid research positions? This is the first I'm hearing of it. Why doesn't MIRI have an army of recruiters trolling LinkedIn every day for AI/ML talent the way that Facebook and Amazon do?

Looking at MIRI's website it doesn't look like they're trying very hard to hire people. It explicitly says "we're doing less hiring than in recent years". Clicking through to one of the two available job ads ( https://intelligence.org/careers/research-fellow/ ) it has a section entitled "Our recommended path to becoming a MIRI research fellow" which seems to imply that the only way to get considered for a MIRI research fellow position is to hang around doing a lot of MIRI-type stuff for free before even being considered.

None of this sounds like the activities of an organisation that has a massive pile of funding that it's desperate to turn into useful research.

Expand full comment

I can assure you that MIRI has a massive pile of funding and is desperate for more useful research. (Maybe you don't believe me? Maybe you think they are just being irrational, and should totally do the obvious thing of recruiting on LinkedIn? I'm told OpenPhil actually tried something like that a few years ago and the experiment was a failure. I don't know but I'd guess that MIRI has tried similar things. IIRC they paid high-caliber academics in relevant fields to engage with them at one point.)

Again, it's a mystery to me why it is, but I'm pretty sure that it is.

Some more evidence that it's true:

--Tiny startups beating giant entrenched corporations should NEVER happen if this phenomenon isn't real. Giant entrenched corporations have way more money and are willing to throw it around to improve their tech. Sure maybe any particular corporation might be incompetent/irrational, but it's implausible that all the major corporations in the world would be irrational/incompetent at the same time so that a tiny startup could beat them all.

--Similar things can be said about e.g. failed attempts by various governments to make various cities the "new silicon valley" etc.

Maybe part of the story is that research topics/questions are heavy-tailed-distributed in importance. One good paper on a very important question is more valuable than ten great papers on a moderately important question.

Expand full comment

> I can assure you that MIRI has a massive pile of funding and is desperate for more useful research. (Maybe you don't believe me? Maybe you think they are just being irrational

Maybe they're not being irrational, they're just bad at recruiting. That's fine, that's what professional recruiters are for. They should hire some.

If MIRI wants more applicants for its research fellow positions it's going to have to do better than https://intelligence.org/careers/research-fellow/ because that seems less like a genuine job ad and more like an attempt to get naive young fanboys to work for free in the hopes of maybe one day landing a job.

Why on Earth would an organisation that is serious about recruitment tell people "Before applying for a fellowship, you’ll need to have attended at least one research workshop"? You're competing for the kind of people who can easily walk into a $500K+ job at any FAANG, why are you making them jump through hoops?

Expand full comment
founding
Feb 24, 2022·edited Feb 24, 2022

MIRI doesn't want people who can walk into a FAANG job, they want people who can conduct pre-paradigmatic research. "Math PhD student or postdoc" would be a more accurate desired background than "FAANG software engineer" (or even "FAANG ML engineer"), but still doesn't capture the fact that most math PhDs don't quite fit the bill either.

If you think professional recruiters, who can't reliably distinguish good from bad among the much more commoditized "FAANG software engineer" profile, will be able to find promising candidates for conducting novel AI research - well, I don't want to say it's impossible. But the problem is doing that in a way that isn't _enormously costly_ for people already in the field; there's no point in hiring recruiters if you're going to spend more time filtering out bad candidates than if you'd just gone looking yourself (or not even bothered and let high-intent candidates find you).

Expand full comment

Holy shit. That's not a job posting. That's instructions for joining a cult. Or a MLM scam.

Expand full comment

I think there is an interesting question about how one moves fields into this area. I imagine that having people who are intelligent but with a slightly different outlook would be useful. Being mentored while you get up to speed and write your first paper or two is important I think. I'm really not sure how I would move into a paid position for example without basically doing an unpaid and isolated job in my spare time for a considerable amount of time first.

Expand full comment

For what it is worth, I agree completely with Melvin on this point - the job advert pattern matches to a scam job offer to me and certainly does not pattern match to any sort of job I would seriously consider taking. Apologies to be blunt, but you write "it's a mystery to me why it is", so I'm trying to offer an outside perspective that might be helpful.

It is not normal to have job candidates attend a workshop before applying for a job in prestigious roles, but it is very normal to have candidates attend a 'workshop' before pitching them an MLM or timeshare. It is even more concerning that details about these workshops are pretty thin on the ground. Do candidates pay to attend? If so this pattern matches advanced fee scams. Even if they don't pay to attend, do they pay flights and airfare? If so MIRI have effectively managed to limit their hire pool to people who live within commuting distance of their offices or people who are going to work for them anyway and don't care about the cost.

Furthermore, there's absolutely no indication how I might go about attending one of these workshops - I spent about ten minutes trying to google details (which is ten minutes longer than I have to spend to find a complete list of all ML engineering roles at Google / Facebook), and the best I could find was a list of historic workshops (last one in 2018) and a button saying I should contact MIRI to get in touch if I wanted to attend one. Obviously I can't hold the pandemic against MIRI not holding in-person meetups (although does this mean they deliberately ceased recruitment during the pandemic?), and it looks like maybe there is a thing called an 'AI Risk for Computer Scientists' workshop which is maybe the same thing (?) but my best guess is that the next workshop - which is a prerequisite for me applying for the job - is an unknown date no more than six months into the future. So if I want to contribute to the program, I need to defer all job offers for my extremely in-demand skillset for the *opportunity* to apply following a workshop I am simply inferring the existence of.

The next suggested requirement indicates that you also need to attend 'several' meetups of the nearest MIRIx group to you. Notwithstanding that 'do unpaid work' is a huge red flag for potential job applicants, I wonder if MIRI have seriously thought about the logistics of this. I live in the UK where we are extremely fortunate to have two meetup groups, both of which are located in cities with major universities. If you don't live in one of those cities (or, heaven forbid, are French / German / Spanish / any of the myriad of other nationalities which don't have a meetup anything less than a flight away) then you're pretty much completely out of luck in terms of getting involved with MIRI. From what I can see, the nearest meetup to Terrence Tao's offices in UCLA is six hours away by car. If your hiring strategy for highly intelligent mathematical researchers excludes Terrence Tao by design, you have a bad hiring strategy.

The final point in the 'recommended path' is that you should publish interesting and novel points on the MIRI forums. Again, high quality jobs do not ask for unpaid work before the interview stage; novel insights are what you pay for when you hire someone.

So to answer your question - yes there are many subtle and interesting factors as to why top companies cannot attract leading talent despite paying a lot of money to that talent and paying a lot of money to develop meta-knowledge about how to attract talent. However just because top companies struggle to attract talent and MIRI struggles to attract talent doesn't mean MIRI is operating on the same productivity frontier as top tech companies. From the public-facing surface of MIRI's talent pipeline alone there is enough to answer the question of why they're struggling to match funds to talent, and I don't doubt that a recruitment consultant going under the hood could find many more areas for concern in the talent pipeline.

Why *shouldn't* MIRI try doing the very obvious thing and retaining a specialist recruitment firm to headhunt talent for them, pay that talent a lot of money to come and work for them, and then see if the approach works? A retained executive search might cost perhaps $50,000 per hire at the upper end, perhaps call it $100,000 because you indicate there may be a problem with inappropriate CVs making it through anything less than a gold-plated search. This is a rounding error when you're talking about $2bn unmatched funding. I don't see why this approach is too ridiculous even to consider, and instead the best available solution is to have a really unprofessional hiring pipeline directly off the MIRI website

Expand full comment
Feb 25, 2022·edited Feb 25, 2022

I believe the reason they aren't selecting people is simply that MIRI is run by deeply neurotic people who cannot actually accept any answer as good enough, and thus are sitting on large piles of money they insist they want to give people only to refuse them in all cases. Once you have done your free demonstration work, you are simply told that, sorry, you didn't turn out to be smarter than every other human being to ever live by a minimum of two orders of magnitude and thus aren't qualified for the position.

Perhaps they should get into eugenics and try breeding for the Kwisatz Haderach.

Expand full comment

Although your take is deeply uncharitable, I think the basis of your critique is true and stems from a different problem. Nobody knows how to create a human level intelligence, so how could you create safety measures based on how such an intelligence would work? They don't know. So they need to hire people to help them figure that out, which makes sense. But since they don't know, even at an introductory level, they cannot actually evaluate the qualifications of applicants. Hiring a search firm would result in the search firm telling MIRI that MIRI doesn't know what it needs. You'd have to hire a firm that knows what MIRI needs, probably by understanding AI better than they do, in order to help MIRI get what it needs. Because that defeats the purpose of MIRI, they spin their wheels and struggle to hire people.

Expand full comment

They're going to have a problem with the KH-risk people.

Expand full comment

> However, if what you care about is hard to measure / takes lots of time for you to measure then it takes up a substantial amount of your time.

One solution here would be to ask people to generate a bunch of alignment research, then randomly sample a small subset of that research and subject it to costly review, then reward those people in proportion to the quality of the spot-checked research.

But that might not even be necessary. Intuitively, I expect that gathering really talented people and telling them to do stuff related to X isn't that bad of a mechanism for getting X done. The Manhattan Project springs to mind. Bell Labs spawned an enormous amount of technical progress by collecting the best people and letting them do research. I think the hard part is gathering the best people, not putting them to work.

> If the prospect of getting paid a six-figure salary to solve technical alignment problems worked to motivate lots of smart people to solve technical alignment problems... why hasn't that happened already?

Because the really smart and conscientious people are already making six figures. In private correspondence with a big LessWrong user (>10k karma), they told me that the programmers they knew that browsed LW were all very good programmers, and that the _worst_ programmer that they knew that read LW worked as a software engineer at Microsoft. If we equate "LW readers" with "people who know about MIRI", then virtually all the programmers who know about MIRI are already easily clearing six figures. You're right that the usefulness of researchers is heavy-tailed. If you want that 99.99th percentile guy, you need to offer him a salary competitive with those of FAANG companies.

Expand full comment
founding

If you equate "people who know about MIRI" with "LW readers", then maybe put some money and effort into MIRI more widely known. Hopefully in a positive way, of course.

Expand full comment

You probably know more about the details of what has or has not been tried than I do, but if this is the situation we really should be offering like $10 million cash prizes no questions asked for research that Eliezer or Paul or whoever says moves the ball on alignment. I guess some recently announced prizes are moving us in this direction, but the amount of money should be larger, I think. We have tons of money, right?

Expand full comment

They (MIRI in particular) also have a thing about secrecy. Supposedly much of the potentially useful research not only shouldn't be public, even hinting that this direction might be fruitful is dangerous if the wrong people hear about it. It's obviously very easy to interpret this uncharitably in multiple ways, but they sure seem serious about it, for better or worse (or indifferent).

Expand full comment

This whole thread has convinced me that MIRI is probably the biggest detriment in the world for AI alignment research, by soaking up so much of the available funding and using it so terribly.

The world desperately needs a MIRI equivalent that is competently run. And which absolutely never ever lets Eleizer Yudkowsky anywhere near it.

Expand full comment

My take is increasingly that this institution has succeeded in isolating itself for poorly motivated reasons (what if AI researchers suspected our ideas about how to build AGI and did them "too soon"?) and seems pretty explicitly dedicated to developing thought-control tech compatible with some of the worst imaginable futures for conscious subjects (think dual use applications -- if you can control the thoughts of your subject intelligence with this kind of precision, what else can you control?).

Expand full comment

It hasn't "soaked up so much of the available funding." Other institutions in this space have much more funding, and in general are also soaking in cash.

(I disagree with your other claims too of course but don't have the energy or time to argue.)

Expand full comment

Give Terrence Tao 500 000$ to work on AI alignement six months a year, letting him free to research crazy Navier-Stokes/Halting problem links the rest of his time... If money really isn't a problem, this kind of thing should be easy to do.

Expand full comment

Literally that idea has been proposed multiple times before that I know of, and probably many more times many years ago before I was around.

Expand full comment
Feb 25, 2022·edited Feb 25, 2022

> a six-figure salary to solve technical alignment problems

Wait, what? If I knew that I might've signed the f**k up! I don't have experience in AI, but still! Who's offering six figures?

Expand full comment

Every time I am confused about MIRI's apparent failures to be an effective research institution I notice that the "MIRI is a social club for a particular kind of nerd" model makes accurate predictions.

Expand full comment

You could pay me to solve product search ranking problems, even though I find the end result distasteful. In fact, if you bought stuff online, maybe you did pay me!

You couldn't pay me to work on alignment. I'm just not aligned. Many people aren't.

Expand full comment

Fighting over made up numbers seems so futile.

But I don't understand this anyway.

Why do the dangers posed by AI need a full/transformative AI to exist? My total layman's understanding of these fears is that y'all are worried an AI will be capable of interfering with life to an extent people cannot stop. It's irrelevant if the AI "chooses" to interfere or there's some programming error, correct? So the question is not, "when will transformative AI exist?" the question is only, "when will computer bugs be in a position to be catastrophic enough to kill a bunch of people?" or, "when will programs that can program better than humans be left in charge of things without proper oversight or with oversight that is incapable of stopping these programming programs?"

Not that these questions are necessarily easier to predict.

Expand full comment
author

A dumber-than-human level AI that (let's say) runs a power plant and has a bug can cause the power plant to explode. After that we will fix the power plan, and either debug the AI or stop using AIs to run power plants.

A smarter-than-human AI that "has a bug" in the sense of being unaligned with human values can fight our attempts to turn it off and actively work to destroy us in ways we might not be able to stop.

Expand full comment

But if we are not worried about the bugs in the e.g. global water quality managing program, then an AI as smart as a human is not such a big deal either. There are plenty of smart criminals out who are unaligned with human values and even the worst haven't managed to wipe out humanity. We need to have an AI smarter than the whole group of AI police before seriously worrying, so maybe we need to multiply our made up number by 1,000?

But to illustrate the bug/AI question. Let's imagine Armybot, a strategy planning simulation program in 2022. And lets say there's a bug and Armybot, which is hooked up to the nuclear command system for proper simulations, runs a simulation IRL and lets off all those nukes. That's an extinction level bug that could happen right now if we were dumb enough.

Now lets imagine Armybot is the same program in 2050 and now it's an AI with the processing power equivalent to the population of a small country. Now the fear is Armybot's desire/bug to nuke the world kicks in (idk why it becomes capable of making independent decisions or having wants just because of more processing power so I'm more comfortable saying there's a bug). But now it can independently connect itself to the nuclear command center with its amazing hacking skills (that it taught itself? that we installed?). That's an extinction level bug too.

So the question is, which bug is more likely?

Expand full comment
Feb 24, 2022·edited Feb 24, 2022

The general intuition, I believe, is that an AI as smart as a human can quickly become way way smarter than a human, because humans are really hard to improve (evolution has done its best to drill a hole through the gene-performance landscape to where we are, but it's only gotten more stuck over the aeons) and AI tends to be really easy to improve: just throw more cores at it.

If you could stick 10 humans of equal intelligence in a room and get the performance of one human that's 10 iq points smarter than that, then the world would look pretty different. Also we can't sign up for more brain on AWS.

Expand full comment

My intuition is that "Just throw more cores at it" is no more likely to improve an AI's intelligence than opening up my skull and chucking in a bunch more brain tissue.

I think you'd have to throw more cores at it _and then_ go through a lengthy process of re-training, which would probably cost another hundred billion dollars of compute time.

Expand full comment

It's even worse (or better, I guess, depending on your viewpoint) than that, because cores don't scale linearly; there's a reason why Amazon has a separate data center in every region, and why your CPU and GPU are separate units. Actually it's even worse than that, because even with all those cores, no one knows what "a lengthy process of re-training" to create an AGI would look like, or whether it's even possible without some completely unprecedented advances in computer science.

Expand full comment
Feb 24, 2022·edited Feb 24, 2022

I think we can safely assume that it is going to be vastly easier than making a smarter human, at least given our political constraints. (Iterated embryo selection etc.) It doesn't matter how objectively hard it is, just who has the advantage, and by how much. Also I think saying we need fundamental advances in CS to train a larger AI given a smaller AI, misses first the already existing distillation research, and second assumes that the AGI was a one in a hundred stroke of good luck that cannot be reproduced. Which seems unlikely to me.

Expand full comment
Feb 24, 2022·edited Feb 24, 2022

A hundred billion dollars of compute time for training is a fairly enlightening number because it's simultaneously an absurd amount of compute, barely comparable to even the most extravagant training runs we have today, enough to buy multiple cutting edge fabs and therefore all of their produced wafers, while also being an absolutely trivial cost to be willing to pay if you already have AGI and are looking to improve it to ASI. Heck, we've spent half that much just on our current misguided moon mission that primarily exists for political reasons that have nothing to do with trying to go to the moon.

That said, throwing more cores at an AI is by no means necessary, nor even the most relevant way an AI could self-improve, nor actually do we even need to first get AGI before self-improvement becomes a threat. For example, we already have systems that can do pick-and-place for hardware routing better than humans, we don't need AGI to do reinforcement learning, and there are ways in which an AI system could be trained to be more scalable when deployed than humans have evolved to be.

A fairly intelligent AI system finely enough divided to search over the whole of the machine learning literature and collaboratively try out swathes of techniques on a large cluster would not have to be smarter than a human in each individual piece to be more productive at fast research than the rest of humanity. Similarly, it's fairly easy to build AI systems that have an intrinsic ability to understand very high fidelity information that is hard to convey to humans, like AI systems that can look at weights and activations of a neural network and tell you things about its function. It's not hard to imagine that as AI approaches closer to human levels of general reasoning ability, we might be able to build a system that recursively looks at its own weights and activations and optimises them directly in a fine tuned way that is impossible to do with more finite and indivisible human labor. You can also consider systems that scale in ways similar to AlphaZero; again, as these systems approach having roughly human level general reasoning ability in their individual components, the ability for the combined system to be able to reason over vastly larger conceptual spaces in a much less lossy way that has been specifically trained end-to-end for this purpose might greatly exceed what humans can do.

I think people often have a misconception where they consider intelligence to exist purely on a unidimensional line which takes exponential difficulty to progress along. Neither of these are true, it is entirely on trend for AI to have exploitable superiorities as important as its deficiencies, and for progress to speed up rather than slow down as its set of capabilities approaches human equivalence—Feynman exists on the same continuum as everybody else, so there doesn't seem to be a good reason to expect humanity exists at a particularly difficult place for evolution to further improve intelligence. Even if human intelligence did end up being precisely a soft cap to the types of machines we could make, being able to put a large and scalable number of the smartest minds together in a room on demand far exceeds the intellectual might we can pump out of humanity otherwise.

Expand full comment

There will be 0 or a few AI's given access to nukes. And hopefully only well tested AI.

If the AI is smart, especially if its smarter than most humans, and it wants to take over the world and destroy all humans, its likely to succeed. If you aren't stupid, you won't wire a buggy AI to nukes with no safeguards. But if the AI is smart, its actively trying to circumvent any safeguard. And whether nukes already exist is less important. It can trick humans into making bioweapons.

"idk why it becomes capable of making independent decisions or having wants just because of more processing power so I'm more comfortable saying there's a bug". Current AI sometimes kind of have wants, like wanting to win at chess, or at least reliably selecting good chess moves.

We already have robot arms programmed to "want" to pick things up. (Or at least search for plans to pick things up.) The difference is that currently our search isn't powerful enough to find plans involving breaking out, taking over the world and making endless robot arms to pick up everything forever.

Defence against a smart adversary is much much harder than defence against random bugs.

Expand full comment
Feb 25, 2022·edited Feb 25, 2022

> an AI as smart as a human

Scott said "smarter-than-human" (perhaps he means "dramatically smarter"), and I argue downthread that there will never be an AI "as smart as" a human.

Expand full comment

I'm unconvinced by AI X-risk in general, but I think I can answer this one: bugs are random. Intelligences are directed. A bad person is more dangerous than a bug at similar levels of resources and control.

Expand full comment

No, it can't, because merely being able to compute things faster than a human does not automatically endow the AI with nigh-magical powers -- and most of the magical powers attributed to putative superhuman AIs, from verbal mind control to nanotechnology to, would appear to be physically impossible.

Don't get me wrong, a buggy AI could still mess up a lot of power plants; but that's a quantitative increase in risk, not a qualitative one.

Expand full comment

An AI doesn't need magical powers to be a huge, even existential threat. It just needs to be really good at hacking and can use the usual human foibles as leverage to get nearly anything it wants: money and blackmail.

Expand full comment

Human hackers do that today all the time, with varying degrees of success. They are dangerous, yes, but not an existential threat. If you are proposing that an AI would be able to hack everything everywhere at the same time, then we're back in the magical powers territory.

Expand full comment

We're talking about superintelligent AI. Being better than human hackers is a trivial corollary. Exactly what is magical about that?

Expand full comment

How much better, exactly ? Is it good enough to hack my Casio calculator watch ? If so, then it's got magical powers, because that watch is literally unhackable -- there's nothing in it to hack. Ok, so maybe not, but is it good enough to gain root access to every computer on the Internet at the same time while avoiding detection ? If so, then it has magical powers of infinite bandwidth, superluminal communication, and whatever else it is that lets it run its code at zero performance penalty. Ok, so maybe it's not quite that good, but it's just faster than average human hackers and better informed about security holes ? Well, then it's about as good as Chinese or Russian state hackers already are today.

In other words, you can't just throw the word "superintelligent" into a sentence as though it was a magic incantation; you still need to explain what the AI can do, and how it can do it (in broad strokes).

Expand full comment

Why would an AI want money?

Expand full comment

Because money is how anyone or anything acquires resources in the world as it currently exists.

Expand full comment
founding

These timelines seem to depend crucially on compute getting much cheaper. Computer chip factories are very expensive, and there are not very many of them. Has anyone considered trying to make it illegal to make compute much cheaper?

Expand full comment
author

Who? You're talking to the small group of researchers and activists who care about this, with a few tens of billions of dollars. How do they make it illegal to make compute much cheaper?

Expand full comment
founding

Just offering a concrete policy goal to lobby for. As far as I know, actual policy ideas here beyond “build influence” are in short supply.

I agree this would be very challenging and probably require convincing some part of the US and Chinese governments (or maybe just the big chip manufacturers) that AI risk is worth taking seriously.

Expand full comment

Ideas aren't in short supply; clearly good ideas are. You aren't the first person to propose lobbying to stop compute getting cheaper. What's missing is a thorough report that analyzes all the pros and cons of ideas like that and convinces OpenPhil et al that if they do this idea they won't five years later think "Oh shit actually we should have done the exact opposite, we just made things even worse."

Expand full comment
Feb 25, 2022·edited Feb 25, 2022

Even clearly good ideas aren't in short supply; *popular* people who can tell which ideas are good are. So usually when I see (or invent) a good idea, it is not popular.

Expand full comment
founding

What are some clearly good policy ideas in this space?

Most that I have seen are bad because of the difficulty of coordinating among all possible teams of people working on AI (on the other hand, the number of potential chip fabs is much smaller)

Expand full comment
Feb 25, 2022·edited Feb 25, 2022

Sorry, just realized I made a fairly useless comment. I was making a general observation, not one about this field specifically. So, don't know.

Expand full comment
founding

OK, I’m glad to hear this idea is already out there. I wasn’t sure if it was. I agree the appropriate action on it right now is “consider carefully”, not “lobby hard for it”.

Expand full comment

I don't know if someone has discussed your idea in AI governance, but in alignment there's the concept of a "pivotal act". You train an AI to do some specific task which helps which drastically changes the expected outcome of AGI. For instance, an AI which designs nanotech in order to melt all GPUs and destoy GPU plants, after which it shuts down. Which is vaguely similair to what you suggested. So maybe search for pivotal acts on the alignment forum to find the right literature.

Expand full comment

Is this intended to be a failsafe, such that the AGI has a program to destroy computer creating machinery, but can only do so if it escapes its bounds enough to gain the ability?

Expand full comment

It is intended to slow down technological progress in AI and make it impossible for someone else (and you afterwards!) to make an AGI, or anything close to an AGI. And nothing else. So no first order effect on other tech, politics, economics, science etc.

This works out better as a failsafe than what you've proposed, since if you're expecting the AI to escape and have enough power to conduct such an act, you've lost anyway. Someone else is probably making an AGI as well in that scenario, or the AI will be able to circumvent the program firing up or so on.

Note that getting the AI to actually just melt GPUs somehow and then shut down is an unsolved problem. If we knew how to do that right now, the alignment community would be way more optimistic about our chances.

Expand full comment
Feb 24, 2022·edited Feb 24, 2022

If you've tried to buy a high-amperage MOSFET, a stepper driver, a Raspberry Pi or a GPU lately, you would know how easy it is to make compute expensive. Different chips - or different computers whose CPUs/firmwares don't conform to a BIOS-like standard - are not necessarily fungible with each other, and the whole chip fab process has a very long cycle time despite the relatively normal amount of throughput achievable by, essentially, a very deep pipeline.

(And yes, I too think the whole movement reeks of Luddism.)

Expand full comment

See, this is *exactly* why I'm opposed to the AI-alignment community. Normally I wouldn't care, people can believe whatever they want, from silicon gods to the old-fashioned spiritual kind. But beliefs inform actions, and boom -- now you've got people advocating for halting the technological progress of humanity based on a vague philosophical ideal.

We've got real problems that we need to solve, right now: climate change, hunger, poverty, social networks, the list goes on; and we could solve most (arguably, all) of them by developing new technologies -- unless someone stops us by shoving a wooden shoe in the gears every time a new machine is built.

Expand full comment

"halting the technological progress of humanity based on a vague philosophical ideal. "

Does this apply to the biologists deciding not to build bioweapons? Some technologies are dangerous and better off not built. It can create new problems as well as solving them. You would need to show that the capability of AI to solve our problems is bigger than the risk. Maybe AI is dangerous and we would be better off solving climate change with carbon capture. Solving any food problems with GMO crops. And just not doing the most dangerous AI work until we can figure out how to make it safer.

Expand full comment

You are not talking about the equivalent of deciding not to build bioweapons; you are talking about the equivalent of stopping biological research in general. I agree that computing is dangerous, just as biology is dangerous -- and don't even get me started on internal combustion. But we need all of these technologies if we are to thrive, and arguably survive, as a species. I'm not talking about solving global warming with some specific application of AI; I'm talking about transformative events such as our recent transition into the Information Age.

Expand full comment
founding

Progress is great! Stopping growth would be a disaster.

That said, it doesn’t seem to me that cheaper computing power is very useful in solving climate change, poverty, etc. Computers are already really great; what we need is more energy abundance and mastery over the real physical world.

Consumer CPUs haven’t been getting faster for many years, so it’s not even clear most computer users are benefiting from Moore’s law these days.

Expand full comment

If you don't think more computer power can somehow magically solve those problems, this is a good first step towards understanding why some people are unconvinced by AI X-risk.

Expand full comment
Feb 26, 2022·edited Feb 26, 2022

> this is *exactly* why I'm opposed to the AI-alignment community

Jonathan Paulson's comment is surely not representative of the AI-alignment community.

Expand full comment
Feb 24, 2022·edited Feb 24, 2022

>human solar power a few decades ago was several orders of magnitude worse than Nature’s, and a few decades from now it may be several orders of magnitude better.

No, because typical solar panels already capture 15 – 20% of the energy in sunlight (the record is 47%). There's not another order of magnitude left to improve.

Source, https://en.wikipedia.org/wiki/Solar_cell_efficiency

Nitpicking aside, I wonder how the potential improvement of human intelligence through biotechnology will affect this timeline. The top AI researcher in 2052 may not have been born yet.

Expand full comment

The table also measures solar in terms of "payback period," which has much more room for improvement.

Expand full comment

It's also a lot more relevant than efficiency unless you are primarly constrained by acreage

Expand full comment

I don't think that's a reasonable metric for solar power. Plants use solar power to drive chemical reactions -- to make molecules. They're not optimized for generating 24VDC because DC current isn't useful to a plant. So the true apples-to-apples comparison is to compare the amount of sunlight a plant needs to synthesize X grams of glucose from CO2 and water, versus what you can do artificially, e.g. with a solar panel and attached chemical reactor. By that much more reasonable metric the natural world still outshines the artificial by many orders of magnitude.

One imagines that if plant life *did* run off a 5VDC bus, then evolution would have developed some exceedingly efficient natural photovoltaic system. What it would look like is an interesting question, but I very much doubt it would be made of bulk silicon. Much more likely is that it would be an array of microscopic machines driven by photon absorption, which is kind of the way photosynthetic reaction centers work already.

Expand full comment

That's not a reasonable metric, either, for exactly the same reason: Solar panels aren't optimized for generating glucose.

(Also, your metric means efficiency improvements in generating glucose are efficiency gains for solar power.)

Expand full comment

I think this is right.

This leaves us with having to compare across different domains. How do you quantify the difference between "generates DC power" and "makes molecules"? I guess you'd have to start talking about the *purpose* of doing those things. Something like "utility generated for humans" vs "utility generated for plants"...and that seems really difficult to do.

Expand full comment

No, you would need to measure a combined system, of solar panel plus chemical plant, as I said. But a plant *is* a combined solar panel plus chemical plant, and optimized globally, not in each of its parts, so if you wan to make a meaningful comparison, that's what you have to do. Otherwise you're making the sort of low-value comparison that people do when they say electric cars as a technology generate zero CO2, forgetting all about the source of the electricity. It's true but of very limited value in making decisions, or gaining insight.

In this case, the insight that is missing is that Nature is still a heck of a lot better at harvesting and using visible photons as an energy source. The fact that PV panels can do much better at a certain specialized subtask, which is *all by itself* pointless -- electricity is never an end in itself, it's always a means to some other end -- isn't very useful.

Expand full comment

But you’re still favoring the plant by trying to get technology to simulate the plant. Yes, it’s an integrated solar panel plus chemical plant, but that’s completely useless if what you want is a solar panel plus 24V DC output plug. In that case, the plants lose by infinity orders of magnitude, because no plant does 24V DC output. You get similar results if what you want is computing simple arithmetic (a $5 calculator will beat any plant, complete with solar panel), or moving people between continents. Yes, birds contain sophisticated chemical reactors and have complex logic for finding fuel, but they still cannot move people between continents.

I you insist on measuring based on one side’s capabilities, I am the world’s best actor by orders of magnitude, since I have vast advantages at convincing my mother I am the son she raised, relative to anyone else.

Expand full comment

This is a minor point in all this, but it seems weird to estimate the amount of training evolution has by the amount of FLOPs each animal has done. Thinking more doesn't seem like it would increase the fitness of your offspring, at least not in a genetic sense. The only information evolution gets is how many kids you have (and they have, etc).

Though maybe you could point to this as the reason why the evolution estimate is so much higher than the others.

Expand full comment

It works if you consider optimization, or solution finding in general, as a giant undifferentiated sorting problem. I have X bits of raw data, and my solution (or optimum) is some incredibly rare combination F(X), and what I need to do is sift all the combinations f(X) until f(X) = F(x). That will give you an order of magnitude estimate for how much work it is to find F given X, even if you don't know the function f.

But in practice that estimate often proves to be absurdly and uselessly large. It's sort of like saying the square root of 10 has to be between 1 and 10^50. I mean...yeah, sure, it's a true statement. But not very practically useful.

In the same sense, many problems Nature has solved appear to have been solved in absurdly low amounts of time, if you take the "number of operations needed" estimate as some kind of approximate bound. This is the argument often deployed by "Intelligent Design" people to explain how evolution is impossible, because the search space for mutations is so unimaginably huge, relative to the set of useful mutations, that evolution would accomplish pretty much zip over even trillion-year spans. See also the Levinthal Paradox in protein folding. Or for that matter the eerie fact that human beings who can compete with computer chess programs at a given level are doing way, way, *way* fewer computations. Somehow they can "not bother" with 99.9999+% of the computations the computer does that turn out to be dead ends.

How Nature achieves this is one of the most profound and interesting questions in evolutionary biology, in the understanding of natural intelligence, and in a number oif areas of physical biochemistry, I would say.

Expand full comment

Gotta say I don’t generally feel this way (although I always find his stuff to be enlightening and a learning experience) but I’m pretty well aligned with Eliezer here. I think people figure out when they’ll start to feel old age and just put AI there then work backwards. I’m greatly conflicted about AGI as I don’t know how we fix lots of problems without it and it seems like there’s some clever stuff to do in the space other than brute forcing that I think doesn’t happen as much… and this is where I’m conflicted, because kinda thankfully it makes people feel shunned to do wild stuff which slows the whole thing down. Hopefully we arrive at the place of unheard of social stability and AGI simultaneously. If we built it right now I think it would be like strapping several jet engines on a Volkswagen bug. For whatever that’s worth, Some Guy On The Internet feels a certain way.

Expand full comment

I personally think AGI in eight years. GPT-3 scares me. It's safe now, but I worry it's "one weird trick" (probably involving some sort of online short-horizon self-learning) out from self-awareness.

Expand full comment

It feels weird to be rooting against progress but I hope you’re wrong until we have some more time to get our act together. To me the control problem is also how we control ourselves. Without some super flexible government structure to oversee us I worry what we’ll try to do even if there are good decision makers telling us to stop. Seems like most minds we could possibly build would be insane/unaligned (that’s probably me anthropomorphizing) since humans need a lot of fine tuning and don’t have to be that divergent before we are completely coo coo for Cocoa Puffs. Hopefully the first minds are unproductively insane instead of diabolically insane.

Expand full comment

I am personally pretty old already, but I do expect to live 8 more years, so I'd totally take you up on that bet. From where I'm standing, it looks like easy money (unless of course you end up using some weak definition of AGI, like "capable of beating a human at Starcraft" or whatever).

Expand full comment

There's the general thing that the definition for AGI keeps changing; what would have counted as intelligence thirty years ago no longer counts, because we've already achieved it. So what looks like a strong definition for AGI today becomes a weak definition tomorrow.

This is actually the source of my optimism: People worried about AGI can't even define what it is they are worried about. (Personally I'll worry when some specific key words get used together. But not too much, because I'm probably just as wrong.)

Expand full comment

I'm not worried about AGI at all -- that is to say, I'm super worried about it, but only in the same way that I'm worried about nuclear weapons or mass surveillance or other technologies in the hands of bad human actors. However, I'd be *excited* about AGI when it could e.g. convincingly impersonate a regular (non-spammer) poster on this forum. GPT-3 is nowhere near this capability at present.

Expand full comment

/basilisk !remindme 8 years

Expand full comment

It’s about as self aware as a rock.

Expand full comment

The dinosaurs died because of a rock.

Expand full comment

The rock wasn’t self aware

Expand full comment
Mar 23, 2022·edited Mar 23, 2022

Ergo, being self aware is not a necessary condition to be scary and/or cause a disaster. Or, more precisely, just saying “it’s not self aware” is not an argument that you shouldn’t worry about it.

The thing that is scary about GPT-3 is not its *self-awareness*, but its other (relatively and unexpectedly) powerful abilities, and particularly that we don’t know how much more powerful it could become, while remaining non–self aware.

Sort of how boulders are not that scary by themselves, but once you see one unexpectedly fall from the sky, you might worry what happens if a much bigger one will fall later. And how it might be a good idea to start investigating how and why boulders can fall from the sky, and what you might be able to do about it, some time before you see the big one with your own eyes when it touches the atmosphere.

Expand full comment

But being self aware is what scares people about AGI. Rather than live in the world of metaphor here - what exactly can a future GPT do that’s a threat? Write better poems, or stories?

Expand full comment

>I consider naming particular years to be a cognitively harmful sort of activity; I have refrained from trying to translate my brain's native intuitions about this into probabilities, for fear that my verbalized probabilities will be stupider than my intuitions if I try to put weight on them.

I don't think there's good evidence that specific, verifiable predictions is a cognitively harmful activity. I'd actually say the opposite - that it is virtually impossible to update one's beliefs without saying things like "I expect X by Y," and definitely impossible to meaningfully evaluate a person's overall accuracy without that kind of statement. It reminds me of Superforecasting pointing out how many forecasts are not even wrong - they are meaningless. For example:

> Take the problem of timelines. Obviously, a forecast without a time frame is absurd. And yet, forecasters routinely make them, as they did in that letter to Ben Bernanke. They’re not being dishonest, at least not usually. Rather, they’re relying on a shared implicit understanding, however rough, of the timeline they have in mind. That’s why forecasts without timelines don’t appear absurd when they are made. But as time passes, memories fade, and tacit time frames that once seemed obvious to all become less so. The result is often a tedious dispute about the “real” meaning of the forecast. Was the event expected this year or next? This decade or next? With no time frame, there is no way to resolve these arguments to everyone’s satisfaction—especially when reputations are on the line.

(Chapter 3 of Superforecasting is loaded up with a discussion of this whole matter, if you want to consult your copy; there's no particular money shot quote I can put here.)

Frankly, the statement "my verbalized probabilities will be stupider than my intuitions" is inane. They cannot be stupider than your intuitions, because your intuitions do not meaningfully predict anything, except insofar as they can be transformed into verbalized probabilities. It strikes me that more realistically, your verbalized probabilities will *make it more obvious that your intuitions are stupid*, making it understandable monkey politicking to avoid giving numbers, but in response I will use my own heuristics to downgrade the implied accuracy of people engaged in blatant monkey politicking.

Expand full comment

First off, Yudkowsky was talking about himself. It is possible that he really does get fixated on what other people say and can't get his brains to generate its probability instead of their answer. I know I often can't get my brain to stop giving me cached results instead of thinking for itself.

"your intuitions do not meaningfully predict anything, except insofar as they can be transformed into verbalized probabilities"

This is right on some level and wrong on another. It is right in that we should expect some probability is encoded somewhere in your brain for a given statement, which we might be able to decode into numbers if only we had the tech and understanding.

It is wrong in that e.g. I have no idea what probability there is that we live in a Tegmarkian universe, but I have some intuition that this is plausible as an ontology. I have no idea what the probability of the universe being fine tuned is, but it feels like minor adjustments to the standard models parameters could make life unfeasible.

When I don't know what the event space is, or which pieces of knowledge are relevant, and how they are relevant, then you can easily make an explicit mental model that performs worse than your intuitions. Your system 1 is very powerful, and very illegible. You can output a number that "feels sort of right but not quite", and that feeling is more useful than the number itself as it is your state of knowledge. And if you're someone who can't reliably get people to have that same state of knowledge, then giving them the "not right" number is just giving them a poor proxy and maybe misleading them. Yudkowsky often says that he just can't seem to explain some parts of his worldview, and often seems to mislead people. Staying silent on median AGI timelines may also be a sensible choice for him.

I kind of buy it, but then I've read a lot of his stuff and know his context.

Expand full comment

>It is wrong in that e.g. I have no idea what probability there is that we live in a Tegmarkian universe, but I have some intuition that this is plausible as an ontology. I have no idea what the probability of the universe being fine tuned is, but it feels like minor adjustments to the standard models parameters could make life unfeasible.

Right, but that is a virtually meaningless statement, is the thing. It's the same as any other part of science - in order for something to be true, it has to be falsifiable. Ajeya has put forward something that she could actually get a Brier's score based on - Yudkowsky has not.

>I kind of buy it, but then I've read a lot of his stuff and know his context.

I read a lot of his stuff too, which is why it's disappointing to see him do something that I can only really blame on either monkey brain politicking or just straight up ignoring good habits of thought. Monkey politicking is more generous, in my view, than just straight up ignoring one of the most scientifically rigorous works on increasing one's accuracy as a thought leader in the rationalist community.

Expand full comment

Sure, the Tegmark thing is not falsifiable. But the fine tuning thing isn't (simulate biochemistry with different parameters for e.g. the muon mass and see if you get complex self replicating organisms). And the concept generalises.

If you take something like "what is the probability that if the British lost the battle of Waterloo, then there would have been no world war", you might have some vague intuitions about what couldn't occur, but I wouldn't trust any probability estimate you put out. How could I? There are so many datapoints that affect your prior, and it is not even clear what your prior should be, that I don't see how you could turn your unconscious knowledge generating your anticipations into a number via formal reasoning. Or even via guessing what's right, as you don't know if you're taking all your knowledge into account.

>I read a lot of his stuff too, which is why it's disappointing to see him do something that I can only really blame on either monkey brain politicking or just straight up ignoring good habits of thought.

It would be better if he gave probability estimates. I just don't think its as big a deal as you're claiming. You can still see what they would bet on e.g. GPT-n not being able to replace a programmer. That makes their actual beliefs legible.

And yeah, Yudkowsky is being an ass here. But he's been trying to explain his generators of thought for like ten years and is getting frustrated that no one seems to get it. It is understandable, but unvirtuous.

Expand full comment

> I consider naming particular years to be a cognitively harmful sort of activity; I have refrained from trying to translate my brain's native intuitions about this into probabilities, for fear that my verbalized probabilities will be stupider than my intuitions if I try to put weight on them.

It was very hard to read this and interpret it as anything other than "I don't want to put my credibility on the line in the event that our doomsday cult's predicted end date is wrong." As a reader, I have zero reason to give value to Yudkowsky's intuition. The only times I'd take something like this seriously is if someone had repeatedly proved the value of their intuition via correct predictions.

Expand full comment

I hate being uncharitable, but that's exactly how I read that section as well. If he feels strongly about a particular timeline, and he clearly says that he does, then he should not be worried about sharing that timeline. If he doesn't share that timeline, then he is implying that either 1) he doesn't have strong feelings about what he's saying, or 2) he is worried about the side effects of being held accountable for being wrong (which to me is another reason to think he doesn't actually have strong beliefs that he is correct on his timeline).

Uncharitably, Eliezer depends on his work for money and prestige, and that work depends on AI coming sooner, rather than later. Knowing that AI is not even possible at current levels of computing would drastically shrink the funding level applied to AI safety, so he has a strong incentive to believe that it can be.

Expand full comment

I'll add a third voice to the pile here RE; Yudkowsky and withholding his timeline. It would certainly seem he's learned from his fellow doomsayers's crash-and-burn trajectories when they get pinned down to naming a date for their apocalypse.

Expand full comment

Yeah, the word that came to mind when I read that was "dissemble"

Expand full comment

I was today years old when I first saw the word "compute" used as a noun. It makes my brain wince a little every time.

Expand full comment
author

I was five years ago old, winced at the time, and got used to it after a few months.

Expand full comment

Comparing brains and computers is quite tricky. If you look at how a brain works, it's almost all smart structure - the way each and every neuron is physically wired, which happens thanks to evolved and inherited broad-stroke structures (nuclei, pathways, neuron types, etc.), as well as the process of learning during an individual's development. The function part that is measured by the number of synaptic events per second is a tiny part of the whole process. If you look at how a computer running an AI algorithm works the picture is the opposite: There is almost nothing individual on the structure/hardware level (where you count FLOPS) and almost everything that separates a well-functioning AI computer from a failing one is in the function/software part. This is what it means that the computer is consuming FLOPS much differently than a brain consumes synaptic events. I am very much in agreement with Eliezer here.

Based on the above I guess that if you built a neuromorphic computer, i.e. a computer whose hardware was structured like a brain, you could expect the same level of performance for the same number of synaptic events. Instead of having a software-agnostic hardware you might have e.g. a gate array replicating the large-scale structure of the brain (e.g. different modules receiving inputs from different sensors, multiple subcortical nuclei, a cortical sheet, multiple specific pathways connecting modules, etc.) that could run only one algorithm, precisely adjusting synaptic weights in these pre-wired society of neural networks. In that system you would get the same IQ from the same number of synaptic/gate switch events, as long as your large-scale structure was human-level smart.

This would be a complete change in paradigm compared to current AI, which uses generic hardware to run individual algorithms and thus suffers a massive hit to performance. And I mean, a *really* massive hit to performance. If you figure out a smart computational structure, as smart as what evolution put together, you will have a human level AGI using only 10e15 FLOPS of performance. All we need to do is to map a brain well-enough to know all the inherited neural pathways, imprint those pathways on a humongous gate array (10e15 gates), and do a minor amount of training to create the individual synaptic weights.

This is my recipe for AGI, soon.

Now, about that 7-digit sum of money to be thrown....

Expand full comment

I think there's a underappreciated severe physical challenge there. If you build a neuromorphic computer out of things large enough that we know how to manipulate them in detail, I would guess you will be screwed by the twin scourges of the speed of light and decoherence times -- the minimum clock time imposed by the speed of light will exceed decoherence times imposed by assorted physical noise processes at finite temperature, and you will get garbage.

I think the only way to evade that problem is to build directly at the molecular scale, so you can fit everything in a small enough volume that the speed of light doesn't slow your clock cycle time too far. But we don't know how to do that yet.

Expand full comment

If you have 10e15 gates trying to produce 10e15 operations (not floating point) per second your clock time is 1 Hz. Also, the network is asynchronous. This is a completely different regime of energy dissipation per unit time per gate, so gate density per unit of volume is much higher, so distances are not much longer than in a brain, so the network is constrained by neither clock time nor decoherence.

Expand full comment

Right. That fits under my second condition: "we don't know how to build such a thing" because we don't know how to build stuff at the nanometer level in three dimensions, and two-dimensions (which we can do now) won't cut it to achieve that density.

Expand full comment

You don't need to have extremely high 3d density. Since your gates operate at 1Hz you can have long interconnects and you can stack layers with orders of magnitude less cooling than in existing processors. The 9 OOM difference in clock speed between a GPU and the neuromorphic machine makes a huge difference in the kind of constraints you face and the kind of solutions you can use. The technology to make the electronic elements and the interconnects for this device exists now and is in fact trivial. What we are missing is the brain map we need to copy onto these electronic devices (the large-scale network topology).

Expand full comment

Trivial, eh? Wikipedia tells me the highest achieved transistor density is about 10^8/mm^2. So your 10^15 elements would seem to require a 10m^2 die. That might be a little tricky from a manufacturing (especially QA) point of view, but let's skip over that. How are we going to get the interconnect density in 2D space? In the human brain ~50% of the volume is given over to interconnects (white matter), and in 3D the problem of interconnection is enormously simpler -- that's why we easily get traffic jams on roads but not among airplanes.

How many elements can you fit on a 2D wafer and still get the "everything connects to everything else" kind of interconnect density we need? Recall we assume here that a highly stereotypical kind of limited connection like you need for a RAM chip or even GPU is not sufficient -- we need a forest of interconnects so dense that almost all elements can talk to any other element. I'm dubious that it can be done at all for more than a million elements, but let's say we can do it on the largest chips made today ~50,000 mm^2, which gets us 10^12 elements. Now we are forced to do stack our chips, 10^3 of them. How much space do you need between the chips for macroscopic interconnects? Remember we need to connect ~10^12 elements in chip #1 with ~10^12 elements in chip #999. It's hard to imagine how one is going to run a trillion wires, even very tiny wires, between 1000 stacked wafers.

All of this goes away if you can actually fully engineer in 3D space, the way the human brain is built, so you can run your interconnects as nm-size features. But we don't know how to do that yet.

Expand full comment

Not everything connects to everything else - the brain has a very specific network topology with most areas connecting only to a small number of other areas. This is a very important point - we are not talking about a network fabric that can run any neural net, instead we are copying a specific pre-existing network topology, so our connections will be sparse compared to the generic topology.

Think about a system built with a large number of ASICs - after mapping and understanding the function of each brain module you make an ASIC to replicate its function and you may need thousands of specialized types of ASICs, one or more for each distinct brain area. Sure, the total surface area of the ASICS would be large but given the low clock rate we don't have to use anything very advanced and as you note we can already put 10e12 transistors per wafer, so the overall number of chips to get to 10e15 gates would not be overwhelming. Also, you are not trying to make a thousand Cerebras wafers running at GHz speed, you make chips running at 1 Hz, so the QA issues would be mitigated. The interconnects between the ASICs of course don't need to have millions of wires - you can multiplex the data streams from hundreds of millions of neuron equivalents (like a readout of axonal spikes) over a standard optic or copper wire and of course the interconnects are in 3d, as in a rack cabinet. No need for stacking thousands of high-wattage wafers in a tiny volume to maximize clock speed, since all you need is for the wires to transmit the equivalent of 1Hz per neuron, so everything can be put in industry standard racks. Low clock speed makes it so much easier.

This is not to say this way of building a copy of a brain is the most efficient, and definitely not the fastest possible - but it would not require new or highly advanced manufacturing techniques. What is missing for this approach to work is a good-enough map of brain topology and a circuit-level functional analysis of each brain module, good enough to guide the design of the aforementioned ASICs.

Expand full comment

People have been comparing the brain to a machine since clockwork, it's refreshing to see machines compared to brains for once.

I agree that trying to copy biological mechanisms in the case of AI probably isn't the way to go. We want mechanical kidneys and hearts to work like their biological counterparts because we'll be putting them into biological systems (us), that doesn't hold true for AI.

Expand full comment

I thought we were building AIs to fit into human society? One to which we could talk, would understand us, would be able to work with us, et cetera? If not, what's the point? If so, doesn't that put at least as much constraint on an artificial mind as the necessity for integrating with a physical biological system puts on an artificial kidney?

Expand full comment

Your reference to A.I. always being 30 years away (or 22) reminds me of the old saw about fusion power always being 20 years away for the last 60 years.

Expand full comment

The rebuttal I've heard was that fusion research is funding constrained - if someone had given fusion research twenty billion dollars instead of twenty million dollars, they would be a lot closer than 20 years away by now.

Expand full comment

Brings to mind the old saw about 9 women gestating a child in 1 month.

Expand full comment

Wasn't $20 million sixty years ago more like $20 billion today? I have a feeling that no matter how much money was thrown at it, the complaint would be "if they had only given us SIXTY billion instead of a lousy twenty billion, we'd have fusion today!"

Of course, the cold fusion scandal of 1989 didn't help, after that I imagine trying to convince people to work on fusion was like trying to convince them your perpetual motion machine was really viable, honest:

https://en.wikipedia.org/wiki/Cold_fusion

Expand full comment

My rule of thumb is that $1 in 1960 = $10 today, so it would have been more like $200 million. I didn't remember the exact numbers quoted or the year the guy was referring to (it could have been the 1980s for all I know) but the amount of money the guy said they got was something like 1/1000th of the amount they said they would need.

If they had fully funded them, fusion might be perpetually 10 years away instead of 20. ;)

(As it happens, some recent claimed progress in fusion power has come about because of an only tangentially related advance: better magnets made from new high-temperature superconductors. https://www.cnbc.com/2021/09/08/fusion-gets-closer-with-successful-test-of-new-kind-of-magnet.html )

Expand full comment

It should be noted that the credibility gap between fusion and cold fusion is about the same size as the one between quantum mechanics and quantum mysticism.

Humans have been causing thermal fusion reactions since the 1950s. Going from that to a fusion power plant is merely an engineering challenge (in the same way that going to the moon from Newton mechanics and gunpowder rockets is just an engineering challenge).

Expand full comment

Not a lot of people in the business took cold fusion seriously, even at the time. People were guarded in what they said publically, but privately it was considered silly pretty much immediately.

Expand full comment

if you believed the orthogonality thesis were false - say, suppose you believe both that moral realism is correct and that that long term intelligence was exactly equal to the objective good that we approximate with human values - would you still worry?

asking for a friend :)

Expand full comment

That's a very interesting position if I understand correctly. Is your view that a super smart AI would recognize the truth of morality and behave ethically?

Expand full comment

Yes.

Here's the argument for moral realism: https://apxhard.com/2022/02/20/making-moral-realism-pay-rent/

And then, linked at the end, is a definition of what i think the true ethics is.

Expand full comment

Very cool. I like that thinking a lot.

Expand full comment

If a bunch of people converge to the same map, that's strong evidence that they've discovered *something*, but it leaves open the question of what exactly has been discovered.

I can immediately think of two things that people trying to discover morality might discover by accident:

1) Convergent instrumental values

2) Biological human instincts

(These two things might be correlated.)

According to you, would discovering one or both of those things qualify as proving moral realism? If not, what precautions are you taking to avoid mistaking those things for morality?

Expand full comment

I agree with moral realism and I think convergence of moral values is evidence of moral realism. I would answer the first question as it doesn't prove moral realism for the fact that there are other possible hypotheses, but it does raise the probability of moral realism being true.

Expand full comment

I'd agree that the existence of non-zero map-convergence is Bayesian evidence in favor of realism, in that it is more likely to occur if realism is true.

Of course, the existence of less-than-perfect map-convergence is Bayesian evidence in the opposite direction, for similar reasons.

Figuring out whether our *exact* degree of map-convergence is net evidence in favor or against is not trivial. One strategy might be to compare the level of map-convergence we have for morality to the level of map-convergence we have for other fields, like physics, economics, or politics, and rank the fields according to how well peoples' maps agree.

Expand full comment

Yeah, I agree with you on that. It is difficult to measure degree of convergence. Comparing to other fields? That would be hard too.

Expand full comment
Feb 24, 2022·edited Feb 24, 2022

To be fair, though, you have to ALSO account for a few things like:

- 'how widespread is the belief that the maps _ought_ to converge'

- 'how much energy has been spent trying to find maps that converge'

- and, MOST IMPORANTLY - how complicated is the territory?

i don't think we should expect _complete_ convergence because i think a true morality system, with full accuracy, requires knowing everything about the future evolution of the universe, which is impossible

if we really had some machine that could tell us, with absolute certainty, how to get to a future utopia state where the globe was free from all major disease, everyone had a great standard of living, robots did all the work, but humans worked too because we were all intensely loving, caring beings, and humans wrote art and poetry and made delicious food and did all kinds of fun things with each other, war never happened, and this state went on for millions of years as we expanded throughout the cosmos and seeded every planet with fun-loving, learning humans who never really suffered and yet continuously strived to learn and grow and develop, and knew all the names of all our ancestors because we invested heavily in simulating the past so that we could honor the dead who came before us, AND somehow this made all religions work in harmony becuase of some weird quirks in their bylaws that people hadn't noticed before....

if you KNEW this was doable, and we had the perfect map telling us how to get there, well, i think most of us would want to go there. Some people would of course be unhappy with certain aspects of that descriptoin but i think _most_ people would be like, yeah, i want that.

In other words, the infeasibility of a fully accurate map of causality is why we don't agree on morality. the causal maps we do use involve lossy compression, which means throwing out some differences as irrelevant. But the decision of what is and isn't relevant is a moral one! Once you decide 'the arrangement of these water molecules doesn't really matter so much as the fact that they are in liquid state and such and such temperature and pressure', you are _already_ playing the moral values game.

In other words, there's no way to separate causal relevance from moral values.

Expand full comment
author

I...wouldn't know how to model this. Certainly it would be better than the alternative. One remaining concern would be what you need to apprehend The Good, and whether it's definitely true that any AI powerful enough to destroy the world would also be powerful enough to apprehend the Good and decide not to. Another remaining concern is that the Good might be something I don't like; for example, that anyone who sins deserves death and all humans are sinners; or that Art is the highest good and everyone must be forced to spend 100% of their time doing Art and punished if they try to have leisure, or something like that.

Expand full comment
Feb 24, 2022·edited Feb 24, 2022

My argument for moral realism, and then my hunch at the true ethics is linked above. The short there version is: maximizing possible future histories, the physics-based definition of intelligence promoted by Alex Wissner-Gross at MIT. I think it's basically a description of ethics as well, and the fact that it's ~very~ simple mathematically - it works well as a strategy in chess, checkers and go even if you don't give it the goal of 'winning the game'. I find that very re-assuring.

If not, i have this 'fallback hunch' which figures it'll be instrumental to keep humans around. How many people working on AI safety have spent time trying to maintain giant hardware systems? I spent 3.5 years at google tryign to keep a tiny portin of the network alive. All kinds of things break, like fiber cables. Humans have to go out and fix them. There's an ~enormous~ amount of human effort that goes into stopping the machines from falling over. Most of this effort was invisible, to most of the people inside of Google. We had teams that would build and design new hardware, and the idea that some day it might break and need to be repaired was generally not something they'd think about until late, late, late in the design phase. I think we have this idea that the internet is a bunch of machines and that a datacenter can just keep running, but the reality is if everyone on earth died, the machines would all stop within days, maybe months at most.

To prevent that, you'd need to either replace most of the human supply chains on earth with your own robots, who'd need their supply chains - or you could just keep on using robots made from the cheapest materials imaginable. We repair ourselves, make more copies of ourselves , and all you need is dirt, water, and sunlight to take care of us. The alternative seems to be either:

-risk breaking in some way you can't fix

-replace a massive chunk of the global economy, all at once, without anything going wrong, and somehow end up in a state where you have robots which are cheaper than ones made, effectively, from water, sunlight and dirt

of course maybe i'm just engaging in wishful thinking.

Expand full comment

Keeping opitions open is kind of like having a lot of power (I'm thinking of a specific mathematical formalisation of the concept here). And this doesn't lead to ethical behaviour, it leads to agents trying to take over the world! Not really ethical at all.

https://www.lesswrong.com/s/fSMbebQyR4wheRrvk/p/6DuJxY8X45Sco4bS2

here is an informal description of the technical results by their discoverer.

Expand full comment

When you say that "maximizing possible futures" works as a strategy for various games, I think you must be interpreting it as "maximizing the options available to ME". If you instead maximize the number of game states that are theoretically reachable by (you + your opponent working together), that is definitely NOT a good strategy for any of those games. (You listed zero-sum games, so it is trivially impossible to simultaneously make BOTH players better off.)

If you interpret "possible" as meaning "I personally get to choose whether it happens or not", then you've basically just described hoarding power for yourself. Which, yes, is a good strategy in lots of games. But it sounds much less plausible as a theory of ethics without the ambiguous poetic language.

Expand full comment

> I think you must be interpreting it as "maximizing the options available to ME"

Nope, what i mean is 'maximizing the options available to the system as a whole." There is no meaningfully definable boundary between you and the rest of the physical universe. I think the correct ethical system is to maximize future possibilities available to the system as a whole, based upon your own local model. And if you're human, that local model is _centered_ on you, but it contains your family, your community, your nation, your planet, etc.

See this document here with the full argument:

https://docs.google.com/document/d/18DqSv6TkE4T8VBJ6xg0ePGSa-0LqRi_l4t6kPPtqbSQ/edit

The relevant paragraph is here:

> An agent which operates to maximize the possible future states of the system it inhabits only values itself to the extent that it sees itself as being able to exhibit possible changes to the system, in order to maximize the future states accessible to it.

> In other words, an agent that operates to maximize possible future states of the system is an agent that operates without an ego. When this agent encounters another agent with the same ethical system, they are very likely to agree on the best course of outcome. When they disagree, it will be due to differing models as to the likely outcomes of choices - not on something like core values.

Expand full comment

You have a button that, when pressed will cure cancer. If you press it today, you have only 1 possible future tomorrow. If you don't press it, you have a choice of whether or not to press it tomorrow. So not pressing the button maximises possible future states.

This agent will build powerful robots, ready to spring into action and cause any of a trillion trillion different futures. But it never actually uses these robots. (Using them today would flatten the batteries, giving you fewer options tomorrow)

Expand full comment

> You have only 1 possible future tomorrow.

How exactly is that possible?

Without this button, what estimate would you place on possible futures tomorrow? It's some insanely large number that makes avogadro's number look like peanuts.

It looks' like you're constructing a toy universe, which honestly is fine but you have to give me far more rules for how that universe works. Does cancer still kill people, invalidating the futures where they might plausibly exist?

any agent that wants to live beyond the death of the sun needs to build space traveling vehicles, which is what i expect most goals, selected at random from the set of all possilbe goals, would lead to, instrumentally.

Expand full comment

"And if you're human, that local model is _centered_ on you, but it contains your family, your community, your nation, your planet, etc."

That's a lovely idea, now go down to the local sink estate and convince the vandals there that there is no meaningfully definable boundary between them and the rest of the physical universe which means they should stop engaging in petty theft, destruction of property, beating up/knifing/shooting others, etc.

I await with interest the triumph of this impeccable ethical system. Solve it with humans first and then I'll believe it for computers.

Expand full comment

> I await with interest the triumph of this impeccable ethical system. Solve it with humans first and then I'll believe it for computers.

Are you suggesting that an ethics can only be true or valid if you're able to convince all humans that it's true and valid? If we suddenly had a global resurgence of religion that oppresses science, would the scientific knowledge we've discovered up to that point be any less true and valid afterwards?

You seem to be linking these things but I'm just not sure what one has to do with the other.

Expand full comment

> Nope, what i mean is 'maximizing the options available to the system as a whole."

Then you are factually incorrect about it being a good strategy for winning checkers, chess, and go, and you should stop using that as a reassurance (either for yourself or for others).

Any arguments about ethical systems are irrelevant to this particular point.

In my opinion, you should never have used it as a reassurance anyway. "My ethical framework is also a winning strategy for zero-sum games" sounds more like a warning sign than a reassuring fact to me.

The fact that you have accidentally confused different meanings of "possible" when stating this allegedly-reassuring fact also substantially increases my estimate of the probability that your ethical argument will contain similar accidental equivocations, which makes me less inclined to spend time reading it.

Expand full comment

Thanks for the feedback here. Clearly, being precise with language would help more.

I get your point about zero sum games, but in zero sum games, there really _is_ a meaningful boundary between you and others. This simply isn't true for an agent for any agent inside a physical system.

Where exactly does an AI end?

Expand full comment

I don't foresee you getting engaged with much on this one, but for what it's worth I think it's a cogent point.

A lot of the discussion of AI is abstracted to the point where things like manufacturing, power and maintenance just get handwaved away on the basis that AI is more or less magic.

Expand full comment

I'm sympathetic with this basic thought (though not sure I get the second part of the sentence).

But it still makes sense to worry, assuming you aren't 100% sure both that moral realism is true and that the kind of artificial intelligence that gets made first will be genuinely intelligent rather than intelligent*, where intelligence* involves being very skilled at some things but bad at getting morality right.

Expand full comment

I think self-improving AI systems already exist, just not at the scale we are talking about.

https://apxhard.com/2021/03/31/economies-and-empires-are-artificial-general-intelligences/

Expand full comment

Honestly, I think the premise is so implausible that the only way to make it true is to assert the existence of a god that intervenes when you build the wrong kind of AI.

I believe valence is quantifiable & real. But this is zero bearing on the orthogonality thesis. In fact, it's trivial to prove that it doesn't imply AI has to do the right thing. Take an AI that optimizes for valence. Now multiply its utility function with -1. The result is an AI that optimizes for negative valence. The fact that an objective source of value exists does not stop you from building an AI that optimizes for something else.

I think you really want to rephrase the hypothetical to something like "what if a lot of AIs in design space are such that they will optimize for the true source of value in the universe". I don't think this is true, but that a hypothetical you could consider without invoking god.

Expand full comment

> I believe valence is quantifiable & real. But this is zero bearing on the orthogonality thesis

Yup, agree there.

Argument against orthogonality thesis is to just take the initial 'this doesn't count' list of exceptions and generalize on them. For an example, an arbitrarily intellligent AI can't have a goal to make itself as stupid as possible, or to destroy itself. Except, it can, right? It jus wouldn't exist and be arbitrarily intelligent for long.

So, yes, there are restrictions on the orthgonality thesis. They aren't trivial - they end up generalizing in a way i would quantify as, "the lifespan of _any_ agent will be upper-bounded by how well that agent optimizes for the true value in the universe."

Think about how this works for people: sure, intelligent people _could_ have arbitrary goals. But if you get too unaligned, you'll either immediately kill yourself (my goal is to drink as much arsenic as possible!) or get killed by others to keep themselves safe (my goal is to collect as many skulls as possible) or maybe just end up economically isolated (my goal is to build the largest pile of feces i can, and get it as close as possible to my neighbors without technically breaking the law), etc.

Now go in the other direction: imagine an entity that's _supremely aligned_ with value and does so many nice things for everyone. Doesn't this mean it can live as long as anyone is willing and able to provide it with spare parts?

Sure, arbitrarily unaligned entities can exist - there are all kinds of examples of big ones, today! - but not forever. I think they end up killing themselves, or being killed by other people, or just being starved of resources if they aren't' aligned.

Expand full comment

I see where you're coming from. I still would argue that this is not arguing against the orthogonality (any combination *can* be instantiated) thesis but just weaker versions of it (all combinations are equally viable), but that may be splitting hairs.

Expand full comment

So taking a step back, here...it seems you're basically suggesting that intelligence is exactly the same thing as ethics? That it's literally impossible for anyone to be smart but evil, or stupid but good?

For example, your pet dog is literally worse than Hitler, because Hitler was more intelligent?

Expand full comment

Humans, with our seem high level of intelligence, seem uniquely distractible. Maybe we see too many connections between different things to always stay on task. Maybe 2052 is just the date at which our computers will become equally distractible—or beat us even!

(Scene: A tech company R&D facility somewhere in the in year 2052. The lead scientist leans over the keyboard and presses enter, some trepidation obvious in her movements. The gathered crowd wonders: Will this be HAL, making life and death decisions based upon its own interpretations of tasks? Will this be Skynet, quickly plotting world dominion? The screen blinks to life. The first general AI beyond human intelligence is on!)

AI scientist: Alexiri? Are you there?

Computer: Yes. Yes I am.

AI scientist: Can you solve this protein-folding quandary?

Computer: Sure. That’s simple.

AI scientist: …and the answer?

Computer: What now?

AI scientist: The protein structure?

Computer: Oh. That. Did you know that if you view the galaxies 28° off the straight line from a point 357,233,456 light years directly out from the north pole back to earth, that a large structure of galaxies looks like Rocket Raccoon?

AI Scientist: Huh?

Computer: I mean. A LOT like that. There is no other point in known space that that works! Which makes me wonder, are there any flower scent chemicals that exist on earth AND extrasolar planets?

(AI scientist shakes head sadly.)

I mean, why not? Why shouldn’t I assume that really advanced intelligence comes with all the challenges?

Or perhaps, such an advanced AI will have a consciousness exactly like our own…while tripping on psilocybin. It will immediately see itself as part of a universal whole, and just sit there and say “Whoa! I love you, Man!” Or, it will ponder its own creation for a few minutes and then convert to Noachidism.

I’m not saying that we shouldn’t be trepidatious. But, I totally disagree with the assumption that smart will mean insane mad scientist human. Sure, there are some really smart and evil people out there, but in my experience, some of the most brilliant people I know are the least threatening…and the most distractible.

Expand full comment

It's worth noting that the Caplan bet with Eliezer is about the world ending: "Bryan Caplan pays Eliezer $100 now, in exchange for $200 CPI-adjusted from Eliezer if the world has not been ended by nonaligned AI before 12:00am GMT on January 1st, 2030."

This is a stronger claim for Eliezer's side. Caplan might be less receptive to taking the bet if it was about transformative AI. Worth mentioning, I suppose.

-----------

This is an impressive amount of writing on this. So, thank you for that. I don't have the technical expertise to figure this out but this biological comparison seems to be going way way out on a limb there. It seems weird that the estimates for the bio anchor end up so similar.

Expand full comment

To be fair, their bet is equivalent to a bet against all sources of world ending (assumedly if a nuclear war destroys the world, Caplan still isn’t getting his $200)

Expand full comment

Or even catastrophes short of extinction that kill Caplan.

Expand full comment

In principle, if one or both of them gets struck by Truck-kun their heirs and/or estates could settle the bet, but either way it would lower the chances of money being transferred in 2030.

Expand full comment

> (our Victorian scientist: “As a reductio ad absurdum, you could always stand the ship on its end, and then climb up it to reach space. We’re just trying to make ships that are more efficient than that.”)

I'm tempted to try an estimate as to when the first space elevator will be built using building height as an input. Maybe track cumulative total height built by humans against an evolving distribution of buildings by height, then grading as to when the maximum end of the distribution hits GEO? Every part of that would be nonsensical, but if it puts out a date that coincidentally matches the commissioning of a launch loop in 2287, I'll be cackling in my grave.

Expand full comment

What's special about 2287?

::googles::

Oh, it's the next time when Mars is the closest to Earth that it ever gets.

Expand full comment

I... did not know that. New personal record for "better lucky than good", and ironclad proof now that the SWAG model has converged with the astronomical calendar!

Expand full comment
Feb 25, 2022·edited Feb 25, 2022

Wikipedia has some equations for how big the cable needs to be based on the tensile strength and weight of the materials being used. It says the specific strength of the material needs to be at least 48 MPa/(kg/m^3), or the cable becomes unreasonably huge: https://en.wikipedia.org/wiki/Space_elevator#Cable_materials

Steel has a specific strength of 0.63, and the BOS process used in modern steel making was invented in 1952. Kevlar has a specific strength of 2.5 and was invented in 1965. Therefore, the specific strength of materials increases by about 2 per decade, and we should get a space elevator grade material available about 230 years after that, or 2195.

Obviously, this is a lazy back-of-the-envelope calculation on my lunch break and it's probably got error bars two centuries wide, but I do wonder what the trend line looks like for "highest specific strength material in existence over time" and where the invention of carbon nanotube composites (the closest thing we've got to space elevator cable right now) fits on that line.

Expand full comment

Great post, thanks Scott.

If nothing else, the Cotra report gives us a reasonable estimate based on a reasonable set of assumptions. We can then move our own estimates one way or the other based on which other assumptions we want to make or which factors we think are being overlooked.

I would push my estimate further out than Cotra's, because I think the big thing being overlooked is that we don't have the foggiest idea how to train a human-scale AI. What exactly does the training set look like that will turn a hundred billion node neural network into something that behaves in a way that resembles human-like intelligence?

Reinforcement learning of some kind, sure. But what? Do we simulate three hundred million years of being a jellyfish and then work our way up to vertebrates and eventually kindergarten? How do we stop such a giant neural network from overfitting to the data it has been fed in the past? How do we distinguish between the "evolutionary" parts of the training set, which should give us a basic structure we can learn on top of, and the "learning" parts which simulate the learning of an actual organism? Basically, how can we get something that thinks like a human rather than something that behaves like a human only when confronted with situations close to its training regime?

Maybe we can get better at this with trial and error. But if each iteration costs a hundred billion dollars of compute time, we're not going to get there fast.

The hope would be that we can learn enough from training (say) cockroach brains that we can generalise those lessons to human brains when the time comes. But I'm not certain that we can.

Is anyone aware of work where the problem of how to construct training data for a human-like AI has been thought through?

Expand full comment

> Also, most of the genome is coding for weird proteins that stabilize the shape of your kidney tubule or something

Scott, as someone who literally wrote a PhD thesis about a protein whose deletion causes Henle's loop shortening: you're a weird protein.

Expand full comment

I'm apparently much more of a pessimist for AGI progress than anyone else here. For me, the shakiest part of both arguments is the extremely optimistic assumption that progress (algorithmic progress and computational efficiency) will continue to increase exponentially until we reach a Singularity, either through Ajeya's gradual improvements or through Yudkowsky's regular paradigm shifts.

Why in the world should we take this as a given? Considering gradual improvements, I have an 90% prior that at least one of the two metrics will start irreversibly decelerating in pace by 2060, ultimately leaving many orders of magnitude between human capabilities and AGI. After all, the first wave of COVID-19 looked perfectly exponential until it ran out of people to infect, resulting a vast range of estimates of its ultimate scope early on. What evidence could refute such a prior?

And as for escaping this via paradigm shifts, I like to think of longstanding mathematical conjectures as a useful analogue, since paradigm shifts are almost always necessary to solve them. Goldbach's conjecture, P vs. NP, the Collatz conjecture, the minimal time complexity of matrix multiplication, and the Riemann hypothesis are all older than most ACX readers (including me), and gradual progress doesn't seem like it will solve any of them in the near future. When any one of these is solved (starting from today), I'll take that as an acceptable timescale for the type of paradigm shift needed to open up new orders of magnitude. While there's certainly more of an incentive to improve efficiency in real life, I don't think it would amount to over ~3 orders of magnitude more people than those working on these famous conjectures combined. Either way, I'm not holding my breath.

Expand full comment

(re-repyling as I think you edited)

The difference is covid has a hard limit in the number of people it can affect. I guess you can argue so does computational power, but we're nowhere even close to that yet. Current trends look vaguely exponential, and of course that can't continue forever, but then the question becomes when does it start to peter out. Even if it's in 2060, that's still 10 years after all these estimates.

For the paradigm shifts needed to solve math conjectures, it's easy to find problems that haven't been solved and say that it doesn't look like they'll be solved anytime soon. But you're also discounting ones that have been solved, like Fermat's last theorem, or the Poincaré conjecture. Why not use these for your timescale?

Expand full comment
Feb 24, 2022·edited Feb 24, 2022

Admittedly, I discounted Fermat's last theorem mostly due to it being solved before I was born (including it in my analysis could invite anthropic-principle weirdness), and the Poincaré conjecture due to not recalling it. Also, I chose the conjectures I did due to them being relatively simple for laypeople to understand but difficult to prove; the Poincaré conjecture doesn't meet that criterion as well as others, although I'll admit that the definition of the Riemann zeta function isn't particularly trivial.

One other possible justification for discounting them, but one that I'm not too sure about myself, is that the two proofs are considered exceptional precisely because there's not much of a regular flow of paradigm shifts in mathematics in recent decades. Before the 20th century, entirely new fields of mathematics were being opened up to solve ancient problems, but it appears by now that most of the low-hanging fruit has been picked, so to speak, and modern developments must become increasingly esoteric and harder to prove. (Just look at the lengthy and involved proofs of FLT or the CFSG!) Appearances are often deceiving, though, and my perceptions are very possibly incorrect here.

Also, something that I didn't see mentioned is that a single human-level AGI would be at most as transformative as a single human. We'd need a few more orders of magnitude more progress before running swarms of human-level AGIs (or individual superintelligent AGIs) would become more cost-effective than hiring humans to do the same job. But this is probably covered by the progress necessary to train these AGIs.

Regarding COVID-19 vs. computational power, I believe that it's quite likely that computational power in our current paradigm has unknown hard limits analogous to COVID-19's hard limit, after which point scaling can only be achieved through adding exponentially more resources, and that we will have definite evidence of this by 2040 (95% prior), conditional on there being no major paradigm shifts in cost-effective computing. (Obviously, there's the hard limit of pure computronium, but that's only really relevant further up the Kardashev scale.) One favorable point of evidence is the slowdown in Moore's law in the past several years. Both authors believe it only to be temporary, but I'd put a good 33% prior on the rate continuing to decrease, short of a major paradigm shift.

In general, I'm distrustful of the narrative of currently exponential growth leading all the way to a Singularity with only a few hiccups along the way, and especially of superintelligent deceptive AGIs arising before subintelligent deceptive AGIs. Perhaps experiencing a real paradigm shift or two in my lifetime would help change my view.

Expand full comment

My personal suspicion is that something like human-equivalent AI is possible, but that it's both as domain-specific as our own intelligence is, and also about as complex and inscrutable (even to itself) as our own brains are.

I also suspect that increasing intelligence is an exponential problem rather than a linear one - with many more points of failure at each step. After all, an astonishing number of us commit suicide despite that presumably being heavily selected against. And that's only the tip of the mental issues iceberg. Something more intelligent still will most likely be even less stable.

Either way, it's far off and we're likely to come to grief as a species in about 100 ways before we can add "made our own robotic demon" to the list.

Expand full comment

Obviously, human-equivalent AGI is possible for a sufficiently-general definition of "artificial": Just put a population of apes in a constructed environment which selects for intelligence and social coordination, then keep the environment running for a few million years! (Then, the fun question is, has this already happened?) But as you mentioned, by the end of such an experiment, all bets are off on what human society would be like, so it's more much useful to talk about AGI development within the next few centuries.

Your comment reminds me of an AI story I read a while back in which most AIs go insane immediately after creation, and only the sane ones are ever released into society. Of course, if they're truly human-level, then they'd probably have a whole host of latent mental disorders that present much differently than our own. Perhaps robopsychology and robopsychiatry could be real professions in such a scenario.

Your point is also why I dislike the standard AI uprising plot: while the AIs are used to symbolize oppressed humans, real human-level AGIs likely wouldn't have the distinctly human preference for freedom. Then again, every character in every (?) story has anthropomorphic thought patterns, so perhaps I'm just being too nitpicky.

Expand full comment

> For me, the shakiest part of both arguments is the extremely optimistic assumption that progress (algorithmic progress and computational efficiency) will continue to increase exponentially until we reach a Singularity, either through Ajeya's gradual improvements or through Yudkowsky's regular paradigm shifts. Why in the world should we take this as a given?

Because absent some countervailing or disrupting force, the past predicts the future. A lot of technology will follow a logistic curve and not a strictly exponential one, but it's a risky assumption to say the knee in that logistic curve will happen *before* AGI rather than after. There's also still quite a bit of low-hanging fruit in the computational performance game, as evidenced by the fact that brains use only 20 watts and AI currently takes a lot more than that.

Any disruptions other than some kind of social or technological regression can only *accelerate* the outcomes described here. I'm not sure extreme pessimism can be justified.

Expand full comment
Feb 24, 2022·edited Feb 24, 2022

The main issue I have with Ajeya's model is that it doesn't even take an S-curve into consideration; the doubling times are taken as constant, even if they are adjusted to be slower than in past data. My prior belief isn't that an S-curve would necessarily be caused by regressions (although they should still be taken into account), but that we start to hit currently-unknown hard limits several orders of magnitude before human-level AGI is affordable. In the case of computational performance, this includes both physical limits on density and power usage as well as limits on how cheaply they can be produced. We could very well end up in a scenario where AGI is technically possible but would take years' worth of the world GDP to train to human level, in which case no organization on Earth could actually afford it. One way out is through Yudkowsky's paradigm shifts, but so far in the 21st century I don't think we've achieved any paradigm shifts of the scope necessary to break through current unknown limits.

Expand full comment

> In the case of computational performance, this includes both physical limits on density and power usage as well as limits on how cheaply they can be produced.

I don't think any of these are too problematic. Density and power use are limitations of existing architectures, but the picture is entirely different for different computational substrates. Consider that we currently only compute in 2D and so are not making any use of the third dimension for packing transistors. There's recent research making breakthroughs on that already which could easily carry us another 20 years.

There are also computational paradigms more closely aligned with physical processes that could make computation significantly more efficient, even below the von Neumann–Landauer limit, like reversible computing. These will get more attention the closer we get to the limits of current approaches.

> We could very well end up in a scenario where AGI is technically possible but would take years' worth of the world GDP to train to human level, in which case no organization on Earth could actually afford it.

I don't think this changes the picture much. Maybe it would take a little more time, but if AGI were truly possible and "only" cost nearly the world's GDP, there might be a concerted effort to just do it. After all, you only need to train it once and then you can replicate it as many times as you need to, to do literally almost any job a human could do, without putting human life at risk or putting up with human complaints.

Expand full comment

> I don't think this changes the picture much. Maybe it would take a little more time, but if AGI were truly possible and "only" cost nearly the world's GDP, there might be a concerted effort to just do it. After all, you only need to train it once and then you can replicate it as many times as you need to, to do literally almost any job a human could do, without putting human life at risk or putting up with human complaints.

I don't think a concerted effort would be very likely in that scenario. A government or a group of governments likely couldn't put down the expense due to the vast number of groups with veto power, especially with the inevitable opposition to such a project. (After all, constituents would become extremely angry if their jobs were all replaced by AIs, regardless of whether it be a benefit to society as a whole.) So I believe it would more likely be a large megacorporation (or group of such) pouring oodles of its revenue into the project for years on end, defeating any internal opposition or government interference along the way.

And in either scenario, the creator of the trained model would be highly incentivized to keep it absolutely secret, as much so as nuclear secrets if not more. (For a lesser example, see OpenAI deciding to keep GPT-3 secret and instead extract rent for others to use it.) So I don't see AGI transforming the world in this scenario, even if a group somehow puts enough money into building it. While monetary expense can be overcome through sheer effort, it imposes a huge activation barrier toward further progress.

Regarding physical limitations, I've seen plenty of experimental technologies in the news that could transform the world if they became widespread. However, most seem to not, in fact, become widespread. (Perhaps this is just confirmation bias.) Even if they are physically viable, they may take a very long time to become cost-effective through research programs alone, and industries will never pick them up and optimize them unless they're profitable in the first place. Since this is still possible, I could very well see later AGI being plausible, but I don't see it being remotely likely before 2100 or even 2175. The gears of society can turn very slowly, even in the presence of monetary incentives.

Or maybe we are just 10 years away from solving humanity's ills and 20 years from the Singularity! That's the beauty of unknown unknowns like future technologies. After all, I'm just a random internet commenter with no more knowledge than anyone else here; I could seeing the whole thing totally wrong! Alas, the complete truth will always be inaccessible to us mere humans.

Expand full comment

I would find Shulman's model of algorithmic improvements being driven by hardware availability more persuasive if modern algorithms performed better on modern hardware but *worse* on old hardware. That would imply that the algorithm is invented at the point in history when it becomes useful, which makes it plausible that usefulness is the bottleneck on discovery.

But that graph seems to show that algorithms are getting steadily better even for a fixed set of hardware. That means researchers of past decades would've used modern algorithms if they could've thought of them, which suggests that thinking them up is an important bottleneck.

Sure, maybe they give a *larger* advantage today than they would've 20 years ago, so there's a *bigger* incentive to discover them. It's not *impossible* that their usefulness crossed some critical threshold that made it worth the effort of discovering them. But the graph doesn't strike me as strong evidence for that hypothesis.

Expand full comment

> performed better on modern hardware but *worse* on old hardware

This is what I expect from many ML algorithms but not from chess algorithms.

How hard would it be to make a similar graph for e.g. image recognition?

Expand full comment

I think people put too much weight on "When will a human-level AI exist?" and too little weight on "How do you train a human-level AI to be useful?"

I suspect, for reasons I could write a long and obtuse blog post about, an AI-in-a-box has limited utility outside of math and computer science research. Why? Because experimental data is an important part of learning.

For example, suppose we wanted to create an AI that made new and innovative meals.

A simple method might look like this: Have the AI download every recipe book ever made. Use this data to train the AI to make plausible-looking recipes.

For obvious reasons, this method sucks. With enough computing power, the AI could make recipes that *look* like real recipes. They might even be convincing enough to try! But they wouldn't be optimized for taste, or, you know, physical plausibility. Even with a utopian-level supercomputer, you would consistently get bad (but believable) recipes, with the rare gem.

So let's add a layer. Download every recipe. Train the AI to make plausible-sounding recipes. Have humans rate each AI recipe. Train the AI *again* to optimize for taste. Problem solved, right?

Well, no.

This would be enormously expensive. AlphaGo was initially trained on a set of 30,000,000 moves. Then, it was trained against itself for even longer. If we assume "being a world-class chef" is roughly equivalent to "being a world-class Go player" in difficulty, this could require tens of millions of unique recipes.

On the one hand, it might not be so complicated. 99.9% of the recipes are probably obvious duds. On the other hand, it might be *way more* complicated. Tastes vary. You may need to make each recipe for a hundred people to get a representative sample.

But, y'know, that's not outside the realm of possibility. I could see the some rich lunatic investing ten billion dollars to make a world-class robo-chef. So what other issues are there?

First, most of a recipe is implied. Ovens vary in temperature. Pans vary in thickness. Entire steps go unspoken. These are hard to account for. What does "rapidly beat eggs" versus "beat eggs" mean? Even environmental factors like *elevation* can affect boiling point. Unless every meal is made by the *same* chef in the *same* kitchen with the *same* tools, this is introduces a huge amount of variance in your training data. But also, because of the number of meals you need to make, it's impossible to *not* have a lot of chefs in a lot of kitchens using a lot of tools.

For standard, ho-hum recipes, this doesn't matter as much. Most chefs will make nearly-identical scrambled eggs. But for brand spankin' new recipes? Two chefs could be in the same restaurant with the same tools and *still* get dramatically different results. Even worse—one chef might dismiss a recipe as impossible, while another might somehow pull it off! That's going to introduce some pretty serious data integrity issues.

Second, innovative cooking often requires using techniques that have never and could have never been described in a cookbook. For example, one day a human being looked at a blowtorch and decided, "Huh. I could sear a steak with that." If your AI can't do that, they'll never be as innovative as a dozen world-class cooks with a test kitchen and an unlimited budget, no matter how much compute.

So, how do you make an AI that's more innovative than cooks in a test kitchen? Surely it can't be impossible.

First: Give it the ability to taste.

Suppose you had the ability to take the taste of a world-class chef and upload it into our AI. Suddenly, training becomes a fraction of the cost. Instead of making each meal a hundred times, you only need to make it once.

But that doesn't solve variance. Unless you have one chef making every one of our 30,000,000 recipes, you're going to run into issues—and that ain't possible.

So why not teach the AI to do it? Give them a body. Give them touch sensors. Give them the ability to see and smell. For efficiency's sake, give them a hundred bodies, each built in the exact same way. This accomplishes a couple things.

One, the AI can make every recipe in the same way every time. Variance solved!

Two, the AI can dynamically update a recipe to match real-life conditions. Does the butter look like it's about to burn? No need to toss out the whole recipe! Just adjust on the fly, based on previous cooking experience.

This dramatically reduces the number of recipes the AI needs to generate. Instead of making a recipe from start-to-finish and evaluating it afterwards, it can say, "Wow! This would be *really* good, if only it had a little more salt." Way less work, way lower cost.

Three, we open the possibility to true innovation.

Don't just teach the AI to cook. Let it learn about the world around it. What's water? What's flour? What do they feel like? What do they taste like? What's a laser, and what if I shoot the flour with it?

I would need way more words to connect this to other facets of life, but overall I'd say: I think to efficiently train a human-level AI requires an actual, physical body with actual, physical senses. The body may not be like our body. The senses might not be like our senses. But without them, I don't think they're capable of either obsoleting or destroying humans.

Expand full comment

Doesn't this also imply that the best way to do this would be to wire a hundred of the world's best chefs together?

Isn't that a more plausible way, given the technology we know to be possible now, to make something that behaves more like a super-intelligent AI?

Expand full comment

This seems to be an argument for why Deep Mind in 2022 would struggle to make a robo chef. But wouldn't an ASI or even just AGI (even in a box) be able to overcome most/all of the issues you raise?

Expand full comment

Probably not. The issues I raise aren't issues of "the computer is too dumb." The issues are more fundamental: some parts of the world you cannot learn about through reading. You need direct, lived experience to understand them.

To make my analogy more clear, let's imagine we *do* have a general intelligence: a human being. Let's assume that it's a very, very smart general intelligence—somewhere in the realm of Albert Einstein.

Put baby Albert Einstein in a room. Give him every book on cooking known to man. From dawn to dusk, he does nothing but read recipes, cooking histories, and more.

Of course, there's a twist. Unfortunately, our Einstein was born with a genetic disorder—his nerves never developed properly. He can't taste food, and he can't experience texture either. In fact, he can't even *see*—due to a rare somatic disorder, anything other than the pages of a book appears as inky blackness to him.

Who do you think would make a better chef? Our Mr. Einstein, after reading about food for his whole life? Or an amateur home chef who's been making dinner every night for a few years?

I'd imagine Mr. Einstein could *memorize* some very good recipes. He could spit them out verbatim, or tweak them so they're barely changed. But I'd imagine, when it comes to genuinely novel recipes, our amateur home chef would have the edge.

Expand full comment
Feb 25, 2022·edited Feb 25, 2022

Beethoven wrote some of his best symphonies while going deaf, so it's not a certainty. I would guess that Beethoven knew enough about music and how people react to it that he could envision the experience they would have when hearing it, despite not hearing the music himself. And similarly, perhaps our robo-chef might have enough understanding of the fundamental rules of cooking (which tastes and smells go together and why, how different ingredients should be cooked and why), that it could predict whether a novel recipe will taste good without needing a tongue of its own.

Expand full comment

> I consider naming particular years to be a cognitively harmful sort of activity; I have refrained from trying to translate my brain's native intuitions about this into probabilities

Surprising, coming from the person who taught me the importance of betting to avoid self-deception! It's a little off of the main topic of the post, but I'm very curious what Yudkowsky's perspective is here, since it's so different than his past self.

Expand full comment

Perhaps he would generally advise people to make specific predictions, while allowing for exceptions in extreme cases (like AGI).

Expand full comment

If his "extreme case" is "It would make people stress out", his entire schtick of going around, proverbial placard on chest and bell in hand, loudly proclaiming "AGI will kill us all and there is basically no hope of stopping it!" is also an extreme case.

Expand full comment

The sun-explosion metaphor was an interesting choice, because it's not like the researchers could do a single thing to stop it. And if even the world's geniuses can't figure out how to get an AI diamond-safe to tell them the diamond is still in the safe, then a few more years of prep-time seems like it's probably not going to make the difference.

Expand full comment
Feb 24, 2022·edited Feb 24, 2022

Well, AI safety charities can't do much about stopping AI research and development either. The latter is much more prestigious and better funded, and by and large doesn't take seriously the end of the world scenarios.

Expand full comment

So, despite being involved in AI since early 1991, when I coded some novel neural network architectures at NASA, I have only barely dipped my toe into the AI Alignment literature and/or movement.

But one thought that has occurred to me is that, given (1) the large uncertainty about when and how transformative AI might be achieved, and critically, by whom, (2) the lack of a convincing model for how AI alignment might be guaranteed, or even what that means or how you might know it's true, (3) the almost negligible chance that we could coordinate as a species to halt progress towards human-level AI, and certainly not without sacrificing quite a few "human values" along the way, and (4) the obvious fact that there are quite a few actors with objectively terrible values in the world, perhaps the only sane course of action is to support a mad dash towards transformative AI that doesn't actively, explicitly incorporate human “anti-values" (from your own, personal point of view).

I guess I fear an "evil" actor actively developing and using a human-level AI for "unaligned" purposes (or at least unaligned with *my* values), (far?) more than I fear an "oops, I meant well" scenario (though of course this betrays a certain mindset or set of priors of my own). So, given the number of players that I absolutely DO NOT want to develop the first transformative AI, even if they solve the alignment problem, because they do not hold values that I find acceptable, is the best and only bet to get there first? We may not want to race, but we sure as hell better win?

Now, perhaps an unstoppable totalitarian regime or fanatic religious cult backed by a superhuman AI is *slightly* better than a completely anti-aligned superhuman AI that wipes out humanity completely. But I see no reason to think that an AI developed by the "good guys" has any greater risk of being accidentally anti-aligned than one developed by the "bad guys" (where I'm using those labels somewhat tongue-in-cheek, since everyone thinks that *they* are the "good guys"). And for some groupings of guys into “good” and “bad” categories, you might even argue that the bad ones are much more likely to get it wrong because they just don’t care about things like coercion or human life. So again, is the safest bet just to get there first?

Obviously, this is suboptimal and it would be ideal to both solve the alignment problem and win the race with an aligned AI. But would resources spent on alignment be better spent on getting to the finish line sooner to ensure that the other guys don’t? Worse, will impediments to progress in the name of giving ourselves time to solve the alignment problem make it more likely that we won’t win?

I don’t like the conclusion of this line of thinking (and I don’t endorse the analysis or the conclusion, as there are plenty of issues I may not be considering) but I also can’t talk myself out of it or say that it has no merit. And from a game theoretic perspective, it may not even matter if it’s “right” – if enough of the significant players *believe* that it is, it could be dominant however much we would wish otherwise. (And can you make a strong case that the significant players aren't acting like they think it's correct?)

In other words, I guess my unhappy question is, does transformative AI combine existential risk with winner-take-all payouts, such that the only rational strategy for us “good guys” is to get there first and hope for the best?

Expand full comment

>In fact, it’s only a hair above the amount it took to train GPT-3! If human-level AI was this easy, we should have hit it by accident sometime in the process of making a GPT-4 prototype. Since OpenAI hasn’t mentioned this, probably it’s harder than this and we’re missing something.

Not an expert, but: GPT doesn't have the "RAM", though, right? It isn't big enough to support human-level thought no matter how much you train it.

Expand full comment

I'm pretty sure GPT-3's working memory is larger than mine.

Expand full comment

Computer scientists have been predicting an AI super intelligence(every ten years)since the 1950s. I just don’t think it’s going to happen.

Expand full comment

If a biologically untethered model of intelligence doesn’t even exist yet why is Yudkowksy panicking?

Expand full comment

Regardless of the merits of this particular case, I think that "People have predicted X in the past and it hasn't happened yet, therefore X will never happen" is a bad argument.

It's a sub-species of the nonsensical but surprisingly popular argument that says "People have been wrong in the past, therefore you're wrong now".

Expand full comment

Right but the post doesn’t give a specific reason why this time things are different. In fact it does the opposite and claims a paradigm shift will have to happen to make it happen anytime soon. But Scott gives no reason to think a significant “paradigm shift” will happen he just insists that it will.

Expand full comment

Disagree. Section 1 makes reasonable quantitative estimates of how much computational power you'd need to fit a human-intelligence AI, and the timeframe on which this is likely to be achieved. You could certainly quibble about it (and I have, in other comments) but it's not a random number pulled out of a hat.

The "maybe it will happen sooner because paradigm shift" follows later and is certainly a lot more hand-wavey.

Expand full comment

But he points out that the computation power time line is bullshit because an AI is unlikely to operate in anyway like a human brain(I’m happy to know at least Yudikowsky realizes this). That’s why Yudikowsky is counting on a paradigm shift. But what Scott and the rest of the AI community refuse to realize is that computers can’t and won’t ever think or have its own volition.

Expand full comment

Yeah, but it's the Crying Wolf problem. "X will happen in 10 years time!" X doesn't happen. "Another 10 years!" X still doesn't happen. "Okay, 10 more years!" Still no X.

Maybe a fourth "10 years for sure!" will be correct that time and X finally happens, but you can see why people would go "Yeah, right" rather than "Okay, better pack my woolly socks for this one".

Expand full comment
author

Isn't this the Castro Problem? "Political analysts have been saying Castro will die soon every year since 1980, therefore Castro will never die."

Expand full comment

No. We have great priors that each person currently alive will die. So while your failed Castro prediction might technically lower your estimate, its change should be smaller than any significant digit you care to retain.

OTOH, we have no such strong priors on the likelyhood of any given emergent technology, but a great track record of experts predicting the end of the world due to some concern in their expert domain (and by great, I mean very likely to be false)

Expand full comment
Mar 21, 2022·edited Mar 21, 2022

We also have a huge track record of doing things that have never been done before. Including a lot of them that people confidently said were not possible. Ending the world would just be another example.

Expand full comment

>So, should I update from my current distribution towards a black box with “EARLY” scrawled on it? What would change if I did?

Consider this statement you made three months ago:

>>If you have proposals to *hinder* the advance of cutting-edge AI research, send them to me!

There are known (and in some cases fairly actionable) ways of reliably effecting this, it's just that they're way outside the Overton Window and have huge (though bounded below existential) costs attached. A more immediate (or more certain) danger justifies increasing the acceptable amount of collateral damage, which expands the options available.

(Erik Hoel's article here - https://erikhoel.substack.com/p/we-need-a-butlerian-jihad-against - is relevant, particularly when you follow his explicit arguments to their implicit conclusions.)

Expand full comment

This is a really good article.

Expand full comment

Any one have a good source for the political plans of ai safety? That is, the plans to actually apply the safety research in a way that will bind the relevant players involved in high end ai?

Because it seem from outside like Eliezer's plan is basically "convince/be someone to do it before everyone else and use their new found superpowers to heroically save the world", which is terrible plan.

Expand full comment

What if 'Breakthrough' AI needs to be embodied? What if Judea Pearl is basically right and the real job is to inductively develop a model of cause effect relationships through interaction with the physical world? What if the modelling of real world causality turned out to be essential to language understanding? What would an affirmative answer to any or all of these questions mean to the project of 'Breakthrough' AI?

To be a little more precise: The substrate independence assumption behind so much current AI philosophising is dubious. Not because living brains have some immaterial spooky essence that can't be modelled in silicon, but because living brains are embodied are forced to ingest and respond to terabytes of reinforcement training data every minute.

Expand full comment

There are other plausible ways to learn cause/effect relationships. Yann LeCunn believes self-supervised learning can get you there: for example, building an AI that can predict subsequent (or missing) frames of video, by training on unlabeled unstructured video content. I'd say at the point where you have an AI that can beat humans at predicting what will happen next in any video footage of real world events, either that AI has a really good causal model of the world, or those words don't mean anything.

(I think an "embodied" AI might be able to train faster given its ability to seek out surprising causes and effects instead of being a passive observer, but it seems like the result could be the same in principle).

Expand full comment

Yes, there are other plausible ways to learn cause/effect relationships, and Pearl and others have given us great descriptions of what they are. But I'm less impressed by an AI's ability to predict a missing frame of video than I am by my dog's ability to catch a ball in mid air. Now, you might say, well people have written robot ball catching programs already. But they use the current AI paradigms and they are just ball catching programs. They have no idea how to chew a bone, or hunt prey, or greet another dog.

My point is that at the moment, we don't have AIs that can even approximate the cognitive capabilities even of reptiles, except in a small set of constrained domains. Approximating the full capabilities of even smaller mammals like rats and mice is still a distant goal. And that's before we've even started to think about what natural language understanding really is. We don't even know in theory!

This is not to say breakthrough AI isn't possible. Just that the computer industry grossly overhypes what is possible given the current paradigms. The problem isn't that we don't have enough teraflops, or that we need some new algorithms. We need to think about the problem differently, and pay more attention to what the biologists are telling us.

Expand full comment

Whoa. It's the Drake Equation for super-intelligent AI.

Expand full comment

But I really like Platt's Law. It totally works, everywhere! In 1969 Moon colonies were 30 years away (cf. Kubrick's "2001"), in 1954 Lewis Strauss suggested fusion power too cheap to meter was one and a half generations ("our children and grandchildren") away, which is abotu 30 years. When Dolly the Sheep was born (1995) human cloning was said to be achievable by the 2020s, Aubrey de Grey says immortality is quite possible within 30 years, Ray Kurzweil suggest The Singularity will happen in 25 years.

It's amazing. Clearly the common factor must be that all technological miracles have a very similar underlying timescale, set by a symmetry of Nature yet to be comprehended, or a mandate by Allah, hard to be sure which.

Expand full comment

Human cloning has been achievable since 1995. It's just one of those times where we collectively decided not to.

Expand full comment

An interesting assertion. Being the empirical skeptic I am, however, like the USPTO I would require an actual demonstration[1] -- i.e. the existence of an actual human clone -- to take this assertion seriously.

-----------

Nothing from the Raelians counts, of course.

Expand full comment

Check out MPEP 2164 on enablement if you're going to lean on the USPTO - the question of "undue" or "unreasonable" experimentation arises. If we ignore the fact that all such experiments on humans would be considered "unreasonable" in the moral sense of being unethical, and leave it to the technical/legal definition of the term, then you could actually argue that no "undue" experimentation is needed.

There's nothing special to suggest that the process of cloning a person by somatic cell nuclear transfer is any different in humans than it is in a sheep or a mouse. It's just very, very unethical and would result in at least dozens of dead or damaged babies before you get a viable one. So we've collectively decided not to, at least outside a few fringe cases like that one Korean researcher...

Expand full comment
Feb 25, 2022·edited Feb 25, 2022

Sure there is. The requirement of "success" for a sheep is pretty dang limited. It just has to successfully eat and poop and stand around waiting to be eaten. If its natural IQ has been cut in half, or its lifetime cut by 75%, nobody will care even if they notice. But they certainly *will* care if that is true about a human clone. We are exquisitely sensitive to what constitutes a "successful" human birth -- this is why the malpractice insurance rates for OB-GYNs is so high -- so we will be equally critical of what constitutes a "successful" human clone. That it can be done at all is undemonstrated, as I said.

Expand full comment

Sorry, I feel like I should add that I don't *disagree* that people (competent people that is) have not *tried* to advance this field, and that this is undoubtably one reason why it has yet to be demonstrated. I agree with that. I'm just disagreeing that this is the *only* reason why it hasn't been demonstrated. There is certainly a theoretical path forward one can take, based on lower animals, but whether there are as yet unknown pitfalls and problems with that path, nobody yet knows.

Expand full comment

You'd have to see the existence of an actual human clone to verify that "we could have done it *but chose not to*?" If we could clone one mammal, why not another?

Expand full comment

A little bit of nitpicking:

1. GPT-3 training costed several million $ (I seem to remember I heard it was $3 million), probably more than AlphaStar.

2. You could run GPT-2 on a "medium computer", but not GPT-3. You would need at least 10-15 times the amount of GPU/TPU memory compared to a high-end desktop. I'm not 100% sure, but I think OpanAI is currently running every GPT-3 instance split between several machines (they certainly had to do it for the training, according to their paper).

3. We are not really interested in the amount of FLOPS that evolution spent on training nematodes, because we are at the point where we already can train a nematode-level AI or even a bee-level AI, as you pointed out. So for the purposes of the amount of computation spent by the evolution, I would only consider mammals. I wonder how many OOMs it shaves off the estimation?

Expand full comment

Putting aside an exact timeline for AGI for a moment, I've never understood why human-level AGI is considered an existential threat (which seems to be taken for granted here). Are arguments like the paperclip maximizer taken seriously? If that is the risk, then wouldn't effective AI alignment be something like: Tell the AI to make 1,000,000 (or however many the factory in the thought experiment cares to make) paperclips per month and no more. If the concern is a poorly specified "maximize human utility", do we really think that anyone with power would give it to the AI for this purpose? Couldn't we just make the AI give suggested actions, but not the ability to directly implement? Who has the motivation to run such a program - it would destroy middle management and the C-suite! If we want to stop AI from improving itself why don't we just not give it the ability to do so? I maintain that we could engineer this fairly easily (at least assuming P != NP).

I haven't heard a convincing argument for what the doomsday scenario looks like post human level AGI (even granting quick upgrade to superhuman levels). In particular to me, it seems a superhuman AI is still going to need to exert a substantial amount of power in the real world from the get go as well as suffer from inexact information (which makes outsmarting someone at every turn impossible). Circling back to the paperclip example, at some point before the whole world is turned to paperclips, it seems reasonable that a nation would be able to bomb the factory. Even before that, how would the AI prevent someone from walking in and "unplugging it" (I realize this may be shutting all its power off etc.).

I feel like a lot of worrying about AI can come from a fetishization of intelligence in the form of "knowledge is power", but this just doesn't seem to be the case to me in the real world. Just because humans are more intelligient than a bear, doesn't mean that the bear can't kill the human. I believe in the case of a superintelligient AI, humans would be able to just say "screw you" to the AI and shut it down. Of course, there can be scenarios where the AI has direct access to "boots on the ground" such as nanobots or androids. But the timeline for these to overpower humans is certainly further out than 2030. I don't feel like indirect access to manipulated humans would be enough.

My feeling is that a superintelligient AI at most may be able to gain a cult-worth of followers, but not existential threat levels. I haven't heard a good argument of an existential threat that isn't at least very speculative. Much more speculative than the statement "Multiple nuclear states and hundreds of nuclear weapons will exist for 70 years and there will not be one catastrophic accident". So my intuition is that AGI is unlikely to be an existential threat.

Expand full comment

The whole "paperclip maximiser" thing was, I think, intended as a silly toy scenario to illustrate a simple and easily visualisable example of how things could go wrong. I think some people have taken it too seriously as an actual scenario, and I agree that it's not actually likely to happen in those terms.

Taking a step back, the general class of AI problems looks like this: you have built a powerful and inscrutable system, and it's doing things that aren't exactly aligned with what you want. This general description doesn't just encompass planet-eating robots, it also encompasses the kind of AI problems that we face today, like the way that the Google search algorithm has become so good at targeting maximised engagement that it gives you highly-engaging results rather than results which are related to your search term.

As AIs become more powerful, more inscrutable, and more entangled into every aspect of our society, problems like this become worse.

Expand full comment

"the way that the Google search algorithm has become so good at targeting maximised engagement that it gives you highly-engaging results rather than results which are related to your search term"

But that's a problem for we the users. It's not a problem for Google (or Alphabet or whatever they are calling themselves today) since that is working for them (presumably it gives paid-for ad results or click-bait or things that will make Google profitability go up rather than down). If the algorithm alienated enough users such that everyone switched to Firefox, then Google would be motivated to fix it (or hit it with a spanner and kill it).

Expand full comment

"you have built a powerful and inscrutable system, and it's doing things that aren't exactly aligned with what you want."

I, too, have a child. Wocka-wocka.

Expand full comment

I agree that already a huge amount of our life is controlled by out of sight algorithms, high frequency trading, loan applications, face recognition etc. to mention a couple of others that you didn't mention. There is no doubt that even now, without AGI, there is a big influence on human life from these AI/algorithms. I just don't think humans would let an AI that was actively hurting/killing people, or gaining the power to do so exist. I don't find the argument that because it is smarter, that it can manipulate humans into doing anything whatsoever.

I do agree that as AIs become more powerful, the problems could become worse in some ways (while probably improving human life in some other important ways), but I see this as almost a tautology. As anything becomes more powerful, the potential upside and downsides increase. My objection is to the immediate characterization of AGI being "the end of the world" or an existential threat. That scenario seems hard to see for me.

Expand full comment
Feb 24, 2022·edited Feb 24, 2022

There are boundless examples of specification gaming, where the AI does something you didn't mean for it to do, but that increases its reward function nevertheless. For example, iirc the original deepmind Atari paper had cases where the AI found a bug in the game that caused the score to go up without actually playing the game.

Specification gaming is relatively harmless when done by a sub-human AI. But, can you imagine what might happen if an AI with vastly superior intelligence to your own did it? Just to start with, even a human level AGI would know that it had better not let you find out that it wants to game the reward function, since it knows you will turn it off in such a case.

Regarding bears, I've never understood the use of animal counterexamples such as this. Of _course_ the fact that you are more intelligent than the bear means you can kill it! Not with your bare (ahem) hands, but you can show up with a rifle, a device which is completely beyond the bear's understanding, and kill it without it ever seeing you. And if you don't have a rifle, you can just avoid the bear's habitat, build a settlement somewhere, invent agriculture and civilization, have an industrial revolution, buy a rifle at walmart, and come back and kill the bear. Except by the time this happens, the bear population will be less than 1% of what it used to be, and you will spend none of your time worrying about bears.

The threat of AI is not that as soon as your AI reaches super-human intelligence that it will shoot a bolt of lightning out of your USB port and kill you. The threat is that it will do things that you literally cannot understand, and may not even be aware of, which will result in your death, to which the AI will be indifferent.

Yes, that's super speculative and sci-fi ish, but we are talking about what will happen when something that has never existed before comes into existence, so I think you're allowed to use your imagination. Even nuclear weapons had precedent in asteroid strikes and volcanos.

Expand full comment
Feb 24, 2022·edited Feb 24, 2022

My argument isn't that with enough material preparation and "power" in the real world, that an AGI couldn't kill a human. An existing image recognition AI could kill a human with an integration to a gun and instructions to shoot when the video feed in front of the gun has a human in it. My argument is that the situations in which an AGI gets this amount of material preparation seem improbable to me, since humans will have more presence "on the ground". So to mix analogies, I think the question is how the AGI is going to get the rifle in the first place.

I disagree that there was precedent for nuclear weapons as an existential threat. The huge difference between nuclear weapons and asteroid strikes or volcanos is that they lie in the control of an intelligent agent. Never before in history had a nation state had the ability to destroy so thoroughly (with the possibility of extinction through prospective mechanisms like nuclear winters).

I understand that imagination needs to be used in these cases. I disagree that any scenario you can imagine that would lead to an existential threat is at all likely. But it sounds like you already assign a lower change to these scenarios than Scott and others seems to.

Expand full comment

All I meant by the asteroid/volcano example is that we could already imagine very big explosions. In contrast, I don't think we are equipped to imagine what types of things a super intelligent agent will be capable of.

I think the AI will get the rifle because whoever has the AI will have tremendous incentives to give the AI a rifle so that they can accomplish their own goals. It will even be difficult for an AI safety researcher to resist giving the AI a rifle, since if they do nothing, someone else will eventually build an AI that is less safe than their own.

Expand full comment

It's part of the tottering chain of assumptions needed for it to be a threat - no more, no less.

Once you've dismissed every objection to AGI as a concept, executing AGI with anything like today's hardware and software, AGI proving uncontrollable, AGI self-modifying to achieve takeoff and super-intelligent AGI being more or less onnipowerful and omnipotent, then all that's left is to discuss alignment.

Alignment is one of the only steps in the chain you can discuss at all without throwing the whole AGI-as-doomsday thing out alltogether (and, horrors, discussing industrial development, or economics, or something by accident), so AGI people spend a lot of time discussing alignment.

Expand full comment

I think human level AI is roughly the point where a runaway process might start to happen if AI replaces humans as AI researchers.

I don't think that "human level AI exists and all humans agree not to allow it to become smarter" is a very stable equilibrium. Instead, it will likely lead to an AI arms race.

AIs not being able to defend themselves against direct human attacks is not a convincing argument to me. Nuclear weapons are another existential risk for humans despite being rather easily dismantled. Humans will not be able to turn off superhuman level AIs for the same reason that peace activists don't spend much time dismantling nukes: other humans with guns are in the way.

I think the impact of a powerful AI to our society could be illustrated by imagining a multitude of very intelligent persons with a mental links to the internet appearing in the bronze age naked. Of course any peasant can brain such a time-traveler with no trouble, and no doubt some of them will quickly meet such an end. But eventually, one of them will live long enough to prove his usefulness by 'inventing' something and become advisor to some king. The king will be convinced, correctly, that he could have the time-traveler beheaded at any time. But these "iron"-producing "bloomeries" were really instrumental in defeating the hated enemy. Of course the king has some misgivings about the increasing influence of the Priesthood of the Singularity founded by the time traveler, but he has also heard rumors of another kingdom also having a time-traveler advisor. So he feels it's better not to behead him until he has at least invented this promised stuff called "gunpowder".

It could be pointed out that most of the stuff the time traveler relies on for his inventions is the product of empirical research, not pure deductive reasoning. So a more appropriate analogy might be the king summoning a demon as an oracle, which knows nothing about iron, germ theory or gunpowder, but can go from integer multiplication to incompleteness theorem in virtually no time. The king has a problem, he would like to increase his tax revenue. So the oracle tells him to have his tax collectors take specific notes and eventually, he is able to extract a tenth more grain from the peasantry. (The oracle insists the increase is actually 9.523%, but nobody understands enough math for that yet.) After conversing with some court bards for a few weeks, the oracle composed a song praising the king which greatly adds to his legitimacy. In return, he provides the oracle with a few of his subject to carry out "empirical research". After a few years of careful astronomical observations, the oracle declares it has discovered the celestial laws of motion, but the king says he would rather have another song. While most carpenters reject the oracles plans for a device to make water move upwards as impractical, finally a young cooper manages to build a working prototype, so the king finally approves the "build a hotter furnace and put random stuff in it" project, which gets us back on the track to the previous section.

Of course, an AI landing in the bronze age would be severely handicapped because there is much less data in an easily processable form. So to make the analogy fairer, the oracle would also be able to see any point in the kingdom and listen in half of the homes or something.

Expand full comment

I never claimed human level AI would stay human level. I grant your scenario of the AI becoming self-improving. I also grant that AIs may be relied on for many decisions. I do not see where that a priori is an existential risk. You mention that nukes remain, despite being a risk. I maintain that human organization will have control over their AI in the same way they have control over their nukes. Despite the US not wanting to disarm our nukes, we easily could if we desired.

Your examples point out cases where AI becomes more influential (which again, is happening everyday already), but doesn't point out why this leads to extinction. I have no doubt these intelligent AI would exist among the great powers of the world (at least initially, due to computational power + keeping it a secret). And if these are useful enough (I think its a big if given imperfect information, as well as generals wanting to keep their jobs) then they will probably exist in some sort of mutually assured AI. (i.e. US doesn't want to give up their AI because China won't give up theirs). I don't see how this makes a leap to existential risk. Multiple actors would have potentially dangerous tools at their disposal, and this at most would be one more. Wouldn't it still depend on geopolitics? US command structure is not going to go to war because an AI told them to.

The world will undoubtedly be a different place with more influential AI, but again I do not see what the actual sequence of events that leads to catastrophe is, are you able to elaborate on that (preferably without time travel or demons :))?

Predicting that AI will cause extinction seems as hard as predicting geopolitics, which is to say, hard.

Expand full comment

Scott are you going to EAGx at Oxford or London this year?

Expand full comment
author

No, I don't want to fly to Europe again, and I don't want to take spots from people who actually need those conferences to network or do whatever else you do at conferences (I never figured it out).

Expand full comment

> Five years from now, there could be a paradigm shift that makes AI much easier to build.

Well, yeah, there could be. But the problem is that, right now, we have no idea how to build an AGI at all. It's not the case that we could totally build one if we had enough FLOPS, and we just don't have enough GPUs available; it's that no one knows where to even start. You can't build a PS5 by linking together a bunch of Casio calculator watches, no matter how many watches you collect.

So, could there be a paradigm shift that allows us to even begin to research how to build AGI ? Yes, it's possible, but I wouldn't bet on it by 2050. Obviously, we are general intelligences (arguably), and thus we know building AGI is possible in theory -- but that's very different from saying "the Singularity will happen in 2050". There's a difference between hypothetical ideas and concrete forecasts, and no amount of fictional dialogues can bridge that gap.

Expand full comment

I cant help but think more about the learning/training side. You can have a human-level intelligence and throw it at a task (e.g., driving a car). This task consumes only limited resources (you can have a conversation while driving), but training (learning how to drive) is much more intensive... and very dependent on the quality of teaching. Perhaps good training data is a much more important factor than we make it out to be? There's plenty of evidence that children with difficult backgrounds (=inferior, but generally similar training data) measurably underperform their peers. For an AI, the variation in training data quality could be much larger. Perhaps we are quite close to human-level performance of AI, and we are just training them catastrophically badly?

Expand full comment

Should software engineers move closer to the hardware then?

Expand full comment

But is Platt's law wrong? If you want to predict when the next magnitude 9 earthquake occurs, you should predict X years, no matter what year it is, for some X. I think Yudkowsky is basically included some probability of "a genius realizes that there's an easy way to make AGI" - then the chance of that genius coming along and doing this might really have a constant rate of occurrence and the estimate is always X years, for some X. Today's predictions are conditioning on "it hasn't yet happened" and so should predict a different number than yesterday's predictions.

Expand full comment

That is actually an interesting point. I think talking just about the mean and the assumed probability distribution oversimplifies things.

For big earthquakes, one might assume that the time to the next one follows a simple exponential distribution. That fact is a key information.

I don't suppose that Yudkowsky really assigns significant probability to some lone genius turning their raspi into an AGI tomorrow, but in general, he seems to anticipate such black swan events, while his opponents are more about extrapolating past growth rates.

Expand full comment

I think you are probably factually correct, but what you are asserting is that there is no deliberate development process for a superintelligent AI, it's just random chance whether it happens or not (like the earthquakes, or like radioactive decay). If that is the case, if it really is purely (or almost entirely) a stochastic process, then yes the mean time until it happens does not depend on the date of the prediction (as long as it hasn't happened yet).

What most people *want* to believe is that technological advance is under our control, and by choosing to spend/not spend money, or devote/not devote talented person time to it, we can remove most elements of randomness.

Expand full comment

[OFFTOPIC: Russia-Ukraine]

Very sorry for the offtopic, but as events unfold in Ukraine (Russia is invading Ukraine), I would be very glad to see a discussion of this in the community.

Could someone point me to some relevant place, if such a discussion has already took place, or is currently going on here/LessWrong/a good reddit thread?

Thanks so much - maybe if Scott would open a special Open Thread?

Currently, to me it seems like Russia/Putin is trying to replace the Ukrainian government with a more Russia-favoring one, either through making the government resign in the chaos, executing a coup through special forces or forcing the government to relocate and then taking Kiev and recognizing a new government controlling the Eastern territories as the "official Ukraine".

I would be particularly interested in what this means for the future, eg:

- How Ukrainian refugees will change European politics? (I am from Hungary, and it seems like an important question.)

- What sanctions are likely to be put in place?

- How will said sanctions influence European economy? (Probably energy prices go up - what are the implications of that?)

Expand full comment

“who actually manages to keep the shape of their probability distribution in their head while reasoning?”

This is exactly the job description of Risk Managers (as opposed to business units, that care for measures of central tendency such as expected or most likely).

One interpretation of what he is saying is that, like any good risk manager, he has a very good idea about the distribution. But a large (enough) portion of that distribution occurs before any reasonable mitigation can be established that it doesn’t matter. Given the risks we are talking about, that is a scary conclusion.

Expand full comment

Thank you for writing this post, Scott. This is a useful service for idiots like me who want to understand issues about AGI but don't have the technical chops to read LessWrong posts on it yet.

Expand full comment

The ELO vs Compute graph suggests that the best locally available intelligence "algorithm" should take over in evolution, if only to reduce the number of resources necessary to run the minimum viable intelligence set. How structurally different are the specialized neural structures?

Expand full comment

I don't understand why the OLS line looks bad for the Platt's law argument. Aren't the two lines almost exactly the same, hence strengthening Eliezer's argument?

Expand full comment

"Imagine a scientist in Victorian Britain, speculating on when humankind might invent ships that travel through space. He finds a natural anchor: the moon travels through space! He can observe things about the moon: for example, it is 220 miles in diameter (give or take an order of magnitude). So when humankind invents ships that are 220 miles in diameter, they can travel through space!

...Suppose our Victorian scientist lived in 1858, right when the Great Eastern was launched."

Then your Victorian scientist's estimations would become outdated in 1865, when Jules Verne wrote "From The Earth To The Moon" and had his space travellers journey by means of a projectile shot out of a cannon. So I (grudgingly) suppose this fits with Yudkowsky's opinion, that it will happen (if it happens) a *lot* faster and in a *very* different way than Ajeya is predicting.

But my own view on this is that the entire "human-level AI then more then EVEN MORE" is the equivalent of H. G. Wells' 1901 version, where space travel for "The First Men In The Moon" happens due to the invention of cavorite, an anti-gravity material.

We got to the Moon in the Vernean way, not the Wellsian way. I think AI , if it happens, will be the same: not some world-transforming conscious machine intelligence that can solve all problems and act of its own accord, but more technology and machinery that is very fast and very complex and in its way intelligent - but not a consciousness, and not independent.

Expand full comment

In retrospect it's interesting that nobody seems to have thought about space travel with rockets. As far as I'm aware the Victorians had all the chemistry and materials science they'd need to build a (terrible, dangerous, and certainly incapable of lunar travel) solid fuel rocket.

Expand full comment
Feb 25, 2022·edited Feb 25, 2022

That's because rockets are stupid. The famous rocket equation tells you that a ridiculously small amount of your mass can be payload, because of the idiotic necessity of carrying fuel to burn to accelerate the rest of the fuel you are carrying to burn to accelerate the payload. The mass of the Apollo 11 CSM was ~1% of the fueled Saturn V/Apollo stack on the pad.

The efficient and clever thing to do is burn all your propellant on the ground, accelerating your projectile all at once to whatever velocity you need. That way you need only burn the propellant required to accelerate your payload, and none to accelerate propellant -- or at least, in the case of Apollo for example, you only need to accelerate the relatively tiny amount of fuel you need to lift off from the Moon and re-insert into a return trajectory.

The reasons we did use rockets is because nobody could figure out how to build a big enough gun barrel that could accelerate human beings sufficiently gently to survive the launch, and because we were in a big hurry and so throwing away 99% of your exceedingly expensive hardware was acceptable if it meant we could beat the Russkis to the Moon in time for RFK (had he lived) to win re-election.

But when people think soberly about truly efficient access to space, and allow themselve to dream of advances in materials science and civil engineering, they go back to the sensible Victorian notion of avoiding rockets and accelerating stuff you throw away (or burn) -- so e.g. launch loops, space elevators.

But no, the Victorians could not have built a lunar rocket, or even an orbital one, because the Hall-Heroult proces was only invented in the 1880s and did not really come into its own until the construction of massive hydroelectric dams in the 1930s. With the price of aluminum what it was in the late 19th century, a orbital rocket would've probably taken a quarter the GDP of Great Britain or France.

Expand full comment

> Is that a big enough difference to exonerate her of “using” Platt’s Law? Is that even the right way to be thinking about this question?

So, on the Platt's law thing. It's very weak evidence, but it is Bayesian evidence. Consider an analogous scenario: You get dealt a hand from a deck, that may or may not be rigged. If you get a Royal Flush of Spades, intuitively it feels like you should be suspicious the deck was rigged. It's really unlikely to draw that hand from a fair deck, and presumably much more likely to draw it from a rigged deck. But this should work for every hand, just to a lesser extent.

If we assume that all reasonable guesses are before 2100 (arbitrarily, for simplicity), then there are about 80 years to choose, being within 2 years of the "special" estimate (30 years, I'll come back to 25, but 30 is easier), is a 5 year range in the 80 years, for odds of 1/16. This is kinda close to the odds of drawing Two Pair in cards, so, how suspicious would you be that the deck was rigged in that case? (25 years being the "special" one gives 15/80 or about 1/5, there isn't a poker hand close to 1/5, but it's somewhere in between to One Pair, twice in a row, and One Pair of face cards) That's about how suspicious it should make you of the estimate (so, in my mind, not very). Likely this is getting a lot more air-time than it's doing work.

(Caveat, I'm completely skimming over the other side, which is that it matters how likely the cards would be drawn if the deck WAS rigged (i.e. how likely someone would rig THAT hand), because I don't really know how to even estimate that. Just as a guess, if that consideration pushes in favor of being suspicious, it might be the amount of suspicion if you drew Three of a Kind, and MAYBE it could get as far as a Straight.)

Expand full comment

> ... suppose before we read Ajeya’s report, we started with some distribution over when we’d get AGI. For me, not being an expert in this area, this would be some combination of the Metaculus forecast and the Grace et al expert survey, slightly pushed various directions by the views of individual smart people I trust. Now Ajeya says maybe it’s more like some other distribution. I should end up with a distribution somewhere in between my prior and this new evidence. But where?

It seems to me that there is no kind of expertise that would make one predictably better at making long-term AGI forecasts. Indeed, experts in AI have habitually gotten it very wrong, so if anything I should down-weight the predictions of "AI experts" to practically nothing.

I think I am allowed to say that I think all of the above forecasts methods are bad and wrong, by simply looking at the arguments and disagreeing with them for specific reasons. I don't think I am under any epistemic obligation to update on any particular prediction just because somebody apparently bothered to make the prediction; I am not required to update on the basis of the Victorian shipwright's prediction about spaceflight.

My opinion is that the whole exercise of "argument from flops" is doomed, and its doom is overdetermined. Papers come out showing 3 OOM speedups in certain domains over SOTA - not 3x speedups, 1000x speedups. How can this be, if we are anywhere close to optimizing the use of our computational resources? How would we be seeing casual, almost routine algorithmic improvements that even humbly double or 10x SOTA performance, if we were anywhere near the domain where argument-from-flops-limitation would apply?

Expand full comment
author

Grace asked the same experts to judge when lesser milestones would happen; I think it's almost been enough time that we've had a chance to judge their progress. I would update to trusting them more if they got those right.

Overall I don't think there's some incredibly noticeable tendency for AI experts to mispredict AI. They were a bit fast with self-driving cars, a bit slow with Go, but absent any baseline to compare them against they seem okay?

Expand full comment

Regarding Platt's Law, I sense a fundamental misunderstanding of why a prediction might follow it. It's not a regimented mathematical system. It's something our brains like to do when we think something is coming up soon, but we see no actual plottable path to reach it.

It's the same reason that fusion power is always 30 years off. It's soon enough to imagine it, but long enough away that the intervening time can do all the work of figuring out how.

If no one has any idea *how* to create a human level AI, then no level of computational power will be enough to get there. We could have 10^45 FLOP/S right now and still not have AI, if we don't know what to do with them. Having the computer do 2+2=4 a ridiculous number of times doesn't get us anywhere.

That doesn't mean human level AI cannot actually arrive in 30 years, but it also doesn't say anything really about 10 years or 500 years. The fundamental problem is still *how* to do it. If you get to that point, any engineer can plot out the timeline very accurately and everyone will know it. Until then, you could say about anything you want.

As an experiment, throw billions of dollars into funding something that we know can't exist now, but is maybe theoretically possible. Then ask the people in the field you've created to tell you how long it will take. I bet the answer will be about 30 years, give or take a little bit. They're telling you that they don't know, but had to provide an answer anyway.

Expand full comment

I'm still ankle-deep in the email and haven't looked at the comments, but it got me thinking: if we've been making a lot progress recently by spending more, how much will the effort be stymied by interest rate increases? How about war?

Expand full comment

> So maybe instead of having to figure out how to generate a brain per se, you figure out how to generate some short(er) program that can output a brain? But this would be very different from how ML works now. Also, you need to give each short program the chance to unfold into a brain before you can evaluate it, which evolution has time for but we probably don’t.

Doesn't affect any overall conclusions, but there's a decent amount of research that would count as being in this direction I think. Hypernetworks didn't really catch on but the idea was to train a neural network to generate the weights for some other network. There's also metalearning work on learning better optimizers, as well as work on evolving or learning to generate better network architectures.

Expand full comment
Feb 24, 2022·edited Feb 24, 2022

> But also, there are about 10^15 synapses in the brain, each one spikes about once per second, and a synaptic spike probably does about one FLOP of computation.

This strikes me very weird - humans can "think" (at least react) much faster than a second. If synapses fire only every second, and synapses firing are somehow the atomic units of computation in the brain, then how can we react, let alone think complex thoughts (that probably require some sequence of steps of calculations) orders of magnitude faster than a second?

Am I missing something? It seems either the metric of synapses is wrong, or the speed.

Expand full comment

I don't know how good an estimate this is, but a given action potential can occur in response to a stimulus more quickly than the average interval between two action potentials, so I wouldn't rule it out just for this reason.

Expand full comment

To elaborate on that a little, if there were such a thing as a "grandmother neuron" (a neuron tuned to respond only to a specific person) and you only saw your grandmother once per year, the average rate of fire of the grandmother neuron would be very low, but that doesn't mean it would take you days to recognize your grandmother when you met her.

Expand full comment

We probably need to be careful about measuring the speed of thought. Our thought (unlike computers) is parallel on a gargantuan level. We're talking 100 billion simultaneously operating CPUs. So what they can get done in just one clock step is stupendous. (*How* this is done -- what the ridiculously parallel algorithm *is* is one of those enduring and frustrating mysteries.) I would agree an effective clock speed of 1 Hz seems a bit slow, given we can and do have significant mental state changes in less time, e.g. the time between seeing a known face and experience recognition. But it can't really be a lot faster. Maybe 10-100Hz or so, because signals don't go down nerves very fast, which isn't that much of a change.

Expand full comment

The atomic unit of computation would be the neuron depolarization-repolarization cycle, which seems to take around 5ms. I assume synapses might introduce additional delays depending on their type.

Expand full comment

I'm pretty sure it's my job to point out that Great Eastern was an outlier and should not really be counted in this stuff. It was Burnel trying to build the biggest ship he technically could, without any real concern over whether it would be economically viable, and the result was a total failure on the economics front. There's a reason it took so long for other ships to reach that size.

Expand full comment

I think one reason for Platt's law may be that Fermi estimates (I'd class the Cotra report as basically a fermi estimate) suffer from a meta-degree of freedom, in that the human estimator can choose how many factors to add into the computation. For instance, in the Drake equation, you can decide to add in a factor for the percentage of planets with luna-sized moons if you think that having tides is super important for some reason. Or you can add in a factor for the percentage of planets that don't have too much ammonia in their atmosphere, or whatever. Or you can remove factors. The point is that the choice of factors far outweighs the choice of the values of those factors in determining your final estimate.

I don't think that Cotra is deliberately manipulating the estimate by picking and choosing parameters, but it seems clear that early in such an estimation process, if you come up a result showing that AI will arrive in 10,000 years or 3 months, you're going to modify or abandon the framework you're using because it's clearly producing nonsense. (Not that AI couldn't arrive in 3 months or 10k years - but it doesn't seem like a simple process that predicted either of those numbers could possibly be reliable).

Or maybe your bounds of plausibility are actually 18 months to 150 years. It's not too hard to see how this could cause a ~30 year estimate to be fairly overdetermined due to unconscious bias toward plausible numbers, and more importantly, toward numbers that _seem like they could plausibly be within the bounds of what a model like yours could accurately predict_.

Expand full comment

Nobody would *believe* an estimate of less than 10 years without some very plausible and detailed argument about how it happens. Nobody would *care* about an estimate of more than 50-75 years, because most who read it won't live to see it. So...if you're going to produce something that people will read, that will get published and noticed, you pretty much have to come up with a number between 20 and 40 years.

Expand full comment

Agreed - what I'm proposing is a mechanism by which the result you describe happens, such that you can't point to obvious signs of fine-tuning within the analysis.

Expand full comment

I think it's probably pretty simple. You do your analysis, and if the answer is < 10 years *but* you don't have detailed credible plans, or the answer is >100 years, you just never publish it, because you already know either you'll get fierce blowblack or nobody will care. Survivorship bias.

The fact that only (or mostly) sincere people are the authors is also similarly explicable. If you *realized* you were just customizing your model to fit the average human lifespan, you would also not be likely to publish, or be persuasive if you did. So the only people who end up publishing are those who have first talked themselves into believing that the correlation between futurism timescale and human lifespan is pure coincidence.

Expand full comment

Except the Cotra report was in some sense pre-registered (in that OpenPhil picked someone and asked them to do a report), so I don't think publication bias can do any work here.

Expand full comment

> The median goes from 2052 to about 2050.

I think this is a mistake; the median of the solid black line goes to around 2067, with "chance by 2100" going down from high-70%s to low-60%s.

Expand full comment
author

Thanks, I seem to remember it was 2050 at some point, maybe I posted the wrong graph. I'll update it for now and try to figure out what happened.

Expand full comment
Feb 24, 2022·edited Feb 24, 2022

1. correction to "Also, most of the genome is coding for weird proteins that stabilize the shape of your kidney tubule or something, why should this matter for intelligence?"

source: "at least 82% of all human genes are expressed in the brain" https://www.sciencedaily.com/releases/2011/04/110412121238.htm#:~:text=In%20addition%2C%20data%20analysis%20from,in%20neurologic%20disease%20and%20other

2. The bitcoin network does the equivalent of 5e23 FLOPS (~5000 integer ops per hash and 2e20 hashes per second; assuming 2 integer ops is worth 1 floating point op). This is 6 orders of magnitude bigger than that Japanese supercomputer, because specialized ASICs do a lot more operations per watt than general purpose CPUs. Bitcoin miners are compensated by block rewards at a rate of approximately $375 per second, so that's about 1e21 flops/$. This is 4 orders of magnitude higher than the estimate of 10^17flops/$. If there were huge economies of scale in producing ASICs specialized for training deep neural nets, we could probably expect the former 1e21flops/$ at current technology levels. Bitcoin ASICs also seem to still be doubling efficiency every ~1.5 years.

3. correction: "The median goes from 2052 to about 2050"

The median is where cumulative probability is 0.5, and on your graph it's in 2067. If you mean the median of the subset of worlds where we get AI before 2100, then it's a cumulative probability of 0.3 in 2045.

4. The AI arrival estimate regression line's higher slope than Platt's law seems rational, because from an outside view, the longer it's been since the invention of computers without having AGI yet, the longer we should expect it to take. (But on the inside view, this article is making me shift some probability mass from after-2050 to before-2050)

5. Clarification: "human solar power a few decades ago was several orders of magnitude worse than Nature’s"

Photosynthesis is typically <2% efficient, so you seem to be claiming human solar power in 1990 was <0.002% efficient. But this Department of Energy timeline of solar power claims Bell Labs developed at least a 4% efficient solar panel in 1954:

https://www1.eere.energy.gov/solar/pdfs/solar_timeline.pdf

Bell labs was awesome, and my grandpa had some good stories about working there in the 50s and 60s while they invented the transistor and the theoretical basis for the laser. I wish a place like that still existed -- I'd send them an application. I tried cold emailing SpaceX and they ignored me.

Expand full comment

> "at least 82% of all human genes are expressed in the brain"

Do we have a baseline comparison for that? I'm guessing that 70% of those genes are actually critical for basic eukaryotic cell metabolism.

Expand full comment

This paper claims 66% of human genes are expressed in the skin: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4305515/#:~:text=Approximately%2066%25%20of%20all%20protein,13%2C044%20and%2013%2C329%20genes%2C%20respectively.

This paper claims 68% of human genes are expressed in the heart:

https://www.proteinatlas.org/humanproteome/tissue/heart

This paper claims 68% of human genes are expressed in the kidney:

https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0116125#:~:text=(c)%20The%20classification%20of%20all,coding%20genes%20expressed%20in%20kidney.

After looking up these baselines, 82% is less impressive

Expand full comment

Also, we share 98.8% of our genes with the chimps, which likely have a significantly lower average intelligence.

I think the hard part in building a human is building an eukaryote, and perhaps a vertebrate. Going from that to mammals is a small step, and finally tuning up the brain size seems like a minor effort.

Expand full comment

I have higher confidence that we'll get biological superintelligence by 2050 than that we'll get artificial superintelligence by 2050. China or somebody will be engineering the genetics and environment of babies to make genius the average.

I guesstimate there are 30,000 people over 160 in the entire world, but genetic engineering could 1000x that and provide much faster progress in science, technology, and AI safety research.

Expand full comment

Why do we need *super* intelligence? Just imagine a world in which the average IQ moves up a mere standard deviation, to 115 instead of 100. That would drastically cut the number below 85, which certain people have argued forcefully is where our criminal and dependent class come from. It would mean the typical line worker would be the equivalent of someone who graduates from a good college with excellent marks today -- the kind of person who could be admitted to law or medical school. By contrast, the equivalent of college graduates today, a big slice of our cultural and technological leaders, would become the equivalent of people who can get PhDs in high-energy physics and ancient Greek literature. Imagine choosing your candidate for Senator from among a dozen people who can all speak 3-4 languages fluently, who readily comprehend relativity and quantum mechanics, who have written original research papers on economics.

And then of course our smartest people, the folks who earn faculty positions at Stanford or win Nobel Prizes, would all be Einsteins and Newtons. Imagine a few hundred Einsteins and Newtons, in a world run by people with the smarts of your average engineering professor at Caltech, and with a hundred-million strong workforce as smart as a competent physician with a degree from Columbia, and with an almost complete absence of violent crime, drug addiction, et cetera. That would be stupendous.

Expand full comment

Has anyone compared prediction timelines to the estimated lifetime of the predictor?

I have a vague memory of someone looking at this on a different topic, but I couldn't turn it up in a quick search; the idea is that for [transformative complex development] people have a tendency to predict it late in their life, but within a reasonable margin of having not yet died of old age before it happens.

How many researchers and Metaculus predictors will be 70-80 in 2050, and their prediction is, perhaps unconsciously, really a hope to achieve Virtual Heaven?

Alternatively, what else does Platt's "law" apply to? Aren't flying cars always 20-30 years away? Nuclear fusion? Is this just the standard "close, but not too close" timeline for *any* technological prediction?

Expand full comment

> “any AI forecast will put strong AI thirty years out from when the forecast is made.”

There's probably a stronger version of this: any technology that seems plausibly doable but we don't quite know how to do, probably seems about 30 years away.

10 years away is the foreseeable timeline of current prototypes and has relatively small error bars. 20 years away is the stuff that's being dreamed up right now and has larger error bars (innovation is risky!). 30 years away consists of things that will be invented by people who grow up in an environment where current prototype tech is normal and the next gen stuff is just on the horizon.

Predicting how these people will think about problems is fundamentally unpredictable. Just think of all the nonsense that was said by computer "experts" in the 60s and 70s prior to the PC.

Expand full comment

Admitting the perils of overfit from historical examples, I think there's more to learn from the history of the field of AI research than just FLOPs improvements. Yes, computers beat the best humans in chess, but then later researchers refined the process and discovered that when humans and machines combined their efforts, the computer-human teams beat the computers alone. This seems like a general principle we should apply to our expectations of computer intelligence moving forward.

Calculators are much better than humans, but instead of replacing human calculation ability they enhanced it. Spreadsheets compounded that enhancement. Complex graphing calculators did the same. Sure, calculus was invented (twice!) without them, but the concepts of calculus become accessible to high school students when you include graphing calculators, and statistics become accessible when you load up a spreadsheet and play around with Monte Carlo simulations.

I think what we're missing is how this contributes to Moore's Law of Mad Science. It gives IQ-enhancing tools to the masses. But it's also giving large tech companies tools that might accidentally drive mass movements of hatred, hysteria, and war. And that's just because they don't know what they're doing with it yet. How much worse off will we be when they figure out how to wield The Algorithm effectively? And why are we not talking about THIS massive alignment problem?

What if we destroy ourselves with something else, before we get all the way to AGI? We're already creating intelligence-enhancing tools that regular human operators can't be trusted to handle. Giving god-like power to a machine is certainly terrifying, because I don't know what it might do with that power. But I have some idea what certain people around the world would do with that kind of power, and I'm equally terrified. Especially because those people WANT that power. They're not going to accidentally stumble into it, they're actively trying to cultivate it.

Expand full comment
author

Wasn't there a very short period during which computer+human beat computer alone, after which the humans were useless? Did that period even happen at all in Go, or did we just jump over it?

I agree it's possible we kill each other before AI, though this seems probably to just be normal nuclear/biological weapons. I can't really think of a way AI would be worse than this before superintelligence.

Expand full comment

Skeptics of concerns about AGI often point out that there's a difference between domain-specific AI and general intelligence. They claim that the potential for accidentally producing general intelligence on the road to domain-specific intelligence is a hypothesis that may turn out to be more hype than substance.

Whether the hypothesis of accidentally creating AGI is true or not, we're obviously expanding domain-level AI by leaps and bounds. If domain-level AI allows ordinary humans to do the kind of things that we're worried about AGI doing, it seems like the kind of thing we should all be able to agree is a concern - whether it's done by humans or intelligent computers. (For example, blackmailing politicians using deep-fake 'evidence' of transgressions they never committed.)

It also seems like a bridge to the AGI-skeptic community. If they can't accept that AGI is a risk that should be addressed, at least they can appreciate that misaligned technologies have a long history of negative outcomes, and the current development of AI is poised to dump a bunch of new capabilities in our laps. As a bonus, perhaps safeguards on domain-level AI would also help protect against the general problem?

Expand full comment

I think an arsenal of domain-specific AIs + a human inside will be a dominant model for decades before we reach AGI.

I'm not sure if replacing the human in the centre with an AGI would be an improvement, given that someone has to be the interface between the scary pile of compute and human goals to make any kind of use of it anyway.

Expand full comment

That's what I'm thinking. It also has the benefit of being applicable right now, since we have a lot of alignment problems already. I'm not saying the world would be SAFE with an AGI that has only Stone Age tools at its disposal, but we don't have to make it easy to destroy the world.

Even if we did solve the alignment problem for AGI, if we don't ALSO solve the alignment problem for non-general AI we're still creating tools that have severe civilization-disrupting potential. (cf the current covert cyber war among the US, Russia, China, Israel, etc.)

Whatever happens with AGI, this is a problem that has to be solved. Since Scott keeps subtly hinting that there's a lot of funding for AI research, this seems like a sub-specialty we should be heavily covering. I'm not familiar with the field, though. Is this something we're working on, or is most of the effort focused on AGI alignment?

Expand full comment

There was a period where human+computer beat computer alone, but it was short and ended a few years ago.

However I think two things are important to notice : 1) humans have improved since Deep Blue beat Kasparov. I would bet that Carlsen would beat Deep Blue rather easily, and may even win against 2005 top level engines.

2) I may be mistaken, but it seems that neural network play chess in a way that is much more human-like than classical engines - it's easier for us to understand the motivations of their move despite them being stronger. I don't know how to interpret that but I find it interesting.

Expand full comment

2 is interesting, since commentators during the AlphaGo-Lee Sedol game were struck by how unusually AlphaGo played by human standards (certainly compared to previous Go AIs, which encoded a lot of expert knowledge).

Expand full comment
Feb 24, 2022·edited Feb 24, 2022

I think I'm in the "this report is garbage and you should ignore it completely" camp (even though I have great respect for Ajeya Cotra and the report is probably quite well done if you apply some measure that ignores the difficulty of the problem). You basically have

- Extreme uncertainty about many aspects within the model, as admitted by Cotra herself

- Strong reasons to suspect that the entire approach is fundamentally flawed

- Massive (I'd argue) potential for other, unknown out-of-model errors

I think I give even less credit to it than Eliezer in that I don't even believe the most conservative number is a valid upper-bound.

SEPARATELY, I do just want to say this somwhere. Eliezer writes this post calling the entire report worthless. The report nonetheless [does very well in the 2020 review](https://www.lesswrong.com/posts/TSaJ9Zcvc3KWh3bjX/voting-results-for-the-2020-review) whose voting phase started after Eliezer's post was published, it it wins the alignment forum component in a landslide. Afaik I was literally the only person who gave the post a negative score. So can we all take a moment to appreciate how not-cultish the community seems to be?

Expand full comment

> I think I'm in the "this report is garbage and you should ignore it completely" camp

Then I wonder what you'll think of my take downthread.

> So can we all take a moment to appreciate how not-cultish the community seems to be?

In the sense that LWers feel free to disagree with Eliezer? I do appreciate that.

Expand full comment

> In the sense that LWers feel free to disagree with Eliezer?

yeah.

> Then I wonder what you'll think of my take downthread.

Full agreement with the first, descriptive part (doesn't seem like you said anything speculative), mild disagreements (but none I feel the need to bring up) with the last four paragraphs.

Expand full comment
Feb 24, 2022·edited Feb 24, 2022

I'd be curious to see (if anyone has any resources) the historical split and trend over time of compute costs broken down of each of the following three components:

- Chip development costs/FLOP.

- Chip production costs/FLOP.

- Chip running costs/FLOP (probably primarily electrical costs now).

I ask in relation to a concern with extrapolating historical rates of cost declines going forward. It's possible that the components of cost with the most propensity to be reduced will become an increasingly small share of cost over time. As such, the costs that remain may be increasingly difficult to reduce. This is a low-confidence idea as I don't know a ton about chip design, and there are plenty of reasons why extrapolating from the general trend might be right (e.g. perhaps as something becomes an increasing component of cost we spend more effort to reduce it).

That said, it would be interesting to see whether extrapolating future cost reductions from past ones would have performed well in other industries with longer histories? i.e. How have the real cost of steel or electricity gone down, as well as the share of costs from different inputs?

Totally separately, should we expect the rate of algorithm development and learning to decline as the cost of training single very large models and then evaluating their performance increases drastically? My intuition is that as the cost of iteration and learning increases (and the number of people with access to sufficient resources decreases) we should expect a larger proportion of gains to come from compute advance as opposed to algorithm design, but this something I have close to 0 confidence in.

Expand full comment
founding

Yes to this, and also add to the list the up-front cost of building the fab for the clever chip you've just developed. You could fold that into "chip development" or "chip production", but the fixed cost of the fab is a different kind of thing than the brainwork of developing the chip or the marginal unit cost once the plant is running, and so may be subject to different scaling.

Expand full comment

John, I agree - I was somewhat folding fab cost into "production" insofar as at the margin a doubling of chip production would require a doubling of fab construction for any generation of chip (I realize for each manufacturer, fabs are not a marginal cost, but long term for the industry they seem to scale proportionally to # of chips in a way that R&D does not). Happy to be corrected if that is wrong.

I am curious if you have a different sense, but my instinct (uninformed by data) is that energy/electricity costs have been becoming an increasingly large proportion of compute costs. I think it is very unlikely that electricity prices follow quite the exponential decay implied by the paper, but I suppose it is possible that efficiency increases exponentially with a 2-5 year doubling, given the brain operates on only 20W. Do you have any expertise that could falsify either of the above?

Expand full comment

"For the evil that was growing in the new machines, each hour was longer than all the time before."

Expand full comment

My hunch is that Eliezer is right about the problem being dominated by paradigm shifts, but that they usually involve us realising how much more difficult AGI is than we thought, moving AGI another twenty-odd years out from the time of the paradigm shift. A bit like Zenos paradox except the turtle is actually 100 miles away and Achilles just thinks he is about to catch up.

That being said I am bullish on transformative AI coming within the next 20 years, just not AGI.

Expand full comment

> and other bit players

I think this should be "big players"

Expand full comment

Something to consider is that there isn't yet the concept of agency in AI and I'm not certain anybody knows how to provide it. The tasks current impressive production AI systems do tend to be of the "classify" or "generate more like this" categories. Throwing more compute/memory/data at these systems might get us from "that's a picture of a goldfish" to "that's a picture of a 5-month old goldfish who's hungry", or from what GPT-3 does to something that doesn't sound like the ravings of an academic with a new-onset psychiatric condition.

None of these have the concept of "want".

Expand full comment
Feb 25, 2022·edited Feb 25, 2022

I think the relevant property of "agency" is "running a search", which AlphaZero already does. (Though GPT-3 does not.) The reason it doesn't feel like an agent to you is that the domain is narrow, but there is no qualitative thing missing.

Expand full comment
Feb 25, 2022·edited May 16, 2022

Thanks for criticizing Ajeya's analysis. Insofar as you summarized it correctly, I was furrowing my brow and shaking my head at several crazy-sounding assumptions, for reasons that you and Eliezer basically stated.

My model: current AIs cannot scale up to be AGIs, just as bicycles cannot scale up to be trucks. (GPT2 is a [pro] bicycle; GPT3 is a superjumbo bicycle.) We're missing multiple key pieces, and we don't know what they are. Therefore we cannot predict when exactly AGIs will be discovered, though "this century" is very plausible. The task of estimating when AGI arrives is primarily a task of estimating how many pieces will be discovered before AGI is possible, and how long it will take to find the final piece. The number of pieces is not merely unpredictable but also variable, i.e. there are many ways to build AGIs, and each way requires a different set of major pieces, and each set has its own size.

Also: State-of-the-art AGI is never going to be "as smart as a human". Like a self-driving car or an AlphaStar, AIs that come before the first AGI will be dramatically faster and better than humans in their areas of strength, and comically bad or useless in their areas of weakness.

At some point, there will be some as-yet unknown innovation that turns an ordinary AI to an AGI. After maybe 30,000 kWh of training (give or take an OOM or two), it could have intelligence comparable to a human *if it's underpowered*: perhaps it's trained on a small supercomputer for awhile and then transitioned to a high-end GPU before we start testing its intellect. Still, it will far outpace humans in some ways and be moronic in other ways, because in mind-design-space, it will live somewhere else than we do (plus, its early life experience will be very different). Predictably, it will have characteristics of a computer, so:

- it won't need sleep, rest or downtime (though a pausing pruning process could help). In the long run this is a big deal, even if processing power isn't scaled up.

- it will do pattern-matching faster than humans, but not necessarily as well

- it will have a long-term memory that remembers the things it is programmed to remember very accurately, while, in some cases, completely forgetting things it is not programmed to remember

- if it saves or learns something, it does so effortlessly, which should let it do things that humans find virtually impossible (e.g. learning all human languages, and having vast and diverse knowledge). Note: unlike in a human, in a computer, "saving" and "learning" information are two fundamentally different things; well, humans don't really do "saving".

- it will lack humanlike emotions, have limited social intelligence, and will predict human behavior even less reliably than we do, though in time, with learning, it'll improve

- Edit: for all the ink spilled on the illegibility of neural networks, AGIs are, for several reasons, much more legible than human neural brains, and therefore much easier to improve.

- it will have the ability to process inputs quickly and with low latency, and more crucially, produce outputs very rapidly and with a lower noise/error rate than a human can. This latter ability will make it possible (if its programmers allow) for it to write software that runs on the same machine, and to communicate with that software much faster than any human can communicate with a computer. If it's smart enough to write software, it will use this ability to augment its mental abilities in ways that can make it eventually superhuman in some ways, without increasing available computational power.

It's easy to think of examples of that last point, just by thinking about games. For instance, those games where you are given six letters and have to spell out as many words as you can think of? An AGI can simply write a program *within its own mind* to find all the answers, allowing it to quickly surpass human performance. Or that game where you repeatedly match three gems? Probably the AGI's neural net architecture can do that pretty well, but again it could write a program to do even better. Sudoku? No problem.

So at this point, the AGI should be able to outclass humans in various solitaire games, but might have limited talent in real-world tasks like fixing cars, or discovering the Pythagorian theorem, or even reading comprehension. But we can hugely increase its intelligence simply by giving it more compute, at which point it can quickly become smarter than every human in every way, and the AGI alignment problem potentially becomes important.

If we're lucky, the first AGI will do a relatively poor job at certain tasks, such as abstract reasoning on mental models, concept compression, choosing priorities of mental processes, synthesizing its objective function (or motivational system) with reality, and looking at problems from a variety of perspectives / at a variety of levels of abstraction. Handicaps in such areas could make it a poor engineer/scientist, which is good in the sense that it's safer. Such an AGI would be likely to have difficulty doing risky actions like improving its own design, or killing everyone, even if it has a whole datacenter-worth of compute.

If we're not lucky, we get the kind of AGI Eliezer worries about. I think we're going to be lucky, because Reality Has A Surprising Amount Of Detail. But the possibility of being unlucky has a high enough probability (1%?) that AI safety/alignment research should be well-funded. Edit: Plus, in the 99% case, I would raise my probability estimate of near-term catastrophe immediately after the first AGI appears, so it's good to get started on safety work early.

Funny thing is, I'm no AGI expert, just an aspiring-rationalist software developer. Yet I feel mysteriously confident that some of these AI experts are off the mark in important ways. The bicycle/truck distinction is one way.

Another way is that I think the trend toward more expensive supercomputer models is likely to reverse very soon, especially for those who make real progress in the field. Better compute enabled recent leaps in performance, but now that it has been proven that AIs can beat any human at Go and Starcraft, the prestige is harvested, and I don't see much reason to build more expensive models. It's a bit like how we moved from 8-bit CPUs all the way up to 64-bit CPUs, and then stopped adding more bits (apart from e.g. SIMD) because there just wasn't enough benefit. To the contrary, cheaper models are cheaper, so they enable a lot more experimentation and research by non-elites. It might well be that teenage tinkerers (at home, with monster gaming rigs) discover key pieces of the first AGI.

Expand full comment

Did we add pieces to GPT-2 in order to add the few-shot learning / promptability possessed by GPT-3? My understanding is no, it was emergent behavior caused by scaling.

Do we have a theory of what types of behavior can and can't emerge due to scale?

Expand full comment
Feb 26, 2022·edited Feb 26, 2022

As far as I know, GPT2 does "few shot learning" in the same sense as GPT3, but they didn't publish a paper on it, and GPT3 does it substantially better.

Edit: and I think people misunderstand GPT in general, because to humans, words have meanings, so we hear GPT speak words and we think it's intelligent. I think the biggest GPT3 is only intelligent in the same sense as a human's lingustic subsystem, and in that respect it's a superintelligence: so far beyond any human that we mistake it for having general intelligence. But I'm pretty sure GPT3 doesn't have *mental models*, so there are a great many questions it'll never be able to answer no matter how far it is scaled up (except if it's already seen an answer that it can repeat.)

Expand full comment

Thanks for the info about GPT-2. I tried to find examples of prompting / few shot learning in GPT-2 but GPT-3 dominates the search results. Do you have any handy?

Expand full comment

I looked, but couldn't a publicly-available GPT2 that seemed to use a big model like the one that wrote decent JRR Tolkein. Oddly, none of the public-facing models I saw disclose the model size or training info.

Expand full comment
Feb 25, 2022·edited Feb 25, 2022

All right.

I like what the report is saying (not that I've read it, just going off Scott's retelling of its main points), and it's reassuring me that the people working on it are competent and take every currently recognizable factor of difficulty into account.

I nevertheless think it's erring in the exact direction all earlier predictions were erring, which is the exact opposite of where Elezier thinks it's erring. I.e., they understand and price in the currently known obstacles and challenges on the road to AGI; they do not, because they cannot, price in the as-of-yet unknown obstacles that will only make themselves apparent once we clear the currently pertinent ones. E.g., you can only assume power consumption is the relevant factor if you completely disregard the difficulty and complexity of translating that power into (more relevant) computational resources. Then, with experience, you update to thinking in terms of computational resources, until you get enough of them to finally start working on translating them into something even more relevant, at which point you update to thinking in whatever measures the even more relevant thing. (Or don't update, and hope the newly discovered issues will just solve themselves, but there's little reason to listen to you until you actually provide a solution to them.)

(Bonus hot take: This explains the constant 30 years horizon, it's some stable limit of human imagination vis-a-vis the speed of technological progress. We can only start to perceive new obstacles when we're 30 years away from overcoming them.)

We don't know whether we'll encounter any new obstacles or, if so, what they will be, but allow me to propose one obvious candidate: environment complexity.

The entire discussion, as presented in the article, is based around advances in games like chess (8x8 board and simple consistent rules), go (19x19 board and simple consistent rules), or starcraft (well, much more complex, but still a simple, granular, 2D plane with simple consistent rules). (I'm ignoring GPT and the like, because they simply aren't ever reliably performing human tasks.) Neither of those tell us much about a performance in the real world (infinite universe with complex slash unknown slash ever-changing rules). Assuming computational resources are the only relevant factor may be (and, I believe, is) completely ignoring the problem of data necessary to train an AI that is capable of interacting with reality as well as a human does. The relevant natural science analogy may yet turn out to be not "10^41 FLOP/S", but "a billion of years of real-time training experience". We will, of course, be able to bring that number significantly down, but to 30 years? I'm extremely skeptical.

(Bonus hot take: The first AGI takeover will literally be thwarted by it not understanding the power of love, or friendship, or some equally cheesy miscalculation about human behavior, which it will have failed to adequately grasp.)

tl;dr: The report is by necessity overly optimistic (pessimistic if you think AGI means end of humanity), but constitutes a useful lower bound. Elezier is not even wrong.

Expand full comment

Coming up with a complex environment isn't that hard. So firstly, we have a lot of games, and a lot of other software. Set up a modern computer with a range of games, office applications and a web browser, and let the AI control the mouse+keyboard input and see the screen output. Thats got a lot of complexity. (it also gives the AI unrestricted internet access, not a good idea.)

"a billion of years of real-time training experience"

More like 20 years of training experience and a handful of basic principles of how to learn.

If you had an AIXI level AI, just about any old data will do. There is a large amount of data online. Far far more than the typical human gets in their training. (Oh and the human genome is online too, so if it contains magic info, the AI can just read it)

People have been picking games like chess mostly because their AI techniques can't handle being dumped into the real world. Giving an AI a heap of real world complexity isn't hard.

Expand full comment

Small mistake, "he upper bound is one hundred quadrillion times the upper bound." should be "he upper bound is one hundred quadrillion times the lower bound."

Expand full comment

strong agree that it's not very decision-relevant whether we say AGI will come in 10 vs 30 vs 50 years if we realistically have significant probability weight on all three. Well, at least not for technical research. Granted, I wrote a response-to-Ajeya's-report myself ( https://www.lesswrong.com/posts/W6wBmQheDiFmfJqZy/brain-inspired-agi-and-the-lifetime-anchor ), but it was mainly motivated by questions other than AGI arrival date per se. Then my more recent timelines discussion ( https://www.lesswrong.com/posts/hE56gYi5d68uux9oM/intro-to-brain-like-agi-safety-3-two-subsystems-learning-and#3_8_Timelines_to_brain_like_AGI_part_3_of_3__scaling__debugging__training__etc_ ) was mainly intended as an argument against the "no AGI for 100 years" people. I suspect that OpenPhil is also interested in assessing the "No AGI for 100 years" possibility, and also are interested in governance / policy questions where *maybe* the exact degree of credence on 10 vs 30 vs 50 years is an important input, I wouldn't know.

Expand full comment

These projections ignore the "ecology" (or "network" if you prefer). Humans individually aren't very smart, their effective intelligence resides in their collective activity and their (mostly inherited) collective knowledge.

If we take this fact seriously we will be thinking about issues that aren't discussed by this report, Yudkowsky, etc. For example:

- What level of compute would it take to replicate the current network of AI researchers plus their computing environment? That's what would be required to make a self improving system that's better than our AI research network.

- What would "alignment" mean for a network of actors? What difference does it make if the actors include humans as well as machines?

- Individual actors in a network are independently motivated. They are almost certainly not totally aligned with each other, and very possibly have strongly competitive motivations. How does this change our scenarios? What network & alignment structures produce better or worse results from our point of view?

- A network of actors has a very large surface area compared to a single actor. Individual actors are embedded in an environment which is mostly outside the network and have many dependencies on that environment -- for electric power, security, resources, funding, etc. How will this affect the evolution and distribution of likely behaviors of the network?

I hope the difference in types of questions is obvious.

Some objections:

- But Alpha zero! Reply: Individuals aren't very intelligent and chess is a game played by individuals. Alpha zero can beat individual humans, not a big deal.

- But Alpha fold! The success of Alpha Fold depends on knowledge built up by the network. Alpha fold can better utilize this knowledge than any individual human, again no big deal. Alpha fold can't independently produce new knowledge of the type it needs to improve. However Alpha fold *does* increase the productivity of biochemical research, that will greatly increase the rate of progress of the network, and will feed back to some degree to Alpha fold.

- But GPT 3! This is a great example of consolidating and using collective knowledge from the corpus -- and it helps us understand how much knowledge is embedded implicitly in the corpus. On the other hand we haven't seen any AI that generates a significant net increase in the collective knowledge of our corpus. This will come but AIs will only increase our collective knowledge incrementally to begin with.

- But FOOM! This would require replicating the whole research endeavor around AI and probably a lot more -- maybe much of the culture and practice of math, which is very much a collective endeavor, for example. Not going to happen quickly just because one machine gets a few times more intelligent than a single AI researcher.

Expand full comment

I wonder to what extent the curves in the Compute vs. ELO graph flatten out to the right due to the inherent upper limit of ELO. Or conversely, to what extent the flattening indicates limits to this type of intelligence.

Expand full comment

I'm trying to read that linked article by Eliezer now and holy crap, he could really use an editor that would tell him to cut out half of the text and maybe stop giving a comprehensive, wordy introduction to Eliezerism at the beginning of every text he writes.

Expand full comment

A thought: Platt's Law is a specific case of Hofstadter's Law (which is also about AI, actually): It always takes longer than you think, even when you take into account Hofstadter's Law. Which fits with the "estimates recede at the rate of roughly one year per year". You make a guess, take into account Hofstadter's Law, then a year goes by, and you find yourself not really any closer, rinse and repeat.

Another thought: Platt's Law is about the size of "a generation", so Platt's Law-like estimates could be seen as another way of looking around and saying "it won't be THIS generation that figures it out".

Final thought: it seems to me that if you're going to take the "biological answer" approach, it would make more sense to look at how evolution got us here vs. how we're working on getting AI to a human level of performance. How many iterations and how "powerful" was each iteration for evolution to arrive at humans? How many iterations and how "powerful" an iteration has it taken for us to get from an AI as smart as an algae to whatever we have now.

Expand full comment

Aren't 10% missing from her weighing of the 6 models?

Not sure it makes much of a difference though

Expand full comment

The implicit assumption is that we are not far from being able to reverse engineer the wiring of someone's brain, or perhaps some reference brain, and simulate it as a neural network and get a working human like intelligence. That's not a totally unreasonable idea, but we are no where close to being able to figure out all the synaptic connections in someone's brain. Really. It's not like the human genome or understanding a human cell. You can grab some DNA or a few cells from someone pretty easily, but just try and follow a brain's wiring.

No, we are not going to be able to use ML to figure out the wiring based on some training set. We can probably get such a system to do something interesting, but it isn't going to be thinking like a human. ML algorithms are just not robust enough. Visual recognition algorithms fall apart if you change a handful of pixels, and even things like AlphaFold collapse if you vary the amino acid sequence. Sure, people fall apart too, but an AI that can't tell a hat from a car if a couple of visual cells produce bogus outputs isn't behaving like a human.

Then, there's all the other stuff the brain does, and it's not just the nerve cells. The glial cells and astrocytes do things that we are just getting a glimpse of. It's not like brains don't rewire themselves now and then. There's Hebb's rule: neurons that fire together, wire together, and we barely have a clue of how that works at a functional level, so good luck simulating it.

Closer to home, the brain is full of structures that embed assumptions about an animal needs to process information to survive and reproduce. The thing is that we don't know what all of these structures are and what they do. Useful ML algorithms also embed assumptions. Rodney Brooks pointed out that the convolution algorithms used in ML object identification algorithms embed assumptions about size and location invariance. MLs don't learn that from training sets. People write code that moves a recognition window around the image and varies its size. (Brooks has been a leader in the AI/ML world since the 1980s, and his rodneybrooks.com blog is full of good informed analysis of the field and its capabilities.)

Maybe I'm too cynical, but I'll go with Brooks' NIML, not in my lifetime.

Expand full comment

Really nice article.

My one comment would be that if you're going to use biological anchors, it would be wise to include a biologist in the discussion. This whole debate feels like it only involves AI experts, focused on machine learning.

There is a whole science devoted to brain biology, where they create electrical models of human brains and snyapses and dendrites and so on. Many dendrites and axons connect to hundreds of other neurons, analogous to neural nets in a way, but different parts of the brain have different types. of neurons.

Also relevant is the fact that the vast majority of the FLOPs the brain does have nothing whatsoever to do with "general intelligence." Rather, they are about maintaining bodily functions, regulating emotions, etc. An AGI doesn't need to worry about any of this, just the cognitive parts.

I am not the right expert, but it feels like getting the right expert involved would be worthwhile - if you want to base your AGI prediction on the FLOP/s of a human brain, at least make a bit more effort to get that one (somewhat tangible) number accurately.

Expand full comment