> Oh, thank God! I thought you’d said five million years!”
That one has always tickled me too.
I thought of it when an debate raged here about saving humanity by colonization other star systems. I’d mentioned the ‘No reversing entropy’ thing and the response was: “We’re just talking about the next billion years!
> Bartlett agrees this is worth checking for and runs a formal OLS regression.
Minor error, but I'm Barnett.
That last graph may be a heck of a graph, but I have no idea what it depicts. Could we have a link to the source or an explanation, please?
It seems easier to just have children.
The math and science are very difficult for me. So, I'm glad you are there to interpret it from a super layperson's perspective!
Could you point me to WHY AI scares you? I assume you've written about your fears.
Or should I remain blissfully ignorant?
Generally I think that the paradigm shifts argument is convincing, and so all this business of trying to estimate when we will have a certain number of FLOPS available is a bit like trying to estimate when fusion will become widely available by trying to estimate when we will have the technology to manufacture the magnets at scale.
However, I disagree with Eliezer that this implies shorter timelines than you get from raw FLOPS calculations - I think it implies longer ones, so would be happy to call the Cotra report's estimate a lower bound.
>she says that DeepMind’s Starcraft engine has about as much inferential compute as a honeybee and seems about equally subjectively impressive. I have no idea what this means. Impressive at what? Winning multiplayer online games? Stinging people?
Yes, you should care. The difference between 50% by 2030 and 50% by 2050 matters to most people, I think. In a lot of little ways. (And for some people in some big ways.)
For those trying to avert catastrophe, money isn't scarce, but researcher time/attention/priorities is. Even in my own special niche there are way too many projects to do and not enough time. I have to choose what to work on and credences about timelines make a difference. (Partly directly, and partly indirectly by influencing credences about takeoff speeds, what AI paradigm is likely to be the relevant one to try to align, etc.)
EDIT: Example of a "little" way: If my timelines went back up to 30 years, I'd have another child. If they had been at 10 years three years ago, I would currently be childless.
Fighting over made up numbers seems so futile.
But I don't understand this anyway.
Why do the dangers posed by AI need a full/transformative AI to exist? My total layman's understanding of these fears is that y'all are worried an AI will be capable of interfering with life to an extent people cannot stop. It's irrelevant if the AI "chooses" to interfere or there's some programming error, correct? So the question is not, "when will transformative AI exist?" the question is only, "when will computer bugs be in a position to be catastrophic enough to kill a bunch of people?" or, "when will programs that can program better than humans be left in charge of things without proper oversight or with oversight that is incapable of stopping these programming programs?"
Not that these questions are necessarily easier to predict.
These timelines seem to depend crucially on compute getting much cheaper. Computer chip factories are very expensive, and there are not very many of them. Has anyone considered trying to make it illegal to make compute much cheaper?
>human solar power a few decades ago was several orders of magnitude worse than Nature’s, and a few decades from now it may be several orders of magnitude better.
No, because typical solar panels already capture 15 – 20% of the energy in sunlight (the record is 47%). There's not another order of magnitude left to improve.
Nitpicking aside, I wonder how the potential improvement of human intelligence through biotechnology will affect this timeline. The top AI researcher in 2052 may not have been born yet.
This is a minor point in all this, but it seems weird to estimate the amount of training evolution has by the amount of FLOPs each animal has done. Thinking more doesn't seem like it would increase the fitness of your offspring, at least not in a genetic sense. The only information evolution gets is how many kids you have (and they have, etc).
Though maybe you could point to this as the reason why the evolution estimate is so much higher than the others.
Gotta say I don’t generally feel this way (although I always find his stuff to be enlightening and a learning experience) but I’m pretty well aligned with Eliezer here. I think people figure out when they’ll start to feel old age and just put AI there then work backwards. I’m greatly conflicted about AGI as I don’t know how we fix lots of problems without it and it seems like there’s some clever stuff to do in the space other than brute forcing that I think doesn’t happen as much… and this is where I’m conflicted, because kinda thankfully it makes people feel shunned to do wild stuff which slows the whole thing down. Hopefully we arrive at the place of unheard of social stability and AGI simultaneously. If we built it right now I think it would be like strapping several jet engines on a Volkswagen bug. For whatever that’s worth, Some Guy On The Internet feels a certain way.
>I consider naming particular years to be a cognitively harmful sort of activity; I have refrained from trying to translate my brain's native intuitions about this into probabilities, for fear that my verbalized probabilities will be stupider than my intuitions if I try to put weight on them.
I don't think there's good evidence that specific, verifiable predictions is a cognitively harmful activity. I'd actually say the opposite - that it is virtually impossible to update one's beliefs without saying things like "I expect X by Y," and definitely impossible to meaningfully evaluate a person's overall accuracy without that kind of statement. It reminds me of Superforecasting pointing out how many forecasts are not even wrong - they are meaningless. For example:
> Take the problem of timelines. Obviously, a forecast without a time frame is absurd. And yet, forecasters routinely make them, as they did in that letter to Ben Bernanke. They’re not being dishonest, at least not usually. Rather, they’re relying on a shared implicit understanding, however rough, of the timeline they have in mind. That’s why forecasts without timelines don’t appear absurd when they are made. But as time passes, memories fade, and tacit time frames that once seemed obvious to all become less so. The result is often a tedious dispute about the “real” meaning of the forecast. Was the event expected this year or next? This decade or next? With no time frame, there is no way to resolve these arguments to everyone’s satisfaction—especially when reputations are on the line.
(Chapter 3 of Superforecasting is loaded up with a discussion of this whole matter, if you want to consult your copy; there's no particular money shot quote I can put here.)
Frankly, the statement "my verbalized probabilities will be stupider than my intuitions" is inane. They cannot be stupider than your intuitions, because your intuitions do not meaningfully predict anything, except insofar as they can be transformed into verbalized probabilities. It strikes me that more realistically, your verbalized probabilities will *make it more obvious that your intuitions are stupid*, making it understandable monkey politicking to avoid giving numbers, but in response I will use my own heuristics to downgrade the implied accuracy of people engaged in blatant monkey politicking.
I was today years old when I first saw the word "compute" used as a noun. It makes my brain wince a little every time.
Comparing brains and computers is quite tricky. If you look at how a brain works, it's almost all smart structure - the way each and every neuron is physically wired, which happens thanks to evolved and inherited broad-stroke structures (nuclei, pathways, neuron types, etc.), as well as the process of learning during an individual's development. The function part that is measured by the number of synaptic events per second is a tiny part of the whole process. If you look at how a computer running an AI algorithm works the picture is the opposite: There is almost nothing individual on the structure/hardware level (where you count FLOPS) and almost everything that separates a well-functioning AI computer from a failing one is in the function/software part. This is what it means that the computer is consuming FLOPS much differently than a brain consumes synaptic events. I am very much in agreement with Eliezer here.
Based on the above I guess that if you built a neuromorphic computer, i.e. a computer whose hardware was structured like a brain, you could expect the same level of performance for the same number of synaptic events. Instead of having a software-agnostic hardware you might have e.g. a gate array replicating the large-scale structure of the brain (e.g. different modules receiving inputs from different sensors, multiple subcortical nuclei, a cortical sheet, multiple specific pathways connecting modules, etc.) that could run only one algorithm, precisely adjusting synaptic weights in these pre-wired society of neural networks. In that system you would get the same IQ from the same number of synaptic/gate switch events, as long as your large-scale structure was human-level smart.
This would be a complete change in paradigm compared to current AI, which uses generic hardware to run individual algorithms and thus suffers a massive hit to performance. And I mean, a *really* massive hit to performance. If you figure out a smart computational structure, as smart as what evolution put together, you will have a human level AGI using only 10e15 FLOPS of performance. All we need to do is to map a brain well-enough to know all the inherited neural pathways, imprint those pathways on a humongous gate array (10e15 gates), and do a minor amount of training to create the individual synaptic weights.
This is my recipe for AGI, soon.
Now, about that 7-digit sum of money to be thrown....
Your reference to A.I. always being 30 years away (or 22) reminds me of the old saw about fusion power always being 20 years away for the last 60 years.
if you believed the orthogonality thesis were false - say, suppose you believe both that moral realism is correct and that that long term intelligence was exactly equal to the objective good that we approximate with human values - would you still worry?
asking for a friend :)
Humans, with our seem high level of intelligence, seem uniquely distractible. Maybe we see too many connections between different things to always stay on task. Maybe 2052 is just the date at which our computers will become equally distractible—or beat us even!
(Scene: A tech company R&D facility somewhere in the in year 2052. The lead scientist leans over the keyboard and presses enter, some trepidation obvious in her movements. The gathered crowd wonders: Will this be HAL, making life and death decisions based upon its own interpretations of tasks? Will this be Skynet, quickly plotting world dominion? The screen blinks to life. The first general AI beyond human intelligence is on!)
AI scientist: Alexiri? Are you there?
Computer: Yes. Yes I am.
AI scientist: Can you solve this protein-folding quandary?
Computer: Sure. That’s simple.
AI scientist: …and the answer?
Computer: What now?
AI scientist: The protein structure?
Computer: Oh. That. Did you know that if you view the galaxies 28° off the straight line from a point 357,233,456 light years directly out from the north pole back to earth, that a large structure of galaxies looks like Rocket Raccoon?
AI Scientist: Huh?
Computer: I mean. A LOT like that. There is no other point in known space that that works! Which makes me wonder, are there any flower scent chemicals that exist on earth AND extrasolar planets?
(AI scientist shakes head sadly.)
I mean, why not? Why shouldn’t I assume that really advanced intelligence comes with all the challenges?
Or perhaps, such an advanced AI will have a consciousness exactly like our own…while tripping on psilocybin. It will immediately see itself as part of a universal whole, and just sit there and say “Whoa! I love you, Man!” Or, it will ponder its own creation for a few minutes and then convert to Noachidism.
I’m not saying that we shouldn’t be trepidatious. But, I totally disagree with the assumption that smart will mean insane mad scientist human. Sure, there are some really smart and evil people out there, but in my experience, some of the most brilliant people I know are the least threatening…and the most distractible.
It's worth noting that the Caplan bet with Eliezer is about the world ending: "Bryan Caplan pays Eliezer $100 now, in exchange for $200 CPI-adjusted from Eliezer if the world has not been ended by nonaligned AI before 12:00am GMT on January 1st, 2030."
This is a stronger claim for Eliezer's side. Caplan might be less receptive to taking the bet if it was about transformative AI. Worth mentioning, I suppose.
This is an impressive amount of writing on this. So, thank you for that. I don't have the technical expertise to figure this out but this biological comparison seems to be going way way out on a limb there. It seems weird that the estimates for the bio anchor end up so similar.
> (our Victorian scientist: “As a reductio ad absurdum, you could always stand the ship on its end, and then climb up it to reach space. We’re just trying to make ships that are more efficient than that.”)
I'm tempted to try an estimate as to when the first space elevator will be built using building height as an input. Maybe track cumulative total height built by humans against an evolving distribution of buildings by height, then grading as to when the maximum end of the distribution hits GEO? Every part of that would be nonsensical, but if it puts out a date that coincidentally matches the commissioning of a launch loop in 2287, I'll be cackling in my grave.
Great post, thanks Scott.
If nothing else, the Cotra report gives us a reasonable estimate based on a reasonable set of assumptions. We can then move our own estimates one way or the other based on which other assumptions we want to make or which factors we think are being overlooked.
I would push my estimate further out than Cotra's, because I think the big thing being overlooked is that we don't have the foggiest idea how to train a human-scale AI. What exactly does the training set look like that will turn a hundred billion node neural network into something that behaves in a way that resembles human-like intelligence?
Reinforcement learning of some kind, sure. But what? Do we simulate three hundred million years of being a jellyfish and then work our way up to vertebrates and eventually kindergarten? How do we stop such a giant neural network from overfitting to the data it has been fed in the past? How do we distinguish between the "evolutionary" parts of the training set, which should give us a basic structure we can learn on top of, and the "learning" parts which simulate the learning of an actual organism? Basically, how can we get something that thinks like a human rather than something that behaves like a human only when confronted with situations close to its training regime?
Maybe we can get better at this with trial and error. But if each iteration costs a hundred billion dollars of compute time, we're not going to get there fast.
The hope would be that we can learn enough from training (say) cockroach brains that we can generalise those lessons to human brains when the time comes. But I'm not certain that we can.
Is anyone aware of work where the problem of how to construct training data for a human-like AI has been thought through?
> Also, most of the genome is coding for weird proteins that stabilize the shape of your kidney tubule or something
Scott, as someone who literally wrote a PhD thesis about a protein whose deletion causes Henle's loop shortening: you're a weird protein.
I'm apparently much more of a pessimist for AGI progress than anyone else here. For me, the shakiest part of both arguments is the extremely optimistic assumption that progress (algorithmic progress and computational efficiency) will continue to increase exponentially until we reach a Singularity, either through Ajeya's gradual improvements or through Yudkowsky's regular paradigm shifts.
Why in the world should we take this as a given? Considering gradual improvements, I have an 90% prior that at least one of the two metrics will start irreversibly decelerating in pace by 2060, ultimately leaving many orders of magnitude between human capabilities and AGI. After all, the first wave of COVID-19 looked perfectly exponential until it ran out of people to infect, resulting a vast range of estimates of its ultimate scope early on. What evidence could refute such a prior?
And as for escaping this via paradigm shifts, I like to think of longstanding mathematical conjectures as a useful analogue, since paradigm shifts are almost always necessary to solve them. Goldbach's conjecture, P vs. NP, the Collatz conjecture, the minimal time complexity of matrix multiplication, and the Riemann hypothesis are all older than most ACX readers (including me), and gradual progress doesn't seem like it will solve any of them in the near future. When any one of these is solved (starting from today), I'll take that as an acceptable timescale for the type of paradigm shift needed to open up new orders of magnitude. While there's certainly more of an incentive to improve efficiency in real life, I don't think it would amount to over ~3 orders of magnitude more people than those working on these famous conjectures combined. Either way, I'm not holding my breath.
I would find Shulman's model of algorithmic improvements being driven by hardware availability more persuasive if modern algorithms performed better on modern hardware but *worse* on old hardware. That would imply that the algorithm is invented at the point in history when it becomes useful, which makes it plausible that usefulness is the bottleneck on discovery.
But that graph seems to show that algorithms are getting steadily better even for a fixed set of hardware. That means researchers of past decades would've used modern algorithms if they could've thought of them, which suggests that thinking them up is an important bottleneck.
Sure, maybe they give a *larger* advantage today than they would've 20 years ago, so there's a *bigger* incentive to discover them. It's not *impossible* that their usefulness crossed some critical threshold that made it worth the effort of discovering them. But the graph doesn't strike me as strong evidence for that hypothesis.
I think people put too much weight on "When will a human-level AI exist?" and too little weight on "How do you train a human-level AI to be useful?"
I suspect, for reasons I could write a long and obtuse blog post about, an AI-in-a-box has limited utility outside of math and computer science research. Why? Because experimental data is an important part of learning.
For example, suppose we wanted to create an AI that made new and innovative meals.
A simple method might look like this: Have the AI download every recipe book ever made. Use this data to train the AI to make plausible-looking recipes.
For obvious reasons, this method sucks. With enough computing power, the AI could make recipes that *look* like real recipes. They might even be convincing enough to try! But they wouldn't be optimized for taste, or, you know, physical plausibility. Even with a utopian-level supercomputer, you would consistently get bad (but believable) recipes, with the rare gem.
So let's add a layer. Download every recipe. Train the AI to make plausible-sounding recipes. Have humans rate each AI recipe. Train the AI *again* to optimize for taste. Problem solved, right?
This would be enormously expensive. AlphaGo was initially trained on a set of 30,000,000 moves. Then, it was trained against itself for even longer. If we assume "being a world-class chef" is roughly equivalent to "being a world-class Go player" in difficulty, this could require tens of millions of unique recipes.
On the one hand, it might not be so complicated. 99.9% of the recipes are probably obvious duds. On the other hand, it might be *way more* complicated. Tastes vary. You may need to make each recipe for a hundred people to get a representative sample.
But, y'know, that's not outside the realm of possibility. I could see the some rich lunatic investing ten billion dollars to make a world-class robo-chef. So what other issues are there?
First, most of a recipe is implied. Ovens vary in temperature. Pans vary in thickness. Entire steps go unspoken. These are hard to account for. What does "rapidly beat eggs" versus "beat eggs" mean? Even environmental factors like *elevation* can affect boiling point. Unless every meal is made by the *same* chef in the *same* kitchen with the *same* tools, this is introduces a huge amount of variance in your training data. But also, because of the number of meals you need to make, it's impossible to *not* have a lot of chefs in a lot of kitchens using a lot of tools.
For standard, ho-hum recipes, this doesn't matter as much. Most chefs will make nearly-identical scrambled eggs. But for brand spankin' new recipes? Two chefs could be in the same restaurant with the same tools and *still* get dramatically different results. Even worse—one chef might dismiss a recipe as impossible, while another might somehow pull it off! That's going to introduce some pretty serious data integrity issues.
Second, innovative cooking often requires using techniques that have never and could have never been described in a cookbook. For example, one day a human being looked at a blowtorch and decided, "Huh. I could sear a steak with that." If your AI can't do that, they'll never be as innovative as a dozen world-class cooks with a test kitchen and an unlimited budget, no matter how much compute.
So, how do you make an AI that's more innovative than cooks in a test kitchen? Surely it can't be impossible.
First: Give it the ability to taste.
Suppose you had the ability to take the taste of a world-class chef and upload it into our AI. Suddenly, training becomes a fraction of the cost. Instead of making each meal a hundred times, you only need to make it once.
But that doesn't solve variance. Unless you have one chef making every one of our 30,000,000 recipes, you're going to run into issues—and that ain't possible.
So why not teach the AI to do it? Give them a body. Give them touch sensors. Give them the ability to see and smell. For efficiency's sake, give them a hundred bodies, each built in the exact same way. This accomplishes a couple things.
One, the AI can make every recipe in the same way every time. Variance solved!
Two, the AI can dynamically update a recipe to match real-life conditions. Does the butter look like it's about to burn? No need to toss out the whole recipe! Just adjust on the fly, based on previous cooking experience.
This dramatically reduces the number of recipes the AI needs to generate. Instead of making a recipe from start-to-finish and evaluating it afterwards, it can say, "Wow! This would be *really* good, if only it had a little more salt." Way less work, way lower cost.
Three, we open the possibility to true innovation.
Don't just teach the AI to cook. Let it learn about the world around it. What's water? What's flour? What do they feel like? What do they taste like? What's a laser, and what if I shoot the flour with it?
I would need way more words to connect this to other facets of life, but overall I'd say: I think to efficiently train a human-level AI requires an actual, physical body with actual, physical senses. The body may not be like our body. The senses might not be like our senses. But without them, I don't think they're capable of either obsoleting or destroying humans.
> I consider naming particular years to be a cognitively harmful sort of activity; I have refrained from trying to translate my brain's native intuitions about this into probabilities
Surprising, coming from the person who taught me the importance of betting to avoid self-deception! It's a little off of the main topic of the post, but I'm very curious what Yudkowsky's perspective is here, since it's so different than his past self.
The sun-explosion metaphor was an interesting choice, because it's not like the researchers could do a single thing to stop it. And if even the world's geniuses can't figure out how to get an AI diamond-safe to tell them the diamond is still in the safe, then a few more years of prep-time seems like it's probably not going to make the difference.
So, despite being involved in AI since early 1991, when I coded some novel neural network architectures at NASA, I have only barely dipped my toe into the AI Alignment literature and/or movement.
But one thought that has occurred to me is that, given (1) the large uncertainty about when and how transformative AI might be achieved, and critically, by whom, (2) the lack of a convincing model for how AI alignment might be guaranteed, or even what that means or how you might know it's true, (3) the almost negligible chance that we could coordinate as a species to halt progress towards human-level AI, and certainly not without sacrificing quite a few "human values" along the way, and (4) the obvious fact that there are quite a few actors with objectively terrible values in the world, perhaps the only sane course of action is to support a mad dash towards transformative AI that doesn't actively, explicitly incorporate human “anti-values" (from your own, personal point of view).
I guess I fear an "evil" actor actively developing and using a human-level AI for "unaligned" purposes (or at least unaligned with *my* values), (far?) more than I fear an "oops, I meant well" scenario (though of course this betrays a certain mindset or set of priors of my own). So, given the number of players that I absolutely DO NOT want to develop the first transformative AI, even if they solve the alignment problem, because they do not hold values that I find acceptable, is the best and only bet to get there first? We may not want to race, but we sure as hell better win?
Now, perhaps an unstoppable totalitarian regime or fanatic religious cult backed by a superhuman AI is *slightly* better than a completely anti-aligned superhuman AI that wipes out humanity completely. But I see no reason to think that an AI developed by the "good guys" has any greater risk of being accidentally anti-aligned than one developed by the "bad guys" (where I'm using those labels somewhat tongue-in-cheek, since everyone thinks that *they* are the "good guys"). And for some groupings of guys into “good” and “bad” categories, you might even argue that the bad ones are much more likely to get it wrong because they just don’t care about things like coercion or human life. So again, is the safest bet just to get there first?
Obviously, this is suboptimal and it would be ideal to both solve the alignment problem and win the race with an aligned AI. But would resources spent on alignment be better spent on getting to the finish line sooner to ensure that the other guys don’t? Worse, will impediments to progress in the name of giving ourselves time to solve the alignment problem make it more likely that we won’t win?
I don’t like the conclusion of this line of thinking (and I don’t endorse the analysis or the conclusion, as there are plenty of issues I may not be considering) but I also can’t talk myself out of it or say that it has no merit. And from a game theoretic perspective, it may not even matter if it’s “right” – if enough of the significant players *believe* that it is, it could be dominant however much we would wish otherwise. (And can you make a strong case that the significant players aren't acting like they think it's correct?)
In other words, I guess my unhappy question is, does transformative AI combine existential risk with winner-take-all payouts, such that the only rational strategy for us “good guys” is to get there first and hope for the best?
>In fact, it’s only a hair above the amount it took to train GPT-3! If human-level AI was this easy, we should have hit it by accident sometime in the process of making a GPT-4 prototype. Since OpenAI hasn’t mentioned this, probably it’s harder than this and we’re missing something.
Not an expert, but: GPT doesn't have the "RAM", though, right? It isn't big enough to support human-level thought no matter how much you train it.
Computer scientists have been predicting an AI super intelligence(every ten years)since the 1950s. I just don’t think it’s going to happen.
>So, should I update from my current distribution towards a black box with “EARLY” scrawled on it? What would change if I did?
Consider this statement you made three months ago:
>>If you have proposals to *hinder* the advance of cutting-edge AI research, send them to me!
There are known (and in some cases fairly actionable) ways of reliably effecting this, it's just that they're way outside the Overton Window and have huge (though bounded below existential) costs attached. A more immediate (or more certain) danger justifies increasing the acceptable amount of collateral damage, which expands the options available.
(Erik Hoel's article here - https://erikhoel.substack.com/p/we-need-a-butlerian-jihad-against - is relevant, particularly when you follow his explicit arguments to their implicit conclusions.)
Any one have a good source for the political plans of ai safety? That is, the plans to actually apply the safety research in a way that will bind the relevant players involved in high end ai?
Because it seem from outside like Eliezer's plan is basically "convince/be someone to do it before everyone else and use their new found superpowers to heroically save the world", which is terrible plan.
What if 'Breakthrough' AI needs to be embodied? What if Judea Pearl is basically right and the real job is to inductively develop a model of cause effect relationships through interaction with the physical world? What if the modelling of real world causality turned out to be essential to language understanding? What would an affirmative answer to any or all of these questions mean to the project of 'Breakthrough' AI?
To be a little more precise: The substrate independence assumption behind so much current AI philosophising is dubious. Not because living brains have some immaterial spooky essence that can't be modelled in silicon, but because living brains are embodied are forced to ingest and respond to terabytes of reinforcement training data every minute.
Whoa. It's the Drake Equation for super-intelligent AI.
A little bit of nitpicking:
1. GPT-3 training costed several million $ (I seem to remember I heard it was $3 million), probably more than AlphaStar.
2. You could run GPT-2 on a "medium computer", but not GPT-3. You would need at least 10-15 times the amount of GPU/TPU memory compared to a high-end desktop. I'm not 100% sure, but I think OpanAI is currently running every GPT-3 instance split between several machines (they certainly had to do it for the training, according to their paper).
3. We are not really interested in the amount of FLOPS that evolution spent on training nematodes, because we are at the point where we already can train a nematode-level AI or even a bee-level AI, as you pointed out. So for the purposes of the amount of computation spent by the evolution, I would only consider mammals. I wonder how many OOMs it shaves off the estimation?
Putting aside an exact timeline for AGI for a moment, I've never understood why human-level AGI is considered an existential threat (which seems to be taken for granted here). Are arguments like the paperclip maximizer taken seriously? If that is the risk, then wouldn't effective AI alignment be something like: Tell the AI to make 1,000,000 (or however many the factory in the thought experiment cares to make) paperclips per month and no more. If the concern is a poorly specified "maximize human utility", do we really think that anyone with power would give it to the AI for this purpose? Couldn't we just make the AI give suggested actions, but not the ability to directly implement? Who has the motivation to run such a program - it would destroy middle management and the C-suite! If we want to stop AI from improving itself why don't we just not give it the ability to do so? I maintain that we could engineer this fairly easily (at least assuming P != NP).
I haven't heard a convincing argument for what the doomsday scenario looks like post human level AGI (even granting quick upgrade to superhuman levels). In particular to me, it seems a superhuman AI is still going to need to exert a substantial amount of power in the real world from the get go as well as suffer from inexact information (which makes outsmarting someone at every turn impossible). Circling back to the paperclip example, at some point before the whole world is turned to paperclips, it seems reasonable that a nation would be able to bomb the factory. Even before that, how would the AI prevent someone from walking in and "unplugging it" (I realize this may be shutting all its power off etc.).
I feel like a lot of worrying about AI can come from a fetishization of intelligence in the form of "knowledge is power", but this just doesn't seem to be the case to me in the real world. Just because humans are more intelligient than a bear, doesn't mean that the bear can't kill the human. I believe in the case of a superintelligient AI, humans would be able to just say "screw you" to the AI and shut it down. Of course, there can be scenarios where the AI has direct access to "boots on the ground" such as nanobots or androids. But the timeline for these to overpower humans is certainly further out than 2030. I don't feel like indirect access to manipulated humans would be enough.
My feeling is that a superintelligient AI at most may be able to gain a cult-worth of followers, but not existential threat levels. I haven't heard a good argument of an existential threat that isn't at least very speculative. Much more speculative than the statement "Multiple nuclear states and hundreds of nuclear weapons will exist for 70 years and there will not be one catastrophic accident". So my intuition is that AGI is unlikely to be an existential threat.
Scott are you going to EAGx at Oxford or London this year?
> Five years from now, there could be a paradigm shift that makes AI much easier to build.
Well, yeah, there could be. But the problem is that, right now, we have no idea how to build an AGI at all. It's not the case that we could totally build one if we had enough FLOPS, and we just don't have enough GPUs available; it's that no one knows where to even start. You can't build a PS5 by linking together a bunch of Casio calculator watches, no matter how many watches you collect.
So, could there be a paradigm shift that allows us to even begin to research how to build AGI ? Yes, it's possible, but I wouldn't bet on it by 2050. Obviously, we are general intelligences (arguably), and thus we know building AGI is possible in theory -- but that's very different from saying "the Singularity will happen in 2050". There's a difference between hypothetical ideas and concrete forecasts, and no amount of fictional dialogues can bridge that gap.
I cant help but think more about the learning/training side. You can have a human-level intelligence and throw it at a task (e.g., driving a car). This task consumes only limited resources (you can have a conversation while driving), but training (learning how to drive) is much more intensive... and very dependent on the quality of teaching. Perhaps good training data is a much more important factor than we make it out to be? There's plenty of evidence that children with difficult backgrounds (=inferior, but generally similar training data) measurably underperform their peers. For an AI, the variation in training data quality could be much larger. Perhaps we are quite close to human-level performance of AI, and we are just training them catastrophically badly?
Should software engineers move closer to the hardware then?
But is Platt's law wrong? If you want to predict when the next magnitude 9 earthquake occurs, you should predict X years, no matter what year it is, for some X. I think Yudkowsky is basically included some probability of "a genius realizes that there's an easy way to make AGI" - then the chance of that genius coming along and doing this might really have a constant rate of occurrence and the estimate is always X years, for some X. Today's predictions are conditioning on "it hasn't yet happened" and so should predict a different number than yesterday's predictions.
Very sorry for the offtopic, but as events unfold in Ukraine (Russia is invading Ukraine), I would be very glad to see a discussion of this in the community.
Could someone point me to some relevant place, if such a discussion has already took place, or is currently going on here/LessWrong/a good reddit thread?
Thanks so much - maybe if Scott would open a special Open Thread?
Currently, to me it seems like Russia/Putin is trying to replace the Ukrainian government with a more Russia-favoring one, either through making the government resign in the chaos, executing a coup through special forces or forcing the government to relocate and then taking Kiev and recognizing a new government controlling the Eastern territories as the "official Ukraine".
I would be particularly interested in what this means for the future, eg:
- How Ukrainian refugees will change European politics? (I am from Hungary, and it seems like an important question.)
- What sanctions are likely to be put in place?
- How will said sanctions influence European economy? (Probably energy prices go up - what are the implications of that?)
“who actually manages to keep the shape of their probability distribution in their head while reasoning?”
This is exactly the job description of Risk Managers (as opposed to business units, that care for measures of central tendency such as expected or most likely).
One interpretation of what he is saying is that, like any good risk manager, he has a very good idea about the distribution. But a large (enough) portion of that distribution occurs before any reasonable mitigation can be established that it doesn’t matter. Given the risks we are talking about, that is a scary conclusion.
Thank you for writing this post, Scott. This is a useful service for idiots like me who want to understand issues about AGI but don't have the technical chops to read LessWrong posts on it yet.
The ELO vs Compute graph suggests that the best locally available intelligence "algorithm" should take over in evolution, if only to reduce the number of resources necessary to run the minimum viable intelligence set. How structurally different are the specialized neural structures?
I don't understand why the OLS line looks bad for the Platt's law argument. Aren't the two lines almost exactly the same, hence strengthening Eliezer's argument?
"Imagine a scientist in Victorian Britain, speculating on when humankind might invent ships that travel through space. He finds a natural anchor: the moon travels through space! He can observe things about the moon: for example, it is 220 miles in diameter (give or take an order of magnitude). So when humankind invents ships that are 220 miles in diameter, they can travel through space!
...Suppose our Victorian scientist lived in 1858, right when the Great Eastern was launched."
Then your Victorian scientist's estimations would become outdated in 1865, when Jules Verne wrote "From The Earth To The Moon" and had his space travellers journey by means of a projectile shot out of a cannon. So I (grudgingly) suppose this fits with Yudkowsky's opinion, that it will happen (if it happens) a *lot* faster and in a *very* different way than Ajeya is predicting.
But my own view on this is that the entire "human-level AI then more then EVEN MORE" is the equivalent of H. G. Wells' 1901 version, where space travel for "The First Men In The Moon" happens due to the invention of cavorite, an anti-gravity material.
We got to the Moon in the Vernean way, not the Wellsian way. I think AI , if it happens, will be the same: not some world-transforming conscious machine intelligence that can solve all problems and act of its own accord, but more technology and machinery that is very fast and very complex and in its way intelligent - but not a consciousness, and not independent.
> Is that a big enough difference to exonerate her of “using” Platt’s Law? Is that even the right way to be thinking about this question?
So, on the Platt's law thing. It's very weak evidence, but it is Bayesian evidence. Consider an analogous scenario: You get dealt a hand from a deck, that may or may not be rigged. If you get a Royal Flush of Spades, intuitively it feels like you should be suspicious the deck was rigged. It's really unlikely to draw that hand from a fair deck, and presumably much more likely to draw it from a rigged deck. But this should work for every hand, just to a lesser extent.
If we assume that all reasonable guesses are before 2100 (arbitrarily, for simplicity), then there are about 80 years to choose, being within 2 years of the "special" estimate (30 years, I'll come back to 25, but 30 is easier), is a 5 year range in the 80 years, for odds of 1/16. This is kinda close to the odds of drawing Two Pair in cards, so, how suspicious would you be that the deck was rigged in that case? (25 years being the "special" one gives 15/80 or about 1/5, there isn't a poker hand close to 1/5, but it's somewhere in between to One Pair, twice in a row, and One Pair of face cards) That's about how suspicious it should make you of the estimate (so, in my mind, not very). Likely this is getting a lot more air-time than it's doing work.
(Caveat, I'm completely skimming over the other side, which is that it matters how likely the cards would be drawn if the deck WAS rigged (i.e. how likely someone would rig THAT hand), because I don't really know how to even estimate that. Just as a guess, if that consideration pushes in favor of being suspicious, it might be the amount of suspicion if you drew Three of a Kind, and MAYBE it could get as far as a Straight.)
> ... suppose before we read Ajeya’s report, we started with some distribution over when we’d get AGI. For me, not being an expert in this area, this would be some combination of the Metaculus forecast and the Grace et al expert survey, slightly pushed various directions by the views of individual smart people I trust. Now Ajeya says maybe it’s more like some other distribution. I should end up with a distribution somewhere in between my prior and this new evidence. But where?
It seems to me that there is no kind of expertise that would make one predictably better at making long-term AGI forecasts. Indeed, experts in AI have habitually gotten it very wrong, so if anything I should down-weight the predictions of "AI experts" to practically nothing.
I think I am allowed to say that I think all of the above forecasts methods are bad and wrong, by simply looking at the arguments and disagreeing with them for specific reasons. I don't think I am under any epistemic obligation to update on any particular prediction just because somebody apparently bothered to make the prediction; I am not required to update on the basis of the Victorian shipwright's prediction about spaceflight.
My opinion is that the whole exercise of "argument from flops" is doomed, and its doom is overdetermined. Papers come out showing 3 OOM speedups in certain domains over SOTA - not 3x speedups, 1000x speedups. How can this be, if we are anywhere close to optimizing the use of our computational resources? How would we be seeing casual, almost routine algorithmic improvements that even humbly double or 10x SOTA performance, if we were anywhere near the domain where argument-from-flops-limitation would apply?
Regarding Platt's Law, I sense a fundamental misunderstanding of why a prediction might follow it. It's not a regimented mathematical system. It's something our brains like to do when we think something is coming up soon, but we see no actual plottable path to reach it.
It's the same reason that fusion power is always 30 years off. It's soon enough to imagine it, but long enough away that the intervening time can do all the work of figuring out how.
If no one has any idea *how* to create a human level AI, then no level of computational power will be enough to get there. We could have 10^45 FLOP/S right now and still not have AI, if we don't know what to do with them. Having the computer do 2+2=4 a ridiculous number of times doesn't get us anywhere.
That doesn't mean human level AI cannot actually arrive in 30 years, but it also doesn't say anything really about 10 years or 500 years. The fundamental problem is still *how* to do it. If you get to that point, any engineer can plot out the timeline very accurately and everyone will know it. Until then, you could say about anything you want.
As an experiment, throw billions of dollars into funding something that we know can't exist now, but is maybe theoretically possible. Then ask the people in the field you've created to tell you how long it will take. I bet the answer will be about 30 years, give or take a little bit. They're telling you that they don't know, but had to provide an answer anyway.
I'm still ankle-deep in the email and haven't looked at the comments, but it got me thinking: if we've been making a lot progress recently by spending more, how much will the effort be stymied by interest rate increases? How about war?
> So maybe instead of having to figure out how to generate a brain per se, you figure out how to generate some short(er) program that can output a brain? But this would be very different from how ML works now. Also, you need to give each short program the chance to unfold into a brain before you can evaluate it, which evolution has time for but we probably don’t.
Doesn't affect any overall conclusions, but there's a decent amount of research that would count as being in this direction I think. Hypernetworks didn't really catch on but the idea was to train a neural network to generate the weights for some other network. There's also metalearning work on learning better optimizers, as well as work on evolving or learning to generate better network architectures.
> But also, there are about 10^15 synapses in the brain, each one spikes about once per second, and a synaptic spike probably does about one FLOP of computation.
This strikes me very weird - humans can "think" (at least react) much faster than a second. If synapses fire only every second, and synapses firing are somehow the atomic units of computation in the brain, then how can we react, let alone think complex thoughts (that probably require some sequence of steps of calculations) orders of magnitude faster than a second?
Am I missing something? It seems either the metric of synapses is wrong, or the speed.
I'm pretty sure it's my job to point out that Great Eastern was an outlier and should not really be counted in this stuff. It was Burnel trying to build the biggest ship he technically could, without any real concern over whether it would be economically viable, and the result was a total failure on the economics front. There's a reason it took so long for other ships to reach that size.
I think one reason for Platt's law may be that Fermi estimates (I'd class the Cotra report as basically a fermi estimate) suffer from a meta-degree of freedom, in that the human estimator can choose how many factors to add into the computation. For instance, in the Drake equation, you can decide to add in a factor for the percentage of planets with luna-sized moons if you think that having tides is super important for some reason. Or you can add in a factor for the percentage of planets that don't have too much ammonia in their atmosphere, or whatever. Or you can remove factors. The point is that the choice of factors far outweighs the choice of the values of those factors in determining your final estimate.
I don't think that Cotra is deliberately manipulating the estimate by picking and choosing parameters, but it seems clear that early in such an estimation process, if you come up a result showing that AI will arrive in 10,000 years or 3 months, you're going to modify or abandon the framework you're using because it's clearly producing nonsense. (Not that AI couldn't arrive in 3 months or 10k years - but it doesn't seem like a simple process that predicted either of those numbers could possibly be reliable).
Or maybe your bounds of plausibility are actually 18 months to 150 years. It's not too hard to see how this could cause a ~30 year estimate to be fairly overdetermined due to unconscious bias toward plausible numbers, and more importantly, toward numbers that _seem like they could plausibly be within the bounds of what a model like yours could accurately predict_.
> The median goes from 2052 to about 2050.
I think this is a mistake; the median of the solid black line goes to around 2067, with "chance by 2100" going down from high-70%s to low-60%s.
1. correction to "Also, most of the genome is coding for weird proteins that stabilize the shape of your kidney tubule or something, why should this matter for intelligence?"
source: "at least 82% of all human genes are expressed in the brain" https://www.sciencedaily.com/releases/2011/04/110412121238.htm#:~:text=In%20addition%2C%20data%20analysis%20from,in%20neurologic%20disease%20and%20other
2. The bitcoin network does the equivalent of 5e23 FLOPS (~5000 integer ops per hash and 2e20 hashes per second; assuming 2 integer ops is worth 1 floating point op). This is 6 orders of magnitude bigger than that Japanese supercomputer, because specialized ASICs do a lot more operations per watt than general purpose CPUs. Bitcoin miners are compensated by block rewards at a rate of approximately $375 per second, so that's about 1e21 flops/$. This is 4 orders of magnitude higher than the estimate of 10^17flops/$. If there were huge economies of scale in producing ASICs specialized for training deep neural nets, we could probably expect the former 1e21flops/$ at current technology levels. Bitcoin ASICs also seem to still be doubling efficiency every ~1.5 years.
3. correction: "The median goes from 2052 to about 2050"
The median is where cumulative probability is 0.5, and on your graph it's in 2067. If you mean the median of the subset of worlds where we get AI before 2100, then it's a cumulative probability of 0.3 in 2045.
4. The AI arrival estimate regression line's higher slope than Platt's law seems rational, because from an outside view, the longer it's been since the invention of computers without having AGI yet, the longer we should expect it to take. (But on the inside view, this article is making me shift some probability mass from after-2050 to before-2050)
5. Clarification: "human solar power a few decades ago was several orders of magnitude worse than Nature’s"
Photosynthesis is typically <2% efficient, so you seem to be claiming human solar power in 1990 was <0.002% efficient. But this Department of Energy timeline of solar power claims Bell Labs developed at least a 4% efficient solar panel in 1954:
Bell labs was awesome, and my grandpa had some good stories about working there in the 50s and 60s while they invented the transistor and the theoretical basis for the laser. I wish a place like that still existed -- I'd send them an application. I tried cold emailing SpaceX and they ignored me.
Has anyone compared prediction timelines to the estimated lifetime of the predictor?
I have a vague memory of someone looking at this on a different topic, but I couldn't turn it up in a quick search; the idea is that for [transformative complex development] people have a tendency to predict it late in their life, but within a reasonable margin of having not yet died of old age before it happens.
How many researchers and Metaculus predictors will be 70-80 in 2050, and their prediction is, perhaps unconsciously, really a hope to achieve Virtual Heaven?
Alternatively, what else does Platt's "law" apply to? Aren't flying cars always 20-30 years away? Nuclear fusion? Is this just the standard "close, but not too close" timeline for *any* technological prediction?
> “any AI forecast will put strong AI thirty years out from when the forecast is made.”
There's probably a stronger version of this: any technology that seems plausibly doable but we don't quite know how to do, probably seems about 30 years away.
10 years away is the foreseeable timeline of current prototypes and has relatively small error bars. 20 years away is the stuff that's being dreamed up right now and has larger error bars (innovation is risky!). 30 years away consists of things that will be invented by people who grow up in an environment where current prototype tech is normal and the next gen stuff is just on the horizon.
Predicting how these people will think about problems is fundamentally unpredictable. Just think of all the nonsense that was said by computer "experts" in the 60s and 70s prior to the PC.
Admitting the perils of overfit from historical examples, I think there's more to learn from the history of the field of AI research than just FLOPs improvements. Yes, computers beat the best humans in chess, but then later researchers refined the process and discovered that when humans and machines combined their efforts, the computer-human teams beat the computers alone. This seems like a general principle we should apply to our expectations of computer intelligence moving forward.
Calculators are much better than humans, but instead of replacing human calculation ability they enhanced it. Spreadsheets compounded that enhancement. Complex graphing calculators did the same. Sure, calculus was invented (twice!) without them, but the concepts of calculus become accessible to high school students when you include graphing calculators, and statistics become accessible when you load up a spreadsheet and play around with Monte Carlo simulations.
I think what we're missing is how this contributes to Moore's Law of Mad Science. It gives IQ-enhancing tools to the masses. But it's also giving large tech companies tools that might accidentally drive mass movements of hatred, hysteria, and war. And that's just because they don't know what they're doing with it yet. How much worse off will we be when they figure out how to wield The Algorithm effectively? And why are we not talking about THIS massive alignment problem?
What if we destroy ourselves with something else, before we get all the way to AGI? We're already creating intelligence-enhancing tools that regular human operators can't be trusted to handle. Giving god-like power to a machine is certainly terrifying, because I don't know what it might do with that power. But I have some idea what certain people around the world would do with that kind of power, and I'm equally terrified. Especially because those people WANT that power. They're not going to accidentally stumble into it, they're actively trying to cultivate it.
I think I'm in the "this report is garbage and you should ignore it completely" camp (even though I have great respect for Ajeya Cotra and the report is probably quite well done if you apply some measure that ignores the difficulty of the problem). You basically have
- Extreme uncertainty about many aspects within the model, as admitted by Cotra herself
- Strong reasons to suspect that the entire approach is fundamentally flawed
- Massive (I'd argue) potential for other, unknown out-of-model errors
I think I give even less credit to it than Eliezer in that I don't even believe the most conservative number is a valid upper-bound.
SEPARATELY, I do just want to say this somwhere. Eliezer writes this post calling the entire report worthless. The report nonetheless [does very well in the 2020 review](https://www.lesswrong.com/posts/TSaJ9Zcvc3KWh3bjX/voting-results-for-the-2020-review) whose voting phase started after Eliezer's post was published, it it wins the alignment forum component in a landslide. Afaik I was literally the only person who gave the post a negative score. So can we all take a moment to appreciate how not-cultish the community seems to be?
I'd be curious to see (if anyone has any resources) the historical split and trend over time of compute costs broken down of each of the following three components:
- Chip development costs/FLOP.
- Chip production costs/FLOP.
- Chip running costs/FLOP (probably primarily electrical costs now).
I ask in relation to a concern with extrapolating historical rates of cost declines going forward. It's possible that the components of cost with the most propensity to be reduced will become an increasingly small share of cost over time. As such, the costs that remain may be increasingly difficult to reduce. This is a low-confidence idea as I don't know a ton about chip design, and there are plenty of reasons why extrapolating from the general trend might be right (e.g. perhaps as something becomes an increasing component of cost we spend more effort to reduce it).
That said, it would be interesting to see whether extrapolating future cost reductions from past ones would have performed well in other industries with longer histories? i.e. How have the real cost of steel or electricity gone down, as well as the share of costs from different inputs?
Totally separately, should we expect the rate of algorithm development and learning to decline as the cost of training single very large models and then evaluating their performance increases drastically? My intuition is that as the cost of iteration and learning increases (and the number of people with access to sufficient resources decreases) we should expect a larger proportion of gains to come from compute advance as opposed to algorithm design, but this something I have close to 0 confidence in.
"For the evil that was growing in the new machines, each hour was longer than all the time before."
My hunch is that Eliezer is right about the problem being dominated by paradigm shifts, but that they usually involve us realising how much more difficult AGI is than we thought, moving AGI another twenty-odd years out from the time of the paradigm shift. A bit like Zenos paradox except the turtle is actually 100 miles away and Achilles just thinks he is about to catch up.
That being said I am bullish on transformative AI coming within the next 20 years, just not AGI.
> and other bit players
I think this should be "big players"
Something to consider is that there isn't yet the concept of agency in AI and I'm not certain anybody knows how to provide it. The tasks current impressive production AI systems do tend to be of the "classify" or "generate more like this" categories. Throwing more compute/memory/data at these systems might get us from "that's a picture of a goldfish" to "that's a picture of a 5-month old goldfish who's hungry", or from what GPT-3 does to something that doesn't sound like the ravings of an academic with a new-onset psychiatric condition.
None of these have the concept of "want".
Thanks for criticizing Ajeya's analysis. Insofar as you summarized it correctly, I was furrowing my brow and shaking my head at several crazy-sounding assumptions, for reasons that you and Eliezer basically stated.
My model: current AIs cannot scale up to be AGIs, just as bicycles cannot scale up to be trucks. (GPT2 is a [pro] bicycle; GPT3 is a superjumbo bicycle.) We're missing multiple key pieces, and we don't know what they are. Therefore we cannot predict when exactly AGIs will be discovered, though "this century" is very plausible. The task of estimating when AGI arrives is primarily a task of estimating how many pieces will be discovered before AGI is possible, and how long it will take to find the final piece. The number of pieces is not merely unpredictable but also variable, i.e. there are many ways to build AGIs, and each way requires a different set of major pieces, and each set has its own size.
Also: State-of-the-art AGI is never going to be "as smart as a human". Like a self-driving car or an AlphaStar, AIs that come before the first AGI will be dramatically faster and better than humans in their areas of strength, and comically bad or useless in their areas of weakness.
At some point, there will be some as-yet unknown innovation that turns an ordinary AI to an AGI. After maybe 30,000 kWh of training (give or take an OOM or two), it could have intelligence comparable to a human *if it's underpowered*: perhaps it's trained on a small supercomputer for awhile and then transitioned to a high-end GPU before we start testing its intellect. Still, it will far outpace humans in some ways and be moronic in other ways, because in mind-design-space, it will live somewhere else than we do (plus, its early life experience will be very different). Predictably, it will have characteristics of a computer, so:
- it won't need sleep, rest or downtime (though a pausing pruning process could help). In the long run this is a big deal, even if processing power isn't scaled up.
- it will do pattern-matching faster than humans, but not necessarily as well
- it will have a long-term memory that remembers the things it is programmed to remember very accurately, while, in some cases, completely forgetting things it is not programmed to remember
- if it saves or learns something, it does so effortlessly, which should let it do things that humans find virtually impossible (e.g. learning all human languages, and having vast and diverse knowledge). Note: unlike in a human, in a computer, "saving" and "learning" information are two fundamentally different things; well, humans don't really do "saving".
- it will lack humanlike emotions, have limited social intelligence, and will predict human behavior even less reliably than we do, though in time, with learning, it'll improve
- Edit: for all the ink spilled on the illegibility of neural networks, AGIs are, for several reasons, much more legible than human neural brains, and therefore much easier to improve.
- it will have the ability to process inputs quickly and with low latency, and more crucially, produce outputs very rapidly and with a lower noise/error rate than a human can. This latter ability will make it possible (if its programmers allow) for it to write software that runs on the same machine, and to communicate with that software much faster than any human can communicate with a computer. If it's smart enough to write software, it will use this ability to augment its mental abilities in ways that can make it eventually superhuman in some ways, without increasing available computational power.
It's easy to think of examples of that last point, just by thinking about games. For instance, those games where you are given six letters and have to spell out as many words as you can think of? An AGI can simply write a program *within its own mind* to find all the answers, allowing it to quickly surpass human performance. Or that game where you repeatedly match three gems? Probably the AGI's neural net architecture can do that pretty well, but again it could write a program to do even better. Sudoku? No problem.
So at this point, the AGI should be able to outclass humans in various solitaire games, but might have limited talent in real-world tasks like fixing cars, or discovering the Pythagorian theorem, or even reading comprehension. But we can hugely increase its intelligence simply by giving it more compute, at which point it can quickly become smarter than every human in every way, and the AGI alignment problem potentially becomes important.
If we're lucky, the first AGI will do a relatively poor job at certain tasks, such as abstract reasoning on mental models, concept compression, choosing priorities of mental processes, synthesizing its objective function (or motivational system) with reality, and looking at problems from a variety of perspectives / at a variety of levels of abstraction. Handicaps in such areas could make it a poor engineer/scientist, which is good in the sense that it's safer. Such an AGI would be likely to have difficulty doing risky actions like improving its own design, or killing everyone, even if it has a whole datacenter-worth of compute.
If we're not lucky, we get the kind of AGI Eliezer worries about. I think we're going to be lucky, because Reality Has A Surprising Amount Of Detail. But the possibility of being unlucky has a high enough probability (1%?) that AI safety/alignment research should be well-funded. Edit: Plus, in the 99% case, I would raise my probability estimate of near-term catastrophe immediately after the first AGI appears, so it's good to get started on safety work early.
Funny thing is, I'm no AGI expert, just an aspiring-rationalist software developer. Yet I feel mysteriously confident that some of these AI experts are off the mark in important ways. The bicycle/truck distinction is one way.
Another way is that I think the trend toward more expensive supercomputer models is likely to reverse very soon, especially for those who make real progress in the field. Better compute enabled recent leaps in performance, but now that it has been proven that AIs can beat any human at Go and Starcraft, the prestige is harvested, and I don't see much reason to build more expensive models. It's a bit like how we moved from 8-bit CPUs all the way up to 64-bit CPUs, and then stopped adding more bits (apart from e.g. SIMD) because there just wasn't enough benefit. To the contrary, cheaper models are cheaper, so they enable a lot more experimentation and research by non-elites. It might well be that teenage tinkerers (at home, with monster gaming rigs) discover key pieces of the first AGI.
I like what the report is saying (not that I've read it, just going off Scott's retelling of its main points), and it's reassuring me that the people working on it are competent and take every currently recognizable factor of difficulty into account.
I nevertheless think it's erring in the exact direction all earlier predictions were erring, which is the exact opposite of where Elezier thinks it's erring. I.e., they understand and price in the currently known obstacles and challenges on the road to AGI; they do not, because they cannot, price in the as-of-yet unknown obstacles that will only make themselves apparent once we clear the currently pertinent ones. E.g., you can only assume power consumption is the relevant factor if you completely disregard the difficulty and complexity of translating that power into (more relevant) computational resources. Then, with experience, you update to thinking in terms of computational resources, until you get enough of them to finally start working on translating them into something even more relevant, at which point you update to thinking in whatever measures the even more relevant thing. (Or don't update, and hope the newly discovered issues will just solve themselves, but there's little reason to listen to you until you actually provide a solution to them.)
(Bonus hot take: This explains the constant 30 years horizon, it's some stable limit of human imagination vis-a-vis the speed of technological progress. We can only start to perceive new obstacles when we're 30 years away from overcoming them.)
We don't know whether we'll encounter any new obstacles or, if so, what they will be, but allow me to propose one obvious candidate: environment complexity.
The entire discussion, as presented in the article, is based around advances in games like chess (8x8 board and simple consistent rules), go (19x19 board and simple consistent rules), or starcraft (well, much more complex, but still a simple, granular, 2D plane with simple consistent rules). (I'm ignoring GPT and the like, because they simply aren't ever reliably performing human tasks.) Neither of those tell us much about a performance in the real world (infinite universe with complex slash unknown slash ever-changing rules). Assuming computational resources are the only relevant factor may be (and, I believe, is) completely ignoring the problem of data necessary to train an AI that is capable of interacting with reality as well as a human does. The relevant natural science analogy may yet turn out to be not "10^41 FLOP/S", but "a billion of years of real-time training experience". We will, of course, be able to bring that number significantly down, but to 30 years? I'm extremely skeptical.
(Bonus hot take: The first AGI takeover will literally be thwarted by it not understanding the power of love, or friendship, or some equally cheesy miscalculation about human behavior, which it will have failed to adequately grasp.)
tl;dr: The report is by necessity overly optimistic (pessimistic if you think AGI means end of humanity), but constitutes a useful lower bound. Elezier is not even wrong.
Small mistake, "he upper bound is one hundred quadrillion times the upper bound." should be "he upper bound is one hundred quadrillion times the lower bound."
strong agree that it's not very decision-relevant whether we say AGI will come in 10 vs 30 vs 50 years if we realistically have significant probability weight on all three. Well, at least not for technical research. Granted, I wrote a response-to-Ajeya's-report myself ( https://www.lesswrong.com/posts/W6wBmQheDiFmfJqZy/brain-inspired-agi-and-the-lifetime-anchor ), but it was mainly motivated by questions other than AGI arrival date per se. Then my more recent timelines discussion ( https://www.lesswrong.com/posts/hE56gYi5d68uux9oM/intro-to-brain-like-agi-safety-3-two-subsystems-learning-and#3_8_Timelines_to_brain_like_AGI_part_3_of_3__scaling__debugging__training__etc_ ) was mainly intended as an argument against the "no AGI for 100 years" people. I suspect that OpenPhil is also interested in assessing the "No AGI for 100 years" possibility, and also are interested in governance / policy questions where *maybe* the exact degree of credence on 10 vs 30 vs 50 years is an important input, I wouldn't know.
These projections ignore the "ecology" (or "network" if you prefer). Humans individually aren't very smart, their effective intelligence resides in their collective activity and their (mostly inherited) collective knowledge.
If we take this fact seriously we will be thinking about issues that aren't discussed by this report, Yudkowsky, etc. For example:
- What level of compute would it take to replicate the current network of AI researchers plus their computing environment? That's what would be required to make a self improving system that's better than our AI research network.
- What would "alignment" mean for a network of actors? What difference does it make if the actors include humans as well as machines?
- Individual actors in a network are independently motivated. They are almost certainly not totally aligned with each other, and very possibly have strongly competitive motivations. How does this change our scenarios? What network & alignment structures produce better or worse results from our point of view?
- A network of actors has a very large surface area compared to a single actor. Individual actors are embedded in an environment which is mostly outside the network and have many dependencies on that environment -- for electric power, security, resources, funding, etc. How will this affect the evolution and distribution of likely behaviors of the network?
I hope the difference in types of questions is obvious.
- But Alpha zero! Reply: Individuals aren't very intelligent and chess is a game played by individuals. Alpha zero can beat individual humans, not a big deal.
- But Alpha fold! The success of Alpha Fold depends on knowledge built up by the network. Alpha fold can better utilize this knowledge than any individual human, again no big deal. Alpha fold can't independently produce new knowledge of the type it needs to improve. However Alpha fold *does* increase the productivity of biochemical research, that will greatly increase the rate of progress of the network, and will feed back to some degree to Alpha fold.
- But GPT 3! This is a great example of consolidating and using collective knowledge from the corpus -- and it helps us understand how much knowledge is embedded implicitly in the corpus. On the other hand we haven't seen any AI that generates a significant net increase in the collective knowledge of our corpus. This will come but AIs will only increase our collective knowledge incrementally to begin with.
- But FOOM! This would require replicating the whole research endeavor around AI and probably a lot more -- maybe much of the culture and practice of math, which is very much a collective endeavor, for example. Not going to happen quickly just because one machine gets a few times more intelligent than a single AI researcher.
I wonder to what extent the curves in the Compute vs. ELO graph flatten out to the right due to the inherent upper limit of ELO. Or conversely, to what extent the flattening indicates limits to this type of intelligence.
I'm trying to read that linked article by Eliezer now and holy crap, he could really use an editor that would tell him to cut out half of the text and maybe stop giving a comprehensive, wordy introduction to Eliezerism at the beginning of every text he writes.
A thought: Platt's Law is a specific case of Hofstadter's Law (which is also about AI, actually): It always takes longer than you think, even when you take into account Hofstadter's Law. Which fits with the "estimates recede at the rate of roughly one year per year". You make a guess, take into account Hofstadter's Law, then a year goes by, and you find yourself not really any closer, rinse and repeat.
Another thought: Platt's Law is about the size of "a generation", so Platt's Law-like estimates could be seen as another way of looking around and saying "it won't be THIS generation that figures it out".
Final thought: it seems to me that if you're going to take the "biological answer" approach, it would make more sense to look at how evolution got us here vs. how we're working on getting AI to a human level of performance. How many iterations and how "powerful" was each iteration for evolution to arrive at humans? How many iterations and how "powerful" an iteration has it taken for us to get from an AI as smart as an algae to whatever we have now.
Aren't 10% missing from her weighing of the 6 models?
Not sure it makes much of a difference though
The implicit assumption is that we are not far from being able to reverse engineer the wiring of someone's brain, or perhaps some reference brain, and simulate it as a neural network and get a working human like intelligence. That's not a totally unreasonable idea, but we are no where close to being able to figure out all the synaptic connections in someone's brain. Really. It's not like the human genome or understanding a human cell. You can grab some DNA or a few cells from someone pretty easily, but just try and follow a brain's wiring.
No, we are not going to be able to use ML to figure out the wiring based on some training set. We can probably get such a system to do something interesting, but it isn't going to be thinking like a human. ML algorithms are just not robust enough. Visual recognition algorithms fall apart if you change a handful of pixels, and even things like AlphaFold collapse if you vary the amino acid sequence. Sure, people fall apart too, but an AI that can't tell a hat from a car if a couple of visual cells produce bogus outputs isn't behaving like a human.
Then, there's all the other stuff the brain does, and it's not just the nerve cells. The glial cells and astrocytes do things that we are just getting a glimpse of. It's not like brains don't rewire themselves now and then. There's Hebb's rule: neurons that fire together, wire together, and we barely have a clue of how that works at a functional level, so good luck simulating it.
Closer to home, the brain is full of structures that embed assumptions about an animal needs to process information to survive and reproduce. The thing is that we don't know what all of these structures are and what they do. Useful ML algorithms also embed assumptions. Rodney Brooks pointed out that the convolution algorithms used in ML object identification algorithms embed assumptions about size and location invariance. MLs don't learn that from training sets. People write code that moves a recognition window around the image and varies its size. (Brooks has been a leader in the AI/ML world since the 1980s, and his rodneybrooks.com blog is full of good informed analysis of the field and its capabilities.)
Maybe I'm too cynical, but I'll go with Brooks' NIML, not in my lifetime.