> Oh, thank God! I thought you’d said five million years!”
That one has always tickled me too.
I thought of it when an debate raged here about saving humanity by colonization other star systems. I’d mentioned the ‘No reversing entropy’ thing and the response was: “We’re just talking about the next billion years!
Another minor error: I believe Carl Shulman is not 'independent' but still employed by FHI (albeit living in the Bay Area and collaborating heavily with OP etc).
Another minor error: Transformative AI is "AI that precipitates a transition comparable to (or more significant than) the agricultural or industrial revolution".
This is easy to fix Scott, and about as long as your original description + its witty reply.
Without explicitly confirming at the source, it appears to be a graph of chess program performance per computational power, for multiple models over time.
The Y-axis is chess performance measured using the Elo system, which is a way of ranking performers by a relative standard. Beginner humans are <1000, a serious enthusiast might be >1500, grandmaster is ~2500, and Magnus Carlsen peaked at 2882.
The X-axis is how much compute ("thinking time") each model was allowed per move. This has to be normalized to a specific model for comparisons to be meaningful (SF13-NNUE here) and I'm just going to trust it was done properly, but it looks ok.
The multiple lines are each model's performance at a given level of compute. There are three key takeaways here: 1) chess engines are getting more effective over time even allowed the same level of compute, 2) each model's performance tends to "level out" at some level of allocated resources, and 3) a lot of the improved performance of new models comes from being able to usefully utilize additional resources.
That's a big deal, because if compute keep getting cheaper but the algorithms can't really leverage it, you haven't done much. But if ML folks look at the resources thrown at GPT-3 and say "the curve isn't bending!" it could be a sign that we can still get meaningful performance increases from moar power.
Scott seems to take from this graph that it supports the "algorithms have a range of compute where they're useful" thesis. But I see it as opposing that.
First, the most modern algorithms are doing much better than the older ones *at low compute regimes* so the idea that we nearly immediately discover the best algorithms for a given compute regime once we're there appears to be false - at least we didn't manage to do that back in 1995.
Second, regimes where increased computation gives a benefit to these algorithms seems pretty stable. It's just that newer algorithms are across-the-board better. I guess it's hard to compare a 100 ELO increase at 2000 ELO to a 100 ELO increase at 3000 ELO, but I don't really see any evidence in the plot that newer algorithms *scale* better with more compute. If anything, it's that they scale better at low compute regimes, which more lend itself to a Yudkowskian conclusion.
I agree with you. If it were really the case that "once the compute is ready, the paradigm will appear", I would expect to see all of the curves on this graph intersect each other, with each engine having a small window for which it dominates the ELO roughly corresponding to the power of computers at the time it was made.
I'd expect that the curves for, say, image recognition tasks, *would* intersect, particularly if the training compute is factored in.
But the important part this graph shows is: the difference between algorithms isn't as large as the difference between compute (although the relative nature of ELO makes this less obvious).
I think those algorithms have training baked in, so a modern trained net does really well even with low compute (factor of 1000 from hardware X software), but the limit on how good an algo you could train was a lot lower in the past (factor of 50 from software alone)
> But if ML folks look at the resources thrown at GPT-3 and say "the curve isn't bending!" it could be a sign that we can still get meaningful performance increases from moar power.
I don't follow the space closely, but I think this is exactly what ML folks are saying about GPT-3.
When we say we want AIs what we are really saying is we want an AI that is better than humans not just an AI. But there are geniuses being born every day.
But what we really want is to understand consciousness and to solve particular problems faster than than we can at the moment.
We wanted to fly like the birds but we really did not invent an artificial bird. We wanted to work as hard as horse, but did not invent an artificial horse.
The question of consciousness is a legitimate and important question.
I think this is an important point. Doing basic research in AI as a way to understand NI makes enormous sense: we understand almost nothing about how our mind works, and if we understood much more we could (one hopes) make enormous strides in education, sociology, functional political institutions, the treatment of mental illness, and the improvement of life for people with mental disabilities (through trauma, birth, or age). We could also optimize the experience and contributions of people who are unusually intelligent, and maybe figure out how to boost our own intelligence, via training or genetic manipulation. Exceedingly valuable stuff.
But as a technological end goal, an actual deployed mass-manufactured tool, it seems highly dubious. There are only three cases to consider:
(1) We can build a general AI that is like us, but much dumber. Why bother? (There's of course many roles for special-purpose AIs that can do certain tasks way better than we can, but don't have our general-purpose thinking abilities.)
(2) We can build a general AI that is like us, and about as smart. Also seems mostly pointless, unless we can do it far cheaper than we can make new people, and unless it is so psychologically different it doesn't mind being a slave.
(3) We can build a general AI that is much smarter than us. This seems a priori unlikely, in the sense that if we understood intelligence sufficiently well to do this, why not just increase our own intelligence first? Got to be easier, since all you need to do is tweak the DNA appropriately. And even if we could build one, why would we want to either enslave a hyperintelligent being or become its slaves, or pets? Even a bad guy wouldn't do that, since a decent working definition of "bad guy" is "antisocial who doesn't want to recognize any authority" and building a superintelligent machine to whom to submit is rather the opposite of being a pirate/gangster boss/Evil Overlord.
I realize plenty of people believe there is case (2b) we can build an AI that is about as smart as us, and then *it* can rebuild itself (or build another AI) that is way smarter than us, but I don't believe in this boostrapping theory at all, for the same reason I find (3) dubious a priori. The idea that you can build a very complex machine without any good idea of how it works seems silly.
>The idea that you can build a very complex machine without any good idea of how it works seems silly.
But that's essentially what ML does. If there was a good idea of how a solution to a given problem works, it would be implemented via traditional software development instead.
I disagree. I understand very well what a ML program does. I may not have all the details at my fingertips, but that is just as meaningless as the fact that I don't know where each molecule goes when gasoline combusts with oxygen. Sure, there's a lot of weird ricochets and nanometer-scale fluctuations that go on about which I might not know, absent enormous time and wonderful microscopes -- but saying I don't know the details is nowhere near saying I don't know what's going on. I know in principle.
Same with ML. I may not know what this or that node weight is, and to figure out why it is what it is, i.e. trace it back to some pattern in the training data, would take enormous time and painstaking not to say painful attention to itsy bitsy detail, but that is a long way from saying I don't know what it's doing. I do in principle.
I'll add this dichotomy has existed in other areas of science and technology for much longer, and it doesn't bother us. Why does a particular chemical reaction happen in the pathway it does, exactly? We can calculate that from first principles, with a big enough computer to solve a staggeringly huge quantum chemistry problem. But if you wanted to trace back this wiggle in the preferred trajectory to some complex web of electromagnetic forces between electrons, it would take enormous time and devotion to detail. So we don't bother, because this detail isn't very important. We understand the principles by which quantum mechanics determines the reaction path, and we can build a machine that finds that path by doing trillions of calculations which we do not care to follow, and maybe the path is not what raw intuition suggests (which is why we do the calculation at all, usually), but at no point here do we say we do not *understand* why the Schroedinger Equations is causing this H atom to move this way instead of that. I don't really see why we would attribute some greater level of mystic magic to a neural network pattern-recognition algorithm.
>...but that is a long way from saying I don't know what it's doing. I do in principle.
Knowing in principle seems like a much lower bar than having a good idea how something works.
>I don't really see why we would attribute some greater level of mystic magic to a neural network pattern-recognition algorithm.
Intelligence is an emergent phenomenon (cf., evolution producing hominid minds), so what magic do you see being attributed beyond knowledge of how to build increasingly complex pattern-recognition algorithms?
That's not what ML does. ELI5, ML is about as well understood as the visual cortex, it's built like a visual cortex, and it solves visual cortex style problems.
People act like just because each ML model is too large and messy to explain, all of ML is a black box. It's not. Each model of most model classes (deep learning, CNN, RNN, gbdt, whatever you want) is just a layered or otherwise structured series of simple pattern recognizers, each recognizer is allowed to float towards whatever "works" for the problem at hand, and all the recognizers are allowed to influence each other in a mathematically stable (ie convergent) format.
End result of which is you get something that works like a visual cortex: it has no agency and precious little capacity for transfer learning, but has climbed the hill to solve that one problem really well.
This is a very well understood space. It's just poorly explained to the general public.
My initial objection to Carl was based on a difference of opinion about what constitutes a "good idea of how it works". You appear to share his less-restrictive understanding of the phrase.
N.B., I am a working data scientist who was hand coding CV convolutions two decades ago.
> This seems a priori unlikely, in the sense that if we understood intelligence sufficiently well to do this, why not just increase our own intelligence first? Got to be easier, since all you need to do is tweak the DNA appropriately.
I think this is mistaken. For reasons that Scott has talked about elsewhere, the fact that we aren't *already* smarter suggests that we're near a local optimum for our physiology / brain architecture / etc, or evolution would have made it happen; eg it may be that a simple tweak to increase our intelligence would result in too much mental illness. Finding ways to tweak humans to be significantly smarter without unacceptable tradeoffs may be extremely difficult for that reason.
On the other hand, I see no a priori reason that that local optimum is likely to be globally optimal. So conditional on building GAI at all, I see no particular reason to expect a specific barrier to increasing past human-level intelligence.
Oh I wouldn't disagree that it's likely to be hard to increase human intelligence. Whether what we mean by "intelligence" -- usually, purposeful conscious reasoning and imagination -- has been optimized by Nature is an interesting and unsolved question, inasmuch as we don't know whether that kind of intelligence is always a survival advantage. There are also some fairly trivial reasons why Nature may not have done as much as can be done, e.g. the necessity for having your head fit through a vagina during birth.
But yeah I'd take a guess that it would be very hard. I only said that hard as it is, building a brand-spanking new type of intelligence, a whole new paradigm, is likely to be much harder.
Anyway, if we take a step back, the idea that improving the performance of an engine that now exists is a priori less likely than inventing a whole new type of engine is logically incoherent.
That would be convincing if anyone had ever written a computer code that had even the tiniest bit of awareness or original thought, no matter how slow, halting, or restricted in its field of competence. I would say that the idea that a computer can be programmed *at all* to have original thought (or awareness) is sheer speculation, based on a loose analogy between what a computer does and what a brain does, and fueled dangerously by a lot of metaphorical thinking and animism (the same kind that causes humans to invent local conscious-thinking gods to explain why it rains when it does, or eclipses, or why my car keys are always missing when I'm in a hurry).
Deep blue can produce chess moves that are good, and aren't copies of moves humans made. GPT3 can come up with new and semi-sensible text.
Can you give a clear procedure for measuring "Original thought".
Before deep blue, people were arguing that computers couldn't play chess because it required too much "creative decision making" or whatever.
I think you are using "Original thought" as a label for anything that computers can't do yet.
You have a long list of things humans can do. When you see a simple dumb algorithm that can play chess, you realize chess doesn't require original thought, just following a simpleish program very fast. Then GPT3 writes kind of ok poetry and you realize that writing ok poetry (given lots of examples) doesn't require original thought.
I think there is a simplish program for everything humans do, we just haven't found it yet. I think you think there is some magic original thought stuff that only humans have, and also a long list of tasks like chess, go, image recognition etc that we can do with the right algorithm.
"Because the change is trivial in computer code, but hard in DNA."
In any large software shop which relies on ML to solve "rubber hits the road" problems, not toy problems, it takes literally dozens of highly paid full time staff to keep the given ML from falling over on its head every *week* as the staff either build new models or coddle old ones in an attempt to keep pace with ever changing reality.
And the work is voodoo, full of essentially bad software practices and contentious statistical arguments and unstable code changes.
Large scale success with ML is about as far from "the change is trivial in computer code" as it is possible to be in the field of computer science.
I thought about this specifically when reading that we could spend quadrillions of dollars to create a supercomputer capable of making a single human level AI.
To be fair, once made that AI could be run on many different computers (which would each be far less expensive), whereas we don't have a copy-paste function for people.
But more importantly, that way of thinking is wrong (edit: I mean the quadrillion dollars thing) and I predict humanity is about to reduce per-model training budgets at the high end. Though wealthy groups' budgets will jump temporarily whenever they suspect they might have invented AGI, or something with commercialization potential.
I mean that a typical wealthy AI group will reduce the total amount it actually spends on models costing over ~$500,000 each, unless they suspect they might have invented AGI, or something with commercialization potential, and even in those cases they probably won't spend much more than before on a single model (but if they do, I'm pretty sure they won't get a superintelligent AGI out of it). (edit: raised threshold 100K=>500K. also, I guess the superjumbo model fad might have a year or two left in it, but I bet it'll blow over soon)
I obviously cannot speak to why AI scares Scott, but there are some theoretical and practical reasons to consider superhuman AI a highly-scary thing should it come into existence.
Theoretical:
Many natural dangers that threaten humans do not threaten humanity, because humanity is widely dispersed and highly adaptive. Yellowstone going off or another Chicxulub impactor striking the Earth would be bad, but these are not serious X-risks because humanity inhabits six continents (protecting us from local effects), has last-resort bunkers in many places (enabling resilience against temporary effects) and can adapt its plans (e.g. farming with crops bred for colder/warmer climates).
These measures don't work, however, against other intelligent creatures; there is no foolproof plan to defeat an opponent with similar-or-greater intelligence and similar-or-greater resources. For the last hundred thousand years or so, this category has been empty save for other humans and as such humanity's survival has not been threatened (the Nazis were an existential threat to Jews, but they were not an existential threat to humanity because they themselves were human). AGI, however, is by definition an intelligent agent that is not human, which makes human extinction plausible (other "force majeure" X-risks include alien attack and divine intervention).
Additionally, many X-risks can be empirically determined to be incredibly unlikely by examining history and prehistory. An impact of the scale of that which created Luna would still be enough to kill off humanity, but we can observe that these don't happen often and there is no particular reason for the chance to increase right now. This one even applies to alien attack and divine intervention, since presumably these entities would have had the ability to destroy us since prehistory and have reliably chosen not to (as Scott pointed out in Don't Fear the Filter back on SSC, if you think humans are newly a threat to interstellar aliens or to God, you are underestimating interstellar aliens and God). But it doesn't apply to AI - or at least, not to human-generated AI (alien-built AI is not much different from aliens in this analysis). Humans haven't built (human-level or superhuman) AI before, so we don't have a track record of safety.
So the two basic heuristics that rule most things out as likely X-risks don't work on AI. This doesn't prove that AI *will* wipe out humanity, but it's certainly worrying.
Practical:
- AI centralises power (particularly when combined with robotics). Joe Biden can't kill all African-Americans (even if he wanted to, which he presumably does not), because he can't kill them all himself and if he told other people to do it they'd stop listening to him. Kim Jong-un can kill a lot of his people, because the norms are more permissive to him doing so, but he still can't literally depopulate North Korea because he still needs other people to follow his orders and most won't follow obviously-self-destructive orders. But if Joe Biden or Kim Jong-un had a robot military, they could do it. No monarch has ever had the kind of power over their nation that an AI-controlled robot army can give. Some people can be trusted with that kind of power; most can't.
- Neural-net architecture is very difficult to interrogate. It's hard enough to tell if explicit code is evil or not, but neural nets are almost completely opaque - the whole point is that they work without us needing to know *how* they work. Humans can read each other reasonably well despite this because evolution has trained us quite specifically to read other humans; that training is at best useless and at worst counterproductive when trying to read a potentially-deceptive AI. So there's no way to know whether a neural-net AI can be trusted with power either; it's basically a matter of plug-and-pray (you could, of course, train an AI to interrogate other AIs, but the interrogating AI itself could be lying to you).
Very helpful to my understanding why AI is a unique threat. Thanks for this. You explain it very well. Although now when i see video clips of kids in robot competitions, my admiration will be tinged with a touch of foreboding.
Don't be tinged by that foreboding. If you read a bit about superintelligence it becomes clear that it's not going to come from any vector that's typically imagined (terminator or black mirror style robots).
There are plenty of ideas of more realistic ways an AGI escapes confinement and gains access to the real world, a couple of interesting ones I read were it solving the protein folding problem, paying or blackmailing someone over the intenet to mix the necessary chemicals, and it creates nanomachines capable of anything. Another was tricking a computer sciencist with a perfect woman on a VR headset.
In fact it probably won't be any of these things, after all, it's a super intelligence: whatever it creates to pursue its goals will be so beyond our understanding that it's meaningless to predict what it will do other than as a bit of fun or creative writing exercise.
Let me know if you want links to those stories/ideas, I should have them somewhere. Superintelligence by Nick Bostrom is good read, although quite heavy. I prefer Scott's stuff haha.
The hypothetical "rogue superintelligent AGI with no resources is out to kill everyone, what does it do" might not be likely to go that way, but that's hardly the only possibility for "AI causes problems". Remote-control killer robots are already a thing (and quite an effective thing), militaries have large budgets, and plugging an AI into a swarm of killbots does seem like an obvious way to improve their coordination. PERIMETR/Dead Hand was also an actual thing for a while.
> solving the protein folding problem, paying or blackmailing someone over the intenet to mix the necessary chemicals, and it creates nanomachines capable of anything
Arguably the assumption that "nanomachines capable of anything" can even exist is a big one. After all, in the Smalley - Drexler debate Smalley was fundamentally right and drexlerian nanotech is not really compatible with known physics and chemistry
1) I mean, yes, people get annoyed when you explain in as many words that you are strawmanning them in order to make people ignore them.
2) There are really two factions to the AI alarmists (NB: I don't intend negative connotations there, I just mean "people who are alarmed and want others to be alarmed") - the ones who want to "get there first and do it right" and the ones who want to shut down the whole field by force. You have something of a case against the former but haven't really devoted any time to the latter.
Generally I think that the paradigm shifts argument is convincing, and so all this business of trying to estimate when we will have a certain number of FLOPS available is a bit like trying to estimate when fusion will become widely available by trying to estimate when we will have the technology to manufacture the magnets at scale.
However, I disagree with Eliezer that this implies shorter timelines than you get from raw FLOPS calculations - I think it implies longer ones, so would be happy to call the Cotra report's estimate a lower bound.
>she says that DeepMind’s Starcraft engine has about as much inferential compute as a honeybee and seems about equally subjectively impressive. I have no idea what this means. Impressive at what? Winning multiplayer online games? Stinging people?
Yes, you should care. The difference between 50% by 2030 and 50% by 2050 matters to most people, I think. In a lot of little ways. (And for some people in some big ways.)
For those trying to avert catastrophe, money isn't scarce, but researcher time/attention/priorities is. Even in my own special niche there are way too many projects to do and not enough time. I have to choose what to work on and credences about timelines make a difference. (Partly directly, and partly indirectly by influencing credences about takeoff speeds, what AI paradigm is likely to be the relevant one to try to align, etc.)
EDIT: Example of a "little" way: If my timelines went back up to 30 years, I'd have another child. If they had been at 10 years three years ago, I would currently be childless.
Why does your child-having depend on your timelines? I'm considering a similar question now and was figuring that if bringing a child into the world is good, it will be half as good if the kid lives 5 years as if they live 10, but at no point does it become bad.
This would be different if I thought I had an important role in aligning AI that having a child would distract me from; maybe that's our crux?
I myself am pro bringing in another person to fight the good fight. If it were me being brought in I would find it an honor, rather than damning. My crux is simply that I am too busy to rear more humans myself.
I'm not sure it is rational / was rational. I probably shouldn't have mentioned it. Probably an objective, third-party analysis would either conclude that I should have kids in both cases or in neither case.
However the crux you mention is roughly right. The way I thought of it at the time was: If we have 30 years left then not only will they have a "full" life in some sense, but they may even be able to contribute to helping the world, and the amount of my time they'd take up would be relavitely less (and the benefits to my own fulfillment and so forth in the long run might even compensate) and also the probability of the world being OK is higher and there will be more total work making it be OK and so my lost productivity will matter much less...
(Apologies if this is a painful topic. I'm a parent and genuinely curious about your thinking)
Would you put a probability on their likelihood of survival in 2050? (ie, are you truly operating from the standpoint that your children have a 40 or 50 percent chance of dying from GAI around 2050?)
Yes, something like that. If I had Ajeya's timelines I wouldn't say "around 2050" I would say "by 2050." Instead I say 2030-ish. There are a few other quibbles I'd make as well but you get the gist.
> money isn't scarce, but researcher time/attention/priorities is.
I don't get the "MIRI isn't bottlenecked by money" perspective. Isn't there a well-established way to turn money into smart-person-hours by paying smart people very high salaries to do stuff?
My limited understanding is: It works in some domains but not others. If you have an easy-to-measure metric, you can pay people to make the metric go up, and this takes very little of your time. However, if what you care about is hard to measure / takes lots of time for you to measure (you have to read their report and fact-check it, for example, and listen to their arguments for why it matters) then it takes up a substantial amount of your time, and that's if they are just contractors who you don't owe anything more than the minimum to.
I think another part of it is that people just aren't that motivated by money, amazingly. Consider: If the prospect of getting paid a six-figure salary to solve technical alignment problems worked to motivate lots of smart people to solve technical alignment problems... why hasn't that happened already? Why don't we get lots of applicants from people being like 'Yeah I don't really care about this stuff I think it's all sci-fi but check out this proof I just built, it extends MIRI's work on logical inductors in a way they'll find useful, gimme a job pls." I haven't heard of anything like that ever happening. (I mean, I guess the more realistic case of this is someone who deep down doesn't really care but on the exterior says they do. This does happen sometimes in my experience. But not very much, not yet, and also the kind of work these kind of people produce tends to be pretty mediocre.)
Another part of it might be that the usefulness of research (and also manager/CEO stuff?) is heavy-tailed. The best people are 100x more productive than the 95th percentile people who are 10x more productive than the 90th percentile people who are 10x more productive than the 85th percentile people who are 10x more productive than the 80th percentile people who are infinitely more productive than the 75th percentile people who are infinitely more productive than the 70th percentile people who are worse than useless. Or something like that.
Anyhow it's a mystery to me too and I'd like to learn more about it. The phenomenon is definitely real but I don't really understand the underlying causes.
> Consider: If the prospect of getting paid a six-figure salary to solve technical alignment problems worked to motivate lots of smart people to solve technical alignment problems... why hasn't that happened already?
I mean, does MIRI have loads of open, well-paid research positions? This is the first I'm hearing of it. Why doesn't MIRI have an army of recruiters trolling LinkedIn every day for AI/ML talent the way that Facebook and Amazon do?
Looking at MIRI's website it doesn't look like they're trying very hard to hire people. It explicitly says "we're doing less hiring than in recent years". Clicking through to one of the two available job ads ( https://intelligence.org/careers/research-fellow/ ) it has a section entitled "Our recommended path to becoming a MIRI research fellow" which seems to imply that the only way to get considered for a MIRI research fellow position is to hang around doing a lot of MIRI-type stuff for free before even being considered.
None of this sounds like the activities of an organisation that has a massive pile of funding that it's desperate to turn into useful research.
I can assure you that MIRI has a massive pile of funding and is desperate for more useful research. (Maybe you don't believe me? Maybe you think they are just being irrational, and should totally do the obvious thing of recruiting on LinkedIn? I'm told OpenPhil actually tried something like that a few years ago and the experiment was a failure. I don't know but I'd guess that MIRI has tried similar things. IIRC they paid high-caliber academics in relevant fields to engage with them at one point.)
Again, it's a mystery to me why it is, but I'm pretty sure that it is.
Some more evidence that it's true:
--Tiny startups beating giant entrenched corporations should NEVER happen if this phenomenon isn't real. Giant entrenched corporations have way more money and are willing to throw it around to improve their tech. Sure maybe any particular corporation might be incompetent/irrational, but it's implausible that all the major corporations in the world would be irrational/incompetent at the same time so that a tiny startup could beat them all.
--Similar things can be said about e.g. failed attempts by various governments to make various cities the "new silicon valley" etc.
Maybe part of the story is that research topics/questions are heavy-tailed-distributed in importance. One good paper on a very important question is more valuable than ten great papers on a moderately important question.
> I can assure you that MIRI has a massive pile of funding and is desperate for more useful research. (Maybe you don't believe me? Maybe you think they are just being irrational
Maybe they're not being irrational, they're just bad at recruiting. That's fine, that's what professional recruiters are for. They should hire some.
If MIRI wants more applicants for its research fellow positions it's going to have to do better than https://intelligence.org/careers/research-fellow/ because that seems less like a genuine job ad and more like an attempt to get naive young fanboys to work for free in the hopes of maybe one day landing a job.
Why on Earth would an organisation that is serious about recruitment tell people "Before applying for a fellowship, you’ll need to have attended at least one research workshop"? You're competing for the kind of people who can easily walk into a $500K+ job at any FAANG, why are you making them jump through hoops?
MIRI doesn't want people who can walk into a FAANG job, they want people who can conduct pre-paradigmatic research. "Math PhD student or postdoc" would be a more accurate desired background than "FAANG software engineer" (or even "FAANG ML engineer"), but still doesn't capture the fact that most math PhDs don't quite fit the bill either.
If you think professional recruiters, who can't reliably distinguish good from bad among the much more commoditized "FAANG software engineer" profile, will be able to find promising candidates for conducting novel AI research - well, I don't want to say it's impossible. But the problem is doing that in a way that isn't _enormously costly_ for people already in the field; there's no point in hiring recruiters if you're going to spend more time filtering out bad candidates than if you'd just gone looking yourself (or not even bothered and let high-intent candidates find you).
I think there is an interesting question about how one moves fields into this area. I imagine that having people who are intelligent but with a slightly different outlook would be useful. Being mentored while you get up to speed and write your first paper or two is important I think. I'm really not sure how I would move into a paid position for example without basically doing an unpaid and isolated job in my spare time for a considerable amount of time first.
For what it is worth, I agree completely with Melvin on this point - the job advert pattern matches to a scam job offer to me and certainly does not pattern match to any sort of job I would seriously consider taking. Apologies to be blunt, but you write "it's a mystery to me why it is", so I'm trying to offer an outside perspective that might be helpful.
It is not normal to have job candidates attend a workshop before applying for a job in prestigious roles, but it is very normal to have candidates attend a 'workshop' before pitching them an MLM or timeshare. It is even more concerning that details about these workshops are pretty thin on the ground. Do candidates pay to attend? If so this pattern matches advanced fee scams. Even if they don't pay to attend, do they pay flights and airfare? If so MIRI have effectively managed to limit their hire pool to people who live within commuting distance of their offices or people who are going to work for them anyway and don't care about the cost.
Furthermore, there's absolutely no indication how I might go about attending one of these workshops - I spent about ten minutes trying to google details (which is ten minutes longer than I have to spend to find a complete list of all ML engineering roles at Google / Facebook), and the best I could find was a list of historic workshops (last one in 2018) and a button saying I should contact MIRI to get in touch if I wanted to attend one. Obviously I can't hold the pandemic against MIRI not holding in-person meetups (although does this mean they deliberately ceased recruitment during the pandemic?), and it looks like maybe there is a thing called an 'AI Risk for Computer Scientists' workshop which is maybe the same thing (?) but my best guess is that the next workshop - which is a prerequisite for me applying for the job - is an unknown date no more than six months into the future. So if I want to contribute to the program, I need to defer all job offers for my extremely in-demand skillset for the *opportunity* to apply following a workshop I am simply inferring the existence of.
The next suggested requirement indicates that you also need to attend 'several' meetups of the nearest MIRIx group to you. Notwithstanding that 'do unpaid work' is a huge red flag for potential job applicants, I wonder if MIRI have seriously thought about the logistics of this. I live in the UK where we are extremely fortunate to have two meetup groups, both of which are located in cities with major universities. If you don't live in one of those cities (or, heaven forbid, are French / German / Spanish / any of the myriad of other nationalities which don't have a meetup anything less than a flight away) then you're pretty much completely out of luck in terms of getting involved with MIRI. From what I can see, the nearest meetup to Terrence Tao's offices in UCLA is six hours away by car. If your hiring strategy for highly intelligent mathematical researchers excludes Terrence Tao by design, you have a bad hiring strategy.
The final point in the 'recommended path' is that you should publish interesting and novel points on the MIRI forums. Again, high quality jobs do not ask for unpaid work before the interview stage; novel insights are what you pay for when you hire someone.
So to answer your question - yes there are many subtle and interesting factors as to why top companies cannot attract leading talent despite paying a lot of money to that talent and paying a lot of money to develop meta-knowledge about how to attract talent. However just because top companies struggle to attract talent and MIRI struggles to attract talent doesn't mean MIRI is operating on the same productivity frontier as top tech companies. From the public-facing surface of MIRI's talent pipeline alone there is enough to answer the question of why they're struggling to match funds to talent, and I don't doubt that a recruitment consultant going under the hood could find many more areas for concern in the talent pipeline.
Why *shouldn't* MIRI try doing the very obvious thing and retaining a specialist recruitment firm to headhunt talent for them, pay that talent a lot of money to come and work for them, and then see if the approach works? A retained executive search might cost perhaps $50,000 per hire at the upper end, perhaps call it $100,000 because you indicate there may be a problem with inappropriate CVs making it through anything less than a gold-plated search. This is a rounding error when you're talking about $2bn unmatched funding. I don't see why this approach is too ridiculous even to consider, and instead the best available solution is to have a really unprofessional hiring pipeline directly off the MIRI website
I believe the reason they aren't selecting people is simply that MIRI is run by deeply neurotic people who cannot actually accept any answer as good enough, and thus are sitting on large piles of money they insist they want to give people only to refuse them in all cases. Once you have done your free demonstration work, you are simply told that, sorry, you didn't turn out to be smarter than every other human being to ever live by a minimum of two orders of magnitude and thus aren't qualified for the position.
Perhaps they should get into eugenics and try breeding for the Kwisatz Haderach.
Although your take is deeply uncharitable, I think the basis of your critique is true and stems from a different problem. Nobody knows how to create a human level intelligence, so how could you create safety measures based on how such an intelligence would work? They don't know. So they need to hire people to help them figure that out, which makes sense. But since they don't know, even at an introductory level, they cannot actually evaluate the qualifications of applicants. Hiring a search firm would result in the search firm telling MIRI that MIRI doesn't know what it needs. You'd have to hire a firm that knows what MIRI needs, probably by understanding AI better than they do, in order to help MIRI get what it needs. Because that defeats the purpose of MIRI, they spin their wheels and struggle to hire people.
> However, if what you care about is hard to measure / takes lots of time for you to measure then it takes up a substantial amount of your time.
One solution here would be to ask people to generate a bunch of alignment research, then randomly sample a small subset of that research and subject it to costly review, then reward those people in proportion to the quality of the spot-checked research.
But that might not even be necessary. Intuitively, I expect that gathering really talented people and telling them to do stuff related to X isn't that bad of a mechanism for getting X done. The Manhattan Project springs to mind. Bell Labs spawned an enormous amount of technical progress by collecting the best people and letting them do research. I think the hard part is gathering the best people, not putting them to work.
> If the prospect of getting paid a six-figure salary to solve technical alignment problems worked to motivate lots of smart people to solve technical alignment problems... why hasn't that happened already?
Because the really smart and conscientious people are already making six figures. In private correspondence with a big LessWrong user (>10k karma), they told me that the programmers they knew that browsed LW were all very good programmers, and that the _worst_ programmer that they knew that read LW worked as a software engineer at Microsoft. If we equate "LW readers" with "people who know about MIRI", then virtually all the programmers who know about MIRI are already easily clearing six figures. You're right that the usefulness of researchers is heavy-tailed. If you want that 99.99th percentile guy, you need to offer him a salary competitive with those of FAANG companies.
If you equate "people who know about MIRI" with "LW readers", then maybe put some money and effort into MIRI more widely known. Hopefully in a positive way, of course.
You probably know more about the details of what has or has not been tried than I do, but if this is the situation we really should be offering like $10 million cash prizes no questions asked for research that Eliezer or Paul or whoever says moves the ball on alignment. I guess some recently announced prizes are moving us in this direction, but the amount of money should be larger, I think. We have tons of money, right?
They (MIRI in particular) also have a thing about secrecy. Supposedly much of the potentially useful research not only shouldn't be public, even hinting that this direction might be fruitful is dangerous if the wrong people hear about it. It's obviously very easy to interpret this uncharitably in multiple ways, but they sure seem serious about it, for better or worse (or indifferent).
This whole thread has convinced me that MIRI is probably the biggest detriment in the world for AI alignment research, by soaking up so much of the available funding and using it so terribly.
The world desperately needs a MIRI equivalent that is competently run. And which absolutely never ever lets Eleizer Yudkowsky anywhere near it.
My take is increasingly that this institution has succeeded in isolating itself for poorly motivated reasons (what if AI researchers suspected our ideas about how to build AGI and did them "too soon"?) and seems pretty explicitly dedicated to developing thought-control tech compatible with some of the worst imaginable futures for conscious subjects (think dual use applications -- if you can control the thoughts of your subject intelligence with this kind of precision, what else can you control?).
It hasn't "soaked up so much of the available funding." Other institutions in this space have much more funding, and in general are also soaking in cash.
(I disagree with your other claims too of course but don't have the energy or time to argue.)
Give Terrence Tao 500 000$ to work on AI alignement six months a year, letting him free to research crazy Navier-Stokes/Halting problem links the rest of his time... If money really isn't a problem, this kind of thing should be easy to do.
Every time I am confused about MIRI's apparent failures to be an effective research institution I notice that the "MIRI is a social club for a particular kind of nerd" model makes accurate predictions.
You could pay me to solve product search ranking problems, even though I find the end result distasteful. In fact, if you bought stuff online, maybe you did pay me!
You couldn't pay me to work on alignment. I'm just not aligned. Many people aren't.
Why do the dangers posed by AI need a full/transformative AI to exist? My total layman's understanding of these fears is that y'all are worried an AI will be capable of interfering with life to an extent people cannot stop. It's irrelevant if the AI "chooses" to interfere or there's some programming error, correct? So the question is not, "when will transformative AI exist?" the question is only, "when will computer bugs be in a position to be catastrophic enough to kill a bunch of people?" or, "when will programs that can program better than humans be left in charge of things without proper oversight or with oversight that is incapable of stopping these programming programs?"
Not that these questions are necessarily easier to predict.
A dumber-than-human level AI that (let's say) runs a power plant and has a bug can cause the power plant to explode. After that we will fix the power plan, and either debug the AI or stop using AIs to run power plants.
A smarter-than-human AI that "has a bug" in the sense of being unaligned with human values can fight our attempts to turn it off and actively work to destroy us in ways we might not be able to stop.
But if we are not worried about the bugs in the e.g. global water quality managing program, then an AI as smart as a human is not such a big deal either. There are plenty of smart criminals out who are unaligned with human values and even the worst haven't managed to wipe out humanity. We need to have an AI smarter than the whole group of AI police before seriously worrying, so maybe we need to multiply our made up number by 1,000?
But to illustrate the bug/AI question. Let's imagine Armybot, a strategy planning simulation program in 2022. And lets say there's a bug and Armybot, which is hooked up to the nuclear command system for proper simulations, runs a simulation IRL and lets off all those nukes. That's an extinction level bug that could happen right now if we were dumb enough.
Now lets imagine Armybot is the same program in 2050 and now it's an AI with the processing power equivalent to the population of a small country. Now the fear is Armybot's desire/bug to nuke the world kicks in (idk why it becomes capable of making independent decisions or having wants just because of more processing power so I'm more comfortable saying there's a bug). But now it can independently connect itself to the nuclear command center with its amazing hacking skills (that it taught itself? that we installed?). That's an extinction level bug too.
The general intuition, I believe, is that an AI as smart as a human can quickly become way way smarter than a human, because humans are really hard to improve (evolution has done its best to drill a hole through the gene-performance landscape to where we are, but it's only gotten more stuck over the aeons) and AI tends to be really easy to improve: just throw more cores at it.
If you could stick 10 humans of equal intelligence in a room and get the performance of one human that's 10 iq points smarter than that, then the world would look pretty different. Also we can't sign up for more brain on AWS.
My intuition is that "Just throw more cores at it" is no more likely to improve an AI's intelligence than opening up my skull and chucking in a bunch more brain tissue.
I think you'd have to throw more cores at it _and then_ go through a lengthy process of re-training, which would probably cost another hundred billion dollars of compute time.
It's even worse (or better, I guess, depending on your viewpoint) than that, because cores don't scale linearly; there's a reason why Amazon has a separate data center in every region, and why your CPU and GPU are separate units. Actually it's even worse than that, because even with all those cores, no one knows what "a lengthy process of re-training" to create an AGI would look like, or whether it's even possible without some completely unprecedented advances in computer science.
I think we can safely assume that it is going to be vastly easier than making a smarter human, at least given our political constraints. (Iterated embryo selection etc.) It doesn't matter how objectively hard it is, just who has the advantage, and by how much. Also I think saying we need fundamental advances in CS to train a larger AI given a smaller AI, misses first the already existing distillation research, and second assumes that the AGI was a one in a hundred stroke of good luck that cannot be reproduced. Which seems unlikely to me.
A hundred billion dollars of compute time for training is a fairly enlightening number because it's simultaneously an absurd amount of compute, barely comparable to even the most extravagant training runs we have today, enough to buy multiple cutting edge fabs and therefore all of their produced wafers, while also being an absolutely trivial cost to be willing to pay if you already have AGI and are looking to improve it to ASI. Heck, we've spent half that much just on our current misguided moon mission that primarily exists for political reasons that have nothing to do with trying to go to the moon.
That said, throwing more cores at an AI is by no means necessary, nor even the most relevant way an AI could self-improve, nor actually do we even need to first get AGI before self-improvement becomes a threat. For example, we already have systems that can do pick-and-place for hardware routing better than humans, we don't need AGI to do reinforcement learning, and there are ways in which an AI system could be trained to be more scalable when deployed than humans have evolved to be.
A fairly intelligent AI system finely enough divided to search over the whole of the machine learning literature and collaboratively try out swathes of techniques on a large cluster would not have to be smarter than a human in each individual piece to be more productive at fast research than the rest of humanity. Similarly, it's fairly easy to build AI systems that have an intrinsic ability to understand very high fidelity information that is hard to convey to humans, like AI systems that can look at weights and activations of a neural network and tell you things about its function. It's not hard to imagine that as AI approaches closer to human levels of general reasoning ability, we might be able to build a system that recursively looks at its own weights and activations and optimises them directly in a fine tuned way that is impossible to do with more finite and indivisible human labor. You can also consider systems that scale in ways similar to AlphaZero; again, as these systems approach having roughly human level general reasoning ability in their individual components, the ability for the combined system to be able to reason over vastly larger conceptual spaces in a much less lossy way that has been specifically trained end-to-end for this purpose might greatly exceed what humans can do.
I think people often have a misconception where they consider intelligence to exist purely on a unidimensional line which takes exponential difficulty to progress along. Neither of these are true, it is entirely on trend for AI to have exploitable superiorities as important as its deficiencies, and for progress to speed up rather than slow down as its set of capabilities approaches human equivalence—Feynman exists on the same continuum as everybody else, so there doesn't seem to be a good reason to expect humanity exists at a particularly difficult place for evolution to further improve intelligence. Even if human intelligence did end up being precisely a soft cap to the types of machines we could make, being able to put a large and scalable number of the smartest minds together in a room on demand far exceeds the intellectual might we can pump out of humanity otherwise.
There will be 0 or a few AI's given access to nukes. And hopefully only well tested AI.
If the AI is smart, especially if its smarter than most humans, and it wants to take over the world and destroy all humans, its likely to succeed. If you aren't stupid, you won't wire a buggy AI to nukes with no safeguards. But if the AI is smart, its actively trying to circumvent any safeguard. And whether nukes already exist is less important. It can trick humans into making bioweapons.
"idk why it becomes capable of making independent decisions or having wants just because of more processing power so I'm more comfortable saying there's a bug". Current AI sometimes kind of have wants, like wanting to win at chess, or at least reliably selecting good chess moves.
We already have robot arms programmed to "want" to pick things up. (Or at least search for plans to pick things up.) The difference is that currently our search isn't powerful enough to find plans involving breaking out, taking over the world and making endless robot arms to pick up everything forever.
Defence against a smart adversary is much much harder than defence against random bugs.
Scott said "smarter-than-human" (perhaps he means "dramatically smarter"), and I argue downthread that there will never be an AI "as smart as" a human.
I'm unconvinced by AI X-risk in general, but I think I can answer this one: bugs are random. Intelligences are directed. A bad person is more dangerous than a bug at similar levels of resources and control.
No, it can't, because merely being able to compute things faster than a human does not automatically endow the AI with nigh-magical powers -- and most of the magical powers attributed to putative superhuman AIs, from verbal mind control to nanotechnology to, would appear to be physically impossible.
Don't get me wrong, a buggy AI could still mess up a lot of power plants; but that's a quantitative increase in risk, not a qualitative one.
An AI doesn't need magical powers to be a huge, even existential threat. It just needs to be really good at hacking and can use the usual human foibles as leverage to get nearly anything it wants: money and blackmail.
Human hackers do that today all the time, with varying degrees of success. They are dangerous, yes, but not an existential threat. If you are proposing that an AI would be able to hack everything everywhere at the same time, then we're back in the magical powers territory.
How much better, exactly ? Is it good enough to hack my Casio calculator watch ? If so, then it's got magical powers, because that watch is literally unhackable -- there's nothing in it to hack. Ok, so maybe not, but is it good enough to gain root access to every computer on the Internet at the same time while avoiding detection ? If so, then it has magical powers of infinite bandwidth, superluminal communication, and whatever else it is that lets it run its code at zero performance penalty. Ok, so maybe it's not quite that good, but it's just faster than average human hackers and better informed about security holes ? Well, then it's about as good as Chinese or Russian state hackers already are today.
In other words, you can't just throw the word "superintelligent" into a sentence as though it was a magic incantation; you still need to explain what the AI can do, and how it can do it (in broad strokes).
These timelines seem to depend crucially on compute getting much cheaper. Computer chip factories are very expensive, and there are not very many of them. Has anyone considered trying to make it illegal to make compute much cheaper?
Who? You're talking to the small group of researchers and activists who care about this, with a few tens of billions of dollars. How do they make it illegal to make compute much cheaper?
Just offering a concrete policy goal to lobby for. As far as I know, actual policy ideas here beyond “build influence” are in short supply.
I agree this would be very challenging and probably require convincing some part of the US and Chinese governments (or maybe just the big chip manufacturers) that AI risk is worth taking seriously.
Ideas aren't in short supply; clearly good ideas are. You aren't the first person to propose lobbying to stop compute getting cheaper. What's missing is a thorough report that analyzes all the pros and cons of ideas like that and convinces OpenPhil et al that if they do this idea they won't five years later think "Oh shit actually we should have done the exact opposite, we just made things even worse."
Even clearly good ideas aren't in short supply; *popular* people who can tell which ideas are good are. So usually when I see (or invent) a good idea, it is not popular.
What are some clearly good policy ideas in this space?
Most that I have seen are bad because of the difficulty of coordinating among all possible teams of people working on AI (on the other hand, the number of potential chip fabs is much smaller)
OK, I’m glad to hear this idea is already out there. I wasn’t sure if it was. I agree the appropriate action on it right now is “consider carefully”, not “lobby hard for it”.
I don't know if someone has discussed your idea in AI governance, but in alignment there's the concept of a "pivotal act". You train an AI to do some specific task which helps which drastically changes the expected outcome of AGI. For instance, an AI which designs nanotech in order to melt all GPUs and destoy GPU plants, after which it shuts down. Which is vaguely similair to what you suggested. So maybe search for pivotal acts on the alignment forum to find the right literature.
Is this intended to be a failsafe, such that the AGI has a program to destroy computer creating machinery, but can only do so if it escapes its bounds enough to gain the ability?
It is intended to slow down technological progress in AI and make it impossible for someone else (and you afterwards!) to make an AGI, or anything close to an AGI. And nothing else. So no first order effect on other tech, politics, economics, science etc.
This works out better as a failsafe than what you've proposed, since if you're expecting the AI to escape and have enough power to conduct such an act, you've lost anyway. Someone else is probably making an AGI as well in that scenario, or the AI will be able to circumvent the program firing up or so on.
Note that getting the AI to actually just melt GPUs somehow and then shut down is an unsolved problem. If we knew how to do that right now, the alignment community would be way more optimistic about our chances.
If you've tried to buy a high-amperage MOSFET, a stepper driver, a Raspberry Pi or a GPU lately, you would know how easy it is to make compute expensive. Different chips - or different computers whose CPUs/firmwares don't conform to a BIOS-like standard - are not necessarily fungible with each other, and the whole chip fab process has a very long cycle time despite the relatively normal amount of throughput achievable by, essentially, a very deep pipeline.
(And yes, I too think the whole movement reeks of Luddism.)
See, this is *exactly* why I'm opposed to the AI-alignment community. Normally I wouldn't care, people can believe whatever they want, from silicon gods to the old-fashioned spiritual kind. But beliefs inform actions, and boom -- now you've got people advocating for halting the technological progress of humanity based on a vague philosophical ideal.
We've got real problems that we need to solve, right now: climate change, hunger, poverty, social networks, the list goes on; and we could solve most (arguably, all) of them by developing new technologies -- unless someone stops us by shoving a wooden shoe in the gears every time a new machine is built.
"halting the technological progress of humanity based on a vague philosophical ideal. "
Does this apply to the biologists deciding not to build bioweapons? Some technologies are dangerous and better off not built. It can create new problems as well as solving them. You would need to show that the capability of AI to solve our problems is bigger than the risk. Maybe AI is dangerous and we would be better off solving climate change with carbon capture. Solving any food problems with GMO crops. And just not doing the most dangerous AI work until we can figure out how to make it safer.
You are not talking about the equivalent of deciding not to build bioweapons; you are talking about the equivalent of stopping biological research in general. I agree that computing is dangerous, just as biology is dangerous -- and don't even get me started on internal combustion. But we need all of these technologies if we are to thrive, and arguably survive, as a species. I'm not talking about solving global warming with some specific application of AI; I'm talking about transformative events such as our recent transition into the Information Age.
Progress is great! Stopping growth would be a disaster.
That said, it doesn’t seem to me that cheaper computing power is very useful in solving climate change, poverty, etc. Computers are already really great; what we need is more energy abundance and mastery over the real physical world.
Consumer CPUs haven’t been getting faster for many years, so it’s not even clear most computer users are benefiting from Moore’s law these days.
If you don't think more computer power can somehow magically solve those problems, this is a good first step towards understanding why some people are unconvinced by AI X-risk.
>human solar power a few decades ago was several orders of magnitude worse than Nature’s, and a few decades from now it may be several orders of magnitude better.
No, because typical solar panels already capture 15 – 20% of the energy in sunlight (the record is 47%). There's not another order of magnitude left to improve.
Nitpicking aside, I wonder how the potential improvement of human intelligence through biotechnology will affect this timeline. The top AI researcher in 2052 may not have been born yet.
I don't think that's a reasonable metric for solar power. Plants use solar power to drive chemical reactions -- to make molecules. They're not optimized for generating 24VDC because DC current isn't useful to a plant. So the true apples-to-apples comparison is to compare the amount of sunlight a plant needs to synthesize X grams of glucose from CO2 and water, versus what you can do artificially, e.g. with a solar panel and attached chemical reactor. By that much more reasonable metric the natural world still outshines the artificial by many orders of magnitude.
One imagines that if plant life *did* run off a 5VDC bus, then evolution would have developed some exceedingly efficient natural photovoltaic system. What it would look like is an interesting question, but I very much doubt it would be made of bulk silicon. Much more likely is that it would be an array of microscopic machines driven by photon absorption, which is kind of the way photosynthetic reaction centers work already.
This leaves us with having to compare across different domains. How do you quantify the difference between "generates DC power" and "makes molecules"? I guess you'd have to start talking about the *purpose* of doing those things. Something like "utility generated for humans" vs "utility generated for plants"...and that seems really difficult to do.
No, you would need to measure a combined system, of solar panel plus chemical plant, as I said. But a plant *is* a combined solar panel plus chemical plant, and optimized globally, not in each of its parts, so if you wan to make a meaningful comparison, that's what you have to do. Otherwise you're making the sort of low-value comparison that people do when they say electric cars as a technology generate zero CO2, forgetting all about the source of the electricity. It's true but of very limited value in making decisions, or gaining insight.
In this case, the insight that is missing is that Nature is still a heck of a lot better at harvesting and using visible photons as an energy source. The fact that PV panels can do much better at a certain specialized subtask, which is *all by itself* pointless -- electricity is never an end in itself, it's always a means to some other end -- isn't very useful.
But you’re still favoring the plant by trying to get technology to simulate the plant. Yes, it’s an integrated solar panel plus chemical plant, but that’s completely useless if what you want is a solar panel plus 24V DC output plug. In that case, the plants lose by infinity orders of magnitude, because no plant does 24V DC output. You get similar results if what you want is computing simple arithmetic (a $5 calculator will beat any plant, complete with solar panel), or moving people between continents. Yes, birds contain sophisticated chemical reactors and have complex logic for finding fuel, but they still cannot move people between continents.
I you insist on measuring based on one side’s capabilities, I am the world’s best actor by orders of magnitude, since I have vast advantages at convincing my mother I am the son she raised, relative to anyone else.
This is a minor point in all this, but it seems weird to estimate the amount of training evolution has by the amount of FLOPs each animal has done. Thinking more doesn't seem like it would increase the fitness of your offspring, at least not in a genetic sense. The only information evolution gets is how many kids you have (and they have, etc).
Though maybe you could point to this as the reason why the evolution estimate is so much higher than the others.
It works if you consider optimization, or solution finding in general, as a giant undifferentiated sorting problem. I have X bits of raw data, and my solution (or optimum) is some incredibly rare combination F(X), and what I need to do is sift all the combinations f(X) until f(X) = F(x). That will give you an order of magnitude estimate for how much work it is to find F given X, even if you don't know the function f.
But in practice that estimate often proves to be absurdly and uselessly large. It's sort of like saying the square root of 10 has to be between 1 and 10^50. I mean...yeah, sure, it's a true statement. But not very practically useful.
In the same sense, many problems Nature has solved appear to have been solved in absurdly low amounts of time, if you take the "number of operations needed" estimate as some kind of approximate bound. This is the argument often deployed by "Intelligent Design" people to explain how evolution is impossible, because the search space for mutations is so unimaginably huge, relative to the set of useful mutations, that evolution would accomplish pretty much zip over even trillion-year spans. See also the Levinthal Paradox in protein folding. Or for that matter the eerie fact that human beings who can compete with computer chess programs at a given level are doing way, way, *way* fewer computations. Somehow they can "not bother" with 99.9999+% of the computations the computer does that turn out to be dead ends.
How Nature achieves this is one of the most profound and interesting questions in evolutionary biology, in the understanding of natural intelligence, and in a number oif areas of physical biochemistry, I would say.
Gotta say I don’t generally feel this way (although I always find his stuff to be enlightening and a learning experience) but I’m pretty well aligned with Eliezer here. I think people figure out when they’ll start to feel old age and just put AI there then work backwards. I’m greatly conflicted about AGI as I don’t know how we fix lots of problems without it and it seems like there’s some clever stuff to do in the space other than brute forcing that I think doesn’t happen as much… and this is where I’m conflicted, because kinda thankfully it makes people feel shunned to do wild stuff which slows the whole thing down. Hopefully we arrive at the place of unheard of social stability and AGI simultaneously. If we built it right now I think it would be like strapping several jet engines on a Volkswagen bug. For whatever that’s worth, Some Guy On The Internet feels a certain way.
I personally think AGI in eight years. GPT-3 scares me. It's safe now, but I worry it's "one weird trick" (probably involving some sort of online short-horizon self-learning) out from self-awareness.
It feels weird to be rooting against progress but I hope you’re wrong until we have some more time to get our act together. To me the control problem is also how we control ourselves. Without some super flexible government structure to oversee us I worry what we’ll try to do even if there are good decision makers telling us to stop. Seems like most minds we could possibly build would be insane/unaligned (that’s probably me anthropomorphizing) since humans need a lot of fine tuning and don’t have to be that divergent before we are completely coo coo for Cocoa Puffs. Hopefully the first minds are unproductively insane instead of diabolically insane.
I am personally pretty old already, but I do expect to live 8 more years, so I'd totally take you up on that bet. From where I'm standing, it looks like easy money (unless of course you end up using some weak definition of AGI, like "capable of beating a human at Starcraft" or whatever).
There's the general thing that the definition for AGI keeps changing; what would have counted as intelligence thirty years ago no longer counts, because we've already achieved it. So what looks like a strong definition for AGI today becomes a weak definition tomorrow.
This is actually the source of my optimism: People worried about AGI can't even define what it is they are worried about. (Personally I'll worry when some specific key words get used together. But not too much, because I'm probably just as wrong.)
I'm not worried about AGI at all -- that is to say, I'm super worried about it, but only in the same way that I'm worried about nuclear weapons or mass surveillance or other technologies in the hands of bad human actors. However, I'd be *excited* about AGI when it could e.g. convincingly impersonate a regular (non-spammer) poster on this forum. GPT-3 is nowhere near this capability at present.
Ergo, being self aware is not a necessary condition to be scary and/or cause a disaster. Or, more precisely, just saying “it’s not self aware” is not an argument that you shouldn’t worry about it.
The thing that is scary about GPT-3 is not its *self-awareness*, but its other (relatively and unexpectedly) powerful abilities, and particularly that we don’t know how much more powerful it could become, while remaining non–self aware.
Sort of how boulders are not that scary by themselves, but once you see one unexpectedly fall from the sky, you might worry what happens if a much bigger one will fall later. And how it might be a good idea to start investigating how and why boulders can fall from the sky, and what you might be able to do about it, some time before you see the big one with your own eyes when it touches the atmosphere.
But being self aware is what scares people about AGI. Rather than live in the world of metaphor here - what exactly can a future GPT do that’s a threat? Write better poems, or stories?
>I consider naming particular years to be a cognitively harmful sort of activity; I have refrained from trying to translate my brain's native intuitions about this into probabilities, for fear that my verbalized probabilities will be stupider than my intuitions if I try to put weight on them.
I don't think there's good evidence that specific, verifiable predictions is a cognitively harmful activity. I'd actually say the opposite - that it is virtually impossible to update one's beliefs without saying things like "I expect X by Y," and definitely impossible to meaningfully evaluate a person's overall accuracy without that kind of statement. It reminds me of Superforecasting pointing out how many forecasts are not even wrong - they are meaningless. For example:
> Take the problem of timelines. Obviously, a forecast without a time frame is absurd. And yet, forecasters routinely make them, as they did in that letter to Ben Bernanke. They’re not being dishonest, at least not usually. Rather, they’re relying on a shared implicit understanding, however rough, of the timeline they have in mind. That’s why forecasts without timelines don’t appear absurd when they are made. But as time passes, memories fade, and tacit time frames that once seemed obvious to all become less so. The result is often a tedious dispute about the “real” meaning of the forecast. Was the event expected this year or next? This decade or next? With no time frame, there is no way to resolve these arguments to everyone’s satisfaction—especially when reputations are on the line.
(Chapter 3 of Superforecasting is loaded up with a discussion of this whole matter, if you want to consult your copy; there's no particular money shot quote I can put here.)
Frankly, the statement "my verbalized probabilities will be stupider than my intuitions" is inane. They cannot be stupider than your intuitions, because your intuitions do not meaningfully predict anything, except insofar as they can be transformed into verbalized probabilities. It strikes me that more realistically, your verbalized probabilities will *make it more obvious that your intuitions are stupid*, making it understandable monkey politicking to avoid giving numbers, but in response I will use my own heuristics to downgrade the implied accuracy of people engaged in blatant monkey politicking.
First off, Yudkowsky was talking about himself. It is possible that he really does get fixated on what other people say and can't get his brains to generate its probability instead of their answer. I know I often can't get my brain to stop giving me cached results instead of thinking for itself.
"your intuitions do not meaningfully predict anything, except insofar as they can be transformed into verbalized probabilities"
This is right on some level and wrong on another. It is right in that we should expect some probability is encoded somewhere in your brain for a given statement, which we might be able to decode into numbers if only we had the tech and understanding.
It is wrong in that e.g. I have no idea what probability there is that we live in a Tegmarkian universe, but I have some intuition that this is plausible as an ontology. I have no idea what the probability of the universe being fine tuned is, but it feels like minor adjustments to the standard models parameters could make life unfeasible.
When I don't know what the event space is, or which pieces of knowledge are relevant, and how they are relevant, then you can easily make an explicit mental model that performs worse than your intuitions. Your system 1 is very powerful, and very illegible. You can output a number that "feels sort of right but not quite", and that feeling is more useful than the number itself as it is your state of knowledge. And if you're someone who can't reliably get people to have that same state of knowledge, then giving them the "not right" number is just giving them a poor proxy and maybe misleading them. Yudkowsky often says that he just can't seem to explain some parts of his worldview, and often seems to mislead people. Staying silent on median AGI timelines may also be a sensible choice for him.
I kind of buy it, but then I've read a lot of his stuff and know his context.
>It is wrong in that e.g. I have no idea what probability there is that we live in a Tegmarkian universe, but I have some intuition that this is plausible as an ontology. I have no idea what the probability of the universe being fine tuned is, but it feels like minor adjustments to the standard models parameters could make life unfeasible.
Right, but that is a virtually meaningless statement, is the thing. It's the same as any other part of science - in order for something to be true, it has to be falsifiable. Ajeya has put forward something that she could actually get a Brier's score based on - Yudkowsky has not.
>I kind of buy it, but then I've read a lot of his stuff and know his context.
I read a lot of his stuff too, which is why it's disappointing to see him do something that I can only really blame on either monkey brain politicking or just straight up ignoring good habits of thought. Monkey politicking is more generous, in my view, than just straight up ignoring one of the most scientifically rigorous works on increasing one's accuracy as a thought leader in the rationalist community.
Sure, the Tegmark thing is not falsifiable. But the fine tuning thing isn't (simulate biochemistry with different parameters for e.g. the muon mass and see if you get complex self replicating organisms). And the concept generalises.
If you take something like "what is the probability that if the British lost the battle of Waterloo, then there would have been no world war", you might have some vague intuitions about what couldn't occur, but I wouldn't trust any probability estimate you put out. How could I? There are so many datapoints that affect your prior, and it is not even clear what your prior should be, that I don't see how you could turn your unconscious knowledge generating your anticipations into a number via formal reasoning. Or even via guessing what's right, as you don't know if you're taking all your knowledge into account.
>I read a lot of his stuff too, which is why it's disappointing to see him do something that I can only really blame on either monkey brain politicking or just straight up ignoring good habits of thought.
It would be better if he gave probability estimates. I just don't think its as big a deal as you're claiming. You can still see what they would bet on e.g. GPT-n not being able to replace a programmer. That makes their actual beliefs legible.
And yeah, Yudkowsky is being an ass here. But he's been trying to explain his generators of thought for like ten years and is getting frustrated that no one seems to get it. It is understandable, but unvirtuous.
> I consider naming particular years to be a cognitively harmful sort of activity; I have refrained from trying to translate my brain's native intuitions about this into probabilities, for fear that my verbalized probabilities will be stupider than my intuitions if I try to put weight on them.
It was very hard to read this and interpret it as anything other than "I don't want to put my credibility on the line in the event that our doomsday cult's predicted end date is wrong." As a reader, I have zero reason to give value to Yudkowsky's intuition. The only times I'd take something like this seriously is if someone had repeatedly proved the value of their intuition via correct predictions.
I hate being uncharitable, but that's exactly how I read that section as well. If he feels strongly about a particular timeline, and he clearly says that he does, then he should not be worried about sharing that timeline. If he doesn't share that timeline, then he is implying that either 1) he doesn't have strong feelings about what he's saying, or 2) he is worried about the side effects of being held accountable for being wrong (which to me is another reason to think he doesn't actually have strong beliefs that he is correct on his timeline).
Uncharitably, Eliezer depends on his work for money and prestige, and that work depends on AI coming sooner, rather than later. Knowing that AI is not even possible at current levels of computing would drastically shrink the funding level applied to AI safety, so he has a strong incentive to believe that it can be.
I'll add a third voice to the pile here RE; Yudkowsky and withholding his timeline. It would certainly seem he's learned from his fellow doomsayers's crash-and-burn trajectories when they get pinned down to naming a date for their apocalypse.
Comparing brains and computers is quite tricky. If you look at how a brain works, it's almost all smart structure - the way each and every neuron is physically wired, which happens thanks to evolved and inherited broad-stroke structures (nuclei, pathways, neuron types, etc.), as well as the process of learning during an individual's development. The function part that is measured by the number of synaptic events per second is a tiny part of the whole process. If you look at how a computer running an AI algorithm works the picture is the opposite: There is almost nothing individual on the structure/hardware level (where you count FLOPS) and almost everything that separates a well-functioning AI computer from a failing one is in the function/software part. This is what it means that the computer is consuming FLOPS much differently than a brain consumes synaptic events. I am very much in agreement with Eliezer here.
Based on the above I guess that if you built a neuromorphic computer, i.e. a computer whose hardware was structured like a brain, you could expect the same level of performance for the same number of synaptic events. Instead of having a software-agnostic hardware you might have e.g. a gate array replicating the large-scale structure of the brain (e.g. different modules receiving inputs from different sensors, multiple subcortical nuclei, a cortical sheet, multiple specific pathways connecting modules, etc.) that could run only one algorithm, precisely adjusting synaptic weights in these pre-wired society of neural networks. In that system you would get the same IQ from the same number of synaptic/gate switch events, as long as your large-scale structure was human-level smart.
This would be a complete change in paradigm compared to current AI, which uses generic hardware to run individual algorithms and thus suffers a massive hit to performance. And I mean, a *really* massive hit to performance. If you figure out a smart computational structure, as smart as what evolution put together, you will have a human level AGI using only 10e15 FLOPS of performance. All we need to do is to map a brain well-enough to know all the inherited neural pathways, imprint those pathways on a humongous gate array (10e15 gates), and do a minor amount of training to create the individual synaptic weights.
This is my recipe for AGI, soon.
Now, about that 7-digit sum of money to be thrown....
I think there's a underappreciated severe physical challenge there. If you build a neuromorphic computer out of things large enough that we know how to manipulate them in detail, I would guess you will be screwed by the twin scourges of the speed of light and decoherence times -- the minimum clock time imposed by the speed of light will exceed decoherence times imposed by assorted physical noise processes at finite temperature, and you will get garbage.
I think the only way to evade that problem is to build directly at the molecular scale, so you can fit everything in a small enough volume that the speed of light doesn't slow your clock cycle time too far. But we don't know how to do that yet.
If you have 10e15 gates trying to produce 10e15 operations (not floating point) per second your clock time is 1 Hz. Also, the network is asynchronous. This is a completely different regime of energy dissipation per unit time per gate, so gate density per unit of volume is much higher, so distances are not much longer than in a brain, so the network is constrained by neither clock time nor decoherence.
Right. That fits under my second condition: "we don't know how to build such a thing" because we don't know how to build stuff at the nanometer level in three dimensions, and two-dimensions (which we can do now) won't cut it to achieve that density.
You don't need to have extremely high 3d density. Since your gates operate at 1Hz you can have long interconnects and you can stack layers with orders of magnitude less cooling than in existing processors. The 9 OOM difference in clock speed between a GPU and the neuromorphic machine makes a huge difference in the kind of constraints you face and the kind of solutions you can use. The technology to make the electronic elements and the interconnects for this device exists now and is in fact trivial. What we are missing is the brain map we need to copy onto these electronic devices (the large-scale network topology).
Trivial, eh? Wikipedia tells me the highest achieved transistor density is about 10^8/mm^2. So your 10^15 elements would seem to require a 10m^2 die. That might be a little tricky from a manufacturing (especially QA) point of view, but let's skip over that. How are we going to get the interconnect density in 2D space? In the human brain ~50% of the volume is given over to interconnects (white matter), and in 3D the problem of interconnection is enormously simpler -- that's why we easily get traffic jams on roads but not among airplanes.
How many elements can you fit on a 2D wafer and still get the "everything connects to everything else" kind of interconnect density we need? Recall we assume here that a highly stereotypical kind of limited connection like you need for a RAM chip or even GPU is not sufficient -- we need a forest of interconnects so dense that almost all elements can talk to any other element. I'm dubious that it can be done at all for more than a million elements, but let's say we can do it on the largest chips made today ~50,000 mm^2, which gets us 10^12 elements. Now we are forced to do stack our chips, 10^3 of them. How much space do you need between the chips for macroscopic interconnects? Remember we need to connect ~10^12 elements in chip #1 with ~10^12 elements in chip #999. It's hard to imagine how one is going to run a trillion wires, even very tiny wires, between 1000 stacked wafers.
All of this goes away if you can actually fully engineer in 3D space, the way the human brain is built, so you can run your interconnects as nm-size features. But we don't know how to do that yet.
Not everything connects to everything else - the brain has a very specific network topology with most areas connecting only to a small number of other areas. This is a very important point - we are not talking about a network fabric that can run any neural net, instead we are copying a specific pre-existing network topology, so our connections will be sparse compared to the generic topology.
Think about a system built with a large number of ASICs - after mapping and understanding the function of each brain module you make an ASIC to replicate its function and you may need thousands of specialized types of ASICs, one or more for each distinct brain area. Sure, the total surface area of the ASICS would be large but given the low clock rate we don't have to use anything very advanced and as you note we can already put 10e12 transistors per wafer, so the overall number of chips to get to 10e15 gates would not be overwhelming. Also, you are not trying to make a thousand Cerebras wafers running at GHz speed, you make chips running at 1 Hz, so the QA issues would be mitigated. The interconnects between the ASICs of course don't need to have millions of wires - you can multiplex the data streams from hundreds of millions of neuron equivalents (like a readout of axonal spikes) over a standard optic or copper wire and of course the interconnects are in 3d, as in a rack cabinet. No need for stacking thousands of high-wattage wafers in a tiny volume to maximize clock speed, since all you need is for the wires to transmit the equivalent of 1Hz per neuron, so everything can be put in industry standard racks. Low clock speed makes it so much easier.
This is not to say this way of building a copy of a brain is the most efficient, and definitely not the fastest possible - but it would not require new or highly advanced manufacturing techniques. What is missing for this approach to work is a good-enough map of brain topology and a circuit-level functional analysis of each brain module, good enough to guide the design of the aforementioned ASICs.
People have been comparing the brain to a machine since clockwork, it's refreshing to see machines compared to brains for once.
I agree that trying to copy biological mechanisms in the case of AI probably isn't the way to go. We want mechanical kidneys and hearts to work like their biological counterparts because we'll be putting them into biological systems (us), that doesn't hold true for AI.
I thought we were building AIs to fit into human society? One to which we could talk, would understand us, would be able to work with us, et cetera? If not, what's the point? If so, doesn't that put at least as much constraint on an artificial mind as the necessity for integrating with a physical biological system puts on an artificial kidney?
Your reference to A.I. always being 30 years away (or 22) reminds me of the old saw about fusion power always being 20 years away for the last 60 years.
The rebuttal I've heard was that fusion research is funding constrained - if someone had given fusion research twenty billion dollars instead of twenty million dollars, they would be a lot closer than 20 years away by now.
Wasn't $20 million sixty years ago more like $20 billion today? I have a feeling that no matter how much money was thrown at it, the complaint would be "if they had only given us SIXTY billion instead of a lousy twenty billion, we'd have fusion today!"
Of course, the cold fusion scandal of 1989 didn't help, after that I imagine trying to convince people to work on fusion was like trying to convince them your perpetual motion machine was really viable, honest:
My rule of thumb is that $1 in 1960 = $10 today, so it would have been more like $200 million. I didn't remember the exact numbers quoted or the year the guy was referring to (it could have been the 1980s for all I know) but the amount of money the guy said they got was something like 1/1000th of the amount they said they would need.
If they had fully funded them, fusion might be perpetually 10 years away instead of 20. ;)
It should be noted that the credibility gap between fusion and cold fusion is about the same size as the one between quantum mechanics and quantum mysticism.
Humans have been causing thermal fusion reactions since the 1950s. Going from that to a fusion power plant is merely an engineering challenge (in the same way that going to the moon from Newton mechanics and gunpowder rockets is just an engineering challenge).
Not a lot of people in the business took cold fusion seriously, even at the time. People were guarded in what they said publically, but privately it was considered silly pretty much immediately.
if you believed the orthogonality thesis were false - say, suppose you believe both that moral realism is correct and that that long term intelligence was exactly equal to the objective good that we approximate with human values - would you still worry?
That's a very interesting position if I understand correctly. Is your view that a super smart AI would recognize the truth of morality and behave ethically?
If a bunch of people converge to the same map, that's strong evidence that they've discovered *something*, but it leaves open the question of what exactly has been discovered.
I can immediately think of two things that people trying to discover morality might discover by accident:
1) Convergent instrumental values
2) Biological human instincts
(These two things might be correlated.)
According to you, would discovering one or both of those things qualify as proving moral realism? If not, what precautions are you taking to avoid mistaking those things for morality?
I agree with moral realism and I think convergence of moral values is evidence of moral realism. I would answer the first question as it doesn't prove moral realism for the fact that there are other possible hypotheses, but it does raise the probability of moral realism being true.
I'd agree that the existence of non-zero map-convergence is Bayesian evidence in favor of realism, in that it is more likely to occur if realism is true.
Of course, the existence of less-than-perfect map-convergence is Bayesian evidence in the opposite direction, for similar reasons.
Figuring out whether our *exact* degree of map-convergence is net evidence in favor or against is not trivial. One strategy might be to compare the level of map-convergence we have for morality to the level of map-convergence we have for other fields, like physics, economics, or politics, and rank the fields according to how well peoples' maps agree.
To be fair, though, you have to ALSO account for a few things like:
- 'how widespread is the belief that the maps _ought_ to converge'
- 'how much energy has been spent trying to find maps that converge'
- and, MOST IMPORANTLY - how complicated is the territory?
i don't think we should expect _complete_ convergence because i think a true morality system, with full accuracy, requires knowing everything about the future evolution of the universe, which is impossible
if we really had some machine that could tell us, with absolute certainty, how to get to a future utopia state where the globe was free from all major disease, everyone had a great standard of living, robots did all the work, but humans worked too because we were all intensely loving, caring beings, and humans wrote art and poetry and made delicious food and did all kinds of fun things with each other, war never happened, and this state went on for millions of years as we expanded throughout the cosmos and seeded every planet with fun-loving, learning humans who never really suffered and yet continuously strived to learn and grow and develop, and knew all the names of all our ancestors because we invested heavily in simulating the past so that we could honor the dead who came before us, AND somehow this made all religions work in harmony becuase of some weird quirks in their bylaws that people hadn't noticed before....
if you KNEW this was doable, and we had the perfect map telling us how to get there, well, i think most of us would want to go there. Some people would of course be unhappy with certain aspects of that descriptoin but i think _most_ people would be like, yeah, i want that.
In other words, the infeasibility of a fully accurate map of causality is why we don't agree on morality. the causal maps we do use involve lossy compression, which means throwing out some differences as irrelevant. But the decision of what is and isn't relevant is a moral one! Once you decide 'the arrangement of these water molecules doesn't really matter so much as the fact that they are in liquid state and such and such temperature and pressure', you are _already_ playing the moral values game.
In other words, there's no way to separate causal relevance from moral values.
I...wouldn't know how to model this. Certainly it would be better than the alternative. One remaining concern would be what you need to apprehend The Good, and whether it's definitely true that any AI powerful enough to destroy the world would also be powerful enough to apprehend the Good and decide not to. Another remaining concern is that the Good might be something I don't like; for example, that anyone who sins deserves death and all humans are sinners; or that Art is the highest good and everyone must be forced to spend 100% of their time doing Art and punished if they try to have leisure, or something like that.
My argument for moral realism, and then my hunch at the true ethics is linked above. The short there version is: maximizing possible future histories, the physics-based definition of intelligence promoted by Alex Wissner-Gross at MIT. I think it's basically a description of ethics as well, and the fact that it's ~very~ simple mathematically - it works well as a strategy in chess, checkers and go even if you don't give it the goal of 'winning the game'. I find that very re-assuring.
If not, i have this 'fallback hunch' which figures it'll be instrumental to keep humans around. How many people working on AI safety have spent time trying to maintain giant hardware systems? I spent 3.5 years at google tryign to keep a tiny portin of the network alive. All kinds of things break, like fiber cables. Humans have to go out and fix them. There's an ~enormous~ amount of human effort that goes into stopping the machines from falling over. Most of this effort was invisible, to most of the people inside of Google. We had teams that would build and design new hardware, and the idea that some day it might break and need to be repaired was generally not something they'd think about until late, late, late in the design phase. I think we have this idea that the internet is a bunch of machines and that a datacenter can just keep running, but the reality is if everyone on earth died, the machines would all stop within days, maybe months at most.
To prevent that, you'd need to either replace most of the human supply chains on earth with your own robots, who'd need their supply chains - or you could just keep on using robots made from the cheapest materials imaginable. We repair ourselves, make more copies of ourselves , and all you need is dirt, water, and sunlight to take care of us. The alternative seems to be either:
-risk breaking in some way you can't fix
-replace a massive chunk of the global economy, all at once, without anything going wrong, and somehow end up in a state where you have robots which are cheaper than ones made, effectively, from water, sunlight and dirt
of course maybe i'm just engaging in wishful thinking.
Keeping opitions open is kind of like having a lot of power (I'm thinking of a specific mathematical formalisation of the concept here). And this doesn't lead to ethical behaviour, it leads to agents trying to take over the world! Not really ethical at all.
When you say that "maximizing possible futures" works as a strategy for various games, I think you must be interpreting it as "maximizing the options available to ME". If you instead maximize the number of game states that are theoretically reachable by (you + your opponent working together), that is definitely NOT a good strategy for any of those games. (You listed zero-sum games, so it is trivially impossible to simultaneously make BOTH players better off.)
If you interpret "possible" as meaning "I personally get to choose whether it happens or not", then you've basically just described hoarding power for yourself. Which, yes, is a good strategy in lots of games. But it sounds much less plausible as a theory of ethics without the ambiguous poetic language.
> I think you must be interpreting it as "maximizing the options available to ME"
Nope, what i mean is 'maximizing the options available to the system as a whole." There is no meaningfully definable boundary between you and the rest of the physical universe. I think the correct ethical system is to maximize future possibilities available to the system as a whole, based upon your own local model. And if you're human, that local model is _centered_ on you, but it contains your family, your community, your nation, your planet, etc.
> An agent which operates to maximize the possible future states of the system it inhabits only values itself to the extent that it sees itself as being able to exhibit possible changes to the system, in order to maximize the future states accessible to it.
> In other words, an agent that operates to maximize possible future states of the system is an agent that operates without an ego. When this agent encounters another agent with the same ethical system, they are very likely to agree on the best course of outcome. When they disagree, it will be due to differing models as to the likely outcomes of choices - not on something like core values.
You have a button that, when pressed will cure cancer. If you press it today, you have only 1 possible future tomorrow. If you don't press it, you have a choice of whether or not to press it tomorrow. So not pressing the button maximises possible future states.
This agent will build powerful robots, ready to spring into action and cause any of a trillion trillion different futures. But it never actually uses these robots. (Using them today would flatten the batteries, giving you fewer options tomorrow)
> Oh, thank God! I thought you’d said five million years!”
That one has always tickled me too.
I thought of it when an debate raged here about saving humanity by colonization other star systems. I’d mentioned the ‘No reversing entropy’ thing and the response was: “We’re just talking about the next billion years!
> Bartlett agrees this is worth checking for and runs a formal OLS regression.
Minor error, but I'm Barnett.
Another minor error: I believe Carl Shulman is not 'independent' but still employed by FHI (albeit living in the Bay Area and collaborating heavily with OP etc).
Also pretty sure Carl is no longer living in the Bay Area, but in Reno instead (to the Bay Area's great loss)
Another minor error: Transformative AI is "AI that precipitates a transition comparable to (or more significant than) the agricultural or industrial revolution".
This is easy to fix Scott, and about as long as your original description + its witty reply.
Sorry, fixed.
Sure, but what does Bartlett think?
Another minor error: quoting on Mark Xu's list
That last graph may be a heck of a graph, but I have no idea what it depicts. Could we have a link to the source or an explanation, please?
Without explicitly confirming at the source, it appears to be a graph of chess program performance per computational power, for multiple models over time.
The Y-axis is chess performance measured using the Elo system, which is a way of ranking performers by a relative standard. Beginner humans are <1000, a serious enthusiast might be >1500, grandmaster is ~2500, and Magnus Carlsen peaked at 2882.
The X-axis is how much compute ("thinking time") each model was allowed per move. This has to be normalized to a specific model for comparisons to be meaningful (SF13-NNUE here) and I'm just going to trust it was done properly, but it looks ok.
The multiple lines are each model's performance at a given level of compute. There are three key takeaways here: 1) chess engines are getting more effective over time even allowed the same level of compute, 2) each model's performance tends to "level out" at some level of allocated resources, and 3) a lot of the improved performance of new models comes from being able to usefully utilize additional resources.
That's a big deal, because if compute keep getting cheaper but the algorithms can't really leverage it, you haven't done much. But if ML folks look at the resources thrown at GPT-3 and say "the curve isn't bending!" it could be a sign that we can still get meaningful performance increases from moar power.
many thanks!
Scott seems to take from this graph that it supports the "algorithms have a range of compute where they're useful" thesis. But I see it as opposing that.
First, the most modern algorithms are doing much better than the older ones *at low compute regimes* so the idea that we nearly immediately discover the best algorithms for a given compute regime once we're there appears to be false - at least we didn't manage to do that back in 1995.
Second, regimes where increased computation gives a benefit to these algorithms seems pretty stable. It's just that newer algorithms are across-the-board better. I guess it's hard to compare a 100 ELO increase at 2000 ELO to a 100 ELO increase at 3000 ELO, but I don't really see any evidence in the plot that newer algorithms *scale* better with more compute. If anything, it's that they scale better at low compute regimes, which more lend itself to a Yudkowskian conclusion.
Am I misinterpreting this?
I agree with you. If it were really the case that "once the compute is ready, the paradigm will appear", I would expect to see all of the curves on this graph intersect each other, with each engine having a small window for which it dominates the ELO roughly corresponding to the power of computers at the time it was made.
I'd expect that the curves for, say, image recognition tasks, *would* intersect, particularly if the training compute is factored in.
But the important part this graph shows is: the difference between algorithms isn't as large as the difference between compute (although the relative nature of ELO makes this less obvious).
I think those algorithms have training baked in, so a modern trained net does really well even with low compute (factor of 1000 from hardware X software), but the limit on how good an algo you could train was a lot lower in the past (factor of 50 from software alone)
> But if ML folks look at the resources thrown at GPT-3 and say "the curve isn't bending!" it could be a sign that we can still get meaningful performance increases from moar power.
I don't follow the space closely, but I think this is exactly what ML folks are saying about GPT-3.
Basically a Gwern quote IIRC, but I wouldn't hold him responsible for my half-rememberings!
It seems easier to just have children.
This made me laugh
If you think about it long enough it should.
When we say we want AIs what we are really saying is we want an AI that is better than humans not just an AI. But there are geniuses being born every day.
But what we really want is to understand consciousness and to solve particular problems faster than than we can at the moment.
We wanted to fly like the birds but we really did not invent an artificial bird. We wanted to work as hard as horse, but did not invent an artificial horse.
The question of consciousness is a legitimate and important question.
I think this is an important point. Doing basic research in AI as a way to understand NI makes enormous sense: we understand almost nothing about how our mind works, and if we understood much more we could (one hopes) make enormous strides in education, sociology, functional political institutions, the treatment of mental illness, and the improvement of life for people with mental disabilities (through trauma, birth, or age). We could also optimize the experience and contributions of people who are unusually intelligent, and maybe figure out how to boost our own intelligence, via training or genetic manipulation. Exceedingly valuable stuff.
But as a technological end goal, an actual deployed mass-manufactured tool, it seems highly dubious. There are only three cases to consider:
(1) We can build a general AI that is like us, but much dumber. Why bother? (There's of course many roles for special-purpose AIs that can do certain tasks way better than we can, but don't have our general-purpose thinking abilities.)
(2) We can build a general AI that is like us, and about as smart. Also seems mostly pointless, unless we can do it far cheaper than we can make new people, and unless it is so psychologically different it doesn't mind being a slave.
(3) We can build a general AI that is much smarter than us. This seems a priori unlikely, in the sense that if we understood intelligence sufficiently well to do this, why not just increase our own intelligence first? Got to be easier, since all you need to do is tweak the DNA appropriately. And even if we could build one, why would we want to either enslave a hyperintelligent being or become its slaves, or pets? Even a bad guy wouldn't do that, since a decent working definition of "bad guy" is "antisocial who doesn't want to recognize any authority" and building a superintelligent machine to whom to submit is rather the opposite of being a pirate/gangster boss/Evil Overlord.
I realize plenty of people believe there is case (2b) we can build an AI that is about as smart as us, and then *it* can rebuild itself (or build another AI) that is way smarter than us, but I don't believe in this boostrapping theory at all, for the same reason I find (3) dubious a priori. The idea that you can build a very complex machine without any good idea of how it works seems silly.
>The idea that you can build a very complex machine without any good idea of how it works seems silly.
But that's essentially what ML does. If there was a good idea of how a solution to a given problem works, it would be implemented via traditional software development instead.
I disagree. I understand very well what a ML program does. I may not have all the details at my fingertips, but that is just as meaningless as the fact that I don't know where each molecule goes when gasoline combusts with oxygen. Sure, there's a lot of weird ricochets and nanometer-scale fluctuations that go on about which I might not know, absent enormous time and wonderful microscopes -- but saying I don't know the details is nowhere near saying I don't know what's going on. I know in principle.
Same with ML. I may not know what this or that node weight is, and to figure out why it is what it is, i.e. trace it back to some pattern in the training data, would take enormous time and painstaking not to say painful attention to itsy bitsy detail, but that is a long way from saying I don't know what it's doing. I do in principle.
I'll add this dichotomy has existed in other areas of science and technology for much longer, and it doesn't bother us. Why does a particular chemical reaction happen in the pathway it does, exactly? We can calculate that from first principles, with a big enough computer to solve a staggeringly huge quantum chemistry problem. But if you wanted to trace back this wiggle in the preferred trajectory to some complex web of electromagnetic forces between electrons, it would take enormous time and devotion to detail. So we don't bother, because this detail isn't very important. We understand the principles by which quantum mechanics determines the reaction path, and we can build a machine that finds that path by doing trillions of calculations which we do not care to follow, and maybe the path is not what raw intuition suggests (which is why we do the calculation at all, usually), but at no point here do we say we do not *understand* why the Schroedinger Equations is causing this H atom to move this way instead of that. I don't really see why we would attribute some greater level of mystic magic to a neural network pattern-recognition algorithm.
>...but that is a long way from saying I don't know what it's doing. I do in principle.
Knowing in principle seems like a much lower bar than having a good idea how something works.
>I don't really see why we would attribute some greater level of mystic magic to a neural network pattern-recognition algorithm.
Intelligence is an emergent phenomenon (cf., evolution producing hominid minds), so what magic do you see being attributed beyond knowledge of how to build increasingly complex pattern-recognition algorithms?
That's not what ML does. ELI5, ML is about as well understood as the visual cortex, it's built like a visual cortex, and it solves visual cortex style problems.
People act like just because each ML model is too large and messy to explain, all of ML is a black box. It's not. Each model of most model classes (deep learning, CNN, RNN, gbdt, whatever you want) is just a layered or otherwise structured series of simple pattern recognizers, each recognizer is allowed to float towards whatever "works" for the problem at hand, and all the recognizers are allowed to influence each other in a mathematically stable (ie convergent) format.
End result of which is you get something that works like a visual cortex: it has no agency and precious little capacity for transfer learning, but has climbed the hill to solve that one problem really well.
This is a very well understood space. It's just poorly explained to the general public.
My initial objection to Carl was based on a difference of opinion about what constitutes a "good idea of how it works". You appear to share his less-restrictive understanding of the phrase.
N.B., I am a working data scientist who was hand coding CV convolutions two decades ago.
> This seems a priori unlikely, in the sense that if we understood intelligence sufficiently well to do this, why not just increase our own intelligence first? Got to be easier, since all you need to do is tweak the DNA appropriately.
I think this is mistaken. For reasons that Scott has talked about elsewhere, the fact that we aren't *already* smarter suggests that we're near a local optimum for our physiology / brain architecture / etc, or evolution would have made it happen; eg it may be that a simple tweak to increase our intelligence would result in too much mental illness. Finding ways to tweak humans to be significantly smarter without unacceptable tradeoffs may be extremely difficult for that reason.
On the other hand, I see no a priori reason that that local optimum is likely to be globally optimal. So conditional on building GAI at all, I see no particular reason to expect a specific barrier to increasing past human-level intelligence.
Oh I wouldn't disagree that it's likely to be hard to increase human intelligence. Whether what we mean by "intelligence" -- usually, purposeful conscious reasoning and imagination -- has been optimized by Nature is an interesting and unsolved question, inasmuch as we don't know whether that kind of intelligence is always a survival advantage. There are also some fairly trivial reasons why Nature may not have done as much as can be done, e.g. the necessity for having your head fit through a vagina during birth.
But yeah I'd take a guess that it would be very hard. I only said that hard as it is, building a brand-spanking new type of intelligence, a whole new paradigm, is likely to be much harder.
Anyway, if we take a step back, the idea that improving the performance of an engine that now exists is a priori less likely than inventing a whole new type of engine is logically incoherent.
"if we understood intelligence sufficiently well to do this, why not just increase our own intelligence first?"
Because the change is trivial in computer code, but hard in DNA.
For example, maybe a neural structure in 4d space works really well. We can simulate that on a computer, but good luck with the GM.
Maybe people do both, but the human takes 15-20 years to grow up, whereas the AI "just" takes billions of dollars and a few months.
Because we invented an algorithm that is nothing at all like a human mind, and works well.
That would be convincing if anyone had ever written a computer code that had even the tiniest bit of awareness or original thought, no matter how slow, halting, or restricted in its field of competence. I would say that the idea that a computer can be programmed *at all* to have original thought (or awareness) is sheer speculation, based on a loose analogy between what a computer does and what a brain does, and fueled dangerously by a lot of metaphorical thinking and animism (the same kind that causes humans to invent local conscious-thinking gods to explain why it rains when it does, or eclipses, or why my car keys are always missing when I'm in a hurry).
Deep blue can produce chess moves that are good, and aren't copies of moves humans made. GPT3 can come up with new and semi-sensible text.
Can you give a clear procedure for measuring "Original thought".
Before deep blue, people were arguing that computers couldn't play chess because it required too much "creative decision making" or whatever.
I think you are using "Original thought" as a label for anything that computers can't do yet.
You have a long list of things humans can do. When you see a simple dumb algorithm that can play chess, you realize chess doesn't require original thought, just following a simpleish program very fast. Then GPT3 writes kind of ok poetry and you realize that writing ok poetry (given lots of examples) doesn't require original thought.
I think there is a simplish program for everything humans do, we just haven't found it yet. I think you think there is some magic original thought stuff that only humans have, and also a long list of tasks like chess, go, image recognition etc that we can do with the right algorithm.
"Because the change is trivial in computer code, but hard in DNA."
In any large software shop which relies on ML to solve "rubber hits the road" problems, not toy problems, it takes literally dozens of highly paid full time staff to keep the given ML from falling over on its head every *week* as the staff either build new models or coddle old ones in an attempt to keep pace with ever changing reality.
And the work is voodoo, full of essentially bad software practices and contentious statistical arguments and unstable code changes.
Large scale success with ML is about as far from "the change is trivial in computer code" as it is possible to be in the field of computer science.
I thought about this specifically when reading that we could spend quadrillions of dollars to create a supercomputer capable of making a single human level AI.
To be fair, once made that AI could be run on many different computers (which would each be far less expensive), whereas we don't have a copy-paste function for people.
But more importantly, that way of thinking is wrong (edit: I mean the quadrillion dollars thing) and I predict humanity is about to reduce per-model training budgets at the high end. Though wealthy groups' budgets will jump temporarily whenever they suspect they might have invented AGI, or something with commercialization potential.
By "reduce per-model training budgets", do you mean "reduce how much we're willing to spend" or "reduce how much we need to spend"?
I mean that a typical wealthy AI group will reduce the total amount it actually spends on models costing over ~$500,000 each, unless they suspect they might have invented AGI, or something with commercialization potential, and even in those cases they probably won't spend much more than before on a single model (but if they do, I'm pretty sure they won't get a superintelligent AGI out of it). (edit: raised threshold 100K=>500K. also, I guess the superjumbo model fad might have a year or two left in it, but I bet it'll blow over soon)
The math and science are very difficult for me. So, I'm glad you are there to interpret it from a super layperson's perspective!
Could you point me to WHY AI scares you? I assume you've written about your fears.
Or should I remain blissfully ignorant?
He has written about this before on his previous blog, but even more helpfully summarized the general concerns here https://www.lesswrong.com/posts/LTtNXM9shNM9AC2mp/superintelligence-faq
Consider especially parts 3.1.2 thru 4.2
This is pretty out of date, but I guess it will do until/unless I write up something else.
Thanks!
I obviously cannot speak to why AI scares Scott, but there are some theoretical and practical reasons to consider superhuman AI a highly-scary thing should it come into existence.
Theoretical:
Many natural dangers that threaten humans do not threaten humanity, because humanity is widely dispersed and highly adaptive. Yellowstone going off or another Chicxulub impactor striking the Earth would be bad, but these are not serious X-risks because humanity inhabits six continents (protecting us from local effects), has last-resort bunkers in many places (enabling resilience against temporary effects) and can adapt its plans (e.g. farming with crops bred for colder/warmer climates).
These measures don't work, however, against other intelligent creatures; there is no foolproof plan to defeat an opponent with similar-or-greater intelligence and similar-or-greater resources. For the last hundred thousand years or so, this category has been empty save for other humans and as such humanity's survival has not been threatened (the Nazis were an existential threat to Jews, but they were not an existential threat to humanity because they themselves were human). AGI, however, is by definition an intelligent agent that is not human, which makes human extinction plausible (other "force majeure" X-risks include alien attack and divine intervention).
Additionally, many X-risks can be empirically determined to be incredibly unlikely by examining history and prehistory. An impact of the scale of that which created Luna would still be enough to kill off humanity, but we can observe that these don't happen often and there is no particular reason for the chance to increase right now. This one even applies to alien attack and divine intervention, since presumably these entities would have had the ability to destroy us since prehistory and have reliably chosen not to (as Scott pointed out in Don't Fear the Filter back on SSC, if you think humans are newly a threat to interstellar aliens or to God, you are underestimating interstellar aliens and God). But it doesn't apply to AI - or at least, not to human-generated AI (alien-built AI is not much different from aliens in this analysis). Humans haven't built (human-level or superhuman) AI before, so we don't have a track record of safety.
So the two basic heuristics that rule most things out as likely X-risks don't work on AI. This doesn't prove that AI *will* wipe out humanity, but it's certainly worrying.
Practical:
- AI centralises power (particularly when combined with robotics). Joe Biden can't kill all African-Americans (even if he wanted to, which he presumably does not), because he can't kill them all himself and if he told other people to do it they'd stop listening to him. Kim Jong-un can kill a lot of his people, because the norms are more permissive to him doing so, but he still can't literally depopulate North Korea because he still needs other people to follow his orders and most won't follow obviously-self-destructive orders. But if Joe Biden or Kim Jong-un had a robot military, they could do it. No monarch has ever had the kind of power over their nation that an AI-controlled robot army can give. Some people can be trusted with that kind of power; most can't.
- Neural-net architecture is very difficult to interrogate. It's hard enough to tell if explicit code is evil or not, but neural nets are almost completely opaque - the whole point is that they work without us needing to know *how* they work. Humans can read each other reasonably well despite this because evolution has trained us quite specifically to read other humans; that training is at best useless and at worst counterproductive when trying to read a potentially-deceptive AI. So there's no way to know whether a neural-net AI can be trusted with power either; it's basically a matter of plug-and-pray (you could, of course, train an AI to interrogate other AIs, but the interrogating AI itself could be lying to you).
Very helpful to my understanding why AI is a unique threat. Thanks for this. You explain it very well. Although now when i see video clips of kids in robot competitions, my admiration will be tinged with a touch of foreboding.
Don't be tinged by that foreboding. If you read a bit about superintelligence it becomes clear that it's not going to come from any vector that's typically imagined (terminator or black mirror style robots).
There are plenty of ideas of more realistic ways an AGI escapes confinement and gains access to the real world, a couple of interesting ones I read were it solving the protein folding problem, paying or blackmailing someone over the intenet to mix the necessary chemicals, and it creates nanomachines capable of anything. Another was tricking a computer sciencist with a perfect woman on a VR headset.
In fact it probably won't be any of these things, after all, it's a super intelligence: whatever it creates to pursue its goals will be so beyond our understanding that it's meaningless to predict what it will do other than as a bit of fun or creative writing exercise.
Let me know if you want links to those stories/ideas, I should have them somewhere. Superintelligence by Nick Bostrom is good read, although quite heavy. I prefer Scott's stuff haha.
The hypothetical "rogue superintelligent AGI with no resources is out to kill everyone, what does it do" might not be likely to go that way, but that's hardly the only possibility for "AI causes problems". Remote-control killer robots are already a thing (and quite an effective thing), militaries have large budgets, and plugging an AI into a swarm of killbots does seem like an obvious way to improve their coordination. PERIMETR/Dead Hand was also an actual thing for a while.
The "killbots" can't load their own ordnance or even fill their own fuel tanks, which is going to put a limit on their capabilities.
> solving the protein folding problem, paying or blackmailing someone over the intenet to mix the necessary chemicals, and it creates nanomachines capable of anything
Arguably the assumption that "nanomachines capable of anything" can even exist is a big one. After all, in the Smalley - Drexler debate Smalley was fundamentally right and drexlerian nanotech is not really compatible with known physics and chemistry
Offering the opposite take: https://idlewords.com/talks/superintelligence.htm
(Note this essay is extremely unpopular around these parts, but also, fortunately, rationalists are kind enough to let it be linked!)
1) I mean, yes, people get annoyed when you explain in as many words that you are strawmanning them in order to make people ignore them.
2) There are really two factions to the AI alarmists (NB: I don't intend negative connotations there, I just mean "people who are alarmed and want others to be alarmed") - the ones who want to "get there first and do it right" and the ones who want to shut down the whole field by force. You have something of a case against the former but haven't really devoted any time to the latter.
Generally I think that the paradigm shifts argument is convincing, and so all this business of trying to estimate when we will have a certain number of FLOPS available is a bit like trying to estimate when fusion will become widely available by trying to estimate when we will have the technology to manufacture the magnets at scale.
However, I disagree with Eliezer that this implies shorter timelines than you get from raw FLOPS calculations - I think it implies longer ones, so would be happy to call the Cotra report's estimate a lower bound.
>she says that DeepMind’s Starcraft engine has about as much inferential compute as a honeybee and seems about equally subjectively impressive. I have no idea what this means. Impressive at what? Winning multiplayer online games? Stinging people?
Swarming
Building hives
You people are all great.
It plays Zerg well and Terran for shit.
Protoss, you say? Everyone knows Protoss in SC2 just go air.
Yes, you should care. The difference between 50% by 2030 and 50% by 2050 matters to most people, I think. In a lot of little ways. (And for some people in some big ways.)
For those trying to avert catastrophe, money isn't scarce, but researcher time/attention/priorities is. Even in my own special niche there are way too many projects to do and not enough time. I have to choose what to work on and credences about timelines make a difference. (Partly directly, and partly indirectly by influencing credences about takeoff speeds, what AI paradigm is likely to be the relevant one to try to align, etc.)
EDIT: Example of a "little" way: If my timelines went back up to 30 years, I'd have another child. If they had been at 10 years three years ago, I would currently be childless.
Why does your child-having depend on your timelines? I'm considering a similar question now and was figuring that if bringing a child into the world is good, it will be half as good if the kid lives 5 years as if they live 10, but at no point does it become bad.
This would be different if I thought I had an important role in aligning AI that having a child would distract me from; maybe that's our crux?
I myself am pro bringing in another person to fight the good fight. If it were me being brought in I would find it an honor, rather than damning. My crux is simply that I am too busy to rear more humans myself.
FWIW I totally agree
Psst… kids are awesome (for whatever points a random Internet guy adds to your metrics)
I'm not sure it is rational / was rational. I probably shouldn't have mentioned it. Probably an objective, third-party analysis would either conclude that I should have kids in both cases or in neither case.
However the crux you mention is roughly right. The way I thought of it at the time was: If we have 30 years left then not only will they have a "full" life in some sense, but they may even be able to contribute to helping the world, and the amount of my time they'd take up would be relavitely less (and the benefits to my own fulfillment and so forth in the long run might even compensate) and also the probability of the world being OK is higher and there will be more total work making it be OK and so my lost productivity will matter much less...
(Apologies if this is a painful topic. I'm a parent and genuinely curious about your thinking)
Would you put a probability on their likelihood of survival in 2050? (ie, are you truly operating from the standpoint that your children have a 40 or 50 percent chance of dying from GAI around 2050?)
Yes, something like that. If I had Ajeya's timelines I wouldn't say "around 2050" I would say "by 2050." Instead I say 2030-ish. There are a few other quibbles I'd make as well but you get the gist.
Thanks for answering.
> money isn't scarce, but researcher time/attention/priorities is.
I don't get the "MIRI isn't bottlenecked by money" perspective. Isn't there a well-established way to turn money into smart-person-hours by paying smart people very high salaries to do stuff?
My limited understanding is: It works in some domains but not others. If you have an easy-to-measure metric, you can pay people to make the metric go up, and this takes very little of your time. However, if what you care about is hard to measure / takes lots of time for you to measure (you have to read their report and fact-check it, for example, and listen to their arguments for why it matters) then it takes up a substantial amount of your time, and that's if they are just contractors who you don't owe anything more than the minimum to.
I think another part of it is that people just aren't that motivated by money, amazingly. Consider: If the prospect of getting paid a six-figure salary to solve technical alignment problems worked to motivate lots of smart people to solve technical alignment problems... why hasn't that happened already? Why don't we get lots of applicants from people being like 'Yeah I don't really care about this stuff I think it's all sci-fi but check out this proof I just built, it extends MIRI's work on logical inductors in a way they'll find useful, gimme a job pls." I haven't heard of anything like that ever happening. (I mean, I guess the more realistic case of this is someone who deep down doesn't really care but on the exterior says they do. This does happen sometimes in my experience. But not very much, not yet, and also the kind of work these kind of people produce tends to be pretty mediocre.)
Another part of it might be that the usefulness of research (and also manager/CEO stuff?) is heavy-tailed. The best people are 100x more productive than the 95th percentile people who are 10x more productive than the 90th percentile people who are 10x more productive than the 85th percentile people who are 10x more productive than the 80th percentile people who are infinitely more productive than the 75th percentile people who are infinitely more productive than the 70th percentile people who are worse than useless. Or something like that.
Anyhow it's a mystery to me too and I'd like to learn more about it. The phenomenon is definitely real but I don't really understand the underlying causes.
> Consider: If the prospect of getting paid a six-figure salary to solve technical alignment problems worked to motivate lots of smart people to solve technical alignment problems... why hasn't that happened already?
I mean, does MIRI have loads of open, well-paid research positions? This is the first I'm hearing of it. Why doesn't MIRI have an army of recruiters trolling LinkedIn every day for AI/ML talent the way that Facebook and Amazon do?
Looking at MIRI's website it doesn't look like they're trying very hard to hire people. It explicitly says "we're doing less hiring than in recent years". Clicking through to one of the two available job ads ( https://intelligence.org/careers/research-fellow/ ) it has a section entitled "Our recommended path to becoming a MIRI research fellow" which seems to imply that the only way to get considered for a MIRI research fellow position is to hang around doing a lot of MIRI-type stuff for free before even being considered.
None of this sounds like the activities of an organisation that has a massive pile of funding that it's desperate to turn into useful research.
I can assure you that MIRI has a massive pile of funding and is desperate for more useful research. (Maybe you don't believe me? Maybe you think they are just being irrational, and should totally do the obvious thing of recruiting on LinkedIn? I'm told OpenPhil actually tried something like that a few years ago and the experiment was a failure. I don't know but I'd guess that MIRI has tried similar things. IIRC they paid high-caliber academics in relevant fields to engage with them at one point.)
Again, it's a mystery to me why it is, but I'm pretty sure that it is.
Some more evidence that it's true:
--Tiny startups beating giant entrenched corporations should NEVER happen if this phenomenon isn't real. Giant entrenched corporations have way more money and are willing to throw it around to improve their tech. Sure maybe any particular corporation might be incompetent/irrational, but it's implausible that all the major corporations in the world would be irrational/incompetent at the same time so that a tiny startup could beat them all.
--Similar things can be said about e.g. failed attempts by various governments to make various cities the "new silicon valley" etc.
Maybe part of the story is that research topics/questions are heavy-tailed-distributed in importance. One good paper on a very important question is more valuable than ten great papers on a moderately important question.
> I can assure you that MIRI has a massive pile of funding and is desperate for more useful research. (Maybe you don't believe me? Maybe you think they are just being irrational
Maybe they're not being irrational, they're just bad at recruiting. That's fine, that's what professional recruiters are for. They should hire some.
If MIRI wants more applicants for its research fellow positions it's going to have to do better than https://intelligence.org/careers/research-fellow/ because that seems less like a genuine job ad and more like an attempt to get naive young fanboys to work for free in the hopes of maybe one day landing a job.
Why on Earth would an organisation that is serious about recruitment tell people "Before applying for a fellowship, you’ll need to have attended at least one research workshop"? You're competing for the kind of people who can easily walk into a $500K+ job at any FAANG, why are you making them jump through hoops?
MIRI doesn't want people who can walk into a FAANG job, they want people who can conduct pre-paradigmatic research. "Math PhD student or postdoc" would be a more accurate desired background than "FAANG software engineer" (or even "FAANG ML engineer"), but still doesn't capture the fact that most math PhDs don't quite fit the bill either.
If you think professional recruiters, who can't reliably distinguish good from bad among the much more commoditized "FAANG software engineer" profile, will be able to find promising candidates for conducting novel AI research - well, I don't want to say it's impossible. But the problem is doing that in a way that isn't _enormously costly_ for people already in the field; there's no point in hiring recruiters if you're going to spend more time filtering out bad candidates than if you'd just gone looking yourself (or not even bothered and let high-intent candidates find you).
Holy shit. That's not a job posting. That's instructions for joining a cult. Or a MLM scam.
I think there is an interesting question about how one moves fields into this area. I imagine that having people who are intelligent but with a slightly different outlook would be useful. Being mentored while you get up to speed and write your first paper or two is important I think. I'm really not sure how I would move into a paid position for example without basically doing an unpaid and isolated job in my spare time for a considerable amount of time first.
For what it is worth, I agree completely with Melvin on this point - the job advert pattern matches to a scam job offer to me and certainly does not pattern match to any sort of job I would seriously consider taking. Apologies to be blunt, but you write "it's a mystery to me why it is", so I'm trying to offer an outside perspective that might be helpful.
It is not normal to have job candidates attend a workshop before applying for a job in prestigious roles, but it is very normal to have candidates attend a 'workshop' before pitching them an MLM or timeshare. It is even more concerning that details about these workshops are pretty thin on the ground. Do candidates pay to attend? If so this pattern matches advanced fee scams. Even if they don't pay to attend, do they pay flights and airfare? If so MIRI have effectively managed to limit their hire pool to people who live within commuting distance of their offices or people who are going to work for them anyway and don't care about the cost.
Furthermore, there's absolutely no indication how I might go about attending one of these workshops - I spent about ten minutes trying to google details (which is ten minutes longer than I have to spend to find a complete list of all ML engineering roles at Google / Facebook), and the best I could find was a list of historic workshops (last one in 2018) and a button saying I should contact MIRI to get in touch if I wanted to attend one. Obviously I can't hold the pandemic against MIRI not holding in-person meetups (although does this mean they deliberately ceased recruitment during the pandemic?), and it looks like maybe there is a thing called an 'AI Risk for Computer Scientists' workshop which is maybe the same thing (?) but my best guess is that the next workshop - which is a prerequisite for me applying for the job - is an unknown date no more than six months into the future. So if I want to contribute to the program, I need to defer all job offers for my extremely in-demand skillset for the *opportunity* to apply following a workshop I am simply inferring the existence of.
The next suggested requirement indicates that you also need to attend 'several' meetups of the nearest MIRIx group to you. Notwithstanding that 'do unpaid work' is a huge red flag for potential job applicants, I wonder if MIRI have seriously thought about the logistics of this. I live in the UK where we are extremely fortunate to have two meetup groups, both of which are located in cities with major universities. If you don't live in one of those cities (or, heaven forbid, are French / German / Spanish / any of the myriad of other nationalities which don't have a meetup anything less than a flight away) then you're pretty much completely out of luck in terms of getting involved with MIRI. From what I can see, the nearest meetup to Terrence Tao's offices in UCLA is six hours away by car. If your hiring strategy for highly intelligent mathematical researchers excludes Terrence Tao by design, you have a bad hiring strategy.
The final point in the 'recommended path' is that you should publish interesting and novel points on the MIRI forums. Again, high quality jobs do not ask for unpaid work before the interview stage; novel insights are what you pay for when you hire someone.
So to answer your question - yes there are many subtle and interesting factors as to why top companies cannot attract leading talent despite paying a lot of money to that talent and paying a lot of money to develop meta-knowledge about how to attract talent. However just because top companies struggle to attract talent and MIRI struggles to attract talent doesn't mean MIRI is operating on the same productivity frontier as top tech companies. From the public-facing surface of MIRI's talent pipeline alone there is enough to answer the question of why they're struggling to match funds to talent, and I don't doubt that a recruitment consultant going under the hood could find many more areas for concern in the talent pipeline.
Why *shouldn't* MIRI try doing the very obvious thing and retaining a specialist recruitment firm to headhunt talent for them, pay that talent a lot of money to come and work for them, and then see if the approach works? A retained executive search might cost perhaps $50,000 per hire at the upper end, perhaps call it $100,000 because you indicate there may be a problem with inappropriate CVs making it through anything less than a gold-plated search. This is a rounding error when you're talking about $2bn unmatched funding. I don't see why this approach is too ridiculous even to consider, and instead the best available solution is to have a really unprofessional hiring pipeline directly off the MIRI website
I believe the reason they aren't selecting people is simply that MIRI is run by deeply neurotic people who cannot actually accept any answer as good enough, and thus are sitting on large piles of money they insist they want to give people only to refuse them in all cases. Once you have done your free demonstration work, you are simply told that, sorry, you didn't turn out to be smarter than every other human being to ever live by a minimum of two orders of magnitude and thus aren't qualified for the position.
Perhaps they should get into eugenics and try breeding for the Kwisatz Haderach.
Although your take is deeply uncharitable, I think the basis of your critique is true and stems from a different problem. Nobody knows how to create a human level intelligence, so how could you create safety measures based on how such an intelligence would work? They don't know. So they need to hire people to help them figure that out, which makes sense. But since they don't know, even at an introductory level, they cannot actually evaluate the qualifications of applicants. Hiring a search firm would result in the search firm telling MIRI that MIRI doesn't know what it needs. You'd have to hire a firm that knows what MIRI needs, probably by understanding AI better than they do, in order to help MIRI get what it needs. Because that defeats the purpose of MIRI, they spin their wheels and struggle to hire people.
They're going to have a problem with the KH-risk people.
> However, if what you care about is hard to measure / takes lots of time for you to measure then it takes up a substantial amount of your time.
One solution here would be to ask people to generate a bunch of alignment research, then randomly sample a small subset of that research and subject it to costly review, then reward those people in proportion to the quality of the spot-checked research.
But that might not even be necessary. Intuitively, I expect that gathering really talented people and telling them to do stuff related to X isn't that bad of a mechanism for getting X done. The Manhattan Project springs to mind. Bell Labs spawned an enormous amount of technical progress by collecting the best people and letting them do research. I think the hard part is gathering the best people, not putting them to work.
> If the prospect of getting paid a six-figure salary to solve technical alignment problems worked to motivate lots of smart people to solve technical alignment problems... why hasn't that happened already?
Because the really smart and conscientious people are already making six figures. In private correspondence with a big LessWrong user (>10k karma), they told me that the programmers they knew that browsed LW were all very good programmers, and that the _worst_ programmer that they knew that read LW worked as a software engineer at Microsoft. If we equate "LW readers" with "people who know about MIRI", then virtually all the programmers who know about MIRI are already easily clearing six figures. You're right that the usefulness of researchers is heavy-tailed. If you want that 99.99th percentile guy, you need to offer him a salary competitive with those of FAANG companies.
If you equate "people who know about MIRI" with "LW readers", then maybe put some money and effort into MIRI more widely known. Hopefully in a positive way, of course.
You probably know more about the details of what has or has not been tried than I do, but if this is the situation we really should be offering like $10 million cash prizes no questions asked for research that Eliezer or Paul or whoever says moves the ball on alignment. I guess some recently announced prizes are moving us in this direction, but the amount of money should be larger, I think. We have tons of money, right?
They (MIRI in particular) also have a thing about secrecy. Supposedly much of the potentially useful research not only shouldn't be public, even hinting that this direction might be fruitful is dangerous if the wrong people hear about it. It's obviously very easy to interpret this uncharitably in multiple ways, but they sure seem serious about it, for better or worse (or indifferent).
This whole thread has convinced me that MIRI is probably the biggest detriment in the world for AI alignment research, by soaking up so much of the available funding and using it so terribly.
The world desperately needs a MIRI equivalent that is competently run. And which absolutely never ever lets Eleizer Yudkowsky anywhere near it.
My take is increasingly that this institution has succeeded in isolating itself for poorly motivated reasons (what if AI researchers suspected our ideas about how to build AGI and did them "too soon"?) and seems pretty explicitly dedicated to developing thought-control tech compatible with some of the worst imaginable futures for conscious subjects (think dual use applications -- if you can control the thoughts of your subject intelligence with this kind of precision, what else can you control?).
It hasn't "soaked up so much of the available funding." Other institutions in this space have much more funding, and in general are also soaking in cash.
(I disagree with your other claims too of course but don't have the energy or time to argue.)
Give Terrence Tao 500 000$ to work on AI alignement six months a year, letting him free to research crazy Navier-Stokes/Halting problem links the rest of his time... If money really isn't a problem, this kind of thing should be easy to do.
Literally that idea has been proposed multiple times before that I know of, and probably many more times many years ago before I was around.
> a six-figure salary to solve technical alignment problems
Wait, what? If I knew that I might've signed the f**k up! I don't have experience in AI, but still! Who's offering six figures?
Every time I am confused about MIRI's apparent failures to be an effective research institution I notice that the "MIRI is a social club for a particular kind of nerd" model makes accurate predictions.
You could pay me to solve product search ranking problems, even though I find the end result distasteful. In fact, if you bought stuff online, maybe you did pay me!
You couldn't pay me to work on alignment. I'm just not aligned. Many people aren't.
Fighting over made up numbers seems so futile.
But I don't understand this anyway.
Why do the dangers posed by AI need a full/transformative AI to exist? My total layman's understanding of these fears is that y'all are worried an AI will be capable of interfering with life to an extent people cannot stop. It's irrelevant if the AI "chooses" to interfere or there's some programming error, correct? So the question is not, "when will transformative AI exist?" the question is only, "when will computer bugs be in a position to be catastrophic enough to kill a bunch of people?" or, "when will programs that can program better than humans be left in charge of things without proper oversight or with oversight that is incapable of stopping these programming programs?"
Not that these questions are necessarily easier to predict.
A dumber-than-human level AI that (let's say) runs a power plant and has a bug can cause the power plant to explode. After that we will fix the power plan, and either debug the AI or stop using AIs to run power plants.
A smarter-than-human AI that "has a bug" in the sense of being unaligned with human values can fight our attempts to turn it off and actively work to destroy us in ways we might not be able to stop.
But if we are not worried about the bugs in the e.g. global water quality managing program, then an AI as smart as a human is not such a big deal either. There are plenty of smart criminals out who are unaligned with human values and even the worst haven't managed to wipe out humanity. We need to have an AI smarter than the whole group of AI police before seriously worrying, so maybe we need to multiply our made up number by 1,000?
But to illustrate the bug/AI question. Let's imagine Armybot, a strategy planning simulation program in 2022. And lets say there's a bug and Armybot, which is hooked up to the nuclear command system for proper simulations, runs a simulation IRL and lets off all those nukes. That's an extinction level bug that could happen right now if we were dumb enough.
Now lets imagine Armybot is the same program in 2050 and now it's an AI with the processing power equivalent to the population of a small country. Now the fear is Armybot's desire/bug to nuke the world kicks in (idk why it becomes capable of making independent decisions or having wants just because of more processing power so I'm more comfortable saying there's a bug). But now it can independently connect itself to the nuclear command center with its amazing hacking skills (that it taught itself? that we installed?). That's an extinction level bug too.
So the question is, which bug is more likely?
The general intuition, I believe, is that an AI as smart as a human can quickly become way way smarter than a human, because humans are really hard to improve (evolution has done its best to drill a hole through the gene-performance landscape to where we are, but it's only gotten more stuck over the aeons) and AI tends to be really easy to improve: just throw more cores at it.
If you could stick 10 humans of equal intelligence in a room and get the performance of one human that's 10 iq points smarter than that, then the world would look pretty different. Also we can't sign up for more brain on AWS.
My intuition is that "Just throw more cores at it" is no more likely to improve an AI's intelligence than opening up my skull and chucking in a bunch more brain tissue.
I think you'd have to throw more cores at it _and then_ go through a lengthy process of re-training, which would probably cost another hundred billion dollars of compute time.
It's even worse (or better, I guess, depending on your viewpoint) than that, because cores don't scale linearly; there's a reason why Amazon has a separate data center in every region, and why your CPU and GPU are separate units. Actually it's even worse than that, because even with all those cores, no one knows what "a lengthy process of re-training" to create an AGI would look like, or whether it's even possible without some completely unprecedented advances in computer science.
I think we can safely assume that it is going to be vastly easier than making a smarter human, at least given our political constraints. (Iterated embryo selection etc.) It doesn't matter how objectively hard it is, just who has the advantage, and by how much. Also I think saying we need fundamental advances in CS to train a larger AI given a smaller AI, misses first the already existing distillation research, and second assumes that the AGI was a one in a hundred stroke of good luck that cannot be reproduced. Which seems unlikely to me.
A hundred billion dollars of compute time for training is a fairly enlightening number because it's simultaneously an absurd amount of compute, barely comparable to even the most extravagant training runs we have today, enough to buy multiple cutting edge fabs and therefore all of their produced wafers, while also being an absolutely trivial cost to be willing to pay if you already have AGI and are looking to improve it to ASI. Heck, we've spent half that much just on our current misguided moon mission that primarily exists for political reasons that have nothing to do with trying to go to the moon.
That said, throwing more cores at an AI is by no means necessary, nor even the most relevant way an AI could self-improve, nor actually do we even need to first get AGI before self-improvement becomes a threat. For example, we already have systems that can do pick-and-place for hardware routing better than humans, we don't need AGI to do reinforcement learning, and there are ways in which an AI system could be trained to be more scalable when deployed than humans have evolved to be.
A fairly intelligent AI system finely enough divided to search over the whole of the machine learning literature and collaboratively try out swathes of techniques on a large cluster would not have to be smarter than a human in each individual piece to be more productive at fast research than the rest of humanity. Similarly, it's fairly easy to build AI systems that have an intrinsic ability to understand very high fidelity information that is hard to convey to humans, like AI systems that can look at weights and activations of a neural network and tell you things about its function. It's not hard to imagine that as AI approaches closer to human levels of general reasoning ability, we might be able to build a system that recursively looks at its own weights and activations and optimises them directly in a fine tuned way that is impossible to do with more finite and indivisible human labor. You can also consider systems that scale in ways similar to AlphaZero; again, as these systems approach having roughly human level general reasoning ability in their individual components, the ability for the combined system to be able to reason over vastly larger conceptual spaces in a much less lossy way that has been specifically trained end-to-end for this purpose might greatly exceed what humans can do.
I think people often have a misconception where they consider intelligence to exist purely on a unidimensional line which takes exponential difficulty to progress along. Neither of these are true, it is entirely on trend for AI to have exploitable superiorities as important as its deficiencies, and for progress to speed up rather than slow down as its set of capabilities approaches human equivalence—Feynman exists on the same continuum as everybody else, so there doesn't seem to be a good reason to expect humanity exists at a particularly difficult place for evolution to further improve intelligence. Even if human intelligence did end up being precisely a soft cap to the types of machines we could make, being able to put a large and scalable number of the smartest minds together in a room on demand far exceeds the intellectual might we can pump out of humanity otherwise.
There will be 0 or a few AI's given access to nukes. And hopefully only well tested AI.
If the AI is smart, especially if its smarter than most humans, and it wants to take over the world and destroy all humans, its likely to succeed. If you aren't stupid, you won't wire a buggy AI to nukes with no safeguards. But if the AI is smart, its actively trying to circumvent any safeguard. And whether nukes already exist is less important. It can trick humans into making bioweapons.
"idk why it becomes capable of making independent decisions or having wants just because of more processing power so I'm more comfortable saying there's a bug". Current AI sometimes kind of have wants, like wanting to win at chess, or at least reliably selecting good chess moves.
We already have robot arms programmed to "want" to pick things up. (Or at least search for plans to pick things up.) The difference is that currently our search isn't powerful enough to find plans involving breaking out, taking over the world and making endless robot arms to pick up everything forever.
Defence against a smart adversary is much much harder than defence against random bugs.
> an AI as smart as a human
Scott said "smarter-than-human" (perhaps he means "dramatically smarter"), and I argue downthread that there will never be an AI "as smart as" a human.
I'm unconvinced by AI X-risk in general, but I think I can answer this one: bugs are random. Intelligences are directed. A bad person is more dangerous than a bug at similar levels of resources and control.
No, it can't, because merely being able to compute things faster than a human does not automatically endow the AI with nigh-magical powers -- and most of the magical powers attributed to putative superhuman AIs, from verbal mind control to nanotechnology to, would appear to be physically impossible.
Don't get me wrong, a buggy AI could still mess up a lot of power plants; but that's a quantitative increase in risk, not a qualitative one.
An AI doesn't need magical powers to be a huge, even existential threat. It just needs to be really good at hacking and can use the usual human foibles as leverage to get nearly anything it wants: money and blackmail.
Human hackers do that today all the time, with varying degrees of success. They are dangerous, yes, but not an existential threat. If you are proposing that an AI would be able to hack everything everywhere at the same time, then we're back in the magical powers territory.
We're talking about superintelligent AI. Being better than human hackers is a trivial corollary. Exactly what is magical about that?
How much better, exactly ? Is it good enough to hack my Casio calculator watch ? If so, then it's got magical powers, because that watch is literally unhackable -- there's nothing in it to hack. Ok, so maybe not, but is it good enough to gain root access to every computer on the Internet at the same time while avoiding detection ? If so, then it has magical powers of infinite bandwidth, superluminal communication, and whatever else it is that lets it run its code at zero performance penalty. Ok, so maybe it's not quite that good, but it's just faster than average human hackers and better informed about security holes ? Well, then it's about as good as Chinese or Russian state hackers already are today.
In other words, you can't just throw the word "superintelligent" into a sentence as though it was a magic incantation; you still need to explain what the AI can do, and how it can do it (in broad strokes).
Why would an AI want money?
Because money is how anyone or anything acquires resources in the world as it currently exists.
These timelines seem to depend crucially on compute getting much cheaper. Computer chip factories are very expensive, and there are not very many of them. Has anyone considered trying to make it illegal to make compute much cheaper?
Who? You're talking to the small group of researchers and activists who care about this, with a few tens of billions of dollars. How do they make it illegal to make compute much cheaper?
Just offering a concrete policy goal to lobby for. As far as I know, actual policy ideas here beyond “build influence” are in short supply.
I agree this would be very challenging and probably require convincing some part of the US and Chinese governments (or maybe just the big chip manufacturers) that AI risk is worth taking seriously.
Ideas aren't in short supply; clearly good ideas are. You aren't the first person to propose lobbying to stop compute getting cheaper. What's missing is a thorough report that analyzes all the pros and cons of ideas like that and convinces OpenPhil et al that if they do this idea they won't five years later think "Oh shit actually we should have done the exact opposite, we just made things even worse."
Even clearly good ideas aren't in short supply; *popular* people who can tell which ideas are good are. So usually when I see (or invent) a good idea, it is not popular.
What are some clearly good policy ideas in this space?
Most that I have seen are bad because of the difficulty of coordinating among all possible teams of people working on AI (on the other hand, the number of potential chip fabs is much smaller)
Sorry, just realized I made a fairly useless comment. I was making a general observation, not one about this field specifically. So, don't know.
OK, I’m glad to hear this idea is already out there. I wasn’t sure if it was. I agree the appropriate action on it right now is “consider carefully”, not “lobby hard for it”.
I don't know if someone has discussed your idea in AI governance, but in alignment there's the concept of a "pivotal act". You train an AI to do some specific task which helps which drastically changes the expected outcome of AGI. For instance, an AI which designs nanotech in order to melt all GPUs and destoy GPU plants, after which it shuts down. Which is vaguely similair to what you suggested. So maybe search for pivotal acts on the alignment forum to find the right literature.
Is this intended to be a failsafe, such that the AGI has a program to destroy computer creating machinery, but can only do so if it escapes its bounds enough to gain the ability?
It is intended to slow down technological progress in AI and make it impossible for someone else (and you afterwards!) to make an AGI, or anything close to an AGI. And nothing else. So no first order effect on other tech, politics, economics, science etc.
This works out better as a failsafe than what you've proposed, since if you're expecting the AI to escape and have enough power to conduct such an act, you've lost anyway. Someone else is probably making an AGI as well in that scenario, or the AI will be able to circumvent the program firing up or so on.
Note that getting the AI to actually just melt GPUs somehow and then shut down is an unsolved problem. If we knew how to do that right now, the alignment community would be way more optimistic about our chances.
If you've tried to buy a high-amperage MOSFET, a stepper driver, a Raspberry Pi or a GPU lately, you would know how easy it is to make compute expensive. Different chips - or different computers whose CPUs/firmwares don't conform to a BIOS-like standard - are not necessarily fungible with each other, and the whole chip fab process has a very long cycle time despite the relatively normal amount of throughput achievable by, essentially, a very deep pipeline.
(And yes, I too think the whole movement reeks of Luddism.)
See, this is *exactly* why I'm opposed to the AI-alignment community. Normally I wouldn't care, people can believe whatever they want, from silicon gods to the old-fashioned spiritual kind. But beliefs inform actions, and boom -- now you've got people advocating for halting the technological progress of humanity based on a vague philosophical ideal.
We've got real problems that we need to solve, right now: climate change, hunger, poverty, social networks, the list goes on; and we could solve most (arguably, all) of them by developing new technologies -- unless someone stops us by shoving a wooden shoe in the gears every time a new machine is built.
"halting the technological progress of humanity based on a vague philosophical ideal. "
Does this apply to the biologists deciding not to build bioweapons? Some technologies are dangerous and better off not built. It can create new problems as well as solving them. You would need to show that the capability of AI to solve our problems is bigger than the risk. Maybe AI is dangerous and we would be better off solving climate change with carbon capture. Solving any food problems with GMO crops. And just not doing the most dangerous AI work until we can figure out how to make it safer.
You are not talking about the equivalent of deciding not to build bioweapons; you are talking about the equivalent of stopping biological research in general. I agree that computing is dangerous, just as biology is dangerous -- and don't even get me started on internal combustion. But we need all of these technologies if we are to thrive, and arguably survive, as a species. I'm not talking about solving global warming with some specific application of AI; I'm talking about transformative events such as our recent transition into the Information Age.
Progress is great! Stopping growth would be a disaster.
That said, it doesn’t seem to me that cheaper computing power is very useful in solving climate change, poverty, etc. Computers are already really great; what we need is more energy abundance and mastery over the real physical world.
Consumer CPUs haven’t been getting faster for many years, so it’s not even clear most computer users are benefiting from Moore’s law these days.
If you don't think more computer power can somehow magically solve those problems, this is a good first step towards understanding why some people are unconvinced by AI X-risk.
> this is *exactly* why I'm opposed to the AI-alignment community
Jonathan Paulson's comment is surely not representative of the AI-alignment community.
>human solar power a few decades ago was several orders of magnitude worse than Nature’s, and a few decades from now it may be several orders of magnitude better.
No, because typical solar panels already capture 15 – 20% of the energy in sunlight (the record is 47%). There's not another order of magnitude left to improve.
Source, https://en.wikipedia.org/wiki/Solar_cell_efficiency
Nitpicking aside, I wonder how the potential improvement of human intelligence through biotechnology will affect this timeline. The top AI researcher in 2052 may not have been born yet.
The table also measures solar in terms of "payback period," which has much more room for improvement.
It's also a lot more relevant than efficiency unless you are primarly constrained by acreage
I don't think that's a reasonable metric for solar power. Plants use solar power to drive chemical reactions -- to make molecules. They're not optimized for generating 24VDC because DC current isn't useful to a plant. So the true apples-to-apples comparison is to compare the amount of sunlight a plant needs to synthesize X grams of glucose from CO2 and water, versus what you can do artificially, e.g. with a solar panel and attached chemical reactor. By that much more reasonable metric the natural world still outshines the artificial by many orders of magnitude.
One imagines that if plant life *did* run off a 5VDC bus, then evolution would have developed some exceedingly efficient natural photovoltaic system. What it would look like is an interesting question, but I very much doubt it would be made of bulk silicon. Much more likely is that it would be an array of microscopic machines driven by photon absorption, which is kind of the way photosynthetic reaction centers work already.
That's not a reasonable metric, either, for exactly the same reason: Solar panels aren't optimized for generating glucose.
(Also, your metric means efficiency improvements in generating glucose are efficiency gains for solar power.)
I think this is right.
This leaves us with having to compare across different domains. How do you quantify the difference between "generates DC power" and "makes molecules"? I guess you'd have to start talking about the *purpose* of doing those things. Something like "utility generated for humans" vs "utility generated for plants"...and that seems really difficult to do.
No, you would need to measure a combined system, of solar panel plus chemical plant, as I said. But a plant *is* a combined solar panel plus chemical plant, and optimized globally, not in each of its parts, so if you wan to make a meaningful comparison, that's what you have to do. Otherwise you're making the sort of low-value comparison that people do when they say electric cars as a technology generate zero CO2, forgetting all about the source of the electricity. It's true but of very limited value in making decisions, or gaining insight.
In this case, the insight that is missing is that Nature is still a heck of a lot better at harvesting and using visible photons as an energy source. The fact that PV panels can do much better at a certain specialized subtask, which is *all by itself* pointless -- electricity is never an end in itself, it's always a means to some other end -- isn't very useful.
But you’re still favoring the plant by trying to get technology to simulate the plant. Yes, it’s an integrated solar panel plus chemical plant, but that’s completely useless if what you want is a solar panel plus 24V DC output plug. In that case, the plants lose by infinity orders of magnitude, because no plant does 24V DC output. You get similar results if what you want is computing simple arithmetic (a $5 calculator will beat any plant, complete with solar panel), or moving people between continents. Yes, birds contain sophisticated chemical reactors and have complex logic for finding fuel, but they still cannot move people between continents.
I you insist on measuring based on one side’s capabilities, I am the world’s best actor by orders of magnitude, since I have vast advantages at convincing my mother I am the son she raised, relative to anyone else.
This is a minor point in all this, but it seems weird to estimate the amount of training evolution has by the amount of FLOPs each animal has done. Thinking more doesn't seem like it would increase the fitness of your offspring, at least not in a genetic sense. The only information evolution gets is how many kids you have (and they have, etc).
Though maybe you could point to this as the reason why the evolution estimate is so much higher than the others.
It works if you consider optimization, or solution finding in general, as a giant undifferentiated sorting problem. I have X bits of raw data, and my solution (or optimum) is some incredibly rare combination F(X), and what I need to do is sift all the combinations f(X) until f(X) = F(x). That will give you an order of magnitude estimate for how much work it is to find F given X, even if you don't know the function f.
But in practice that estimate often proves to be absurdly and uselessly large. It's sort of like saying the square root of 10 has to be between 1 and 10^50. I mean...yeah, sure, it's a true statement. But not very practically useful.
In the same sense, many problems Nature has solved appear to have been solved in absurdly low amounts of time, if you take the "number of operations needed" estimate as some kind of approximate bound. This is the argument often deployed by "Intelligent Design" people to explain how evolution is impossible, because the search space for mutations is so unimaginably huge, relative to the set of useful mutations, that evolution would accomplish pretty much zip over even trillion-year spans. See also the Levinthal Paradox in protein folding. Or for that matter the eerie fact that human beings who can compete with computer chess programs at a given level are doing way, way, *way* fewer computations. Somehow they can "not bother" with 99.9999+% of the computations the computer does that turn out to be dead ends.
How Nature achieves this is one of the most profound and interesting questions in evolutionary biology, in the understanding of natural intelligence, and in a number oif areas of physical biochemistry, I would say.
Gotta say I don’t generally feel this way (although I always find his stuff to be enlightening and a learning experience) but I’m pretty well aligned with Eliezer here. I think people figure out when they’ll start to feel old age and just put AI there then work backwards. I’m greatly conflicted about AGI as I don’t know how we fix lots of problems without it and it seems like there’s some clever stuff to do in the space other than brute forcing that I think doesn’t happen as much… and this is where I’m conflicted, because kinda thankfully it makes people feel shunned to do wild stuff which slows the whole thing down. Hopefully we arrive at the place of unheard of social stability and AGI simultaneously. If we built it right now I think it would be like strapping several jet engines on a Volkswagen bug. For whatever that’s worth, Some Guy On The Internet feels a certain way.
I personally think AGI in eight years. GPT-3 scares me. It's safe now, but I worry it's "one weird trick" (probably involving some sort of online short-horizon self-learning) out from self-awareness.
It feels weird to be rooting against progress but I hope you’re wrong until we have some more time to get our act together. To me the control problem is also how we control ourselves. Without some super flexible government structure to oversee us I worry what we’ll try to do even if there are good decision makers telling us to stop. Seems like most minds we could possibly build would be insane/unaligned (that’s probably me anthropomorphizing) since humans need a lot of fine tuning and don’t have to be that divergent before we are completely coo coo for Cocoa Puffs. Hopefully the first minds are unproductively insane instead of diabolically insane.
I am personally pretty old already, but I do expect to live 8 more years, so I'd totally take you up on that bet. From where I'm standing, it looks like easy money (unless of course you end up using some weak definition of AGI, like "capable of beating a human at Starcraft" or whatever).
There's the general thing that the definition for AGI keeps changing; what would have counted as intelligence thirty years ago no longer counts, because we've already achieved it. So what looks like a strong definition for AGI today becomes a weak definition tomorrow.
This is actually the source of my optimism: People worried about AGI can't even define what it is they are worried about. (Personally I'll worry when some specific key words get used together. But not too much, because I'm probably just as wrong.)
I'm not worried about AGI at all -- that is to say, I'm super worried about it, but only in the same way that I'm worried about nuclear weapons or mass surveillance or other technologies in the hands of bad human actors. However, I'd be *excited* about AGI when it could e.g. convincingly impersonate a regular (non-spammer) poster on this forum. GPT-3 is nowhere near this capability at present.
/basilisk !remindme 8 years
It’s about as self aware as a rock.
The dinosaurs died because of a rock.
The rock wasn’t self aware
Ergo, being self aware is not a necessary condition to be scary and/or cause a disaster. Or, more precisely, just saying “it’s not self aware” is not an argument that you shouldn’t worry about it.
The thing that is scary about GPT-3 is not its *self-awareness*, but its other (relatively and unexpectedly) powerful abilities, and particularly that we don’t know how much more powerful it could become, while remaining non–self aware.
Sort of how boulders are not that scary by themselves, but once you see one unexpectedly fall from the sky, you might worry what happens if a much bigger one will fall later. And how it might be a good idea to start investigating how and why boulders can fall from the sky, and what you might be able to do about it, some time before you see the big one with your own eyes when it touches the atmosphere.
But being self aware is what scares people about AGI. Rather than live in the world of metaphor here - what exactly can a future GPT do that’s a threat? Write better poems, or stories?
>I consider naming particular years to be a cognitively harmful sort of activity; I have refrained from trying to translate my brain's native intuitions about this into probabilities, for fear that my verbalized probabilities will be stupider than my intuitions if I try to put weight on them.
I don't think there's good evidence that specific, verifiable predictions is a cognitively harmful activity. I'd actually say the opposite - that it is virtually impossible to update one's beliefs without saying things like "I expect X by Y," and definitely impossible to meaningfully evaluate a person's overall accuracy without that kind of statement. It reminds me of Superforecasting pointing out how many forecasts are not even wrong - they are meaningless. For example:
> Take the problem of timelines. Obviously, a forecast without a time frame is absurd. And yet, forecasters routinely make them, as they did in that letter to Ben Bernanke. They’re not being dishonest, at least not usually. Rather, they’re relying on a shared implicit understanding, however rough, of the timeline they have in mind. That’s why forecasts without timelines don’t appear absurd when they are made. But as time passes, memories fade, and tacit time frames that once seemed obvious to all become less so. The result is often a tedious dispute about the “real” meaning of the forecast. Was the event expected this year or next? This decade or next? With no time frame, there is no way to resolve these arguments to everyone’s satisfaction—especially when reputations are on the line.
(Chapter 3 of Superforecasting is loaded up with a discussion of this whole matter, if you want to consult your copy; there's no particular money shot quote I can put here.)
Frankly, the statement "my verbalized probabilities will be stupider than my intuitions" is inane. They cannot be stupider than your intuitions, because your intuitions do not meaningfully predict anything, except insofar as they can be transformed into verbalized probabilities. It strikes me that more realistically, your verbalized probabilities will *make it more obvious that your intuitions are stupid*, making it understandable monkey politicking to avoid giving numbers, but in response I will use my own heuristics to downgrade the implied accuracy of people engaged in blatant monkey politicking.
First off, Yudkowsky was talking about himself. It is possible that he really does get fixated on what other people say and can't get his brains to generate its probability instead of their answer. I know I often can't get my brain to stop giving me cached results instead of thinking for itself.
"your intuitions do not meaningfully predict anything, except insofar as they can be transformed into verbalized probabilities"
This is right on some level and wrong on another. It is right in that we should expect some probability is encoded somewhere in your brain for a given statement, which we might be able to decode into numbers if only we had the tech and understanding.
It is wrong in that e.g. I have no idea what probability there is that we live in a Tegmarkian universe, but I have some intuition that this is plausible as an ontology. I have no idea what the probability of the universe being fine tuned is, but it feels like minor adjustments to the standard models parameters could make life unfeasible.
When I don't know what the event space is, or which pieces of knowledge are relevant, and how they are relevant, then you can easily make an explicit mental model that performs worse than your intuitions. Your system 1 is very powerful, and very illegible. You can output a number that "feels sort of right but not quite", and that feeling is more useful than the number itself as it is your state of knowledge. And if you're someone who can't reliably get people to have that same state of knowledge, then giving them the "not right" number is just giving them a poor proxy and maybe misleading them. Yudkowsky often says that he just can't seem to explain some parts of his worldview, and often seems to mislead people. Staying silent on median AGI timelines may also be a sensible choice for him.
I kind of buy it, but then I've read a lot of his stuff and know his context.
>It is wrong in that e.g. I have no idea what probability there is that we live in a Tegmarkian universe, but I have some intuition that this is plausible as an ontology. I have no idea what the probability of the universe being fine tuned is, but it feels like minor adjustments to the standard models parameters could make life unfeasible.
Right, but that is a virtually meaningless statement, is the thing. It's the same as any other part of science - in order for something to be true, it has to be falsifiable. Ajeya has put forward something that she could actually get a Brier's score based on - Yudkowsky has not.
>I kind of buy it, but then I've read a lot of his stuff and know his context.
I read a lot of his stuff too, which is why it's disappointing to see him do something that I can only really blame on either monkey brain politicking or just straight up ignoring good habits of thought. Monkey politicking is more generous, in my view, than just straight up ignoring one of the most scientifically rigorous works on increasing one's accuracy as a thought leader in the rationalist community.
Sure, the Tegmark thing is not falsifiable. But the fine tuning thing isn't (simulate biochemistry with different parameters for e.g. the muon mass and see if you get complex self replicating organisms). And the concept generalises.
If you take something like "what is the probability that if the British lost the battle of Waterloo, then there would have been no world war", you might have some vague intuitions about what couldn't occur, but I wouldn't trust any probability estimate you put out. How could I? There are so many datapoints that affect your prior, and it is not even clear what your prior should be, that I don't see how you could turn your unconscious knowledge generating your anticipations into a number via formal reasoning. Or even via guessing what's right, as you don't know if you're taking all your knowledge into account.
>I read a lot of his stuff too, which is why it's disappointing to see him do something that I can only really blame on either monkey brain politicking or just straight up ignoring good habits of thought.
It would be better if he gave probability estimates. I just don't think its as big a deal as you're claiming. You can still see what they would bet on e.g. GPT-n not being able to replace a programmer. That makes their actual beliefs legible.
And yeah, Yudkowsky is being an ass here. But he's been trying to explain his generators of thought for like ten years and is getting frustrated that no one seems to get it. It is understandable, but unvirtuous.
> I consider naming particular years to be a cognitively harmful sort of activity; I have refrained from trying to translate my brain's native intuitions about this into probabilities, for fear that my verbalized probabilities will be stupider than my intuitions if I try to put weight on them.
It was very hard to read this and interpret it as anything other than "I don't want to put my credibility on the line in the event that our doomsday cult's predicted end date is wrong." As a reader, I have zero reason to give value to Yudkowsky's intuition. The only times I'd take something like this seriously is if someone had repeatedly proved the value of their intuition via correct predictions.
I hate being uncharitable, but that's exactly how I read that section as well. If he feels strongly about a particular timeline, and he clearly says that he does, then he should not be worried about sharing that timeline. If he doesn't share that timeline, then he is implying that either 1) he doesn't have strong feelings about what he's saying, or 2) he is worried about the side effects of being held accountable for being wrong (which to me is another reason to think he doesn't actually have strong beliefs that he is correct on his timeline).
Uncharitably, Eliezer depends on his work for money and prestige, and that work depends on AI coming sooner, rather than later. Knowing that AI is not even possible at current levels of computing would drastically shrink the funding level applied to AI safety, so he has a strong incentive to believe that it can be.
I'll add a third voice to the pile here RE; Yudkowsky and withholding his timeline. It would certainly seem he's learned from his fellow doomsayers's crash-and-burn trajectories when they get pinned down to naming a date for their apocalypse.
Yeah, the word that came to mind when I read that was "dissemble"
I was today years old when I first saw the word "compute" used as a noun. It makes my brain wince a little every time.
I was five years ago old, winced at the time, and got used to it after a few months.
Comparing brains and computers is quite tricky. If you look at how a brain works, it's almost all smart structure - the way each and every neuron is physically wired, which happens thanks to evolved and inherited broad-stroke structures (nuclei, pathways, neuron types, etc.), as well as the process of learning during an individual's development. The function part that is measured by the number of synaptic events per second is a tiny part of the whole process. If you look at how a computer running an AI algorithm works the picture is the opposite: There is almost nothing individual on the structure/hardware level (where you count FLOPS) and almost everything that separates a well-functioning AI computer from a failing one is in the function/software part. This is what it means that the computer is consuming FLOPS much differently than a brain consumes synaptic events. I am very much in agreement with Eliezer here.
Based on the above I guess that if you built a neuromorphic computer, i.e. a computer whose hardware was structured like a brain, you could expect the same level of performance for the same number of synaptic events. Instead of having a software-agnostic hardware you might have e.g. a gate array replicating the large-scale structure of the brain (e.g. different modules receiving inputs from different sensors, multiple subcortical nuclei, a cortical sheet, multiple specific pathways connecting modules, etc.) that could run only one algorithm, precisely adjusting synaptic weights in these pre-wired society of neural networks. In that system you would get the same IQ from the same number of synaptic/gate switch events, as long as your large-scale structure was human-level smart.
This would be a complete change in paradigm compared to current AI, which uses generic hardware to run individual algorithms and thus suffers a massive hit to performance. And I mean, a *really* massive hit to performance. If you figure out a smart computational structure, as smart as what evolution put together, you will have a human level AGI using only 10e15 FLOPS of performance. All we need to do is to map a brain well-enough to know all the inherited neural pathways, imprint those pathways on a humongous gate array (10e15 gates), and do a minor amount of training to create the individual synaptic weights.
This is my recipe for AGI, soon.
Now, about that 7-digit sum of money to be thrown....
I think there's a underappreciated severe physical challenge there. If you build a neuromorphic computer out of things large enough that we know how to manipulate them in detail, I would guess you will be screwed by the twin scourges of the speed of light and decoherence times -- the minimum clock time imposed by the speed of light will exceed decoherence times imposed by assorted physical noise processes at finite temperature, and you will get garbage.
I think the only way to evade that problem is to build directly at the molecular scale, so you can fit everything in a small enough volume that the speed of light doesn't slow your clock cycle time too far. But we don't know how to do that yet.
If you have 10e15 gates trying to produce 10e15 operations (not floating point) per second your clock time is 1 Hz. Also, the network is asynchronous. This is a completely different regime of energy dissipation per unit time per gate, so gate density per unit of volume is much higher, so distances are not much longer than in a brain, so the network is constrained by neither clock time nor decoherence.
Right. That fits under my second condition: "we don't know how to build such a thing" because we don't know how to build stuff at the nanometer level in three dimensions, and two-dimensions (which we can do now) won't cut it to achieve that density.
You don't need to have extremely high 3d density. Since your gates operate at 1Hz you can have long interconnects and you can stack layers with orders of magnitude less cooling than in existing processors. The 9 OOM difference in clock speed between a GPU and the neuromorphic machine makes a huge difference in the kind of constraints you face and the kind of solutions you can use. The technology to make the electronic elements and the interconnects for this device exists now and is in fact trivial. What we are missing is the brain map we need to copy onto these electronic devices (the large-scale network topology).
Trivial, eh? Wikipedia tells me the highest achieved transistor density is about 10^8/mm^2. So your 10^15 elements would seem to require a 10m^2 die. That might be a little tricky from a manufacturing (especially QA) point of view, but let's skip over that. How are we going to get the interconnect density in 2D space? In the human brain ~50% of the volume is given over to interconnects (white matter), and in 3D the problem of interconnection is enormously simpler -- that's why we easily get traffic jams on roads but not among airplanes.
How many elements can you fit on a 2D wafer and still get the "everything connects to everything else" kind of interconnect density we need? Recall we assume here that a highly stereotypical kind of limited connection like you need for a RAM chip or even GPU is not sufficient -- we need a forest of interconnects so dense that almost all elements can talk to any other element. I'm dubious that it can be done at all for more than a million elements, but let's say we can do it on the largest chips made today ~50,000 mm^2, which gets us 10^12 elements. Now we are forced to do stack our chips, 10^3 of them. How much space do you need between the chips for macroscopic interconnects? Remember we need to connect ~10^12 elements in chip #1 with ~10^12 elements in chip #999. It's hard to imagine how one is going to run a trillion wires, even very tiny wires, between 1000 stacked wafers.
All of this goes away if you can actually fully engineer in 3D space, the way the human brain is built, so you can run your interconnects as nm-size features. But we don't know how to do that yet.
Not everything connects to everything else - the brain has a very specific network topology with most areas connecting only to a small number of other areas. This is a very important point - we are not talking about a network fabric that can run any neural net, instead we are copying a specific pre-existing network topology, so our connections will be sparse compared to the generic topology.
Think about a system built with a large number of ASICs - after mapping and understanding the function of each brain module you make an ASIC to replicate its function and you may need thousands of specialized types of ASICs, one or more for each distinct brain area. Sure, the total surface area of the ASICS would be large but given the low clock rate we don't have to use anything very advanced and as you note we can already put 10e12 transistors per wafer, so the overall number of chips to get to 10e15 gates would not be overwhelming. Also, you are not trying to make a thousand Cerebras wafers running at GHz speed, you make chips running at 1 Hz, so the QA issues would be mitigated. The interconnects between the ASICs of course don't need to have millions of wires - you can multiplex the data streams from hundreds of millions of neuron equivalents (like a readout of axonal spikes) over a standard optic or copper wire and of course the interconnects are in 3d, as in a rack cabinet. No need for stacking thousands of high-wattage wafers in a tiny volume to maximize clock speed, since all you need is for the wires to transmit the equivalent of 1Hz per neuron, so everything can be put in industry standard racks. Low clock speed makes it so much easier.
This is not to say this way of building a copy of a brain is the most efficient, and definitely not the fastest possible - but it would not require new or highly advanced manufacturing techniques. What is missing for this approach to work is a good-enough map of brain topology and a circuit-level functional analysis of each brain module, good enough to guide the design of the aforementioned ASICs.
People have been comparing the brain to a machine since clockwork, it's refreshing to see machines compared to brains for once.
I agree that trying to copy biological mechanisms in the case of AI probably isn't the way to go. We want mechanical kidneys and hearts to work like their biological counterparts because we'll be putting them into biological systems (us), that doesn't hold true for AI.
I thought we were building AIs to fit into human society? One to which we could talk, would understand us, would be able to work with us, et cetera? If not, what's the point? If so, doesn't that put at least as much constraint on an artificial mind as the necessity for integrating with a physical biological system puts on an artificial kidney?
Your reference to A.I. always being 30 years away (or 22) reminds me of the old saw about fusion power always being 20 years away for the last 60 years.
The rebuttal I've heard was that fusion research is funding constrained - if someone had given fusion research twenty billion dollars instead of twenty million dollars, they would be a lot closer than 20 years away by now.
Brings to mind the old saw about 9 women gestating a child in 1 month.
Wasn't $20 million sixty years ago more like $20 billion today? I have a feeling that no matter how much money was thrown at it, the complaint would be "if they had only given us SIXTY billion instead of a lousy twenty billion, we'd have fusion today!"
Of course, the cold fusion scandal of 1989 didn't help, after that I imagine trying to convince people to work on fusion was like trying to convince them your perpetual motion machine was really viable, honest:
https://en.wikipedia.org/wiki/Cold_fusion
My rule of thumb is that $1 in 1960 = $10 today, so it would have been more like $200 million. I didn't remember the exact numbers quoted or the year the guy was referring to (it could have been the 1980s for all I know) but the amount of money the guy said they got was something like 1/1000th of the amount they said they would need.
If they had fully funded them, fusion might be perpetually 10 years away instead of 20. ;)
(As it happens, some recent claimed progress in fusion power has come about because of an only tangentially related advance: better magnets made from new high-temperature superconductors. https://www.cnbc.com/2021/09/08/fusion-gets-closer-with-successful-test-of-new-kind-of-magnet.html )
It should be noted that the credibility gap between fusion and cold fusion is about the same size as the one between quantum mechanics and quantum mysticism.
Humans have been causing thermal fusion reactions since the 1950s. Going from that to a fusion power plant is merely an engineering challenge (in the same way that going to the moon from Newton mechanics and gunpowder rockets is just an engineering challenge).
Not a lot of people in the business took cold fusion seriously, even at the time. People were guarded in what they said publically, but privately it was considered silly pretty much immediately.
Here's the graph https://commons.wikimedia.org/wiki/File:U.S._historical_fusion_budget_vs._1976_ERDA_plan.png
if you believed the orthogonality thesis were false - say, suppose you believe both that moral realism is correct and that that long term intelligence was exactly equal to the objective good that we approximate with human values - would you still worry?
asking for a friend :)
That's a very interesting position if I understand correctly. Is your view that a super smart AI would recognize the truth of morality and behave ethically?
Yes.
Here's the argument for moral realism: https://apxhard.com/2022/02/20/making-moral-realism-pay-rent/
And then, linked at the end, is a definition of what i think the true ethics is.
Very cool. I like that thinking a lot.
If a bunch of people converge to the same map, that's strong evidence that they've discovered *something*, but it leaves open the question of what exactly has been discovered.
I can immediately think of two things that people trying to discover morality might discover by accident:
1) Convergent instrumental values
2) Biological human instincts
(These two things might be correlated.)
According to you, would discovering one or both of those things qualify as proving moral realism? If not, what precautions are you taking to avoid mistaking those things for morality?
I agree with moral realism and I think convergence of moral values is evidence of moral realism. I would answer the first question as it doesn't prove moral realism for the fact that there are other possible hypotheses, but it does raise the probability of moral realism being true.
I'd agree that the existence of non-zero map-convergence is Bayesian evidence in favor of realism, in that it is more likely to occur if realism is true.
Of course, the existence of less-than-perfect map-convergence is Bayesian evidence in the opposite direction, for similar reasons.
Figuring out whether our *exact* degree of map-convergence is net evidence in favor or against is not trivial. One strategy might be to compare the level of map-convergence we have for morality to the level of map-convergence we have for other fields, like physics, economics, or politics, and rank the fields according to how well peoples' maps agree.
Yeah, I agree with you on that. It is difficult to measure degree of convergence. Comparing to other fields? That would be hard too.
To be fair, though, you have to ALSO account for a few things like:
- 'how widespread is the belief that the maps _ought_ to converge'
- 'how much energy has been spent trying to find maps that converge'
- and, MOST IMPORANTLY - how complicated is the territory?
i don't think we should expect _complete_ convergence because i think a true morality system, with full accuracy, requires knowing everything about the future evolution of the universe, which is impossible
if we really had some machine that could tell us, with absolute certainty, how to get to a future utopia state where the globe was free from all major disease, everyone had a great standard of living, robots did all the work, but humans worked too because we were all intensely loving, caring beings, and humans wrote art and poetry and made delicious food and did all kinds of fun things with each other, war never happened, and this state went on for millions of years as we expanded throughout the cosmos and seeded every planet with fun-loving, learning humans who never really suffered and yet continuously strived to learn and grow and develop, and knew all the names of all our ancestors because we invested heavily in simulating the past so that we could honor the dead who came before us, AND somehow this made all religions work in harmony becuase of some weird quirks in their bylaws that people hadn't noticed before....
if you KNEW this was doable, and we had the perfect map telling us how to get there, well, i think most of us would want to go there. Some people would of course be unhappy with certain aspects of that descriptoin but i think _most_ people would be like, yeah, i want that.
In other words, the infeasibility of a fully accurate map of causality is why we don't agree on morality. the causal maps we do use involve lossy compression, which means throwing out some differences as irrelevant. But the decision of what is and isn't relevant is a moral one! Once you decide 'the arrangement of these water molecules doesn't really matter so much as the fact that they are in liquid state and such and such temperature and pressure', you are _already_ playing the moral values game.
In other words, there's no way to separate causal relevance from moral values.
I...wouldn't know how to model this. Certainly it would be better than the alternative. One remaining concern would be what you need to apprehend The Good, and whether it's definitely true that any AI powerful enough to destroy the world would also be powerful enough to apprehend the Good and decide not to. Another remaining concern is that the Good might be something I don't like; for example, that anyone who sins deserves death and all humans are sinners; or that Art is the highest good and everyone must be forced to spend 100% of their time doing Art and punished if they try to have leisure, or something like that.
My argument for moral realism, and then my hunch at the true ethics is linked above. The short there version is: maximizing possible future histories, the physics-based definition of intelligence promoted by Alex Wissner-Gross at MIT. I think it's basically a description of ethics as well, and the fact that it's ~very~ simple mathematically - it works well as a strategy in chess, checkers and go even if you don't give it the goal of 'winning the game'. I find that very re-assuring.
If not, i have this 'fallback hunch' which figures it'll be instrumental to keep humans around. How many people working on AI safety have spent time trying to maintain giant hardware systems? I spent 3.5 years at google tryign to keep a tiny portin of the network alive. All kinds of things break, like fiber cables. Humans have to go out and fix them. There's an ~enormous~ amount of human effort that goes into stopping the machines from falling over. Most of this effort was invisible, to most of the people inside of Google. We had teams that would build and design new hardware, and the idea that some day it might break and need to be repaired was generally not something they'd think about until late, late, late in the design phase. I think we have this idea that the internet is a bunch of machines and that a datacenter can just keep running, but the reality is if everyone on earth died, the machines would all stop within days, maybe months at most.
To prevent that, you'd need to either replace most of the human supply chains on earth with your own robots, who'd need their supply chains - or you could just keep on using robots made from the cheapest materials imaginable. We repair ourselves, make more copies of ourselves , and all you need is dirt, water, and sunlight to take care of us. The alternative seems to be either:
-risk breaking in some way you can't fix
-replace a massive chunk of the global economy, all at once, without anything going wrong, and somehow end up in a state where you have robots which are cheaper than ones made, effectively, from water, sunlight and dirt
of course maybe i'm just engaging in wishful thinking.
Keeping opitions open is kind of like having a lot of power (I'm thinking of a specific mathematical formalisation of the concept here). And this doesn't lead to ethical behaviour, it leads to agents trying to take over the world! Not really ethical at all.
https://www.lesswrong.com/s/fSMbebQyR4wheRrvk/p/6DuJxY8X45Sco4bS2
here is an informal description of the technical results by their discoverer.
When you say that "maximizing possible futures" works as a strategy for various games, I think you must be interpreting it as "maximizing the options available to ME". If you instead maximize the number of game states that are theoretically reachable by (you + your opponent working together), that is definitely NOT a good strategy for any of those games. (You listed zero-sum games, so it is trivially impossible to simultaneously make BOTH players better off.)
If you interpret "possible" as meaning "I personally get to choose whether it happens or not", then you've basically just described hoarding power for yourself. Which, yes, is a good strategy in lots of games. But it sounds much less plausible as a theory of ethics without the ambiguous poetic language.
> I think you must be interpreting it as "maximizing the options available to ME"
Nope, what i mean is 'maximizing the options available to the system as a whole." There is no meaningfully definable boundary between you and the rest of the physical universe. I think the correct ethical system is to maximize future possibilities available to the system as a whole, based upon your own local model. And if you're human, that local model is _centered_ on you, but it contains your family, your community, your nation, your planet, etc.
See this document here with the full argument:
https://docs.google.com/document/d/18DqSv6TkE4T8VBJ6xg0ePGSa-0LqRi_l4t6kPPtqbSQ/edit
The relevant paragraph is here:
> An agent which operates to maximize the possible future states of the system it inhabits only values itself to the extent that it sees itself as being able to exhibit possible changes to the system, in order to maximize the future states accessible to it.
> In other words, an agent that operates to maximize possible future states of the system is an agent that operates without an ego. When this agent encounters another agent with the same ethical system, they are very likely to agree on the best course of outcome. When they disagree, it will be due to differing models as to the likely outcomes of choices - not on something like core values.
You have a button that, when pressed will cure cancer. If you press it today, you have only 1 possible future tomorrow. If you don't press it, you have a choice of whether or not to press it tomorrow. So not pressing the button maximises possible future states.
This agent will build powerful robots, ready to spring into action and cause any of a trillion trillion different futures. But it never actually uses these robots. (Using them today would flatten the batteries, giving you fewer options tomorrow)