"Whatever you’re expecting you 'need self-awareness' in order to do, I bet non-self-aware computers can do it too. "

From very early in Vernor Vinge's "A Fire Upon the Deep":

" ... The omniscient view. Not self-aware really. Self-awareness is much overrated. Most automation works far better as part of a whole, and even if human powerful, it does not need to self-know. ..."

Vinge is writing science fiction, but still ...

Expand full comment

Scott, this is kind of a big ask, but i think the only real way to understand the limitations and powers of current gen AI (and by extension, if there is a reasonable path forward from them to AGI), is to do a basic project with it or programming with one yourself. Hands-on work really, imo, really reveals much more information then any paper can convey. Physically seeing the algorithms and learning systems and poking at them directly explains things and conceptualized what papers struggle to explain in the abstract.

You talk about current AI systems as having "a blob of learning-ability", which is true, but not complete. We (and by we i mean AI developers) have a much deeper understanding of the specifics and nuances of what learning ability is, in the same way rocket engineers have a much deeper and more nuanced understanding of how a rocket system has "a blob of propulsive material". In my experience (which isn't cutting edge, to be fair, but was actual research at the higher level classes in college), our current blob of learning ability we can simulate has a large number of fundamental limitations on the things it can learn and understand, in the same way a compressed chunk of gunpowder has fundamental limitations on it's ability as a propulsive material. A compressed tube of gunpowder will never get you to space; you need to use much more complex fuel mixing system and oxygen and stuff for that (i am not a rocket scientist). In the same way, our current learning algorithms have strong limitations. They are rather brute force, relying on more raw memory and compression to fit information and understanding. It can only learn a highly specific way, and is brittle in training. Often, generalization abilities are not really a result of the learning system getting more capable, but of being able to jam more info in a highly specific way (pre-tuned by humans for the task type, seriously a lot of AI work is just people tweaking the AI system until it works and not reporting the failure). Which can accomplish amazing things, but will not get you to AGI. Proving this, of course, is impossible. I could be wrong! But that's why i suggest the hands-on learning process. I believe that will allow you to feel, directly, how the learning system is more a tube of gunpowder then a means to space.

Expand full comment

I think the element that's missing from the gunpowder and rock analogies is recursive self improvement. A big pile of gunpowder isn't going to invent fission bombs. A big/sophisticated enough neural network could potentially invent an even better neural network, and so on. (This isn't sufficient to get AGI, but it's one component.)

More pragmatically, the gunpowder analogy is overlooking fuel-air exclusives. As the saying goes, quantity has a quality all its own.

Expand full comment

A tiny suggestion for fixing social media. or 2. a) Have human moderators that zing "bad" content and penalize the algorithm for getting zinged until it learns how not to get zinged.

b) Tax the ad revenue progressively by intensity of use. Some difficulty of defining "intensity" hours per day or per week? counts only if "related" page views?

Expand full comment

It seems like there's a potential crux here that Scott vaguely alluded to in a couple of these responses but didn't tackle quite as directly as I'd have liked: Can the current ML paradigm scale all the way up to AGI? (Or, more generally, to what Open Phil calls "transformative AI"?)

The response to Chris Thomas suggests that Scott thinks it can, since he sketches out a scenario where pretty much that happens. Meanwhile, pseudo-Dionysus seems to assume it can't, since he uses the relationship between gunpowder and nuclear weapons as a metaphor, and the techniques used to scale gunpowder weapons didn't in fact scale up to nukes; inventing nukes required multiple paradigm shifts and solving a lot of problems that the Byzantines were too confused to even begin to make progress on?

So is this the case for ML, or not? Seems hard to know with high confidence, since prediction is difficult, especially about the future. You can find plenty of really smart experts arguing both sides of this. It seems to be at least fashionable in safety-adjacent AI circles right now to claim that the current paradigm will indeed scale to transformative AI, and I do put some weight on that, and I think people (like me) who don't know what they're talking about should hesitate to dismiss that entirely.

On the other hand, just going by my own reasoning abilities, my guess is that the current paradigm will not scale to transformative AI, and it will require resolving some questions that we're still too confused about to make progress. My favorite argument for this position is https://srconstantin.wordpress.com/2017/02/21/strong-ai-isnt-here-yet/

I don't think people who believe this should rest easy, though! It seems to me that it's hard to predict in advance how many fundamental breakthrough insights might be needed, and they could happen at any time. The Lindy effect is not particularly on our side here since the field of AI is only about 70 years old; it would not be surprising to see a lot more fundamental breakthrough insights this century, and if they turn out to be the ones that enable transformative AI, and alignment turns out to be hard (a whole separate controversy that I don't want to get into here), and we didn't do the technical and strategic prep work to be ready to handle it, then we'll be in trouble.

(Disclaimer: I'm a rank amateur, and after nine years of reading blog posts about this subject I still don't have any defensible opinions at all.)

Expand full comment

And why not fix spam calls by charging incoming calls just as some kinds of outgoing Call used to be charged. Say it's 10 cents per call credited to the answer's account. Not a big obstacle to normal personal calls, but it would make cold call spam unaffordable.

Expand full comment

"AIs have to play zillions of games of chess to get good, but humans get good after only a few thousand games"

You can make an argument that for each chess position humans consider many possible moves that haven't actually happened in the games they play or analyze and that we can program a computer to also learn this way, without actually playing out zillions of games. It's just that it doesn't seem like the most straightforward way to go in order to get a high-rated program.

Expand full comment

> I don’t know, it would seem weird if this quickly-advancing technology being researched by incredibly smart people with billions of dollars in research funding from lots of megacorporations just reached some point and then stopped.

This wouldn't be weird, it's the normal course of all technologies. The usual course of technology development looks like a logistic curve: a long period in which we don't have it yet, a period of rapid growth (exponential-looking) as we learn about it and discoveries feed on each other, and then diminishing returns as we fully explore the problem space and reach the limits of what's possible in the domain. (The usual example here is aerospace. After sixty years in which we went from the Wright Flyer to jumbo jets, who would predict that another sixty years later the state of the art in aerospace would be basically identical to 1960s jets, but 30% more fuel-efficient?)

It seems like the 2010s have been in the high-growth period for AI/ML, just as the 40s were for aerospace and the 80s were for silicon. But it's still far too early to say where the asymptote of that particular logistic is. Perhaps it's somewhere above human-equivalent, or perhaps it's just a GPT-7 that can write newspaper articles but not much more. The latter outcome would not be especially surprising.

Expand full comment

It seems like if you want practice at securing things against intelligent opponents, computer security is the place to be. And we aren't very good at that, are we? Ransomware attacks get worse every year.

For "hard" computer security (computers getting owned), I think the only hope is to make computers that are so simple that there are provably no bugs. I'm not sure anyone is doing serious work on that. Precursor [1] seems promising but probably could use more funding.

But beyond that there are financial and social attacks, and we are clearly helpless against them. Cryptocurrency shows that if even an unintelligent algorithm promises people riches then many people will enthusiastically take its side. (Though, it's sort of good for "hard" computer security, though, since it funds ransomware.)

[1] https://www.bunniestudios.com/blog/?p=5921

Expand full comment

I missed this one the first time around, but a more prosaic response than Scott's:

> Soon gunpowder weapons will be powerful enough to blow up an entire city! If everyone keeps using them, all the cities in the world will get destroyed, and it'll be the end of civilization. We need to form a Gunpowder Safety Committee to mitigate the risk of superexplosions."


I guarantee you, if it looks like you have the makings of a new entry for that list, you'll get a visit from one of the many, many different Safety Committees. This is an example of a problem that was solved through a large, interlocking set of intentional mechanisms developed as a response to tragedy, not one that faded away on its own.

(I'll agree that nuclear belongs in a category of its own, but quibbling over whether those all count as "gunpowder" undercuts the hypothetical - that's a demand for chemical consistency beyond what our Greek engineer would have witnessed in the first place.)

Expand full comment

Come up with an AI that can teach itself how to surf. That would be impressive.

Expand full comment

The omniscient view. Not self-aware really. Self-awareness is much overrated. Most automation works far better as part of a whole, and even if human powerful, it does not need to self-know. ..."

Are you familiar with Julian Jaynes?

Expand full comment

"The trophy won't fit in my suitcase. I need a larger one."

- A super simple idea that we don't even think of as being ambiguous or difficult. But that's because we live in an actual world with suitcases and trophies. I'm not sure that if you skinner-boxed an actual human being in a room with a dictionary, infinite time to figure it out, and rewards for progress, they'd be able to figure it out.

It is absolutely true that the "learning blob" is wildly adaptable. But at the end of the day, it's a slave to its inputs.

That doesn't mean that AI isn't dangerous. As long as we have little insight into how it makes choices, it will remain possible that it's making choices using rationales we'd find repugnant. And as long as we keep giving algorithms more influence over the institutions that run much of our lives, they will have the capacity for great harm.

But we've been algorithming society since the industrial age. Computers do it harder, better, faster, stronger, but Comcast customer service doesn't need HAL to make day-to-day life a Kafkaesque nightmare for everyone including Comcast - they just need Goodhart. Goodhart will win without computers - it'll win faster with them, granted.

Expand full comment

I think the reason skeptics insist on the AGI achieving consciousness is that this is the only way we know of for inferential reasoning, and brand-new ideas. Current forms of AI have zero ability to reason inferentially, and zero capability for coming up with new ideas. The only reason they can "learn" to do "new things" is because the capability of doing the new things was wired into them by their programmers. They need to find the most efficient path of doing the new things, to be sure, but this is basically a gigantic multi-dimensional curve-fitting problem, an exercise in deductive logic. From the strict comparison-to-human point of view, it's no more "learning" than my calculator "learns" when it finds the best slope for a collection of data via a least-squares algorithm, although I can understand the use of the shorthand.

We should just remember it *is* a shorthand, though, and that there is a profound difference* between an AI learning to play Go very well and a human child realizing that the noises coming out of an adult's mouth are abstract symbols for things and actions and meaning, and proceeding to decipher the code.

If you had an AI that could *invent* a new game, with new rules, based on its mere knowledge that games exist, and then proceed to become good at the new game -- a task human intelligence can accomplish with ease -- then you'd have a case for AI learning that was in the same universality class as human intelligence. But I don't know of any examples.

If you had an example of an AI that posed a question that was not thought of by it programmers, was indeed entirely outside the scope of their imagination of what the AI could or should do -- again, something humans do all the time -- then you'd have a case for an AI being capable of original creative thought. But again I know of no such examples.

*Without* creative original thought and inferential reasoning, it is by definition impossible for an AI to exceed the design principles it was given by humans, it is merely a very sophisticated and powerful machine. (The fact that we may not be clear on the details of how the machine is operating strike me as a trivial distinction: we design drugs all the time where we don't know the exact mechanism of action, and we built combustion engines long before we fully understood the chemistry of combustion. We *routinely* resort to phenomenology in our design processes elsewhere, I see no reason to be surprised by it with respect to computer programming, now that it is more mature.)

And if an AI cannot exceed its design principles, then it is not a *new* category of existential threat. It's just another way in which we can build machines -- like RBMK nuclear reactors, say, or thalidomide, or self-driving cars -- that are capable of killing us, if we are insufficiently careful about their control mechanisms. We can already build "Skynet," in the sense that we can build a massive computer program that controls all our nuclear weapons, and we can do it stupidly, so that some bug or other causes it to nuke us accidentally (from our point of view). But that's a long way from a brand-new type of threat, a life-form that can and does form the original intention of doing us harm, and comes up with novel and creative ways to do it.


* And I don't really see why you assume that difference is one of mere degree, as opposed to being a quantum leap, a not-conscious/conscious band gap across which one must jump all at once, and cannot evolve across gradually. If that were true, one would expect to see a smooth variation in consciousness among people, just as we see a smooth variation in height or weight: some people would be "more conscious" and some would be "less conscious." Through the use of drugs we should be able to simulate states of 25% conscious, or 75%, or maybe 105%. (And I don't mean "being awake" here, so that sleep counts as "not conscoius," I mean *self-aware*, the usual base definition of being a conscious creature.) How would we even define a state of being 75% as self-aware as someone else? So far as our common internal experience seems to go, being a conscious self-aware being is all-or-nothing, you either are or your aren't, it's a quantum switch. That doesn't really support the idea that it could gradually evolve.

Expand full comment

I think a significant fraction of people trying to think about AGI are tripped up by the following:

Humans have this neat cognitive hack for understanding other humans (and sorta-kinda-humans, such as pets). If you tried to understand your friend Tom in the way an engineer understands a car, as a bunch of simple parts that interact in systematic ways to generate larger-scale behaviors, it would be quite difficult. But since you happen to BE a human, you can use yourself as a starting point, and imagine Tom as a modified version of you.

You don't really *understand* yourself, either. But you have a working model that you can run simulations against, which lets you predict how you'll behave in various hypothetical situations with a reasonable degree of accuracy. And then you can tweak those simulations to predict Tom, too (with less accuracy, but still enough to be useful).

I think a lot of people look at the computers of today, and they understand those computers in the normal way that engineers understand cars and planes and elevators and air conditioners. Then they imagine AGI, and they apply the "modified version of myself" hack to understand that.

And those models doesn't feel like models, they just feel like how the world is. ( https://www.lesswrong.com/posts/yA4gF5KrboK2m2Xu7/how-an-algorithm-feels-from-inside )

This tends to produce a couple of common illusions.

First, you may compare your two mental models (of today's computers vs future AGI) and notice an obvious, vast, yet difficult-to-characterize difference between them. (That difference is approximately "every property that you did NOT consciously enumerate, and which therefore took on the default value of whatever thing you based that model on".)

That feeling of a vast-yet-ineffable difference exists in the map, not the territory.

There might ALSO be a vast difference in the territory! But you can't CONCLUDE that just from the fact that you MODELED one of them as a machine and the other as an agent. To determine that an actual gulf exists, you should be looking for specific, concrete differences that you can explain in technical language, not feelings of ineffable vastness.

If you used words like "self-aware", "conscious", "sentient", "volition", etc., I would consider that a warning flag that your thinking here may be murky. (Why would an AGI need any of those?)

Second, if you think of AGI like a modified version of yourself (the way you normally think about your coworkers and your pets), it's super easy to do a Typical Mind Fallacy and assume the AGI would be much more similar to you than the evidence warrants. People do this all the time when modeling other people; modeling hypothetical future AGI is much more difficult; it would be astonishing if people were NOT doing this all over the place.

I think this is the source of most objections along the lines of "why would a superintelligent agent spend its life doing something dumb like making paperclips?" People imagine human-like motives and biases without questioning whether that's a safe assumption.

(Of course YOU, dear reader, are far too intelligent to make such mistakes. I'm talking about those other people.)

Expand full comment

Personally I wish we'd table the long-term, strong AI topic since, as I commented on the original Acemoglu post, these conversations are just going in circles. Do you yourself honestly feel like your understanding of this issue is progressing in any way? Or that your plan for action has changed in any way? You're at 50-50 on the possibility of true AI by 2100. So after all this, you still have no idea. And neither do I. And that's hardly changed in years. We aren't accomplishing anything here. Four of the last eight posts have been AI-related. Sorry to keep being such a pooh-pooher, but I really appreciate your writing and I feel like it just goes to waste on this. I'd love for the focus here to be on more worthwhile topics.

Expand full comment

> OpenAI’s programs can now write essays, compose music, and generate pictures, not because they had three parallel amazing teams working on writing/music/art AIs, but because they took a blob of learning ability and figured out how to direct it at writing/music/art, and they were able to get giant digital corpuses of text / music / pictures to train it.

One thing outsiders might not understand is how huge a role "figured out how to direct it" plays in AI software. In Silicon Valley everyone and their intern will tell you they're doing AI, but there are very few problems you can just naively point a neural network at and get decent results (board games / video games are the exception here, not the rule). For everything else, you need to do ton of data cleanup--it's >90% of the work involved-- and that means hard-coding a bunch of knowledge and assumptions about the problem space into the system. The heuristics from that effort tend to also do most of the heavy lifting as far as "understanding" the problem is concerned. I've worked at one startup (and heard stories of several more) where the actual machine-learning part was largely a fig leaf to attract talent and funding.

So here's another scale we might judge AI progress on: How complicated a problem space is it actually dealing with? At one extreme would be something trivial like a thermostat, and at the other-- the requirement for AGI "takeoff"-- would be, say, the daily experience of working as a programmer. Currently AI outperforms top humans only at tasks with very clean representations, like Go or Starcraft. Further up the scale are tasks like character recognition, which is a noisier but still pretty well-defined problem space (there's a single correct answer out of a small constant number of possibilities) and which computers handle well but not perfectly. Somewhere beyond that you get to text and image generation, much more open ended but still with outputs that are somewhat constrained and quantifiable. In those cases the state of the art is significantly, obviously worse than even mediocre human work.

My wild guess is that games are 5-10% of the way to AGI-complexity-level problems, recognition tasks are 20-30% of the way there, and generation tasks are 40-60% of the way there, which would suggest we're looking at a timescale of centuries rather than decades before AGI is within reach.

Expand full comment

Humans don't just make art, humans invented art. And humans invented games. Programs which can beat all humans at chess and go are amazing, but I might wait to be impressed until a program invents a game that a lot of people want to play.

So far as I know, programs can make art that people can't tell from human-created art, but haven't created anything which has become popular, even for a little while.

Other than that, I've had a sharp lesson about my habit of not actually reading long words. I was wondering for paragraphs and paragraphs how the discussion of the Alzheimer's drug had turned into a discussion of AI risk.

Expand full comment

I always thought it was going to be a collection of AIs which can direct each other. One can see the power that comes in the example of the Learner + Strategy narrow AI/algorithms.

If 1 + 1 = 2...then might 1 + 1 + 1 = 3 or 1 + 2 = 3? Or perhaps at some point it is 2 controller AIs x 3 problem AIs = 6 AI power instead of only 5 AI power.

Just like in the body we have layers of systems that interact. Certainly a sufficiently well organised collection of AIs serving various function, coordination, strategy, and learner roles could become extremely powerful. It may well take a sophisticated group of humans to put it together, but I don't see why that could not happen.

The blood and plasma fill the veins and these get pumped by the heart and the entire cardiovascular system receives instructions through hormones such as adrenaline to go faster or various parasympathetic instructions to slow down.

You can do the same with the kidneys and neural nets, etc. and at some point you have a human being. I'm inclined to agree with Scott that we're on a path now and just need to keep going...at some point we hit sufficiently complex and multifunctional AIs that they are better than many humans or better than any human at more things than any one person could ever do.

Some combination of AIs to recognise problems, sort them into categories, assign tasks to those systems, and implement solutions could work to create a very complex being.

Basically we keep thinking that AI is the brain only...but that's not a great analogy. There needs to be many body like parts and I'm not talking about robotics. But many functional AIs.

Just imagine we had a third AI to the very simple Learner + Strategy AI. This AI is a 'Is it a game?' AI or a Categoriser AI.

So now we have Categoriser + Learner + Strategy. This Categoriser is like a water filter that stops stupid junk from going into our Learner.

Here's a book....Categoriser says this is not a game! rejection.

Here's a board game....Categoriser says this is a game! Accept - Learner learns...Strategy developed.

Game 2 - Categoriser ...is a game...accept - learner learns - Strategy 2 is developed.

Game 1 is presented again - Categoriser see this - Strategy 1 is deployed...Learner accepts new data.

Something like this can work. That way our Learner doesn't keel over in confusion or get clogged up with lots of useless information from a book which is not a boardgame.

This could be a single node within a series of nodes. We connect up lots of these systems with various indepnedent heirarchies of AI and bam...over time it add a lot of functionality.

We don't need a super general AI which can become super generalised to figure out how to play Go or Chess or Starcraft...we already have these. Why not simply have one AI that turns on another AI?

If we can brute force solve enough problems and over time get slightly and moderately better at creating general AI, then we can get towards some arbitrary point in space where we can say we have an AGI.

Over time and with enough testing of which AI to wire to another AI we'll likely discover synergistic effects. As in my opener above where 2 x 3 instead of 2 + 3 AIs occurs.

Now three children stacked on top of each other in a trenchcoat isn't' an adult per se. But if those kids get really good at doing a lot of things which adults do, then they can 'pass' or at least achieve certain things for us.

Throw on some robotics AI....and you can get some interesting smarter robots vs the dumb robots of today. We don't need to teach 'Spot' from Boston Dynamics how to do everything with a single AI, but can use a combination set to be the brains to drive the thing. Hopefully not straight into picking up a gun and going to war, but that'll probably happen.

But the more useful function of being able to navigate a sidewalk or neighbourhood to deliver parcels from driverless delivery trucks for a truly hands free warehouse to front door experience. If Tesla can get a car to drive, hopefully we can get a robot doggie to deliver parcels without bumping into or hurting anyone, even calling an ambulance for anyone it finds in trouble someday.

Who knows what a philosopher will call such a being, the multi-chain AI, not the robo-doggie. Is it an AGI, is it a risk to humanity, is it narrow or broad? Who cares when considering the greedy pigs who'll try to make money from it without a care or thought in their minds about abstract or future focused concepts outside of increasing their net worth...side question: are they sentient? The main question will be, is it useful? if it is, then someone will make and sell it or try to.

So yea...I see no problem with 2 x 3 = 6 AI being how a series of properly connected AI could operate. So as we move forward in a stepwise direction, we'll get increasingly complex AI.

Maybe the Categoriser is hooked up to the Learn + Strategy line for boardgames...but will redirect textual information to GPT4 or GPT5 or whatever successor GPT gets developed to improve its database of things to learn from. That could be a (1 x 2) + 1 scenario or even (1 + 2) + (1 +1) chain. The future notation or existing notation I'm unaware of will address now to denote AI chains to estimate their overall complexity.

Expand full comment

So, it’s 2035, and there is yet another general learner, and it’s even more capable than the previous one, and there are news about yet another drone war, and there is no tangible progress from AGI risk prevention community, but there are even more research labs and agents working towards AGI, and Metaculus predictions are scary, and you have the same feeling about the future that you had in February 2020 - about impotence and adversity of regulators, and events following their own logic.

But this time you cannot stockpile and self-isolate, and this time the disaster is much worse than in 2020. So then you ask yourself what you could have done differently in order to prevent this? Maybe waiting for some smart guys from MIRI to come up with panacea was not the best plan of action? Maybe another slightly funnier Instagram filter was just not worth it? Maybe designing better covid drug was not the problem you should have worked on?

And when the time come, will you just sit and watch how events unfold? And shouldn’t you have started acting earlier, and not in 2035, when the chances of positive outcome are much smaller, and actions much more radical?

Expand full comment

If you look at how human intelligence translates to real-world impact, almost all of it depends on hard-wired (instinctual) motivations.

For example, you could have a Capybara that was actually super-intelligent. But if its instincts just motivated it to eat, sleep, and sit in the water, that's what it would do, only more skillfully (and I think there are diminishing returns to intelligence there).

Humans are of course social animals and have a complex set of social instincts and emotional states that motivates most of their behavior.

Of course AI research may (and I believe eventually will) address issues of motivation. But will they accidentally address it without understanding it at all? If you look at human instincts, they are suspiciously hard-coded (the core instincts (hunger, social, etc) are largely the same although they manifest in different ways). This suggests that motivation is not an intrinsic property of general pattern-recognition.

If this is the case then motivated AIs will not arise accidentally out of pattern-recognition research, and will need to be implemented explicitly, giving control to the creator. So this supports an AI-as-nuclear-weapons type scenario, where it is a dangerous tool in human hands but not independently a threat (unless explicitly designed by humans to be so).

One other important angle: Are there diminishing returns to intelligence? If you look at for example Terence Tao, there are clearly people with much greater pattern-recognition ability than average. But this does not seem to straightforwardly translate to political or economic power (just look at our politicians). If there are diminishing returns to intelligence, and assuming no superweapons are available based on undiscovered physics, perhaps machine super-intelligence isn't a threat on those grounds?

My guess is that while there are probably diminishing returns to human intelligence, this is less true for AI. Mainly because AIs could presumably be tuned for speed rather than depth of pattern recognition, and in many (physical/economic/military) fields 1000000 man-years of work can more easily translate to real-world impact than one extremely intelligent person working for 1 man-year. This is of course assuming (I think reasonably) that AIs can eventually be optimized to such a degree that they are available substantially cheaper than human intelligence is.

Expand full comment

One kind of key thing that I think a lot of people don't understand is that existing neural network technology has kind of a fundamental algorithmic limitation that (as far as we can tell) does not apply to biological neural networks. Namely, existing techniques can really only learn fixed computational graphs. Any behavior produced by the neural network needs to be differentiable for backpropagation based learning to work, and changing the actual structure of the neural network being learned is inherently non-differentiable. Thus, the neural network weights being used are always the same, no matter what the situation is.

If you think of a neural network as being "learned code", this is a really serious problem. It amounts to an inability to alter control flow. A backpropagation learning algorithm is like a programmer using a language that has no loops, gotos, or function calls. If it needs something to happen 20 times in a row, it needs to write it out 20 times. It's very hard to write code that competently handles lots of cases, because every unique case has to be explicitly enumerated.

This explains some of the weird quirks of the technology, like how transformer language models can easily learn arithmetic on three digit numbers, but struggle five digit -- you can handle unlimited length arithmetic with a simple recursive algorithm, but there's no way to represent a recursive algorithm in the weights of a neural network, so every length of addition has to be learned as a separate problem, and five digit arithmetic needs orders of magnitude more examples.

I also believe that this kind of thing is probably responsible for the sample efficiency issues we see with neural networks, where they need far more data than humans do. The inability to do recursion means that the most efficient and general styles of learning are unavailable. Everything has to be done with brute force, and the underlying rules simply can't be represented by the computational process modelling them.

I think this problem is going to be solved in the next 10-20 years, and when it is, it's going to be a pretty major sea change in terms of neural network capability.

Expand full comment

Typo thread:

"c is the offender’s age" -> "a is the offender’s age"

"blog" -> "blob"

"deal with problems as the come up" -> "deal with problems as they come up"

Expand full comment

When did you get into Max Stirner tho?

Expand full comment

"One of the main problems AI researchers are concerned about is (essentially) debugging something that’s fighting back and trying to hide its bugs / stay buggy. Even existing AIs already do this occasionally - Victoria Krakovna’s list of AI specification gaming examples describes an AI that learned to recognize when it was in a testing environment, stayed on its best behavior in the sandbox, and then went back to being buggy once you started trying to use it for real."

This description has almost nothing to do with the actual contents of the link. The link contains examples of "unintended solutions to the specified objective that don't satisfy the designer's intent" (quote from the Google form for submitting more examples), in some cases due to the objective being insufficiently well-defined, in a lot of (most?) cases due to bugs in the sandbox. I see no mention of trying to use any of the AIs for real, never mind one of them learning to tell the difference between sandbox and reality.

Expand full comment

>Non-self-aware computers can beat humans at Chess, Go, and Starcraft.

It was never proven that they could beat the best StarCraft players, the top 0.1%. With no news for so long it seems like they gave up on that goal too, so it could be many years before another AI comes along to make an attempt.

Expand full comment

> A late Byzantine shouldn’t have worried that cutesy fireworks were going to immediately lead to nukes. But instead of worrying that the fireworks would keep him up at night or explode in his face, he should have worried about giant cannons and the urgent need to remodel Constantinople’s defenses accordingly.

Well... maybe. I'm not sure if cannons specifically would've been immediately conceivable based on just fireworks; yes, it seems obvious to us today, but it might not have been in the past. In any case, imagine that you're standing on the street corner, shouting, "Guys ! There's a way to scale up fireworks to bring down city walls, we need thicker walls !"; and there's a guy next to you yelling, "Guys ! Fireworks will lead to a three-headed dog summoning a genie that will EAT the WORLD ! REPENT !!!"

Sure, some people might listen to you. Some people might listen to the other guy. But most of them are going to ignore both of you, on the assumption that you're both nuts. In this metaphor, the Singularity risk community is not you. It's the other guy.

Yes, if we start with a 50/50 prior of "malevolent gods exist", then you can find lots of evidence that kinda sorta points in the direction of immediate repentance. But the reasonable value of the prior for superintelligence is nowhere near that high. We barely have intelligence on Earth today, and it took about 4 billion years to evolve, and it went into all kinds of dead ends, and it evolves really slowly... and no one really knows how it works. Also, despite our vaunted human intelligence, we are routinely outperformed by all kinds of creatures, from bears or coronaviruses.

All of that adds up to a prior that is basically epsilon, and the fact that we have any kind of machine learning systems today is pretty damn close to a miracle. Even with that, there are no amount of little tiny tweaks that you can add to AlphaGo to suddenly make it as intelligent as a human; that's like saying that you can change an ant into a lion with a few SNPs. Technically true, but even more technically, currently completely unachievable.

Expand full comment

>Then you make it play Go against itself a zillion times and learn from its mistakes, until it has learned a really good strategy for playing Go (let’s call this Strategy).

>Learner and Strategy are both algorithms. Learner is a learning algorithm and Strategy is a Go-playing algorithm.

They're structurally the same, the way you give the two algorithms different names is confusing. Strategy is a Learner, it has everything it needs to keep learning. Learner is a Strategy too, though a bad one (it learns by trying out bad strategies and getting or not punished for them, after all).

After each game, you can turn your LearnerStrategist into another one by giving it feedback, but at no point do you have two separate Learner and Strategy algorithms (unless you want to include "backprop" as part of the "Learner" entity, but that seems uninteresting, since whether you remove backprop or not is unrelated to having played a billion games, or being good at the game)

Expand full comment

I agree the issue of the hard problem of consciousness isnt relevant to AI risk.

But it's deeply morally important. And the problem is that it seems implausible there is a way you can define what it means to be a computation (bc a recording of what someone said gives same output but isn't same computation) without at least some ability to talk meaningfully about what that system would have counterfactually done with other input.

But evaluating that very much requires specifying extra properties about the world. At the very least you have to say there is a fact of the matter that we follow, say, this formulation of Newtonian mechanics not this other mathematically equivalent one bc those equivalent formulations will disagree on what happens if, counterfactually, some other thing happened (eg a miracle that changes momentum at time t will break versions of the laws that depends on it always being conserved).

Expand full comment

>Consider a Stone Age man putting some rocks on top of each other to make a little tower, and then being given a vision of the Burj Dubai

Consider a mindless process putting cytosine, guanine, adenine, and thymine on top of each other...

Expand full comment

Your answers (esp to Lizard Man) really helped me crystalize what I find disquieting about the Yudkowsky school for dealing with AI risk.

I absolutely believe there are hard problems about how to do things like debug an AI given it can learn to recognize the training enviornment. Indeed, I even think these problems give rise to some degree of x-risk but note that studying and dealing with these problems doesn't involve, indeed it is often in tension with the idea, of trying to ensure that the AI's goals are compatible with yours. That's only one way things can go bad and probably not even the most likely one.

As AIs get better the bugs will get ever more complex but the AI as deity kind of metaphor basically ignores all the danger that can't be seen as the AI successfully optimizing for some simple goal. The evil deity AI narrative is putting too much weight on one way things can go bad and thereby ignoring the more likely ways it does something bad by failing to behave in some ideal way.

Or, to put the point another way, if I was going to give someone a nuclear launch key I'd be as worried about a psychotic break as a sober, sane plan to optimize some value. I think we should be similarly or more worried more about psychotic aka buggy (and thus not optimizing and simple global goal) AI.

Expand full comment

To give in and use metaphors as well I'd say that while a caveman might be able to understand that it would be really really dangerous if someone could harness the power that makes the sun go as a weapon their pre-theoretical understanding of that danger is a very bad guide to how to limit the danger.

They might come up with neat plans to station guards in front of the place Apollo parks his chariot during the evening and clever ways (eg threatening to torture their loved ones before the end) if the guards defect. But they won't come up with radiation detectors to monitor for fissile materials or export controls on high speed centrifuges to limit uranium enrichment (fusion bombs require fission initiator).

So yah, I believe there are real dangers here but let's drop the ridiculous narratives about the AI wanting to maximize paperclips because, almost surely, the best way to understand both the risks and to reduce them will be to invent new tools and means of understanding these issues.

So by all means do research the problem of AIs taking unwanted actions but don't focus on *alignment* because it's not likely to turn out that our analogies really provide a good guide to the risk of to solving it.

Expand full comment

>I think the closest thing to a consensus is Metaculus, which says:

>There’s a 25% chance of some kind of horrendous global catastrophe this century.

>If it happens, there’s a 23% chance it has something to do with AI.

>The distribution of when this AI-related catastrophe will occur looks like this:

1. The time distribution question doesn't seem to be part of the same series as the first two, and isn't linked; could you provide a link to the question?

2. Some of the questions in that Ragnarok Question Series have imperfect incentives. To give an example:

- Assume for the sake of argument that there is a 20% chance of an AI destroying humanity (100% kill) by 2100, a 4% chance of it killing 95% > X > 10% of humanity, and a 1% chance of it killing 100% > X > 95% of humanity (if it gets that many, it's probably going to be able to hunt down the rest in the chaos unless it happens right before the deadline). Assume other >10% catastrophes are negligible, and assume I am as likely to die as anyone else.

- I don't care about imaginary internet points if I am dead. Ergo, I discount the likelihood of "AI kills 10% < X < 95%" by 52.5%, the likelihood of "AI kills 95% < X < 100%" by 97.5%, and the likelihood of "AI kills 100%" by 100%.

- The distribution I am incentivised to give is therefore "no catastrophe 97.50%, 10% < X < 95% 2.47%, 95% < X < 100% 0.03%, 100% 0%".

Expand full comment

As an elderly coworker once told me: the odds of the answer of any yes or no question is 50-50. It's either yes, or no!

...I don't think he was a Bayesian.

Expand full comment

While reading the DYoshida section, I can't help but compare that to Pascal's Mugging. What differentiates Pascal from this very-similar-looking alternative? In both cases, we have an impossible-to-quantify probability of a massive issue. If that issue is true, it requires significant resources to be directed towards it. If it's false, then no resources should be directed to it. Because the negative consequences of not responding are so great, we consider whether we should put at least a fair amount of resources into it.

Pascal's Wager/Mugging doesn't typically elicit a lot of sympathy around here, but I'm struggling to see the difference. Is it the assigned probability? If we really don't know the probability of either, that's less than convincing to anyone who has reason to doubt.

Expand full comment

Maybe all the very intelligent people who can't get all the dumb people to take their concerns seriously should maybe re-evaluate how powerful intelligence is?

Also, I don't think anybody arguing for an AI-related catastrophe can actually specify what it is they're worried about. Which, granted, is kind of a case of 'If I could specify it, the problem would be solved' - but also, as a society, we've stopped lending credence to underspecified proclamations of future problems, because the discussion does not look like "Here are the specific things we are worried about." The discussion looks like "Those madmen are playing God! Who even knows what we should be worried about, that's is the reason we don't play God!"

Also, agent-like behaviors look to me like an extremely hard problem in AI, and the worry that somebody will accidentally solve it seems ... well, odd. The whole discussion here seems odd, amounting to a fear somebody will accidentally solve a problem in a field which has, over the fifty-ish years we have spent on it, consistently and repeatedly shown itself to be way harder than we thought ten years ago.

Expand full comment

One being's risk is another being's opportunity. Why is super-capable AIs even a problem?

Sure, it might be a problem for humans, but why should we be obsessed with humans? Is not that species-racism?

...Not least since we are highly likely to be on the way out anyway. If not before, then at least when the sun blows up. We are not particularly well suited for space travel, so discount that remote possibility. Machines tackle space much better. Passing the torch to AIs may even allow us to go out with a bang rather than a whimper.

What is so scary about passing the torch to another being, if that being is better than us in surviving? And which on top of that might have more success than us in reaching the stars?

Expand full comment

AlphaZero-trained-to-win-at-international-politics would be *able* to take over the world, but it wouldn't *want* to take over the world. The only thing AlphaZero "wants" to do is learn to win things, and no amount of getting better at learning how to win things is going to make it generally intelligent because that's not the same problem.

Likewise, you could train GPT-N to generate scissor statements, but it wouldn't go around spontaneously generating such statements, because it's not a system for wanting anything, it's a system for generating outputs in the style of a given input, and no amount of getting better at "generating outputs in the style of a given input" is going to make it generally intelligent, because that's not the same problem.

And you could plug the two of them together, and add in Siri and Watson for good measure, and then you'd have a system where you could ask it "please generate a set of five statements that if published will cause the government of Canada to fall", and it would do it, and the statements would work (unless it screwed up and thought Toronto was a US city). But none of this is an argument for "AGI alignment threat", it's all an argument for "people using ML to do terrible things threat".

> My impression is that human/rat/insect brains are a blob of learning-ability

Human brains *have* a blob of learning ability. But they have a bunch of other things too, and we don't seem to be making nearly as much progress on artificializing the non-learning-blob parts.

Expand full comment

tl;dr None of these SC2 complaints are "AlphaStar got to train on millions of games" "200 years" or anything like that. AlphaStar was competing with substantial game advantages that DM/media downplay/ignore.

I am not a Starcraft expert, however I may be one of the worst players to reach Platinum in SC2. For me to push from Silver to Plat (near release, different league system with Plat being ~top 20% if I recall correctly) required an incredible level of effort. Wake up, practice games, go to work, come home, review vods, warm-up games, ranked games, review vods, sleep. Notes/map-specific build orders taped to side of monitor along with specific map/match-up based cheese tactics. This was, naturally, the optimal way to spend a summer internship where I worked 35 hours per week, had a small commute and only worked out twice a week for a few hours. With all of this effort I managed to claw my way to Plat (and have played 3 games of SC2 in the years since).

However, I think this gives me some perspective on learning models for SC2 and how the AlphaStar system is nonsense. Caveat: The specifics of any one model/test are useful only for a minor adjustment, these are mostly problems that could probably be solved with more money/time/development effort/incentives to solve and essentially amount to saying "this 2019 test shows that in 2029 AI could probably be pretty dang good in a heads-up-match against humans". It is still important in the broader scheme because, I suspect, many/most of the examples people give face similar problems, and give the false idea that AI is more advanced than it is.

Points 1-3 discuss the 1st version of AS, points 4-6 discuss the 2nd version. Both versions have a similar problem: They claim to be competing on the same game, but actually have substantial systematic advantages the developers/media downplay and/or ignore.

1: DeepMind's (DM's) comments about APM are tangential to what APM means for Starcraft and show a desire to advertise the coolness of the product (naturally), but downplay the massive advantage AlphaStar (AS) has over humans.

DM says (quotes from) https://deepmind.com/blog/article/alphastar-mastering-real-time-strategy-game-starcraft-ii

"In its games against TLO and MaNa, AlphaStar had an average APM of around 280, significantly lower than the professional players, although its actions may be more precise. ... interacted with the StarCraft game engine directly via its raw interface, meaning that it could observe the attributes of its own and its opponent’s visible units on the map directly, without having to move the camera - effectively playing with a zoomed out view of the game. ... we developed a second version of AlphaStar. Like human players, this version of AlphaStar chooses when and where to move the camera, its perception is restricted to on-screen information, and action locations are restricted to its viewable region. ... The version of AlphaStar using the camera interface was almost as strong as the raw interface, exceeding 7000 MMR on our internal leaderboard. In an exhibition match, MaNa defeated a prototype version of AlphaStar using the camera interface, that was trained for just 7 days."

So, afaiu, they lost 10 games against a computer that had to use 0 APM to receive all possible information, then MaNa won 1 game against a computer that (the way they implemented the screen restriction) still had to use very little APM (from what I can find it might have used as high as 10 APM on this task, but maybe 5) to receive almost all possible information. Receiving information and doing basic macro is a background load of ~75-100 APM out of the ~ average 250-300 APM of a tournament pro. So AS is doing macro at ~25 APM and info gathering at 10 APM running a baseline APM load of 35, well under half of what a human would take to do those tasks. It's not doing this because it's "more efficient" than humans, it's doing it because the rules are set to allow it to gather information in a way not available to humans (instantly DL all info on screen) entire screen as well as map specific optimizations that will be discussed later.

2: MaNa is not a "top pro".

Look, I'm a trash-monkey at SC2. He's good, he's clearly put in a lot of work and treating this a professional career. But, he's basically a journeyman that competes in a lot of EU only events, does ok/good, then goes to international events and finishes with solidly meh results. Holding him up as a "top pro" or some representative at what humans are capable of at SC2 is weak, also he still won the 1 out of 11 games that AS was on a more equal footing in terms of free vision on map/free data.

3: SC2 pros are solving a different problem than AS.

AS was solving how to win on 1 map, "Our agents were trained to play StarCraft II (v4.6.2) in Protoss v Protoss games, on the CatalystLE ladder map." A pro needs to play all ladder/tournament maps, and can only spend so much optimization time on any specific map. This means that pros move units by using many clicks, where if you optimize for 1 specific map you can learn how the pathfinding will function for those units on that position on that map and reduce the APM needed to do basic things like move units. This, combined with the above APM discussion, further shows how the APM comparisons are nonsense. AS is not an AI sitting at the computer taking in information, moving a mouse/clicking a keyboard. AS already gets a substantial "free" APM boost (even in screen limited mode), allowing these map specific improvements to APM frees up ~30 APM, meaning the total "free" APM AS gets is probably around 100 over a human/AI sitting at mouse and keyboard using optical based vision. Saying "look they both have tournament APM" is a lie. AS, functionally, has an equivalent sustained APM of at least ~380, well above tournament levels".

Turning now to the improved version of AS discussed here:


4: "The interface and restrictions were approved by a professional player." By 2019 TLO was, bluntly, a washed up journeyman player with his best (not particularly great) days long behind him. He was definitely a pro, as in he was paid money for SC2, but the article also describes him as a top pro, which was never true and certainly not true in 2019.

His approval/disapproval of the interface and restrictions are irrelevant to the fairness.

5: Here's what they say (I don't have access to the paid and published article about their camera interface for this version: "AlphaStar played using a camera interface, with similar information to what human players would have, and with restrictions on its action rate to make it comparable with human players." They also say "Agents were capped at a max of 22 agent actions per 5 seconds, where one agent action corresponds to a selection, an ability and a target unit or point, which counts as up to 3 actions towards the in-game APM counter. Moving the camera also counts as an agent action, despite not being counted towards APM." [note: camera movement doesn't count as an APM with the mouse, but afaik it does with the keyboard which is the 'proper' pro way to move the camera, also afaik it counts if you save locations and Fkey switch]

From media coverage and this article I understand "camera interface" to mean "instant computer readout of all the possible information displayed on the screen, with 0 action screen switching". Based on their prior article and the above, I have already discussed the nonsense phrase "restrictions on its action rate to make it comparable with human players". As a brief recap: AS gets a free baseload (even in camera mode) of ~100 APM, meaning the APM cap should be ~100 less than a human's APM for comparable results.

This version played all ladder maps, after much additional optimization, and reached the 99.8 percentile, under the condition that you (functionally) spot it 100 APM.

6: The APM issue gets worse. Grandmaster APM is around 190. Even if you say the improved camera mode is only spotting it 50 APM, it's still a substantial advantage. But, for this version they provide a brief explanation of their peak APM model, 22 agent actions per 5 seconds which could count as up to 3 human apm. As I read it this means in a 5 second block AS is taking up to 110 actions, but might only display 66 actions (or less). This means their APM cap is (ignoring camera movement, which is negligible when we see the final # and actual keyboard camera movement may count, I can't remember) 66 actions in 5 seconds, 13.2 APS for 792 APM. Quite the cap there DM. 792 APM Cap, and it's essentially spotted some APM by the improved interface, and it can sustain this indefinitely, vs players running ~200 APM sustained. Brilliant. You've definitely competed on an apples to apples level.

Really, after reading the two DM articles and taking some time to look at their claims, I conclude they are way too far into hype mode to be taken seriously. They minimize, ignore, or (in the case of their characterization of the pros) wildly-misstate the situation. This means I give them very little credit for any of things not disclosed or discused, and essentially assume any factors they don't disclose are even worse. Finally, the switch from talking about/showing actual APM to giving their ridiculous 22 agent actions per 5 Seconds which means up to 3 human actions and not translating is the final bad faith coffin nail.

DeepMind/AlphaStar isn't a grandmaster AI, it's certainly not a pro-level AI, on anything approaching a even footing. Snarky ending tl;dr: Computers that cheat in games are better than most humans in those games.

Expand full comment

Woo-hoo, it's arguing time!

"What I absolutely don’t expect is that there is some kind of extra thing you have to add to your code in order to make it self-aware - import neshamah.py - which will flip it from “not self-aware” to “self-aware” in some kind of important way."

I agree that it doesn't matter if the hypothetical AI is self-aware (because I am very strongly in the camp of "it'll never happen") and that the danger is letting a poorly-understood complex system have real decision-making power that will affect humans because a bunch of us decided we needed to put 1 cent increase of value on the share price of our company.

*But* the problem is, all the talk about the risks and dangers is couched in terms of self-awareness! The AI will "want" to do things because it will have "goals and aims" so we must teach it "values", otherwise it will try to deceive us as to its real intentions while secretly manipulating its environment to gain more power and resources. Just like Sylvester Sneekly in "The Perils of Penelope Pitstop".

So the race is on to see if we will end up with Ahura Mazda who will provide us with post-scarcity full luxury gay space communism, or Angra Mainyu who will turn us all into paperclips. In which case, unless we are all supposed to throw our digital microwaves onto the rubbish dump because eek, AI! there *must* be something which will differentiate the shiny new AI from a toaster.

I don't believe that, because I think even the shiniest, newest AI that dispenses bounty from the cornucopia of infinite cosmic resources is in essence a toaster: something we created to make tasks easier for ourselves. But since nobody seems to be prognosticating that our toasters will rise up and enslave us, then there is a "secret sauce" within AI research on danger that requires something extra added to flip it from 'makes toast' to 'runs the global economy', and that cannot be merely "more speed, more scope", because that argument is "oh sure, your kitchen toaster can handle four slices, tops, but if we had a super-duper-toaster that could make a million slices of perfectly-toasted toast in 0.2 seconds, then it could enslave us all to be nothing more than toast-butterers for all eternity!"

"You’ll be completely right, and have proven something very interesting about the deep philosophical category of swimming, and the submarine can just nuke you anyway."

Same error at work here. The *submarine* isn't nuking me, the submarine is doing nothing but floating in the water. The *humans operating the submarine* are the ones nuking me, and indeed, it's not them so much as *the big-wigs back on dry land in the Oval Office* telling them to nuke me. Without a crew to operate it and a political establishment to give orders to that crew and a geo-political set of tensions in play, a nuclear submarine packed to the gills with missiles isn't going to do a straw except be a nuisance to trawlers https://www.theguardian.com/science/2005/aug/11/thisweekssciencequestions

"Where does this place it on the “it’s just an algorithm” vs. “real intelligence” dichotomy?"

It places it on the "humans are constantly getting better at making things that seem to run independently" step of the staircase.

"If I had to answer this question, I would point to the sorts of work AI Impacts does, where they try to estimate how capable computers were in 1980, 1990, etc, draw a line to represent the speed at which computers are becoming more capable, figure out where humans are at the same metric, and check the time when that line crosses however capable you’ve decided humans are."

Yes, people are quite fond of those kinds of graphs: https://i.stack.imgur.com/XE9el.jpg

"One of the main problems AI researchers are concerned about is (essentially) debugging something that’s fighting back and trying to hide its bugs / stay buggy. Even existing AIs already do this occasionally - Victoria Krakovna’s list of AI specification gaming examples describes an AI that learned to recognize when it was in a testing environment, stayed on its best behavior in the sandbox, and then went back to being buggy once you started trying to use it for real. This isn’t a hypothetical example - it’s something that really happened. But it happened in a lab that was poking and prodding at toy AIs. In that context it’s cute and you get a neat paper out of it. It’s less fun when you’re talking twenty years from now and the AI involved is as smart as you are and handling some sort of weaponry system (or even if it’s making parole decisions!)"

Now, this *is* interesting, intriguing and informative. It's a great example of "huh, things happened we weren't expecting/didn't want to happen". Apparently animals do it too https://prokopetz.tumblr.com/search/Two%20of%20my%20favourite%20things%20about%20animal%20behaviour%20studies and if software is doing it, then I think that's a phenomenon that should definitely be studied as it might indeed have something to teach us about organic intelligence.

And if you've got a machine intelligence as smart as a mouse, I congratulate you! But I want to give a promissory kick in the pants to whatever idiots decided that AI should be handling weaponry systems without human oversight. And *that*, once again, is the real problem: we don't need Paperclip AI when what is more likely to happen is a bunch of drones run by a system where we've cut out the human operators (on grounds of reducing costs, but it'll be sold as 'more efficient' and 'more humane - less human error so innocent sheepherders get drone-bombed into obliteration').

What will happen there is some glitch, twitch or bad interpretation means that some poor bastard in the valleys of Afghanistan who has been forced at gunpoint to harvest opium poppies for the Taliban will be identified by the spy satellites as 'likely insurgent' and the brainless automatic machine sends out drone swarm to obliterate him, the poppy fields, and the nearest five villages.

That needs no intentionality or purpose or awareness on the part of the machine system, it just needs human cupidity, sloth and negligence because we would rather have our machine-slaves think for us than use our own brains and incur responsibility.

AI will be more like the marching brooms and buckets in "The Sorcerer's Apprentice" than a Fairy Godmother or Demon King from a pantomime: mindless servitude that is over-literal and over-responsive to our careless requests.

"whereas we ourselves are clearly most like Byzantines worried about Ottoman cannons"

The Byzantines had a clear reason to be worried about Ottoman cannons because the Ottomans were marching the cannons right up to their front doorsteps. Right now, where are the cannons? Unless you can identify and locate those, worrying about far-range AI is rather like "by the year 2021 the entire globe will be under the sway of the Sublime Porte!"

Expand full comment

"Non-self-aware computers can beat humans at Chess, Go, and Starcraft. They can write decent essays and paint good art."

I dispute this as well. AI-generated content is already being used by media, and the examples I've seen are awful:


Take this sample of filler that small online operations use from Ireland to India - is this any kind of comprehensible article? Does it tell you anything? Look at the formatting and layout - this is lowest quality "take a press release, maximise it for clicks by extracting a headline to get people's attention, then produce X column inches of crap":


It's allegedly written by a human, "Jake Pearson", but if a real human was involved he should be thrown out a window.

If AI at present does produce passable art or essays, it's down to human pattern-matching. Out of five hundred attempts, we pick the ten or three or one that fits our interpretation of "this can be taken as art/this works as an essay". We refine the software to better cut'n'paste from the training data it is given, so that it can more smoothly select keywords and match chunks of text together. It's the equivalent of the Chinese assembly-line "paintings" produced for sale in furniture stores as "something to stick up on the wall if you're old-fashioned enough to still think pictures are part of a room". It even has a name: "wall decor market" and you can find a product to suit all 'tastes' even if you've moved on from Granny's notion of "hay cart painted beside a stream under trees' to 'gold-painted sticks in a geometric shape'.


You can even cater to the snobbier types who would laugh at the fans of Thomas Kinkade but have no idea that artists are commissioned to churn out 'product' for their environments:


There's a real niche there for AI-produced art, cut out the human third-tier American and European artists and commissions from 'developing world' factories and replace it with AI produced boxes, and who could ever tell the difference? Because it is product, not art, done to a formula just like tins of beans.

Expand full comment

"Like I wish I had a help desk for English questions where the answers were good and not people posturing to look good to other people on the English Stack Exchange, for example. I would pay them per call or per minute or whatever. Totally unexplored market AFAIK because technology hasn’t been developed yet."

I don't know if this exists for English questions, but it does exist for physics questions:


Expand full comment

Note that while these are all attempts to argue that Acemoglu is right, none of them even try to show that Acemoglu's actual article presented an actual argument for what he asserts. They're all just arguments people think he could have presented, but he didn't present any of them; Scott's central charge that he wrote an article in a major paper saying that long-term AI risk was bunk while presenting not the slightest shadow of an argument for that assertion stands.

Expand full comment

"There is no path from a pile of rocks to a modern skyscraper"

Is no one going to mention the Tower of Babel?

Expand full comment

Just because Alphago can play multiple board games, doesn't mean it can design a new Terminator. Or enable an existing humano-form robot to walk down a circular stair case.

I agree that "self awareness" is a stupid characterization of "true AI", but we don't have true AI by any stretch of imagination.

What we have are collections of inflexible algorithms and machine learning routines - some slightly more and most less flexible, which can execute what a fairly stupid human can do, but at a greater scale and speed.

The problem is that none of these algorithms can do anything outside of a very narrow window of highly constrained activities. Alphago can't beat Jeapordy champions, but the Watson Jeapordy champion can't even parse Twitter and web site pages with anything remotely resembling human comprehension. Nor is an "AI" able to beat video games, particularly impressive since the video games are artificial constrained environments to start with.

Machine vision continues to cough up ridiculously idiotic characterizations; it is increasingly clear that these limitations are fundamental to the neural network training: i.e. even if better data could be found, we would never know it until yet another failure fails to surface for years. So true machine vision may be an open ended problem like fusion - 10 years away for the last 60 years.

Note that all of the above are virtual/information/data related. If "AI" can't even perform moderately well in the realm of pure ideas - its even worse failures in the realm of actual reality show just how far they have to go and how little they can actually do.

Returning back to Terminator: Skynet would never happen because were it one of the AIs today and not a doomporn fantasy construct, it would never make that intuitive leap to equating accomplishment of its unknown mission - some type of defense app - with extermination of the human race.

Expand full comment

"Victoria Krakovna’s list of AI specification gaming examples describes an AI that learned to recognize when it was in a testing environment, stayed on its best behavior in the sandbox, and then went back to being buggy once you started trying to use it for real."

That scares me in a way that merely understanding the concept of the treacherous turn didn't -- that it's been demonstrated in the lab. I'm pulling up the original study now.

I'm not sure what, if anything, I'm going to do about that, but, well, it's there.

Expand full comment

Semi-meta comment. I see the same names taking the "AI fears overblown" side of the argument in the comments that I've seen making the same points for several years, never updating their positions in anything other than a "Well, nevertheless, I'm still right because - " fashion. It's classic bottom-line-first rationalization, and it's exhausting.

I know that it is bad manners to psychoanalyze people who disagree with me, but I feel that I am forced at this point to suggest that a large number of people are simply refusing to seriously consider the case for AI risk at all.

Motivated reasoning is common. If someone that I trust suggests to me that I am arguing from a point of motivated reasoning, I will take their assertion seriously and challenge myself to consider counter-arguments more thoroughly. I consider it a favor. So I am trying to do everyone a favor by suggesting that maybe they should suspend the voice in their head that talks over Scott's arguments with their own, and instead actually take them seriously.

Expand full comment

Honestly, most of the arguments *against* AGI being a huge thing to worry about are pretty bad, and Scott's counterarguments here are all pretty obvious — I'm not really sure what I was thinking back when I thought "AGI is no threat at all, Yudkowsky et al are dangerous whackjobs fool who should respect the *actual* experts & authorities" a couple years ago.... My reasoning, in retrospect, was very unsound.

As best as I can remember, emotionally speaking, it felt like you guys were trying to take away the Future away from me — the shining idea that, even though the world is full of problems and the universe is empty and I will die, Humanity will live on and improve itself and spread and last until the end. AGI being a threat to that, with no clear way out, the claim that the bright path to the Future was crumbling before us and we need to fix it right now except nobody figured out how yet — was and is terrifying, and I grasped at any and all counterarguments to give me a reason to relax and sleep easy at night. Oops.

Expand full comment

I think that AI fears are real, though overblown by most of those focused on AI risk. (My own views are that there is a greater than 10% chance of AGI within the next 30 years. And AGI has a very large chance of causing at least some results that we would view as negative and significant.)

The bigger issue, though, is that proposals for mitigating AI risk seem to come down to:

(i) solve the principal agent problem; and/or

(ii) create a coherent theory of ethics/morality

in a time frame of 10-20 years plus:

(iii) get everyone to agree to adopt the results of (ii).

Principal agent issues have existed at least since ancient Greece and we haven't successfully solved them for humans. We've been unsuccessfully working on a theory of ethics/morality for at least as long.

To suggest that we are going to solve these in 20 years seems to me like fantasyland.

And, to the extent we have theories of ethics and morality, I, Donald Trump, Xi Jinping, AOC and Rod Dreher don't agree on them. (These are examples; the point is that different people have very different values. Whose should the AI implement?)

This is before taking account of the existence of competitive pressure, which likely renders any solution of this nature moot in the real world.

So, it isn't that AI risk isn't real; it is that most of those focused on it seem to be engaged in forms of magical thinking, which have no chance of mitigating the problem.

Expand full comment

With that last section, I feel like you're focusing too much on the apparent silliness of the metaphors themselves without really getting the point they're all trying to make.

Any given technology is subject to the law of diminishing returns. The simplest possible example is that, after a certain point, a rock pile becomes large enough that adding each new rock becomes immensely more difficult than the last (just play a game of Jenga if you don't believe me). Eventually you'll reach a point where it's effectively impossible to stack another rock onto the pile, and you'll reach that point long before the pile is the size of the Burj Khalifa.

As a result, the technology to build a rock pile and the technology to build the Burj Khalifa are not merely separated by a quantitative difference in degree, but by a major qualitative difference in kind. Incrementally improving your technique for stacking rocks will never get you a skyscraper, you need to find an entirely different mode of construction. The same applies if you compare cannons to nuclear weapons, or wheelbarrows to high-speed maglev trains, or abacuses to modern supercomputers.

Ironically, your fireworks example actually proves Dionysus's point, rather than refuting it: While the Byzantines were right to be worried about giant cannons, it's because smaller cannons already existed at that point in history, not because fireworks existed and thus made giant cannons a looming inevitability. The path from small cannons to giant cannons is linear, but the path from fireworks to cannons isn't; in a world where smaller cannons didn't exist, it would indeed be very silly to worry that giant cannons were a major threat because fireworks made bright lights and loud noises! (After all, cannons were invented about 1400 years after fireworks!)

The point of those metaphors is not that a general artificial super-intelligence could never exist under any circumstances. Rather, it's a critique of the idea that you can get a super-AGI simply by increasing the size, speed, and efficiency of modern algorithmic AI. If a super-AGI ever does exist, it will be something *qualitatively different* than modern AI, in the same way that the Burj Khalifa is qualitatively different than a rock pile, a nuclear weapon is qualitatively different than a cannon, and a cannon is qualitatively different than fireworks; it won't simply be a "scaled-up" version of modern AI. Building a super-AGI will require a series of entirely new scientific discoveries and technological developments, some of which will be probably in fields entirely unrelated to computer programming and AI research, and all of which will be major news in their own right, which refutes the idea that a super-AGI is something that could suddenly develop out of the blue and take us by surprise.

Expand full comment

Great post except for the Y2K section. I made _perfect_ sense to use two digits for the year in 1970 because storage was super expensive back then. There are credible estimates that even with all the work done in the 90s to fix the Y2K bugs it was still cheaper to have used two digit years.

Expand full comment

> I hate having this discussion, because a lot of people who aren’t aware of the difference between the easy and hard problems of consciousness get really worked up about how they can definitely solve the easy problem of consciousness, and if you think about it there’s really no difference between the easy and hard problem, is there? These arguments usually end with me accusing these people of being p-zombies, which tends to make them angry

I am a p-zombie, and so are you! Qualia are almost certainly an illusion.

> Whatever you’re expecting you “need self-awareness” in order to do, I bet non-self-aware computers can do it too.

I disagree. Self-awareness almost certainly *must* provide an adaptive advantage or it's wasting precious resources for no purpose, and that would be maladaptive.

Maybe a non-self aware computer could perform some subset of those functions, but your sentence reads as a claim of a more general ability.

Expand full comment

The past year and a half have been a very illustrative example, for me, of what "let's wait until there's clearly a problem" looks like in practice: it's the covid control system that ignores exponential growth until we're many cycles in and the problem is visible enough for most people to see, then clamps down *just* enough to keep at steady state, because if things improve much, the system relaxes the constraints.

This is a terrible solution, extremely expensive in dollars and lives compared to proactive planning without ever actually solving the problem by any definition of what solving the problem means. This is the case even for a type of problem humans have faced many times before, armed with better knowledge and weapons than ever before, against an enemy that has no intelligence and consistently does pretty much what basic biology and math tell us it will do. This does not bode well for AGI.

Expand full comment

> AI is like a caveman fighting a three-headed dog in Constantinople.

Scott, a more truthful depiction is that a caveman is looking at a group of cavemen shamans performing a strange ritual. Some of the shamans think they're going to summon a Messiah that will solve all problems. Others think that there's a chance to summon a world-devouring demon, by mistake. Yet others think neither are going to happen, that the sprites and elementals that the ritual has been bringing into the world are all the ritual can do.

Stated bluntly, us laypersons are currently in the dreaded thought experiment of talking to an AGI of uncertain alignment and capabilities, the AGI being the collective activity of the AI researchers.

I understand this view is something of a cognito-hazard, for anyone for whom AI X-risk is not merely a fun thought experiment. I think it's acceptable to discuss publicly, because if me, an outsider managed to apprehend the gravity of the situation, there's no way others, both outsiders and insiders, haven't done the same.

Expand full comment

My take on this is that Yudkowsky's "Fire Alarm" argument is very strong an I don't feel that an AI skeptic has made a complete argument unless they address it, ideally by providing a specific statement of when we should start to worry. And I haven't seen a single skeptic address it in the comments here. So I would urge AI skeptics to take the opportunity.

Expand full comment

A better name for the McAfee fallacy would be the Cypher Fallacy.

McAfee makes me think of the antivirus software, or the guy who died.

Cypher, from The Matrix, states that "ignorance is bliss" and he intends to embrace that fact.

The "ignorance is bliss" fallacy might be clearer, but too long

Expand full comment

> The AI started out as a learning algorithm, but ended up as a Go-playing algorithm

More precisely, the output of the learning algorithm is a go-playing algorithm

Expand full comment

"I think the path from here to AGI is pretty straight."

I used to believe deep learning wouldn't do anything we consider human-level intelligence but is enough for self-driving. I dont anymore. It turns out that its corner cases all the way down. Tesla autopilot confusing the moon for a yellow light, Uber AI shutting down and Waymo cars committing a comedy of errors (that cannot be solved by more data, no matter what they say).

Elon Musk has turned 180, going from calling autonomous driving as 'basically solved' to 'turns out it is very hard'.

This researcher's blog gets to the heart of the problems https://blog.piekniewski.info/

DeepMind achievements in games are an incredible distraction. I love DeepMind and they are attacking the problem of AI from every possible angle including neuroscience, causality, neurosymbolic AI. But their press releases are highly misleading. Some researchers emphasise on using the words "in mice" when describing a neuroscience or biology study to avoid giving a false picture. I think "in games" should be emphasized whenever such is the case in AI.

Games are an incredibly narrow domain, and provide immediate, clear and predictable feedback to every action (and hence RL does well at them). Thats far from what happens in real world situations including driving.

The path from here to AGI involves completely new paradigms than the ones that are hot right now. And they are not straight at all.

Expand full comment

> So I guess my real answer would be “it’s the #$@&ing prior”.

This came across as unnecessarily (and uncharacteristically!) aggressive. I'm sure I got the wrong end of the stick of what you were getting at (or maybe I'm just being a snowflake), but it kind of felt like you were swearing at a commenter and I don't think there's any need for that.

Expand full comment

Re-the hard and easy problems of consciousness:

I understand the beef a lot of people have with the term self-awareness. It’s vague but I feel it’s a useful term for a phenomen that somewhat escapes us. I can’t imagine there is a person in the world who has never reflected upon themselves, or been of two minds, or any other of 1 million metaphors we use to describe this recursive or reflexive process that takes place in our minds and bodies. Self-awareness doesn’t seem to me like a bad description of it. It might well all be an Illusion (it probably is: see Buddhist texts…)

Paradoxically it is the easy problems of consciousness that I find hard. I have no training in any scientific discipline. However my rational mind Recognizes that these sorts of problems should ultimately have solutions. Perhaps in the same way that there are a fixed number of water molecules in the oceans of the world give or take. It is a definite quantity at any given time. Actually calculating that precisely seems like a monumental task but it is bounded.

The hard problem of consciousness I would put in the same basket as “squaring the circle”. (Is anyone still working on that problem?)

They certainly provoke the same sensations in me, a feeling of wonder mixed with some discomfort.

Expand full comment

If I really had to take a stab at describing self awareness in a vaguely technical way, I would describe it as the “noise” generated by having multiple algorithmic learning programs running simultaneously and sharing little bits of their output with the others.

It’s like that Johnny Cash song, “one piece at a time”

Expand full comment

I want to point out that we didn't need to wait for AI to have bugs that detect and hide when they're being investigated. Hackers have had a word for them since forever (well, the 1990's): heisenbug.

Expand full comment

It is not directly a reply to the arguments here, but I think it is somewhat relevant. Maybe it counts for the “we don't know yet what we don't know yet” argument:

Remember the public who were fooled by the Mechanical Turk, the famous fake chess-playing automaton in the 18th century. If we had asked them: “What is the hard thing in designing a real chess-playing automaton?”, they would have talked about chess.

Well, in the early 1990s, I played and lost against Chessmaster 2000 on Atari ST. Well, I suck at chess, but I read it would have beaten most people, even decent players. A game on a single double density floppy disk running on a 8 MHz 68000 with 512 ko of RAM. And it could talk. YOUR MOVE.

The public in the 18th century would not have realized that much harder than playing chess is grabbing the physical pieces, and that the hardest of all is to analyze a video feed to see what the opponent played.

Of course, there is an XKCD for that: https://xkcd.com/1425/

Expand full comment

AIs don't have a physical world they need to navigate and model and interact with yet. Hence, no need for a fancy navel-gazing nav system and no need for consciousness.

Expand full comment

If everything goes wrong, the army can always call in airstrikes on the relevant facilities. Mankind has a lot of guns on its side too.

Expand full comment

I think both sides of this debate are missing the point, and that makes me sad. We’re busy arguing over arcane talking points while the substance is lurking in the shadows.

Specifically, the crux of the debate today (on both sides) seems to anthropomorphize AIs in a way that is taken for granted, yet actually not at all clear to be coming true. For example, everyone seems to assume that of course AIs have (or will someday come to possess) an egoistic will to survive on their own. Organic entities seem to have this survival instinct, through some emergent property encoded in nucleic acids and proteins.

I think it’s reasonable to assume by default that some analogous emergent property will emerge in learning algorithms - that they will one day become “digital life forms” rather than just “digital tools.” But I wish there were more discussion about when and how and where that might happen. Because from where I sit, that is most likely to be the threshold condition for some future entity (new type of life form) that might actually represent something that today’s commentators would recognize as AGI.

Critically, it seems to me that this boundary condition is distinct from task mastery - even generalized task mastery. I get that DeepMind’s AIs may someday (soon?) be able not just to beat humans at Go, and also solve the protein code, and write novels, etc. But while “being good at lots of intelligence-related stuff” strikes me as a good way to define “AGI”, it also fails to justify our existential fears about AGI, because humans are successful as a species not just because of our intelligence. It’s also because of our unique blend of social behaviors (which Nicholas Kristakis calls the “social suite”). And, importantly, I’d say it’s because of the interplay between our unique (or actually, maybe not so unique?) brand of human intelligence and its ability to serve our deep desires.

In this way, the more dangerous possibilities of AGI appear to me the unpredictable ways in which they might be called on to serve the human limbic system, much as our “human” intelligence already does. Thus, the most likely “villains” in an AGI scenario, I believe, are other humans. We people, with our egos and desires and social vanity, might enlist a general-purpose digital-based “intelligence” facility to further our own ends - with similarly unforeseen and potentially devastating consequences as nuclear weapons, etc.

It is certainly possible that algorithms will eventually morph from being an ego-less general intelligence to being a full blown digital life form with its own agenda, needs, wants, desires, and social habits. But until that happens, we should focus our fears on what nasty things AGI might do in non-scrupulous (and/or naive) human hands. Assuming that AGI is indeed a very powerful force, one might argue that ANY human hands would effectively be naive. That strikes me as a winning argument.

What does NOT strike me as a winning argument is the idea that AGI will inevitably come to challenge the human race for some perceived dominion over earthly affairs. The only thing that makes this future scenario appear like the default future is the human-centric egos of the commentators pushing those ideas.

That’s not to say that it’s beyond the realm of possibility that AGIs will one day become sentient and furthermore develop their own egos and “limbic systems.” But this would constitute an emergent property that seems far from inexorable - especially when considered on a “by the end of this century” time frame. And, either way, if we’re so afraid of some AGI wiping out all of human kind, why is no one talking about this? Instead we are stuck debating arcane points that I believe are largely immaterial to any real set of “existential threats” likely to emerge from the evolution (human-guided or otherwise) of today’s AI technologies.

Expand full comment

This whole discussion highlights why I hate reasoning by analogy. The things have to actually be analogous in the relevant way for it to work; otherwise it's just a word game. So if you want productive convos about AI risk I propose banning analogies, unless they have met some standard for carefully laying out WHY the things are analogous. So "AI is like human brain because xyz" might be ok, but "worrying about agi is like byzantines worrying about nukes" is not, unless you've first established e.g. that gunpowder weapons were improving at a moore's law-like rate at the time and that this would have made city-flattening weapons possible within a human lifetime if improvement continued at that rate.

Let's try some non-analogy reasoning that's accessible to non-experts, such as myself. I don't really trust the experts' judgments on probability because they're too close to the issue, and I don't trust my own bc what do I know. But forecasts from smart people who aren't necessarily experts sound promising. This is the first time I've seen the Metaculus numbers and that made me significantly more worried about the issue. I've always suspected the polls of experts were overestimating the risk because of experts' inherent biases, so I've never tried to calculate an expected value number for it before. And according to Metaculus they actually were. But by a factor of 2-3, not 100. Multiplying Metaculus's estimate of 22% chance of catastrophe that kills at least 10% of world population, by 23% chance that any such catastrophe would be caused by AI, by the 60% chance that any AI catastrophe would reduce world population by at least 95%, that's a greater than 3% chance of a worse than any previous time in history level catastrophe in the next 80 years, which is phenomenally high for something of that magnitude. EV on that is ~230 million-300 million lives, not even counting lost future value if civilization or the species is wiped out.

Expand full comment