Nothing will be slowed down, but the conversation is to raise fear as a campaign to increase the rate at which regulations might come into effect in the next 5-10 years. It is hard to say if that effort to raise awareness is successful or not, in our weird corner of the internet here AI safety is obvious and long discussed...but we have yet to really come to any consensus on even a type of approach to take. But I think this is a conversation meant to spark regulatory action, but all of it will be ineffectual.
Just like with dumping chemicals and birth defects and rare cancers...it is only much later when people are harmed do we do something about it some of the time. Regulators will never get ahead of the harms and innovators will never slow down. We are still sick and dying more from the Green Revolution as we sleep on neurotoxin filled pillows which are mandated by law for fire codes or some nonsense. So I wouldn't count on anything stopping or regulations leading to anything meaningful until we see very large scale harms.
In reality I think the true response to AI will be a UBI to prevent a mass revolt in around 10 years.
I have not been following AI news all that closely, but it kind of seems like the hype is dying down a little? Like, three months ago everyone was acting like AI was going to take over the world soon, and now the attitude seems to be it's a kind of cool autocomplete.
For whatever it's worth, when I've used ChatGPT and Bing's AI chat it falls into the category of "kind of cool, but definitely not going to take anyone's job anytime soon."
My company has been applying for a lot of government grants the last couple of months. It has already effectively taken somebody's job because we didn't need to hire extra writers and researchers for these long concept papers that we had previously. It has both increased our reach and decreased our cost. Now this is a zero-sum type situation (competing for grants) but it is also applying to line of business application programming here as well which is not zero-sum.
I think if GPT enables a 50% increase in programming productivity the increase in demand for programming applications will increase by more than the reduction in cost. Programming jobs may increase and possibly enable an increase of employment in the newly productive areas benefiting from newly affordable programming services
In two Seattle studies of minimum wage regarding their last two increases they found that with the increase in the minimum the total amount of wages paid to minimum earners decreased due to reduced hours and lost jobs. Therefore if you decrease the minimum wage it's possible to increase the aggregate wages paid to minimum earners
If this holds for decreasing the wages paid to programmers per unit productivity the increased demand of the lower cost service may increase the aggregate wages and employment for programmers, as well as promoting economic growth among the recipients of the increase of programming services
I provided an example where a few people's jobs were taken (while ALSO increasing productivity).
That doesn't mean arguing against on average across the economy while some people's roles were allocated differently as a whole the pie remained the same or grew. I'm making no gross statements like that.
I mean person A said "I'm not seeing changes" and person B is responding "Here are some concrete changes".
Yeah, I read a comment by someone who said they went from 5 writers to one plus GPT for a niche magazine. I didn't expect that big of a difference to be possible, although maybe it isn't working out or they had to hire someone back. Still appears to show dramatic potential for individual lost jobs
It's hard to say this increases productivity. It might, but really you're just talking about reducing some of the overhead involved in applying for something, which could have been done at any time by the government funding agencies you're applying to deciding to select grant recipients randomly instead of based on the quality of their applications. Unless your company is the only company that has figured out ChatGPT can fill out funding applications, you've effectively put these funding agencies into that position anyway. They no longer act as a meaningful discriminator. So they'll either come up with something else, or if we charitably assume there was some kind of good reason there was an application process in the first place, there's a tradeoff here in that whatever gain was considered worthwhile to have a competitive application process is now lost.
Plus, at least right now, assuming there is no conceivable way OpenAI can be profitable this early in the R&D phase of its business cycle, replacing human staff with ChatGPT just means OpenAI's investors are paying for your applications instead of you having to pay for them. Whether or not this represents a net economy-wide productivity gain anyway depends on your gains versus OpenAI's losses, neither of which is published. For what it's worth, I think it very likely still is, but we can't know.
I think you missed the bit where I said the applications were zero-sum but the line of business software wasn't. And the bit where I said I made no statement about overall US economy effects.
But I will ask this: you say "...assuming there is no conceivable way OpenAI can be profitable this early in the R&D phase of its business cycle..." are you including capex in this equation or just operating costs?
"In two Seattle studies of minimum wage regarding their last two increases they found that with the increase in the minimum the total amount of wages paid to minimum earners decreased due to reduced hours and lost jobs. Therefore if you decrease the minimum wage it's possible to increase the aggregate wages paid to minimum earners"
This does not follow. If you increase the minimum, employers stop hiring and do without. If you then decrease the minimum, they will still have learned to do without.
More robustly, if you freeze the minimum and set out that future raises in the effective minimum wage will come through earned income tax credits for the foreseeable future the effect will be an increase in minimum wage hiring. Where economists generally think increasing minimum wages causes lost jobs in the short term the theory is that it has an especially large effect on future hiring
This could be more analogous to increased programmer and other sector productivity where future hiring is now expected to be more productive and cheaper per unit of production
That would make sense - "not hiring" avoids a lot of the stigma an employer would get from firing people.
At least until you get e.g. youth unemployment rates of 50% or so, and even then it's a generalized "no one hires youths" instead of "Company <X> fired all it's youth employees".
>I think if GPT enables a 50% increase in programming productivity the increase in demand for programming applications will increase by more than the reduction in cost.
As someone in the industry who has actively uses both ChatGPT and Copilot (the two major LLM coding-productivity tools), I'm not sure this follows. GPT is good at two things: writing mediocre code, and helping competent developers write more code per unit time (at the expense of having to proofread the mediocre code, which is still usually a very favorable trade off).
The problem with the software dev market is that it's almost impossible to find a job as an entry-level programmer, because there's *already* an overwhelming glut of mediocre programmers on the market. But (somewhat paradoxically) it's also pretty difficult for anyone looking to hire to find competent programmers!
So taking that into account, in a market with a glut of mediocre coders and a shortage of competent ones, if you introduce a machine that can only produce mediocre code or reduce the demand on competent coders, the market may actually shrink!
Interesting! So in isolation the programmer jobs market should shrink for mediocre programmers until GPT can assist them into competency. At which point it may be able to simply replace jobs and outpace the increase in demand for the service
It's naturally hard to imagine this being a benefit for overall employment although I always think of energy where greater production per labour and capital employed is reliably an enabler of jobs in the wider economy
>I have not been following AI news all that closely, but it kind of seems like the hype is dying down a little?
It's not just you. Look up "ChatGPT" or "GPT" or "GPT4" on Google Trends. The new toy shine is gone.
GPT is great and obviously lots of people still use it, but the hype bubble has deflated a bit.
I think people are waking up to the limitations of chatbots: they suck for a lot of stuff (even when the underlying AI is smart). The quality of your output is dependent on how well you can phrase a prompt (or worse, grok some abstruse "Midjourney-ese" that's impossible for a newcomer to intuitively understand). They're not always the best tool.
I was scared we'd move to a paradigm where EVERY website and app requires you to use a chatbot, regardless of whether it makes sense. I saw a music generation app that let you create simple music via a chatbot. Like, if you wanted the drums in your song quieter, you'd type "make the drums quieter". How is that not ten times harder than adjusting volume via a fader? I'm glad we dodged a bullet there.
Was this a generation app that was using one of the recent big music-generating models on the backend, where the audio composition was being done inside the inscrutable matrices? Or was it a generation app that was closer to traditional generation where there were multiple channels and such that were legible to the software?
>Like, three months ago everyone was acting like AI was going to take over the world soon, and now the attitude seems to be it's a kind of cool autocomplete.<
Sort of how the average Englishman felt between the declaration of war and the beginning of the Battle of Britain.
I don't think I'm getting too far out over my skis to say that we're full speed ahead on AI at Wolfram and trying to figure out how what we have built fits into this rapidly changing situation. GPT-4 has proven itself capable of writing some surprisingly sophisticated code.
AI xrisk was always about the future, even if it's a close one. If there's a change in tone, I would argue that mostly reflects the internet's attention span. There are still people on both sides of the issue.
If this is a representative summary of Asterisk's articles, it seems... very one-note. Are there any online rationalist/effective altruist hangout spots that aren't consumed with AI-related discussion?
"We should, as a civilization, operate under the assumption that transformative AI will arrive 10-40 years from now, with a wide range for error in either direction."
I feel like it's hard not to read this closing statement as a slightly more nuanced and somewhat less efficient way to say "we don't know." I don't mean that flippantly; I think there's a lot of value to thinking through what we do know and where our real confidence level should be. However, what really is the practical difference between 10-40 years with significant error bars and we just don't really know?
I think there's a certain quality to saying "we don't know" that people reflexively interpret as saying "no".
For example: "We don't know if within our lifetime aliens will enslave us to defend themselves against another alien species."
Compared to 1960s us "We don't know if within our lifetime modern civilization will end in nuclear flames."
Both are/were true statements. 10-40 years with significant error bars communicates something closer to the latter than the former. Whereas if you replace it with "We don't know when transformative AI will arrive" it's closer to aliens than I think the author intended. (insert History channel ancient aliens meme)
There's a difference between "we don't know" spoken with a careless shrug and "we don't know" spoken with a concern-furrowed brow.
I don't think many would have predicted a 10-40 timeline in 1980. We're still uncertain, but there's a new kid in town (the transformer) that may have changed the game.
Popular audiences in the 80s apparently had no trouble accepting sentient AI by 2019 (Blade Runner) or 1997 (Terminator) as a plausible story. Now the tone here seems to be that the general public is too dismissive of the possibility of imminent AGI, and I think that's accurate: the comments on ACX make me want to shake my head at the hysterical fear, while people in real life telling me AI risk is something they've "just never thought about" make me want to scream in hysterical fear myself.
The people I really still don't get are the people who are like "within 5 years AI will destroy us all", but work making interfaces for General Mills or something.
Why the fuck aren't you taking immediate drastic and illegal action to stop it then? If you really actually believe that, it would seem that would be the urgent mandate.
I can't grok the selfishness of someone who truly believes "the world will end in 5 years without intervention, but I am just going to keep attending my 9-5 and playing videogames".
This is not an uncommon sentiment but it doesn't really make sense. There isn't, so far as I'm aware, any known drastic illegal action you could currently take that would improve the situation in expectation. Drastic and illegal actions also have massive downsides including impairing humanity's ability to co-ordinate on any current or future interventions that likely have a better chance of success than some panic driven murder spree.
I just think this is pretty silly. People/corporations are skittish. A concerted campaign of terrorism could absolutely work at least in the short term.
And if you are really thinking there is a 95% chance of humanity being destroyed, the extreme short term is pretty much all you should care about.
Come on really? Think about it. Would a campaign of terrorism actually convince you to take the threat of AI seriously? Or would you just chalk it up as yet more evidence in favor of doomers being a bunch of crazy whackos. Except now they would also be extremely dangerous terrorists to be put down with extreme prejudice.
FWIW, my wild guess is that AGI is more likely than not to be an extinction event for humanity (probably a slower one than EY expects), but I'm on the sidelines. Amongst other things, I'm a 64 year old retiree, and everyone that I personally care about is 60 or older. I find AI interesting, but I don't lose sleep over it.
I don't think it's going to end in 5 yrs., but I am seriously concerned about ASI wrecking the world in various ways. I have in fact suggested some actions to take which, if not illegal, are certainly sleazy. Nobody shows any interest or suggests other actions they think would be more effective.
It's a hard problem. Monitoring large training runs and any other large AI projects might buy some time, but ultimately we either have to solve the control problem for superhumanly capable agents or just hope that whatever we build happens to be a big fan of human flourishing.
Yes, I think it's unstoppable. There are too many people deeply invested it in, too many possible benefits, and too much money too be made.
Another suggestion I often make is that the people working on alignment cast a bigger net in the search for options. I'm sure there are many versions of "alignment" on the table already, but I still get the feeling, as a person who does not work in tech but who reads about it, that there isn't a sufficiently broad assortment of solutions being considered. All seem some like variant of "implant a rule that the AI can't break and can't remove from its innards." It all sounds like a school principal coming up with a very limited selection of ideas for how to reduce fights when classes change: Install cameras and give the bad actors a medium-sized punishment. Have teacher monitors instead of cameras, and give the bad actors big punishments. Install cameras and give the good kids a small rewards. Have teacher monitors and give the good kids medium-size rewards and make the bad kids write apologies . . .
The world is full of models of weaker and dumber things that are protected from stronger and smarter things: The young of most mammals are protected from harm by their parents by wiring that's way deeper than a rule, and also by the parent's knowledge that their kids are the bits of them they hope will live on after the parent's death. Hippos tolerate little birds on their back because the birds remove the hippo equivalent of fleas. Butterflies disguise themselves to look like a kind of butterfly that's toxic to birds; chain saws have deadman switches.
It's not that I think I can supply the clever model for keeping us safe from ASI that the AI blokes haven't thought of -- it's that I think it's likely that crowd-sourcing the problem could. Just as some people are superforecasters, some people are extraordinarily good at thinking around corners. I'm a psychologist, and I can tell you that you can actually test for the ability to come up with novel solutions. I understand why people recoil from my sleaze proposal, and I am very unenthusiastic about it myself. But I can't get anyone to take the crowdsource it idea seriously either, and it has virtually no downside. Have some contests, for god's sake! Or test people first, and use those who test high on inventiveness as your crowd.
And I don't think it's necessary for people to understand the tech to understand the problem well enough to brainstorm approaches.
No, this really doesn't make sense. It's a perfectly reasonable position to believe both that "within 5 years AI will destroy us all" and "there's nothing I personally can do to meaningfully reduce the risk of this". Sure, if *enough* people took action we could prevent it, and this would clearly be a net win for people as a whole, but obviously I only care about myself, not humanity as a whole, and ruining the next 5 years of my life to reduce to risk of extermination by 0.0001% is a terrible trade off.
Sorry, to be clear, I don't personally actually think AI is nearly certain to kill us all in 5 years (I'm fairly pessimistic by most people's standards, I think, but definitely nowhere near that pessimistic). I'm just trying to explain why holding that view doesn't mandate taking extreme actions.
I've written on ACX before about timelines that happen to match life milestones of the people making the prediction. 20-30 years often takes the predictor to retirement or a little beyond. It's like admitting that they themselves will not complete the research, but that they feel it's pretty close otherwise. Then in 10 years the same 20-30 years might get trotted out as if no progress was made at all. Fusion has done this for a long time, and even though we get hype cycles and more specific information about what's being done to make fusion a reality, that prediction horizon never really seems to move. I'm old enough to have seen 25 year predictions on fusion 25 years ago.
Five year prediction cycles are a lot more meaningful, but they also (should) put the predictor's reputation on the line. Saying something specific will happen in a small number of years where the expectation that the predictor and the person hearing it will both be around to see the results is a much stronger claim. I'm seeing a lot more people in the long range camp giving themselves an out than the short term camp. If we don't have transformative AGI in the next five, maybe ten, years, then EY should take a reputational hit. If everyone else backs off of such a short timeframe for fear of the same, then I take it that AGI is much further off than most people are willing to admit.
>Five year prediction cycles are a lot more meaningful, but they also (should) put the predictor's reputation on the line. Saying something specific will happen in a small number of years where the expectation that the predictor and the person hearing it will both be around to see the results is a much stronger claim.
Good point. Even with short prediction cycles, very few predictors actually get a reputation hit from a wrong prediction. The nuclear weapons prediction article in the issue, https://asteriskmag.com/issues/03/how-long-until-armageddon , was fascinating (and, in terms of assessing the value of predictions by even the best-informed people, depressing). As the author, Michael Gordin, points out:
"The Americans were concerned with only one adversary and the development of a single technology which they already knew all the details of. American predictions of Soviet proliferation offer a highly constrained case study for those who would undertake technological forecasting today."
> I don’t think they emphasized enough the claim that the natural trajectory of growth is a hyperbola reaching infinity in the 2020s, we’ve only deviated from that natural curve since ~1960 or so, and that we’re just debating whether AI will restore the natural curve rather than whether it will do some bizarre unprecedented thing that we should have a high prior against.
Yeah, that's not how fitting curves to data works.
I read it back when you posted it. Just skimmed again to refresh. I agree with the possiblity (in the abstract) that creating a new kind of worker (AI) that reproduces using different resources than humans can change the trajectory of history. That's the central claim of that post, correct?
But at the same time, I am very very skeptical of any kind of "fitting curves to data to predict the future". The first step needs to be to have a complete and coherent model that captures all the major dynamics properly.
(As an example, consider covid. I don't believe that covid grows exponentially because someone fitted a curve. I believe it because we know how viruses behave. Under mild assumptions (even population mixing), our models make perfect physical sense. The curve fitting part is only to find out empirical things like the R_0 of a specific pathogen.)
Anyways, I just don't see it with the hyperbolic growth model presented there. It abstracts away too much, modeling "progress" as if it was a smooth mathematical quantity. Even if this is a good model for some parts of history (low tech level), it doesn't follow that it holds for all of history. I am especially skeptical *because these models don't have a coherent mechanistic explanation for what they're modelling*.
(As another aside, I think economics and game theory have allowed us to understand human systems a lot better. They're almost like laws of nature, e.g. we can confidently say that just giving money to people doesn't actually make us richer, and will just cause price inflation (maybe that's a bad example, I hope you get what I mean). But these "laws" have only been tested on small timescales, closed systems, and so on. We are extremely far away from a "universal theory of economics". At best, we are at the point equivalent to the early stages of classical Newtonian physics.)
The model actually has a very clear meaning. Economic growth is a function of population and innovation, innovation is a function of population (and on some version, amount of innovation so far), and population growth is a function of economic growth.
There is definitely something unphysical that happens near that singularity - I think it only goes there if we assume that people can contribute to innovation within epsilon time of being born, and innovation can contribute to wealth within epsilon time, and wealth can contribute to population growth within epsilon time.
I liked the article, but i thought the trend line was a function of the interval chosen rather than it being part of a "natural" path.
There are 2 outlier data points at the begining of history, low quality of course.
Then a period until AD 1000 with no clear trend. (Though thats also low quality)
Then the actual trend identified over a period that includes european industrialization. (Starting 1500 not just the "industrial revolution") With some noise in the period just before and just after.
Then flatline since 1960.
Data that shows the industrial revolution was a very important change in human development gets reframed as the singularity should have happened in 2020.
The data for the far past is basically unreliable. I wouldn't put much faith in anything prior to 1500.
And the data for the recent past up through 1960 is basically capturing how incredibly amazing and useful using fossil fuels to drive your economy is. By 1960 the tapering of pop growth and the full saturation of fossil fuels into the economy leveled off growth, and it will probably stay leveled off until we make some big strides in energy technology.
Everything in the economy comes back to energy.
So there was a long period of "slowish growth", and then a period of super rapid growth from utilization of fossil fuels.
But like bitcoin or anything else, you can always slap on some exciting looking "singularity curves" onto anything which is growing quickly. Doesn't make it true.
Even though the data prior to 1500 is unreliable, it's clear that the rate of change has accelerated over the last 12,000 years, and over the last 100,000 years too. Overwhelming historical evidence indicates that technological progress was very slow in ancient times.
We also know that the economy couldn't have been growing at an average rate of say, 1% per year since 10,000 BC, because that would imply that the economy would have grown to be 10^43 fold during that time period. We can use a similar argument against an average of 0.1% yearly growth over the last 100,000 years.
Given the acceleration in the measured data from 1500-1950, we have good evidence that the long-term trend has been super-exponential over the course of human history.
I think what isn’t clear is if there were other little mini ramp ups which were aborted.
Yeah the long term trend has been “super exponential”, but that isn’t really much evidence the actual relationship is exponential. Almost anything where there has been a ton of growth at some point will look like it *could* be exponential. But that doesn’t mean it is.
Take a child's skill at many things for example. My 9 year old was terrible at long division for his entire life, until he mastered it in weeks. In a few more weeks will he be the worlds best long divider? No.
On a related note, there are former skills which have atrophied or pretty much vanished. For example, no one alive today could manually make a flint axe head half as well as an expert in neolithic times, and nobody could coordinate a fleet of square rigged ships at sea with just flag signals nor probably, come to that, with radios!
> the natural trajectory of growth is a hyperbola reaching infinity in the 2030s, we’ve only deviated from that natural curve since ~1960 or so
This is a very oddly stated claim. Most apparent exponential curves are actually S-curves that haven't yet hit the "S." My null hypothesis is that this is another example of such, not that there's some "natural" trajectory from which we've deviated. Why would one believe the latter?
The claim isn't that it's apparently exponential, the claim is that it's apparently hyperbolic.
I think that matters because yes, once it got too high, it would reach some limit and level out. But before it did that, it would be going shockingly fast/high by the standards of everything that happened before.
If you're driving/accelerating along the following trajectory:
1:00 PM: You're going 1 mph
2:00 PM: You're going 2 mph
3:00 PM: You're going 3.1 mph
4:00 PM: You're going at the speed of light
5:00 PM: You're still going at the speed of light
...then technically this follows an S-curve, but that doesn't mean something very exciting didn't happen around 3:30.
So, in terms of this metaphor, sometime around 1960, we got to close to the speed of light?
Sorry, I don't think that's a helpful analytic framework. I think something far more mundane happened: in the Johnson administration, and increasingly thereafter, elites in Western society began to broadly and strongly prioritize social goals over technological ones. As a result, the space program was mostly killed, deployment of nuclear fission power was stopped dead in its tracks, research into nuclear fusion and better nuclear fission was defunded, etc. But no mysterious analog of hitting the speed of light.
No, we were on a hyperbolic trajectory until the 1960s that would have looked boring until about the 2020s and then given us a singularity that shot us to the speed of light. In the 1960s we deviated from that trajectory in a way that was hard to notice at the time but suggests it's no longer hyperbolic and won't shoot up in the 2020s, though that could change. Again, see the link at https://slatestarcodex.com/2019/04/22/1960-the-year-the-singularity-was-cancelled/
You should read the models. They say that population growth rate depends on wealth, that wealth growth rate depends on population and on innovation, that innovation growth rate depends on population and how much innovation has already happened. There are some natural places to put a factor that would turn, for instance, population growth into an s where it starts to hit up against physical limits - though the point of the wealth term is in part to model getting through those limits, so you’d want to think carefully about that.
Assuming the Angry Birds challenge is serious, what is it about that game that makes it so much more difficult for AIs to master? Surely if AIs can beat humans at Go and chess, not to mention master dozens of classic Atari games in a matter of hours, Angry Birds shouldn't be that much more difficult??
Which article is that one in? Or if it isn't an article do you have a link?
Off the cuff I'd assume it's just an anthropic issue. Chess and Go are popular enough to devote a lot of time and compute. Atari could either be because it is very simple or that was what fates willed someone to try and solve first (and note it isn't literally all Atari games, just dozens). It doesn't have to be that Angry Birds is particularly AI-hard, but that the big guns haven't been brought to bear on it yet.
To answer your question about which article: It was Scott’s article. Perhaps the key quote about that particular issue: “The most trollish outcome — and the outcome toward which we are currently heading — is that those vast, semidivine artifact-minds still won’t be able to beat us at Angry Birds.”
Not an expert, but while we associate board games like Chess and Go to 'high intelligence', conceptually they're quite simple - the rules are not too complicated and they're very discrete: you have something like 100 moves in every position and you can exactly list them out and analyze the resulting position.
Angry Birds is a lot more complicated and 'fuzzy' - it requires something a lot more like a working knowledge of physics - the input is simple (a bird, an angle, a power), but the result isn't trivially predictable in the way it is with chess or Go, so the techniques we've used for ages just don't work.
I don’t think it’s impossible, but it is far enough from what existing systems do that you’d have to design and train a system specifically for playing Angry Birds. And it’s not surprising that nobody has tried yet.
LLMs have a surprising ability to figure out freeform problems like this, but they’re not a good fit for two reasons: they don’t handle images, so they can’t just “read” the screen and figure out what’s going on, and they have short context windows so they can’t do longer interactions that are necessary to play a whole game. Multi-modal models will help with the first issue, but not the second.
Eventually when modern AI is applied to robotics, it’ll have to handle long-running interactions with live video input, and I expect that models like that will also be able to operate any smartphone app fluently.
Yes, I was wondering the same thing in the last open thread. The game is turn by turn, there are very few parameters to play with (choice of bird, angle and intensity of throw, maybe time of throw?), the response is immediate and the levels short, and the main aspects of the physics of the game shouldn't be hard to grasp. Chess and Go are not necessarily that helpful comparison, but surely it should be easier than Starcraft 2 or hard platform games? Has anyone looked into the achievements of this contest: http://aibirds.org/angry-birds-ai-competition.html ?
It's definitely lack of interest in beating it. It was Deepmind creating AIs that beat games, and they were tackling Go, Starcraft, etc. The same tech _definitely_ would beat Angry Birds.
But now they are solving protein alignment, and won't drop that for a stupid game
I'd agree, it is clearly a lack of incentive. Does anyone even play Angry Birds anymore? It is out of the Zeitgeist and isn't in Vogue. That said...Scott may continue to be right for the wrong reasons...that Angry Birds will remain an unexplored frontier.
Now what would be more impressive is if we applied AI to beating humans at the more complex board games such as Settlers of Catan.
Go and chess are not very hard games at all. They are played at a very high level of expertise and very very well studied. But they are both SUPER simple in the context of "games".
Go and chess are not visual games. You can play them with numbers alone, and there are a lot of engines that will output situation on board for the AI to pore over. Angry Birds (as well as unmodified StarCraft and other unmodified games) require AI first to solve vision problem - e.g. learn what is presented on game's screen, and THEN learn how to play it. Same as long-solved Atari games, yes, but modern games are much more colorful and dynamic, and screen scrolling adds another complication. Then again, even if you treat Go or Chess board as a visual input, it's a very simple visual input with easily distinguishable features. A game with cartoonish graphics, animations and effects that appear from time to time to obscure parts of image? That's hard.
If you can hook inside the game and pull out relevant data, it simplifies problem a lot, but maybe too much - after all, a human with calculator can probably plot a perfect trajectory if he knows exact coordinates of shooter, target, obstacles, speed and acceleration of the bird, gravitation, and other important properties.
I've been doing some ML research on a game I'm working on at my workplace, and I went this route - I give AI pre-digested representation of game world, not screen capture, because my puny GeForce 1060 cannot even begin to handle networks that could solve both vision and mechanic tasks. But even that is not simple, because our game involves quite a lot of RNG, and different weapons and abilities for both player and enemies.
I asked ChatGPT how to avoid the problem of researchers always being depicted as chemical engineers wearing labcoats, then punched a summary into Dall-E. I think this is probably better representative: https://labs.openai.com/s/KJoq7h4Yfn5ukOnGmCQhQ3v8
The original prompt was "How can I prompt an AI image generator to provide images of researchers that are not biased by the stereotype that all researchers are chemical engineers and wear lab coats all the time?"
I think ChatGPT read the word "biased" in my prompt and immediately leapt into woke mode.
Not disputing statistics, but you may consider that the demo is "researchers", not computer scientists. Anecdotally, while most programmers I know are male, I think the majority of researchers in my circles are not cismen.
The other options had those, but they had their own issues. One was 100% female (or at least, long-haired full-lipped) representation, one looked too much like a business meeting, and the last was this: https://labs.openai.com/s/ptsMITDysdqYBbMWksYG0413
Indeeed, things can get pretty interesting regardless of where the inflection point lies. (I don't know if S-curves can look locally hyperbolic instead of exponential.)
I mean, it’s easy to draw an s curve that is locally hyperbolic if you are generous in what counts as an s curve. But if it is specifically logistic growth, that specifically looks exponential.
China's behind on some things on AI and ahead on others. These advantages and disadvantages are actually probably fairly structural to the point where I'd feel semi-comfortable making predictions about who will lead where (or at least what will be close and where one side will be far ahead). For example, LLMs are handicapped by conditions in China and will be probably indefinitely. On the other hand, facial recognition or weapons guidance have structural advantages in the huge security state serves as an eager market and data provider. And China tends to play closer to the chest with everything but especially military technology.
I'm also highly suspicious of anyone writing about Chinese AI who doesn't have a tech background or who doesn't have access to things like AliCloud. Chinese reporters and lawyers and propagandists are usually about as technically literate as their American counterparts. Which is to say not very. You really need to scrub in to understand it.
Not to mention a lot of the bleeding edge is in dense papers in Chinese. Plus the Chinese produce a gigantic quantity of AI papers which means simply sorting through everything being done is extremely difficult. You need to BOTH understand Chinese and computer science language to even have a hope of understanding them. But Ding seems focused on the consumer and legal side of it.
Also fwiw: I tend to think AI will lead to lots of growth but not some post-scarcity utopia type scenario. It will speed growth, be deflationary as per usual, etc. And I think that we're actually going to see a move away from gigantic all encompassing models towards more minutely trained ones. For example, an AI model based on the entire internet might be less useful than an AI model trained on transcripts of all of your customer service calls and your employee handbook. Even if you wanted to make a "human-like" AI ingesting the entire internet is probably not the way to do it. Though the question of whether we can keep getting smaller/better chips (and on to quantum) is certainly open and important to computing overall.
And the idea of using chips unilaterally as a stranglehold is a bad idea. We only somewhat succeeded with Russia and only then because the other important players went along with it. Any number of nations, even individually, could have handicapped the effort. They just chose not too. And keep in mind China is in the "chose not to" camp. This is without getting into rogue actors.
All in all I enjoyed the magazine although I did not agree with much of it.
A claim that's popular in certain circles is that Chinese language academic literature is a thoroughly corrupt cargo cult paper mill, the main purpose of which is to manufacture credentials. What real research there is can be judged by its appearance in Western publications.
I'm surprised that the interview didn't mention hardware embargoes at all. At first everybody seemed to treat them like a Big Deal, whereas by now it seems more like an overhyped non-issue.
This is a claim I’m definitely interested in. I’d be very cautious about truly believing it, the same way I’m cautious about believing the claims people make about certain academic fields being this sort of thing (whether it’s string theory or gender studies).
This is both true and not true. The way Chinese research works is effectively a centrally planned version of publish or perish. "X University, you are responsible for producing X AI papers to meet the goal of Y AI insights annually." This means 90% of papers on any subject are not bad but aren't worth a lot. They contain real research but it's clearly done to meet a paper quota. I've seen papers I am like 80% sure were specifically written to be done quickly.
On the other hand, China has scale and there are strong incentives for getting it right. If 90% of papers are bad the remaining 10% still represents significant output. And China is quite willing to throw money, prestige, posts, awards, etc at people who produce highly cited research or research that gets picked up by the military or industry. They're also fine with them taking innovation into the private sector and getting rich. The paper mill cargo culter probably has an okay post at a regional university. But if you want to get rich or the top post at a big university you need to make some actually impressive contributions.
The problem with this inefficient "at all costs" system with the gaps filled in by profit seeking companies (who in turn mostly stick to the cities) is that it creates profound regional inequality. The "bad at diffusion" the article talks about. But it's not apparent to me this would matter in AI. It's not like there's artisanal AI models being produced in Kansas anyway. And just those cities represent a larger population than the US at this point.
The chip embargo is a real thing and has caused issues in China. But it hasn't stopped them, just slowed them. In part because it's not nearly total and in part because of China domestic industry.
>But if you want to get rich or the top post at a big university you need to make some actually impressive contributions.
Well, the claim isn't that these don't happen at all, just that they also get published in the West, and this is how they get judged in China as well, because domestic peer review standards aren't high enough. Or are you saying that there is actual cutting edge research there that the West isn't up to speed on?
Yes, there are. And while the west is still more prestigious than internal in China that gets less true every year and has taken an especially sharp turn lately. Getting foreign citations might be prestigious but the real way you move up is impressing your (also Chinese speaking) superiors. Not to mention that unlike (say) India the vast majority of Chinese computer science professors speak almost no English. Even many of the very good ones. Meaning there's a real barrier to publication abroad.
If you look at citation networks you see clusters around language groups. But you also see that Chinese, Japanese, and Korean have strong research output but are much less connected internationally than the other clusters (which are all European languages).
But then how do those top Chinese computer scientists stay abreast of the cutting edge, which is undoubtedly still mainly Western-centered? Surely being language barrier-cut from it would imply that they are constantly at risk of bicycle reinventing, at the very least.
They do have translators, of course. The government/CCP subsidizes all kinds of translations into Chinese (and some out of it). Including specific focus on research papers in areas they're interested in. In many cases Chinese researchers can get relevant western papers for free in about a day after they're published.
But those languages do in fact develop unique features that you'd expect to see in a somewhat isolated environment. (In Japan it's called Galapagos Syndrome.) And there is some degree of reinventing the wheel. The infamous example here being a Chinese mathematician who reinvented Newton's way of doing calculus integration and named it after herself.
People would normally cite the AI survey as "Grace et al" because it was a collaboration with Katja Grace as the first author. (I was one of the authors).
In the Asterisk article you say, "As for Katja, her one-person AI forecasting project grew into an eight-person team, with its monthly dinners becoming a nexus of the Bay Area AI scene." But the second author on the 2016 paper (John Salvatier) was a researcher at AI impacts, and the paper had five authors in total.
AI Impacts has had many boom-bust cycles and I believe it was approximately one person for some of the period the paper was being written (I lived with Katja at the time and gave her a small amount of writing advice, though I could be forgetting some things).
The article about people underestimating the arrival of the Soviet atomic bomb is almost completely leaving out the fact that the Manhattan project was porous as swiss cheese to Soviet spies. Stalin had full blueprints for the bomb before Truman even knew what an atomic bomb was. The Germans failed after going down a research wrong alley (whether Heisenberg did it on purpose or not is perhaps an open question, but the fact remains that they didn't focus on the right path). It is a LOT easier to build an atomic bomb when you have the full manual, plus a list of dead ends to avoid, courtesy of Klaus Fuchs, Theodore Hall, David Greenglass, etc. Unless this article is suggesting that China has spies in OpenAI, I think the analogy needs more work. (And if he is claiming that, that's a major claim and he should be more specific about why he's claiming it.)
Also, can someone please explain to me the ending of the chatbot story? I liked it up until the end and then I was just confused. Ads on fAIth???? What's that mean?
The lack of accounting for Russian spies was clearly outlined in the article. It was something the forecasters at the time missed in their public statements or private ones which have since become public. But the author of the article addressed this point and found it bewildering as well and he wonders what would have happened if they'd taken this into account. Perhaps there is some correct estimate in some hidden intelligence archive which did take the spying into account. The main problem was from the stovepiping by the Americans where they didn't know what was going on themselves, and hence couldn't seem to notice or imagine anyone else would.
I suppose they bought into their own nonsense and thought their efforts to keep the manhattan project secret had succeeded...the evidence to the contrary only became clear in 1949 when the Russians succeeded or even in later years when the success of their spies became clear. I'd argue the rapid rise in popularity of James Bond and spy films/stories really took off in the west as a huge reaction to that critical failure.
Why they didn't compare it to GPT-4 on the MMLU (a famous benchmark that people actually care about)? A mystery of the Orient. It seems GPT-4 performs better on the hard version of C-Eval. It's only on the easier one that ChatGLM scores higher. So take from that what you will.
I like how the guy starts out by assuring people the result's not fake. Kind of says it all.
Specifically because one of the big reasons Ukraine gave up Soviet nukes was security guarantees. Now everyone has seen how that goes (some may add "if you're relying on Russia to keep its word", but I think some reassessment of risks is due even if Russia is nowhere near. People also bring up Gaddafi for similar reasons, although if Libyan nuclear program wasn't anywhere near success when it was abandoned, it probably wouldn't have mattered several years later.)
A lot of hints and suggestions by the prompter was needed, as well as relying on a fake (simplified) website, and also needed a human to engage with that website.
About the two surveys, I wonder whether the explosion of the field plays a role. "Expert" was defined as someone who has published at NeurIPS or ICML, probably in that year.
But for the bigger one, NeurIPS, in 2015 there were ~400 papers, in 2016 it was ~600, in 2021 it was ~2400 and in 2022 there were ~2700 papers.
I am not sure that this implies less expertise for each expert. It could be, because in 2016 there was a much larger fraction of experts with years or decades of experience in AI research, while the typical expert nowadays has 2-3 years of experience. But perhaps long-term experience on AI is useless, so who knows?
But it definitely implies that experts can no longer have an overview of what's going on in the conference. In 2015/16 experts could still know the essential developments in all of NeurIPS. Nowadays they are way more specialized and only know the stuff in their specific area of expertise within NeurIPS. So perhaps it's fair that 2022 experts don't know the state-of-the-art in all of AI, while 2016 experts still knew that?
Here is a chart for how NeurIPS has grown over the years:
I'm a bit angry about the Starcraft thing. Scott says that AI has never beat top humans players. But the reason it did not is that it was not allowed to.
The AI that played starcraft was seriously hobbled : it wasn't allowed to view the whole terrain at once by moving it's camera very fast. It wasn't allowed to micromanage each and every unit at once by selecting it individually. In general it's APM (action per minute) rate was limited to a rate comparable to a human.
That's like inviting a computer to a long-division competition, but only giving it a 100 Hz CPU.
The whole point of having AI instead of meat brains is that AI works on very fast CPUs with very large RAM. If you're gonna forgo that, you won't reap the benefits of having a silicon brain.
Well, AI still has advantages, like perfect mouse precision for instance. I'd say that if we are comparing _intelligence_ instead of mechanical proficiency, then an attempt to introduce those limitations makes sense. Of course, the underlying problem is that high-level Starcraft is not particularly intelligence-heavy in the first place.
That article about AI taking tech jobs is really frustrating. He predicts by 2025, 55% of programmers will be using LLMs "for the majority of their professional programming work". He bases this on "the 1.2 million people who signed up for Github Copilot during its technical preview" (divided by a bit to restrict to U.S. programmers), but utterly elides the distinction between "signed up to try the new thing" and "have adopted the technology as a key part of programming from now on".
What percentage of those 1.2 million even got around to trying it at all after signing up? (I'd guess less than 50%.) What percentage of the remainder got frustrated and didn't put in the effort to learn how to work it? (I'd guess more than 50%.) What percentage of the remainder were able to find places to use it in a significant proportion of their work? (I'd guess less than 50%.) There's a huge, huge gulf between "is a tool someone's tried once" and "is an invaluable part of _the majority of_ someone's work".
A couple months ago, my workplace rolled out an internal LLM-based autocomplete similar to Copilot, but the suggestions are only occasionally useful and it's not even clear to me that it's a net positive.
But more importantly, **even if LLMs could magically write all the code I wanted to perfectly**, that *still* wouldn't be "the majority of their professional programming work" because writing code is only a small fraction of what programmers do.
Amen to that last paragraph. Writing the code is generally the easiest part of my job. Deciding what to write, how to align it with the existing architecture and hidden constraints[1] of the system, figuring out why it worked (didn't throw an error) but didn't actually work (do what it was supposed to do), how to validate what it did and what the side effects of doing it this way are, etc.
[1] Working in a codebase that grew "organically" and worse, was subject to several top-down "cool new things" pushes means that there are a lot of undocumented, spooky action-at-a-distance constraints. Where jiggling that lever means the bridge half-way around the continent falls down. But only if the moon is full and the price of bulk oolong tea on the Bejing Futures market is above $X/kg.
Thankfully I work for a small company that doesn't have enough departments for that. But we have plenty of meetings as it is, to be sure. And product folks who can't make up their mind (or maybe can't remember what they had previously decided). So I feel you.
Thank you for taking the time to read my article, and I take your point. In the process of writing the article, I had to make some difficult decisions about what to include due to space constraints. Unfortunately, this meant that some of the nuances around the adoption of coding assistants, including the points you've raised, were not explored in as much detail as they otherwise could have been. If you'd like, I can do a deep dive into this section on my blog. Otherwise, I'll be happy to answer questions offline and share the original paragraphs surrounding this section that capture a bit more detail.
Gosh, interesting. Thanks for the reply. So you say there is numerical detail and the headline result is still really that you think 2025 is the best guess for when 55% of programmers will be using LLMs "for the majority of their professional programming work"?? I find that... very unlikely. Big chunks of the software industry are swift to adopt new tools, especially in the web space, but there are other big chunks which aren't using any tools newer than 5 or 10 or 15 years old. Some of our customers are still refusing to update from a 10-year-old version of our software! And "the majority of their professional programming work" feels like a huge step from even "acknowledge as a valuable tool and use at least once a month".
Thank you for your response, and I appreciate that you're actively thinking about the underlying assumptions behind my process and not just accepting it at face value. The 55% estimate for 2025 carries a healthy degree of uncertainty since I didn't have much data to go on. It's a good point that a significant number of developers work in environments where the adoption of new tools like LLMs may be slower. Are you using them yourself?
As mentioned in the article, about 1.2 million people had at least signed up for Github Copilot during its technical preview and I estimated about 300K (25%) were in the United States based on Stack Overflow surveys. Of those, I expected only 150K would use it seriously for business purposes (based on conversations with developers I know) which when divided by the estimated developer population of the United States at that time (~4.2M) gave the ~3.5% figure for 2022 from the article.
Mapping this to an idealized adoption S-curve would suggest that we're likely around 10% adoption today (which also lined up with an informal survey I conducted), but if you think the initial estimates are suspect, then that could significantly change the trajectory.
I appreciate your feedback and it's given me something to think about. I'm curious to hear your thoughts on the current level of LLM adoption among developers and how you see this evolving as time progresses. Thank you again for your comments.
In your article, you argue that experts are well-callibrated because they had as many underestimates as overestimates. But this doesn't account for sampling bias - overestimates are more likely to be in the sample (since they've already happened). That is, things predicted for 2026 that happened 4 years only are in green, but 2026 predictions that only happen in 2030 aren't in red in your table.
Adjusting for this, experts probably do have a bias towards optimism (though small sample size so hard to be sure).
Also, re the argument that "fusion is always 30 years away" - there's consistent models for that (e.g. if you think it'll come 10 years after a visible breakthrough but aren't sure when that breakthrough will happen, modeling it as a poisson process will give you 30 year timelines until they jump down to 10). AGI isn't quite as single-breakthrough-dependent, but I can see the case for a similar model being consistent (where it mostly remains X years out then jumps down whenever something like gpt comes out).
Fusion is limited by cost-effectiveness, not breakthroughs. If we were actually close to running out of fossil fuels like it had been fashionable to predict once (but for some reason still insane about ordinary nuclear plants), then all the technical problems would've been solved within 5 years.
One crux I note that I have with "rationalists"/AI doomers, that I see here on display:
Most of the items that Scott resolves positively on the AI predictions, I would resolve negatively, with the further conclusion that AI predictors are way too "optimistic". A major example would be: "See a video, then construct a 3d model". We're nowhere close to this. And sure, you can come up with some sort technicality by which we can do this, but I counter that that sort of reasoning could be used to resolve the prediction positive on a date before it was even made.
I find this further makes me skeptical generally of prediction markets, metaculus, etc. The best "predictors" seem to be good not at reading important details regarding the future, but at understanding the resolution criteria, and in particular, understanding the biases of the people on whose judgement the prediction is resolved.
As much as it's a good concept that disputes between models can be resolved via predictions, in cases where different models produce different presents, and both models accurately predict the respective present environments that the people using the models find themselves in, the entire system of making predictions doesn't work.
"See a video, then construct a 3d model" is crude but existing tech right now. You can get an app on your phone to do it for individual objects, and google maps photogrammetry is also essentially doing this.
Note that you have to "See a video, then construct a 3d model" with AI. Turning an image into a 3d model is something we were able to do in the 1980s. The model probably won't be accurate or efficient, and it certainly won't have good topology, but...
There's a motte bailey problem with these predictions. Each of these capabilities sounds and reads as impressive and powerful technology, but the technology that then resolves them positive tends to be useless and uninspiring. I want to see actual professionals doing important work with these tools before I would resolve this class of prediction positively.
I continue to be annoyed at people identifying the phrase "Moore's Law" exclusively with Moore's 1965 paper on the optimal number of transistors per IC. When the phrase was coined in 1975 many people, including Moore, had pointed to other things also improving exponentially and it was in the context the conference at which Dennard presented his scaling laws tying all the different exponential curves together. From 1975 to 2005, when Dennard Scaling broke down, it was always used as an all-encompassing term to include transistors size decreases, power use decreases, and transistor delay decreases. It's only when the last of these went away that people started arguing over what it really meant and some people substituted the easier question of "What was the first exponential Moore talked about" for "What does Moore's Law really mean." /rant
The bit about AI alignment in "Crash Testing GPT-4" was miles better than most of the AI alignment stuff I've read. Practical, realistic, fascinating. That's real adult science!
Hmm... I was interested in the "So the kind of tests that you’re doing here seem like they could involve a lot of training or prompting models to display power-seeking behaviors, which makes sense because that’s what you’re trying to find out if they’re good at. But it does involve making the models more dangerous. Which is fine if it’s GPT-4, but you can imagine this increasing the threat of some actual dangerous outcomes with scarier models in the future." exchange... Shades of gain-of-function work...
In my limited testing and subjective evaluation, when translating from Russian to English, ChatGPT 3.5 beats Google Translate by a wide margin and translates about as well as a fluent but non-expert human. I have a degree in translation between this pair of languages and it beats me at translating songs. I think the chart pegging it at 2023 is in fact spot on.
The rank stupidity of questioning whether animals think, have theory of mind, etc., gives good reason to mistrust experts of all stripes, but particularly those who think about intelligence, consciousness, etc. (Even in early 90s, time magazine had a cover story asking something like "can animals think?") Combine this with claims about qualia, etc., denying same to other things plays similar role in misguided philosophical arguments about thinking, etc., denying the obvious that it arises out of physical structure of brain, chemical interactions, etc., an emergent property without any magic or special sauce involved. Which suggests machines "feel" something, even if something we might not recognize, are "thinking" even if we just want to frame it as "mechanistic" or "without the magic spark," etc. We're playing the same game with AI now. "OK, fine, crows use tools, but they still don't really "think" because . . ." "OK, fine, GPT seems to "understand" X pretty well and respond appropriately, and can fool most fools most of the time, but still can't . . . "
Sarah Constantin is an ace at topic selection and she goes deep enough into it to just satisfy me too, although I would have liked two areas in particular to be explored more: memristors and neuromorphic computing. If there is to be a breakthrough to get us past the Moore's Law limit that is approaching around 2030 then I think these two technologies will contribute.
The economist and research scientist, in their debate, seem to have forgotten the basics of GDP and *economic* growth (distinguished from other "growth" such as the invention of collectibles, or changes in price of existing goods, that is not economic).
Statements about GDP and economic growth are statements about household income and expenditure.
Yes, there are government spending, business capital formation, and net exports, but to a first approximation, GDP is what households get, and what they spend. *Economic* growth is an increase in household income and expenditure--an increase in the material welfare of households.
Again to a first approximation, households get their income from jobs. The debaters, while talking at length about automating jobs, have not bothered to look at *where the jobs are.*
A quick search of the Bureau of Labor Statistics's web site will tell you that the high-employment occupations are in health and personal care aiding, retail sales, customer service, fast food, first line supervision of retail,office and logistics staff, general and operational management, and physically moving things around - warehouse staff and shelf re-stockers, and order fulfilment. And nursing.
(Just beyond the top 10 we have heavy truck driving, the paradigmatic automatable job, that will be gone in 2016... no, wait, 2018... er, 2021 for sure... well maybe 2025 ... umm, ah, now we've tried to do it, it's a little tricky: perhaps 2040?)
If you want to alter the economy fast, you have to augment the work of people in those high-employment occupations, so the value produced per employee becomes greater. That way, you can pay them more. You're not going to make a big impact if you only automate a few dozen niche occupations that employ a few thousand people each. That doesn't move the needle on aggregate household income and expenditure.
Augmenting work in high-employment occupations could result in either of two outcomes (or, of course, a blend):-
a) if price elasticity of demand is high, and the cost of providing the service falls, then demand explodes. Health and personal care aiding is probably like this--especially if we look at a full spectrum "home help" aide job: some combination of bathing and grooming (nail trimming and the like), medical care (wound dressing, and similar), meal preparation, cleaning (kitchen, laundry, house general) and supervision of repairs and maintenance, operating transport (for example to shops and medical appointments), and on and on.
Currently about 3 million people are employed in the US doing health-and-personal-care work. Probably this means three to five million recipients. There is a potential market of 300M--if the price is right. Who wouldn't make use of a housekeeper/babysitter? (That's a rhetorical question: "every working woman needs a wife", as the saying used to go, back in the '80s and '90s, before political correctness. Of course the existing workers in this occupation are nearly all working with elderly people who are no longer able to do some things for themselves. So "every adult needs a HAPC aide", perhaps.)
So, maybe a sixty-fold increase for an occupation that is about a sixtieth of the workforce: a doubling of household welfare, potentially. All we need is for every occupation to be like that: job done!
But...
b) if price elasticity of demand is low, then the service becomes a smaller part of the total economy. This was the case for food: home prepared and consumed food used to be about 40% of household expenditure; now, it's single-digit percentages. People ate more food and better food as its price dropped; they just didn't eat *enough* more and better to maintain food in its rightful place at the top of the budget. Heating (space-heating, water-heating) and clothing have also shrunk in percentage terms.
It seems likely that retail sales, order fulfilment, and restocking fall into this category. Yes, people will buy more stuff if the price falls, but the increase is probably less than one-to-one with the price reduction. Some of the saved money will be spent on extra tutoring for the child(ren), haircuts, spring breaks, or pilates classes instead.
The same applies to the "customer service representative" and "fast food restaurant worker" occupations.
So automating these occupations (about 12 million employees?)--which, by the way, is easier than automating health-and-personal-care-aiding: it's already happening--will, er, "release" those employees.
Observe that it will also "release" a fraction of the "first line supervisor" and "general and operational manager" occupations, because they supervise and manage workers in these occupations so the effect would be super-linear. There will be *a lot* of people wanting to tutor kids, style hair, facilitate spring breaks, and teach pilates classes.
On the gripping hand, automation may well cause some occupations to grow hugely, that are currently very niche. Personal trainers. Personal coaches/mentors. Wardrobe consultants. Dieticians. Personal "brand"/social media image managers. Ikebana teachers.
And new occupations, or sub-occupations will appear: tutoring people how to get the best out of conversational AI interfaces, for instance.
I have no clue what will take off or to what extent, but I think the process of getting to mass market will take years. (And in the mean time, the effect of automation will be to shrink household I&E.)
That brings me to the third omission from the debate: constraints on investment. If you look over the BLS's list of high-employment occupations you will notice that most of them do not have large amounts of capital associated with them. That is likely to be because businesses employing people in those occupations can't borrow to invest very easily. They are low margin businesses.
There is some discussion of constraints in the Asterisk piece, in the form of "the time it takes for a crop to grow", or "the time it takes to acquire information", but no discussion of the basic (to an economist) constraint: access to funds. Now that interest rates are no longer near zero, this looms as a major limiting factor.
So I think that Hofstadter's Law applies: it always takes longer than you expect, even when you allow for Hofstadter's Law.
Tamay and Matt's conversation seems a bit strange and off base, missing the mark on reality. They sound like those experts talking about how far out certain effects will be in the second survey in Scott's article where they gave reasonable sounding answers for future dates, but those events had already occurred.
One point jumped out at me a lot. Matt's point in his second to last reply talked about how we 'have not seen' impacts from AI in the economy to boost it beyond the 2%. I'd argue the 2% we have been seeing is already coming from AI as older value adds have faltered. They assume a static world and miss the economic skullduggery that'd been going on to hide the long term decline of western economic growth which has been going on for decades. What propelled us in the 1900s is no longer working.
I.
They are just dead wrong. Companies like Amazon and Nvidia have seen dramatic benefits from using AI over the past 10 years as it has scaled up and become useful. Not just LLM, but also in ML and such for Google and the social media companies as well. Truly we may look back at the role of compute power in the economy more so than some special types of compute called AI.
When Brexit and Trump benefits in razor thin technical victories using Cambridge Analytica...that was a win for AI as well in the zero-sum political games which are not progress focused. - which was a great point Matt made. AI and compute only models have already been a huge factor in the economy.
They are also wrong in fixating on AI doing things humans doing...but neglect the things AI can do which humans can't do! Such as write personalised political ads for a million of swing voters. The AGI and AGI vs human tasks is too narrow a framing when considering the impact of AI on the economy, as it already can do things humans cannot which create economic value. Their fixation on job/skill/task replacement vs grand scale macroenomic outcomes is just too narrow.
II.
I think this has been masked by the fact of economic decline. Something like 95% of the growth in the S&P can be attributed to just 5 or 6 companies, tech companies using AI. Most of the economy is worse than stagnating and has been for decades.
This is a hard pill to swallow due to false accounting of inflation and other manipulated economic metrics and conditions created in a decade of low interest rates decreed by central bankers, but the publicly traded economy has been hyper concentrated in FAANGT and such for a long time now with AI of various types being their leading edge for the past decade.
Amazon and Walmart and others have already greatly benefited from AI in their warehouse and supply chain systems. Tesla has greatly benefited from various forms of machine intelligence automation and much of their value is tied in speculation about how amazing their AI self driving will be and their Dojo compute system. Not to mention AI and compute based allocation and coordination of data for their charging network, which wasn't as simple as/equivalent to building a bunch of gas stations along major highways.
I think the AI revolution and forces which drive economic growth have already begun and these two are busy talking about needless highly specific and arcane definitions of full AGI and when this or that factor will lead to greater than 2% economic growth trend. We are already at 20-30% economic growth or greater for AI using companies and they are picking up for the falling value of things like big commercial banks which are down 90% over the past 10 years.
This has already been happening and it is a wrong assumption to think the old factors of 2% growth from the 1900s are still in effect. They are no longer happening in western economies and have been dying out for a long time, hidden by financial games and zombie companies.
This entire article is junk because it focuses on GDP.
GDP, particularly for FIRE (Finance, Insurance and Real Estate) focused Western economies, is utterly meaningless. The massive fees extracted from Western populations by junk credit card fees, for example, "grow GDP".
So does the enormous US overspend on health care.
Skyrocketing college costs.
The list goes on and on and on.
Graph something concrete like electricity usage, energy usage, steel consumption and you get dramatically different outcomes.
Do you? Can you cite me something on interesting energy consumption vs. GDP divergence? And how would you resolve the problem where eg new medical cures make the world better but don't consume steel?
Multiple examples where GDP was neither indicative of greater production nor better outcomes.
As for "medical cures": while the cure may be pure information - the actual manufacture, transportation and deployment of said cure requires energy and resources. In particular - chemicals. India makes a lot of pharmaceuticals, but 80% of the inputs to these pharmaceuticals come from China and, as such, India's pharma output is 80% (or more) dependent on Chinese imports.
Note that explosives are a direct outcome of energy - in particular, natural gas --> Haber Bosch --> Ostvald --> nitric acid. TNT = tri nitro toluene = oil component toluene treated with nitric acid 3 times. The US importing TNT from Japan - which in turn has zero natural gas or oil - is pretty damn sad.
That post doesn't seem to be saying that. It discusses quibbles between GDP and GDP-PPP, then says that in war, what matters isn't raw GDP but GDP-devoted-to-weapons-manufacturing, which seems like a trivial insight, much as a country's position in the space race is probably less predicted by GDP than by GDP-devoted-to-space-programs.
Electricity probably isn't a perfect predictor of war-making ability either; Russia has 10x the electricity production capacity as Ukraine, but they seem evenly matched.
The post explicitly shows that the USSR had a far lower GDP ratio than National Socialist Germany yet produced 50% more weapons overall. This can hardly be called a quibble since tangible production always trumps nonsense measures of economic capability, and serves as a direct refutation that GDP represents an objective measure of actual industrial or economic capability. Note furthermore that nobody believes the USSR was a more advanced industrial economy in 1939 vs. said National Socialist Germany...
The post then goes on to look at China's manufacturing - both in terms of connectivity as well as pure "dollars" - as compared to the US and Europe; the result is China having as much manufacturing capability and the US and Europe combined.
And finally, Ukraine. Clearly you have not been keeping up: Russia is firing 10x the artillery shells that Ukraine - even with direct Western inflows supplementing - has been, something which even Western mainstream news (WaPo, NYT etc acknowledge).
Somehow this translates to an "even match".
So while I absolutely agree that any one measure like electricity consumption is not itself a better proxy than GDP - there are plenty of proxies which are. The manufacturing capacity one is probably the closest.
Note that while I don't agree with those that ascribe the Western GDP ascendancy to be nearly entirely fictitious due to the enormous aforementioned FIRE disproportion; Dr. Michael Hudson is entirely correct when noting that credit card fees, outsize college tuition and overpriced medical care do not in any way convey productivity in a national sense whereas they very much do inflate GDP.
I can cite any number of other examples of Western manufacturing hollowing out - Apple's attempt to onshore assembly of iPhones into the US failed because it could not source the tiny screws used, LOL; Ronzini had to discontinue a popular pasta - both referenced here: https://www.thebulwark.com/the-economic-secret-hidden-in-a-tiny-discontinued-pasta/
Nothing will be slowed down, but the conversation is to raise fear as a campaign to increase the rate at which regulations might come into effect in the next 5-10 years. It is hard to say if that effort to raise awareness is successful or not, in our weird corner of the internet here AI safety is obvious and long discussed...but we have yet to really come to any consensus on even a type of approach to take. But I think this is a conversation meant to spark regulatory action, but all of it will be ineffectual.
Just like with dumping chemicals and birth defects and rare cancers...it is only much later when people are harmed do we do something about it some of the time. Regulators will never get ahead of the harms and innovators will never slow down. We are still sick and dying more from the Green Revolution as we sleep on neurotoxin filled pillows which are mandated by law for fire codes or some nonsense. So I wouldn't count on anything stopping or regulations leading to anything meaningful until we see very large scale harms.
In reality I think the true response to AI will be a UBI to prevent a mass revolt in around 10 years.
I have not been following AI news all that closely, but it kind of seems like the hype is dying down a little? Like, three months ago everyone was acting like AI was going to take over the world soon, and now the attitude seems to be it's a kind of cool autocomplete.
For whatever it's worth, when I've used ChatGPT and Bing's AI chat it falls into the category of "kind of cool, but definitely not going to take anyone's job anytime soon."
My company has been applying for a lot of government grants the last couple of months. It has already effectively taken somebody's job because we didn't need to hire extra writers and researchers for these long concept papers that we had previously. It has both increased our reach and decreased our cost. Now this is a zero-sum type situation (competing for grants) but it is also applying to line of business application programming here as well which is not zero-sum.
I think if GPT enables a 50% increase in programming productivity the increase in demand for programming applications will increase by more than the reduction in cost. Programming jobs may increase and possibly enable an increase of employment in the newly productive areas benefiting from newly affordable programming services
In two Seattle studies of minimum wage regarding their last two increases they found that with the increase in the minimum the total amount of wages paid to minimum earners decreased due to reduced hours and lost jobs. Therefore if you decrease the minimum wage it's possible to increase the aggregate wages paid to minimum earners
If this holds for decreasing the wages paid to programmers per unit productivity the increased demand of the lower cost service may increase the aggregate wages and employment for programmers, as well as promoting economic growth among the recipients of the increase of programming services
And this doesn't address my example.
Calvin said nobody's job is going to be taken.
I provided an example where a few people's jobs were taken (while ALSO increasing productivity).
That doesn't mean arguing against on average across the economy while some people's roles were allocated differently as a whole the pie remained the same or grew. I'm making no gross statements like that.
I mean person A said "I'm not seeing changes" and person B is responding "Here are some concrete changes".
Yeah, I read a comment by someone who said they went from 5 writers to one plus GPT for a niche magazine. I didn't expect that big of a difference to be possible, although maybe it isn't working out or they had to hire someone back. Still appears to show dramatic potential for individual lost jobs
It's hard to say this increases productivity. It might, but really you're just talking about reducing some of the overhead involved in applying for something, which could have been done at any time by the government funding agencies you're applying to deciding to select grant recipients randomly instead of based on the quality of their applications. Unless your company is the only company that has figured out ChatGPT can fill out funding applications, you've effectively put these funding agencies into that position anyway. They no longer act as a meaningful discriminator. So they'll either come up with something else, or if we charitably assume there was some kind of good reason there was an application process in the first place, there's a tradeoff here in that whatever gain was considered worthwhile to have a competitive application process is now lost.
Plus, at least right now, assuming there is no conceivable way OpenAI can be profitable this early in the R&D phase of its business cycle, replacing human staff with ChatGPT just means OpenAI's investors are paying for your applications instead of you having to pay for them. Whether or not this represents a net economy-wide productivity gain anyway depends on your gains versus OpenAI's losses, neither of which is published. For what it's worth, I think it very likely still is, but we can't know.
I think you missed the bit where I said the applications were zero-sum but the line of business software wasn't. And the bit where I said I made no statement about overall US economy effects.
But I will ask this: you say "...assuming there is no conceivable way OpenAI can be profitable this early in the R&D phase of its business cycle..." are you including capex in this equation or just operating costs?
"In two Seattle studies of minimum wage regarding their last two increases they found that with the increase in the minimum the total amount of wages paid to minimum earners decreased due to reduced hours and lost jobs. Therefore if you decrease the minimum wage it's possible to increase the aggregate wages paid to minimum earners"
This does not follow. If you increase the minimum, employers stop hiring and do without. If you then decrease the minimum, they will still have learned to do without.
More robustly, if you freeze the minimum and set out that future raises in the effective minimum wage will come through earned income tax credits for the foreseeable future the effect will be an increase in minimum wage hiring. Where economists generally think increasing minimum wages causes lost jobs in the short term the theory is that it has an especially large effect on future hiring
This could be more analogous to increased programmer and other sector productivity where future hiring is now expected to be more productive and cheaper per unit of production
That would make sense - "not hiring" avoids a lot of the stigma an employer would get from firing people.
At least until you get e.g. youth unemployment rates of 50% or so, and even then it's a generalized "no one hires youths" instead of "Company <X> fired all it's youth employees".
>I think if GPT enables a 50% increase in programming productivity the increase in demand for programming applications will increase by more than the reduction in cost.
As someone in the industry who has actively uses both ChatGPT and Copilot (the two major LLM coding-productivity tools), I'm not sure this follows. GPT is good at two things: writing mediocre code, and helping competent developers write more code per unit time (at the expense of having to proofread the mediocre code, which is still usually a very favorable trade off).
The problem with the software dev market is that it's almost impossible to find a job as an entry-level programmer, because there's *already* an overwhelming glut of mediocre programmers on the market. But (somewhat paradoxically) it's also pretty difficult for anyone looking to hire to find competent programmers!
So taking that into account, in a market with a glut of mediocre coders and a shortage of competent ones, if you introduce a machine that can only produce mediocre code or reduce the demand on competent coders, the market may actually shrink!
Interesting! So in isolation the programmer jobs market should shrink for mediocre programmers until GPT can assist them into competency. At which point it may be able to simply replace jobs and outpace the increase in demand for the service
It's naturally hard to imagine this being a benefit for overall employment although I always think of energy where greater production per labour and capital employed is reliably an enabler of jobs in the wider economy
That's hype cycles for you. I'd say ChatGPT 4 caused a small hype epicycle on the still-ascending curve of a massive overall AI hype cycle.
>I have not been following AI news all that closely, but it kind of seems like the hype is dying down a little?
It's not just you. Look up "ChatGPT" or "GPT" or "GPT4" on Google Trends. The new toy shine is gone.
GPT is great and obviously lots of people still use it, but the hype bubble has deflated a bit.
I think people are waking up to the limitations of chatbots: they suck for a lot of stuff (even when the underlying AI is smart). The quality of your output is dependent on how well you can phrase a prompt (or worse, grok some abstruse "Midjourney-ese" that's impossible for a newcomer to intuitively understand). They're not always the best tool.
I was scared we'd move to a paradigm where EVERY website and app requires you to use a chatbot, regardless of whether it makes sense. I saw a music generation app that let you create simple music via a chatbot. Like, if you wanted the drums in your song quieter, you'd type "make the drums quieter". How is that not ten times harder than adjusting volume via a fader? I'm glad we dodged a bullet there.
Was this a generation app that was using one of the recent big music-generating models on the backend, where the audio composition was being done inside the inscrutable matrices? Or was it a generation app that was closer to traditional generation where there were multiple channels and such that were legible to the software?
>Like, three months ago everyone was acting like AI was going to take over the world soon, and now the attitude seems to be it's a kind of cool autocomplete.<
Sort of how the average Englishman felt between the declaration of war and the beginning of the Battle of Britain.
I don't think I'm getting too far out over my skis to say that we're full speed ahead on AI at Wolfram and trying to figure out how what we have built fits into this rapidly changing situation. GPT-4 has proven itself capable of writing some surprisingly sophisticated code.
AI xrisk was always about the future, even if it's a close one. If there's a change in tone, I would argue that mostly reflects the internet's attention span. There are still people on both sides of the issue.
If this is a representative summary of Asterisk's articles, it seems... very one-note. Are there any online rationalist/effective altruist hangout spots that aren't consumed with AI-related discussion?
Sorry, I should have specified - this is the AI issue of Asterisk. Last issue was the food issue. It changes each time.
Oh that’s right, now I remember your post about the food issue. Please disregard my comment above.
"We should, as a civilization, operate under the assumption that transformative AI will arrive 10-40 years from now, with a wide range for error in either direction."
I feel like it's hard not to read this closing statement as a slightly more nuanced and somewhat less efficient way to say "we don't know." I don't mean that flippantly; I think there's a lot of value to thinking through what we do know and where our real confidence level should be. However, what really is the practical difference between 10-40 years with significant error bars and we just don't really know?
I think there's a certain quality to saying "we don't know" that people reflexively interpret as saying "no".
For example: "We don't know if within our lifetime aliens will enslave us to defend themselves against another alien species."
Compared to 1960s us "We don't know if within our lifetime modern civilization will end in nuclear flames."
Both are/were true statements. 10-40 years with significant error bars communicates something closer to the latter than the former. Whereas if you replace it with "We don't know when transformative AI will arrive" it's closer to aliens than I think the author intended. (insert History channel ancient aliens meme)
> We don't know if within our lifetime aliens will enslave us to defend themselves against another alien species."
That was already settled in the 1950s with invasion of the body snatchers
Except for the self-defense part. That would’ve been a very post modern twist.
They only enslaved us to save themselves.
It's (probably) the plot of both Colony (recent tv show) and similar to Earth: Final Conflict (90s tv show)
There's a difference between "we don't know" spoken with a careless shrug and "we don't know" spoken with a concern-furrowed brow.
I don't think many would have predicted a 10-40 timeline in 1980. We're still uncertain, but there's a new kid in town (the transformer) that may have changed the game.
Popular audiences in the 80s apparently had no trouble accepting sentient AI by 2019 (Blade Runner) or 1997 (Terminator) as a plausible story. Now the tone here seems to be that the general public is too dismissive of the possibility of imminent AGI, and I think that's accurate: the comments on ACX make me want to shake my head at the hysterical fear, while people in real life telling me AI risk is something they've "just never thought about" make me want to scream in hysterical fear myself.
The people I really still don't get are the people who are like "within 5 years AI will destroy us all", but work making interfaces for General Mills or something.
Why the fuck aren't you taking immediate drastic and illegal action to stop it then? If you really actually believe that, it would seem that would be the urgent mandate.
I can't grok the selfishness of someone who truly believes "the world will end in 5 years without intervention, but I am just going to keep attending my 9-5 and playing videogames".
This is not an uncommon sentiment but it doesn't really make sense. There isn't, so far as I'm aware, any known drastic illegal action you could currently take that would improve the situation in expectation. Drastic and illegal actions also have massive downsides including impairing humanity's ability to co-ordinate on any current or future interventions that likely have a better chance of success than some panic driven murder spree.
I just think this is pretty silly. People/corporations are skittish. A concerted campaign of terrorism could absolutely work at least in the short term.
And if you are really thinking there is a 95% chance of humanity being destroyed, the extreme short term is pretty much all you should care about.
Come on really? Think about it. Would a campaign of terrorism actually convince you to take the threat of AI seriously? Or would you just chalk it up as yet more evidence in favor of doomers being a bunch of crazy whackos. Except now they would also be extremely dangerous terrorists to be put down with extreme prejudice.
FWIW, my wild guess is that AGI is more likely than not to be an extinction event for humanity (probably a slower one than EY expects), but I'm on the sidelines. Amongst other things, I'm a 64 year old retiree, and everyone that I personally care about is 60 or older. I find AI interesting, but I don't lose sleep over it.
I don't think it's going to end in 5 yrs., but I am seriously concerned about ASI wrecking the world in various ways. I have in fact suggested some actions to take which, if not illegal, are certainly sleazy. Nobody shows any interest or suggests other actions they think would be more effective.
It's a hard problem. Monitoring large training runs and any other large AI projects might buy some time, but ultimately we either have to solve the control problem for superhumanly capable agents or just hope that whatever we build happens to be a big fan of human flourishing.
Yes, I think it's unstoppable. There are too many people deeply invested it in, too many possible benefits, and too much money too be made.
Another suggestion I often make is that the people working on alignment cast a bigger net in the search for options. I'm sure there are many versions of "alignment" on the table already, but I still get the feeling, as a person who does not work in tech but who reads about it, that there isn't a sufficiently broad assortment of solutions being considered. All seem some like variant of "implant a rule that the AI can't break and can't remove from its innards." It all sounds like a school principal coming up with a very limited selection of ideas for how to reduce fights when classes change: Install cameras and give the bad actors a medium-sized punishment. Have teacher monitors instead of cameras, and give the bad actors big punishments. Install cameras and give the good kids a small rewards. Have teacher monitors and give the good kids medium-size rewards and make the bad kids write apologies . . .
The world is full of models of weaker and dumber things that are protected from stronger and smarter things: The young of most mammals are protected from harm by their parents by wiring that's way deeper than a rule, and also by the parent's knowledge that their kids are the bits of them they hope will live on after the parent's death. Hippos tolerate little birds on their back because the birds remove the hippo equivalent of fleas. Butterflies disguise themselves to look like a kind of butterfly that's toxic to birds; chain saws have deadman switches.
It's not that I think I can supply the clever model for keeping us safe from ASI that the AI blokes haven't thought of -- it's that I think it's likely that crowd-sourcing the problem could. Just as some people are superforecasters, some people are extraordinarily good at thinking around corners. I'm a psychologist, and I can tell you that you can actually test for the ability to come up with novel solutions. I understand why people recoil from my sleaze proposal, and I am very unenthusiastic about it myself. But I can't get anyone to take the crowdsource it idea seriously either, and it has virtually no downside. Have some contests, for god's sake! Or test people first, and use those who test high on inventiveness as your crowd.
And I don't think it's necessary for people to understand the tech to understand the problem well enough to brainstorm approaches.
What do you think about this idea?
No, this really doesn't make sense. It's a perfectly reasonable position to believe both that "within 5 years AI will destroy us all" and "there's nothing I personally can do to meaningfully reduce the risk of this". Sure, if *enough* people took action we could prevent it, and this would clearly be a net win for people as a whole, but obviously I only care about myself, not humanity as a whole, and ruining the next 5 years of my life to reduce to risk of extermination by 0.0001% is a terrible trade off.
You sound like a defeatist. Don’t have kids?
Sorry, to be clear, I don't personally actually think AI is nearly certain to kill us all in 5 years (I'm fairly pessimistic by most people's standards, I think, but definitely nowhere near that pessimistic). I'm just trying to explain why holding that view doesn't mandate taking extreme actions.
I've written on ACX before about timelines that happen to match life milestones of the people making the prediction. 20-30 years often takes the predictor to retirement or a little beyond. It's like admitting that they themselves will not complete the research, but that they feel it's pretty close otherwise. Then in 10 years the same 20-30 years might get trotted out as if no progress was made at all. Fusion has done this for a long time, and even though we get hype cycles and more specific information about what's being done to make fusion a reality, that prediction horizon never really seems to move. I'm old enough to have seen 25 year predictions on fusion 25 years ago.
Five year prediction cycles are a lot more meaningful, but they also (should) put the predictor's reputation on the line. Saying something specific will happen in a small number of years where the expectation that the predictor and the person hearing it will both be around to see the results is a much stronger claim. I'm seeing a lot more people in the long range camp giving themselves an out than the short term camp. If we don't have transformative AGI in the next five, maybe ten, years, then EY should take a reputational hit. If everyone else backs off of such a short timeframe for fear of the same, then I take it that AGI is much further off than most people are willing to admit.
>Five year prediction cycles are a lot more meaningful, but they also (should) put the predictor's reputation on the line. Saying something specific will happen in a small number of years where the expectation that the predictor and the person hearing it will both be around to see the results is a much stronger claim.
*Cough* apostles *cough*
Good point. Even with short prediction cycles, very few predictors actually get a reputation hit from a wrong prediction. The nuclear weapons prediction article in the issue, https://asteriskmag.com/issues/03/how-long-until-armageddon , was fascinating (and, in terms of assessing the value of predictions by even the best-informed people, depressing). As the author, Michael Gordin, points out:
"The Americans were concerned with only one adversary and the development of a single technology which they already knew all the details of. American predictions of Soviet proliferation offer a highly constrained case study for those who would undertake technological forecasting today."
> I don’t think they emphasized enough the claim that the natural trajectory of growth is a hyperbola reaching infinity in the 2020s, we’ve only deviated from that natural curve since ~1960 or so, and that we’re just debating whether AI will restore the natural curve rather than whether it will do some bizarre unprecedented thing that we should have a high prior against.
Yeah, that's not how fitting curves to data works.
Did you read https://slatestarcodex.com/2019/04/22/1960-the-year-the-singularity-was-cancelled/ ? What are your thoughts?
I read it back when you posted it. Just skimmed again to refresh. I agree with the possiblity (in the abstract) that creating a new kind of worker (AI) that reproduces using different resources than humans can change the trajectory of history. That's the central claim of that post, correct?
But at the same time, I am very very skeptical of any kind of "fitting curves to data to predict the future". The first step needs to be to have a complete and coherent model that captures all the major dynamics properly.
(As an example, consider covid. I don't believe that covid grows exponentially because someone fitted a curve. I believe it because we know how viruses behave. Under mild assumptions (even population mixing), our models make perfect physical sense. The curve fitting part is only to find out empirical things like the R_0 of a specific pathogen.)
Anyways, I just don't see it with the hyperbolic growth model presented there. It abstracts away too much, modeling "progress" as if it was a smooth mathematical quantity. Even if this is a good model for some parts of history (low tech level), it doesn't follow that it holds for all of history. I am especially skeptical *because these models don't have a coherent mechanistic explanation for what they're modelling*.
(As another aside, I think economics and game theory have allowed us to understand human systems a lot better. They're almost like laws of nature, e.g. we can confidently say that just giving money to people doesn't actually make us richer, and will just cause price inflation (maybe that's a bad example, I hope you get what I mean). But these "laws" have only been tested on small timescales, closed systems, and so on. We are extremely far away from a "universal theory of economics". At best, we are at the point equivalent to the early stages of classical Newtonian physics.)
The model actually has a very clear meaning. Economic growth is a function of population and innovation, innovation is a function of population (and on some version, amount of innovation so far), and population growth is a function of economic growth.
There is definitely something unphysical that happens near that singularity - I think it only goes there if we assume that people can contribute to innovation within epsilon time of being born, and innovation can contribute to wealth within epsilon time, and wealth can contribute to population growth within epsilon time.
I liked the article, but i thought the trend line was a function of the interval chosen rather than it being part of a "natural" path.
There are 2 outlier data points at the begining of history, low quality of course.
Then a period until AD 1000 with no clear trend. (Though thats also low quality)
Then the actual trend identified over a period that includes european industrialization. (Starting 1500 not just the "industrial revolution") With some noise in the period just before and just after.
Then flatline since 1960.
Data that shows the industrial revolution was a very important change in human development gets reframed as the singularity should have happened in 2020.
Alternate hypothesis.
The data for the far past is basically unreliable. I wouldn't put much faith in anything prior to 1500.
And the data for the recent past up through 1960 is basically capturing how incredibly amazing and useful using fossil fuels to drive your economy is. By 1960 the tapering of pop growth and the full saturation of fossil fuels into the economy leveled off growth, and it will probably stay leveled off until we make some big strides in energy technology.
Everything in the economy comes back to energy.
So there was a long period of "slowish growth", and then a period of super rapid growth from utilization of fossil fuels.
But like bitcoin or anything else, you can always slap on some exciting looking "singularity curves" onto anything which is growing quickly. Doesn't make it true.
Even though the data prior to 1500 is unreliable, it's clear that the rate of change has accelerated over the last 12,000 years, and over the last 100,000 years too. Overwhelming historical evidence indicates that technological progress was very slow in ancient times.
We also know that the economy couldn't have been growing at an average rate of say, 1% per year since 10,000 BC, because that would imply that the economy would have grown to be 10^43 fold during that time period. We can use a similar argument against an average of 0.1% yearly growth over the last 100,000 years.
Given the acceleration in the measured data from 1500-1950, we have good evidence that the long-term trend has been super-exponential over the course of human history.
I think what isn’t clear is if there were other little mini ramp ups which were aborted.
Yeah the long term trend has been “super exponential”, but that isn’t really much evidence the actual relationship is exponential. Almost anything where there has been a ton of growth at some point will look like it *could* be exponential. But that doesn’t mean it is.
Take a child's skill at many things for example. My 9 year old was terrible at long division for his entire life, until he mastered it in weeks. In a few more weeks will he be the worlds best long divider? No.
On a related note, there are former skills which have atrophied or pretty much vanished. For example, no one alive today could manually make a flint axe head half as well as an expert in neolithic times, and nobody could coordinate a fleet of square rigged ships at sea with just flag signals nor probably, come to that, with radios!
> the natural trajectory of growth is a hyperbola reaching infinity in the 2030s, we’ve only deviated from that natural curve since ~1960 or so
This is a very oddly stated claim. Most apparent exponential curves are actually S-curves that haven't yet hit the "S." My null hypothesis is that this is another example of such, not that there's some "natural" trajectory from which we've deviated. Why would one believe the latter?
The claim isn't that it's apparently exponential, the claim is that it's apparently hyperbolic.
I think that matters because yes, once it got too high, it would reach some limit and level out. But before it did that, it would be going shockingly fast/high by the standards of everything that happened before.
If you're driving/accelerating along the following trajectory:
1:00 PM: You're going 1 mph
2:00 PM: You're going 2 mph
3:00 PM: You're going 3.1 mph
4:00 PM: You're going at the speed of light
5:00 PM: You're still going at the speed of light
...then technically this follows an S-curve, but that doesn't mean something very exciting didn't happen around 3:30.
So, in terms of this metaphor, sometime around 1960, we got to close to the speed of light?
Sorry, I don't think that's a helpful analytic framework. I think something far more mundane happened: in the Johnson administration, and increasingly thereafter, elites in Western society began to broadly and strongly prioritize social goals over technological ones. As a result, the space program was mostly killed, deployment of nuclear fission power was stopped dead in its tracks, research into nuclear fusion and better nuclear fission was defunded, etc. But no mysterious analog of hitting the speed of light.
No, we obviously haven't got close to the speed of light yet. Sometime soon we might, e.g. thanks to AI. (According to this metaphor)
No, we were on a hyperbolic trajectory until the 1960s that would have looked boring until about the 2020s and then given us a singularity that shot us to the speed of light. In the 1960s we deviated from that trajectory in a way that was hard to notice at the time but suggests it's no longer hyperbolic and won't shoot up in the 2020s, though that could change. Again, see the link at https://slatestarcodex.com/2019/04/22/1960-the-year-the-singularity-was-cancelled/
OK, given that clarification, I'm not sure we're so far apart. I just have more concrete ideas about what caused the deviation.
You should read the models. They say that population growth rate depends on wealth, that wealth growth rate depends on population and on innovation, that innovation growth rate depends on population and how much innovation has already happened. There are some natural places to put a factor that would turn, for instance, population growth into an s where it starts to hit up against physical limits - though the point of the wealth term is in part to model getting through those limits, so you’d want to think carefully about that.
Assuming the Angry Birds challenge is serious, what is it about that game that makes it so much more difficult for AIs to master? Surely if AIs can beat humans at Go and chess, not to mention master dozens of classic Atari games in a matter of hours, Angry Birds shouldn't be that much more difficult??
Which article is that one in? Or if it isn't an article do you have a link?
Off the cuff I'd assume it's just an anthropic issue. Chess and Go are popular enough to devote a lot of time and compute. Atari could either be because it is very simple or that was what fates willed someone to try and solve first (and note it isn't literally all Atari games, just dozens). It doesn't have to be that Angry Birds is particularly AI-hard, but that the big guns haven't been brought to bear on it yet.
Should have read your comment first. Yeah I think you're spot on.
To answer your question about which article: It was Scott’s article. Perhaps the key quote about that particular issue: “The most trollish outcome — and the outcome toward which we are currently heading — is that those vast, semidivine artifact-minds still won’t be able to beat us at Angry Birds.”
Not an expert, but while we associate board games like Chess and Go to 'high intelligence', conceptually they're quite simple - the rules are not too complicated and they're very discrete: you have something like 100 moves in every position and you can exactly list them out and analyze the resulting position.
Angry Birds is a lot more complicated and 'fuzzy' - it requires something a lot more like a working knowledge of physics - the input is simple (a bird, an angle, a power), but the result isn't trivially predictable in the way it is with chess or Go, so the techniques we've used for ages just don't work.
I don’t think it’s impossible, but it is far enough from what existing systems do that you’d have to design and train a system specifically for playing Angry Birds. And it’s not surprising that nobody has tried yet.
LLMs have a surprising ability to figure out freeform problems like this, but they’re not a good fit for two reasons: they don’t handle images, so they can’t just “read” the screen and figure out what’s going on, and they have short context windows so they can’t do longer interactions that are necessary to play a whole game. Multi-modal models will help with the first issue, but not the second.
Eventually when modern AI is applied to robotics, it’ll have to handle long-running interactions with live video input, and I expect that models like that will also be able to operate any smartphone app fluently.
Yes, I was wondering the same thing in the last open thread. The game is turn by turn, there are very few parameters to play with (choice of bird, angle and intensity of throw, maybe time of throw?), the response is immediate and the levels short, and the main aspects of the physics of the game shouldn't be hard to grasp. Chess and Go are not necessarily that helpful comparison, but surely it should be easier than Starcraft 2 or hard platform games? Has anyone looked into the achievements of this contest: http://aibirds.org/angry-birds-ai-competition.html ?
It's definitely lack of interest in beating it. It was Deepmind creating AIs that beat games, and they were tackling Go, Starcraft, etc. The same tech _definitely_ would beat Angry Birds.
But now they are solving protein alignment, and won't drop that for a stupid game
I'd agree, it is clearly a lack of incentive. Does anyone even play Angry Birds anymore? It is out of the Zeitgeist and isn't in Vogue. That said...Scott may continue to be right for the wrong reasons...that Angry Birds will remain an unexplored frontier.
Now what would be more impressive is if we applied AI to beating humans at the more complex board games such as Settlers of Catan.
Go and chess are not very hard games at all. They are played at a very high level of expertise and very very well studied. But they are both SUPER simple in the context of "games".
Go and chess are not visual games. You can play them with numbers alone, and there are a lot of engines that will output situation on board for the AI to pore over. Angry Birds (as well as unmodified StarCraft and other unmodified games) require AI first to solve vision problem - e.g. learn what is presented on game's screen, and THEN learn how to play it. Same as long-solved Atari games, yes, but modern games are much more colorful and dynamic, and screen scrolling adds another complication. Then again, even if you treat Go or Chess board as a visual input, it's a very simple visual input with easily distinguishable features. A game with cartoonish graphics, animations and effects that appear from time to time to obscure parts of image? That's hard.
If you can hook inside the game and pull out relevant data, it simplifies problem a lot, but maybe too much - after all, a human with calculator can probably plot a perfect trajectory if he knows exact coordinates of shooter, target, obstacles, speed and acceleration of the bird, gravitation, and other important properties.
I've been doing some ML research on a game I'm working on at my workplace, and I went this route - I give AI pre-digested representation of game world, not screen capture, because my puny GeForce 1060 cannot even begin to handle networks that could solve both vision and mechanic tasks. But even that is not simple, because our game involves quite a lot of RNG, and different weapons and abilities for both player and enemies.
I asked ChatGPT how to avoid the problem of researchers always being depicted as chemical engineers wearing labcoats, then punched a summary into Dall-E. I think this is probably better representative: https://labs.openai.com/s/KJoq7h4Yfn5ukOnGmCQhQ3v8
You can probably remove “challenge biases/stereotypes” from the prompt - though that might also bring back some of the lab coats.
The original prompt was "How can I prompt an AI image generator to provide images of researchers that are not biased by the stereotype that all researchers are chemical engineers and wear lab coats all the time?"
I think ChatGPT read the word "biased" in my prompt and immediately leapt into woke mode.
Not disputing statistics, but you may consider that the demo is "researchers", not computer scientists. Anecdotally, while most programmers I know are male, I think the majority of researchers in my circles are not cismen.
"Include laptops, whiteboards, books." doesn't seem to have been honored (unless one fuzzy line is a book or laptop...)
The other options had those, but they had their own issues. One was 100% female (or at least, long-haired full-lipped) representation, one looked too much like a business meeting, and the last was this: https://labs.openai.com/s/ptsMITDysdqYBbMWksYG0413
Ah! I see! So the example I followed was from a number of attempts. Very reasonable. Many Thanks!
All technological progress follows an S-curve, and an S-curve always looks exponential until you hit the inflection point...
See my response to Stephen Pimental above.
Indeeed, things can get pretty interesting regardless of where the inflection point lies. (I don't know if S-curves can look locally hyperbolic instead of exponential.)
I mean, it’s easy to draw an s curve that is locally hyperbolic if you are generous in what counts as an s curve. But if it is specifically logistic growth, that specifically looks exponential.
We could ask AI what she thinks her existence will have on us mortals.
China's behind on some things on AI and ahead on others. These advantages and disadvantages are actually probably fairly structural to the point where I'd feel semi-comfortable making predictions about who will lead where (or at least what will be close and where one side will be far ahead). For example, LLMs are handicapped by conditions in China and will be probably indefinitely. On the other hand, facial recognition or weapons guidance have structural advantages in the huge security state serves as an eager market and data provider. And China tends to play closer to the chest with everything but especially military technology.
I'm also highly suspicious of anyone writing about Chinese AI who doesn't have a tech background or who doesn't have access to things like AliCloud. Chinese reporters and lawyers and propagandists are usually about as technically literate as their American counterparts. Which is to say not very. You really need to scrub in to understand it.
Not to mention a lot of the bleeding edge is in dense papers in Chinese. Plus the Chinese produce a gigantic quantity of AI papers which means simply sorting through everything being done is extremely difficult. You need to BOTH understand Chinese and computer science language to even have a hope of understanding them. But Ding seems focused on the consumer and legal side of it.
Also fwiw: I tend to think AI will lead to lots of growth but not some post-scarcity utopia type scenario. It will speed growth, be deflationary as per usual, etc. And I think that we're actually going to see a move away from gigantic all encompassing models towards more minutely trained ones. For example, an AI model based on the entire internet might be less useful than an AI model trained on transcripts of all of your customer service calls and your employee handbook. Even if you wanted to make a "human-like" AI ingesting the entire internet is probably not the way to do it. Though the question of whether we can keep getting smaller/better chips (and on to quantum) is certainly open and important to computing overall.
And the idea of using chips unilaterally as a stranglehold is a bad idea. We only somewhat succeeded with Russia and only then because the other important players went along with it. Any number of nations, even individually, could have handicapped the effort. They just chose not too. And keep in mind China is in the "chose not to" camp. This is without getting into rogue actors.
All in all I enjoyed the magazine although I did not agree with much of it.
A claim that's popular in certain circles is that Chinese language academic literature is a thoroughly corrupt cargo cult paper mill, the main purpose of which is to manufacture credentials. What real research there is can be judged by its appearance in Western publications.
I'm surprised that the interview didn't mention hardware embargoes at all. At first everybody seemed to treat them like a Big Deal, whereas by now it seems more like an overhyped non-issue.
This is a claim I’m definitely interested in. I’d be very cautious about truly believing it, the same way I’m cautious about believing the claims people make about certain academic fields being this sort of thing (whether it’s string theory or gender studies).
This is both true and not true. The way Chinese research works is effectively a centrally planned version of publish or perish. "X University, you are responsible for producing X AI papers to meet the goal of Y AI insights annually." This means 90% of papers on any subject are not bad but aren't worth a lot. They contain real research but it's clearly done to meet a paper quota. I've seen papers I am like 80% sure were specifically written to be done quickly.
On the other hand, China has scale and there are strong incentives for getting it right. If 90% of papers are bad the remaining 10% still represents significant output. And China is quite willing to throw money, prestige, posts, awards, etc at people who produce highly cited research or research that gets picked up by the military or industry. They're also fine with them taking innovation into the private sector and getting rich. The paper mill cargo culter probably has an okay post at a regional university. But if you want to get rich or the top post at a big university you need to make some actually impressive contributions.
The problem with this inefficient "at all costs" system with the gaps filled in by profit seeking companies (who in turn mostly stick to the cities) is that it creates profound regional inequality. The "bad at diffusion" the article talks about. But it's not apparent to me this would matter in AI. It's not like there's artisanal AI models being produced in Kansas anyway. And just those cities represent a larger population than the US at this point.
The chip embargo is a real thing and has caused issues in China. But it hasn't stopped them, just slowed them. In part because it's not nearly total and in part because of China domestic industry.
>But if you want to get rich or the top post at a big university you need to make some actually impressive contributions.
Well, the claim isn't that these don't happen at all, just that they also get published in the West, and this is how they get judged in China as well, because domestic peer review standards aren't high enough. Or are you saying that there is actual cutting edge research there that the West isn't up to speed on?
Yes, there are. And while the west is still more prestigious than internal in China that gets less true every year and has taken an especially sharp turn lately. Getting foreign citations might be prestigious but the real way you move up is impressing your (also Chinese speaking) superiors. Not to mention that unlike (say) India the vast majority of Chinese computer science professors speak almost no English. Even many of the very good ones. Meaning there's a real barrier to publication abroad.
If you look at citation networks you see clusters around language groups. But you also see that Chinese, Japanese, and Korean have strong research output but are much less connected internationally than the other clusters (which are all European languages).
But then how do those top Chinese computer scientists stay abreast of the cutting edge, which is undoubtedly still mainly Western-centered? Surely being language barrier-cut from it would imply that they are constantly at risk of bicycle reinventing, at the very least.
They do have translators, of course. The government/CCP subsidizes all kinds of translations into Chinese (and some out of it). Including specific focus on research papers in areas they're interested in. In many cases Chinese researchers can get relevant western papers for free in about a day after they're published.
But those languages do in fact develop unique features that you'd expect to see in a somewhat isolated environment. (In Japan it's called Galapagos Syndrome.) And there is some degree of reinventing the wheel. The infamous example here being a Chinese mathematician who reinvented Newton's way of doing calculus integration and named it after herself.
Re "The Transistor Cliff": We are out of compute, now we must think.
People would normally cite the AI survey as "Grace et al" because it was a collaboration with Katja Grace as the first author. (I was one of the authors).
In the Asterisk article you say, "As for Katja, her one-person AI forecasting project grew into an eight-person team, with its monthly dinners becoming a nexus of the Bay Area AI scene." But the second author on the 2016 paper (John Salvatier) was a researcher at AI impacts, and the paper had five authors in total.
Sorry, fixed (the reference here).
AI Impacts has had many boom-bust cycles and I believe it was approximately one person for some of the period the paper was being written (I lived with Katja at the time and gave her a small amount of writing advice, though I could be forgetting some things).
The article about people underestimating the arrival of the Soviet atomic bomb is almost completely leaving out the fact that the Manhattan project was porous as swiss cheese to Soviet spies. Stalin had full blueprints for the bomb before Truman even knew what an atomic bomb was. The Germans failed after going down a research wrong alley (whether Heisenberg did it on purpose or not is perhaps an open question, but the fact remains that they didn't focus on the right path). It is a LOT easier to build an atomic bomb when you have the full manual, plus a list of dead ends to avoid, courtesy of Klaus Fuchs, Theodore Hall, David Greenglass, etc. Unless this article is suggesting that China has spies in OpenAI, I think the analogy needs more work. (And if he is claiming that, that's a major claim and he should be more specific about why he's claiming it.)
Also, can someone please explain to me the ending of the chatbot story? I liked it up until the end and then I was just confused. Ads on fAIth???? What's that mean?
I'm fairly sure China has spies in openai (or has at least hacked their information systems).
Hahaha. About the ending to the story: I recommend people read the story https://asteriskmag.com/issues/03/emotional-intelligence-amplification first. But what's going on is... hidden in ROT13 text:
gur tvey sevraq jub Novtnvy jnf punggvat gb, Snvgu, jnf npghnyyl na NV, cebonoyl eha ol gur fnzr crbcyr nf Plenab, naq Elna cnvq gur pbzcnal gb chg va n tbbq jbeq sbe uvz jura Novtnvy jnf punggvat jvgu sNVgu. Elna cebonoyl qvqa'g npghnyyl unir nal vqrn gung Novtnvy jnf punggvat jvgu Snvgu naq ur pregnvayl qvqa'g xabj gung Novtnvy gubhtug Snvgu jnf n erny crefba abg na NV.
Bu bbcf. V gubhtug nyy nybat gung snvgu jnf na NV naq V zvffrq gung Novtnvy gubhtug fur jnf n erny crefba. Abj vg znxrf zber frafr. Gunax lbh!
The lack of accounting for Russian spies was clearly outlined in the article. It was something the forecasters at the time missed in their public statements or private ones which have since become public. But the author of the article addressed this point and found it bewildering as well and he wonders what would have happened if they'd taken this into account. Perhaps there is some correct estimate in some hidden intelligence archive which did take the spying into account. The main problem was from the stovepiping by the Americans where they didn't know what was going on themselves, and hence couldn't seem to notice or imagine anyone else would.
I suppose they bought into their own nonsense and thought their efforts to keep the manhattan project secret had succeeded...the evidence to the contrary only became clear in 1949 when the Russians succeeded or even in later years when the success of their spies became clear. I'd argue the rapid rise in popularity of James Bond and spy films/stories really took off in the west as a huge reaction to that critical failure.
>Today’s Marginal Revolution links included a claim that a new Chinese model beats GPT-4; I’m very skeptical and waiting to hear more.
I, too, am a bit skeptical.
The first question I'd ask is "beats GPT-4 at what?" Competitive figure skating? Tying cherry stems in knots with its tongue?
https://www.reddit.com/r/LocalLLaMA/comments/14iszrf/a_new_opensource_language_model_claims_to_have/
Apparently it's C-Eval, "a comprehensive Chinese evaluation suite for foundation models."
https://cevalbenchmark.com/static/leaderboard.html
Why they didn't compare it to GPT-4 on the MMLU (a famous benchmark that people actually care about)? A mystery of the Orient. It seems GPT-4 performs better on the hard version of C-Eval. It's only on the easier one that ChatGLM scores higher. So take from that what you will.
I like how the guy starts out by assuring people the result's not fake. Kind of says it all.
Another bit of AI news. Someone may have revealed GPT-4's architecture.
"GPT-4: 8 x 220B experts trained with different data/task distributions and 16-iter inference."
https://twitter.com/soumithchintala/status/1671267150101721090
I'd put it down as a rumor (like the 100 trillion parameter thing) but a couple of people on Twitter are corroborating it.
Thoughts, anyone?
The single mention of Ukraine in the non-proliferation article is
> Belarus, Kazakhstan, and Ukraine traded away the Soviet weapons stranded on their soil in return for economic and security assistance.
Was it written before --2022-- 2014?
While Ukraine is relevant to many things, why do you think it's relevant to non-proliferation?
Specifically because one of the big reasons Ukraine gave up Soviet nukes was security guarantees. Now everyone has seen how that goes (some may add "if you're relying on Russia to keep its word", but I think some reassessment of risks is due even if Russia is nowhere near. People also bring up Gaddafi for similar reasons, although if Libyan nuclear program wasn't anywhere near success when it was abandoned, it probably wouldn't have mattered several years later.)
No, GPT-4 did not just make a gig worker solve a captcha
https://aiguide.substack.com/p/did-gpt-4-hire-and-then-lie-to-a
A lot of hints and suggestions by the prompter was needed, as well as relying on a fake (simplified) website, and also needed a human to engage with that website.
This is super important. Thanks!
About the two surveys, I wonder whether the explosion of the field plays a role. "Expert" was defined as someone who has published at NeurIPS or ICML, probably in that year.
But for the bigger one, NeurIPS, in 2015 there were ~400 papers, in 2016 it was ~600, in 2021 it was ~2400 and in 2022 there were ~2700 papers.
I am not sure that this implies less expertise for each expert. It could be, because in 2016 there was a much larger fraction of experts with years or decades of experience in AI research, while the typical expert nowadays has 2-3 years of experience. But perhaps long-term experience on AI is useless, so who knows?
But it definitely implies that experts can no longer have an overview of what's going on in the conference. In 2015/16 experts could still know the essential developments in all of NeurIPS. Nowadays they are way more specialized and only know the stuff in their specific area of expertise within NeurIPS. So perhaps it's fair that 2022 experts don't know the state-of-the-art in all of AI, while 2016 experts still knew that?
Here is a chart for how NeurIPS has grown over the years:
https://towardsdatascience.com/neurips-conference-historical-data-analysis-e45f7641d232
I'm a bit angry about the Starcraft thing. Scott says that AI has never beat top humans players. But the reason it did not is that it was not allowed to.
The AI that played starcraft was seriously hobbled : it wasn't allowed to view the whole terrain at once by moving it's camera very fast. It wasn't allowed to micromanage each and every unit at once by selecting it individually. In general it's APM (action per minute) rate was limited to a rate comparable to a human.
That's like inviting a computer to a long-division competition, but only giving it a 100 Hz CPU.
The whole point of having AI instead of meat brains is that AI works on very fast CPUs with very large RAM. If you're gonna forgo that, you won't reap the benefits of having a silicon brain.
Well, AI still has advantages, like perfect mouse precision for instance. I'd say that if we are comparing _intelligence_ instead of mechanical proficiency, then an attempt to introduce those limitations makes sense. Of course, the underlying problem is that high-level Starcraft is not particularly intelligence-heavy in the first place.
That article about AI taking tech jobs is really frustrating. He predicts by 2025, 55% of programmers will be using LLMs "for the majority of their professional programming work". He bases this on "the 1.2 million people who signed up for Github Copilot during its technical preview" (divided by a bit to restrict to U.S. programmers), but utterly elides the distinction between "signed up to try the new thing" and "have adopted the technology as a key part of programming from now on".
What percentage of those 1.2 million even got around to trying it at all after signing up? (I'd guess less than 50%.) What percentage of the remainder got frustrated and didn't put in the effort to learn how to work it? (I'd guess more than 50%.) What percentage of the remainder were able to find places to use it in a significant proportion of their work? (I'd guess less than 50%.) There's a huge, huge gulf between "is a tool someone's tried once" and "is an invaluable part of _the majority of_ someone's work".
A couple months ago, my workplace rolled out an internal LLM-based autocomplete similar to Copilot, but the suggestions are only occasionally useful and it's not even clear to me that it's a net positive.
But more importantly, **even if LLMs could magically write all the code I wanted to perfectly**, that *still* wouldn't be "the majority of their professional programming work" because writing code is only a small fraction of what programmers do.
Amen to that last paragraph. Writing the code is generally the easiest part of my job. Deciding what to write, how to align it with the existing architecture and hidden constraints[1] of the system, figuring out why it worked (didn't throw an error) but didn't actually work (do what it was supposed to do), how to validate what it did and what the side effects of doing it this way are, etc.
[1] Working in a codebase that grew "organically" and worse, was subject to several top-down "cool new things" pushes means that there are a lot of undocumented, spooky action-at-a-distance constraints. Where jiggling that lever means the bridge half-way around the continent falls down. But only if the moon is full and the price of bulk oolong tea on the Bejing Futures market is above $X/kg.
You aren't even covering the meetings where you go back and forth with various departments 82 times about what you are even trying to do!
Thankfully I work for a small company that doesn't have enough departments for that. But we have plenty of meetings as it is, to be sure. And product folks who can't make up their mind (or maybe can't remember what they had previously decided). So I feel you.
I once worked on a project that dragged on for 4 years and progress at meeting 2 was about 17% and progress at meeting 40 about 18%…
Thank you for taking the time to read my article, and I take your point. In the process of writing the article, I had to make some difficult decisions about what to include due to space constraints. Unfortunately, this meant that some of the nuances around the adoption of coding assistants, including the points you've raised, were not explored in as much detail as they otherwise could have been. If you'd like, I can do a deep dive into this section on my blog. Otherwise, I'll be happy to answer questions offline and share the original paragraphs surrounding this section that capture a bit more detail.
Gosh, interesting. Thanks for the reply. So you say there is numerical detail and the headline result is still really that you think 2025 is the best guess for when 55% of programmers will be using LLMs "for the majority of their professional programming work"?? I find that... very unlikely. Big chunks of the software industry are swift to adopt new tools, especially in the web space, but there are other big chunks which aren't using any tools newer than 5 or 10 or 15 years old. Some of our customers are still refusing to update from a 10-year-old version of our software! And "the majority of their professional programming work" feels like a huge step from even "acknowledge as a valuable tool and use at least once a month".
Thank you for your response, and I appreciate that you're actively thinking about the underlying assumptions behind my process and not just accepting it at face value. The 55% estimate for 2025 carries a healthy degree of uncertainty since I didn't have much data to go on. It's a good point that a significant number of developers work in environments where the adoption of new tools like LLMs may be slower. Are you using them yourself?
As mentioned in the article, about 1.2 million people had at least signed up for Github Copilot during its technical preview and I estimated about 300K (25%) were in the United States based on Stack Overflow surveys. Of those, I expected only 150K would use it seriously for business purposes (based on conversations with developers I know) which when divided by the estimated developer population of the United States at that time (~4.2M) gave the ~3.5% figure for 2022 from the article.
Mapping this to an idealized adoption S-curve would suggest that we're likely around 10% adoption today (which also lined up with an informal survey I conducted), but if you think the initial estimates are suspect, then that could significantly change the trajectory.
I appreciate your feedback and it's given me something to think about. I'm curious to hear your thoughts on the current level of LLM adoption among developers and how you see this evolving as time progresses. Thank you again for your comments.
It's been a while, but I recently saw something you might find interesting: in a recent StackOverflow poll, 44% of software engineers said that they use AI tools as part of their development processes now and 26% said they plan to soon: https://stackoverflow.blog/2023/06/14/hype-or-not-developers-have-something-to-say-about-ai/
In your article, you argue that experts are well-callibrated because they had as many underestimates as overestimates. But this doesn't account for sampling bias - overestimates are more likely to be in the sample (since they've already happened). That is, things predicted for 2026 that happened 4 years only are in green, but 2026 predictions that only happen in 2030 aren't in red in your table.
Adjusting for this, experts probably do have a bias towards optimism (though small sample size so hard to be sure).
Since this bias becomes bigger the more far-out predictions you have, it also explains why the more recent survey looks much more skewed.
Also, re the argument that "fusion is always 30 years away" - there's consistent models for that (e.g. if you think it'll come 10 years after a visible breakthrough but aren't sure when that breakthrough will happen, modeling it as a poisson process will give you 30 year timelines until they jump down to 10). AGI isn't quite as single-breakthrough-dependent, but I can see the case for a similar model being consistent (where it mostly remains X years out then jumps down whenever something like gpt comes out).
Fusion is limited by cost-effectiveness, not breakthroughs. If we were actually close to running out of fossil fuels like it had been fashionable to predict once (but for some reason still insane about ordinary nuclear plants), then all the technical problems would've been solved within 5 years.
One crux I note that I have with "rationalists"/AI doomers, that I see here on display:
Most of the items that Scott resolves positively on the AI predictions, I would resolve negatively, with the further conclusion that AI predictors are way too "optimistic". A major example would be: "See a video, then construct a 3d model". We're nowhere close to this. And sure, you can come up with some sort technicality by which we can do this, but I counter that that sort of reasoning could be used to resolve the prediction positive on a date before it was even made.
I find this further makes me skeptical generally of prediction markets, metaculus, etc. The best "predictors" seem to be good not at reading important details regarding the future, but at understanding the resolution criteria, and in particular, understanding the biases of the people on whose judgement the prediction is resolved.
As much as it's a good concept that disputes between models can be resolved via predictions, in cases where different models produce different presents, and both models accurately predict the respective present environments that the people using the models find themselves in, the entire system of making predictions doesn't work.
"See a video, then construct a 3d model" is crude but existing tech right now. You can get an app on your phone to do it for individual objects, and google maps photogrammetry is also essentially doing this.
Note that you have to "See a video, then construct a 3d model" with AI. Turning an image into a 3d model is something we were able to do in the 1980s. The model probably won't be accurate or efficient, and it certainly won't have good topology, but...
There's a motte bailey problem with these predictions. Each of these capabilities sounds and reads as impressive and powerful technology, but the technology that then resolves them positive tends to be useless and uninspiring. I want to see actual professionals doing important work with these tools before I would resolve this class of prediction positively.
I continue to be annoyed at people identifying the phrase "Moore's Law" exclusively with Moore's 1965 paper on the optimal number of transistors per IC. When the phrase was coined in 1975 many people, including Moore, had pointed to other things also improving exponentially and it was in the context the conference at which Dennard presented his scaling laws tying all the different exponential curves together. From 1975 to 2005, when Dennard Scaling broke down, it was always used as an all-encompassing term to include transistors size decreases, power use decreases, and transistor delay decreases. It's only when the last of these went away that people started arguing over what it really meant and some people substituted the easier question of "What was the first exponential Moore talked about" for "What does Moore's Law really mean." /rant
The bit about AI alignment in "Crash Testing GPT-4" was miles better than most of the AI alignment stuff I've read. Practical, realistic, fascinating. That's real adult science!
Agreed!
Hmm... I was interested in the "So the kind of tests that you’re doing here seem like they could involve a lot of training or prompting models to display power-seeking behaviors, which makes sense because that’s what you’re trying to find out if they’re good at. But it does involve making the models more dangerous. Which is fine if it’s GPT-4, but you can imagine this increasing the threat of some actual dangerous outcomes with scarier models in the future." exchange... Shades of gain-of-function work...
In my limited testing and subjective evaluation, when translating from Russian to English, ChatGPT 3.5 beats Google Translate by a wide margin and translates about as well as a fluent but non-expert human. I have a degree in translation between this pair of languages and it beats me at translating songs. I think the chart pegging it at 2023 is in fact spot on.
The rank stupidity of questioning whether animals think, have theory of mind, etc., gives good reason to mistrust experts of all stripes, but particularly those who think about intelligence, consciousness, etc. (Even in early 90s, time magazine had a cover story asking something like "can animals think?") Combine this with claims about qualia, etc., denying same to other things plays similar role in misguided philosophical arguments about thinking, etc., denying the obvious that it arises out of physical structure of brain, chemical interactions, etc., an emergent property without any magic or special sauce involved. Which suggests machines "feel" something, even if something we might not recognize, are "thinking" even if we just want to frame it as "mechanistic" or "without the magic spark," etc. We're playing the same game with AI now. "OK, fine, crows use tools, but they still don't really "think" because . . ." "OK, fine, GPT seems to "understand" X pretty well and respond appropriately, and can fool most fools most of the time, but still can't . . . "
Sarah Constantin is an ace at topic selection and she goes deep enough into it to just satisfy me too, although I would have liked two areas in particular to be explored more: memristors and neuromorphic computing. If there is to be a breakthrough to get us past the Moore's Law limit that is approaching around 2030 then I think these two technologies will contribute.
The economist and research scientist, in their debate, seem to have forgotten the basics of GDP and *economic* growth (distinguished from other "growth" such as the invention of collectibles, or changes in price of existing goods, that is not economic).
Statements about GDP and economic growth are statements about household income and expenditure.
Yes, there are government spending, business capital formation, and net exports, but to a first approximation, GDP is what households get, and what they spend. *Economic* growth is an increase in household income and expenditure--an increase in the material welfare of households.
Again to a first approximation, households get their income from jobs. The debaters, while talking at length about automating jobs, have not bothered to look at *where the jobs are.*
A quick search of the Bureau of Labor Statistics's web site will tell you that the high-employment occupations are in health and personal care aiding, retail sales, customer service, fast food, first line supervision of retail,office and logistics staff, general and operational management, and physically moving things around - warehouse staff and shelf re-stockers, and order fulfilment. And nursing.
(Just beyond the top 10 we have heavy truck driving, the paradigmatic automatable job, that will be gone in 2016... no, wait, 2018... er, 2021 for sure... well maybe 2025 ... umm, ah, now we've tried to do it, it's a little tricky: perhaps 2040?)
If you want to alter the economy fast, you have to augment the work of people in those high-employment occupations, so the value produced per employee becomes greater. That way, you can pay them more. You're not going to make a big impact if you only automate a few dozen niche occupations that employ a few thousand people each. That doesn't move the needle on aggregate household income and expenditure.
Augmenting work in high-employment occupations could result in either of two outcomes (or, of course, a blend):-
a) if price elasticity of demand is high, and the cost of providing the service falls, then demand explodes. Health and personal care aiding is probably like this--especially if we look at a full spectrum "home help" aide job: some combination of bathing and grooming (nail trimming and the like), medical care (wound dressing, and similar), meal preparation, cleaning (kitchen, laundry, house general) and supervision of repairs and maintenance, operating transport (for example to shops and medical appointments), and on and on.
Currently about 3 million people are employed in the US doing health-and-personal-care work. Probably this means three to five million recipients. There is a potential market of 300M--if the price is right. Who wouldn't make use of a housekeeper/babysitter? (That's a rhetorical question: "every working woman needs a wife", as the saying used to go, back in the '80s and '90s, before political correctness. Of course the existing workers in this occupation are nearly all working with elderly people who are no longer able to do some things for themselves. So "every adult needs a HAPC aide", perhaps.)
So, maybe a sixty-fold increase for an occupation that is about a sixtieth of the workforce: a doubling of household welfare, potentially. All we need is for every occupation to be like that: job done!
But...
b) if price elasticity of demand is low, then the service becomes a smaller part of the total economy. This was the case for food: home prepared and consumed food used to be about 40% of household expenditure; now, it's single-digit percentages. People ate more food and better food as its price dropped; they just didn't eat *enough* more and better to maintain food in its rightful place at the top of the budget. Heating (space-heating, water-heating) and clothing have also shrunk in percentage terms.
It seems likely that retail sales, order fulfilment, and restocking fall into this category. Yes, people will buy more stuff if the price falls, but the increase is probably less than one-to-one with the price reduction. Some of the saved money will be spent on extra tutoring for the child(ren), haircuts, spring breaks, or pilates classes instead.
The same applies to the "customer service representative" and "fast food restaurant worker" occupations.
So automating these occupations (about 12 million employees?)--which, by the way, is easier than automating health-and-personal-care-aiding: it's already happening--will, er, "release" those employees.
Observe that it will also "release" a fraction of the "first line supervisor" and "general and operational manager" occupations, because they supervise and manage workers in these occupations so the effect would be super-linear. There will be *a lot* of people wanting to tutor kids, style hair, facilitate spring breaks, and teach pilates classes.
On the gripping hand, automation may well cause some occupations to grow hugely, that are currently very niche. Personal trainers. Personal coaches/mentors. Wardrobe consultants. Dieticians. Personal "brand"/social media image managers. Ikebana teachers.
And new occupations, or sub-occupations will appear: tutoring people how to get the best out of conversational AI interfaces, for instance.
I have no clue what will take off or to what extent, but I think the process of getting to mass market will take years. (And in the mean time, the effect of automation will be to shrink household I&E.)
That brings me to the third omission from the debate: constraints on investment. If you look over the BLS's list of high-employment occupations you will notice that most of them do not have large amounts of capital associated with them. That is likely to be because businesses employing people in those occupations can't borrow to invest very easily. They are low margin businesses.
There is some discussion of constraints in the Asterisk piece, in the form of "the time it takes for a crop to grow", or "the time it takes to acquire information", but no discussion of the basic (to an economist) constraint: access to funds. Now that interest rates are no longer near zero, this looms as a major limiting factor.
So I think that Hofstadter's Law applies: it always takes longer than you expect, even when you allow for Hofstadter's Law.
Tamay and Matt's conversation seems a bit strange and off base, missing the mark on reality. They sound like those experts talking about how far out certain effects will be in the second survey in Scott's article where they gave reasonable sounding answers for future dates, but those events had already occurred.
One point jumped out at me a lot. Matt's point in his second to last reply talked about how we 'have not seen' impacts from AI in the economy to boost it beyond the 2%. I'd argue the 2% we have been seeing is already coming from AI as older value adds have faltered. They assume a static world and miss the economic skullduggery that'd been going on to hide the long term decline of western economic growth which has been going on for decades. What propelled us in the 1900s is no longer working.
I.
They are just dead wrong. Companies like Amazon and Nvidia have seen dramatic benefits from using AI over the past 10 years as it has scaled up and become useful. Not just LLM, but also in ML and such for Google and the social media companies as well. Truly we may look back at the role of compute power in the economy more so than some special types of compute called AI.
When Brexit and Trump benefits in razor thin technical victories using Cambridge Analytica...that was a win for AI as well in the zero-sum political games which are not progress focused. - which was a great point Matt made. AI and compute only models have already been a huge factor in the economy.
They are also wrong in fixating on AI doing things humans doing...but neglect the things AI can do which humans can't do! Such as write personalised political ads for a million of swing voters. The AGI and AGI vs human tasks is too narrow a framing when considering the impact of AI on the economy, as it already can do things humans cannot which create economic value. Their fixation on job/skill/task replacement vs grand scale macroenomic outcomes is just too narrow.
II.
I think this has been masked by the fact of economic decline. Something like 95% of the growth in the S&P can be attributed to just 5 or 6 companies, tech companies using AI. Most of the economy is worse than stagnating and has been for decades.
This is a hard pill to swallow due to false accounting of inflation and other manipulated economic metrics and conditions created in a decade of low interest rates decreed by central bankers, but the publicly traded economy has been hyper concentrated in FAANGT and such for a long time now with AI of various types being their leading edge for the past decade.
Amazon and Walmart and others have already greatly benefited from AI in their warehouse and supply chain systems. Tesla has greatly benefited from various forms of machine intelligence automation and much of their value is tied in speculation about how amazing their AI self driving will be and their Dojo compute system. Not to mention AI and compute based allocation and coordination of data for their charging network, which wasn't as simple as/equivalent to building a bunch of gas stations along major highways.
I think the AI revolution and forces which drive economic growth have already begun and these two are busy talking about needless highly specific and arcane definitions of full AGI and when this or that factor will lead to greater than 2% economic growth trend. We are already at 20-30% economic growth or greater for AI using companies and they are picking up for the falling value of things like big commercial banks which are down 90% over the past 10 years.
This has already been happening and it is a wrong assumption to think the old factors of 2% growth from the 1900s are still in effect. They are no longer happening in western economies and have been dying out for a long time, hidden by financial games and zombie companies.
Re: Economic doubling times
This entire article is junk because it focuses on GDP.
GDP, particularly for FIRE (Finance, Insurance and Real Estate) focused Western economies, is utterly meaningless. The massive fees extracted from Western populations by junk credit card fees, for example, "grow GDP".
So does the enormous US overspend on health care.
Skyrocketing college costs.
The list goes on and on and on.
Graph something concrete like electricity usage, energy usage, steel consumption and you get dramatically different outcomes.
Do you? Can you cite me something on interesting energy consumption vs. GDP divergence? And how would you resolve the problem where eg new medical cures make the world better but don't consume steel?
https://policytensor.substack.com/p/is-the-us-stronger-than-china
Multiple examples where GDP was neither indicative of greater production nor better outcomes.
As for "medical cures": while the cure may be pure information - the actual manufacture, transportation and deployment of said cure requires energy and resources. In particular - chemicals. India makes a lot of pharmaceuticals, but 80% of the inputs to these pharmaceuticals come from China and, as such, India's pharma output is 80% (or more) dependent on Chinese imports.
The offshoring of manufacturing hasn't been just big ticket items like cars, but all of the basic and intermediate inputs from which almost everything is made. Another example: https://www.aljazeera.com/news/2023/6/2/us-seeking-tnt-in-japan-for-ukraine-artillery-shells-report
Note that explosives are a direct outcome of energy - in particular, natural gas --> Haber Bosch --> Ostvald --> nitric acid. TNT = tri nitro toluene = oil component toluene treated with nitric acid 3 times. The US importing TNT from Japan - which in turn has zero natural gas or oil - is pretty damn sad.
That post doesn't seem to be saying that. It discusses quibbles between GDP and GDP-PPP, then says that in war, what matters isn't raw GDP but GDP-devoted-to-weapons-manufacturing, which seems like a trivial insight, much as a country's position in the space race is probably less predicted by GDP than by GDP-devoted-to-space-programs.
Electricity probably isn't a perfect predictor of war-making ability either; Russia has 10x the electricity production capacity as Ukraine, but they seem evenly matched.
The post explicitly shows that the USSR had a far lower GDP ratio than National Socialist Germany yet produced 50% more weapons overall. This can hardly be called a quibble since tangible production always trumps nonsense measures of economic capability, and serves as a direct refutation that GDP represents an objective measure of actual industrial or economic capability. Note furthermore that nobody believes the USSR was a more advanced industrial economy in 1939 vs. said National Socialist Germany...
The post then goes on to look at China's manufacturing - both in terms of connectivity as well as pure "dollars" - as compared to the US and Europe; the result is China having as much manufacturing capability and the US and Europe combined.
And finally, Ukraine. Clearly you have not been keeping up: Russia is firing 10x the artillery shells that Ukraine - even with direct Western inflows supplementing - has been, something which even Western mainstream news (WaPo, NYT etc acknowledge).
Somehow this translates to an "even match".
So while I absolutely agree that any one measure like electricity consumption is not itself a better proxy than GDP - there are plenty of proxies which are. The manufacturing capacity one is probably the closest.
Note that while I don't agree with those that ascribe the Western GDP ascendancy to be nearly entirely fictitious due to the enormous aforementioned FIRE disproportion; Dr. Michael Hudson is entirely correct when noting that credit card fees, outsize college tuition and overpriced medical care do not in any way convey productivity in a national sense whereas they very much do inflate GDP.
I can cite any number of other examples of Western manufacturing hollowing out - Apple's attempt to onshore assembly of iPhones into the US failed because it could not source the tiny screws used, LOL; Ronzini had to discontinue a popular pasta - both referenced here: https://www.thebulwark.com/the-economic-secret-hidden-in-a-tiny-discontinued-pasta/