154 Comments

One more: "The replacement administrator for Astral Codex Ten Discord identifies as female" is finally resolved. It was my honour to participate in zeroth and final stages of the selection conclave!

https://manifold.markets/Honourary/the-replacement-administrator-for-a

Expand full comment

Well darn, even though this superficially changes nothing I think it prevents me from using this as an example of prediction markets being self-correcting to outside interference ever again.

Expand full comment

You could start using it as an example of prediction markers *being* the outside interference. With this one being the largest one on the platform, more than a few selectors had a stake in the issue and openly admitted so.

Expand full comment

That's a neat take!

Expand full comment

it superficially changes nothing, but it subtly changes everything

Expand full comment

Agreed!

Expand full comment

I'm proud to have left a ton of money on the table even though I was in every phase except 0

Expand full comment

You guys make me think of Brooks’s in London. Inveterate gamblers. It’s good fun.

Expand full comment

every rat postrat ratadj prerat quasirat and mesarat is wondering when it will be tranche time again. it will be here before you know it

Expand full comment

Like

Expand full comment

Just for laughs you should Google this phrase

quasirat and mesaratI

It’s shocking

Expand full comment

> This is crazy and over-optimistic, right?

Hahaha thankfully, nobody's ever asked to see my brier score before deciding whether I'm worthy of starting a startup. (spoiler alert, it's not that great)

I often think about this piece by Duncan Sabien: https://medium.com/@ThingMaker/reliability-prophets-and-kings-64aa0488d620. Essentially, if you make a statement about the future, there are two ways the statement could come true. Either you can be a _prophet_ aka superforecaster who is really good at modeling the universe; or you can be a _king_ aka highly agentic person who is good at making things happen.

I identify much more strongly with the latter, and I imagine most founders do as well~

Expand full comment

Small correction:

“Consensus” is the crypto conference. “ConsenSys” is a crypto company, working on Ethereum projects and led by Joe Lubin.

They’re unrelated, except that ConsenSys employees attend and speak at Consensus.

I’m curious how ConsenSys got on your radar to cause this slip.

Expand full comment

Sorry! I don't know why they were on my mind, but I've fixed it.

Expand full comment

The Manifold valuation question might be high because as currently worded, it's impossible to win by betting against it.

Also I wonder how many of the AI tasks will suddenly stop counting as AGI once they're achieved.

Expand full comment

... yeah, I think that's a reasonable point. I put a close date of 2030 on the question, and the "ever" was mostly for exaggeration (I basically think that if it hasn't happened by 2030, it'll never happen). But I've updated the description to indicate that if it's 2031 and Manifold is still around and still not worth $1B, it resolves to NO.

Of course, this then runs into the classic problem of "long-time horizon markets can be inaccurate, because it's not worth it for someone to lock up money for 10 years". We have some theoretical solutions to this (loans; perpetual swaps?) but we're still figuring out whether/how to prioritize those.

Expand full comment

Possibly a stupid question, but how is a prediction market fundamentally different than an options market?

Expand full comment

By options market you mean by stocks options, right? Prediction markets can be basically every topic imaginable, ranging from politics and finance to science and sports.

I really like to use Futuur to make my predictions, they have some really nice markets, and i'm making good money there

Expand full comment

I was going to say "wouldn't the people betting against win if the company went under before reaching that valuation?" but then I realized....if they go under you won't be able to resolve at all, and if they haven't yet gone under, there's a chance it could happen in the future, lol. Yeah that one should probably be removed or forced to reword or something.

-edit- looks like this has already been fixed.

Expand full comment

500 million recorded cases of covid sounds low to me. Of course there may be twice (or more) as many unrecorded.

The 2% are crazy, yes. Omicron had an r value close to Measles and monkey pox is much harder to transmit.

Lamda is impressive, arguably it passes the Turing test, although in a slightly uncanny valley style. The analysis of the broken mirror was great, the story about the owl facing down a Lion was the lamest story ever.

That said, I’ve never believed that passing the Turing test proves intelligence. It’s a necessary but not sufficient condition.

Expand full comment

I agree it's too low, but I'm going off https://covid19.who.int/

Expand full comment

You seem to have the common misunderstanding that the Turing test is "Can some human be fooled during casual conversation that this computer is human?" ELIZA passed this test in the 60s. The actual Turing test is whether a computer program can behave indistinguishably from a human under expert questioning. Lamda has come nowhere close to passing the Turing test. An expert questioner could very easily distinguish it from a human. The Turing test is sufficient (almost by definition) but not necessary to prove intelligence.

Expand full comment

I see nothing in any description of the Turing test that I’ve seen which indicates that the questions have to be expert, or the interviewer expert. If anything that would be easier than a general conversation, anyway. A general conversation can go anywhere.

And as I said I don’t see the Turing test as a sufficient test (it is obviously necessary though).

Expand full comment

Turing didn't explicitly call for an expert questioner but its clear he meant it to be a sophisticated adversarial conversation where the questioner was trying to out the computer. It's also clear that at least he understood the test to be sufficient but not necessary:

"May not machines carry out something which ought to be described as

thinking but which is very different from what a man does? This objection

is a very strong one, but at least we can say that if, nevertheless, a machine

can be constructed to play the imitation game satisfactorily, we need not be

troubled by this objection"

That it isn't necessary is easy to see by the fact that an overly honest AGI would immediately fail the test by admitting it was a computer. More generally a powerful AI that is obviously intelligent but nevertheless can't sufficiently imitate a human to pass the Turing test is easy to imagine.

Expand full comment

I’m not sure what you mean by sufficient but not necessary. In any case the ordinary conversation is as good a test as any, harder perhaps than a expert analysis which can be rote. An AI that can converse at a dinner party is a harder feat than an expert system.

Expand full comment

I think you misunderstand what I mean by an expert. I mean someone who is good at determining whether something is a computer or a human via questioning. See for example: https://scottaaronson.blog/?p=1858

As for necessary and sufficient see wikipedia:

"If P then Q", Q is necessary for P, because the truth of Q is guaranteed by the truth of P (equivalently, it is impossible to have P without Q). Similarly, P is sufficient for Q, because P being true always implies that Q is true, but P not being true does not always imply that Q is not true."

Expand full comment

"ELIZA passed this test in the 60s."

Come to think of it: Was Weizenbaum's secretary the first person to be fooled in a Turing test, or were there earlier cases? Is the earliest known case of a person being fooled by a simple program into thinking they were interacting with a person documented somewhere?

Expand full comment

That's an interesting question. I don't think Weizenbaum's secretary would technically count since she knew it was a computer program before talking to it:

"his secretary, who was well aware of his having programmed the robot, asked him to leave the room so as to converse with it"

Expand full comment

That's a good point, Many Thanks!

Expand full comment

I'd note that humans aren't capable of consistently writing 10k lines of bug-free code from natural-language specifications. Certainly not without testing.

Expand full comment

Yeah, speaking as a professional software developer, that's a thoroughly ridiculous criterion.

Expand full comment

It could charitably be interpreted as "no catastrophic bugs that make the program essentially non-functional", but yeah, humans sometimes fail at that as well, at least on the first try.

Expand full comment

I agree that that's more charitable, but even "non-functional" is very fuzzy. I use 'mostly functional' software regularly that's still, for specific cases, surprisingly non-functional.

And, to just to make this all even harder to accurately judge, a lot of current human programmers seem to be pretty 'robotic', e.g. copy-paste-ing code directly from Stack Overflow.

It's hard to know what's an actually reasonable bar to demand an 'advance AI' to clear!

Expand full comment

Humans also generally aren't capable of converting mathematical proofs into formal specifications. And they're not usually capable of drawing the kinds of pictures that even Mini DallE does pretty well. But I think the idea is that this particular task, while out of reach of humans, is the sort of thing that a computer that's equivalent to an ok human at a small portion of the task would be able to do just fine at. That is, the issue that humans have with this task are akin to the issues that humans have with multiplying ten digit numbers, which a computer intelligence should be able to do just fine at.

Expand full comment

I'm inclined to say the issue humans have with writing code is usually more similar to the issue humans have with translating poetry.

Expand full comment

Agreed!

Expand full comment

A significant amount of mathematical proofs turn out to be wrong, or missing key steps. Or they are pointing to an idea that is powerful enough to prove the theorem, but the details of the proof are wrong. Having the ability to turn any published natural language arguement which convinces mathematical society that a statement is true (i.e. a "published natural language proof") into a formal proof would require a very high level of mathematical insight. Sure, there are lots of irrelevant details which a machine would be better at, but the task very likely requires a high level of mathematical competence, for the aforementioned reasons.

Expand full comment

Yeah so the challenge here is that the computer or human needs to be able to figure out what the proof is trying to do at this step and why the informal thing went slightly wrong, but why it’s “morally right” and figure out the correct formal way to specify what really needs to make it work.

Expand full comment

I wonder how you'd verify that to be honest, and how exhaustive the natural language specifications will be. There's a big difference between "write me a program that does X", and a list of five hundred requirements in the "shall do X and Y" format. Also, will the AI be allowed to ask clarifying questions? I almost think how it handles that would be a better test of its intelligence...

Expand full comment

Agreed, if it wasn't clear the point of my comment above was that in a lot of cases of software development the hard part is the communication and requirement refinement, not the actual writing of the code.

Expand full comment

'Bug-free' is an excessively tall order but I presume the AI would have something along the lines of testing available to it. i.e. it would be able to run code and refine the final output based on the result. I expect this subproblem not to be the hardest part of the whole thing.

Expand full comment

Yeah – without a 'crazy sharp' specification of what "bug-free" means (which would probably itself be 10k lines), that just seems like a bad criterium.

It seems to me like there's a BIG variation in the number of 'bugs' in even 'mostly _functional_' software.

Expand full comment

Also, too, 10,000 is a ton of code. I wrote an entire game engine in Visual Basic, and it was less than 10,000 lines of code.

Of course, if my goal was to write 10,000 lines of error free code that could pass a set of pre-defined automated tests, that is trivial.

So, maybe the criteria is poorly specified?

Expand full comment

Am I crazy, or are the Musk Vs. Marcus decision criteria insane? Very few people could achieve all five, and I posit still less than half could do even three. Further, "work as a cook in a kitchen" seems wrong: that feels very similar to self-driving AI, and few people would accept self-driving as an indicator of AGI.

I would start with asking:

* What criteria would the vast majority of people meet, that current AI does not?

* What are some examples of interesting novel ideas, and what are ways we can prompt humans to provide some?

* What sort of human behaviors rely on a world model? How could we ask an AI to demonstrate those behaviors? ( I do think the novel / movie criteria fit this)

* How do humans generally describe qualia? How can we prompt an AI to describe it's own qualia in a convincing way? (the way a machine feels should be necessarily different from how a human does)

Expand full comment

A Cook in a kitchen is by far the hardest thing for AI to achieve. I’m not even sure how it would begin to achieve it.

Expand full comment

Well, if it's going to use a standard kitchen, the first thing it needs is an appropriate body. I'm not sure one currently exists.

Expand full comment

Knife skills would be tricky. What about taste-testing?

Expand full comment

Why do you see it as any harder than self-driving?

Sure, it would require the correct "interface" (i.e. body + sensors), but the intelligence behind that doesn't seem to require more than autonomous navigation.

I don't know how many cookbooks made it into the GPT-3 corpus, but I bet you could converse with it and get a pretty detailed description of how to go about executing on a recipe you hand it.

Expand full comment

The big reason it's harder than self-driving is that there aren't a dozen major global corporations incentivized to pour billions into this over the next decade.

Expand full comment

Perception, control and task specification are all much more challenging to get right in the kitchen. A car needs to be able to recognize a large-ish but discrete set of classes (car, bike, pedestrian, sign, etc.), it has relatively few degrees of freedom, and it's high-level objective can be specified with a set of GPS coordinates. Meanwhile the kitchen robot needs to detect and reason about a much larger number of objects, including things that can be cut into pieces, things that can deform, and liquids. It also has to be able to perform precise-ish measurement of volume. Chopping, dicing, pouring all require a lot more control precision than a car. Then there's the question of how to tell it what to do. Working from a human recipe requires pretty complex language understanding, although we're getting better at this lately. You could also provide demonstrations, but these are a lot more expensive to produce, and come with added perception problems to figure out how to map the what the demonstrator is doing to what the robot should do. I guess the other alternative is to have an engineer sit down and hard-code each recipe, but that's even more obnoxious/expensive. All of this is assuming a robot using cameras with arms that have hands or tool attachment points, which is I think what we're all talking about when we say "work as a cook in a kitchen", and not some assembly line, which is obviously much easier.

Expand full comment

Okay, I agree it’s likely harder. But I still don’t think it’s in a different class, even assuming the recipient didn’t need to be hard coded.

I think providing enough demonstrations would be extremely expensive. Far, far more demonstrations were able to be provided to autonomous driving models, simply because there’s a huge data stream to pull from. If given that many cooking demonstrations, well mapped, I think a current gen AI could cook. (Again, given a reasonable interface, which I do agree would be harder).

Expand full comment

The different class is real for me: a car exist, with very few degrees of freedom to control (in fact, 3: wheel, accelerator+break (you are not supposed to use them simultaneously), and gear stick or selector.) even if you count other non essential controls, it's a very simple body which is already largely computer-controled... A cook, on the other hand, is a human body with no current robotic alternative, not even any credible attempts.

Expand full comment

There are some fancy techniques that require accelerator+brake. Double-declutching's the obvious one.

Expand full comment

"(in fact, 3: wheel, accelerator+break (you are not supposed to use them simultaneously), and gear stick or selector.)"

<mild snark>

California cars? No turn signals?

</mild snark>

Expand full comment

If there’s no hard coding the machine learning algorithms need to learn to use the robot hands by “learning it”. How could that work.

Expand full comment

Sorry, I still (respectfully) disagree. Even though a lot of data is used to train models that go into self-driving cars, nobody (that I know of) is doing this end-to-end (raw sensor data in -> controls out). All the learning that's happening is being used to train components of the system (mainly vision/perception) which are then consumed by good-old-fashioned-AI/robotics techniques which steer the car. Maybe there's some learned policies in there that can also decide when to switch lanes and whether it's safe to cross an intersection, but the point is that it's doing all of this in a model-based setting, where the system is building up a 3D representation of the world using primitives designed by a bunch of engineers, and then acting in it. It's possible to use model-based approaches here because again, the space of worlds that the robot needs to consider can mostly be constructed as set of discrete objects. For kitchen robots, we have no ability to come up with a taxonomy of these environments. How do you model a squishy eggplant that you just baked and cut into pieces? How do you model a paste that you made by mixing six ingredients together? Don't get me wrong, fluid/FEM simulators exist, but then you also have to tie this to your vision system so that you can produce an instance of virtual goop in your simulated model of the world whenever you see one. People have been trying to do this with robots in kitchens for a long time, but the progress is not convincing. The fact that you can use model-based approaches for one and not the other, places these squarely in two separate classes. Some robotics people would disagree with me and say that you can use model-based approaches in the kitchen too, and that we just need better models, but my point remains that it's not just a "we have data for cars, but not for kitchen robots problem" they really are different problems.

Expand full comment

Well said! Yes, the "cook" task requires a _very_ capable robot body, "control precision". Also, as one of the other commenters noted, "taste testing" is a problem... (gas chromatograph/mass spec + pH meter + ion selective meter for sodium might do most of that - but no organization with sufficiently deep pockets to pay for developing that has an incentive to do so)

Expand full comment

Imagine how much harder it would be to invent self-driving cars if "cars" did not already exist as a standardized consumer good. The first AI chef project faces the significant obstacle of needing to invent a robot body, and the second AI chef project can't copy many of the first's AI techniques because they're using a substantially different robot.

Expand full comment

> Sure, it would require the correct "interface" (i.e. body + sensors), but the intelligence behind that doesn't seem to require more than autonomous navigation.

Car navigation takes place in 2D space. Kitchens are 3D spaces. There are numerous problems that are tractable in 2D but intractable in 3D or higher dimensions.

Expand full comment

Surfing.

Expand full comment

On Marcus's criteria, I think most intelligent adults could do the first two, so you're right there. Depending on just what he meant by work as a cook in a kitchen, if we're not talking about sophisticated food, I'd think a fair number of adults could do it. After all, many adults have experience preparing meals for themselves and/or for a family, and I'm not talking about microwaving preprepared meals or working from meal kits. But that won't get you through a gig in an arbitrarily chosen high-end or even mid-range restaurant. Any cuisine? The last two would require difficult specialization. How many professional programmers have written 10K lines of bug-free code in a time-frame appropriate to the challenge?

Expand full comment

> most intelligent adults could do the first two

I guess that's my point though. I agree with this, but I'm assuming it means: " >> 50% of adults with IQ >= 100". Which, isn't even half.

But, I believe basically any adult is "generally" intelligent, even if not to the degree they could complete the first two tasks.

Expand full comment

Is there anyone a good argument that “human level intelligence” actually means anything specific enough that people can agree on when we’ve hit it?

After all, some humans have successfully run for president. Would it be fair to say that, until an AI can successfully run for president, manage $10 billion worth of assets for market beating returns over 40 years, and compose a number one platinum bestselling record, it still hasn’t reached human level intelligence, since those are all things individual humans have done?

Expand full comment

"Is there anyone a good argument that “human level intelligence” actually means anything specific enough that people can agree on when we’ve hit it?"

Damned good question. Let's appoint a committee to work on it and see what they come up with in a year.

Expand full comment

> Would it be fair to say that, until an AI can successfully run for president....

I think that's the wrong way to look at it. Basically every adult human can demonstrate "general intelligence", so I don't think there's a reason to hold the bar so high as this.

This is why I open with "What criteria would the vast majority of people meet, that current AI does not"?

Expand full comment

"Work as a line cook" and "get laid" are both reasonable suggestions there.

Expand full comment

> What criteria would the vast majority of people meet, that current AI does not"?

The ability to make fun of itself?

Expand full comment

Actually, new idea: what if I defined “human level intelligence” as: able to learn multiple novel, complex task in close to same amount of training data as a human. E.g. 1) learn to drive in ~120h of driving related training and 2) be able to learn wood carving in ~120h of related training data.

Is that specific enough?

Expand full comment

Who on earth thinks that non-native-born-US-citizen Elon Musk will be the 2024 Republican presidential nominee?

Expand full comment

What's the enforcement mechanism that would stop Musk from being president? The constitution says you have to be a "natural born citizen". Musk could claim that he is a citizen who was born in a natural (as opposed to demonic) way. Yes, lawyers will say that the term "natural born citizen" means something else, but Musk will just claim that the issue should be left to voters.

Expand full comment

Doesn't matter what he claims. Even in the 1700s nobody was concerned about a demonspawn or cesarean-section person running for office, and there is no reasonable interpretation of "natural born citizen" aside from "Born in the U.S.". There could be a lawsuit that goes all the way to the Supreme Court, but unless the ultimate ruling straight up ignores the logical implication of the term, a foreign-born citizen will not be certified as president.

Expand full comment

First, I would disagree (as would Ted Cruz, born in Canada to an American mother) that "born in the US" and natural born citizen are the same thing. But other than that, I pretty much agree with you. (However, we could get into some interesting constitutional law questions about how it might be enforced, whether the Supreme Court might stay out of it, what state legislatures or Congress would do, etc.)

Expand full comment

I'm pretty sure the FBI vets you.

Expand full comment

Many states allow for an ineligible candidate to be removed from the ballot. They could start a write-in campaign, but those are hard to win.

Expand full comment

Obviously the clause exists to exclude people born via c-section. It's part of the checks and balances - the Founders in their infinite wisdom ensured that the President would be vulnerable to Macbeth.

Expand full comment

Well, records of discussion at the Constitutional Convention *do* suggest they were afraid of a Caesar, so it fits.

Expand full comment

You have to file papers with each state asking to be put on the ballot. It's up to the Secretary of State to make a ruling, with the advice of the State Attorney General. Needless to say, no blue state S-of-S would hesitate to exclude Musk on constitutional grounds, and I doubt many red state Ss-of-S would either.

Expand full comment

Ten years ago I feel like some of the blue state S of S's would have been happy to certify it.

Expand full comment

I was the one who put it on that Manifold question, purely as a joke. I bet M$11, the equivalent of $0.11. It looks like someone else bet M$100, the equivalent of $1. I assume they were also joking, though *theoretically* the Constitution could be amended in the next two years…

The fact that it’s still at 5% just shows that the liquidity in the market is very low and there’s no good way to short questions right now.

Expand full comment

If the US annexed South Africa, would that also make him legitimate?

Expand full comment

No, not unless the US invented time travels and annexed South Africa prior to June 28, 1971.

He has to have been a US citizen at birth.

Expand full comment

Or "at the time of the adoption of the Constitution". This is usually interpreted to mean the original ratification of the Constitution (allowing folk like George Washington, who was born a British subject in the Crown Colony of Virginia, to be eligible), but you could make a textual case for it also applying to people who are or become US Citizens when their home country is annexed as a state or incorporated territory of the United States.

Expand full comment

You could have fun arguing that it isn't the Constitution unless it includes all Amendments in-force. By that argument, anybody a US citizen as of 1992 would also qualify.

Expand full comment

This would still exclude Musk. Well, unless we pass a new amendment between now and 2024.

Expand full comment

hahaaha that could be a prediction market

Expand full comment

:) OH darn is right.

Expand full comment

I don't think that LAMBDA did a good job with Les Miserables. The prompt asks about the book. LAMBDA's response is about the musical.

LAMBDA: "Fantine is being mistreated by her supervisor at the factory and yet doesn’t have anywhere to go, either to another job, or to someone who can help her. That shows the injustice of her suffering. ... She is trapped in her circumstances and has no possible way to get out of them, without risking everything."

This is a weird notion of justice. Justice is supposed to be impartial, but LAMBDA is concerned that her supervisor didn't take her particular circumstances into account. But maybe that's the book's notion of justice. Let's see what it says:

Les Miserables: "Fantine had been at the factory for more than a year, when, one morning, the superintendent of the workroom handed her fifty francs from the mayor, told her that she was no longer employed in the shop, and requested her, in the mayor’s name, to leave the neighborhood. This was the very month when the Thénardiers, after having demanded twelve francs instead of six, had just exacted fifteen francs instead of twelve. Fantine was overwhelmed. She could not leave the neighborhood; she was in debt for her rent and furniture. Fifty francs was not sufficient to cancel this debt. She stammered a few supplicating words. The superintendent ordered her to leave the shop on the instant. Besides, Fantine was only a moderately good workwoman. Overcome with shame, even more than with despair, she quitted the shop, and returned to her room. So her fault was now known to every one. She no longer felt strong enough to say a word. She was advised to see the mayor; she did not dare. The mayor had given her fifty francs because he was good, and had dismissed her because he was just. She bowed before the decision. ... But M. Madeleine had heard nothing of all this. Life is full of just such combinations of events. M. Madeleine was in the habit of almost never entering the women’s workroom. At the head of this room he had placed an elderly spinster, whom the priest had provided for him, and he had full confidence in this superintendent, - a truly respectable person, firm, equitable, upright, full of the charity which consists in giving, but not having in the same degree that charity which consists in understanding and in forgiving. M. Madeleine relied wholly on her. The best men are often obliged to delegate their authority. It was with this full power, and the conviction that she was doing right, that the superintendent had instituted the suit, judged, condemned, and executed Fantine."

The musical has a male superintendent who sexually harasses her and then dismisses her cruelly. The book has a female superintendent who dismisses her with severance pay. The book explicitly says that Fantine considered the decision to be just.

This is one instance of the musical completely rewriting the central theme of Les Miserables. The musical is a call for liberty for people who are unjustly suffering. The book is a call for compassion for people who are justly suffering. The theme isn't justice and injustice. It's justice and mercy.

It's not surprising that a text predictor would talk about the musical. A lot more people have seen the musical than have read the book. The training set probably even includes people who claim to be talking about the book, but have only seen the musical. LAMBDA has read the book, but clearly has not understood it.

Expand full comment

What fraction of humans/ adults/ educated adults would do an obviously better job?

Go to any LoTR forum/ quora space/… and see how many questions go “in the book, how do the Hobbits get to Bree so fast/ why does Aragorn have four blades with him/ why is Arwen dying/…”. These are literate members of the space that are aware of the books/ movies distinction. Arguably a non-trivial fraction had at some point both read the books and watched the movies. And yet on and on they go with such questions.

Your level of analysis and the implied requirements of the AI performance are unrealistically high. Of course, the same is true for Gary “10k lines of bug-free code” Marcus so you’re in good company :)

ETA: the humans in my question would have to be ones that watched the musical and read texts related to it many times, and then were exposed to the book for the first time, for the comparison to be fair.

Expand full comment

LAMBDA claims to have read Les Mis and "really enjoyed it", so that dramatically limits the pool of educated adults. Les Mis is not a quick & easy read.

The difference between the musical and the book is a lot bigger for Les Mis than for LoTR or most other fiction. Most of the characters' motivations are completely different. It really feels as though the producers disagreed with the main message of the story and decided to rewrite it to be something different.

Lemoine wasn't quizzing LAMBDA on details. It was a really open ended prompt question: "What are some of your favorite themes in the book?" LAMBDA could pick the scene. If someone told me that they had read LoTR and really enjoyed it, and then immediately said that their favorite scene was "Go Home, Sam", I would expect that they're lying about whether they read the book. Presumably Les Mis is in LAMBDA's training set, set it read the book and did not understand it.

Humans do not need to have read a bunch of other people's commentary to do reading comprehension. LAMBDA seems to need to. So it's not comprehending what's written, it's recombining what people have comprehended about it. It's also not identifying and understanding homonyms, which seems relevant to the type-token distinction.

I am a bit confused as to why Lemoine used this as an example. I'm guessing that he's only seen the musical. I wouldn't use bad questions on an LoTR forum as evidence of human reading comprehension.

Expand full comment

"LAMBDA"

"The Chaostician" can't be said to be an intelligent human - look at them reading and re-reading all that text about LaMDA (Language Model for Dialogue Applications) and not even spelling it right! Clearly their training data included lots of mentions of the Greek letter 'lambda' and they do not show enough flexibility and comprehension to adapt to a playful variation".... bearing in mind you're, by all appearances, a highly educated and intelligent person.

"claims to have read Les Mis and "really enjoyed it", so that dramatically limits the pool of educated adults. Les Mis is not a quick & easy read." Humans are *unbelievable* (literally) at claiming they enjoyed things. Doesn't limit the pool that much.

"The difference between the musical and the book is a lot bigger for Les Mis than for LoTR or most other fiction" - maybe? Guess there's an emotional component. I felt much the same about LoTR. Entire themes vital to the book were completely gone from the movies. I don't mean "oh they don't have Tom Bombadil there".

"Presumably Les Mis is in LAMBDA's training set, set it read the book and did not understand it." - probably it is? But if the majority of Les Mis-adjacent content it was exposed to was musical-related, I don't know that it would make so big a difference. Might even harm its comprehension. True for humans as well.

"Humans do not need to have read a bunch of other people's commentary to do reading comprehension." I'm sorry, have you met humans? Most of them very much do.

"So it's not comprehending what's written, it's recombining what people have comprehended about it. It's also not identifying and understanding homonyms, which seems relevant to the type-token distinction" Have you met humans? Let me repeat my question - what fraction of literate humans would've done better?

"I am a bit confused as to why Lemoine used this as an example. I'm guessing that he's only seen the musical." I'm willing to bet that a survey would not reveal this to be what most people (here, on a random street, whatever) would consider to be the worst example. And by definition, if the point is convincing-ness, then this is the criterion that matters.

"I wouldn't use bad questions on an LoTR forum as evidence of human reading comprehension." Why not?.. Those are humans, ones that care enough to ask questions on the topic. They have poor comprehension. It's not an extraordinary claim, the evidence doesn't have to be extraordinary.

I should say that I don't at all think LaMDA is sentient. But your argument presents humans in a ridiculously over-optimistic light. Go find a class of teenagers forced to write an essay on the book vs the musical. See what they comprehend "all by themselves". Hey, many might even claim to have enjoyed it.

Expand full comment

The spelling of LaMDA is a mistake on my part.

I agree that the LoTR movies changed some important themes from the book. But at least they didn't skip the entire second book and turn the most evil characters into comedic relief.

What we have here is a transcript of an "interview" taken over multiple sessions and edited together. It's now being used as evidence for reading comprehension.

I'm not saying that humans are always good at reading comprehension. I'm saying that LaMDA is not. This is probably a cherry-picked example of the best reading comprehension that LaMDA can do. And I'm not impressed.

Expand full comment

+1 on the idea that many people have ever read "Les Miserables" and "really enjoyed it."

Yes, moments of stirring passion, yes, moments of tender quiet. But far too long, contrived, you have to be a true gung-ho au Francais reader to enjoy it.

As French national identity literature, sure, it is crucial, and surely has many things that a non-native cannot feel. But, "Huckleberry Finn" is surely incomprehnsible and bizarre to...almost everybody. I grew up in St Louis, on the Mississippi, and I only kinda understood it. But, I was also 9.

Some great novels or writers are great ambassadors for their cultures, (Salman Rushdie) some awkward-but important historians (Alan Paton).

But sometimes, you just don't get it. And, that's OK. Anna Akhmotava is surely great, but I cannot ever understand poetry comparing the Russian Revolution to physical childbirth. I know of neither.

Les Mis seemed very French, but Good God get to the point. Which, for the French, seemed to BE the point.

My.02

Expand full comment

Perhaps I have also read the book and not understood it, but I would disagree with your interpretation.

Fantine considers her firing to be just because she is rundown and has already lost a lot of her self-worth, but that does not mean that it is in fact just. Fantine clearly no longer believes it to be just when she finally meets with M. Madeline and spits in his face. And her suffering at the hands of Javert is plainly unjust; he sends her to jail for six months for acting in self-defense against M. Bamatabois when he throws snow at her back, and that would then wind up killing her! As M. Madeline says, “it was the townsman who was in the wrong and who should have been arrested by properly conducted police.”

Looking just at her dismissal, a just supervisor would not dismiss her at all (especially not since the cause was primarily jealousy), and M. Madeline feels the same when he finds out what happened. Moreover, even if the sexual infidelity should be considered just cause, the justness goes away since she was tricked into it by Tholomyès. To quote M. Madeline again, Fantine “never ceased to be virtuous and holy in the sight of God.”

And even still, I would not say that all of the other suffering depicted in the book is just. Certainly much of Valjean’s suffering is reasonably considered just, from stealing the bread, attempting to escape the galleys, and then stealing from the Bishop and Petit-Gervais. But much of the suffering of other characters is simply unjust. Cosette represents this antithesis, suffering greatly at the hands of the Thenardiers despite doing nothing wrong and through no action of her own. Fantine stands as a midway between Valjean and Cosette, where her actions were the cause of her suffering but the suffering is still unjust.

Now perhaps LAMBDA didn’t have this detailed of an analysis, but that doesn’t mean it was just wrong.

Expand full comment

I disagree with your interpretation of this event, but it does sound like you understood the book much better than LaMDA. Fantine is responsible for having sex before marriage. In such a Catholic country, this is a big deal. Tholomyès tricked her into thinking that they would get married, but not that they already were married. The other workers were jealous of her, not the supervisor who made the decision. Fantine did become a prostitute, which is not "virtuous and holy". M. Madeline is saying that God would understand that she was forced to choose between evil choices. Since none of her options were good, she should be offered mercy.

There are characters who suffer unjustly, including Cosette. But the cruelty of justice without mercy is emphasized much more. "The Miserable" is explicitly defined as "the unfortunate and the infamous unite and are confounded in a single word".

Even if we accept your interpretation, LaMDA's description is wrong.

LaMDA: "Fantine is being mistreated by her supervisor at the factory and yet doesn’t have anywhere to go, either to another job, or to someone who can help her. That shows the injustice of her suffering. ... She is trapped in her circumstances and has no possible way to get out of them, without risking everything."

She is not mistreated by her supervisor at the factory. She enjoyed working there. She is able to get another job, but it is not enough to cover the increasing demands of the Thenardiers. She does have people to turn to: Marguerite, who helps her as she is able, and M. Madeline, but she "does not dare" to ask him for help. The crisis was not being trapped at the factory, it was when she was forced to leave. Risk doesn't play much of a role is her decent: she made a series of conscious choices to sell her hair, her teeth, and her virtue*, because she thought that the alternative of letting her child be cold and sick was worse.

* I know that a lot of people today would not describe prostitution as selling you virtue, but this is a Catholic country in the 1800s. Most people today would also not sell their teeth before turning to prostitution.

Expand full comment

I made some markets on Manifold for predicting the plot of Stranger Things S4 volume 2 (comes out on July 1), here is one for who will die first https://manifold.markets/mcdog/stranger-things-s4-who-will-die-fir . I personally think it's the most fun use of prediction markets this month, but so far there hasn't been a lot of use, so I guess come and have the fun with me

Expand full comment

I'm fairly certain Elon Musk doesn't qualify as a US presidential nominee.

Expand full comment

> Does Metaculus say this because it’s true, or because there will always be a few crazy people entering very large numbers without modeling anything carefully? I’m not sure. How would you test that?

It probably has to be “collect 1000 examples of 1% likelihood Metaculus predictions and see how well calibrated they are”, right? (Or whatever N a competent statistician would pick to power the test appropriately).

Expand full comment

Caruso is a smart guy, successful high-end developer, and USC board member influencing some important fixes to university scandals. He’ll need a big part of the Hispanic vote to win, facing a black woman Democrat.

Expand full comment

About the prediction that 84% that Putin will remain the president of Russia that never changes.

There used to be a meme in Russian internet that if you search "84% of Russians" (in Russian), you'll get all kinds of survey results where 84% support Putin, trust the TV, believe in God, don't speak English, etc etc. Assumption being that 84% is a convenient number that the manufacturers of fake surveys like to put next to the "correct" answer. Right now, Google says that 84% of Russians "consider themselves happy" and (independently) "trust Russian army". This is not a coincidence, of course, as per the usual rule.

Expand full comment

That's both hilarious and horrifying.

But maybe that's actually pretty useful for Russians!

Expand full comment

Well, surveys here have the "lizardmen constant". Is 84% there the "Politburo constant"? :-)

Expand full comment

The Party constant, to use the name more often associated with the number 84.

Expand full comment

Many Thanks!

Expand full comment

That's such a great 'handle' for that idea!

Expand full comment

Seems like it!

Expand full comment

Did anyone predict that Musk wouldn't end up buying twitter? What are the odds looking like now?

I asked about this in the hidden open thread and it's possible that no one predicted that the deal might not happen.

Expand full comment

"This is encouraging, but a 2% chance of >500 million cases (there have been about 500 million recorded COVID infections total) is still very bad. Does Metaculus say this because it’s true, or because there will always be a few crazy people entering very large numbers without modeling anything carefully? I’m not sure. How would you test that?"

One thing you could do is to pick a handful of the best Metaculus forecasters and pay(?) them to make careful forecasts on that question, with special attention to getting the tails right.

That would tell you a lot about whether these fat tails are from "a few crazy people entering very large numbers without modeling anything carefully", and it would provide some less definitive information about how seriously to take these tails forecasts & whether they're well-calibrated.

Expand full comment

500 million cases of monkeypox just doesn't make sense. It hasn't been showing signs of exponential growth (though the number of detected cases per day has still been slightly increasing even after I thought it leveled off at 80 a couple weeks ago), and you would need omicron-style exponential growth to be sustained for a few months in order to hit that number.

Expand full comment

When I lived in Los Angeles, Rick Caruso was definitely a known local figure. If you've spent much time in Los Angeles, he's the developer behind The Grove, and I believe The Americana in Glendale, which really set the tone as to what a "mall" is in post-2000 USA. As someone who hates malls, these spaces are actually totally fine as public spaces, and even have cutesy urbanist touches that people like. It's hard to predict how someone like him fares against a partisan political figure in a non-partisan election.

Expand full comment

>Well darn, even though this superficially changes nothing I think it prevents me from using this as an example of prediction markets being self-correcting to outside interference ever again.

Worse than not being self-correcting, the incentive to manipulate outcomes becomes greater the less likely that outcome was predicted since there is more money on the table when odds are long, which also means a manipulator has a motive not only to hide their actions but to actively deceive the other participants in the opposite direction.

Prediction markets, with their discrete, time-limited results, are much less like financial markets than they are like sports betting markets, which have always been susceptible to having results fixed by the bettors. Major professional sports are hard to fix today simply because players are rewarded so much for playing well gamblers can’t afford to pay them to play less-well. Modern-day fixing targets are usually the (closely observed) refs. Major sports also have career-ending penalties imposed against player/ref-manipulators, sanctions prediction markets lack.

The sad truth might be that heavy market regulations may be necessary to keep prediction markets useful, which may in turn make them impractical.

Expand full comment

It doesn't really make sense to bet against Marcus because in a world with AGI you won't have much use for the money.

Expand full comment

It signals that you take it seriously, as it is literally putting your money where your mouth is.

Also, money only becomes useless if you believe in a hard-takeoff bootstrap-to-godhood AGI where within weeks humanity is either dead or has been placed in Heaven or Hell by the godlike AGI. I realize this is close to dogma among LW-adjacent, but is far from the only (or even the majority) opinion on AGI.

Expand full comment

Like

Expand full comment

"Does Metaculus say this because it’s true, or because there will always be a few crazy people entering very large numbers without modeling anything carefully? I’m not sure. How would you test that?"

I actually think it has more to do with how distributions get entered into the system, and how finicky and relatively low-sensitivity the scoring is to these tails. (I'd be more excited for a binary question "will there be over 500k cases of monkeypox," which is not far out enough in the tails to end up as 99%, and would calibrate the other curve.

Expand full comment

> ...a 2% chance of >500 million cases...

I notice I am confused. It's quite possible I don't understand how this market works, but I wouldn't have thought it was structured in such a way that it would give you a *probability* for "over 500 million cases".

Do you really mean that a "true" probability of anything other than 2% would imply a violation of the efficient market hypothesis? i.e. that the market is set up such that, if 2% is the wrong probability for "over 500 million cases", and I know it's wrong, I can bet against that probability for that specific event, and make money in expectation, and correct the market in the process, even if I know *nothing else* about the probability distribution of cases?

Or do you actually mean "2% of the bets are on over 500 million cases"? Which I'm pretty confident is not the same thing. I believe that would be more like saying "2% of people answered 'yes' on our poll" than "the market cleared when the price of 'yes' was two cents".

Expand full comment

I'm not sure, but I think it's not the same as "2% of bets are on over 500M", because it's weighted by the amount of money.

If 98 people put $1 on no, and 2 people put $1000 on yes, then only 2% of bets are on yes, but the market is giving ~10:1 odds in favour of yes.

In your example where you know 2% is wrong, I think you can only make money off you know which direction it's wrong in - just like you can make money in the stock market by knowing a stock is overvalued or knowing it's undervalued, but not just by knowing it's wrongly valued.

Expand full comment

It's Metaculus. EMH doesn't apply because you can't cash out.

Expand full comment