296 Comments
Comment deleted
Jan 25, 2023Edited
Comment deleted
Expand full comment

Looks like Sam grabbed the probabilities as of February 14, 2022.

Thinking about this now, I'm worried that the dataset for entrants doesn't record the date each entry was submitted. That could make a big difference if most human submissions were weeks before the Feb 14 hard cutoff.

https://www.lesswrong.com/posts/rT8AkEcBnfX8ZdSLs/2022-acx-predictions-market-prices

Expand full comment
Comment deleted
Jan 24, 2023
Comment deleted
Expand full comment

You've mostly answered your own question, I think - individually assigning a probability to an event is a statement of personal belief in the likelihood of the outcome. If that makes sense to you and your concern is on the resolution, then there are several reasons that might be convincing to you, which I'm pulling from Tetlock's research:

- For good forecasters, brier scores improved with the specificity of the prediction. Rounding to the nearest 10%, or 1%, caused loss in accuracy.

- "I'm confident that the outcome is more likely than not" can communicate a wide range of certainty to different people, and the same can be said of most words people use to convey confidence. Using numbers increases precision.

Most people using percent chances for outcomes aren't making a metaphysical statement on the nature of reality, just using precise language to convey confidence levels. A personal statement of probability on nontrivial outcomes is generally a combination of aleatory and epistemic uncertainty.

Expand full comment

I think this post answers this well - "Probability is in the Mind" -https://www.lesswrong.com/s/p3TndjYbdYaiWwm9x/p/f6ZLxEWaankRZ2Crv

Expand full comment

I think this entry gives a good overview of the philosophy behind it: https://plato.stanford.edu/entries/epistemology-bayesian/.

Expand full comment

I think this article is actually more useful on this specific topic, since it draws the distinction between the interpretations rather than just talking about one:

https://plato.stanford.edu/entries/probability-interpret/

Expand full comment

Agreed, read Kenny's article instead :)

Expand full comment

Also, a very simple way to think about it is that if someone is an ideal forecaster, among the events that they put 80% probability on, 8 out of 10 of them happen (and similarly for all other probabilities).

Expand full comment

That is only calibration (weather forecasting term, "reliability"), which is different than the accuracy / resolution.

Expand full comment

Could you explain? What CounterBlunder said makes sense to me.

Expand full comment

Replied to CounterBlunder

Expand full comment

They're super related, though, right? In my head, Brier score (or log loss score or whatever) is basically a combination of calibration + making stronger predictions (like closer to 0 or 1) while staying calibrated

Expand full comment

Was thinking of decompositions, see this https://en.wikipedia.org/wiki/Brier_score . Also shouldn't have used word "accuracy", as "resolution" is related to strength and variance of predictions relative to the overall "uncertainty" of observations. Correct, the components are related because they must sum up to Brier score and are bounded. But resolution it is more variance-like tern than mere "strength" of predictions, in my eyes it is more noisy than something than simply increases with stronger predictions.

Let's imagine a small prediction contest of 5 questions that resolve [1, 1, 0, 0, 0]. This set of question has uncertainty of 2/5 x (1-2/5) = 0.24. Enter Alice and Bob.

Alice predicts f1 = [0.6,0.6, 0.2,0.2,0.2], Brier score is 0.088. Bob predicts f2 = [0.9,0.8, 0.1, 0.1, 0.1], Brier score 0.016; both Alice and Bob have same resolution 0.24, but Bob has better reliability (mean squared calibration), 0.016, which is better than Alice's, 0.088. Because of perfect resolution (arbitrary small example make it easy), their reliability equals Brier scores. But Alice's underconfident predictions result in exactly same resolution as Bob's.

Charlie has secret inside information on questions 1, 4 and 5 and is strategically ignorant on the rest. His predictions are [0.99, 0.5, 0.5, 0.01, 0.01]; Brier score 0.10006 ~~practically same as Alice~~ edit. not same While his resolution is worse (0.14) but his calibration is near perfect (0.00006) and can have two 50% predictions and still have a good BS.

(edit Made a decimal mistake above. However, consider Cecilia, [0.99, 0.99, 0.5, 0.01,0.01], BS =0.058, which is better than Alice.)

Mallory knows the answers but just wants to watch the world burn. His submits in a prediction [2/5, 2/5, 2/5, 2/5, 2/5]. Brier score is equal to the uncertainty, calibration is perfect zero, and resolution also zero, worst possible. (This is not the worst possible prediction: Mallory's second prediction is [0,0,1,1,1] which has Brier score and reliability 1, but resolution 0.24.)

Expand full comment

welcome to bayesianism

here is the standard parable

you are given a coin, and told that it is biased, but not what direction it is biased in. it could be biased towards heads, or towards tails, and you have absolutely no evidence in either direction.

what probability do you assign to seeing heads when you flip the coin?

the bayesian, who believes that probability represents degrees of anticipation and certainty in the mind of a predictor, says "50%, duh. i have no data distinguishing these outcomes from each other, so they have equal probability."

the frequentist, who believes that probability represents the actual frequency at which events occur, says "anything BUT 50%! that's the one probability we know for sure it ISN'T!"

then you get to steal the frequentist's lunch money by offering to bet money on the outcome

Expand full comment

>the frequentist, who believes that probability represents the actual frequency at which events occur, says "anything BUT 50%! that's the one probability we know for sure it ISN'T!"

I'd be interested in reading a steelman of frequentism from a self-identified frequentist. (I don't think this is one.)

Expand full comment

Agreed. If they were required to use P(Heads) as part of a larger equation, what would they consider the best value to substitute in for it, given imperfect knowledge? Surely their options are "50%" (IE the same as the Bayesian answer) or "Stop the maths, we can go no further here" (an odd position for a probability theorist).

Or to ask the question another way - replace the coin with a biased dice that favours one (unknown) number. The frequentist is told that on a roll of 1, the bomb to his left will explode. On a roll of 2 to 6, the bomb to his right will explode. The frequentist would ideally like to run in the opposite direction to avoid the bomb. Does he run to the left, the right, or does he stand in the middle saying smugly "there's no way of knowing"?

Expand full comment

An obvious argument for frequentism is that pure bayesianism is totally useless, and almost everyone is a frequentist at base.

Bayesianism is the correct way to update pre-existing opinions in light of new evidence, but it doesn't let you form evidence-based opinions, and while Frequentism doesn't either it's the intuitive starting point must people use.

For any set of evidence and any non-degenerate (i.e. no divide-by-zero errors) probability distribution on world-states and futures that you like, I can construct a prior that gives me that distribution in light of that evidence. And there's no scientific or rational way of choosing between priors, only to say how we should have updated them.

So, ultimately, on some level almost everyone defaults to frequentism, and says that if so far 70 of 100 coins have come up heads the chance that the 101st does is probably about 0.7.

Expand full comment

The frequentist says the probability of the coin coming up heads is an unknown quantity somewhere between 0 and 1 but not 0.5. Frequentist inference refuses say anything more, so not sure how the bayesian is supposed to get any money by betting.

The situation gets more interesting after one single coin flip.. Suppose the coin comes up heads. What is the probability that coin comes up heads *again*?

The frequentist still says the true probability is unknown, but his maximum likelihood parameter estimate is now 1. But if null hypothesis is "less than 0.5", his p-value is 0.5. That is, if the true probability is less than 0.5, this sort of result can be expected 50% of time. It the null hypothesis "greater than 0.5", p-value is 1. Neither null hypothesis can be rejected. And he will helpfully draw two-tailed 95% confidence interval from 0.025 to 1 around (/ up to) the point estimate 1.

The bayesian's answer depends on his prior. If it is uniform, he answers that his posterior mean is 2/3.

Still not sure how they bet.

Expand full comment

>The frequentist says the probability of the coin coming up heads is an unknown quantity somewhere between 0 and 1 but not 0.5.

I actually don't see why a frequentist would say this, and I honestly suspect John Wittle made it up. Eager to be corrected if I'm wrong on this.

Expand full comment

Because in the joke we have absolutely sure information it isn't 0.5. Further, the frequentist theory generally will refuses to make probability statements about parameters as they are not considered random variables but fixed unknown non-random quantities. Thus inferences about them are based on "likelihood function", whereas Bayesian generally will give a posterior distribution and interpret is as probability.

I admit I am also confused how a frequentist should think of coin flips. Parameter p in a binomial distribution is a non-random overall innate propensity of a coin to land heads or tails in a single experiment. Yet event X = "coin lands heads" is a random variable.

Joke falls flat as I see no reason for a frequentist to pick any other estimate before seeing data, because I think all common estimates will be undefined.

Also, now that I am thinking it more, I think a careful Bayesian should choose a prior that is zero at 0.5, too. In practice, it doesn't really matter because excluding a single point (measure zero) in a range [0,1] does not really contribute any change.

Expand full comment

This is pedantic, but if you are given a coin and told that is biased, you know you have been misinformed and the probability of heads *is* 50%. There are no biased coins. https://www.tandfonline.com/doi/abs/10.1198/000313002605

Expand full comment

I like to think of a dart as basically being a biased coin

Expand full comment

Man are you ever going to be surprised when someone uses a double headed coin....

Expand full comment

Huh. Yes, of course. That follows from Galileo, doesn't it?

Expand full comment

Can you explain what bet you could offer the frequentist to steal their money? I'm having trouble coming up with one.

Expand full comment

oh, i mean, the reductio ad absurdum would be that they would want to accept odds at something other than 1:1 since they know that must not be the "true" odds

but what actually happens is that they say "oh what? if you were gonna bet on it, of course the break even point would be 1:1"

then you say "hmm, it's almost like the way you're actually using them, the odds represent the degree of uncertainty about the outcomes in your mind, not the actual real-world frequency of given outcomes."

then they say "woah now hold on, that's not quite true, there is some true probability of the biased coin landing heads, when I talk about the probability of seeing heads I'm actually talking about the probability of that true probability (or 'frequency') being different values" or something like that

then it devolves into a long and hopefully productive if dry conversation on the nature of anticipation and epistemology and prediction

but obviously that's not as clever and pithy as just pretending they're going to stick to frequentist purism when money's on the line

(maybe? i am not good at steelmanning frequentism tbh, i feel like I don't understand it, it seems to make predictions about repeated events into a different datatype from predictions about one-off events, in a way that makes them incomparable, when to me it's pretty obvious they're the same kind of thing)

Expand full comment

I’m not a frequentist, in fact I’ve just learned the word, but if someone were to tell me a coin was biased and I trusted that person I would bet the house on 100 throws not being 50/50. As for one throw, can’t say anything.

Expand full comment

I think this is more about epistemic than aleatoric uncertainty.

I'm a die-hard Laplacian determinist - I cannot believe in a god who plays dice, even if that means accepting action at a distance and Copenhagen; when I say "I think there's a 40% chance of X happening" I mean something like "in 40% of the notional universes that fit the evidence I have, X is definitely going to happen, and in the other 60% it definitely isn't".

Expand full comment

“ That in 40% of the universes that split off from the present moment there is a ceasefire but in the other 60% there isn't?”

Yes! What else could it mean.

Remember when every expert and pollster “didn’t predict” Trump winning though many had him at a 30% chance. People seem to discount <40% as 0%.

When for instance the prediction/pollster site FiveThirtyEight gave Trump a chance of winning that meant that in 29% of their future models he won.

Expand full comment

There could be a hidden variable that has already fixed the outcome but that we don't know about. If e.g. Vladimir Putin rolled a d10 yesterday and resolved to end the war as soon as his diplomats can come up with a cease-fire agreement if it came up 1-4, then either all universes branching off from the present moment will result in a quick ceasefire or none of them will. The die roll happened yesterday, but it's too soon for Russia's diplomats to have put together the cease-fire proposal. But if our spies reported that Putin was planning to roll the die and then got chased out of the Kremlin before they could see the result, we'd be correct to assess the cease-fire as a 40% probability.

We can still save something like your formulation by replacing "the present moment" with "the ensemble of possible universes that would be indistinguishable from the present moment if examined as closely as we examined this one while making the prediction".

Expand full comment

Sure, in the case of hidden variables the percentage estimated would be wrong.

However I’m just answering as to what I think an x% chance of something happening means. That’s how I think of it.

Expand full comment

I think of it as if there existed an objective answer to the question of "given information X, what's the probability of such event happening?". And when we assign a probability to this, we are estimating this probability for X = "everything we experienced, know, research, etc". The way you got that info also has to be included in the info.

I don't claim this is philosophically correct if you think about it too much, but so far I feel fine enough with this idea.

So different people have different Xs so they give different probabilities to the same event, which can all be objectively correct because they are estimations of different things. A smart aggregate might use info from all of our Xs.

We can suck at estimating the thing, in which case we are not well calibrated. If our when we claim 80% probability, the events happen 40% of the time only, then we know we are not estimating well all the time.

We can also choose to use Xs other than "all information at our reach". For example, we can always say 50% if we choose X as empty, no info, and it will be a perfect estimate of a thing no one cares about.

Expand full comment

I think it's more or less equivalent ot the size of the bet one is willing to put on the outcome, e.g. if I say something has a probability of 80% it probably means I'm willing to bet more on the outcome than if I think it has a probability of 20%. How much more can probably be precisely stated by a complex algorithm that is fiercely debated at enormously length by philosphers of epistemology, but it hardly matters, as we're just talking about whether ineffable human confidence should be expressed on a linear or log scale, and what the units should be[1].

I agree something inherently subjective has dick to do with probability the way we usually define it, which is an objective statement about what fraction of an ensemble has a given value of a property that can take on more than one value. But it has value, kind of the way the price of a stock has value, inasmuch as it allows people who want to take opposite sides of a bet to find each other and agree on the terms of the bet.

Maybe people use probability instead of a literal option price in order to normalize for complicating demographic factors, like how wealthy the estimator is. My saying "I think this has a 99% chance of coming true" might indicate something like "I would bet what for me is an uncomfortable but not suicidal amount of money $X on the outcome," and that statement translates well whether I am Jeff Bezos and $X = $25 billion or I'm a college student and $X = 1000.

-------------

[1] It's kind of like asking someone how he feels and having him say "I'm 25% happier than I was yesterday, but 50% happier than I was Monday." The quantification is a little weird, but it does give us a sense of relative amplitude, particularly if our respondent is not tip-tip with words and can't use a rich vocabulary and active imagination for metaphors to more clearly convey the shades of distinction in his moods.

Expand full comment

I thought that it means "If this person predicts a 40% probability for 100 different events, approximately 40 of them will happen and 60 will not." It's a measure of accuracy over many similar events. Similar to how a statement that "a coin has a 50% chance of landing heads" means that if you toss a coin 100 times, you'll get about 50 heads.

I guess you could say that this is still something behavioral ("this person knows enough about world events to be accurate as often as they say they are") but to me it seems the same as the coin. I don't see how "this coin has a 50% chance of landing heads" is a mathematical proposition but "based on the flips I've seen so far, I think the coin has a 50% chance of landing heads" is a behavioral one.

I guess you could take the first statement as implying some sort of absolute certainty, like "the author of this stats textbook has declared that the coin is fair, so it definitely has 50% probability." But any real-world statement of probability is implicitly saying "this probability is based on what we know so far."

Expand full comment

"It seems like it has to say something behavioural about the person rather than something mathematical about the proposition but I can't put my finger on it"

I think that is correct.

From your posts it looks like you believe people can have internal mental states with different levels of confidence of things happening. Your question is then, how do people convert this confidence into a discrete percentage?

I think for most people, myself included, there is no exact algorithm for converting internal mental states of confidence into a specific percentage. When I end up saying there is a 40% chance something occurs, normally I think it's probably somewhere between 30%-50%, almost definitely between 20%-60%, but I think it is less likely to happen the more likely to happen, so let's go with 40%. This process is a skill more akin to chicken sexing https://www.phon.ucl.ac.uk/home/PUB/WPL/02papers/horsey.pdf, then it is to working out a math problem.

This then can feed into an iterative process of adjustments. If my initial prediction of something in Category A is 80%, I may go back and look at how all my other high confidence predictions for things like Category A did (e.g. a lot of the times I had a high confidence a stock was going to go up it did not) and adjust my predicted percentage based on historical data.

Expand full comment

Is the 2023 contest still open?

Expand full comment

People with self-reported IQ above 150 doing better is pretty surprising to me. I would have expected them to do worse.

Expand full comment
Comment deleted
Jan 24, 2023
Comment deleted
Expand full comment

I'm not surprised that people with exceptionally high IQs are good at intellectual tasks. I'm surprised that people who self-report exceptionally high IQs actually have them.

Expand full comment

I kind of thought that was what you meant.

Expand full comment

Really? It seems to me that the most people might exaggerate their IQ as being slightly higher than it actually is, but not greater-than-150 exaggerate. Do a lot of people who self-report high IQs, but don't have them, see lots of evidence pointing them to think so?

Expand full comment

My guess would have been that because an actual IQ of 150 is so rare, some combination of trolling (my IQ is 150, I'm 6'9", and I'm a billionaire) answers and delusional (rather than just slightly exaggerating) answers would have compensated for the real and slightly exaggerated answers. Surveys find small percentages of people self-reporting all kinds of crazy things.

But it looks like I guessed wrong.

Expand full comment

Just about anywhere other than among Scott's readers, self-reported IQs over 150 are likely to be bogus, but this site has a lot of smart readers and a lot of people who are quite careful about things like their IQs.

Expand full comment

There were a lot of 150+ IQ reports in the ACX survey. And it’s very rare. I once did an IQ test in one of the FAANGS, it was organised by the workers (engineers and software folk). Participation of a few hundred.

The average was 130 but it was tightly bunched. Just a few low 150s, as in one or two. I think IQ tests aren’t very useful at higher levels anyway, and useless at estimating creativity or actual genius etc.

that said these results do seem to indicate that there’s some legitimacy to some self reporting.

Expand full comment
Comment deleted
Jan 24, 2023Edited
Comment deleted
Expand full comment

you're right. probably an IQ of 130 minimum needed to make any major contribution to an intellectual field. for pure math or theoretical physics, more like 140-150.

Expand full comment

Well we are, to repeat the claim at the top of this thread, talking about reported IQ not IQ. Which is like taking reports of male height or sexual partners seriously.

At least that’s what I would have thought.

Nobody is arguing that high IQ doesnt correlate with intelligence.

Expand full comment

Depending on the IQ test, one of the things that a really high IQ is correlated with is an excellent memory. So one can remember small details that might turn out to be important in making a prediction. So that the correlation exists isn't surprising. What's surprising is that it's so weak. This may indicate that the future is chaotic, and that there aren't that many attractors. Or it may be the nature of the questions asked that picked out a subset of features that are chaotic and without strong attractors. (The Ukraine war is a good example becaise the participants should work to be unpredictable in many ways.)

Expand full comment

AFAIK you need either a super long test (impractical) or a combination of a "normal" test with shorter, but extremely difficult "genius" tests to get a reasonable comparison for extremes. The tests given in schools to identify "gifted" students are (in my experience in the US) often the only proper IQ tests anyone takes in their lives.

The SATs might be an apt comparison. A small percentage, (what like 1-5% i don't know I'm in my fuckin 30s now) get a perfect score. A small enough percentage that it's still impressive, but it literally doesn't let you differentiate between the *best* performing students. It's *only* useful for sorting the rest. And colleges want it that way.

Expand full comment

Not all IQs are the same. 150 on a childhood ratio test is not like 150 on a normed test. I don;t think they go that high.

Expand full comment

There's also a bunch of online IQ tests which flatter you with an exaggerated score. I think I got 150+ in one of those back in the day.

I would not attempt to take one these days. I don't think I've got dumber, but I have definitely got out of practice at test-taking.

Expand full comment

If you compare the SAT scores and the IQ scores you can see that there must be many instances of people exaggerating their IQ (or, quite possibly, reporting the results of some hinky online test even though the instructions say not to do that).

*In general*, anyone with a 140+ IQ should be able to crush the modern, post-1995 SAT. You can miss multiple questions and still score 1600, and the questions are not exceptionally difficult.

Of course some may be reporting pre-1995 scores, and there are any number of possible but unlikely reasons that someone could severely underperform on the SAT, but when I see an entry like '145 IQ, 1300 SAT' (it's in there, along with a fair number of other similar cases), I just assume the IQ score is inflated.

The implication that people are more likely to inflate their self-reported IQ than their SAT score is psychologically interesting, I guess.

But anyway I wouldn't use the IQ data for anything serious.

Expand full comment

It has been a long time since I have taken the SAT, but IIRC it is largely knowledge based, and someone with a low IQ who spent a lot of time training can ace it while someone with a high IQ who never paid attention in school could fail it.

This is different from something like the LSATs (IIRC), which does a lot more testing of reasoning and require minimal accumulated knowledge.

Expand full comment

They're reasonably well-correlated because people who have developed skills to be good at one measure usually have had the educational exposure to be good at the other. This is further amplified by the fact that in a survey like this, people who have scored high on some form of IQ test are highly likely to have been given superior educational resources as a main use of IQ testing is to qualify people for accelerated learning programs.

A person good at what formal IQ tests are measuring doesn't have to be good at a trigonometry test, but these aren't independent variables and you'll find that people in the former category probably ending up having a solid math education.

Expand full comment

I took all of these tests long ago and would probably seem like an outlier (or a liar). 140 and 1220. But my SAT was over 250 points higher than the next highest score in my class (my classmates who were in gifted class, IQ>120 all scored in the low 900s.).

Expand full comment

Based on my personal sample of n=1, if you miss a single question on the SAT you won't get a perfect score. But the point holds in general I think.

Expand full comment

Depends. I know I missed a question on the SAT verbal (indelible/ineffable, in 2011 IIRC) and got a perfect score. But maybe you can't miss even one on math.

Expand full comment

This is ACT. The readership is... not representative of the wider population in more ways than one.

Expand full comment

Given the context of a discussion including college entrance exams, I think abbreviating the blog as ACX has a clear advantage over the pure initialism.

Expand full comment

Yeah, I took the ACTs rather than the SAT myself.

Expand full comment

There's a big problem with reporting IQ caused by the (in)ability to properly measure it. Is testing for IQ something that's done in a semi-standartised manner in USA? For example, I have no reasonable ability to get a "proper" IQ test, since that simply isn't done where I live, and the various options that are available (or were - it's probably been over 10 years since I last looked for that) online including the quite long serious-looking ones generally "top out" and report something like 140-150 if you can answer the questions quickly, since they aren't calibrated for people doing that; so while I'd presume that the super-high number given by such online quests this isn't a correct measurement due to how normal distribution works and IQ should work, that's the only IQ number I can get, and thus the only IQ number I can report.

Expand full comment

If a person gets all predictions for year right, the mundane work of estimating one's own IQ is easy. They used same good predicting methods to obtain their IQ that they later used to ace the prediction contest.

Expand full comment

I think Larry is assuming people who claim 150 on IQ might be exaggerating. Over compensating. That would have been my suspicion too but this result have made changed that. A bit.

Expand full comment

It's quite possible or even likely that the self-reported 150+ IQ cohort *do* have very high IQs, and that it's just a matter of some substantial portion of them reporting some high water mark from an online test.

Expand full comment

Not really that demanding. Predicting involves making inferences based on pieces of information. It's not on the same level of rigor or exactness as trying to understand some complicated math or physics concept.

Expand full comment

It seems like a fine example of something that is neither rigorous nor exact, and yet can still be extraordinarily demanding if being done at a high level.

Expand full comment

Which of the following statements (if any) are you most skeptical of?

1. People who are +3 SDs on ability-to-take-IQ-tests exist.

2. Ability-to-take-IQ-tests correlates with a lot of meaningful stuff, including forecasting ability.

3. People who are +3 SDs on ability-to-take-IQ-tests will often take an IQ test (e.g. due to being identified as gifted when a child) and learn of their high IQ.

4. Having learned of their high IQ, people will be willing to share it for a competition such as this.

Or alternatively, are you just claiming that the population of SSC readers who will falsely claim a high IQ outweighs the population with actually high IQs? It certainly seems true that as the IQ score in question gets higher and higher, the ratio of liars to actual geniuses gets higher and higher too.

I think a key issue here is that of self-selection. Someone who claims a high IQ unprompted is likely a bloviator. But a survey question which asks you for your IQ doesn't select for bloviators in the same way.

Expand full comment

I'm not surprised that people with 1 in 1000+ IQs exist, or that those people are really smart, or that those people will answer honestly. I'm surprised that, given how rare they are (1 in 1000+), the people who really are in that group, or are at least genuinely really smart, are not overwhelmed by trolls and internet geniuses on a self-report survey.

Expand full comment

I think this blog is a lot more technical and odd that you are thinking. My wife who went to a very good college got good grades, probably has good test scores, I would guess IQ 125 or something….

Would never spend leisure time reading something this dry.

A huge portion of even doctors, professors, and lawyers are not that exceptional intellectually. That isn’t even getting into cashiers and delivery men.

Being 1/1000 isn’t that high bar when the readership is only a couple tens of thousands of people out of billions.

Expand full comment

TBH I'm guessing this is mostly noise, given that it only works at that arbitrary threshold

Expand full comment

Yeah, I honestly find that more believable than the alternative.

Expand full comment

Read the footnote. It's correlated all the way through in the raw scores.

Expand full comment

"only works" could just mean "is only statistically significant"?

Expand full comment

Possibly those (if any) who wildly exaggerated their IQs in the survey were less likely to participate in the prediction contest? I imagine the sort who make false self-aggrandizing claims (even if only to themselves) are not the sort who would be keen to seek evidence that could undermine their self-image.

Expand full comment

Having just run the numbers, the IQ>150 answers on Round 1 look kinda garbage (even after throwing out the person claiming to have an IQ of 212), in that they're frequently moving the global average in the opposite direction from both the superforecasters and the current Manifold predictions.

E.g., what is the probability Ukraine will hold Sevastapol? Superforecasters say 23%, Manifold says 15%, IQ>150 say 40%. I'm biased, because I myself said 15%, but 40% looks well out of line. Similarly, will any new country join NATO? Superforecasters and Manifold both say 71%, IQ>150 says 58%.

Expand full comment

Woohoo, I beat putting 50% on every single answer, Vox Future Perfect, and roughly one out of four other participants!

(Sigh. I knew from GJ Open that I'm not great at predictions, but I thought that long experience over there would have actually helped. Apparently not.)

Expand full comment

See the words of the winner Ryan

> last place was no worse than second

We are all second. That's amazing!

Expand full comment

Just out of curiosity, how do you score answers that are not "yes" or "no" but rather Bayesian-style percentages? One take is that, if someone who guessed that Gorg will do Blah with probability 10%, and Gorg does do Blah, that person should get one tenth of a point. Another take is that it shouldn't be linear, and someone who guessed 0% should be executed immediately.

Expand full comment

They said they used a log loss rule which means that someone who guessed 0% would be executed immediately (i.e. have a score of +infinity regardless of their other answers).

Expand full comment

>One take is that, if someone who guessed that Gorg will do Blah with probability 10%, and Gorg does do Blah, that person should get one tenth of a point.

This take doesn't work. The problem is that, if there's a 49% chance Gorg will do Blah, and I know that, my expected gain from predicting 49% is 0.49^2 + 0.51^2 = 0.5002, but my expected gain from predicting 0% is 0.49*0 + 0.51*1 = 0.51 > 0.5002. Hence, people will only predict 0 or 1.

Log-loss is one of the ways to get "perfect incentives" i.e. if the correct answer is 49% and I know that, I should predict 49%.

Expand full comment

The term to look up here is "proper scoring rule".

Expand full comment

The two most commonly used proper scoring rules are quadratic loss (i.e., if you say something has a 40% chance of happening, then you lose .36 points if it does and .16 points if it doesn't) and logarithmic score (i.e., your score is the log of the probability you give to the thing that happened - logs of numbers less than 1 are negative, so this is again interpreted as a penalty for being far from the truth).

A "proper scoring rule" is any scoring rule where your expected score is maximized if you report your true probabilities. If your true probability is p and your reported probability is x, then a linear scoring rule gives you an expected score of p(1-x)+(1-p)x = p-px+x-px = p-x(2p-1). Thus, if p>1/2, you maximize your expected score by reporting a probability of 1, and if p<1/2 you maximize your expected score by reporting a probability of 0. Under the quadratic rule, your expected loss is p(1-x)^2+(1-p)x^2 = p-2px+px^2+(1-p)x^2 = x^2-2px+p = (x-p)^2-(p^2-p). It's clear that this expected loss is minimized when x=p. (And since p^2-p is your expected loss when reporting your true probability, this has the useful feature that your expected penalty for reporting a probability other than your own is equal to the square of the difference between your true probability and reported probability.)

It's a little less straightforward to show that the logarithmic scoring rule is proper, but it is. There are a whole family of other proper scoring rules, and they correspond to different ways of thinking about the "directed urgency" of changing your probability estimate by a little bit, when you currently estimate something less than optimally. (The quadratic score gives directed urgency proportional to your probability; the logarithmic score gives directed urgency proportional to the reciprocal of your probability; others do it differently.)

This "directed urgency" turns out to be the same as the idea used in gradient descent training of neural networks. (In each training round they change their weights in a way proportional to how much it would have improved their score according to some particular scoring rule.)

Expand full comment

I was going to ask on the next open thread why people were not all that worried about a nuclear war - but since we are on a thread about forecasting does anybody want to chime in. Or is it in the contest (which I didn’t get involved in).

I’m talking about the Ukraine war here, not in general. In particular if the Russians start to lose badly, will a Ukrainian army that pushes the Russians back to the border stop at the border? And yet the former seems to be what the western world demands. Clearly if the Ukraine army is routing the Russians, if that happens, they may not have the particular desire to stop.

That said, of course, it’s unlikely that the west would continue to supply arms to Ukraine if they cross the border, but you never know. Everybody’s backs are up.

Despite my pessimism I would put the odds at 10-20% but that’s still far too high for comfort.

Expand full comment

Using nuclear weapons serves nobody's interests. Ukraine attacking into Russia would surely galvanise a lot more Russians to actually fight in this war and it would also cost Ukraine the moral advantage. They might do it (15%?), but only to trade it for Crimea.

Expand full comment

I'm less worried about nuclear war now than I was six months ago; Russia has accepted some major setbacks without going nuclear, and evacuating Kherson without at least an explicit threat of nuclear escalation is a good sign. So maybe now 10-12%, down from 20% at the peak.

But that's the total risk of anything that would count as a "nuclear war", including a single strike at a Ukrainian logistics depot followed by loud threats. That's the sort of thing that *might* plausibly benefit Putin and Russia, if things break well for them. A large-scale nuclear war between Russia and NATO, does not plausibly benefit Russia. Or China or India for that matter, and they do matter. It *probably* doesn't benefit Vladimir Putin, and if it does it's because his position in Moscow is not secure, which correlates with a high probability that the Russian military would stop taking his orders if he tried to go Full Nuclear.

So the risk of Global Thermonuclear War is probably now down to ~1% or so. Well, 1% above the baseline for a nothing-interesting-happening-right-now year.

Expand full comment

Well, Putinism is generally self-destructive by all appearances, but some of this is likely still delusional optimism, and it doesn't seem to be short-term suicidal just yet. I'm sure that China and India have abundantly communicated to Putin that they will completely cut him off if he presses the button, and it's unlikely that the small triumphant war is worth it for him, and even that is by no means guaranteed.

He also can't truly lose, even if Ukraine recaptures all territory he can just say that it was all NATO/mercenaries, lick his wounds and relaunch the invasion in a few years. A major Ukrainian invasion beyond its borders is very unlikely, and won't be supported by the West. Ukraine getting accepted into NATO is a different story, but I very much doubt that there would be an appetite in the West for that either.

Expand full comment

Historically a country that has been invaded, and pushes back on the invaders, doesn’t stop at the borders. However I agree that there are special circumstances here. The west will probably cut all funding. Probably. There are some hot heads out there.

As for China - they will come out of any nuclear war, which would be limited these days compared to the Cold War, in a relatively good position. India too.

Expand full comment

The Poland-Soviet Union war right after WWII saw a lot of back and forth across large distances. Same with the Russian Civil War. Same with the Greece-Turkey war of the same era in more rugged terrain.

Expand full comment

WWI ended before allied troops entered Germany. You could argue that this was because Germany surrendered, since the Allies were clearly willing to push farther if they needed to, but I don't see why it wouldn't work just as well with a negotiated peace.

It's also worth noting that when Ukraine took back the area around Kharkiv, they did stop at the border rather than, say, counter-invading towards Belgorod.

Expand full comment

In Russia the president doesn't have the unilateral power to launch nukes, he needs the Defence minister to order it as well, while I am sure Shoigu is loyal to Putin, but I am not sure he is suicidally loyal.

Expand full comment

As far as I can tell, the official hierarchy/checks and balances mean nothing in practice in Russia, if an underling refuses to follow an order he is simply and quietly replaced. Of course, there are a few people whose opinions Putin actually needs to consider, mostly in his FSB/KGB close circle, but I wouldn't bet on them to successfully coordinate and perform a coup in the last possible moment.

Expand full comment

Even if the order is given, would Russian missileers obey? Or would they pretend not to have received the order, raise questions about its authenticity, "discover" that something critical is inoperable, &c.?

Expand full comment

They probably would. They aren't out there watching CNN tell them all about insane tyrant Putin's latest antics, the propaganda they're swimming in every day is instead about Mother Russia being beset by enemies on all sides, and that any day they might be called on to perform their grave but necessary duty. Those who aren't up to it generally don't end up with such jobs in the first place.

Expand full comment

There is the question of whether Putin even has a few years left after which to try again, or if his health is too deteriorated.

Expand full comment

There is a separate issue of Russian nuclear arsenal potentially being in just as terrible condition as everything else. Claiming that Russia doesn't actually have working nukes at all seems to be overly optimistic. But it seems quite likely that there are much less of them than it's claimed to be, and huge uncertanty which are ponent and which are not.

This situation prevents Russia not only from using nukes but even from performing nuclear tests, because a failure would expose the problem making all the implicit nuclear blackmail much less credible.

Expand full comment

People seem to forget that Russia is losing/not winning in Ukraine because it’s fighting the armaments from more than the dozen countries of NATO, and even then the Ukrainians keep asking for more, to the extent that Europe is being depleted in it’s own defense.

https://www.wsj.com/articles/europe-is-rushing-arms-to-ukraine-but-running-out-of-ammo-11671707775

https://www.nytimes.com/2022/11/26/world/europe/nato-weapons-shortage-ukraine.html

Now think about the attitude of in a few months the Ukrainians deplete even more of the capacity of Europe to defend itself and starts to lose territory. It will be panic again.

In any case the fact that the Russian tanks are not modern or capable tells you nothing about the missiles.

A few years ago Turkey bought Russian defense missiles, against American wishes and despite being part of NATO, and last year the US asked that the Turks give these missiles to Ukraine. These are the S-400s. Clearly they are considered useful.

https://www.reuters.com/world/middle-east/turkeys-erdogan-says-intends-buy-another-russian-s-400-defence-system-cbs-news-2021-09-26/

https://www.reuters.com/world/us-suggested-turkey-transfer-russian-made-missile-system-ukraine-sources-2022-03-19/

These are defensive missiles but it means that Russia is certainly pretty modern on offensive missiles as well. Including their new hypersonic missiles.

Expand full comment

I think it's pretty unlikely that whole civilized world is going to deplete its resources faster than Russia, especially considering the state of Russian economy and industry. But thanks for your public prediction. On the countrary, I predict that in a few month we will see new results of Ukraine's offensive.

Missles is one thing, properly functioning nuclear warheads - the other.

Expand full comment

I didn’t actually predict anything in that post. I did link to facts about Europe depleting its arms.

But that was an aside, my main point was that they have relatively good missile systems which you seem to have ceded in moving the goalposts to a discussion about nuclear warheads.

About that I know nothing so I can’t comment.

Expand full comment

I counted this as a prediction:

>Now think about the attitude of in a few months the Ukrainians deplete even more of the capacity of Europe to defend itself and starts to lose territory. It will be panic again.

But now I see that you treat it just as one of the possible scenarios, not necessary one you put your money on.

> my main point was that they have relatively good missile systems which you seem to have ceded in moving the goalposts to a discussion about nuclear warheads

I don't think it can be categorised as goalpost moving if I didn't mention missles in the first place and was talking specifically about nukes.

Expand full comment

Yes, you are correct, apologies. When you said nukes my mind went to both missiles and warheads, mostly missiles. As I said though I can’t really comment on warheads. No clue.

Expand full comment

Did you respond to this comment from the substack email and/or the email link to the comment?

I've noticed that substack makes it quite difficult to see the entire context.

Expand full comment

Naw I just read it wrong 🤷‍♂️

Expand full comment

It is plausible that the "whole civilized world" is going to deplete the particular resource that is "artillery ammunition" faster than Russia and its handful of allies. It's not *likely*, but it is possible and it's why I'm not at 90+% for Ukraine to win this war.

Everybody seriously underestimates the amount of artillery ammunition it takes to win a modern war, even if they've got the data from all the previous wars sitting right in front of them. "Civilized" nations deal with this by not "squandering" their taxpayer's money buying more than they think they're going to need, disposing of the old stuff as soon as it becomes dangerous or expensive to keep in storage, and not pre-emptively building megafactories that can crank out thousands of shells a day that would then sit mostly-idle until they rust out. Russians (and Ukrainians, who learned warfighting from the Russians), deal with this by saying "so what; more artillery is better, build it all and don't throw anything away".

Ukraine reportedly used more artillery ammunition in the first week of this war, than the British army had in total. I *think* that when we throw in the whole rest of the "civilized world", we'll be able to scrounge up enough to get Ukraine through to the end, but it's not a sure thing.

Expand full comment

SAMs & ICBMs are two very different creatures.

Their hypersonic missiles are either not new (Iskander) or bullshit (Zircon).

Expand full comment

I see people say this a lot, but under the START treaty the United States is allowed to perform up to 18 on site inspections per year on Russian nuclear sites. There's also a lot of indirect means of verification allowed. The same parts of the treaty devoted to making sure Russia doesn't have MORE weapons than it claimed likely do a good job of ensuring that they don't have LESS weapons than they claimed. And while the only way to know for sure if they work is to fire them, the US inspection teams can look around and see how much money and effort is being put into these sites and put some measure of confidence in "how well maintained are these systems". Acting like the state of the Russian nuclear arsenal is a black box to the United States is simply not true.

Expand full comment

The problem with nuclear weapons, especially on the scale of the Russian arsenal, is that you don't need a lot of it to ruin millions of people's days.

If they couldn't launch 95% of their missiles (which seems wildly unlikely-- they have a lot of problems with their hardware, but it's not as if that fraction of their tanks just won't start, and their civilian rocketry still works reasonably reliably pace the current ISS Soyuz issue), that's still enough to expose 80 locations we're fond of to instant sunshine and start an unpredictable cascade of events (including the likely retaliation that serves as our deterrent from doing that, the response to that, etc.) from there.

Being degraded to the point that they're literally not a nuclear threat seems basically impossible.

Expand full comment

Well, is substantial part of your nuclear arsenal is mailfunctioning or even if you just can't be sufficiently confident that it's not, and you don't know which nukes are which, it works as an extra reason not to use your nuclear weapons.

Yes, in theory you still can incinerate a lot. But you definetely won't be able to do it without retaliation, and you can't properly prioritise your targets.

Expand full comment

If all their nukes are working, they still can't strike the US without retaliation. By the same token, they have enough nuclear weapons to deter us from putting troops on the ground in Ukraine and from directly attacking actual Russian territory either way.

Expand full comment

Round 1 of the prediction contest says 7% chance of a nuclear weapon being used in war this year, although this question poses some problems (in the event of nuclear armageddon, the contest is unlikely to be scored).

That particular scenario strikes me as unlikely. Ukraine has nothing to gain and much to lose by invading Russia. Also, Ukraine has already pushed Russia back to the Kharkiv/Belgorod border without crossing it.

Expand full comment