The funniest thing about those arguments you're rebutting is that the average of a large number of past 0-or-1 events is only an estimate of the probability of drawing a 1. In other words, the probabilities they're saying are the only ones that exist, are unknowable!
That's right. Even in a simple case of drawing balls from a jar where you don't know how many balls of each color there are, the last ball has a 100% 'real' probability of being a specific color, you just don't know which one.
I'd say that all probabilities are about "how to measure a lack of information". If you have a fair coin it's easy to pretend that probability is objective because any people present have the exactly same information. But as long as there are information differences then different people will give different probabilities. And it's not the case that some of them are wrong. They just lack different information. But for some reason people expect that if you're giving a number instead of a feeling then you are claiming objectivity. They're just being silly. There are no objective probabilities outside of quantum mechanics and maybe not even there. It's all lack of information, it's just that there are some scenarios where it's easy for everyone to have exactly the same lack.
Okay, after years of this I think I have a better handle on what's going on. It's reasonable to pull probabilities because you can obviously perform an operation where something obviously isn't 1% and obviously isn't 99%, so then you're just 'arguing price.' On the other hand, it's reasonable for people to call this out as secretly rolling in reference class stuff and *not* having done the requisite moves to do more complex reasoning about probabilities, namely, defining the set of counterfactuals you are reasoning about and their assumptions and performing the costly cognitive operations of reasoning through those counterfactuals (what else would be different, what would be expect to see?). When people call BS on those not showing their work, they are being justly suspicious of summary statistics.
The thing is, if they’re wanting to call out those things, they should do that, rather than attacking the concept of probability wholesale.
Like, there are good reasons to not invest in Ponzi schemes, but “but money is just a made-up concept” is one of the least relevant possible objections.
A legend about accuracy versus precision that you may have heard but that I think is applicable:
As the story* goes: When cartographers went to measure the height of Mount Everest, they had a hard time because the top is eternally snow-capped, making it nigh-impossible to get an exact reading. They decided to take several different readings at different times of year and average them out.
The mean of all their measurements turned out to be exactly 29,000 feet.
There was concern that such a number wouldn’t be taken seriously. People would wonder what digit they rounded to. A number like that heavily implies imprecision. The measures might explain and justify in person, but somewhere down the line that figure ends up in a textbook (not to mention Guinness), stripped of context.
It’s a silly problem to have, but it isn’t a made-up one. Their concerns were arguably justified. We infuse a lot of meaning into a number, and sometimes telling the truth obfuscates reality.
I’ve heard more than one version of exactly how they arrived at 29,028 feet, instead—the official measurement for decades. One account says they took the median instead of the mean. Another says they just tacked on some arbitrary extra.
More recently, in 2020, Chinese and Nepali authorities officially established the height to be 29,031 feet, 8.5 inches. Do you trust them any more than the cartographers? I don’t.
All of which is to say, it makes sense that our skepticism is aroused when we encounter what looks like an imbalance of accuracy and precision. Maybe the percentage-giver owes us a little bit more in some cases.
* apocryphal maybe, but illustrative, and probably truth-adjacent at least
>More recently, in 2020, Chinese and Nepali authorities officially established the height to be 29,031 feet, 8.5 inches. Do you trust them any more than the cartographers? I don’t.
For your amusement: I just typed
mount everest rising
into Google, and got back
>Mt. Everest will continue to get taller along with the other summits in the Himalayas. Approximately 2.5 inches per year because of plate tectonics. Everest currently stands at 29,035 feet.Jul 10, 2023
> you’re just forcing them to to say unclear things like “well, it’s a little likely, but not super likely, but not . . . no! back up! More likely than that!”, and confusing everyone for no possible gain.
There's something more to that than meets the eye. When you see a number like "95.436," you're expecting that the number of digits printed to represent the precision of the measurement or calculation - that the 6 at the end, means something. In conflict with that is the fact that one significant digit is too many. 20%? 30%? Would anyone stake much on the difference?
That's why (in an alternate world where weird new systems were acceptable for twitter dialouge), writing probabilities in decimal binary makes more sense. 0b0.01 express 25% with no false implication that it's not 26%. Now, nobody will learn binary just for this, but if you read it from right to left, it says "not likely (0), but it might happen (1)." 0b0.0101 would be, "Not likely, but it might happen, but it's less likely than that might be taken to imply, but I can see no reason why it could not come to pass." That would be acceptable with a transition to writing normal decimal percentages after three binary digits, when the least significant figure fell below 1/10.
I like this idea. And you can add more zeroes on the end to convey additional precision! (I was going to initially write here about an alternate way of doing this that is even closer to the original suggestion, but then I realized it didn't have that property, so oh well. Of course one can always manually specificy precision, but...)
but isn't this against the theory? if we were mathematically perfect beings we'd have every probability to very high precision regardless of the "amount of evidence". before reading this post I was like hell yea probabilities rock, but now I'm confused by the precision thing.
I guess we might not know until we figure out the laws of resource-constrained optimal reasoning 😔
Unrelated nitpick, but we already know quite a lot about the laws of resource-constrained optimal reasoning, for example AIXI-tl, or logical induction. It's not the end-all solution for human thinking, but I don't think anyone is working on resource-constrained reasoning on the hope of making humans think like that, because humans are hardcodedly dumb about some things.
Is there a tl;dr; for what resource-constrained reasoning says about how many digits to transmit to describe a measurement with some roughly known uncertainty? My knee-jerk reaction is to think of the measurement as having maybe a gaussian with sort-of known width centered on a mean value, and that reporting more and more digits for the mean value is moving a gaussian with a rounded mean closer and closer to distribution known to the transmitter.
Is there a nice model for the cost of the error of the rounding, for rounding errors small compared to the uncertainty? I can imagine a bunch of plausible metrics, but don't know if there are generally accepted ones. I assume that the cost of the rounding error is going to go down exponentially with the number of digits of the mean transmitted, but, for reasonable metrics, is it linear in the rounding error? Quadratic? Something else?
If you think your belief is going to change up or down by 5% on further reflection, that's your precision estimate. Rounding error can be propagated through the normal techniques for error propagation (see any source on scientific calculation).
There is no rule or formula for the precision of resource-constrained reasoning, because you aren't guaranteed to order the terms in the process of deliberation from greatest to smallest. Instead, I use repeated experiments as my example of a belief you're expecting to change within known bounds in the future, to show why most probabilities have limited precision.
>Instead, I use repeated experiments as my example of a belief you're expecting to change within known bounds in the future, to show why most probabilities have limited precision.
Sure, that makes sense.
>Rounding error can be propagated through the normal techniques for error propagation (see any source on scientific calculation).
True. Basically propagating through the derivatives of whatever downstream calculation consumes the probability distribution estimate... For the case of a bet, I _think_ this comes down to an expected cost per bet (against an opponent who has precisely calibrated probabilities) that is the value of the bet times the difference between the rounded mean and the actual mean. Is that it?
If you are trying to figure out whether a coin is fair, the average number of heads per flip among a large number of experimental trials serves as your best estimate of its bias towards heads. Although you have that estimate to an infinite number of digits of precision, your estimate is guaranteed to change as soon as you flip another coin. That means the "infinite precision of belief," although you technically have it, is kind of pointless.
To put it another way, if you expect the exact probabilistic statement of your beliefs to change as expected but unpredictable-in-the-specifics new information comes in, such as further measurements in the presence of noise, there's no point in printing the estimate past the number of digits that you expect to stay the same.
Here's a way to interpret that "infinite precision of belief": if you bet according to different odds than precisely that estimator, you'll lose money on average. In that sense, the precision is useful however far you compute it, losing any of that precision will lose you some ability to make decisions.
Your conclusion about forgoing the precision that is guaranteed to be inexact is wrong. Consider this edge case: a coin will be flipped that is perfectly biased; it either always comes up head or always tail; you have no information about which one it is biased towards. The max-entropy guess in that situation is 1:1 head or tail, with no precision at all (your next guess is guaranteed to be either 1:0 or 0:1). Nonetheless, this guess still allows you to make bets on the current flip whereas you'd just refuse any bet if you followed your advice.
> losing any of that precision will lose you some ability to make decisions.
The amount of money you'd lose though opportunity cost in a betting game like that decreases exponentially with the number of digits of precision you're using. To quote one author whose opinions on the subject agree with mine,
"That means the 'infinite precision of belief,' although you technically have it, is kind of pointless."
;)
Compare this situation with the issue of reporting a length of a rod that you found to be 2.015mm, 2.051mm, and 2.068mm after three consecutive measurements. I personally would not write an average to four digits of precision.
I'm wondering how to interpret a range or distribution for a single future event probability. My P(heads) for a future flip of a fair coin, and for a coin with unknown fairness, would both be 0.5. In both cases I have complete uncertainty of the outcome. Any evidence favoring one outcome or the other would shift my probability towards 0 or 1. Even knowing all parameters of the random process that determines some binary outcome, shouldn't I just pick the expected value to maximize accuracy? In other words, what kind of uncertainty isn't already expressed by the probability?
It's epistemic vs aleatory uncertainty. The way the coin spins in mid air is aleatory i.e. "true random", while the way it's weighted is a fact that you theoretically could know, but you don't. The distribution should represent your epistemic uncertainty (state of knowledge) about the true likelihood of the coin coming up heads. You can improve on the epistemic part by learning more.
Sometimes it gets tough to define a clear line between the two - maybe Laplace's demon could tell you in advance which way the fair coin will go. But in many practical situations you can separate them into "things I, or somebody, might be able to learn" and "things that are so chaotic and unpredictable that they are best modeled as aleatory."
Epistemic vs aleatory is just fancy words for Bayesian vs frequentist, no? Frequentists only measure aleatory uncertainty, Bayesian probability allows for both aleatory and epistemic
Hmmm... frequentists certainly acknowledge epistemic uncertainty. I guess they're sometimes shy about quantifying it. But when you say p < 0.05, that's a statement about your epistemic uncertainty (if not quite as direct as giving a beta distribution).
It's the probability that you will encounter relevant new information.
You could read it as "My probability is 0.5, and if I did a lot of research I predict with 80% likelihood that my probability would still lie in the range 0.499--0.501," whereas for the coin you suspect to be weighted that range might be 0.9--0.1 instead.
Small error bars mean you predict that you've saturated your evidence, large error bars mean you predict that you could very reasonably change your estimate if you put more effort into it. With a coin that I have personally tested extensively, my probability is 0.5 and I would be *shocked* if I ever changed my mind, whereas if a magician says "this is my trick coin" my probability might be 0.5 but I'm pretty sure it won't be five minutes from now
Single future event probabilities, in the context of events that are unrelated to anything you can learn more about during the time before the event, are the cases where the meaning of "uncertain probability" is less clear. That is why rationalists, who prioritize thinking about AI apocalypses and the existence of God, will tell you that "uncertain probability" doesn't mean anything.
However in science, the expectation that your belief will change in the future is the rule, not the exception. You don't know which way it will change, but if you're aware of the precision of your experiments so far you'll be able to estimate by how much it's likely to change. That's what an "uncertain probability" is.
This is the language way to interpret probabilities, and so is correct. If you say you found half of people are Democrat, it means something different than 50.129% of people you found to be Democrat.
Yet it's subject to abuse, especially to those with less knowledge of statistics, math, or how to lie. If my study finds 16.67% of people to go to a party store on a Sunday, it's not obvious to everyone that my study likely had only six people in it.
There are at least three kinds of lies: lies, damned lies, and statistics.
A central issue when discussing significant digits is the sigmoidal behaviour, eg the difference between 1% and 2% is comparable to the difference between 98% and 99%, but NOT the same as the difference between 51% and 52%. So arguments about significant digits in [0, 1] probabilities are not well-founded. If you do a log transformation you can discuss significant digits in a sensible way.
What would I search for to get more information on that sigmoidal behavior as it applies to probabilities? I've noticed the issue myself, but don't know what to look for to find discussion of it. The Wikipedia page for 'Significant figures' doesn't (on a very quick read) touch on the topic.
This has been on my mind recently, especially when staring at tables of LLM benchmarks. The difference between 90% and 91% is significantly larger than 60% and 61%.
I've been mentally transforming probabilities into log(p/(1-p)), and just now noticed from the other comments that this actually has a name, "log odds". Swank.
Why are you adding this prefix “0b0” to the notation? If you want a prefix that indicates it’s not decimal, why not use something more transparent, like “bin” or even “binary” or “b2” (for “base 2”)?
That notation is pretty standard in programming languages. I do object to this being called "decimal binary" though. I'm not sure what exactly to call it, but not that. Maybe "binary fractions".
— Why not use 25% then? That's surely how everyone would actually mentally translate it and (think of/use) it: "ah, he means a 25% chance."
— Hold on, also I realize I'm not sure how ².01 implies precision any less than 25%: in either case, one could say "roughly" or else be interpreted as meaning "precisely this quantity."
— Per Scott's original post, 23% often *is*, in fact, just precise enough (i.e., is used in a way meaningfully distinct from 22% and 24%, such that either of those would be a less-useful datum).
— [ — Relatedly: Contra an aside in your post, one sigfig is /certainly/ NOT too many: 20% vs 30% — 1/3 vs 1/5 — is a distinction we can all intuitively grasp and all make use of IRL, surely...!]
— And hey, hold on a second time: no one uses "94.284" or whatever, anyway! This is solving a non-existent problem!
-------------------------
— Not that important, and perhaps I just misread you, but the English interpretation you give of ².0101 implies (to my mind) an event /less likely/ than ².01 — (from "not likely but maybe" to "not likely but maybe but more not likely than it seems even but technically possible") — but ².0101 ought actually be read as *more* sure, no? (25% vs 31.25%)
— ...Actually, I'm sorry my friend, it IS a neat idea but the more I think about those "English translations" you gave the more I hate them. I wouldn't know WTF someone was really getting at with either one of those, if not for the converted "oh he means 25%" floatin' around in my head...
> Per Scott's original post, 23% often *is*, in fact, just precise enough
I strongly object to your use of the term "often." I would accept "occasionally" or "in certain circumstances"
(Funnily enough, the difference between "occasionally" and "in certain circumstances" is what they imply about meta-probability. The first indicates true randomness, the second indicates certainty but only once you obtain more information)
I intuitively agree that any real-world probability estimate will have a certain finite level of precision, but I'm having trouble imagining what that actually means formally. Normally to work out what level of precision is appropriate, you estimate the probability distribution of the true value and how much that varies, but with a probability, if you have a probability distribution on probabilities, you just integrate it back to a single probability.
One case where having a probability distribution on probabilities is appropriate is as an answer to "What probability would X assign to outcome Y, if they knew the answer to question Z?" (where the person giving this probability distribution does not already know the answer to Z, and X and the person giving the meta-probabilities may be the same person). If we set Z to something along the lines of "What are the facts about the matter at hand that a typical person (or specifically the person I'm talking to) already knows?" or "What are all the facts currently knowable about this?", then the amount of variation in the meta-probability distribution gives an indication of how much useful information the probability (which is the expectation of the meta-probability distribution) conveys. I'm not sure to what extent this lines up with the intuitive idea of the precision of a probability though.
I was thinking something vaguely along these lines while reading the post. It seems like the intuitive thing that people are trying to extract from the number of digits in a probability is "If I took the time to fully understand your reasoning, how likely is it that I'd change my mind?"
In your notation, I think that would be something like "What is the probability that there is a relevant question Z to which you know the answer and I do not?"
It is really easy to understand what the finite number of digits means if you think about how the probability changes with additional measurements. If you expect the parameter to change by 1% up or down after you learn a new fact, that's the precision of your probability. For example, continually rolling a loaded dice to figure out what its average value is involves an estimate that converges to the right answer at a predictable rate. At any point in the experiment, you can calculate how closely you've converged to the rate of rolling a 6 within 95% confidence intervals.
It's only difficult to see this when you're thinking about questions that have no streams of new information to help you answer them - like the existence of God, or the number of aliens in the galaxy.
I like Scott's wording of "lightly held" probabilities. I think this matches what you are describing about the sensitivity of a probability estimate to the answer of an as-yet unanswered question Z.
Okay, hear me out: only write probabilities as odds ratios (or fractions if you prefer), and the number of digits is the number of Hartleys/Bans of information; you have to choose the best approximation available with the number of digits you're willing to venture.
Less goofy answer: Name the payout ratios at which you'd be willing to take each particular side of a small bet on the event. The further apart they are, the less information you're claiming to have.
>When you see a number like "95.436," you're expecting that the number of digits printed to represent the precision of the measurement or calculation - that the 6 at the end, means something. In conflict with that is the fact that one significant digit is too many.
Ok, though isn't this question orthogonal to whether the number represents probabilities?
This sounds more like a general question of whether to represent uncertainty in some vanilla measurement (say of the weight of an object) with the number of digits of precision or guard digits plus an explicit statement of the uncertainty. E.g. if someone has measured the weight of an object 1000 times on a scale on a vibrating table, and got a nice gaussian distribution, and reported the mean of the distribution as 3.791 +- 0.023 pounds (1 sigma (after using 1/sqrt(N))), it might be marginally more useful than reporting 3.79 +- 0.02 if the cost of the error from using the rounded distribution exceeds the cost of reporting the extra digits.
Yes, this is exactly the same. In your example you are measuring a mass, in my examples you're measuring the parameter of a Bernoulli distribution. For practical reasons, there's always going to be a limited number of digits that it's worth telling someone when communicating your belief about the most likely value of a hidden parameter.
This is one of those societal problem where the root is miscommunication. And frankly it's less of a problem than just the fact of life. I remember Trevor Noah grilling Nate silver that how could Trump win presidency when he predicted that Trump has only in 1/3 chance of winning. It was hilarious is some sense. Now this situation is reverse of what Scott is describing where the person using the probability is using it accurately but the dilemma is same: lack of clear communication.
Yes but most people think of probability like that. They think that probability of below 50% equates to an event being virtually impossible. It's like how many scientists make stupid comments on economics without understanding it's terms.
Most people ... . Most people - outside Korea and Singapore - can not do basic algebra (TIMMS). Most people are just no good with stochastics in new contexts. Most journalists suck at statistics. Many non-economists do not get most concepts of macro/micro - at least not without working on it. Does not make the communication of economists or mathematicians or Nate Silver less clear. 37+73 is 110. Clear. Even if my cat does not understand. - Granted, Nate on TV could have said: "Likely Hillary, but Trump has a real chance." - more adapted to IQs below 100 (not his general audience!). Clearer? Nope.
"more adapted to IQs below 100 (not his general audience!)"
Huh, have you seen the comments section on his substack? It's an absolute cesspool. I don't think I've read another substack with such a high proportion of morons and/or trolls in the comments (though I haven't read many).
I did and agree. He is not writing those comments, is he? - Writing: "who will win: Trump or Biden" will attract morons/trolls. Honey attracts flies just as horseshit does. - MRU comments are mostly too bad to read either.
I agree that the comments are a cesspool, but as far as I know its idiocy borne out of misdirected intellect rather than actual lack of mental horsepower.
If I see a mistake, oftentimes it's something like "Scott says to take X, Y and Z into account. I am going to ignore that he has addressed Y and Z and claim that his comments in X are incorrect!" I would expect someone dumb to not even comprehend that X was mentioned, much less be able to give a coherent (but extremely terrible) argument for this.
I think it's also partially confounded by the existence of... substack grabbers? Don't know what a good term for this type of person is. But when I see a low quality comment, without the background that an ACX reader """should""" have, I'll scroll up and see it's a non regular writing a substack. Which I would guess means that they're sampling from the general substack or internet population.
I have encountered people who use the term "50/50" as an expression of likelihood with no intent of actual numerical content but merely as a rote phrase meaning "unpredictable." On one occasion I asked for a likelihood estimate and was told "50/50," but when I had them count up past occurrences it turned out the ratio was more like 95/5.
I still intuitively feel this way. I know that 40% chance things will happen almost half the time, but I can't help but intuitively feel wronged when my Battle Brothers 40% chance attack doesn't hit.
Charles Babbage said, "On two occasions I have been asked [by members of Parliament], 'Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?' I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question."
Hey sometimes this works now with AI - an LLM figures out what you meant to ask and answers that instead of your dumb typo. Those guys were just 200 years ahead of their time.
Babbage apparently thought the questions [from two MPs, presumably in 1833] were asked in earnest. If so, I think the likeliest explanation is just Clarke's 3rd Law: "Any sufficiently advanced technology is indistinguishable from magic."
I think this is mainly conflict theory hiding behind miscommunication. Because there's no agreed upon standard of using numbers to represent degrees of belief (even the very concept of "degrees of belief" is pretty fraught!), people feel that they have license to dismiss contrary opinions outright and carry on as they were before.
Yes and I suspect this problem gets worse when you start getting to things with 10% probability happening, or things with 90% probability not happening, and so on.
I think the answer to that is to try to raise the sanity waterline.
Also, I am not sure that the issue here is just probability illiteracy.
If Nate had predicted that there is a 1/3 chance of a fair dice landing on 1 or 2, nobody would have batted an eye if two came up. Point out that the chances of dying in a round of Russian roulette are just 1/5 or so, and almost nobody will be convinced to play, because most people can clearly distinguish a 20% risk from a 0% risk.
Part of the problem is that politics is the mind killer. There was a clear tendency of the left to parse "p(Trump)=0.3" (along with the lower odds given by other polls) as "Trump won't win", which was a very calming thought on the left. I guess if you told a patient that there was a 30% chance that they had cancer, they would also be upset if you revised this to 100% after a biopsy. (I guess physicians know better than to give odds beforehand for that very reason.)
Sooner or later, everyone wants to interpret probability statements using a frequentist approach. So, sure you can say that the probability of reaching Mars is 5% to indicate that you think it's very difficult to do, and you're skeptical that this will happen. But sooner or later that 5% will become the basis for a frequentist calculation.
If you read through this article you'll see that probability statements drift between statements of degree of belief and actual frequentist interpretations. It's just inevitable.
It's also very obscure how to assign numerical probabilities to degrees of belief. For instance, suppose we all agree that there is a low probability that we will travel to Mars by 2050. What's the probability value for that? Is it 5%, 0.1%, or 0.000000001%? How do we adjudicate between those values? And how do I know that your 5% degree of belief represents the same thing as my 5% degree of belief?
Assuming you have no religious or similar objections to gambling, the standard mutually intelligible definition of a 5% degree of belief is a willingness to bet a small fraction of what's in your wallet (for example, one cent if you have $20), at better than 20:1 odds.
It has to be a small fraction because the total value you place on your money is only linear for small changes around a given number of dollars, otherwise it tends to be logarithmic for many.
> the standard mutually intelligible definition of a 5% degree of belief is a willingness to bet a small fraction of what's in your wallet (for example, one cent if you have $20), at better than 20:1 odds.
> It has to be a small fraction because the total value you place on your money is only linear for small changes around a given number of dollars, otherwise it tends to be logarithmic for many.
This isn't right; by definition, people are willing to place fraudulent bets that only amount to "a small fraction of what's in their wallet". They do so all the time to make their positions appear more certain than they really are; your operationalization is confounded by the fact that support has a monetary value independently of whether you win or lose. Placing a bet buys you a chance of winning, and it also buys you valuable support.
It still won't work; consider the belief "I will win the lottery."
Fundamentally, it is correct to say that a 5% degree of belief indicates willingness to bet at 20:1 odds in some window over which the bet size is not too large to be survivable and not too small to be worthwhile, but it is not correct to say that willingness to bet indicates a degree of belief (which is what you're saying when you define degree of belief as willingness to bet), and that is particularly the case when you specify that the amount of the bet is trivial.
Sure. Also, I do think that it's reasonable to invest trivial amounts of money in fair lottery tickets given certain utility functions. For example, if a loss is negligible but you value extremely highly the possibility of imminent comfortable retirement. I don't do this myself because I believe that in my part of the world lotteries are rigged and the chance to win really big is actually zero.
> Also, I do think that it's reasonable to invest trivial amounts of money in fair lottery tickets given certain utility functions.
I agree. But this is not compatible with the definition of "degree of belief" offered above; that definition requires that lottery ticket purchasers do not believe the advertised odds are correct.
"I will win the lottery" has a large fantasy component, where people spend time thinking about all the things they could buy with that money.
An anonymous bet with a small fraction of your wealth does not have that fantasy component. "Look at all the things I could buy with this 20 cents I won!" just doesn't do the same thing.
Because it's anonymous you're not pre-committing anything. I suppose some people might brag about making that bet?
Is it actually a problem for prediction markets, though? People betting at inaccurate odds for emotional (or other) reasons just provides a subsidy for the people who focus on winning, resulting in that market more accurately estimating the true likelihood. Certainly markets can be inaccurate briefly, or markets with very few participants can be inaccurate for longer, but it's pretty easy to look past short-term fluctuations.
Or maybe you're thinking of it being a problem in some different way?
Of course, this assumes that people will be willing to bet on an outcome given some odds ratio.
It might well be that some people would not rationally bet on something.
For example, given no further information, I might not be willing to bet on "Alice and Bob will be a couple a month from now" if I have no idea who Alice and Bob are and anticipate that other people in the market have more information. Without knowing more (are Alice and Bob just two randomly selected humans? Do they know each other? Are they dating?) the Knightian uncertainty in that question is just too high.
Thank you! You are making **exactly** my point -- that although people start out by talking about subjective "degrees of belief", sooner or later they will fall into some sort of frequentist interpretation. There can be no purer expression of this than making an argument about betting, because ultimately you are going to have to appeal to some sort of expected value over a long run of trials, which is by definition a frequentist interpretation.
Not necessarily. I think the probability that P=NP is about 5%, but I don’t think that if we “ran” the universe that many times it would be true in 5% of them - it would either be true in all or none.
Instead it means that, if I had to do something right now whose outcome depended on P being equal NP, I would do it if the amount by which it made things better if they were = is more than 20 times better than the amount by which it made things worse if they were unequal. I need some number or other to coordinate my behavior here and guarantee that I don’t create a book of bets that I am collectively guaranteed to lose (like if I paid 50 cents for a bet that gives me 90 cents if horse A wins and also paid 50 cents for a bet that gives me 90 cents if horse A loses). But the number doesn’t have to express anything about frequency - I can use it even for a thing that I think is logically certain one way or the other, if I don’t know which direction that is.
I think the P=NP example (and the Mars example, if you believe in a deterministic universe) can still be approached this way if we define 'rerunning the timeline' as picking one of the set of possible universes that would produce the evidence we have today.
I'm not sure I follow. If the evidence we have today is insufficient to show that P != NP, how is it logically impossible for universes to exist where we have the same evidence and P does or does not equal NP?
Most people suspect that the mathematical tools we have *are* sufficient to show one of the directions of the P and NP conjecture, it's just that no human has yet figured out how to do it.
Even if it turns out not to be provable, it still either is or isn't, right? Universes where the starting conditions and the laws of physics are different are easy to imagine. Universes where *math* is different just bend my mind in circles. If you changed it to switch whether P=NP, how many other theorems would be affected? Would numbers even work anymore?
Interesting that you should say that. I was thinking about the timelines of possible universes going *forward from the present* — and when it comes to the Mars example, what is the set of possible universes that will *prevent* us from getting to Mars in 2050 vs the set that will *allow* us to get to Mars in 2050? think we can agree (or maybe not?) that there is an infinity of universes for the former, but a smaller infinity of universes for the latter. After all, the set of universes where the sequence of events happen in the proper order to get us to Mars would be smaller than the set of universes where the sequence of events didn't happen (or couldn't happen). If these were sets of numbers we could create a bijection (a one-to-one correspondence) between their elements. But no such comparison is possible between these two sets, and the only thing we can say with surety is that they don't have the same cardinality. Trying to calculate the proportionality of the two sets would be impossible, so determining the probability of universes where we get to Mars in 2050 and universes that don't get us to Mars in 2050 would be a nonsensical question. I'm not going to die on this hill, though. Feel free to shoot my arguments down. ;-)
I've never been any good at reasoning about infinities in this way (Am I so bad at math? No, it's the mathematicians who are wrong!), but I've spotted a better out so excuse me while I take it:
I do disagree that these are infinite sets; I think they're just unfathomably large. If there are 2^(10^80) possible states of the universe one Planck second after any given state, the number of possible histories at time t quickly becomes unrepresentable within our universe. It's a pseudoinfinite number that is not measurably different from infinity in a practical sense, but saves us all of the baggage of an infinite number in the theoretical sense.
If you accept that premise (and I don't blame you if you don't), I believe we're allowed to do the standard statistics stuff like taking a random sample to get an estimate without Cantor rolling in his grave.
I like your counter-argument. But I'll counter your counter with this — if the number of potential histories of our universe going forward is impossible to represent within our universe, then it's also impossible to represent the chances of getting to Mars in 2050 vs not getting to Mars in 2050. Ouch! My brain hurts!
> Not necessarily. I think the probability that P=NP is about 5%, but I don’t think that if we “ran” the universe that many times it would be true in 5% of them - it would either be true in all or none.
I think this is a semantic/conceptual disagreement. I think there are two points where we can tease it apart:
* You're thinking of the world as deterministic, and I as predicated on randomness to a significant degree. If the future depends on randomness, then it makes no sense to claim it would be true in all or none. Whereas if the future is determine by initial conditions and laws of nature, then yes, it will be. In which case:
* You can adapt my conceptualisation such that it survives a deterministic world. A determinist would believe that the future is determined by known and unknown determinants, but nonetheless fixed, ie the laws of nature and initial conditions both known and unknown. To give x a 5% probability, then, is to say that if, for each "run" of the universe you filled in the unknown determinants randomly, or according to the probability distribution you believe in, you would get event x occurring 50/1000 times.
Correct me if I'm mistaken in my assumptions about your belief, but I don't know how else to make sense of your comment.
I think that the P vs NP claim is most likely a logical truth one way or the other, and that no matter how you modify the known or unknown determinants, it will come out the same way in all universes.
If you have some worries about that, just consider the claim that the Goldbach conjecture is true (or false), or the claim that the 1415926th digit of pi is a 5.
Isn’t your 5% estimate essentially meaningless here though? Since it is a proposition about a fundamental law and you know it is actually either 100% or 0%. And more importantly no other prediction you make will bear sufficient similarity to this one that grouping them yields any helpful knowledge about future predictions.
Your first point shows that P=NP doesn’t have a *chance* of 5% - either all physically possible paths give P=NP or none of them do.
Your second point shows that a certain kind of frequentist calibrationism isn’t going to make sense of this either.
But Bayesian probability isn’t about either of those (regardless of what Scott says about calibration). Bayesian probability is just a way of governing your actions in light of uncertainty. I won’t make risky choices that come out very badly if P=NP is false, and I won’t make risky choices that come out 20 times as badly if it’s true compared to how well the choice turns out if it’s false. That’s what it means that my credence (Bayesian probability) is 5%. There is nothing that makes one credence “correct” and another “incorrect” - but there are people who have credence-forming policies that generally lead them well and others that generally lead them badly. And the only policy that avoids guaranteed losses is for your credences to satisfy the probability axioms and to update by Bayesian conditionalization on evidence.
The thing I don’t get is that if it is not a frequentist probability, how can it make sense to applies Bayes theorem to an update for P=NP? Say a highly respected mathematician claims he has a proof of it, then promptly dies. This is supposed to be Bayesian evidence in favour of P=NP. But does it make sense to apply a mathematical update to the chance of P=NP? Surely it is not an event that conforms to a modelable distribution, since as you say it is either wrong or right in all universes.
What Bayes' Theorem says is just that P(A|B)P(B)=P(A&B). (Well, people often write it in another form, as P(A|B)=P(B|A)P(A)/P(B), but that's just a trivial consequence of the previous equation holding for all A and B.)
To a Bayesian, P(A) (or P(B), or P(A&B)) just represents the price you'd be willing to pay for a bet that pays $1 if A (or B, or A&B) is true and nothing otherwise (and they assume you'd also be willing to pay scaled up or down amounts for things with scaled up or down goodness of outcome).
Let's say that P(B|A) is the amount you plan be willing to pay for a bet that pays $1 if B is true and nothing otherwise, in a hypothetical future where you've learned A and nothing else.
The first argument states that this price should be the same amount you'd be willing to pay right now for a bet that pays $1 if A&B are both true, nothing if A is true and B is false, and pays back your initial payment if A is false (i.e., the bet is "called off" if A is false). After all, if the price you'd be willing to pay for this called off bet is higher, then someone could ask you to buy this called off bet right now, as well as a tiny bet that A is false, and then if A turns out to be true they sell you back a bet on B at the lower price you'd be willing to accept after learning A is true. This plan of betting is one you'd be willing to commit to, but it would guarantee that you lose money. There's a converse plan of betting you'd be willing to commit to that would guarantee that you lose money if the price you'd be willing to pay for the called off bet is lower than the price you'd be willing to pay after learning that A is true. The only way to avoid the guaranteed loss is if the price you're willing to pay for the called off bet is precisely equal to the price you'd be willing to pay for a bet on B after learning A.
But a bet on B that is called off if A is false is precisely equal to a sum of three bets that you're willing to make right now - and we can check that if your prices don't precisely satisfy the equation P(B|A)P(A)=P(A&B), then there's a set of bets you're willing to make that collectively guarantee that you'll lose money.
There's nothing objectively right about the posterior P(B|A), just that if you are currently committed to betting on B at P(B), and you're currently committed to betting on A&B and P(A&B), then you better be committed to updating your price for bets on B to P(B|A) if you learn A (and nothing else) or else your commitments are self-undermining.
All that there is for the Bayesian is consistency of commitment. (I think Scott and some others want to say that some commitments are objectively better than others, such as the commitments of Samotsvety, but I say that we can only understand this by saying that Samotsvety is very skilled at coming up with commitments that work out, and not that they are getting the objectively right commitments.)
I do not think that 5% to reach Mars could ever be interpreted as a frequentist probability.
If you are one of the mice which run the simulation that is Earth, and decide to copy it 1000 times and count in how many of the instances the humans reach Mars by 2050, then you can determine a frequentist probability.
If you are living on Earth, you would have to be very confused about frequentism to think that such a prediction could ever be a frequentist probability.
>If you read through this article you'll see that probability statements drift between statements of degree of belief and actual frequentist interpretations. _It's_ _just_ _inevitable_.
I wonder if there's a bell curve relationship of "how much you care about a thing" versus "how accurately you can make predictions about that thing". E.g. do a football teams' biggest fans predict their outcomes more accurately or less accurately than non-supporters? I would guess that the superfans would be less accurate.
If that's the case, "Person X has spent loads of time thinking about this question" may be a reason to weigh their opinion less than that of a generally well-calibrated person who has considered the question more briefly.
I think you're conflating bias vs "spent loads of time thinking about this question" as the cause of bad predictions. The latter group includes all experts and scientists studying the question, and probably most of the best predictions. It also includes the most biased people who apply no rational reasoning to the question and have the worst predictions. You're better off considering bias and expertise separately than just grouping them as people who spenta lot of time thinking about something.
Me too. But you still have to consider the two factors separately. You can't just reason that since physicists have spent a lot of time thinking about physics, they're probably biased and shouldn't be trusted about physics.
The funniest thing about those arguments you're rebutting is that the average of a large number of past 0-or-1 events is only an estimate of the probability of drawing a 1. In other words, the probabilities they're saying are the only ones that exist, are unknowable!
That's right. Even in a simple case of drawing balls from a jar where you don't know how many balls of each color there are, the last ball has a 100% 'real' probability of being a specific color, you just don't know which one.
I'd say that all probabilities are about "how to measure a lack of information". If you have a fair coin it's easy to pretend that probability is objective because any people present have the exactly same information. But as long as there are information differences then different people will give different probabilities. And it's not the case that some of them are wrong. They just lack different information. But for some reason people expect that if you're giving a number instead of a feeling then you are claiming objectivity. They're just being silly. There are no objective probabilities outside of quantum mechanics and maybe not even there. It's all lack of information, it's just that there are some scenarios where it's easy for everyone to have exactly the same lack.
Okay, after years of this I think I have a better handle on what's going on. It's reasonable to pull probabilities because you can obviously perform an operation where something obviously isn't 1% and obviously isn't 99%, so then you're just 'arguing price.' On the other hand, it's reasonable for people to call this out as secretly rolling in reference class stuff and *not* having done the requisite moves to do more complex reasoning about probabilities, namely, defining the set of counterfactuals you are reasoning about and their assumptions and performing the costly cognitive operations of reasoning through those counterfactuals (what else would be different, what would be expect to see?). When people call BS on those not showing their work, they are being justly suspicious of summary statistics.
The thing is, if they’re wanting to call out those things, they should do that, rather than attacking the concept of probability wholesale.
Like, there are good reasons to not invest in Ponzi schemes, but “but money is just a made-up concept” is one of the least relevant possible objections.
👏🏼 👏🏼 👏🏼
A legend about accuracy versus precision that you may have heard but that I think is applicable:
As the story* goes: When cartographers went to measure the height of Mount Everest, they had a hard time because the top is eternally snow-capped, making it nigh-impossible to get an exact reading. They decided to take several different readings at different times of year and average them out.
The mean of all their measurements turned out to be exactly 29,000 feet.
There was concern that such a number wouldn’t be taken seriously. People would wonder what digit they rounded to. A number like that heavily implies imprecision. The measures might explain and justify in person, but somewhere down the line that figure ends up in a textbook (not to mention Guinness), stripped of context.
It’s a silly problem to have, but it isn’t a made-up one. Their concerns were arguably justified. We infuse a lot of meaning into a number, and sometimes telling the truth obfuscates reality.
I’ve heard more than one version of exactly how they arrived at 29,028 feet, instead—the official measurement for decades. One account says they took the median instead of the mean. Another says they just tacked on some arbitrary extra.
More recently, in 2020, Chinese and Nepali authorities officially established the height to be 29,031 feet, 8.5 inches. Do you trust them any more than the cartographers? I don’t.
All of which is to say, it makes sense that our skepticism is aroused when we encounter what looks like an imbalance of accuracy and precision. Maybe the percentage-giver owes us a little bit more in some cases.
* apocryphal maybe, but illustrative, and probably truth-adjacent at least
>More recently, in 2020, Chinese and Nepali authorities officially established the height to be 29,031 feet, 8.5 inches. Do you trust them any more than the cartographers? I don’t.
For your amusement: I just typed
mount everest rising
into Google, and got back
>Mt. Everest will continue to get taller along with the other summits in the Himalayas. Approximately 2.5 inches per year because of plate tectonics. Everest currently stands at 29,035 feet.Jul 10, 2023
They should have just measured in meters, problem solved.
Mentioning Yoshua Bengio attracts typos:
"Yoshua Bengio said the probability of AI causing a global catastrophe everybody is 20%"
"Yosuha Bengio thinks there’s 20% chance of AI catastrophe"
So it seems, indeed.
> you’re just forcing them to to say unclear things like “well, it’s a little likely, but not super likely, but not . . . no! back up! More likely than that!”, and confusing everyone for no possible gain.
There's something more to that than meets the eye. When you see a number like "95.436," you're expecting that the number of digits printed to represent the precision of the measurement or calculation - that the 6 at the end, means something. In conflict with that is the fact that one significant digit is too many. 20%? 30%? Would anyone stake much on the difference?
That's why (in an alternate world where weird new systems were acceptable for twitter dialouge), writing probabilities in decimal binary makes more sense. 0b0.01 express 25% with no false implication that it's not 26%. Now, nobody will learn binary just for this, but if you read it from right to left, it says "not likely (0), but it might happen (1)." 0b0.0101 would be, "Not likely, but it might happen, but it's less likely than that might be taken to imply, but I can see no reason why it could not come to pass." That would be acceptable with a transition to writing normal decimal percentages after three binary digits, when the least significant figure fell below 1/10.
I like this idea. And you can add more zeroes on the end to convey additional precision! (I was going to initially write here about an alternate way of doing this that is even closer to the original suggestion, but then I realized it didn't have that property, so oh well. Of course one can always manually specificy precision, but...)
but isn't this against the theory? if we were mathematically perfect beings we'd have every probability to very high precision regardless of the "amount of evidence". before reading this post I was like hell yea probabilities rock, but now I'm confused by the precision thing.
I guess we might not know until we figure out the laws of resource-constrained optimal reasoning 😔
Unrelated nitpick, but we already know quite a lot about the laws of resource-constrained optimal reasoning, for example AIXI-tl, or logical induction. It's not the end-all solution for human thinking, but I don't think anyone is working on resource-constrained reasoning on the hope of making humans think like that, because humans are hardcodedly dumb about some things.
Is there a tl;dr; for what resource-constrained reasoning says about how many digits to transmit to describe a measurement with some roughly known uncertainty? My knee-jerk reaction is to think of the measurement as having maybe a gaussian with sort-of known width centered on a mean value, and that reporting more and more digits for the mean value is moving a gaussian with a rounded mean closer and closer to distribution known to the transmitter.
Is there a nice model for the cost of the error of the rounding, for rounding errors small compared to the uncertainty? I can imagine a bunch of plausible metrics, but don't know if there are generally accepted ones. I assume that the cost of the rounding error is going to go down exponentially with the number of digits of the mean transmitted, but, for reasonable metrics, is it linear in the rounding error? Quadratic? Something else?
If you think your belief is going to change up or down by 5% on further reflection, that's your precision estimate. Rounding error can be propagated through the normal techniques for error propagation (see any source on scientific calculation).
There is no rule or formula for the precision of resource-constrained reasoning, because you aren't guaranteed to order the terms in the process of deliberation from greatest to smallest. Instead, I use repeated experiments as my example of a belief you're expecting to change within known bounds in the future, to show why most probabilities have limited precision.
Many Thanks!
>Instead, I use repeated experiments as my example of a belief you're expecting to change within known bounds in the future, to show why most probabilities have limited precision.
Sure, that makes sense.
>Rounding error can be propagated through the normal techniques for error propagation (see any source on scientific calculation).
True. Basically propagating through the derivatives of whatever downstream calculation consumes the probability distribution estimate... For the case of a bet, I _think_ this comes down to an expected cost per bet (against an opponent who has precisely calibrated probabilities) that is the value of the bet times the difference between the rounded mean and the actual mean. Is that it?
If you are trying to figure out whether a coin is fair, the average number of heads per flip among a large number of experimental trials serves as your best estimate of its bias towards heads. Although you have that estimate to an infinite number of digits of precision, your estimate is guaranteed to change as soon as you flip another coin. That means the "infinite precision of belief," although you technically have it, is kind of pointless.
To put it another way, if you expect the exact probabilistic statement of your beliefs to change as expected but unpredictable-in-the-specifics new information comes in, such as further measurements in the presence of noise, there's no point in printing the estimate past the number of digits that you expect to stay the same.
Here's a way to interpret that "infinite precision of belief": if you bet according to different odds than precisely that estimator, you'll lose money on average. In that sense, the precision is useful however far you compute it, losing any of that precision will lose you some ability to make decisions.
Your conclusion about forgoing the precision that is guaranteed to be inexact is wrong. Consider this edge case: a coin will be flipped that is perfectly biased; it either always comes up head or always tail; you have no information about which one it is biased towards. The max-entropy guess in that situation is 1:1 head or tail, with no precision at all (your next guess is guaranteed to be either 1:0 or 0:1). Nonetheless, this guess still allows you to make bets on the current flip whereas you'd just refuse any bet if you followed your advice.
> losing any of that precision will lose you some ability to make decisions.
The amount of money you'd lose though opportunity cost in a betting game like that decreases exponentially with the number of digits of precision you're using. To quote one author whose opinions on the subject agree with mine,
"That means the 'infinite precision of belief,' although you technically have it, is kind of pointless."
;)
Compare this situation with the issue of reporting a length of a rod that you found to be 2.015mm, 2.051mm, and 2.068mm after three consecutive measurements. I personally would not write an average to four digits of precision.
Wouldn't it be best to assign a range to represent uncertainty? Or give error bars?
So, for a generic risk you could say something like 6(-2/+5)% chance of X outcome occurring.
Yes, and the next step is to give a probability distribution.
I'm wondering how to interpret a range or distribution for a single future event probability. My P(heads) for a future flip of a fair coin, and for a coin with unknown fairness, would both be 0.5. In both cases I have complete uncertainty of the outcome. Any evidence favoring one outcome or the other would shift my probability towards 0 or 1. Even knowing all parameters of the random process that determines some binary outcome, shouldn't I just pick the expected value to maximize accuracy? In other words, what kind of uncertainty isn't already expressed by the probability?
It's epistemic vs aleatory uncertainty. The way the coin spins in mid air is aleatory i.e. "true random", while the way it's weighted is a fact that you theoretically could know, but you don't. The distribution should represent your epistemic uncertainty (state of knowledge) about the true likelihood of the coin coming up heads. You can improve on the epistemic part by learning more.
Sometimes it gets tough to define a clear line between the two - maybe Laplace's demon could tell you in advance which way the fair coin will go. But in many practical situations you can separate them into "things I, or somebody, might be able to learn" and "things that are so chaotic and unpredictable that they are best modeled as aleatory."
Epistemic vs aleatory is just fancy words for Bayesian vs frequentist, no? Frequentists only measure aleatory uncertainty, Bayesian probability allows for both aleatory and epistemic
Hmmm... frequentists certainly acknowledge epistemic uncertainty. I guess they're sometimes shy about quantifying it. But when you say p < 0.05, that's a statement about your epistemic uncertainty (if not quite as direct as giving a beta distribution).
It's the probability that you will encounter relevant new information.
You could read it as "My probability is 0.5, and if I did a lot of research I predict with 80% likelihood that my probability would still lie in the range 0.499--0.501," whereas for the coin you suspect to be weighted that range might be 0.9--0.1 instead.
Small error bars mean you predict that you've saturated your evidence, large error bars mean you predict that you could very reasonably change your estimate if you put more effort into it. With a coin that I have personally tested extensively, my probability is 0.5 and I would be *shocked* if I ever changed my mind, whereas if a magician says "this is my trick coin" my probability might be 0.5 but I'm pretty sure it won't be five minutes from now
Single future event probabilities, in the context of events that are unrelated to anything you can learn more about during the time before the event, are the cases where the meaning of "uncertain probability" is less clear. That is why rationalists, who prioritize thinking about AI apocalypses and the existence of God, will tell you that "uncertain probability" doesn't mean anything.
However in science, the expectation that your belief will change in the future is the rule, not the exception. You don't know which way it will change, but if you're aware of the precision of your experiments so far you'll be able to estimate by how much it's likely to change. That's what an "uncertain probability" is.
This is the language way to interpret probabilities, and so is correct. If you say you found half of people are Democrat, it means something different than 50.129% of people you found to be Democrat.
Yet it's subject to abuse, especially to those with less knowledge of statistics, math, or how to lie. If my study finds 16.67% of people to go to a party store on a Sunday, it's not obvious to everyone that my study likely had only six people in it.
There are at least three kinds of lies: lies, damned lies, and statistics.
Why not “1/4”?
Because it's hard to pronounce 1/4 with your tongue in your cheek. ;-)
A central issue when discussing significant digits is the sigmoidal behaviour, eg the difference between 1% and 2% is comparable to the difference between 98% and 99%, but NOT the same as the difference between 51% and 52%. So arguments about significant digits in [0, 1] probabilities are not well-founded. If you do a log transformation you can discuss significant digits in a sensible way.
What would I search for to get more information on that sigmoidal behavior as it applies to probabilities? I've noticed the issue myself, but don't know what to look for to find discussion of it. The Wikipedia page for 'Significant figures' doesn't (on a very quick read) touch on the topic.
Try looking up the deciban, the unit for that measure: https://en.m.wikipedia.org/wiki/Hartley_(unit)
Ah, yeah, that does seem like a good starting point, thanks. For anyone else who's interested, this short article is good:
http://rationalnumbers.james-kay.com/?p=306
Many Thanks!
This has been on my mind recently, especially when staring at tables of LLM benchmarks. The difference between 90% and 91% is significantly larger than 60% and 61%.
I've been mentally transforming probabilities into log(p/(1-p)), and just now noticed from the other comments that this actually has a name, "log odds". Swank.
Why are you adding this prefix “0b0” to the notation? If you want a prefix that indicates it’s not decimal, why not use something more transparent, like “bin” or even “binary” or “b2” (for “base 2”)?
That notation is pretty standard in programming languages. I do object to this being called "decimal binary" though. I'm not sure what exactly to call it, but not that. Maybe "binary fractions".
I think "binary floating-point" is probably the least confusing term.
It's not actually floating point though. That'd be binary scientific notation, like 0b1.011e-1100.
I kind of like it in principle, but...
— Why not use 25% then? That's surely how everyone would actually mentally translate it and (think of/use) it: "ah, he means a 25% chance."
— Hold on, also I realize I'm not sure how ².01 implies precision any less than 25%: in either case, one could say "roughly" or else be interpreted as meaning "precisely this quantity."
— Per Scott's original post, 23% often *is*, in fact, just precise enough (i.e., is used in a way meaningfully distinct from 22% and 24%, such that either of those would be a less-useful datum).
— [ — Relatedly: Contra an aside in your post, one sigfig is /certainly/ NOT too many: 20% vs 30% — 1/3 vs 1/5 — is a distinction we can all intuitively grasp and all make use of IRL, surely...!]
— And hey, hold on a second time: no one uses "94.284" or whatever, anyway! This is solving a non-existent problem!
-------------------------
— Not that important, and perhaps I just misread you, but the English interpretation you give of ².0101 implies (to my mind) an event /less likely/ than ².01 — (from "not likely but maybe" to "not likely but maybe but more not likely than it seems even but technically possible") — but ².0101 ought actually be read as *more* sure, no? (25% vs 31.25%)
— ...Actually, I'm sorry my friend, it IS a neat idea but the more I think about those "English translations" you gave the more I hate them. I wouldn't know WTF someone was really getting at with either one of those, if not for the converted "oh he means 25%" floatin' around in my head...
> Per Scott's original post, 23% often *is*, in fact, just precise enough
I strongly object to your use of the term "often." I would accept "occasionally" or "in certain circumstances"
(Funnily enough, the difference between "occasionally" and "in certain circumstances" is what they imply about meta-probability. The first indicates true randomness, the second indicates certainty but only once you obtain more information)
I intuitively agree that any real-world probability estimate will have a certain finite level of precision, but I'm having trouble imagining what that actually means formally. Normally to work out what level of precision is appropriate, you estimate the probability distribution of the true value and how much that varies, but with a probability, if you have a probability distribution on probabilities, you just integrate it back to a single probability.
One case where having a probability distribution on probabilities is appropriate is as an answer to "What probability would X assign to outcome Y, if they knew the answer to question Z?" (where the person giving this probability distribution does not already know the answer to Z, and X and the person giving the meta-probabilities may be the same person). If we set Z to something along the lines of "What are the facts about the matter at hand that a typical person (or specifically the person I'm talking to) already knows?" or "What are all the facts currently knowable about this?", then the amount of variation in the meta-probability distribution gives an indication of how much useful information the probability (which is the expectation of the meta-probability distribution) conveys. I'm not sure to what extent this lines up with the intuitive idea of the precision of a probability though.
I was thinking something vaguely along these lines while reading the post. It seems like the intuitive thing that people are trying to extract from the number of digits in a probability is "If I took the time to fully understand your reasoning, how likely is it that I'd change my mind?"
In your notation, I think that would be something like "What is the probability that there is a relevant question Z to which you know the answer and I do not?"
It is really easy to understand what the finite number of digits means if you think about how the probability changes with additional measurements. If you expect the parameter to change by 1% up or down after you learn a new fact, that's the precision of your probability. For example, continually rolling a loaded dice to figure out what its average value is involves an estimate that converges to the right answer at a predictable rate. At any point in the experiment, you can calculate how closely you've converged to the rate of rolling a 6 within 95% confidence intervals.
It's only difficult to see this when you're thinking about questions that have no streams of new information to help you answer them - like the existence of God, or the number of aliens in the galaxy.
I like Scott's wording of "lightly held" probabilities. I think this matches what you are describing about the sensitivity of a probability estimate to the answer of an as-yet unanswered question Z.
Okay, hear me out: only write probabilities as odds ratios (or fractions if you prefer), and the number of digits is the number of Hartleys/Bans of information; you have to choose the best approximation available with the number of digits you're willing to venture.
Less goofy answer: Name the payout ratios at which you'd be willing to take each particular side of a small bet on the event. The further apart they are, the less information you're claiming to have.
The idea of separate bid and ask prices is a very good way to communicate this concept to finance people, thanks for that.
>When you see a number like "95.436," you're expecting that the number of digits printed to represent the precision of the measurement or calculation - that the 6 at the end, means something. In conflict with that is the fact that one significant digit is too many.
Ok, though isn't this question orthogonal to whether the number represents probabilities?
This sounds more like a general question of whether to represent uncertainty in some vanilla measurement (say of the weight of an object) with the number of digits of precision or guard digits plus an explicit statement of the uncertainty. E.g. if someone has measured the weight of an object 1000 times on a scale on a vibrating table, and got a nice gaussian distribution, and reported the mean of the distribution as 3.791 +- 0.023 pounds (1 sigma (after using 1/sqrt(N))), it might be marginally more useful than reporting 3.79 +- 0.02 if the cost of the error from using the rounded distribution exceeds the cost of reporting the extra digits.
Yes, this is exactly the same. In your example you are measuring a mass, in my examples you're measuring the parameter of a Bernoulli distribution. For practical reasons, there's always going to be a limited number of digits that it's worth telling someone when communicating your belief about the most likely value of a hidden parameter.
Many Thanks!
This is one of those societal problem where the root is miscommunication. And frankly it's less of a problem than just the fact of life. I remember Trevor Noah grilling Nate silver that how could Trump win presidency when he predicted that Trump has only in 1/3 chance of winning. It was hilarious is some sense. Now this situation is reverse of what Scott is describing where the person using the probability is using it accurately but the dilemma is same: lack of clear communication.
Lack of clear/logical thinking on Trevor Noah's side.
Yes but most people think of probability like that. They think that probability of below 50% equates to an event being virtually impossible. It's like how many scientists make stupid comments on economics without understanding it's terms.
Most people ... . Most people - outside Korea and Singapore - can not do basic algebra (TIMMS). Most people are just no good with stochastics in new contexts. Most journalists suck at statistics. Many non-economists do not get most concepts of macro/micro - at least not without working on it. Does not make the communication of economists or mathematicians or Nate Silver less clear. 37+73 is 110. Clear. Even if my cat does not understand. - Granted, Nate on TV could have said: "Likely Hillary, but Trump has a real chance." - more adapted to IQs below 100 (not his general audience!). Clearer? Nope.
"more adapted to IQs below 100 (not his general audience!)"
Huh, have you seen the comments section on his substack? It's an absolute cesspool. I don't think I've read another substack with such a high proportion of morons and/or trolls in the comments (though I haven't read many).
I did and agree. He is not writing those comments, is he? - Writing: "who will win: Trump or Biden" will attract morons/trolls. Honey attracts flies just as horseshit does. - MRU comments are mostly too bad to read either.
I didn't blame him for the comments (though I assume he's publically decided not to moderate them?), I was responding to "his general audience".
I agree that the comments are a cesspool, but as far as I know its idiocy borne out of misdirected intellect rather than actual lack of mental horsepower.
If I see a mistake, oftentimes it's something like "Scott says to take X, Y and Z into account. I am going to ignore that he has addressed Y and Z and claim that his comments in X are incorrect!" I would expect someone dumb to not even comprehend that X was mentioned, much less be able to give a coherent (but extremely terrible) argument for this.
I think it's also partially confounded by the existence of... substack grabbers? Don't know what a good term for this type of person is. But when I see a low quality comment, without the background that an ACX reader """should""" have, I'll scroll up and see it's a non regular writing a substack. Which I would guess means that they're sampling from the general substack or internet population.
I have encountered people who use the term "50/50" as an expression of likelihood with no intent of actual numerical content but merely as a rote phrase meaning "unpredictable." On one occasion I asked for a likelihood estimate and was told "50/50," but when I had them count up past occurrences it turned out the ratio was more like 95/5.
I still intuitively feel this way. I know that 40% chance things will happen almost half the time, but I can't help but intuitively feel wronged when my Battle Brothers 40% chance attack doesn't hit.
Charles Babbage said, "On two occasions I have been asked [by members of Parliament], 'Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?' I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question."
Hey sometimes this works now with AI - an LLM figures out what you meant to ask and answers that instead of your dumb typo. Those guys were just 200 years ahead of their time.
Babbage apparently thought the questions [from two MPs, presumably in 1833] were asked in earnest. If so, I think the likeliest explanation is just Clarke's 3rd Law: "Any sufficiently advanced technology is indistinguishable from magic."
I think this is mainly conflict theory hiding behind miscommunication. Because there's no agreed upon standard of using numbers to represent degrees of belief (even the very concept of "degrees of belief" is pretty fraught!), people feel that they have license to dismiss contrary opinions outright and carry on as they were before.
There is an argument to be made that we should stop using probabilities in public communications if people don't understand how they work.
Sometimes you might want to transmit useful information to the non-retarded part of the public even at the cost of riling up the others.
What's the probability of that argument working?
Yes and I suspect this problem gets worse when you start getting to things with 10% probability happening, or things with 90% probability not happening, and so on.
I hesitate to imagine what fraction of the public can deal with independent random variables...
I think the answer to that is to try to raise the sanity waterline.
Also, I am not sure that the issue here is just probability illiteracy.
If Nate had predicted that there is a 1/3 chance of a fair dice landing on 1 or 2, nobody would have batted an eye if two came up. Point out that the chances of dying in a round of Russian roulette are just 1/5 or so, and almost nobody will be convinced to play, because most people can clearly distinguish a 20% risk from a 0% risk.
Part of the problem is that politics is the mind killer. There was a clear tendency of the left to parse "p(Trump)=0.3" (along with the lower odds given by other polls) as "Trump won't win", which was a very calming thought on the left. I guess if you told a patient that there was a 30% chance that they had cancer, they would also be upset if you revised this to 100% after a biopsy. (I guess physicians know better than to give odds beforehand for that very reason.)
Sooner or later, everyone wants to interpret probability statements using a frequentist approach. So, sure you can say that the probability of reaching Mars is 5% to indicate that you think it's very difficult to do, and you're skeptical that this will happen. But sooner or later that 5% will become the basis for a frequentist calculation.
If you read through this article you'll see that probability statements drift between statements of degree of belief and actual frequentist interpretations. It's just inevitable.
It's also very obscure how to assign numerical probabilities to degrees of belief. For instance, suppose we all agree that there is a low probability that we will travel to Mars by 2050. What's the probability value for that? Is it 5%, 0.1%, or 0.000000001%? How do we adjudicate between those values? And how do I know that your 5% degree of belief represents the same thing as my 5% degree of belief?
Assuming you have no religious or similar objections to gambling, the standard mutually intelligible definition of a 5% degree of belief is a willingness to bet a small fraction of what's in your wallet (for example, one cent if you have $20), at better than 20:1 odds.
It has to be a small fraction because the total value you place on your money is only linear for small changes around a given number of dollars, otherwise it tends to be logarithmic for many.
Exactly
> the standard mutually intelligible definition of a 5% degree of belief is a willingness to bet a small fraction of what's in your wallet (for example, one cent if you have $20), at better than 20:1 odds.
> It has to be a small fraction because the total value you place on your money is only linear for small changes around a given number of dollars, otherwise it tends to be logarithmic for many.
This isn't right; by definition, people are willing to place fraudulent bets that only amount to "a small fraction of what's in their wallet". They do so all the time to make their positions appear more certain than they really are; your operationalization is confounded by the fact that support has a monetary value independently of whether you win or lose. Placing a bet buys you a chance of winning, and it also buys you valuable support.
You can amend this to an anonymous bet.
It still won't work; consider the belief "I will win the lottery."
Fundamentally, it is correct to say that a 5% degree of belief indicates willingness to bet at 20:1 odds in some window over which the bet size is not too large to be survivable and not too small to be worthwhile, but it is not correct to say that willingness to bet indicates a degree of belief (which is what you're saying when you define degree of belief as willingness to bet), and that is particularly the case when you specify that the amount of the bet is trivial.
Sure. Also, I do think that it's reasonable to invest trivial amounts of money in fair lottery tickets given certain utility functions. For example, if a loss is negligible but you value extremely highly the possibility of imminent comfortable retirement. I don't do this myself because I believe that in my part of the world lotteries are rigged and the chance to win really big is actually zero.
> Also, I do think that it's reasonable to invest trivial amounts of money in fair lottery tickets given certain utility functions.
I agree. But this is not compatible with the definition of "degree of belief" offered above; that definition requires that lottery ticket purchasers do not believe the advertised odds are correct.
"I will win the lottery" has a large fantasy component, where people spend time thinking about all the things they could buy with that money.
An anonymous bet with a small fraction of your wealth does not have that fantasy component. "Look at all the things I could buy with this 20 cents I won!" just doesn't do the same thing.
Because it's anonymous you're not pre-committing anything. I suppose some people might brag about making that bet?
This is spot on, and a good illustration of why I believe prediction markets are going to have problems as they scale up in size and importance.
Is it actually a problem for prediction markets, though? People betting at inaccurate odds for emotional (or other) reasons just provides a subsidy for the people who focus on winning, resulting in that market more accurately estimating the true likelihood. Certainly markets can be inaccurate briefly, or markets with very few participants can be inaccurate for longer, but it's pretty easy to look past short-term fluctuations.
Or maybe you're thinking of it being a problem in some different way?
> standard mutually intelligible definition
FWIW this is just the de Finetti perspective, and you could have others, more based around subjective expectations.
Like, I think the reduction works a bunch of the time, but I don't think you can reduce subjective belief to willingness to bet
Unless the enemy has studied his Agrippa, which I have.
Of course, this assumes that people will be willing to bet on an outcome given some odds ratio.
It might well be that some people would not rationally bet on something.
For example, given no further information, I might not be willing to bet on "Alice and Bob will be a couple a month from now" if I have no idea who Alice and Bob are and anticipate that other people in the market have more information. Without knowing more (are Alice and Bob just two randomly selected humans? Do they know each other? Are they dating?) the Knightian uncertainty in that question is just too high.
Thank you! You are making **exactly** my point -- that although people start out by talking about subjective "degrees of belief", sooner or later they will fall into some sort of frequentist interpretation. There can be no purer expression of this than making an argument about betting, because ultimately you are going to have to appeal to some sort of expected value over a long run of trials, which is by definition a frequentist interpretation.
Doesn't it just mean if we "ran" our timeline 1000 times, I predict we reach Mars 50 in 50 of those timelines?
In other words, it is still frequentist in a sense, but over hypothetical modalities.
Not necessarily. I think the probability that P=NP is about 5%, but I don’t think that if we “ran” the universe that many times it would be true in 5% of them - it would either be true in all or none.
Instead it means that, if I had to do something right now whose outcome depended on P being equal NP, I would do it if the amount by which it made things better if they were = is more than 20 times better than the amount by which it made things worse if they were unequal. I need some number or other to coordinate my behavior here and guarantee that I don’t create a book of bets that I am collectively guaranteed to lose (like if I paid 50 cents for a bet that gives me 90 cents if horse A wins and also paid 50 cents for a bet that gives me 90 cents if horse A loses). But the number doesn’t have to express anything about frequency - I can use it even for a thing that I think is logically certain one way or the other, if I don’t know which direction that is.
I think the P=NP example (and the Mars example, if you believe in a deterministic universe) can still be approached this way if we define 'rerunning the timeline' as picking one of the set of possible universes that would produce the evidence we have today.
It can be if you accept logical impossibilities as “one of the set of possible universes that would produce the evidence we have today”.
I'm not sure I follow. If the evidence we have today is insufficient to show that P != NP, how is it logically impossible for universes to exist where we have the same evidence and P does or does not equal NP?
Most people suspect that the mathematical tools we have *are* sufficient to show one of the directions of the P and NP conjecture, it's just that no human has yet figured out how to do it.
Even if it turns out not to be provable, it still either is or isn't, right? Universes where the starting conditions and the laws of physics are different are easy to imagine. Universes where *math* is different just bend my mind in circles. If you changed it to switch whether P=NP, how many other theorems would be affected? Would numbers even work anymore?
Interesting that you should say that. I was thinking about the timelines of possible universes going *forward from the present* — and when it comes to the Mars example, what is the set of possible universes that will *prevent* us from getting to Mars in 2050 vs the set that will *allow* us to get to Mars in 2050? think we can agree (or maybe not?) that there is an infinity of universes for the former, but a smaller infinity of universes for the latter. After all, the set of universes where the sequence of events happen in the proper order to get us to Mars would be smaller than the set of universes where the sequence of events didn't happen (or couldn't happen). If these were sets of numbers we could create a bijection (a one-to-one correspondence) between their elements. But no such comparison is possible between these two sets, and the only thing we can say with surety is that they don't have the same cardinality. Trying to calculate the proportionality of the two sets would be impossible, so determining the probability of universes where we get to Mars in 2050 and universes that don't get us to Mars in 2050 would be a nonsensical question. I'm not going to die on this hill, though. Feel free to shoot my arguments down. ;-)
I think this is a fascinating objection, thanks.
I've never been any good at reasoning about infinities in this way (Am I so bad at math? No, it's the mathematicians who are wrong!), but I've spotted a better out so excuse me while I take it:
I do disagree that these are infinite sets; I think they're just unfathomably large. If there are 2^(10^80) possible states of the universe one Planck second after any given state, the number of possible histories at time t quickly becomes unrepresentable within our universe. It's a pseudoinfinite number that is not measurably different from infinity in a practical sense, but saves us all of the baggage of an infinite number in the theoretical sense.
If you accept that premise (and I don't blame you if you don't), I believe we're allowed to do the standard statistics stuff like taking a random sample to get an estimate without Cantor rolling in his grave.
I like your counter-argument. But I'll counter your counter with this — if the number of potential histories of our universe going forward is impossible to represent within our universe, then it's also impossible to represent the chances of getting to Mars in 2050 vs not getting to Mars in 2050. Ouch! My brain hurts!
> Not necessarily. I think the probability that P=NP is about 5%, but I don’t think that if we “ran” the universe that many times it would be true in 5% of them - it would either be true in all or none.
I think this is a semantic/conceptual disagreement. I think there are two points where we can tease it apart:
* You're thinking of the world as deterministic, and I as predicated on randomness to a significant degree. If the future depends on randomness, then it makes no sense to claim it would be true in all or none. Whereas if the future is determine by initial conditions and laws of nature, then yes, it will be. In which case:
* You can adapt my conceptualisation such that it survives a deterministic world. A determinist would believe that the future is determined by known and unknown determinants, but nonetheless fixed, ie the laws of nature and initial conditions both known and unknown. To give x a 5% probability, then, is to say that if, for each "run" of the universe you filled in the unknown determinants randomly, or according to the probability distribution you believe in, you would get event x occurring 50/1000 times.
Correct me if I'm mistaken in my assumptions about your belief, but I don't know how else to make sense of your comment.
I think that the P vs NP claim is most likely a logical truth one way or the other, and that no matter how you modify the known or unknown determinants, it will come out the same way in all universes.
If you have some worries about that, just consider the claim that the Goldbach conjecture is true (or false), or the claim that the 1415926th digit of pi is a 5.
I was with you until the last bit Kenny (per Wolfram Alpha, the 1415926th digit of pi is known to be a 7:-))
Isn’t your 5% estimate essentially meaningless here though? Since it is a proposition about a fundamental law and you know it is actually either 100% or 0%. And more importantly no other prediction you make will bear sufficient similarity to this one that grouping them yields any helpful knowledge about future predictions.
Your first point shows that P=NP doesn’t have a *chance* of 5% - either all physically possible paths give P=NP or none of them do.
Your second point shows that a certain kind of frequentist calibrationism isn’t going to make sense of this either.
But Bayesian probability isn’t about either of those (regardless of what Scott says about calibration). Bayesian probability is just a way of governing your actions in light of uncertainty. I won’t make risky choices that come out very badly if P=NP is false, and I won’t make risky choices that come out 20 times as badly if it’s true compared to how well the choice turns out if it’s false. That’s what it means that my credence (Bayesian probability) is 5%. There is nothing that makes one credence “correct” and another “incorrect” - but there are people who have credence-forming policies that generally lead them well and others that generally lead them badly. And the only policy that avoids guaranteed losses is for your credences to satisfy the probability axioms and to update by Bayesian conditionalization on evidence.
The thing I don’t get is that if it is not a frequentist probability, how can it make sense to applies Bayes theorem to an update for P=NP? Say a highly respected mathematician claims he has a proof of it, then promptly dies. This is supposed to be Bayesian evidence in favour of P=NP. But does it make sense to apply a mathematical update to the chance of P=NP? Surely it is not an event that conforms to a modelable distribution, since as you say it is either wrong or right in all universes.
What Bayes' Theorem says is just that P(A|B)P(B)=P(A&B). (Well, people often write it in another form, as P(A|B)=P(B|A)P(A)/P(B), but that's just a trivial consequence of the previous equation holding for all A and B.)
To a Bayesian, P(A) (or P(B), or P(A&B)) just represents the price you'd be willing to pay for a bet that pays $1 if A (or B, or A&B) is true and nothing otherwise (and they assume you'd also be willing to pay scaled up or down amounts for things with scaled up or down goodness of outcome).
Let's say that P(B|A) is the amount you plan be willing to pay for a bet that pays $1 if B is true and nothing otherwise, in a hypothetical future where you've learned A and nothing else.
The first argument states that this price should be the same amount you'd be willing to pay right now for a bet that pays $1 if A&B are both true, nothing if A is true and B is false, and pays back your initial payment if A is false (i.e., the bet is "called off" if A is false). After all, if the price you'd be willing to pay for this called off bet is higher, then someone could ask you to buy this called off bet right now, as well as a tiny bet that A is false, and then if A turns out to be true they sell you back a bet on B at the lower price you'd be willing to accept after learning A is true. This plan of betting is one you'd be willing to commit to, but it would guarantee that you lose money. There's a converse plan of betting you'd be willing to commit to that would guarantee that you lose money if the price you'd be willing to pay for the called off bet is lower than the price you'd be willing to pay after learning that A is true. The only way to avoid the guaranteed loss is if the price you're willing to pay for the called off bet is precisely equal to the price you'd be willing to pay for a bet on B after learning A.
But a bet on B that is called off if A is false is precisely equal to a sum of three bets that you're willing to make right now - and we can check that if your prices don't precisely satisfy the equation P(B|A)P(A)=P(A&B), then there's a set of bets you're willing to make that collectively guarantee that you'll lose money.
https://plato.stanford.edu/entries/dutch-book/#DiacDutcBookArgu
There's nothing objectively right about the posterior P(B|A), just that if you are currently committed to betting on B at P(B), and you're currently committed to betting on A&B and P(A&B), then you better be committed to updating your price for bets on B to P(B|A) if you learn A (and nothing else) or else your commitments are self-undermining.
All that there is for the Bayesian is consistency of commitment. (I think Scott and some others want to say that some commitments are objectively better than others, such as the commitments of Samotsvety, but I say that we can only understand this by saying that Samotsvety is very skilled at coming up with commitments that work out, and not that they are getting the objectively right commitments.)
And this is my point -- that statements about "degrees of belief" will inevitably be translated into statements about long-run frequencies.
I do not think that 5% to reach Mars could ever be interpreted as a frequentist probability.
If you are one of the mice which run the simulation that is Earth, and decide to copy it 1000 times and count in how many of the instances the humans reach Mars by 2050, then you can determine a frequentist probability.
If you are living on Earth, you would have to be very confused about frequentism to think that such a prediction could ever be a frequentist probability.
<I just couldn't resist>
>If you read through this article you'll see that probability statements drift between statements of degree of belief and actual frequentist interpretations. _It's_ _just_ _inevitable_.
groan [emphasis added]
</I just couldn't resist>
I wonder if there's a bell curve relationship of "how much you care about a thing" versus "how accurately you can make predictions about that thing". E.g. do a football teams' biggest fans predict their outcomes more accurately or less accurately than non-supporters? I would guess that the superfans would be less accurate.
If that's the case, "Person X has spent loads of time thinking about this question" may be a reason to weigh their opinion less than that of a generally well-calibrated person who has considered the question more briefly.
I think you're conflating bias vs "spent loads of time thinking about this question" as the cause of bad predictions. The latter group includes all experts and scientists studying the question, and probably most of the best predictions. It also includes the most biased people who apply no rational reasoning to the question and have the worst predictions. You're better off considering bias and expertise separately than just grouping them as people who spenta lot of time thinking about something.
I do think spending lots of time thinking about a subject can contribute to developing a bias about it, though
Me too. But you still have to consider the two factors separately. You can't just reason that since physicists have spent a lot of time thinking about physics, they're probably biased and shouldn't be trusted about physics.