Comment deleted
Expand full comment
Comment deleted
Expand full comment

0% implied chance of UER between 4 and 5% feels disqualifying to me!

Also worth noting that some of the markets (e.g. inflation, sports) listed have deeper and better existing markets than prediction markets (TIPS and sports betting, namely).

Expand full comment

In response to "we should cherish people who are often extremely wrong but occasionally right when nobody else is; these people would lose a lot of money", I think it would be easy to make the case that they would be quite profitable (or perhaps at least break-even), as the payoff for their successes would be significantly more than their losses (given that everyone disagrees with them, they would theoretically be able to find extremely favorable odds to bet on), so even if they are right only 5% of the time, that could easily be >=20x payout for when they are (and indeed there are quite a few investing strategies, both in prediction markets and traditional markets, that do just this: make small losses for months on end until some improbable event occurs, then making it all back plus some. In many cases this is basically what hedging is as well).

Expand full comment

Not sure if you've seen this before, but for good measure: TipRanks analyzes stock analysts recommendations. https://www.tipranks.com/

Expand full comment

The Brier score, Brier skill score, and concept of climatologic forecasts having accuracy but no skill are all relevant here.

In essence: suppose you know that it rains 1/5 days in a given place. Then, in the absence of any other information, you should predict 20% chance of rain for a given date. This forecast will be completely accurate but not skillful. Accuracy here is defined as: of all your 20% forecasts, how many days had rain? If it's 20%, you were perfectly accurate. But since the climate in this case is know, your skill as a forecaster depends on how accurately you deviate from the climatologic prediction. (There's also power, which is how close your forecasts are to 0 or 1.)

EU Met Cal used to have a spectacularly good set of articles explaining all of this (at eg https://www.eumetcal.org/resources/ukmeteocal/temp/msgcal/www/english/msg/ver_prob_forec/uos2/), but it seems to be off the web with no Archive. If I can find a working link, I will post.

Expand full comment

My half finished prediction market awards extra points for being right when others were wrong and also being first to be right - that was my attempt to adjust for everything anyway.

In light of ever the above article - what do you (meaning fellow commenters) think of the relevance of those two criteria?

Expand full comment

Letting the prediction market see the pundit is actually a fatal flaw. Those pundits that who the market accurately assesses the reliability of will make an efficient market better at predicting than they are by sharing their position and justification.

It’s one thing to be able to beat the market; it’s another thing entirely to be able to consistently beat the market while also sharing your reasoning and bets.

Maybe you could compare the prediction market before and after a post by a pundit who makes a prediction that differs from the market, to see which direction it updates, and then evaluate whether it more often updated in the direction that it finally evaluated, in conjunction with whether the market adjusted towards or away from the pundit’s estimate.

If the market updates away from the pundit consistently and improves accuracy that way, then they are providing useful analyses but are bad at prediction. If the market updates in a way that reduces accuracy, then the analyses are of negative value, even if the prediction is more accurate than the market.

But there is no way to hold all other things equal.

Expand full comment

Zeynep isn't wrong often. I'd love to see her vs. an aggregate punditry. Have you talked to the Good Judgment folks about running something like this for pundits that were willing to jump in? Have a parallel version for Superforecasters. Keep track of results.

Expand full comment

Most of what I read comes from the investing community. On that note, that's where my income comes from too.

Currently, my largest investment is in AT&T. The investment thesis here runs along the line of-

HBO Max will do well.

Even without HBO Max AT&T is fine.

What I want to point out here, is that the bet is two sided.

That is, I'm not saying [HBO good: 70%]. In fact, I have no specific number. It's more like [HBO good > Debt dangerous: >50%]. This same logic can be easily applied to politicians, media, and bureaucrats as well. So, Bush said Saddam had WMD, and that we should go to war to solve this.

The two sides of this prediction are:

Iraq war will prevent the nuclear annihilation of New York.

Iraq war will be quick and easy.

If the first half had been proven true, I would have given Bush lots of points for that, sure. But if only the second half had been proven true, I still would have given Bush lots of points. If Saddam did have WMD that he did plan to use on New York, but our military was crushed in a failed invasion (perhaps because of help from, China?) I would have penalized Bush by a ton.

Of course, both halves proved false, and I despise Bush, for that and other reasons, but, anyway, the point is, most policies have an implicit [potential cost]:[potential return]. In investing, it's not actually important to be right about anything. What matters is basically, to not be wrong. At least, the investors I tend to look to for advice, make lots of investments with the potential for profits, and little in the way of potential losses. I think the same can be asked of people like Trump, Fauci, and Tucker Carlson.

Expand full comment


Week before last's top comments:

- misha rounds up different forecasting platforms takes on the Olympics (https://www.metaculus.com/questions/5555/rescheduled-2020-olympics/?invite=tLxPdB#comment-56906)

- There were no terrorist attacks in OEDC founder states between the 3-Nov-20 and 1-Mar-21. (But Jgalt points out there were attacks on either side) (https://www.metaculus.com/questions/5441/terrorist-attack-in-oecd-founder-state/?invite=tLxPdB#comment-57003)

- F for SN10 (Jglat) (https://www.metaculus.com/questions/6498/will-sn10-land/?invite=tLxPdB#comment-57038)

- ege_erdil and I look at the distribution for bitcoin's peak in 2021 (https://www.metaculus.com/questions/6666/maximum-price-of-bitcoin-in-2021/?invite=tLxPdB#comment-56828, https://www.metaculus.com/questions/6666/maximum-price-of-bitcoin-in-2021/?invite=tLxPdB#comment-56829)

- ege_erdil calculates the base rate for resignations for politicians resigning after accusations of misconduct (https://www.metaculus.com/questions/6693/will-ny-governor-andrew-cuomo-resign-soon/?invite=tLxPdB#comment-57116)


Last week's top comments:

- ThirdEyeOpen's application to be a moderator (https://www.metaculus.com/questions/6805/2021-spring-equinox-moderator-election/?invite=9I6hgw#comment-57728)

- Various meta-points on resolution (https://www.metaculus.com/questions/353/will-someone-born-before-2001-live-to-be-150/?invite=9I6hgw#comment-57513, https://www.metaculus.com/questions/6145/brent-crude-oil-to-exceed-70-in-2021/?invite=9I6hgw#comment-57508, https://www.metaculus.com/questions/5555/rescheduled-2020-olympics/?invite=9I6hgw#comment-57601)

- chudetz shares his docuseries on the BEAR question (https://www.metaculus.com/questions/6087/when-will-ai-understand-i-want-my-hat-back/?invite=9I6hgw#comment-57466)

- Cory points out US semi manufacturing might be on the rise (https://www.metaculus.com/questions/6249/november-2021-production-of-semiconductors/?invite=9I6hgw#comment-57456)

- EvanHarper identifies a source how other NY officials feel about Cuomo's position (https://www.metaculus.com/questions/6693/will-ny-governor-andrew-cuomo-resign-soon/?invite=9I6hgw#comment-57783)

Expand full comment

I caution against getting too far into the predictions game, because it draws attention away from substance -- as do the horse-race elements of election coverage. It's usually not important to predict stuff like Biden's end-of-year public approval rating. What counts is: (1) whether a pundit was on the mark in warning about an under-protected risk (Protect against pandemic, dummy); (2) whether a commentator was farsighted in revealing options that most others were ignorant about (e.g., Farrington Daniels' early book on solar power); and (3) whether an expert revealed hidden facets or complexities in ways that deepened readers'/viewers' thinking (perhaps "Think Like an Economist"). There no doubt are more such major categories, but not many.

Expand full comment

I don't know if I view the primary job of a pundit as prediction. I see them as primarily there to provide context and perspective on issues, which prediction should be a part of, but how harshly should we judge bad predictions?

In the Iraqi WMDs example, pundits were making their assertions based on the available evidence that strongly indicated such weapons did exist - the US government, the US intel community, foreign governments/intel communities, Saddam Hussein himself... All widely agreed that Iraq did have WMDs. How often do we have events with such a large amount of misleading info and so little publicly available truth?

Expand full comment

I worry about over-reliance on prediction as a metric for assessing the quality of journalism. Even apart from the methodological challenges discussed in the post, I think that success in particularized factual forecasting—while an important skill—is often unrelated to the value of a piece of political writing or analysis. Few of the ideas I find most valuable are susceptible to short-term, isolated verification of falsification. Synthesizing disparate facts into an interesting theory, etc., can rarely be tested by a “wagerable” near-term market result, and I am skeptical that facility in the one is particularly relevant to the usefulness of the other.

Expand full comment

Related to Decius's objection below (and alluded to in the post): over the long term, properly functioning prediction markets should always beat any individual pundit. The easy way to see this is: if the prediction market isn't beating the pundit, then somebody can make a ton of money just by doing whatever the pundit says. Now the prediction market does as well as the pundit.

More generally, pundits (and every other part of the "discourse") seem like an important component in the machine that makes prediction markets work. This is good, but it makes "judge pundits by how they stack up against the prediction market" sort of circular. It's not clear that any direct comparison will be stable in the long run. (Is this another instance of Goodhart's law?)

But, I think we can quantify the value of pundits even when they don't beat the market. As a very coarse measure: does a pundit writing about topic X cause the market to shift? If every time Yglesias writes a post, the relevant prediction markets suddenly change by 5 points, that means that Yglesias's posts have some serious information content. That's valuable even if Matt couldn't correctly predict whether or not the sun would rise tomorrow.

Expand full comment

"There would already be good prediction markets in which ones will or won't pan out. There would be a few teams, people, and companies who are known for being great at trading in them, and who have expertise in knowing which people are real experts who should be consulted."

Assuming this occurs, these people's careers would be made or broken by the quality of their predictions. Doesn't this result in a scenario where there's a perverse incentive to influence any outcomes they predict?

For example if a pundit predicts an antidepressant will not be approved, then raises their concerns about problems identified by studies on the drug with friends involved in the FDA approval process (or the public to generate pressure) to make it more likely to be denied?

It seems this would be the easiest way to "beat the market" but would generally have negative real-world consequences.

Expand full comment

Wait, so there's a 20-year fascination with prediction markets because it took a while to sink in on everyone that a bunch of people, for their own various and incompatible reasons, lied the country into a war? That, of all things, the lesson out of this - "humans hurt us once, so let's build economic robots and trust those instead"?

Expand full comment

Isn't there value in punditry that makes predications with rational explanations? Even if they can't be validated? Or is it the rationalizing itself that would be more useful with validation?

Expand full comment

I don't know if you're saying Yglesias' posts are paywalled just to warn readers or because you genuinely can't read them yourself. If it's the latter ... come on, Substack! Obviously Substack should give all its authors free subscriptions to each other, so they'll link to each other and do free cross promotion.

Just buy subscriptions for them if you didn't think to set it up ahead of time. Maybe restrict it to authors with X+ followers if you're worried about people making their own substacks just to read for free and not posting anything. Or maybe don't? And let the lure of free subscriptions entice people to overcome the setup hurdle, after which they'll feel compelled to post and promote.

Expand full comment

I think we should distinguish between pundits who make good predictions and pundits who make accurate statements about the world as it is (in part based on what they are pontificating about). Other than punishing people who predict 12 out of the last 10 recessions and the like (speculating on the impact of bills), most of the issues with punditry is people making factually incorrect or bullshit claims (using bullshit to mean that there isn't even a desire to care about truth). In the case of the Iraq war there are lots of predictions that would have been good to make (if we invade, what is the chance Iraq is a democracy within X years) but I don't think those would be as valuable as holding people accountable for spreading obvious lies like links between Al Qaeda and Saddam.

Expand full comment

I think my 'ideal world' fantasy is a little different from what Scott describes. I'd rather see a meter on the side of the screen next to his blog - one that looks like the audio output graphics on old stereos. Give it a column for politics, economics, psychiatry, etc. Then give me the overall grade (weighted more heavily for more recent/high-impact predictions) with a link to the specific predictions.

I like Scott's idea of placing the predictions on each article/post itself, but my concern is that nobody would ever go back and judge whether the predictions made actually panned out. I'd also be concerned that the response to this kind of accountability would be for pundits to stop making testable predictions, even as they continued to make sweeping overconfident statements. My sense is that given how difficult it is to truly make predictions based on current trends, most pundits would try to game the system by making low-impact, high probability predictions as Scott discussed to pad their numbers.

What's that heuristic about how when you turn a measurement into a quota it corrupts the measurement? Still, it would be nice to have a kind of FICO score for pundits. Then I wouldn't have to do the hard work of underwriting their predictions.

Expand full comment

As has been pointed out around here a lot, for the minority of cases for which there's a prediction market at all, they can be pretty crappy. Some coauthors and I are working on the problem of "this prediction market says X will happen with probability P; can we quantify the extent to which that's garbage?" Alternatively: if we have no time or energy to figure out if "the market" is being idiotic or who we might trust to be less idiotic, can we combine all the bid-ask spreads in the various markets and determine the one canonical number that we can agree is the Official Market Probability? Things like http://electionbettingodds.com/ try to do this but they're definitely doing it incorrectly, just averaging all the bid/asks across markets indiscriminately.

This is not my main research area these days and I don't expect to build a better prediction market aggregator/assessor any time soon but I'd love to discuss this problem with people who may be inspired to build such tools.

Expand full comment

FWIW, the stuff about "Apple silicon" is likely referring to how Apple is transitioning away from Intel processors to ones they make themselves: https://www.cnbc.com/2020/11/10/why-apple-is-breaking-a-15-year-partnership-with-intel-on-its-macs-.html

Expand full comment

This has always existed for come classes of journalism. Jim Cramer's stock picks are all there for posterity to see forever. Sports journalism is the most obvious example. Every preseason, midseason, postseason, big playoff game, involves all of the various network and publication talking heads making public selections of just about everything you can bet on. In sports journalism, it's even pretty common for groups of pundits to keep a running tally each year of who was more right and who was more wrong, usually for no actual stakes, just bragging rights, but it's still a scored record. And they all stick to what they claim is their area of expertise.

So the model for something like this definitely exists.

Of course, as many others have pointed out, I don't think making maximally accurate predictions is really the purpose or value or most journalism. We'd do well with more detailed investigative work and pure research and less horse race crap.

Expand full comment

> The current interest in forecasting grew out of Iraq-War-era exasperation with the pundit class. <

Were they mis-predicting, or were they simply lying, aka going along to get along?

The powers-that-be in the US had declared that WMD existed, which appears to be sufficient reason for many people to make the same declaration - often deceiving themselves rather than consciously lying.

Expand full comment

I feel like there’s a way to use machine learning to parse out all the predictions (you’ll need to distinguish between real predictions and sarcasm) someone has made over a period and then process a bunch of news stories and spit out an accuracy percentage.

I can imagine a lot of ways for that idea to fail but I am curious if anyone is trying it out

Expand full comment

The true Chad move is never to make predictions about specific events but always to make confident statements about the arc of history.

Expand full comment

Seems to me that predictions are only useful if you are going to be placing a bet (investing). If you don't have a vested interest in the outcome, predictions are pointless, and the questions you have listed are especially pointless to me. Are things going to get better for the majority of people in the USA? might be a good question, but I am pretty sure the answer is no. A better question is does anyone have any ideas on things might be improved? Of course good ideas are a dime a dozen, you have to be able to sell them, and I don't see anybody selling any good ideas in current climate.

Expand full comment

There is a lot of discussion of reasons that it is unfair to compare experts to prediction markets. I agree with this sentiment and I think that the solution is to hold experts to a standard of being well calibrated rather than a standard of trying to beat the market. There are several reasons I think this:

1. Trying to beat a well functioning market is essentially impossible.

2. You get penalized for admitting you don't know something, when in reality we should be rewarding experts for this behavior.

3. If someone makes a very confident prediction, knowing that their very confident predictions almost always comes true (i.e. they are well calibrated) is important. Knowing that they are generally able to be as confident as the market is less important. Beating the market requires being well calibrated AND being very confident. Being more confident makes your predictions more useful in general, but it is irrelevant to evaluating any particular prediction.

There is a caveat with this approach, that being that you should only trust a prediction if its predictor is well calibrated in similar predictions, similar in this case means similar in terms of confidence, topic, and difficulty. Assuming that this condition it met, whether or not a predictor beats the market consistently should not be relevant.

Expand full comment

22. United States rejoins JCPOA and Iran resumes compliance (80%)

Resume compliance? They didn't comply from the start. German intelligence said after the deal was signed that Iran was secretly seeking nuclear and missile material from German companies. And in 2018 Israel released a cache of Iranian files that showed they had a clandestine nuclear program.

Expand full comment

"In my ideal world, it's silly for random psychiatrists to be speculating on psychiatry papers. There would already be good prediction markets in which ones will or won't pan out. There would be a few teams, people, and companies who are known for being great at trading in them, and who have expertise in knowing which people are real experts who should be consulted."

I don't know what the psychiatry world is like, but from my understanding of academia, adding a metric like this would cause a lot of moral hazard/insider trading/hostile short selling and other related problems. Maybe it's better in a field where you can expect empirical validation on an irrefutable scale in a few years, if a drug gets approved for widespread use.

Expand full comment

I would say there's a fairly good rational argument that an accurate human pundit in the sense of this essay cannot exist. It goes like this:

1. Assume arguendo a pundit exists who could make predictions about the future that were noticeably more accurate than those anyone else can make.

2. Either he does it in a way that depends on some unique quality he has (e.g. he's Jesus Christ, or speaks with Him) or he does it in a way that anyone, or at least some moderate number of people, can duplicate with appropriate intelligence, training, meditation, medication, et cetera.

3. If the latter, then given the high value of accurate prediction and anything approaching an efficient market, he cannot now be unique -- many people will have already learned to do this, long ago, and so he will not stand out. (After all, we can all make accurate predictions of which way an apple falls if we let it go, so that kind of prediction does not stand out.) But a pundit that is no more accurate than any intelligent informed random person contradicts our definitions and need not be considered further. We also have no need to search out such a person, because he won't have any secrets worth the learning.

4. If the former, however, his methods will be indistinguishable from luck or divine intervention to the outsider, because by definition he's doing it in a way nobody else can learn. Which is to say, there is no way of evaluating the *method* to see if it makes sense, is plausible and believable -- all we can do is evaluate the outcomes.

5. He will either have a perfect record or not on the predictions of interest to us, which are "high value" predictions where the payoff is large if we follow the prediction instead of our prior inclination. These are by definition Black Swan or near-Black Swan events. (Predictions that the Sun will rise tomorrow are generally useless, as are most predictions that merely consist of extrapolation of current trends.) If he does *not* have a perfect record, since we cannot evaluate his methods (see 4), we will not be able to distinguish what he does from lucky guesses. On any *given* prediction we'll have no way of knowing whether this is where he's right or one of those times where he's wrong. If his predictions were trivial, that wouldn't matter much, but nobody will make decisions on Black Swan events (heavy payoff if you win, heavy loss if you lose) if the predictions aren't close to foolproof. So he would be more of a Cassandra than an Oracle, someone on whom people were afraid to bet, but who could say "I told you so" a lot, presuming anyone listened (nobody liked Cassandra after all).

6. If he has a perfect record on Black Swan events, we can rely on his predictions. But this is a very strange human being indeed, one who can make perfect predictions of future Black Swan events by means completely incomprehensible to the rest of us. I would need to have the existence of such a super-being demonstrated before I was willing to entertain the notion that such a being could exist and be a member of our species.

Expand full comment

I've lived long enough to be extremely skeptical of my ability to predict the future. There are just too many factors.

Instead, I try to first notice and then to mention trends that are already happening. For example, since June 2020, I've been hollering about how the media-declared Racial Reckoning is driving up the murder rate.

As Orwell said, "To see what is in front of one’s nose needs a constant struggle."

Expand full comment

Just want to say that Bob Cringley is a journalist who has been scoring his predictions for over 20 years at this point.

Expand full comment

It would be informative to go through Philip Tetlock's annual lists of question for his famous forecasting contest and list the really big events that happened during the year in question that we so unanticipated that no questions refer to them. For example, as I noted in my 2016 review of Tetlock's airport book, nobody was asked to predict whether a European leader in 2015 would suddenly invite a million Muslims in the way Angela Merkel did more or less on a whim in the late summer of 2015.

Jean Raspail more or less predicted it in his 1973 dystopian novel "The Camp of the Saints," but even he would have been wrong for each of the first 41 years.

Similarly, did Tetlock ask anybody to predict the Great Awokening or the Transgender Boom or Trumpism or the Racial Reckoning?

Conversely, one of these years, the biggest event of the year in the US will be the Great California Earthquake of 20XX. How much credit should a pundit get for accurately predicting that to the year?

Expand full comment

Re: As far as I know, the first official journalists to do something like this...

There was an attempt bTetlock et al to do something like this at pundits, rather than with their explicit consent. You can see the original proposal here: https://www.openphilanthropy.org/files/Grants/Tetlock/Revolutionizing_the_interviewing_of_alpha-pundits_nov_10_2015.pdf, a related OpenPhil grant here: https://www.openphilanthropy.org/giving/grants/university-pennsylvania-philip-tetlock-forecasting#Goals_and_expectations_for_this_grant. I also got confirmation that nothing came of it, but people come up with similar proposals every now and again.

Expand full comment

Have you considered partnering with a prediction market firm so that all your predictions get automatically added as markets? You'd give them a lot of free advertising and customers, we'd get a functioning prediction market on your stuff.

Expand full comment

Also, why can't we like comments in this thread?

Expand full comment

In response to having meaningful Brier scores...

A Brier score needs a benchmark to judge skill, as others have pointed out. How to achieve this?

ACX could post forecasts onto an app like Maby.app (it is free) and then readers could forecast as well. This provides (a) a baseline against which to judge forecast accuracy, (b) everyone gets feedback in the form of a calibration curve, (c) forecasts of readers are hidden from each other until the question is closed - so it is harder for anyone to piggy-back on the work of others.

Keep the questions open for a couple days then close them - so everyone effectively forecasts at the same time.

Expand full comment

“The process of globalization, powerful as it is, could be substantially slowed or even stopped. Short of a major global conflict, which we regard as improbable, another large-scale development that we believe could stop globalization would be a pandemic…”

That is probably the most chillingly prescient passage from Mapping the Global Future, a report written 16 years ago by experts working for the U.S. National Intelligence Council, describing coming developments in geopolitics, culture, technology, and the economy out to 2020. With the year in question having arrived, I thought it was worthwhile to review the accuracy of it’s predictions, and overall, I was impressed. Mapping the Global Future correctly identified most of the megatrends that shaped the world from 2004-20, (though it was somewhat less accurate forecasting the degrees to which those factors would change things):


Expand full comment

It's not just that the Iraq War pundits were wrong - they were catastrophically wrong, wrong in ways that caused men (Julius Streicher, and, to a lesser extent, Alfred Rosenberg) to hang at Nuremberg, and they suffered not the slightest, not personally or professionally.

Meanwhile, the naysayers were cast into Outer Darkness and have remained there ever since, even though the War On Iraq went worse than the most pessimistic predictions would have it.

Expand full comment

Couldn't this "prediction difficulty calibration problem" be solved with an Item Response Theory approach? (The statistical method used in computer adaptive testing). Use the mass of data from metaculus to calibrate item difficulty levels via IRT, then compute skill-level of pundits based on the items they answer.

Or maybe adapt something like the Elo system for chess ratings? (Which is closely related mathematically to IRT). That way you can progressively be estimating the skill of people across time.

The challenge w these approaches is that ppl can only answer items one time and then it's obsolete, which means you need tons of ppl answering each item to establish good item difficulties... but don't you have that w platforms like Metaculus?

You could even incorporate the amount ppl bet to set item difficulty/skill levels, the way that Klinkenberg et al. 2011 does for reaction time in computer adaptive practice (See "High Stakes High Gain Scoring Rule") https://doi.org/10.1016/j.compedu.2011.02.003

Expand full comment

I'm confused by what you want out of prediction markets and I think you might be too. As far as I can tell, the argument goes "we want prediction markets, so that we can tell which pundits are good, so that when the good pundits make predictions, then we can trust them." But if we have prediction markets, we don't need the pundits! You don't need to know who the experts actually are if you know you have a market that's already priced in all of their wisdom.

This still leaves open the case of how to trust in the absence of prediction markets. I believe that the answer is the same for what we do for predictions in science: Kolmogorov complexity. This has two uses against this problem. First, it provides a way to formally(ish) define cheating at making predictions. And second, it provides a rigorous method of discounting the credit of cheaters post facto.

Consider a Pundit who predicts 1000 copies of "The sun will rise in the East." They might publish a book of predictions that looks like:

The sun will rise in the East on 2022-01-01.

The sun will rise in the East on 2022-01-02.

The sun will rise in the East on 2022-01-03.


The sun will rise in the East on 2024-09-27.

Then, when all those predictions come true, they claim 1000 points worth of credit on September 27th 2024. The intuition that I wish to formalize is that you could replace this pundit's book with the following program.

from datetime import *

for i in range(1000):

print("The sun will rise in the East on " + date(2022, 1, 1) + timedelta(days=i) + ".")

This program is 3 lines long, so the pundit should only get credit for 3(ish) predictions. This program contains some overhead, it would contain quite a bit more if I wasn't allowed to just import how calendars work. In the limit of a large number of predictions this overhead isn't important, so we should give additional credit to pundits who offer a larger number of predictions. Importantly, this discounting of a prediction book can happen post facto. The minimum entropy it takes to produce the book doesn't change when we know the answers.

The problem with this scheme is that you can never know the true Kolmogorov complexity of a thing, you can only know the lower bound set by the cleverest adversary. This isn't particularly good, but it does enable me to look at a pundit's win/loss record and verify to myself that I find it difficult to compress their selected predictions. It would work even better if this opened up a market for "adversarial pundits" who would publish their best efforts at compressing others prediction sets.

Expand full comment

I am British, and I use "brackets" and "parentheses" interchangeably, favouring brackets: () are brackets or round brackets, [] are square brackets, {} are curly brackets or curly braces. I was confused by "Yglesias' numbers are bold and in parentheses. Metaculus' numbers are in brackets (not all questions are on Metaculus)." - had Yglesias' numbers not also been bold, I'd not have been able to follow the listing below at all.

Expand full comment

I really enjoy reading Matt's thoughts, they're often thought provoking and carefully argued, even when I disagree.

Today he wrote contra meritocracy, and I was left a bit unsatisfied though, mainly for reasons Scott has well covered.

I'd love to weigh in on that article, but cannot comment without paying Matt money. It feels like perverse incentives relevant to Scott's post though. Matt is bright. But Cunningham's Law would mean it will be in his best interest to become a slightly worse pundit, in order to get more subscribers to pay for the opportunity to correct him.

Faced with that dilemma, I subscribed to ACX and referenced his post here. Maybe I'll just discuss it on Reddit or Discord.

In the post, Matt proposes (perhaps accidentally) a really fascinating way to measure how much people value meritocracy over partisanship, and I think it's well worth discussing... but maybe not there, nor here. In the next even numbered open thread maybe.

Expand full comment

I still have yet to see a coherent argument for prediction markets actually leading to useful policy recommendations. Prediction markets put fairly long odds on a COVID vaccine being approved before the end of 2020 (source: https://fortune.com/2020/07/15/coronavirus-vaccine-this-year-prediction-markets-coronavirus/), a prediction which a) turned out to be quite wrong and b) could've lead to some fairly substantial policy failures if policymakers had acted on those beliefs.

COVID should've been a golden opportunity for prediction markets to make useful long-term predictions that could've lead to useful policy, but I have yet to see any kind of impressive results.

Expand full comment

Possible approach to comparing punditry success:

Pool all questions from all pundits. If a pundit didn't answer one, assume that's because they felt too ignorant to; so pretend they did answer it, just with an ignorance prior (/maximum entropy prior), e.g. 50% if it was a yes-no question. Then idk take the average.

This rewards making well-calibrated predictions, and punishes not making them when you could have!

(Drawback: sometimes it's not obvious what the ignorance priors are, e.g. especially for continuous random variables [the prior density functions for x and for x^2 can't *both* be flat].)

Expand full comment

I think you’d enjoy Forecast, a community for crowdsourced predictions. Download the app here: https://apps.apple.com/us/app/id1509378877

Use referral code: BRANNO5902 to get an additional 1000 forecasting points when you sign up.

I’m sure I’m late to this here but I find it fun to use

Expand full comment

I was under the impression that log scores couldn't be positive.

Expand full comment

"Some of the disagreements might come from Yglesias making his predictions in late December and Metaculus opening theirs in February, which is kind of unfair to Matt." I work for Metaculus and wanted to let this discussion section know that we didn't want to be unfair to Matt Yglesias, and so gave him an opportunity to amend any predictions from his original Substack post before we put them into our question series.

Expand full comment