Congrats To Polymarket, But I Still Think They Were Mispriced

...

Nov 07, 2024

Polymarket (and prediction markets in general) had an amazing Election Night. They called states impressively early and accurately, kept the site stable through what must have been incredible strain, and have successfully gotten prediction markets in front of the world (including the Trump campaign). From here it’s a flywheel; victory building on victory. Enough people heard of them this election that they’ll never lack for customers. And maybe Trump’s CFTC will be kinder than Biden’s and relax some of the constraints they’re operating under. They’ve realized the long-time rationalist dream of a widely-used prediction market with high volume, deserve more praise than I can give them here, and I couldn’t be happier with their progress.

But I also think their Trump shares were mispriced by about ten cents, and that Trump’s victory in the election doesn’t do much to vindicate their numbers.

II.

Suppose you have a coin. You think there's a 90% chance it's fair and a 10% chance it’s biased 60/40 heads. Then you flip the coin and comes up heads. What should your new probability be? You would solve this with Bayes’ Theorem; the answer is 88% chance it’s fair, 12% chance it’s biased.

Why didn’t it shift your beliefs more? Didn’t the experiment “vindicate” the bias hypothesis’ claim that it would land on heads more, by in fact landing on heads? Yes, but the fair-coin hypothesis already held that it was pretty likely to land heads, and the biased coin hypothesis didn’t add much to this (60% chance vs. 50%). And since you were previously pretty confident in the fair-coin hypothesis, this unremarkable minor finding only shifts your confidence a tiny amount (2%).

Is this just some sort of pathology of extreme confidence? No. Even if you’d started out ambivalent between the two hypotheses, with equal chance that the coin was fair or biased, a single heads should only shift you to 55-45. You just shouldn’t update much on single dramatic events!

Even if you start out ambivalent between the two hypotheses, and you flip it five times, and you get five heads in a row, you still shouldn’t be very confident! At this point, the probability that it’s a completely fair coin is still ~29%! Why so high? Because it’s implausible that a fair coin would get this many heads, but it’s about equally implausible that a coin biased “only” 60-40 would. Either way, you got a weird amount of luck. The amount of luck necessary to get this this result with a fair coin is only slightly greater than the amount necessary to get it with a biased coin, so overall you still shouldn’t be too sure. It’s just really hard for this paradigm - flipping a coin that could be fair or could be 60-40 - to give you useful evidence.

III.

This is equivalent to the implicit argument between Polymarket and a group of other forecasting sites, especially Metaculus.

Just before the election, Polymarket and other real-money prediction markets said Trump had a 60% chance of winning. Metaculus and other non-money forecasting sites said he had a 50% chance of winning.

Then Trump won. Should this increase your trust in Polymarket rather than Metaculus? Only by the tiniest of amounts. If you previously thought (like I did) that there was a 90% chance that Metaculus was more accurate, you should update down to 88%.

But this point holds regardless of your previous opinion of Polymarket vs. Metaculus - whether you thought they were both about equal, or Polymarket was better. Whatever your opinion, the election should barely change it.

This is my main point. In the rest of this post, I’ll explain why I originally thought 90% odds Metaculus was right and Polymarket was wrong (implying the new probability should be 88%), then answer some potential objections.

IV.

Just before the election, various forecasters, markets, and wisdom-of-crowds sites separated into two groups.

One group - the non-money forecasters - said the election was 50%. Nate Silver was in this group. So was Metaculus, a forecasting engine which has outperformed prediction markets in the past, and Manifold, a mostly-play-money prediction market.

Another group - the real-money markets - said the election was 60%. Polymarket was the leader here; a group of smaller prediction markets, including Kalshi, Betfair, and PredictIt - were probably just changing downstream of Polymarket, as traders tried to arbitrage the bigger site’s odds.

Before the election, I said that we should trust the non-money forecasters over the real-money markets, for three reasons:

First, non-money forecasters have beaten real-money markets in past elections. Here’s a graph of 2022 results, courtesy of First Sigma:

Every single non-money forecaster beats every single real-money market.

Maxim Lott looks at a longer term matchup between all real-money markets and 538. He finds they are mostly equally good over the long term, but that including the most recent results 538 wins by a hair.

In my own contest, Metaculus (a non-money forecaster) outperformed Manifold (a play-money market with some tenuous connection to real money). And in Manifold’s own poll, users said they thought Metaculus was more accurate than Polymarket or themselves.

Second, real money markets have a long history of giving weird results.

As we speak, PredictIt says there’s a 7% chance that Kamala Harris will be the next President. Commenters are debating whether maybe Biden will resign in her favor so she can get to be “first woman president” for a few months. But long after Biden won the last election, PredictIt said there was a 9% chance Trump would be the next President; some commenters suggested that maybe he would #StopTheSteal and win a fair recount (aside from the inherent implausibility of this, some of the specific scenarios bettors placed money on required him to win California, where his campaign hadn’t even asked for a recount).

I think it’s more likely that real-money markets have structural problems that make it hard for them to converge on a true probability. After taxes, transaction costs, and risk of misresolution, it’s often not worth it (especially compared to other investments) to invest money correcting small or even medium mispricings. Additionally, there is a lot of dumb money, most smart money is banned from using prediction markets because of some regulation or another, and the exact amount of dumb money available can swing wildly from one moment to the next.

Non-money forecasters have an opposite problem of having no incentive to get things right in the first place. This disqualifies most pundits, but the best forecasting sites have found ways around this. On Metaculus, users risk reputation rather than money; this is easier, since there isn’t some opportunity cost to Metaculus reputation that creates weird dynamics of when vs. when not to invest. On Manifold, people risk play money, which is sort of linked to real money in various obscure ways but you can’t trivially sink your life savings into Manifold and expect to get it back; this is about halfway between monetary and reputational systems. As for Nate Silver, I think he loves gambling enough that he naturally uses a gambling mindset even when he’s not risking money (although he is risking his own reputation, and sometimes does risk money on his beliefs). I didn’t originally think these kinds of “soft” incentives would work as well as real money, but the evidence above has changed my mind.

Third, we know perfectly well why the non-money forecasters and real-money markets differed during this election. Until early October, Metaculus (the top forecaster) was consistently +4% bluer than Polymarket (the top market). This was the expected result: in the past, Metaculus has always been a few percentage points bluer than Polymarket, but they otherwise moved in sync.

Then, starting mid-October, a semi-anonymous French banker who went by “Theo” started plowing millions of dollars into Trump on Polymarket, inflating his chances (different reporters would estimate Theo’s total bet at between $30 - $75 million dollars). At the height of his activities, Polymarket was +13% redder than Metaculus, an unprecedented difference. All the other real-money markets rose close to Polymarket’s level because of arbitrage, and all the non-money forecasters stayed close to Metaculus.

If not for Theo, there’s no reason to think Polymarket would ever have shifted from its usual regime. So when we’re asking whether to trust Polymarket’s conclusion (Trump 60%) or Metaculus’ conclusion (Trump 50%), we’re asking whether to trust the normal operations of the prediction market vs. the personal opinion of one whale. I trust the normal operations of the market.

All of these factors gave me a 90% prior that Metaculus was better calibrated than Polymarket on the elections; now that Trump’s won, I update to 88%.

I made this argument on some comment threads and people raised objections. In case that was you, here are my responses.

—Shouldn’t we count Theo’s opinion a lot because he was willing to bet so much on it? Or because he was smart enough to get rich in the first place?

Yes! I agree that both those things make him more trustworthy, and I take his opinion more seriously because of them.

But I also take this person’s opinion more seriously, for the same reasons.

The person in the tweet was smart enough to make $5 million, confident enough to bet it - and turned out to be wrong. Intelligence and confidence only take you so far, especially when there are equally intelligent and confident people on the other side.

My claim is that, as much as I respect Theo’s good qualities, they don’t make me want to weigh his opinion a significant fraction as highly as the opinion of everyone else in the world combined. But that’s what I’d be doing if I updated on Polymarket’s probabilities.

If Theo hadn’t bet on Polymarket, it would still have been at ~54%. I consider that the average opinion of the non-Theo world. In order to update six points towards Theo’s opinion, I would have to believe that the amount of money someone puts into Polymarket is exactly proportional to their trustworthiness. This is a fair approximation for the Polymarket algorithm to use; it’s the assumption that drives prediction markets. But in this case, it doesn’t make sense.

Suppose that Theo had done even better at his banking job, was 10x as rich, and could move Polymarket up to 90%. Now should we say that the “true” probability of Trump’s win was 90%? Imagine he spent half his money on a yacht just before the election and only had enough money left to move the markets up to 56%. Now should we say that the “true” probability of Trump’s win was 56%? Why should the true probability of Trump’s win depend on whether a French guy bought a yacht or not?

—Isn’t the whole point of a prediction market that people who bet more money should get their opinions counted more than people who have less money?

Yes, but this system is meant to work in a world where amounts of money are at least somewhat even.

There are many systems which work pretty well because everybody is about equally powerful. For example, Bitcoin proof-of-work is very secure because nobody controls more than half the Bitcoin mining computers in the world - and even if they did, they would have an incentive to use them responsibly. But if someone did control more than half the computers, and used them irresponsibly despite the incentive not to do so, Bitcoin would become insecure.

In the same way, prediction markets work because we expect nobody to have vastly more money than anyone else, giving everyone a fair chance to compare their opinions. In the rare cases when that assumption gets violated, they don’t work.

—I thought the whole point of prediction markets was that if anyone put in a crazy amount of money, thousands of other people would show up to correct the market manipulation!

Yes, and like proof-of-work, that works in most reasonable cases. This was an unreasonable case. We know that because the market broke its synchrony with other unaffected markets as soon as Theo started betting, and never regained that synchrony. So obviously the market didn’t correct Theo’s bet.

Why not? In order for an American to use Polymarket, you have to get a VPN, a Coinbase account, and a Metamask wallet, use the VPN, get crypto on the Coinbase account, transfer it to the Metamask wallet, connect the Metamask wallet to Polymarket, and buy the shares you want. Ability to do this rules out 99% of the US population.

But fine, suppose you did that. The median American has a net worth of $200K, but let’s say anyone who can do all that stuff is likely to be a rich techie with $1M. How much do you want to spend on this? If I understand the Kelly criterion right, it says to bet $166,000. But for everyone except Sam Bankman-Fried, this level of risk-tolerance, even on a +EV bet, feels insane. I’m not going to talk about my exact betting behavior because Polymarket is illegal in my country, but when I, uh, imagine doing this in Minecraft, then, in Minecraft, I bet $2,000.

(pouring hundreds of thousands of dollars into opportunities like this would be a no-brainer if they came up every day and you could diversify across fifty of them, but this was a one-time mispricing and there aren’t a lot of similar cases)

If everyone bets $2,000, then you’d need 15,000 - 35,000 people to take the other side of Theo’s bet. I claim that there just weren’t that many individual people who knew about Polymarket, knew enough about the election to understand that it was mispriced, were able to handle the crypto, and weren’t too risk-averse to put up $2K. Of the few people like this who existed and hadn’t already bet on Polymarket before Theo arrived, probably many of them were using their gambling budget for better deals (like arbitraging Polymarket instead of outright betting against it).

—But isn’t there a lot of smart money and hedge funds who could do this?

I have never heard of a hedge fund betting on a prediction market and my guess is that it would either not be legal or require too much compliance paperwork to be worth it. I hope this changes!

—But didn’t Theo give a great explanation of his strategy to the Wall Street Journal, an commission private polls, which proves he was working off of really smart reasoning?

Yes, but there were dozens of people who could give equally-plausible arguments for their positions before the election. These were divided half-and-half into intelligent-sounding pro-Kamala arguments and intelligent-sounding pro-Trump arguments, and Theo was a completely replacement-level example of the intelligent-sounding pro-Trump arguments. We should think of him as an example of an intelligent person with a good argument who got lucky, unlike the many other intelligent people with good arguments who didn’t. I don’t find the private polls very interesting either - the existence of private pollsters implies this happens often, and we shouldn’t expect these private polls to be massively better than the public ones.

(none of this is meant to knock Theo. He seems like a brilliant trader who did everything right and won a much-deserved reward. But markets work because of the interaction of many traders like this. When only one person does it, he may deserve his reward, but we can’t assume the market is efficient.)

—So does this mean we can’t trust prediction markets?

I think prediction markets are among our single best sources of truth, but that (as with every source of truth) we need to think critically about them and notice the rare times when they fail. If you can’t think critically, you’re going to have a hard time, but in that case I would still trust prediction markets over any other source (except Metaculus, which is so similar to a prediction market that it belongs in the same category anyway).

I also think prediction markets will probably become more trustworthy going forward. The more people know about them, and the easier they become to use, the more likely it is that enough minnows will show up to digest the next big whale. I think of this as a good thing, part of the process of prediction markets moving forward.

But on the specific literal level, Polymarket was mispriced last Monday.

Astral Codex Ten

695 Comments