I don't promise never to make mistakes. But if I get something significantly wrong, I'll try to put it here as an acknowledgement and an aid for anyone trying to assess my credibility later.
This doesn't include minor spelling/grammar mistakes, mistakes in links posts, or failed predictions. It's times I was fundamentally wrong about a major part of a post and someone was able to convince me of it. In reverse chronological order, starting with my old blog:
57: (12/4/23) In In Continued Defense Of Effective Altruism, I said of EA’s failures (primarily SBF) that “I’m not sure they cancel out the effect of saving one life, let alone 200,000”. A friend convinced me that this was an unfair exaggeration of the point I wanted to make. There are purported exchange rates between money and lives, destroying $5 - $10 billion in value is pretty bad by all of them, and there are knock-on effects on social trust from fraud that suggest its negative effects should be valued even higher. I regret this sentence and no longer stand by it.
56: (5/27/23) In Raise Your Threshold For Accusing People Of Faking Bisexuality, I cited a study finding that most men’s genital arousal tracked their stated sexual orientation (ie straight men were aroused by women, gay men were aroused by men, bi men were aroused by either), but women’s genital arousal seemed to follow a bisexual pattern regardless of what orientation they thought they were - and concluded that although men’s orientation seemed hard-coded, women’s orientation must be more psychological. But Ozy cites a followup study showing that women (though not men) also show genital arousal in response to chimps having sex, suggesting women’s genital arousal doesn’t track actual attraction and is just some sort of mechanical process triggered by sexual stimuli. I should not have interpreted the results of genital arousal studies as necessarily implying attraction.
55: (5/19/23) In Change My Mind: Density Increases Local But Decreases Global Prices, I asked people to convince me out of an idiosyncratic theory about the housing market. Many good comments pointed out some reasons why, although it might be true in some situations, it probably wasn’t in most of the most relevant cases. and a poll found that only 20% of readers (and only 5% of readers with advanced econ degrees) ended up agreeing with it. Although it wasn’t a mistake to ask people to change my mind, I’ve mostly updated against believing the theory.
54: (2/1/23) In The FDA Has Punted Decisions About Luvox Prescription To The Deepest Recesses Of The Human Soul, I said there was a 60-40 chance fluvoxamine treated COVID, and that based on this it was positive expected value and doctors should use it. More recent studies have found fluvoxamine does not treat COVID. This wasn’t exactly a mistake, since it’s not wrong to give 60-40 odds on a false thing, but it seems worth acknowledging here.
53: (2/1/23) In Response To Alexandros Contra Me On Ivermectin, I went over Alexandros Marinos’ very comprehensive criticism of my ivermectin post, attempting to show it made many mistakes. Some of his claimed mistakes I think he is wrong about and I continue to defend my previous position; others are already listed here. But a few are new. I won’t go through all of them here, but one especially serious one: I said an ivermectin study (Biber et al) had failed to report its primary endpoint, listed irrelevant endpoints to make up for that, deviated from its preregistration, and would have had null results if the preregistration had been followed. In fact, it reported its primary endpoint just fine, the endpoint was significant, and the deviation from preregistration was reasonable and not a major factor in the study. I described the study incorrectly (and very unfairly negatively) and incorrectly dismissed its results as useless.
52: (10/10/22) In A Columbian Exchange, I related as true a story about Columbus cutting off the hands of Indians who didn’t give him enough gold as a tribute. Commenters convinced me that the story was false-ish: Columbus was known to punish Indians who didn’t give him enough gold, hand-cutting-off was a punishment commonly used elsewhere in Spanish America for this situation, but there is no hard evidence Columbus used it. Some other bad things Columbus did seem on firmer ground; see here for more. The original post also exaggerated the degree to which some holidays “neutralized” or replaced other holidays, especially Christmas - again, see here for more.
51: (10/8/22) In I Won My Three Year AI Progress Bet In Three Months, I said that I’d won a bet on AI progress based on (my interpretation of) whether some images matched some prompts. Edwin Chen surveyed a lot more people and found that on average they did not think enough of the images matched the prompts for me to have won the bet. I retract my claim to have won and will continue to see how AI progress advances over the next three years.
50: (8/29/22) In Blindness As A Shift In The Schizophrenia-To-Autism Spectrum, I speculate on why no congenitally blind people develop schizophrenia. Steven Byrnes investigates this often-repeated fact and finds that it may not be true; if so, I was building a tower of conjectures on a totally fake foundation.
49: (6/17/22) On Oh, The Places You’ll Go When Trying To Figure Out The Right Dose Of Escitalopram, I treated a study that assumed a certain dose equivalent for antidepressants as proving that dose equivalent. As far as I know nobody has proven dose equivalents directly, and people look at indirect comparisons to create a conventional wisdom.
48: (6/12/22) Also related to ivermectin: Alexandros Marinos discusses many concerns about my portrayal of Dr. Flavio Cadegiani. I acknowledge two of them as relevant and problematic: first, I noted that he was accused of “crimes against humanity” in a way that suggested the Brazilian government might think he was killing his patients (although I said I didn’t agree with this interpretation). A new source that wasn’t available to me at the time explains that the “crimes against humanity” accusation is because he didn’t stop a trial when it showed the experimental drug was much more effective than the placebo drug (which is a nonsensical accusation, since the accusers don’t believe this is true anyway). Second, although I said he was “involved in a scandal” where the Brazilian government made a defective app, his only “involvement” was that the app used his data; he was not responsible for any of the defects. I cannot remember why I made this error but I assume I saw someone else say something about this and didn’t dig deep enough to be fair to him. I regret both errors. Marinos has other concerns he thinks are relevant which you can find at the article.
47: (6/5/22) In this post, I characterize California gubernatorial candidate Michael Shellenberger as opposing suboxone treatment and supporting “sweeping” institutionalization of the mentally ill, based on some articles about him. Shellenberger rejects both characterizations and says he supports the former and opposes the latter. I apologize for the false characterization. Update (6/22/23): After reading Shellenberger’s book San Fransicko (review and discussion here) I believe my original characterization was correct, and Shellenberger’s claim to reject this was false. This was not a mistake, and it was a mistake to categorize it as a mistake.
46: (5/31/22) In this post, I discussed many statistical issues around ivermectin. Around the middle, I did two analyses of studies using a t-test. A reader suggested that it would have been more appropriate to use a meta-analysis test (eg Dersimonian-Laird). This would have strengthened the two analyses from marginally significant to clearly significant. More recently, another reader has commented that a DerSimonian-Laird test is also inappropriate because the studies aren’t homogenous, and now I’m not sure which test is appropriate or what result it would give - but it definitely wasn’t the one I originally tried. This doesn’t dramatically alter the overall conclusion of the post, which was that the apparent effect (whether marginal or clear) was better explained by other things.
45: (5/18/22) In this post, I reported a survey of nootropics I had conducted. Among the most interesting findings: people found a supplement called Zembrin to be very effective. I reported this and suggested it might be helpful for people. A later survey failed to replicate people finding Zembrin especially effective. I’m still not sure what went wrong here, but my suggestion that people try it was probably wrong.
44: (2/22/22) Many arguments on this month’s Links, of which the only one I really feel like convinced me of a mistake involved this Snopes page. I made fun of it for “debunking” the claim that the Biden administration distributed crack pipes to advance racial equity, when they did in fact distribute the crack pipes. But commenters convinced me that other people cared a lot about the “for racial equity” part. They were not distributing the crack pipes specifically to advance racial equity in particular, and I guess Snopes can consider it worth debunking this.
43: (2/19/22) In my prediction market post The Passage of Polymarket, I said that the government had ignored Philip Tetlock’s superforecasting work. Aftagley corrected me here: the government has had a complicated relationship with Tetlock, funding and supporting him at least at some points (unclear to what degree they are still doing this).
42: (12/6/21) In my Model City Monday post, I unreflectingly believed another source that said charter city project Praxis was backed with $10 million from Peter Thiel. They reached out to say that they were backed by $4 million from Pronomos Capital, a fund affiliated with many people, of whom Thiel should not have been particularly relevant here.
41: (12/6/21) In my 2014 review of The Two Income Trap, I suggested Elizabeth Warren was smart and good. Subsequent events have conclusively revealed her to be dumb and bad. ACX regrets the error.
40: (11/22/21) In my post on FDA approval of Paxlovid, I said that since the studies clearly showed Paxlovid was safe and effective, the FDA had no excuse for delaying its approval. Commenters pointed out that FDA approval also involves making sure the manufacturing process is safe. I admit the FDA has an excuse.
39: (8/28/21) In my post on the California governor election, I made fun of the California Democratic Party for refusing to field any candidates in the Newsom recall election, suggesting this was an obvious own-goal. Commenters pointed out that they’d tried the opposite strategy in 2003 and it had been a disaster, since the presence of acceptable replacement candidates had made Democratic voters feel more comfortable voting for recall. I was wrong to dismiss this strategy as obviously stupid, and have edited the post to reflect that.
38: (8/8/21) In my post on psychedelics and anosmia, I said that a phenomenon couldn’t be explained by neurogenesis, since that mostly did not happen in adults. Commenters found that it at least happens in the olfactory bulb, which was the area under discussion. I still don’t think it’s relevant to that phenomenon, but this particular reason was definitely wrong.
37: (8/6/21) In my post on aducanumab, I argued that the FDA had failed by delaying the approval of a fish-oil based nutritional solution. I got some key points wrong, and two days later I looked into it further and published a separate post with the full story. Some of the specific points I was clearly wrong about include:
I originally incorrectly said the fish oil helped the development of the nervous system; actually, it prevented damage to the liver.
I originally said thousands of babies had died from the approval delay. I can’t find evidence to support that claim, and all I can say with confidence is that it was probably at least a few dozen.
I originally said that the FDA delayed the approval process ~20 years. This is only true if dated from the first use of the oil in the United States, but nobody had seriously suggested the FDA approve it at that time. Dating from the first concerted effort by activists to make the FDA aware that there was a strong case for approving the drug, it delayed ~5 years. Some of this delay was not their fault, insofar as a company had not filed the relevant application.
My original presentation didn’t sufficiently discuss exculpatory factors for the FDA, including that they funded studies of the fish oil, and were helpful enough that the main doctor involved was left with a good impression of them.
Despite this, I stuck to my original assessment that the FDA’s delay in approving this drug was outrageous and an argument for reforming the medical approval system. Some people, including journalist/blogger Kevin Drum, thought that I was updating insufficiently by correcting minor details by continuing to assert this, and that the real story showed the FDA was fine. I wrote a post disagreeing, saying that although the FDA followed the procedures correctly, and the procedures made sense according to the values of the people who made them, the whole way the whole system works is fundamentally flawed. Many people, including some commenters I trust, continue to think I made more of a mistake here than I’m admitting, and I continue to disagree.
36: (4/25/21) In my review of Global Economic History: A Very Short Introduction, I judged the book harshly for saying that Italy had a higher GDP than Britain, when the opposite was true. A commenter pointed out that Italy had had a higher GDP than Britain in 2011, when the book was published. They were right and I was wrong.
35: (4/25/21) In my post on grading Trump predictions, I judged Section 5, Prediction 1, on hate crimes to be true. A commenter pointed out that I had mistakenly used the numbers for “single bias incident” hate crimes only; taking all crimes together, it was false. I judged Section 5, Prediction 6, on race relations, to be true. A commenter pointed out that there was no data for one of the relevant years and I had to interpolate it as being halfway between the previous and subsequent year, and made a case for why this was shady. Based on that, I rejudged it to indeterminate.
34: (2/28/21) In my February 23 coronavirus predictions post, I asserted that the US had the highest unofficial coronavirus death toll so far. Commenters made a good case that India’s was actually much higher, see here for details. I now think India probably had the highest unofficial death toll and it was a mistake to assert otherwise.
33. (3/29/20) In my March 27 Coronalinks thread, I said confidently that quitting smoking would lower your risk of mortality from coronavirus infections. Several readers then alerted me to research saying that it might raise the risk. Although the research is tentative and a little sketchy, it was a mistake to be so confident in the original thread.
32. (3/29/20) In my March 2 Coronalinks thread, I speculated that the CDC was intentionally downplaying the effectiveness of face masks in order to prevent people from hoarding them. Although I still think the CDC underestimated the effectiveness of face masks, my claim that it was intentional was wrong. They have had the same policy consistently for years. See section 7 here for more.
31. (2/9/20) In Adderall Risks: Much More Than You Wanted To Know (written 2017) I stated that although Adderall might increase risk of Parkinson’s Disease, Ritalin didn’t. Since then I’ve been informed of several studies showing Ritalin probably does this too (see eg here). I’m not sure what went wrong and I can’t retrace my steps to find what made me think Ritalin was safe before; I think it was personal communication with some people who I thought would have known. In any case, don’t switch to Ritalin because you’re worried about Parkinson’s; it won’t help.
30. (11/19/19) In Promising The Moon (written 2014), I argued that technological progress had not slowed from 1970 to the present (compared to its pre-1970 speed). Since then, research by a bunch of people (summarized very well here by Tyler Cowen) finds strong evidence that it has. Despite its confidence and snark, my 2014 post was wrong.
29. (9/8/19) In Age Gaps And Birth Order Effects, I found that birth order effects stop mattering suddenly after a seven year gap between siblings. A Less Wrong user tried to replicate and reanalyze the results, and found a less sudden decrease around seven years. They concluded that birth order effects likely decreased more gradually after “4 – 8” years. I agree their reanalysis is better and have edited my post to link to it.
28. (8/21/19) In It’s Bayes All The Way Up, I suggested that psychedelics might work by strengthening the brain’s top-down priors. A more recent paper by Carhart-Harris and Friston provides compelling evidence that psychedelics probably work by weakening the brain’s top-down priors, meaning I got this one maximally wrong. This probably suggests a few other related posts, like Mysticism and Pattern-Matching, have serious flaws or are looking at things in unproductive ways.
27. (4/24/19) A key piece of evidence in Why Were Early Psychedelicists So Weird? was a study finding psychedelics caused a permanent increase in openness. A more recent study failed to replicate this finding. I’m not sure how much to discount the post based on this.
26. (3/28/19) I reported on a survey I did that showed people’s intuitive moral valuation of animals matched their number of cortical neurons spookily well. A commenter replicated the survey with a larger sample and got different results. I partially retracted the post on the survey based on these concerns, but then we found a way to sort of reconcile the two studies, so I updated the partial retraction. It’s complicated.
25. (11/20/18) My post The Economic Perspective On Moral Standards started with a discussion of the phrase “there is no ethical consumption under late capitalism”. Some people brought up that this phrase may usually be used in a way opposite to the way I was describing it. See the comments for discussion, but given the potential error I excised it from the post.
23. (8/28/18) In Elegy For John McCain, I tried to write a poem that balanced my various conflicting feelings about him, including as a high-minded and noble person, a martial hero who refused to tolerate evil, and somebody who when you got down to the consequentialist brass tacks of it spent a big part of his life promoting war and death in a disastrous way. I apparently failed terribly at this and everyone thought it was just attacking McCain in a way that was cruel to do in a period of mourning after his death. I don’t want to claim this is just them being wrong, because I think the poem does have an element of that and I misjudged its appropriateness at least as much as I misjudged other people’s reaction, and because enough smart people interpreted the poem that way that I probably have to accept that’s the way the poem is. I also should hold myself to higher standards since I’ve come out as against this kind of thing before.
22. (6/16/18) In HPPD And The Specter Of Permanent Side Effects, I focused on cell death as the likely explanation for occasional permanent side effects from psychedelics. Some commenters brought up an alternate model based on aberrant learning. Although the specifics don’t make a lot of sense, this would present a better explanation for occasional LSD flashbacks and for seemingly-similar conditions like mal de debarquement. It was a mistake of emphasis to focus on cell death rather than on this possibility.
21. (4/12/2018) The post Why DC’s Low Graduation Rates found that DC has very low graduation rates relative to their test scores (relative to other cities and states) and concluded that the most likely explanation was DC had stricter standards. This was framed as other states and school districts likely had borderline-fraudulent practices of passing students who hadn’t learned very much. Although there’s independent evidence for this and it definitely happens to some degree, based on some of the evidence in Highlights From The Comments On DC Graduation Rates it looks like another major (more important?) factor is DC’s strict policies on student absences, and possible screwy incentives that make student absences difficult to avoid. If this is true, focusing so much on standards was a mistake of emphasis.
20. (4/10/2018) Despite some efforts otherwise, my post on adult neurogenesis came off sounding like it was settled science that it didn’t happen. A few days later, this study came out, providing some more evidence for the “it does happen” side. See also “this comment. Although the subject is still in doubt, and although there’s still a really interesting contrast between the certainty with which people discuss the intricacies of neurogenesis and the uncertainty about whether it happens at all, it would have been more accurate to frame the post in these terms instead of as “it definitely doesn’t happen”.
19. (12/17/2017) In Right Is The New Left, I predicted that liberalism was becoming so mainstream and busybodyish that rebellious young people would start leaning right just to avoid it. Although some young people joined the alt-right, I wish I had also predicted the degree to which rebellious young people would also move to a democratic socialist Bernie-Sanders-esque position that attacks mainstream liberals for being sellouts and shills and morons.
18. (12/17/2017) In my posts about tax policy, I misunderstood the rates at which growth to the economy compounds; as a result, economy-expanding policies look better compared to direct-redistribution policies than I’d originally calculated. I still think the existing tax bill probably doesn’t expand the economy that much. There were a few other errors mentioned in the same post.
17. (11/19/2017) In Depression Is Not A Proxy For Social Dysfunction, I cited results that happier states had higher suicide rates. A more recent study using a finer level of analysis is not able to replicate those results. This strengthens the case that the European results are just an artifact of more vs. less industrialized, strengthens a possible case that western states and high-altitude states are correlated and western states have higher happiness while high-altitude states have more suicide, and weakens a case that high altitude directly causes higher suicide rates. Overall it weakens a case that happiness and suicide are correlated in a meaningful way.
16. (10/22/17) In Book Review: Age of Em, I said we could calculate where hardware would be in fifty years by assuming Moore’s Law held. But Moore’s Law doesn’t seem to be holding anymore, or at least is on the verge of not holding, and that calculation would have been very wrong.
15. (9/24/17) In Mental Disorders As Networks, I tentatively endorsed a theory that there may not be a single underlying cause to all the symptoms of psychiatric disorders, but that instead they might all cause each other by spreading activation through a symptom network. While I still think there might be some element of that, learning more about predictive coding has made me think that psychiatric disorders might come from imbalances in various really-high-level features of the kind of processing the brain does, like “too much bottom-up compared to top-down processing” or “overly high confidence in neural predictions”. See Towards A Predictive Theory Of Depression for an example. This would explain the previously inexplicable tendency of psychiatric disorders to be caused by practically anything but still have the same symptoms, and it would mean that network theory was at least mostly a dead end.
14. (9/24/17) In Hungarian Education III, I mention that it’s implausible that the same family could give birth to three geniuses – regardless of the IQ of the parents – due to regression to the mean. Several commenters pointed out this is true only if the parents had high IQ for non-genetic reasons. If the parents had high IQ for genetic reasons, and performed neither better nor worse than their genes, then regression to the mean would not be expected and the family could give birth to three geniuses.
13. (7/10/17) In Change Minds Or Drive Turnout, I repeated the claim that turnout was down 2% between 2012 and 2016, and tried to determine what that meant for political strategy. In fact, this was based on early numbers before all ballots were in, and turnout probably rose during that time. Some of my analysis in Part III of the post is invalid or obsolete, though it probably doesn’t affect the overall conclusion.
12. (4/9/17) G.K Chesterton On AI Risk was an April Fools’ joke and not meant seriously. But it did criticize Maciej Ceglowski’s piece where he accused singularitarians of not caring enough about the poor. The fake Chesterton of the piece said that if Ceglowski himself gave to charity at the same level as the people he was criticizing, he would “eat his hat”. Ceglowski pointed out that he does indeed give a lot of money to charity, including a $15,000 donation last year. Under the circumstances, I felt honor-bound to eat my hat and post a video of it on Twitter.
11. (4/9/17) In Some Groups Of People Who May Not 100% Deserve Our Eternal Scorn, I said that I thought Vox was mostly behaving responsibly, insofar as everyone knows its biases, and looking beyond them it has good articles on a variety of subjects. Several commenters argued that they had an irresponsible tendency to publish informal “meta-analyses” of the arguments for and against a specific position, while exaggerating the arguments on their side and leaving out the best arguments on the other. They said the “review of evidence” format has an implicit claim to objectivity that can’t be sidestepped just by saying “everyone knows their biases”, and gave some examples of this being used really egregiously. After some thought, I agree. I continue to personally respect many of the people at Vox and enjoy their articles, but saying that they “may not 100% deserve our eternal scorn” was probably going too far.
10. (12/31/16) In Contra NYT On Economists On Education, I said that an NYT article was so deceptive that it constituted journalistic malpractice. Some people commented that they interpreted the article in a different, non-deceptive way. Although I still think most people would interpret the article incorrectly, I think it’s sufficiently possible to interpret it as intended that it was probably just an honest disagreement in interpretations and not a deliberate attempt to mislead. I deleted the “journalistic malpractice” phrase after about an hour, and I apologize to the Times and to the author for imputing motive. I still think they dropped the ball, though.
9. (12/30/16) In Vegetarianism For Meat-Eaters, part 2 suggested that donating to animal welfare charities could save 3 – 11 animal lives per dollar. Based on critiques like those in this essay, I now think those numbers are heavily exaggerated, maybe by several orders of magnitude. I don’t know what the right numbers are or whether the point is still somewhat valid.
8. (12/1/16) I mentioned a few times (eg here) a theory that increased carbon dioxide in poorly-ventilated areas might hurt cognition, especially in the context of global warming which we expect to cause increased carbon dioxide everywhere. Commenters pointed out that submarines had carbon dioxide levels double or triple those of anywhere else studied, but submariners seem pretty with-it and don’t show noticeable cognitive declines. My guess (75% confidence) is now that there’s no cognitive penalty for carbon dioxide within the levels humans encounter in normal situations, but this could change if I see more good studies on it.
7. (1/09/21) I forgot to include a mistake 7 on this Mistakes List, and it would be too much trouble to renumber all the others to correct for this. ACX regrets the error.
6. (3/26/16) In my March 2016 links post, I linked to a Wikipedia page about a radar detector detector detector detector. Rational Conspiracy has looked into it further and believes that was a hoax and there was no such thing. The hierarchy of radar detection most likely ends at radar detector detector detectors.
5. (1/1/15) In my January 2016 links post, I noted that according to an article in Mother Jones, OxyContin abuse kills three times more people than homicide. Although the article was about OxyContin abuse, the specific statistic cited was about all fatal drug overdoses. This wouldn’t have been such a big deal except that it was linked by Marginal Revolution. Sorry, Marginal Revolution.
4. (12/1/15) In College And Critical Thinking, I claimed that a graph showed a u-shaped relationship between time spent in college and critical thinking, which suggested that the relationship between the two was too confusing and unpredictable to be very strong. In fact, commenter PSJ pointed out that this was only true of a small sample of two-year college students, and that most college students showed the expected linear relationship. Discovering the mistake strengthened my conclusion (that college probably does improve critical thinking skills at least in the short term).
3. (4/22/15) In Growth Mindset 3: A Pox On Growth Your Houses, I claimed that a graph showed that most conditions of an educational experiment deteriorated over time, and that since this was very strange the study probably couldn’t be trusted. In fact, the graph was standardized in a way I didn’t notice, and showed only that those conditions did worse than the other conditions, without deteriorating outright. This made the study much more believable than I had thought. The author of the original study corrected me and I explained the correction in detail here. Discovering the mistake lessened my confidence in my conclusion (growth mindset isn’t very impressive and often fails outright) without entirely reversing it.
2. (6/29/14) In Invisible Women, I pointed out a paradox – how come women are joining the workforce, but GDP has not gone up proportionally to the increased number of workers? Pseudoerasmus explained that men were working less often and working fewer hours in a way that counterbalanced increased female employment. Discovering the mistake did not affect any “conclusion” since I was just asking the question, but the question turned out to be much easier and less weird than I had expected.
1. (10/20/13) In The Anti-Reactionary FAQ, I claimed that there was likely not much difference in crime between the distant past (especially Victorian England) and today, because although the reported burglary rate was up, the reported murder rate stayed the same, and murder is the most accurately recorded crime. Michael Anissimov points out that medical care has improved since that time, so that many things that would have been murders in the past are now only attempted murders, lowering the apparent murder rate by as much as five times. Discovering the mistake caused me to reverse my conclusion that crime has not been increasing since the Victorian age.