760 Comments

Does anyone else feel like this whole thing is an argument that science should be done purely anonymously?

It seems impossible to separate ego and tribalism from these analyses. If everything HAD to be published anonymously, maybe that would remove all incentives or anything other than just getting the best evidence in a place where other people can evaluate it?

Expand full comment

Does anyone else feel that this situation is more evidence that this is all a waste of time for them to pay attention to because they only have X hours in the day, and if people laser focused on it seem to vehemently disagree with animosity, the different conclusions are more likely to be something like “the same hard-coded priors that cause political and religious differences”?

I’m glad someone else is looking at this for sure, but it seems like it would be foolish for me to claim to know anything beyond “people are fighting about it, this one group says X, this other group says Y.”

That seems like it would have worked better through most of the history of science, rather than trying to, eg pick between gradual and calamitous theories of geological formation, the hot button issue of a few centuries back.

Expand full comment

This is why I like this blog. I wish the media would do more stuff like this. Very interesting, and now I feel like I have an informed opinion on something that two weeks ago I would’ve punted on. Kudos to Peter.

Expand full comment
Apr 9·edited Apr 9

OK, so if we don't over-focus on tiny details (which is what Peter's argument is), then after all that discussion what we still get is:

1. A novel coronavirus emerges in the same city as a lab doing experiments on novel coronaviruses and not anywhere near where such viruses naturally occur, or in any of the other cities of the world where animals and humans come into regular contact.

2. PRC immediately blocked all investigations.

3. Self-proclaimed "prestigious scientists" thought a lab leak was "just so freaking likely", lied about it deliberately and then organized a PRC-style conspiracy to shut down all such speculation.

Sum that up and it looks very likely that it came from the lab and everyone with relevant expertise immediately realized that was the most likely probability.

For the specific questions about things like Brazil or whatever that seem baffling, all those apparent problems are only problems if you continue to accept that modern epidemiology has any accuracy or value. It doesn't, at all. These are the people who claimed with 100% confidence that COVID wouldn't be seasonal, that it spread in a homogenous population and it would grow without end until everyone had been infected unless there were hardcore lockdowns. No other respiratory virus works that way, and sure enough COVID cases turned out to be seasonal, to depend heavily in things like age and superspreaders, and to peak long before everyone was infected. Lockdowns meanwhile made no difference.

So if there was an early release that went around the world before it was officially recognized, we shouldn't be baffled by non-problems like "why did it not double every 3 days in Brazil" because that claim was wrong to begin with. There is no prestige, there is no skill. Computer scientists are far better at building models and agricultural scientists (or indeed workers) understand animal disease dynamics far better than academic epidemiologists. The people with prestige here are not the ones ignoring counter-arguments.

Expand full comment

This is really great Scott. Thank you.

Expand full comment

To me it feels like the bayesianism was the hardest hit in this debate. I have long suspected that in the vast majority of cases when one says about "priors" and "updating" they're basically doing this https://www.smbc-comics.com/comic/bayesian, and in the debate we can see four people (Saar, Peter, judges) doing Bayesian analysis ("The Math And The Aftermath" in the original section) and getting wildly different results from plugging wildly different numbers in the same formula. How can you properly do Bayesian analysis if you can't even agree on the priors and updates which then become all sorts of ways to smuggle your biases into the result? So it doesn't surprise me that "[the judges] both thought of probabilistic analyses as an afterthought"; it looks like an obviously correct decision and (at least to me) somewhat demonstrates that no one generally does proper Bayesian stats with math and numbers for fuzzy, politically charged real-world problems. (I guess it's still useful for estimating the distribution of black and white balls in the urn or the probability that given email message should go into the spam folder.)

Expand full comment

I am 99% sure it’s zoonotic because of the arguments presented here and in other blogs I have read. However if it’s proven to be 100% lab created I would go with an American attack.

Why? Well as Sherlock said if you eliminate the impossible then whatever remains, however improbable, must be the truth. And the leak is highly unlikely given where the virus clearly originated - the wet market.

Lab created is not lab leak, yet everybody assumes this to be true. But these are independent variables.

Also were it an attack then choosing Wuhan has two layers of plausible deniability - the wet market (where in this scenario it’s deliberately implanted). The second layer is if the virus is proven to be proven to be lab created then people would still blame the Chinese.

Obviously this wouldn’t work with Beijing.

Another fact that might put some credence into this conspiracy theory is the very strong efforts by western media and scientists to deny the lab origin. Were it proven to be lab grown this would be retrospectively suspicious.

Anyway this isn’t my belief - I’m team zoonotic. Take one thing away from this - lab grown is not lab leak.

Expand full comment

> December: COVID doesn't exist, it's all lies

Early January: Fine, it exists, but it’s just some wet market thing that can't spread from person to person

Not quite. It’s called Covid 19 because it was recognised in 2019, albeit on New Year’s Eve.

(Although an Irish health minister did think that there were 18 other versions. We’ve made him prime minister. )

Edit:

Actually I was posting this based on an old report by the WHO. They’ve updated since to say that it wasn’t confirmed by China until Jan 9th.

Expand full comment

Scott's good faith treatment of some of these "very prestigious authors" is not well-deserved.

Scott is great at Normal Scientific Situations, where there's a thorny question, and maybe it's answerable through exhaustive literature meta-analysis, and where the papers comprising the literature have been written by dispassionate actors.

This is not like that. Practically all of the scientists mentioned above who have published about this have conflicts of interest, and their livelihoods would be adversely effected if LL was institutionally accepted or conclusively proven. Because they're scientists, and all they know how to do is write papers, that's what they do, and because they're human, they're engaging in motivated reasoning while writing those papers.

Some of them are just lying. The model is big tobacco or an industry lobby.

Expand full comment

“If they secretly knew they’d just started the worst pandemic in modern history, wouldn’t they at least be wearing masks?”

They’re all under 60 and have a healthy BMI, so… no? Assuming a massive coverup they’d presumably know that the virus is only fatal to elderly people, those with a BMI over 30, and those with major comorbidities.

I don’t believe the massive cover up theory but IMO it’s perfectly rational for extremely sophisticated actors to not wear a mask in that situation. Not to mention they’d have to wear N95 masks to avoid getting infected, a simple surgical mask would do ~nothing to protect them from others.

Expand full comment

I guess it took some mental effort to keep your odds at 90-10, even though I agree with them. Essentially, none of the arguments listed in the original debate post were thoroughly debunked or disproven, and many of the weird coincidences stayed weird.

But man, don’t you want to update when one side seems so overconfident in their theory and arguments? Not sure what the associated fallacy is for this one.

Expand full comment

"Lv (is this even a real name? It sounds like Roman numeral? But I guess that’s what you expect in a country ruled by someone named Xi)"

This one I can answer. In the romanation of Chinese, the letter u actually represents two different sounds. When done properly, the sound "oo" like in "boot" is the plain letter u, and the sound "u" as in the French "tu" is represented by a u with umlaut, like this ü (should show up properly on most computers these days?). But in practice, no-one ever bothers to write or type the dotty accents. However, when you type Chinese, all the input methods have you type the letter v for the accented ü. It's an American keyboard workaround. This means that some people have now started to use the actual letter v to stand in for that ü.

This happens particularly in the surname Lu, because there are two different surnames, one with the u sound (陆 Lu), and one with the accented ü sound (吕 Lü). It's useful for people with these different names to keep them unmixed; but it's a hassle to type the u with umlaut. So a few people named 吕 Lü have just started using that typing shortcut and just romanising their name as Lv.

Expand full comment

Re raccoon dogs I think you misunderstood my point about the whole inventory. They were selling 38 raccoon dogs per month across four wuhan markets. Over 11 days they tested 15 raccoon dogs at the suppliers of various markets in Wuhan. My point was that this is 15/38 raccoon dogs tested over 11 days and so likely comprises all or nearly all of the inventory of the raccoon dogs that would be sold in Wuhan in that period

Expand full comment

I can't take arguments that dismiss Nepalese journals no one's heard of seriously.

Expand full comment

Re: Connor Reed, I don't know if it's relevant, but he died from an overdose of various drugs. It's a sad story, but the discrepancies in the accounts are there: family says he never took drugs but the university dorm mates say he used to regularly order and have them delivered.

https://www.walesonline.co.uk/news/wales-news/connor-reed-covid-death-wuhan-21116962

So I think the different accounts can be explained by (1) he does contract coronavirus (2) he's the (allegedly) first known British case (3) this gets interest from the news media (4) he does a plain interview with local paper (5) national and other media get interested (6) 'hey I can sell my story!' but it's no good giving the same version of it over and over again, each news outlet wants something new so (6) he builds up the story for them e.g. from the original "I just thought it was the flu, that's why I refused antibiotics" to "I was suffocating with pneumonia, further details inside".

https://www.walesonline.co.uk/news/wales-news/connor-reed-covid-coronavirus-wuhan-19239316

"His parents, originally from north Wales, emigrated to Australia's Gold Coast when Connor was 12. The young man was forever dreaming up "get-rich-quick schemes" said his father and he dreamed of one day making millions from some unspecified Anglo-Chinese business venture."

I can imagine someone who dreamed of being rich and famous seizing on the chance to get fees from the media from selling his story, and the juicier the story, the better the fees. That's why he wasn't keeping the story straight between versions.

Expand full comment

"Peter, very reasonably responding to the numbers Saar gave during the debate and not the numbers he had elsewhere, trolled him by giving a set of numbers that came out to 10^25-to-one in favor of lab leak."

I think you mean "in favor of zoonosis" or "against the lab leak"

Expand full comment

Not sure if we'll do a proper response, so in the meantime I'll open a few threads here on the major issues.

The most important thing to note is that this post discusses a lot of evidence (which mostly is irrelevant to our analysis) but Scott does not address the main problem we pointed out - that his whole conclusion stands solely on the market being some random place in Wuhan that is no more likely to form the early cluster.

He assigns this a 500x(!) factor, compared to our 2x. We specifically pointed to major mistakes in his calculation (see our blog https://blog.rootclaim.com/covid-origins-debate-response-to-scott-alexander/ and especially the last section).

In summary:

1. His calculation assumes zoonosis will almost always start in markets, while in SARS1 it was 1 in 9, and in the USAID PREDICT project markets were given a low priority. This is a 5-10x mistake. (He tries to refute this using cherry-picked examples. I'll address that in a separate thread).

2. He then gives no weight at all (Zero!) to the conditions in HSM, implying an HSM vendor who interacts daily with many people in an unhygienic closed environment that was proven to form early clusters elsewhere, is no different from a random Wuhan resident. This is a 10-100x mistake, depending on how much more conducive you think HSM is. And even if you don't think it is, just the fact it has 1000 people in the same space means it is more likely to be noticed, since there needs to be some critical mass of hospitalizations - how many people in Wuhan work in such a place?

3. Even if you somehow manage to ignore this, there is still the alternative explanation of the Wuhan CDC moving just next door to the market during the outbreak, which could also easily account for 10-100x.

Since Scott's odds are currently 17x zoonosis, fixing these mistakes easily turn him into an avid lab-leak supporter.

Expand full comment

Thank you very much for all that work!

Expand full comment

> SARS spread back and forth in some kind of weird net between civets, raccoon dogs, and a bunch of made-up-sounding animals like "ferret-badgers" and "greater hog-badgers".

I'm not sure what makes "ferret badger" a weirder name than "raccoon dog".

Animals tend to either (1) have a local name; or (2) be named after an animal that is local to the language that needs a name for the animal. The Tasmanian tiger resembles nothing so much as a dog (which it isn't), but it's called a "tiger" (which it also isn't) anyway, because they had to call it something, and people only know about the animals that they already know about. "Tiger" was good enough to distinguish it from other marsupials.

Expand full comment

I remain completely perplexed by the sources of evidence used here, and the fact that anyone thinks they can draw solid inferences from data that has obviously been manipulated by the Chinese government. Maybe someone can explain the premise to me, and maybe there's something I'm missing, but analyzing the early cases as acknowledged by the Chinese government just seems fundamentally pointless.

Everyone agrees there has been a cover-up. No one seems to agree exactly what was being covered up and why. Based on the way authoritarian regimes typically act in response to such situations, I think a fair guess is: many different things, for many different reasons, and often for no reason at all. And surely it was also a multi-level cover-up; a standard feature of authoritarian regimes is that lower-level officials conceal things from higher-level officials. So, you may have had multiple coverups operating at cross purposes to one another. Thus, the kind of pattern Scott suggests is so unlikely seems actually quite likely to me; not because it was conscious misdirection by some omni-competent villain but because it was the result of different officials at different levels trying to conceal different things.

So, I think before you can make *anything* at all out of the data on the purported early cases, you need some kind of theory of the data generating process and the biases it introduces. Looking at the data without that backdrop is like a case where I see "here's a survey of 700 people about politics," tell you nothing about how I sampled them, and then you analyze them under the assumption the sampling was random, when in fact this could be 700 people I sampled at a Democratic party meeting or 700 people I know personally or anything else.

So, can someone explain this to me? What is everyone doing? It seems like there's zero analysis on this and everyone just kinda takes the early case data at something like face value with no model of where it came from. Is there a model?

Perhaps the justification for doing this is that it's too hard to come up with a meaningful model. Probably no one in the world has a granular understanding of both the multi-level Chinese politics involved and the virological side of things. You probably can't develop that understanding of the Chinese politics purely based on open sources, so to really have all that information, you'd need either a regime insider or, maybe, an intelligence agency that knows a lot of non-public things about Chinese politics.

Which brings me back around to a point I made in the last post, the only real candidate for an organization that can bring both pieces together is the US intelligence community. It's hard to be sure how good their political understanding is, especially given the high profile failure in recent years of our HUMINT in China, but given the amount of investment we put into understanding Chinese politics, it's probably better than anyone else's. And they have the scientific experts as well. So, it strikes me as *extremely* notable that the intel community is divided on lab leak vs. zoonotic origins, but no agency is willing to offer more than low confidence for the zoonotic origin.

Unless you're incredibly sure that you have a better understanding of Chinese politics than the US government does (not impossible, but they definitely know many things you don't), I really just don't see how you can possibly reject that conclusion.

Expand full comment

I like how you basically went through the same process most of us in this topic went through, but at like 10x the speed.

The "asymmetry of passion" on the lableak side will always guarantee that a plethora of arguments, rationalizations and attacks will be thrown at the wall to see what sticks, and like the famous hydra, anytime scientists knock down one argument, two more spawn in its stead. As I have written here https://www.protagonist-science.com/p/a-tale-of-two-pandemic-origins and elsewhere, the controversy is mostly driven by a set of asymmetries that blows up the lableak side; despite the emperor being naked.

After debunking a few dozens of those flawed arguments, most scientists and communicators but the most stubborn just have to give up; its exhausting, draining and often comes with unpleasant harassment on top.

That does not mean that most people confused or unsure about this topic are conspiracy theorists (just wrong on the topic), or do not deserve good information; of course they do. But asymmetries and the actions of certain activists just makes the fight against the windmills of conspiratorial ideation much harder for the minority of scientists who can speak intelligently about the topic; as I am sure you will experience now coming out with yet another long post that will expose you to some of that same "passion".

In any case, maybe your 10x speed approach is more sensible, just ripping off the band aid and get back to normal life quicker; while the virologists who conducted the essential work have been stuck in a nightmare for years for just doing their jobs; getting death threats, white powder letters, having to hire security guards, scrape the name off their labs from institute's map etc...

Expand full comment

> Lv (is this even a real name? It sounds like Roman numeral? But I guess that’s what you expect in a country ruled by someone named Xi)

Yes, it's a real name. Lv is how the Chinese commonly spell the syllable that is more formally supposed to be spelled lü. (IPA: /ly/.) The reason they do it that way is that pinyin input methods require typing "v" instead of "ü", for the fairly obvious reason that there is no ü key.

But since people are more familiar with the typing they do every day than they are with the specifics of a system of writing Chinese in foreign characters for the benefit of foreigners, they think of "lv" as being the correct spelling.

Expand full comment

Ironically it still seems plausible to me that someone got infected at the lab and infected someone working at the market. Just, I don't have any reason TO believe that.

I can't follow any of the biology, but none of it sounded like "probably came from human alterations" was more likely than "probably evolved randomly in an animal".

Expand full comment

"So which is more likely - that somehow 20 people had COVID long before the virus was officially detected, and on a totally different continent, yet somehow a scientist looking through wastewater found the water from exactly those people and managed to detect the virus? Or that there was a sampling error, which happens all the time in these kinds of things?"

You are disregarding the possibility that the virus detected in Brazil, and other locales, was not Covid-19 but another virus similar enough to Covid-19. As far as I know, detected doesn't mean the whole genomic sequence matches SARS-CoV2; only a few distinctive segments. It should be theoretically possible to detect as Covid-19 a different, yet similar enough, virus.

Hope someone will correct me if I'm wrong.

Expand full comment
Apr 9·edited Apr 9

Your discussion of ascertainment bias is mistaken on numerous points.

1. The fact that cases in his dataset had symptom onset date in December doesn't show that there is no ascertainment bias. From around 30th Dec, hospitals could not send people to even get tested if they were not market linked. This includes many people who were still in hospital after 30th Dec who had symptom onset prior to 30th Dec.

I think this also accounts for your claim that the first five cases were market linked and cannot be subject to ascertainment bias. There is a difference between symptom onset date and when cases were found. Its also not even obvious that the first case was market linked, which is a sign of what a mess the case search was. the National Health Commission of China [reported](https://pdfhost.io/v/YCxN2y61O_The_outbreak_of_pneumonia_infected_by_the_novel_coron) ([original](https://archive.ph/gkpzs) version) the case of a 48 year old woman with no market link who had symptoms starting 10th Dec

2. His robustness test is ridiculous as noted in the peer reviewed Dave Bahry article that he posted in the comments.He excludes some fraction of the cases and asks whether his statistical tests still hold. This excludes false positives but we're interested in false negatives - people who didn't get counted because they weren't market linked. This is like sampling people in new york and then excluding some people from your dataset and concluding that the global population is centred on central park

3. You point to the jinyintan study saying that people were diagnosed there on the basis of clinical characteristics not a market link. The problem with this is that *suspected* cases at other hospitals could only be transferred to jinyintan for testing if they had a market link from 30th Dec. There are Chinese news articles reporting this with quotes from doctors about how they couldn't transfer to jinyintan https://docs.google.com/document/d/1_Cl-uVa7U8WlbssUVKNEI23xYqkEdGPMpnhwFzGZDFU/edit

For an in depth review of the reporting criteria at hospitals see this. It is amazing to think there was no ascertainment bias https://www.researchgate.net/publication/373301830_SAGO_Presentation_Limitations_of_the_official_2019_Wuhan_cases_based_on_Primary_Sources

The paper linked by Dave Bahry in the comments on your first post has statements by various Chinese bodies saying "we focused on the area around the market" for most of January for a case search that ended in mid feb. Are they lying? What is going on here?

Also, the study conducted on Jan 2nd is the jinyintan study which is affected by the biases mentioned above, so you are mentioning this study twice, both in the worobey quote and your following note.

4. None of this accounts for the Mr Chen case. He was only counted as a case because he transferred to Wuhan Central Hospital on the North of the river because his relative happened to work there. He lived 30km from the Huanan market and went nowhere near it in the two weeks prior to symptom onset on 16th Dec. This is strong evidence that loads of cases on this side of town were missed. Also, Mr Chen was one of the whistle-blower cases for the whole pandemic

In worobey 2021, he outlines how many of the first cases appearing at several hospitals were market linked. He doesn't grapple with how the Mr Chen case doesnt fit with this. Instead he says "he travelled north of the market shortly before symptoms". This is deliberately misleading. What he is referring to is Mr Chen travelling to Mulan mountain 90 km north of the city of Wuhan at the end of November. There is no indication that he visited the Huanan market on this tourist trip and it is more than 16 days before he developed symptoms so can't be when he contracted covid. This might also make you question how reliable a guide worobey is. You will see in the weissman piece that he also obviously doesn't understand basic statistics.

The George Gao link you are looking for is here https://www.bbc.co.uk/sounds/play/m001ng7c

He says they focused too much on the market in the case search and focused too much on the wild animal section in the environmental sampling

Expand full comment

700 millions to one comes is 8.8 in log10, and seems very implausible on the face of it.

Such high numbers can appear within a very specific model. The odds of coming up head 29 out of 29 times when flipping a fair coin are roughly 1 to 537 millions. But in the real world, you will observe this with a much higher frequency because sometimes your model assumptions are wrong, and the coin is not fair, has head on either side, is flipped in a deterministic way and so on.

Given that rootclaims are the experts on this kind of analysis, I think it would be part of their job to make certain which probabilities are within-model and which are really life betting odds which they would take if this was a question which was likely to be settled with future evidence. If both Scott and Peter walked away with them believing that rootclaim favored these extreme odds, that is a failure of communication on their part, and I can understand Peter's response of 'You want extreme odds? THIS are extreme odds!'

Politics is the mindkiller, and I think a lot of the actors are firmly committed to their camp, which is generally not helpful for debates or truth. Arguments as soldiers, where even a bad argument requires refutation from the opposite side, thereby binding their resources. This is a reason to not consider any arguments not brought up by either Peter or Saar (assuming that both are competent to give the best arguments for their side, which seems likely), otherwise you will be stuck with debunking bogus studies for the foreseeable future.

What percentage of papers are published by non-partisan scientists? Even the wrong origin theory has likely some facts which increase its odds, so a non-partisan researcher should end up publishing some mixture of arguments for and against the lab leak hypothesis. By contrast, a partisan scientist would start with writing on the bottom "Therefore the lab leak hypothesis is (more|less) likely than previously thought." and then try to fill the blank space above. That does not invalidate their arguments per se, I would prefer not to read arguments from motivated reasoning. (Of course, there is the possibility that partisan researchers metagame by publishing weak arguments against their position.)

I do not think that this is overall a great topic for discussion. Policy-wise, it is not important as there is a general agreement that both zoonosis and lab leaks are an important avenue for future pandemics and we should try to avoid either. (The same is true if one debates if p(doom) is 0.8 or 0.01, btw.)

It is also unlikely to be firmly decided by future evidence. This is what I generally like about science, you start out with two positions like special relativity versus Newtonian mechanics, and then you do a bunch of experiments and eventually (at least) one side is thoroughly refuted. By contrast, COVID origins feel more like debating theism vs atheism in that it seems extremely likely that the question will be settled by evidence eventually.

Speaking of falsifiability, I can't help but notice that rootclaims tends to focus on topics where it is unlikely that a firm consensus will be reached, like COVID origins or Syrian chemical weapon attacks. If their method was working as well as they claim, there should be some questions which will have firm answers in the future. Moreover, you can make a lot of money with such questions in both prediction and stock markets.

By contrast, most of what I see on rootclaims seems unlikely to be validated by future evidence. There is some possibility that Putin having cancer or Trump having a toupee will become public knowledge at some point, but all of the whosdoneit stories will likely only settle on a guilty beyond reasonable doubt vs not guilty verdict, which is only a rough approximation of the objective truth in the best of cases.

Expand full comment

I really appreciate you putting so much time and effort into your original post and this follow-up. COVID origins are a really, really important question. Something which caused millions of deaths is well worth a few hundred thousand dollars to incentivize the best analysis possible.

I really hope you end up doing your own bet/debate! Especially because as a prediction market enthusiast, we are badly in need of high quality, neutral, near-term sources of resolution criteria for Covid origins questions!

I have already optimistically made some markets:

https://manifold.markets/Joshua/scott-alexander-is-planning-a-covid-9aaacc1866ce

https://manifold.markets/Joshua/scott-alexander-is-planning-a-covid

Expand full comment

For what it's worth, I was strongly lab-leak prior to the original post - Scott can confirm this via the ACX survey I filled out a day or two before the blog post. I had no idea Scott was going to do a post about this, so had no reason to lie on the survey.

After the original post, I updated to about 15% lab leak.

After this post, I'm officially at 0% lab leak.

Maybe this can be a point of reference the next time "don't argue with conspiracy theorists, you won't change anyone's mind and you'll only legitimize and/or more-deeply-entrench the conspiracy theorists" comes up.

If Scott wishes to confirm my survey answer, my survey had an extended comment about whether ACX had gotten better/worse/the-same, which tied in with my answer about the Roman Empire. A Ctrl-F for "parables" may or may not return my answer, if not I'd be happy to privately provide the email address I used.

Expand full comment

The US Natsec folk actually care a lot about the narrative on lab leak; it's a great accusation to make to disrupt CCP rule in China, but better to hold in reserve for the right time or as part of existing deterrence (and throwing accusations like that around willy-nilly, based on research done by random bloggers and play-money engines, damages the reputation of democracy among authoritarian-leaning governments around the world).

Expand full comment

> He says:

>> As soon as I get there, a doctor diagnoses pneumonia. So that’s why my lungs are making that noise. I am sent for a battery of tests lasting six hours.

> And then says that he went home either that day, or the next.

>> Day 13: I arrived back at my apartment late yesterday evening. The doctor prescribed antibiotics for the pneumonia but I’m reluctant to take them

Having recently been diagnosed with pneumonia at a Chinese hospital, this is much closer to what happened to me than the later Fox version is. The "battery of tests" consisted of a chest X-ray, a blood sample, and a urine sample, and it took a while, but that is largely because it took me a long time to be able to pee. At one point, a nurse knocked on the bathroom door to check if I was all right. After that, it was just waiting in an examination room until a doctor came in saying "this X-ray clearly shows that you have pneumonia". All together the process, including travel to and from the hospital, took somewhere between 4 and 7 hours.

The antibiotics seemed to cause weird auditory hallucinations, but I was not hesitant about taking them. It would have been difficult for them to be worse than the untreated pneumonia. My major symptoms were a fever of 39 C, unwillingness to eat or drink (and note - unwillingness to drink does not combine well with a high fever!), pretty questionable lucidity much of the time, and on two occasions a loss of balance while standing up. On the second of those occasions I hurt my ribs pretty badly by falling into my kitchen furniture.

I had no shortness of breath and my respiratory tract didn't make any noises that it doesn't make "normally". This has left me with a slightly funny feeling when I see people describing pneumonia as mainly about lung problems. Of course it is a lung problem, but I seem to have gotten everything but the lung-related symptoms.

From speaking to friends, I get the sense that my case was worse than average. (There was an epidemic of pneumonia going around at the time.)

Expand full comment

As another way for Saar to demonstrate the success of his technique, why doesn't he enter forecasting tournaments (or make a Metaculus account)? I know you mentioned this in your original post Scott, but I wished you'd emphasized it more here in section 3.3. If he quickly rose to the top of the leaderboard on Metaculus, I'd find that pretty compelling. And the fact that he hasn't done that (AFAIK?) makes me lose a lot of interest in his claims.

Expand full comment

Your coverage of the Pekar et al replication problems is weirdly incomplete.

It might be worth mentioning that their first correction required lowering their significance threshold to preserve the conclusion.

It is definitely worth mentioning that the same guy who forced the first correction has a second correction which actually invalidates the conclusion. The first correction was about bugs in the code, the second was about a fundamental error. Authors have not responded to this yet, but given that they acknowledged the earlier errors the challenge is very credible.

On the Worobey et al criticisms, you quote as rebuttal a preprint written by a couple of highly conflicted scientists. One of the spatial statisticians (Chiu) has replied that the paper was even worse than they had originally thought and basically dared the virologists to publish their rebuttal.

On the timing, I believe most of papers estimate an earlier origin than late November.

Expand full comment

> For what it’s worth, my timeline of Chinese denials and coverups looks like this:

> 𝗠𝗮𝗿𝗰𝗵: COVID was a US bioweapon, or possibly came from Italy.

You accidentally touched on something that really bothered me earlier in the piece: Peter's summary of covid detection dates.

I clicked through to read his post, and he discusses problems with detection in Spain and Brazil, while completely glossing over Italy. Or, to be more specific, he mentions Italy quite a bit -- he says that they have an early detection date, in December 2019, and that although this is a surprise when compared to the date at which outbreak becomes obvious in some sense, it appears to be a real finding, because it's supported by other early detections between December and whenever it is (February?) that outbreak was formally declared.

But of course the big news about covid detection in Italy was that it was detected in November. Peter must be aware of this, and the argument he presents for why the December detection looks real automatically extends backward to November - if detection between January and March means the December finding is real, then there's detection between December and March, which should mean the November finding is real.

It's very, very weird that this goes completely unmentioned.

Expand full comment

You still haven't read Weissman.

You spent 15 hours watching videos, probably as much again talking to people - mostly the same two people or people influenced by them!

You could easily read Weissman. You don't have to check his math any more than you checked the math on Pekar et al. You don't have to believe his numbers: his review of the evidence includes things that you don't seem to have heard about. You link to him only at the end of a long post, and you don't even mention that his numbers make Saar look less like an outlier. This is really not the Scott Alexander that I am used to.

Expand full comment

I can't figure out the hanzi for researcher Lv, so this is just a guess, but I've occasionally seen v used to romanize what is more commonly represented as ü. Lü is a moderately common Chinese surname.

Expand full comment

This has been awesome

Expand full comment

"Saar mentions that there are several other possible sources like restaurants or farms. I think Peter demonstrated during the debate that pandemics are unlikely to start in rural areas, so farms aren’t that important. Restaurants mostly source their products from wet markets. During SARS1, some pandemics started in restaurants because they kept the civets in cages next to the diners (like how some Western restaurants keep lobsters). After SARS1, restaurants stopped doing that and became a less likely spillover location."

I think it's important to focus on this: "Restaurants mostly source their products from wet markets."

This implies that, if the cluster of cases first started around a restaurant, we'd treat that as evidence of zoonosis - that is, this actually *undermines* the idea that it would be a weird coincidence for the first case to appear in a wet market, because restaurants belong to the same reference class.

To elaborate on this: It's a big coincidence that the first case showed up near the research center, because the research center is nearly unique, in that it was researching exactly the kind of disease COVID was. If we restrict our consideration to only two cases - lab leak, and any other cause - then this is a -huge- update in favor of a lab leak, because we'd expect a lab leak to show up near this specific research center, but we wouldn't expect an "any other cause" to show up near this specific research center. So this is very strong evidence of a lab leak.

This is huge evidence; we must counterbalance it with equally huge evidence. So - would we expect a zoonotic origin to show up in the wet market, specifically? Or, alternatively framed: How many different locations would we expect a zoonotic origin to appear in, weighted by their actual likelihood of showing up there? Let's call this set of locations the "reference class".

You've just argued that restaurants are a part of the reference class - which *reduces the value of the evidence for zoonosis*. Zoos which take (for exhibits) or treat wild animals are valid reference cases. Anybody exposed to the logistical chain involved in transporting mammals (there's no reason to favor raccoon-dogs except that they show up in this specific wet market - any feline, mustelid, or rodent, to name three entire families of mammals that aren't even the complete set of entire plausible families of mammals, would also work).

The reference class for locations that we'd expect a zoonotic origin to first appear in is -huge-. This specific wet market is not that special, it only looks special if we constrain our reference class in an extremely post-hoc manner, such as limiting ourselves to places with exposure to raccoon-dogs - when the only reason we're considering raccoon-dogs is because they were the most likely candidate vector in the specific location. If the most common animal in the wet market where it first showed up was Siberian weasels, we'd similarly be tempted to artificially limit our reference class to places with Siberian weasel exposure.

So the location the virus appeared in *doesn't actually provide much evidence for a zoonotic origin*, because the reference class is enormous, and so it isn't actually very surprising that the first identified site happened to be a part of that reference class. The more places you can identify that are plausibly *part* of that reference class, the less evidentiary value you should place on the place it actually first showed up; you've noticed that restaurants are part of that. This should cause you to update away from zoonosis!

So, from the information about where it first showed up, we have very strong evidence of a laboratory leak (of some kind - a zoonotic origin does not exclude the pandemic beginning with a laboratory leak; the disease could have arisen from mutation of a prior strain in laboratory animals, for example), and very weak evidence (it's still evidence, it's just very weak!) of a zoonotic origin.

Personally, I think the overall evidence in fact points towards "Zoonosis in laboratory animals or infected animals brought into the laboratory, with or without human intervention accelerating it, followed by a laboratory leak" as the most likely origin (human intervention accelerating being, for example, if the laboratory was deliberately evolving a virus in laboratory animals by manipulating group exposures). This also includes "Natural zoonotic origin in a sample animal brought into the laboratory and then leaked". The cleavage site fairly strongly suggests mutation, and the location information very strongly suggests laboratory leak.

Expand full comment

How does early cases clustering around prove it's originated from the wet market? It only proves the fastest way to transmit is via dark cool wet surfaces. An infected lab worker might as well went to the market to buy some fish or something and spreading might have started there. Every city has a wet market but only one has a virology center studying covids. I was 50/50 after reading the original post, but in time I started resliding towards the lab leak side.

Expand full comment
Apr 9·edited Apr 9

Concerning the whole question of the Chinese cover-up: My understanding of the Soviet-school cover-up strategies is the following:

1) since lower levels of authorities are afraid of reporting bad news, it is difficult for higher up authorities to know what to cover up when, since they don't have all the facts reported correctly;

2) there isn't necessarily one single narrative pushed by a high party official and for which evidence is aligned in a concerted effort. Rather, the goal is to mess around with evidence enough so that nobody can make up any consistent story.

Both of this seems in line with the Chinese cover up being shoddy, at times contradictory and irrational. The ultimate goal was not to convince the world of the "frozen lobster from Maine" origin story, but simply of the "it is very complicated, we might never really know" origin story. And this they did very well, as all these discussions illustrate.

Expand full comment
Apr 9·edited Apr 9

"Then they banned Chinese scientists from researching the origins of COVID."

And that arouses no suspicion whatsoever?

Expand full comment

Scott says: "I asked a synthetic biologist about [using CGGCGG]. He said:

» “Nope. I would literally never do this if I was designing a small insert (maybe I wouldn't notice if it happened by chance with ~1 in 25 odds in a naive codon optimization algorithm as part of a larger sequence). High GC% is bad. Tandem repeat is worse. Several other perfectly fine arginine codons. And I wouldn't engineer a viral genome using human codon usage. An engineer would not do it.”

---

The opinion of a single expert in a private conversation is not a good argument. Examples of good arguments:

1. Pfizer and Moderna vaccines recorded almost all arginines into CGG.

2. Shibo Jiang inserted a furin cleavage site and used CGG for the leading arginine.

3. If indeed the FCS was part of investigating the PAA -> PRRA hypothesis (see our full post https://blog.rootclaim.com/covid-origins-debate-response-to-scott-alexander/), then PAA is already CG-rich (CCT GCA GCG) so virologists modeling how PAA could naturally evolve into an FCS could have decided to keep it CG rich.

4. Having a unique sequence can be helpful for easy tracking of mutations in the FCS during lab experiments.

Expand full comment

The post has several attempts to discredit Reed, all of them unsuccessful.

>This is a weird inconsistency! In the Wales interview, the cat got it before him (at least that’s how I interpret “I don’t think I caught it from her”). In the Mail interview, he got it nine days before the cat.

Not sure what the claim is here. That Reed made up the whole story about the cat? That would be pretty weird. Is that really the preferred explanation and not that he just assumed he didn't notice how long the cat was sick? Remember it's not his cat, just a "kitten hanging around my apartment"

--

>In The Wales interview, it’s “the feline coronavirus”. In the Mail interview, he doesn’t know what the cat got and speculates that it might have been COVID. But also, if it was “the feline coronavirus”, how would he know? Wouldn’t you need a vet to diagnose that? But in the Mail interview, he said he didn’t leave the house for a week around the time his cat was sick. So how did he go to the vet?

Scott seems to not understand that the Mail quote is not an 'interview' but a recreation of his experience day-by-day. Unclear how exactly, but since some recollections are very specific, he might have kept a diary or perhaps recreated it from messages to friends etc. It therefore makes perfect sense that in real time he didn't know what he had and thought maybe the cat got the same disease. Later he understood better and could make a more educated guess.

---

>"I was stunned when the doctors told me I was suffering from the virus. I thought I was going to die but I managed to beat it,” he told the outlet, adding he was hospitalized at Zhongnan University Hospital for two weeks following his diagnosis.

In his earlier story, he was at the hospital for less than a day. Now it’s two weeks.

Reed never mentioned any hospitalization, and he isn't even directly quoted here. So this supposed major revelation relies solely on the reporter not misunderstanding Reed. In this case, it seems clear what happened: we know he went to the hospital two weeks (day 12) after symptoms - they simply misunderstood this to be a two week hospitalization. 

---

>But also, the doctors “told [him he] was suffering from the virus”, but this is impossible - the virus hadn’t been discovered yet.

This is from the same source. Again, much more likely that they lost the context and he was describing the later diagnosis he received. He's even quoted a few lines later "It was only when I called back a couple of weeks ago that they told me I’d had the coronavirus". 

---

>I can’t deny that it’s weird to do your regular shopping at a market an hour away, but it really sounds like he’s referring to the wet market where all the cases started here.

Again, Scott puts a lot of weight on news reporters accurately quoting and understanding everything Reed says. It is indeed very very rare for news reporters to make mistakes, but it's still more likely than a single male spending a 3 hours commute to do his regular shopping.

This would all be much more convincing if the supposed lies and inconsistencies were found in the many video interviews he gave, but somehow those all consistently retell the same story of Reed accurately describing Covid symptoms in November.

Overall, these claims seem quite desperate. Reed proves to be very reliable, and all attempts to find inconsistencies in his story failed. It doesn't mean he had Covid - it's possible there was some misunderstanding (we don't even use him in our analysis). His story is, however, very useful as a litmus test for motivated reasoning. 

Expand full comment
Apr 9·edited Apr 9

The two posts did shift me towards zoonosis (funnily, the first post shifted me into the other direction because the most important arguments that were new to me were pro lab leak).

But there is one point were I think that Scott has an inaccurate model, and that is the early stage of a pandemic. I think Scott underestimates the large variance that this early stage has. For doubling times, they are smooth and consistent (what I would call "sharply concentrated"). But AFTER the inital phase, not DURING the initial phase. So, once you have 10-20 infected people, then your data looks very smooth. But with less people, not so much. I happen to study very abstract version of such processes (branching processes, Galton-Watson trees, and some highly abstracted forms of epidemics). And this initial phase looks very different from later phases in all these abstract settings. It's abstract, but it's a part that I would expect to transfer to the real world. It can easily happen that you have one or two or even three initial generations where the process just fizzles before properly starting. And this is even more true for infections where the number of transmission of a source person varies a lot (many infect no other people even in the same household, but some infect 5 or 10 or more). As it does for Covid.

This makes two main points of Scott's argument weaker:

I think we should be less sure about the early timeline. I am not talking about an additional month, but it is perfectly possible that the onset is 1-3 transmissions earlier than the model calculations, which is at least 1-2 weeks.

We should also less sure that the first infection is at the same place as the center of subsequent infections. If the first generations of viral spread go like

1 infection -> 3 new infections -> 2 new infections -> 1 new infection -> BOOOOM,

then we will see the center around the person in generation 4. But this can be at a different place than the person in generation 1. I think Scott underestimates the likelihood of this scenario. BOOM here does not necessarily mean a single superspreading event. It can also be just a few more generations where the number of infection doubles or triples.

I am not saying that this scenario definitely happened. I think it's somwehat less likely than the opposite scenario, where there is no evolutionary bottleneck after patient zero. But Scott quite heavily relies on this scenario being unlikely. It also seems (one of?) the main disagreements with Rootclaim.

Expand full comment

Scott writes "I think scientists had called wet markets as an especially dangerous potential transmission location in advance.", and follows up with quotes warning against markets.

These seem like the result of a targeted search for such quotes, and therefore have no probabilistic weight as they provide no comparison to other spillover sources. We didn't claim no one warned against markets, just that they were relatively low priority.

In our post (https://blog.rootclaim.com/covid-origins-debate-response-to-scott-alexander/) we provide an unbiased search we conducted that clearly showed markets were just one of many high-risk interfaces (search "USAID"), and a relatively low priority one. Nevertheless, this was a quick search, and we are open to see it improved, but only using proper methodologies.

Cherry-picking is one of the most basic human reasoning biases, and probably the most common way people get convinced of wrong things.

Expand full comment
Apr 9·edited Apr 9

Thank you for this! It's a great public service and example of truth-seeking, and your thoroughness and equanimity are remarkable.

I don't have anything to add except the single data point of how this has affected my own belief state. Prior to your initial post I was something like 60/40 in favor of zoonosis (both stories seemed to have some strong points in favor, I hadn't looked into it very deeply and expected to never know the truth, but zoonosis seemed like the more boring & mundane route and thus a marginally safer bet). After your debate summary post, I was with you on 90/10. After this post (and other discourse responding to the previous post), I'm even more strongly convinced that the lab leak evidence is rooted in typical pseudo-scientific patterns and would put myself at 95-99% in favor of zoonosis.

Expand full comment

> (The one argument I know about, haven’t responded to, and it really is because I’m lazy and scared and bad is Michael Weissman’s Bayesian analysis here. It’s 25,000 words and uses a bunch of logits and calculus. Sorry, pass.)

As far as I can tell, the fancy math is basically trying to distract from similar tactics and arguments as Saar and other posters, such as claiming Worobey shows evidence of strong ascertainment bias, which you already responded to. One of the judges posted a brief response to it: https://ermsta.com/posts/20240301 although it mostly doesn't go into the details of the arguments; it's mostly about how to assign Bayes Factors to them.

Expand full comment

>During SARS, the international health community criticized China for having wet markets where zoonotic spillovers could happen. China promised to clean them up, then mostly didn’t (for example, the raccoon-dog vendor at Wuhan was fined a few times, but kept operating). China’s first priority was to prevent people from accusing them of failing to clean up wet markets.

> >police from the Wuhan Public Security Bureau investigating the case interrogated Li, issued a formal written warning and censuring him for "publishing untrue statements about seven confirmed SARS cases at the Huanan Seafood Market."

>My impression is that China (realistically Wuhan City Government, I don’t think Xi would have been involved at this early stage) made a vague attempt to cover up the wet market early on - but that it wasn’t their Department Of Covering-Up’s finest work.

"People who believe lab leak over zoonosis are falling prey to Chinese propaganda" was not the take I expected to end up with from the debate, but it's so delicious that I'm incorporating it into all future discussions I have on the matter.

Expand full comment

> If they secretly knew they’d just started the worst pandemic in modern history, wouldn’t they at least be wearing masks?

If this is a lab leak, and they know it is a lab leak, it doesn't mean they would think it is worse than what they would think at the time if it was not a lab leak or if they didn’t know it is a lab leak.

I don't think china was trying to specifically cover a lab leak, I think they didn't really know what it was, just like us. I am just pointing out that I think this argument don't work well (even if it has mostly no impact on the debate).

Expand full comment

The lab leak discussion seems to have become dominated by the gain-of-function scenario.

The possibility of “zoonotic collection”, i.e. that the virus jumped straight from bats into some scientific researcher who was poking around in caves, seems more probable for several reasons, most critically that it doesn’t require the invocation of a conspiracy theory. It also suggests possible lines of active inquiry that could increase/decrease support, which seem to be under-discussed.

It is very unlikely that any lab would be inserting furin sites without the knowledge of at least several people, and presumably also producing associated documentation (saved files, email chains). Likely, there would be people who are aware of the research who don’t feel any particular personal culpability (e.g. the insertion was carried by a parallel team, was someone else’s project, etc). We then need a conspiracy of silence: all these people keep their mouths shut, all files are deleted or inaccessible, no-one whispers anything to a friend late at night after a few drinks, etc.

In contrast, zoonotic collection not only does not require any kind of conspiracy. It could be that there are genuinely zero people consciously aware of what happened. Field researchers are often young, and early on COVID was likely less virulent anyway. Someone could have returned to Wuhan with barely a mild cold, or totally asymptomatic. If their sampling activity also happened to not pick up any bits of COVID to sequence, there is no particular reason that they would assume they were the source of a global pandemic.

Zoonotic collection requires a researcher to return from the field sometime in 2019. Almost surely, the WIV will have some kind of documentation of such trips (e.g. via sample collection dates, expense reports, emails, etc). Even outside the WIV, the movement of researchers would have been documented by travel records (planes, trains, hotels, payment activity). And at a minimum, information at least suggestive regarding the frequency and locations of field research may be held by US partners. Even looking at the historical frequency of sample collection by the WIV, much of which is (presumably?) public data, could be useful.

If the WIV did not do any in-field collection through 2019, then zoonotic collection is surely very improbable. Conversely, the chances rise if they it turns out they had someone in a cave in Yunnan every second weekend.

Expand full comment

Also, to emphasize something I kind of only implicitly gestured at: The laboratory leak hypothesis does not actually exclude zoonosis! It could be zoonosis inside the lab, among the lab animals, OR it could be zoonosis brought into Wuhan through a specimen.

Zoonosis is not actually the hypothesis that COVID originated with "zoonosis", but rather the more narrow "Zoonosis without any kind of laboratory involvement in the initial exposure". Indeed, the "Zoonosis" hypothesis is barely about zoonosis at all, excepting insofar as it is evidence against a very specific hypothesis in the laboratory-leak-hypothesis-space - rather, it is more fundamentally the claim that the laboratory had nothing to do with the virus at all.

So the evidence "for zoonosis" should be discounted by some factor representing the odds that, if the disease had a natural origin, it might arrive at Wuhan via the laboratory studying that disease, insofar as that is relevant to the evidence in question.

If you did not consider this, you should update away from "zoonosis" to some extent.

Expand full comment
founding

> (The one argument I know about, haven’t responded to, and it really is because I’m lazy and scared and bad is Michael Weissman’s Bayesian analysis here. It’s 25,000 words and uses a bunch of logits and calculus. Sorry, pass.)

A quick skim suggests this is just a different way of framing the update strengths, and it should still be easy to point to "this likelihood ratio disagrees with these other likelihood ratios" in a way that quickly points to the disagreement. You might be able to get Weissman to format his numbers in basically the same table as you have for the judges and commentators. (And then the disagreement is over something like "no, the prior shouldn't be 70:1" or "no, we can't get a likelihood ratio of 4:1 from the lack of intermediate hosts because of suppressed evidence", or so on.

Expand full comment

Scott writes:

>No BANAL-52 relative close enough to create COVID from has ever been discovered.

>By mentioning BANAL-52, I was trying to be maximally charitable to the lab leak side. In order to create COVID, they would need a virus very close to COVID. But in years and years of searching, nobody has ever discovered a virus like this. Therefore it must be rare. As a way of bounding how rare, let’s see how rare the closest virus ever discovered is. That’s BANAL-52. It is very rare. Therefore, the COVID ancestor must be rarer than that.

Response:

This is obvious hindsight bias. We know 5 viruses that are all a few % from each other, with SARS2 being one of them, and the other being 3 BANALs and RaTG13. SARS2 is singled out here only because that's the one that in hindsight started the pandemic, but it could have been any of endless viruses sampled from this space.

Another way to understand this bias, is that whatever restriction you choose to apply to the virus on the lab-leak side, you need to apply to zoonosis (otherwise you're calculating conditional probabilities of different evidence for each side). Meaning, you need to look only at hypothetical zoonotic pandemics that come from a relative of SARS2, rather than any of the endless viruses that could start a pandemic if they somehow attained this 12nt FCS. And unless you can show that this specific sequence is more likely for one of the hypotheses, this redundant restriction cancels out and has no effect.

To be fair, Scott realizes he may have messed this up and writes:

>I don’t know how strong this argument is, because maybe there are millions of rare viruses capable of becoming pandemics, such that getting any one of them is very easy, even though each one individually is rare. The version of this I find convincing is that it should be a probabilistic cost to say that WIV did gain-of-function on a seemingly undiscovered and so-far-very-hard-to-discover rare virus instead of on any of the usual SARS-like viruses that people do their gain-of-function research on.

This still has the hindsight bias mistake above of thinking we need a "so-far-very-hard-to-discover" virus.

As to "usual SARS-like viruses" - This is a whole different argument of the form "an engineer won't do that". You never confidently know what the engineer is trying to achieve and what makes sense for them.

But in this specific case we don't need that, as we actually know from DEFUSE they are interested in a wide range of viruses beyond close relatives of SARS1, and even beyond SARS-like viruses. (also see this: https://twitter.com/ydeigin/status/1778239459818881535)

Bottom line: This has negligible weight. It is quite likely WIV would have a virus that is good enough to start a pandemic (after adding an FCS and potentially passaging).

Expand full comment

I think the photo of the WIV team out to dinner is interesting. It doesn't look like any of them are high-risk (i.e. >60, or even close to it). As such, it would make sense for an early spread of detectable disease to come from somewhere near WIV (like a wet market) even if it didn't come from WIV itself. Thus, you might see spread from nearby and erroneously think, "no, it didn't come from WIV, it came from this place next to WIV." Now, a rejoinder might be that "if someone from the lab were sick, wouldn't they spread it to multiple other people?" Now you're assigning roles to specific people at the lab, to catch and spread the virus according to a model that doesn't necessarily match the real world.

As to whether they would all have worried looks on their faces for having secret knowledge that they "just started the worst pandemic in modern history"... This take is full of projections of future knowledge onto individuals in the past. Did this group of researchers, in mid-Jan 2020, know that:

1.) A recently-identified virus of concern had leaked from their lab,

2.) This virus would become a worldwide pandemic,

3.) It would be the 'worst in modern history', (Not sure I agree with this? I guess it depends on how you define 'modern history', how you define 'pandemic', and how you define 'worst'.)

4.) They needed to wear masks to avoid catching it.

If anything, the invocation of their non-mask wearing shows that this argument proves too much. Shouldn't ANY team of virologists in Jan 2020 have known to mask up? If they didn't, it's probably not a good idea to invoke this photo for this kind of 'evidence'.

I know the whole photo thing was mostly tongue-in-cheek, but it's indicative of how Scott, as a new zoonosis advocate, has potentially adopted a filter for evidence.

Expand full comment

> “The first known case predates the market outbreak by a month” - this is not the consensus position. I cannot say for sure what Dr. Chou means by this, but I suspect he’s referring to one of the many claims to this effect that Peter effectively debunked during the debate (Connor Reed, Mr. Chen, the 92 cases, Brazil, etc).

The first known Chinese case was traced back to November 2019 (https://archive.is/U6Swq). French blood samples and lung scans demonstrated cases in November 2019 also. By December (when the wet market outbreak began) there are indicators that it was in Norway, the US, and Italy (based on antibodies found in blood samples and RNA found in wastewater/cadavers)

I don't really understand the argument that it couldn't possibly be in Europe that early because it would be more widespread - we weren't testing for it, and COVID follows a seasonal pattern (like all human coronaviruses). September and October in Italy and France aren't peak seasonal conditions, so COVID would have likely spread much more slowly especially as it hadn't blanketed the country yet. IIRC the Italian samples from October were even sequenced

> “Genetic analyses put the realistic start date around Sept/Oct” - see the section on Brazil above for the many reasons this is impossible. Pekar, the most-cited genetic analysis, puts the origin in November. Dr. Chou doesn’t cite his sources, so I don’t know what he’s referring to, but it certainly hasn’t entered the knowledge of the reality-based community.

I'm not sure that I'd personally throw around the phrase "reality-based community" when describing ignorance of peer-reviewed studies, some of which have been around for quite a while now... but you do you!

This new study estimates an origin between August and early October 2019: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0301195

This study estimates an origin no later than October 2019: https://academic.oup.com/ve/article/10/1/veae020/7619252?login=false

This 2022 study put it at around mid-Sept 2019 to early October 2019: https://academic.oup.com/bioinformatics/article/38/10/2719/6553661?login=false (though the authors later wrote that they suspect it might have originated earlier than that)

A 2020 study estimated that it jumped to humans in Oct/Nov 2019: https://twitter.com/BallouxFrancois/status/1257787789816537088

The market cases were all lineage B (apart from a single swab which appears to be contamination as it was from a PPE glove). Lineage B was relatively late in the evolution of early COVID (can see here: https://academic.oup.com/view-large/figure/445990907/veae020f2.tif). That alone suggests an earlier origin than is claimed by the wet market theorists.

> “No raccoon-dogs anywhere on the planet have tested positive, beyond those being forcibly infected to do experiments”. False, this paper discusses an outbreak of COVID among raccoon-dogs on a farm in Poland.

Nope, that is a common error people make. The cases in Poland were at a "mink and raccoon dog farm", and the cases were in minks (2 out of 20 tested positive). The mink samples were subsequently uploaded to GISAID - none from raccoon dogs. Following the trail of study citations reveals this.

> “They aren’t capable of catching or spreading COVID”. False, here’s a paper on the subject which says that “Raccoon dogs are susceptible to and efficiently transmit SARS-CoV2”.

...it's a study talking about them being forcibly infected, as I said in the previous quote

> “The clustering around the wet market in Wuhan . . . was just a product of oversmoothing”. Here is a map of December 2020 COVID cases. I recommend ignoring the contour lines and just looking at the dots. How could dots be oversmoothed?:

There is a Twitter thread explaining it step by step here: https://twitter.com/danwalker9999/status/1747673884336312613

Multiple other scientists have written about it too:

https://academic.oup.com/jrsssa/advance-article-abstract/doi/10.1093/jrsssa/qnae021/7632556?login=false

https://academic.oup.com/jrsssa/advance-article-abstract/doi/10.1093/jrsssa/qnad139/7557954?login=false

https://zenodo.org/records/7016143

The broader problem is sampling bias, something both the Chinese CDC and WHO agree makes the early cases unsuitable for claiming the wet market to be the point of origin. The director of the Chinese CDC said that they put too much of a focus around the market and may have missed it coming from the other side of Wuhan.

Expand full comment

Scott writes:

>I think others are using it to prove WIV had “secret viruses” in their catalogue, but the rice virus wasn’t secret, it was HKU4, which is common and which WIV has already published papers about.

Response:

This is incorrect. At least one of the sequences detected in the study is still unpublished by original Wuhan researchers to this day. It is 98.38% identical to the closest known full HKU4 genome (BtTp-GX2012). Moreover, that sequence was inside a BAC reverse genetics system which is also unpublished (i.e. the Wuhan experiments that used it remain unpublished).

So, we have here an example of an unpublished virus that is about as close to a known virus as BANALs are to SARS2, directly contradicting the claim that it's unlikely WIV could have the SARS2 backbone unpublished.

Expand full comment
Apr 9·edited Apr 10

More on ascertainment bias. I promise you aren't the first to notice Worobey and Pekar tried to pre-empt criticisms ;)

>"Before going further, I recommend reading page 8 of the supplementary text of Worobey’s paper, titled “Robustness Of Statistical Test Results To Ascertainment Bias”"

Their robustness test was fallacious, as I show in my published critique (Bahry, 2023): "Although Worobey et al. purport to test for “robustness” of their results to sampling bias, their tests fail (30). For instance, they in effect address false positives near HSM, by dropping cases nearest to HSM from the data. But the issue was false negatives: cases missed due to *not* being near HSM. This is as fallacious as surveying New Yorkers; dropping the 68% most central ones from the data; and concluding from the remaining 32% of New Yorkers that most of humanity lives near to and centered on Central Park."

>"the market connection was discovered December 30 and added to diagnostic criteria January 3"

Covid was discovered on Dec 29, *because of* the Huanan market cluster. The market was part of the search from the very start, including the initial search for earlier-December cases (The nCoV Outbreak Joint Epidemiology Investigation Team & Li, 2020): "On December 29, 2019, a hospital in Wuhan admitted four individuals with pneumonia and recognized that all four had worked in the Huanan Seafood Wholesale Market, which sells live poultry, aquatic products, and several kinds of wild animals to the public. The hospital reported this occurrence to the local center for disease control (CDC), which lead Wuhan CDC staff to initiate a field investigation with a retrospective search for pneumonia patients potentially linked to the market. The investigators found additional patients linked to the market, and on December 30, health authorities from Hubei Province reported this cluster to China CDC. ..."

>"I looked for the direct source of the Gao quote and couldn’t find it"

He told a BBC podcast host (https://www.bbc.co.uk/sounds/play/m001ng7c, context + Gao's answer at 24:00–25:12).

There is no question: the early search is riddled with clear, known, overwhelming ascertainment bias. The Chinese officials who did the search have been clear on this (Bahry, 2023, Table 1). Zoo crew (a coauthor network of Western scientists including Worobey, Pekar, and the "Proximal origin" authors among others) have tried hard to downplay it, but imo that's because they *want* to: they're acting like lawyers, not like scientists.

References

Bahry, D. (2023). Rational discourse on virology and pandemics. mBio 14: e00313-23. https://doi.org/10.1128/mbio.00313-23

The nCoV Outbreak Joint Epidemiology Investigation Team & Li, Q. (2020). An outbreak of NCIP (2019-nCoV) infection in China - Wuhan, Hubei Province, 2019-2020. China CDC Wkly 2: 79-80. http://www.ncbi.nlm.nih.gov/pmc/articles/pmc8393104/

Expand full comment

>"I looked for the direct source of the Gao quote and couldn’t find it"

He told a BBC podcast host (https://www.bbc.co.uk/sounds/play/m001ng7c, context + Gao's answer at 24:00–25:12).

Expand full comment

the argument in 3.1 (Apology to Peter re: extreme odds) is a red herring. Who cares whether or not Saar was *also* quoting insane numbers? No-one should end up at 10^-20 odds in favor of any theory when high-quality evidence is as hard to come by as it is in this case. Whether or not Saar was making huge modeling errors, it's very clear that Peter *was* making such errors.

Expand full comment

"Even if you spend hours and hours talking to the scientists involved and trying to figure out the flaws, it doesn’t matter, because there will be a new set of papers like that a few weeks later."

There's a term for the general phenomenon you're describing in this section: https://en.wikipedia.org/wiki/Brandolini's_law

Expand full comment
Apr 9·edited Apr 9

I don't know if this is a point that comes up a lot in the chemical weapons debate, but something that comes to mind when people challenge the mainstream narrative:

If this was the rebels doing the attack, they got incredible bang for their buck. The Western coalition launched a coordinated airstrikes that blew up a research center and two military bases as a reprisal. If the Rebels were the ones responsible... why not keep going? Goad NATO into blowing up more government bases? The fact that chemical weapon attacks dropped after the airstrike suggests that government forces were in fact responsible, and the airstrike did its job of intimidating them.

Expand full comment

These two posts changed my mind, which I find impressive.

Two notes: can we please agree to stop irresponsible wet markets AND gain of function/ virus-hunting labs?

And: I find it annoying Peter called the lab leak claim a conspiracy theory. It is an obvious potential cause to investigate, even if it turns out to be wrong.

Expand full comment

>How come there wasn’t a second obvious cluster radiating out from a coffee shop, lots of coffee-shop-linked cases, etc?

Because they would not be detected. We need the appearance of multiple severe pneumonia cases clearly linked to the same location. A coffee shop can't generate that

Expand full comment

At a meta level, I find the entire discussion fascinating as science, as epistemology, and as human behavior. At a practical level, most of the issues are beyond my competence, but three items crossed my threshold: 1, the notion that there are fewer than 2,000 sarbecovirus is absurd. That refers to the number officially described and reported. The actual number is orders of magnitude larger, probably many orders of magnitude larger. 2, I find Peter's point about COVID using a PRRAR linkage for the furin cleavage site compelling. In what alternate universe could a Chinese virology lab not expert in this kind of work design and use a novel linker, especially when the evidence then available suggested that it wouldn't work? 3, the rapid doubling time repeatedly mentioned by Scott might be misleading; do we actually know how fast doubling time was for the original strain?

Expand full comment

I'm team zoonosis on balance, partly because of the big picture: pandemics predate institutes of virology by a long way. My prior is that nature is perfectly capable of creating coronavirus pandemics and we should expect it to do so regularly.

But I do notice that at some point Scott gets cross with Saar! This is a savage burn:

> I’m sympathetic to this way of thinking - my beliefs also intuitively feel so obvious that nobody could possibly disagree. But I eventually learned real life didn’t work this way; I think Rootclaim would benefit from a similar lesson.

I'm not 100% certain Saar's arguments are getting generous treatment by the end!

Expand full comment

Methods matter. Let's look at Worobey. One interpretation is that if the data are a good sample of all the cases, with little bias either in the initial gathering or the later filtering and if you make an assumption about how rare it is for there to be an initial spread event at a market, you get some enormous likelihood factor. It's so big it decides your final odds.

If.

Let's say you think those are reasonable possibilities but then so are the alternatives. You might get a likelihood factor, but there's no way it can be big, because there's a pretty good chance the assumptions of the argument fall apart.

Despite his silly boasting, Saar gets this one big thing right. No factor gets to be extreme if there are reasonable paths around the premises of its model. That applies equally to both sides. (It pretty much wiped out my big CGGCGG factor that had strongly favored lab.) It's basic to hierarchical Bayes, not a special rootclaim trick. It's a little shocking that people writing extensively on this, spending many dozen hours, didn't spend an hour learning the basic logic.

Expand full comment

To fulfill your request for a link on George Gao affirming possibility of ascertainment bias around HSM, here it is:

https://www.bbc.co.uk/sounds/play/m001ng7c at 24:40

Expand full comment

wrt 'Lv' name comment: could be mistranslation of Lu (v often represents classic 'u' in pinyin spellings).

seemed like a strange comment that appeared to bash Chinese names (Xi? very common Chinese name as well) and their atypicality when translated into the english language---unusual to the intellectual nature of this publication.

otherwise very thought-provoking post.

Expand full comment

These sorts of questions make me think "rationalists" focus too much on objective reality. To me, the [acting hypothesis] or, the model of the world that will produce the highest utility yield when acted upon, seems a lot easier here than the objective reality.

So, what I'm getting at is, imagine a China that regularly leaks viruses from its labs. It also has a genetic warfare branch and is developing bioweapons and intends to use them/regularly does, often for weird sideways goals that are hard to infer from the results. 20% of worldwide pandemics are caused by a mixture of Chinese indifference/recklessness/malice.

In this world, when a natural pandemic arises, China does not open up its data, and provide proof that the pandemic is natural. It continues to maximally hide data, because not doing so would highlight all of the unnatural viruses. The entire purpose of clamping down on research is to ensure that natural and unnatural pandemics look the exact same. So, this China doesn't care that the information its hiding exonerates it, at least in this particular instance. Rather, it knowingly hides exonerating evidence in the expectation that people will then conclude that hidden evidence is likely exonerating.

In some scenarios, small amounts of hidden evidence are semi-intentionally leaked, that is, the government is somewhat more careless with information that exonerates it than information that doesn't. A random sample of leaked data now implies that the hidden information is overwhelmingly exonerating.

Now, in all of this, assume covid happens, and it's natural, and in fact, part of an 80% category of natural events. Are the people who say it's natural really reasoning correctly? Are the people who say it's 80% likely to have been natural even the correct ones here? To me, the most rational group is the one that concludes that China [et al] is up to no good, and it doesn't particularly matter if the goat is behind this particular cover up.

Expand full comment

My main initial reaction after reading it. The debate moved me slightly away from lableak as I found the anti-lableak arguments fairly convincing.

That said:

1) The repeated "The COVID pandemic doubles every 3.5 days. So if the first infection was much earlier - let’s say November 11 - we would expect 256x as much COVID as we actually saw. " arguments fall a little flat when the number of infected people is small because the spread rate varies greatly based on behavior of individuals. So if 100 people are infected average spread rate works as a fine proxy. But if it is 5, the particular behaviors of the particular people matter a lot and impact the spread rate a lot. Some people are loner shut ins. Some people go out of town. Etc.

2) There doesn't seem to be nearly enough connection with/acceptance of the fact that a lot of the information out there might simply be lies. The medical/scientific establishment and the Chinese and US governments are not entities which are above faking information/data for political/funding reasons. The anti-lableak analysis seems to take most of the research at face value, which seems naive.

Expand full comment

I'm terribly sorry, but does anyone know if any of these papers being referenced by either side share raw data or code?

I ask because, as a rule of thumb, I tend to trust papers that share their underlying R, Python, or occasionally Stata code. I may not be an expert in their particular field but if their data is clean and their code is readable and published, I can feel pretty darn confident that it's accurate and will replicate.

I am not finding any code and little trustworthy papers in any of these links.

https://elifesciences.org/articles/16777 has multiple datasets but they're all like this (https://elifesciences.org/download/aHR0cHM6Ly9jZG4uZWxpZmVzY2llbmNlcy5vcmcvYXJ0aWNsZXMvMTY3NzcvZWxpZmUtMTY3NzctZmlnMS1kYXRhMS12Mi5jc3Y-/elife-16777-fig1-data1-v2.csv), incredibly minimal, at least I think they are.

This article quotes 47k individual samples but...am I missing a link here? (https://www.nature.com/articles/s41598-021-91470)

This one I think is giving me codes to go look it up in some GenBank database (https://academic.oup.com/ve/article/8/1/veac046/6601809)

I got down to these two (https://academic.oup.com/jrsssa/advance-article-abstract/doi/10.1093/jrsssa/qnad139/7557954?login=false) and (https://arxiv.org/abs/2403.05859) and...I'm terribly sorry but for two papers arguing about statistics and centrality I would certainly expect some data and some code. At this point I'm getting frustrated; I'm terribly sorry to the people involved, I'm sure a lot of work was involved, but I don't want to read 27 pages of words, I want to read 7 lines of R code, maybe 100 rows of data, and a paragraph explaining why you chose this algorithm as a measure of centrality. I know you have the data and code, I can see page 4 of Debarre and Worobey paper and I know how those graphs are generated and there can't possibly be privacy concerns, you just showed us a map.

Am I missing something? Does someone have a link? The Rootclaim site doesn't seem to have a clean list of academic sources and I'm not sure where to find this from Peter. I'm finding the complete lack of raw data and code on both sides rather concerning.

Expand full comment

> But hw failed to confront

Typo

Expand full comment

I don't usually like to yell racism, but I think the lab leak hypothesis doesn't really present a particularly compelling argument linking the lab to the wet market.

"Maybe someone wanted some civet" is an argument that only hangs together if you think all Chinese eat lots of weird game on the regular. Lots of people don't eat that stuff at all, and when they do, it tends to be special occasion or if they're tourists. These meats are way more expensive than normal poultry or meats!

Consider demographics. Who, in the WIV, would be likely to spend lots of time actually in the lab? A lab tech or a post-grad - lowest in the hierarchy, doing the grunt work, being exposed to viruses.

Neither of these positions are things you can just luck into - you need tertiary education and training. This implies class - probably at least middle class. Buying a live animal to cook at home is just not really an urban middle class thing!

Also, this is a class of job (skilled, specialised) that often recruits talent from all over the country - a lot of them probably don't live with parents, most probably live in sharehouses with people of the same class (uni students, teachers, clerks). Its not massively likely that these people live with a guy that sells civets at the wet market, or a guy that makes it a habit to bring weird animals home.

They have very different habits - probably a lot of takeaway, or convenience store fare, and dining out (all are cheap options over there). Lots of pre-packaged supermarket stuff, I would imagine - dumplings you can boil, air-fryer or steamed dishes. Much of it delivered probably, because of time demands of the job - and also delivery services in large Chinese cities like Wuhan are really very cheap (high density makes it a lot more efficient than over in the west).

My argument is that if a WIV lab tech or researcher was patient zero, we wouldn't have seen initial cases centred on the wet market - they would have been centred on a flat where 5 low paid interns/PhD students/grad students share a bathroom, and probably the train platform or busport. In the event someone did actually want game, they'd go to a restaurant, not the market directly, and it's really really weird if they somehow spread it to no one else (coworkers, housemates, other passengers on shared transit) on the way to the restaurant!

The equivalent in a western context is claiming that patient zero is actually a white corporate banker, despite all the cases actually found at methadone clinic in a mostly black neighbourhood. It just doesn't make sense to me. Yes you can probably construct a chain but it doesn't seem likely that everyone in that chain is going to be asymptomatic. I feel like you can only make this claim if you have zero understanding of how different Chinese demographics actually live.

Expand full comment

I see Saar asserting multiple times that p = 1/10,000 can't arise except in physics experiments. For example, in his latest blog post he says, "More generally, such extreme numbers are not possible outside very controlled environments where all confounders can be reliably eliminated [...]" (https://blog.rootclaim.com/covid-origins-debate-response-to-scott-alexander/).

However, such extreme probabilities appear quite regularly in a lot of real life situations. The probability that a given commercial airline will crash is around one in a million. The probability of being killed by lightning in the US is 20/330M per year (https://www.weather.gov/safety/lightning-victims). Saar has a weird bias to think that only probabilities like 1-100% are reasonable.

Expand full comment

I did rather enjoy these two posts, even if the math and science is largely over my head...Matt Yglesias recently did a post complaining how there's been no Official Reckoning on covid, the whole thing's just a polarized epistemic shambles. So small-h heroic efforts like this on Thorny Questions are appreciated. I got a lot out of Zvi's covid posts, but the nature of such iterative roundups makes it harder to reference any one specific claim on a specific topic (and of course he doesn't write much on the topic since declaring Mission Accomplished, so I've not heard more current claims until now). A 2024 summary on one of the central questions is thus extra useful. Updated to maybe 60-40 or so now.

Sadly, it's also been reminding me of that other larger-h Heroic Effort you did on a different controversial covid topic, and the attendant months (years? was it years?) of professional critic Alexandros Mainos and company looming over sundry unrelated posts. Bayes really does seem to take the biggest L in the LL v. Z debate; there's more to rationalism than just Bayes, but even if Saar/Weissman/whomever are correct that zoonosis is really a one-legged stool resting on a single load-bearing prior...it just feels like Obvious Nonsense? Don't let good process excuse bad results...Shut Up and Multiply, But Not Blindly. I really do want to believe that hard questions can be better-resolved through some careful application of math, guesstimating, intuition, and reference class definition! This was not exactly a ringing steelman endorsement of such process, though. Maybe it's just too hard when so much of the Simulacra Level 0 information is impossible to trust *or* verify.

Expand full comment

Scott writes:

>30,000 people donated blood in autumn 2019, and the hospitals still had most of it. So they tested the blood samples for COVID antibodies and didn’t find any... There are 12 million people in Wuhan, so if even a few hundred people had COVID during that time, one of them should have turned up. None of them did.

Response:

This is missing two important factors:

1. We need to give 1-2 weeks for antibodies to develop.

2. People are not allowed (and don't want) to donate blood until feeling well.

That means this whole sample is delayed by around 3 weeks.

So let's see what zero positive blood samples tell us:

1. We have 44,000 samples 1-Sep to 31-Dec.

2. Since infections more than double every week, almost all the positives will fall on the last week. That's 44000/13=3400 samples, or 1 in 3000 Wuhan residents.

3. So to have one positive sample, we need ~3000 infected in early December.

4. That's 11.5 doublings. At 3.5 days it's 5.7 weeks, bringing patient zero to late October or later. (Doubling is probably slower at that time, so it's even earlier, but never mind).

That perfectly matches the evidence under lab leak (Reed, Chen, the lineages in the market and more), so the blood samples have no weight as evidence on origins.

Expand full comment

Great posts, and I learned a lot, e.g. that the furin cleavage site in the gain-of-function grant application may not qualify as one of the weird random coincidences after all: scientists may just have been good at assessing what makes for pandemic potential. However, I don't understand why much more attention is not generally paid to the fact that the Wuhan CDC moved *right next to the wet market right before the outbreak*. Perhaps this is Rootclaim's fault, stubbornly pushing an inferior lab-leak theory? There was a comment by Jacob on the original post but it did not make the highlights. To quote from it, "[CDC] employees were fanning out all over China going into bat caves".

Let me put it this way: having learned of the Wuhan CDC move, I would now not be satisfied completely even if I came to the conclusion for myself, after doing my own research, that the wet market's supposed role as the epicentre of the outbreak could be explained away by ascertainment bias. For then there would still be the weird double coincidence (cf. "one of God's biggest and funniest jokes" in the original post) that 1) pandemic begins in the city where the WIV is based, "biggest coronavirus lab in the Eastern Hemisphere" according to the original post, and 2) exact timing of the CDC move, near the exact location where people think the outbreak started.

But it seems like if certain connections could be established both between the WIV and the Wuhan CDC and between the Wuhan CDC and the wet market, then the weird double coincidences might go away?

Expand full comment

Scott writes:

>There was a Lineage A sample in the market, lab leak proponents just try to ignore/dismiss/conspiracize it away

Response:

This was written in response to the fact that all market cases are lineage B. But the lineage A sample is a single environmental sample, not a case. It is additionally from 1-Jan, when SARS2 was already all over Wuhan, so it's not surprising one got to the market. 

And indeed, it had 2-3 mutations meaning it is from a late infection, so it tells us nothing about the early outbreak.

Expand full comment

Regarding Saar and others repeatedly calling the wet market a superspreader event and Peter rebutting them, I think there's an important fact here that somehow fell through the cracks: if you google "covid dispersion parameter" you'll see dozens upon dozens of papers that assert that Covid had it unusually low, for example: https://www.pnas.org/doi/full/10.1073/pnas.2016623118

> Recent estimates suggest that the dispersion parameter k for severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is on the order of 0.1, which corresponds to about 10% of cases being the source of 80% of infections.

As far as I understand it, this means that all arguments about the initial spread (not) doubling every 3.5 days are deeply suspect. It might be entirely possible for Covid to linger in a handful of people for a month or two without there appearing 256 times more cases than observed afterwards. It could evade that 10% chance of hitting a proper spreader for a long time, with the curve becoming observably exponential only with several tens of people infected.

I wouldn't say that this property of Covid automatically puts that kind of arguments to rest, but it definitely should be addressed and analyzed.

Expand full comment

Scott writes about our Syria chemical attack analysis:

>So when Saar says that his method has a great track record, what he means is that when he looks into it further, he becomes even more convinced of his previous position. He doesn’t mean that any kind of external consensus has shifted towards his results over time.

Response:

Yes, at some point one needs to draw a line where further debate becomes ridiculous. For me it is when the rocket impacts from 7 sites all intersect at a small field within opposition-controlled territory where a video was taken on the night of the attack showing islamist opposition fighters wearing gas masks launching the same rocket type that was found in said impact sites.

I understand others draw the line somewhere else and that's fine.

https://twitter.com/Rootclaim/status/1405891184443199488

Expand full comment

Scott writes:

>During my email discussions with Saar, he kept insisting his position was obviously right. He would send me emails like (not exact quotes) ‘Now that I’ve demolished all the evidence for zoonotic transmission, you have to agree with me, right?’ or ‘You must secretly agree I’m right, it’s just be hard for you to admit.’

Response:

That is not how I converse with people and I'm not sure what I've done to cause Scott to stoop to this level and misrepresent a friendly professional exchange in this way.

During our exchange, I've corrected numerous factual mistakes and demonstrated why probabilistic inference methods that Scott proposed do not work in practice. I understood he conceded those points as they were removed in later drafts. Nevertheless, his confidence in the conclusion was unchanged, which seemed to me as confirmation bias and possibly preferring a good story over rational thinking. I very politely pointed this out. That's all.

I believe the fact that Scott still hasn't corrected the obvious mistake in estimating p(HSM|LL) (again, his current position is that an HSM worker is no more likely to be an early case than a random Wuhan resident!), points to a serious problem in rational thinking, regardless of what the origin of Covid truly is.

I hope people he trust better than me can help him understand this. I failed.

Expand full comment

https://arguablywrong.home.blog/2024/04/09/how-likely-is-it-for-covid-to-establish-itself/

Critique of yours and Peters epidemic modeling. I do find it a bit weird that you profess to put a lot of faith in the published science but ignore it when it's inconsistent with zoonosis. E.g. bloom and kumar have October start date as plausible in peer reviewed studies but you dismiss them on the basis of a back of the envelope model.

Expand full comment

The brilliant nod who will probably get the pekar paper retracted, also critiques some of your core assumptions here https://twitter.com/nizzaneela/status/1777989261817508165

Expand full comment

OK, here's the non-mathy passage summarizing Worobey/Pekar. Extensive details and many references are in the arguments leading up to it or the Appendix. ZW means zoonotic by wildlife, ZWM means just at market.

Summary on the market sub-hypothesis, ZW­M

Evaluation of the general ZW hypothesis has usually focussed on its ZW­M sub-hypothesis. As discussed in Appendix 1, by far the largest Bayes factors in those analyses that conclude that ZW is more probable than LL come from specific arguments about the HSM. It’s worth roughly summarizing the likelihood factors specific to ZW­M as opposed to more generic ZW. The first is that the Wuhan markets had a much smaller share of the wildlife trade than one would expect from the population, probably at least a factor of ten less. The second is that there was a fairly early superspreader event at HSM. Given that market superspreader events are likely for market spillovers but moderately unlikely (we’ve seen several at other cities) for infections with other sources, this would give roughly a factor of ten favoring ZW­M. The absence of any market-linked cases of the more ancestral lineage is likely if the spillover occurred elsewhere and unlikely if the spillover occurred at a market, so that would give a factor disfavoring ZW­M. The clustering of non-linked cases near the market looks like a result of ascertainment bias and thus gives negligible evidence either way. The lack of correlation between potential animal host DNA and SC2 RNA in market samples, in contrast to actual animal coronaviruses, also is to be expected if the animals were irrelevant but somewhat surprising if one of them was the prior host. Other than the deeply flawed Pekar et al. 2022 paper, multiple reconstructions of the phylogeny indicate that the spillover occurred more than a month before cases started being identified at HSM. Thus overall ZW­M is unfavored compared to other ZW versions, but with enough uncertainty that it may be best to ignore these factors.

Expand full comment

The figure you present as if it were from Pekar 2021 has been digitally altered from the figure in the paper to remove the confidence intervals in the legend and shift the curve so that it ends at Dec 11th rather than Dec 4th.

See the originals in Figure 3 of the published paper and bioarxiv version:

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8139421/?report=reader#!po=12.5000

https://www.biorxiv.org/content/10.1101/2020.11.20.392126v1.full

This is obviously a very serious problem. Please share the provenance of the figure in your post, since it does not come from the source you have claimed.

Expand full comment

>"What is the Weissman paper that observeralt is talking about? It argues: if the pandemic started at the market, each seemingly non-market-linked case must ultimately derive from a market-linked case. Therefore, we should expect non-market-linked cases to require more steps than market-linked cases. Therefore, they should be further away. But if we look at the map above, we see that not-market-linked cases are closer to the market than market-linked cases. So something must be wrong, and that something might be ascertainment bias."

It's not just vaguely "and that something might be ascertainment bias." It's that unlinked cases being closer to HSM than linked cases, is exactly what to expect if there was also proximity ascertainment bias (e.g., through searching nearby hospitals). Intuition: linked cases were found even if they lived far away (found via the case definition), but unlinked cases were almost only found if they lived nearby.

Expand full comment

Doubling time arguments very, very early in the pandemic aren't really valid because the sample size changes the math. Some people, maybe most people with early variants, infect zero other people. Others infects dozens. In bulk, this averages out and we can speak of an aggregate doubling time. When it's just a few people the tails are very long. A handful of people with COVID in November could turn into thousands in December or still just a handful, depending on the role of the dice.

Expand full comment

FWIW, Lv is sometimes how the pinyin transliteration for Lü is written, for ease of typing.

Expand full comment

When I think of all the discussion it is possible to have about the pandemic, in particular the correct public policy for pandemics, I'm perplexed that so much energy is going into this thing that has only tangential bearing on the future. Yes let's not have any lab leaks in the future. There I said it. Anyone disagree? Anyone in favor of lab leaks? Probably the most pertinent issue connected to lab leaks is whether and how we should be doing this kind of research. It's curious that there's almost no discussion of the future.

Expand full comment

You still need to get the virus in to the wet market. Raccoon dog? Researcher? Frozen fish? Ed Holmes picture proves raccoon dogs in the market, but it also proves that virologists were in the market! Why spread from the market and not from the WIV? For hub spoke reasons. Viruses don't spread very well in virus labs. Next places the virus showed up outside wuhan were transport hubs, singapore, new york, milan. Where's the transport hub in Wuhan? The central railway station right next to the wet market.

Expand full comment

It seems like going through the process of writing both the original post and this comments compilation has been a very stressful process, but I'm very glad you did and appreciate how thoughtfully you approached everything. I have found this all interesting and useful to read, and it has led to thoughtful conversations. Thanks!

Expand full comment

To be honest, the paragraph about Rootclaim's investigation into the chemical attacks had the strongest impact on me in this post. Before that I thought they were some serious group like the Samotsvety (although Saar's behaviour after he was defeated in the debate had already made me very wary). After reading it, I can only repeat the Tobias Schneider question: why should we take Rootclaim's opinion seriously at all?

Expand full comment

Scott, you have used a doctored version of figure 3E from Pekar et al. This misleading version was also used in the Rootclaim debate according to the replies here.

https://twitter.com/nizzaneela/status/1777989261817508165?t=B5bQ5CPYHe6e0A4mCxPEdQ&s=19

Expand full comment

Since it looks like nobody else has pointed it out:

> This has been true ever since the very beginning of epidemiology, when John Snow successfully traced a cholera outbreak back to its origin at a contaminated water pump by taking the center of the map of cholera cases.

This isn't what happened! Snow didn't make the map until well after he concluded that the contaminated pump was responsible; the map was made after the fact as a way to convince other people, not part of how he came to the conclusion. I learned about this from this video by Patrick Kelly: https://www.youtube.com/watch?v=bALs7kNpNSM

Expand full comment

Technical rebuttal to the analysis of early claims presented above. It's also come to light the figure 3E used above from Pekar et al is doctored. It's not the figure they use.

https://arguablywrong.home.blog/2024/04/09/how-likely-is-it-for-covid-to-establish-itself/

Expand full comment

> I would literally never do this if I was designing a small insert (maybe I wouldn't notice if it happened by chance with ~1 in 25 odds in a naive codon optimization algorithm as part of a larger sequence). High GC% is bad. Tandem repeat is worse. Several other perfectly fine arginine codons. And I wouldn't engineer a viral genome using human codon usage. An engineer would not do it.

While I do not care much for speculations about the origin story of covid, I find this fascinating.

What is so bad about tandem repeats?

Why is it bad to have a high GC percentage? Are these bases rarer in nuclei and thus limit the amount of copies a virus can create? Or is it that G,C are identical in DNA and RNA while T gets replaced with U?

I thought that the codon code was universal, human ribosomes parse viral RNA just fine.

Per Wikipedia, GC(C|G|A|U) and AA(C|U) encode Arg. Which of these are good for viruses? Why are others better for humans?

If this is bad for a virus, did later covid variants evolve to replace this with a better encoding?

Expand full comment

Regarding Syria, it should be noted that the western governments attributing the chemical attacks to Assad are largely the same ones which were arming the rebels in the first place, and wikipedia is loath to write anything critical of NATO powers with regard to current events, especially when the US press circles the wagons

I'm not saying we know that the rebels staged the attack, but I am saying there's a lot we don't know and lying on all sides of that mess

Expand full comment

It's strange that Scott has trouble comparing his factors and mine since I posted a translation as a comment on his previous "book review".

Here's a slightly simplified and improved repeat of that comment.

Net priors using Scott's grouping + Scott's "reason's WIV wouldn't do it".

Scott 2.7 MBW 12 before integration of uncertainty, 5.4 after integrating

combo of all HSM/lineage factors (Worobey/Pekar)

Scott 0.002 MBW 1

FCS-ish

Scott 25 MBW 25

cover up success

Scott 0.5 MBW: did it?

other factor: restriction enzyme pattern

Scott NA MBW 70

So really it all comes down to Worobey/Pekar, plus the surprising new FOIA find that the DEFUSERS were planning just the restriction enzyme pattern that Bruttel et al. had noticed.

Does Scott ( or anyone) really believe that there's about one chance in a thousand that the Worobey data are messed up? That there's 99.9% probability that George Gao, the official WHO summary on market linkage, Bloom, Kumar, Samsone, Lv... are all wrong? If Scott allowed any reasonable chance that they were right, then his odds would favor LL.

Expand full comment

Scott writes:

>The first two known Lineage A cases were very close to the market

Response:

Untrue.

One case is just claimed to be near the market with no location or even distance given.

The other is yet another mistake in Worobey et al that we discovered during the debate. They claim the case "resided closer to the Huanan market than expected" based on Wuhan population distribution, forgetting that infections are not linearly related to density. In early stages of spread, the exponential growth could easily cause dense neighborhoods to have 10 times more cases per capita. So this "finding" is just an artifact of HSM being in Wuhan's densest areas. Given that it was already borderline significant at p=0.034, it’s safe to say there is nothing there after correcting the mistake.

Expand full comment

Anon Virology you mention above suggests argument lineage B came first because it was more prolific and had more genetic diversity than lineage A is ridiculous. On that basis Omicron came first. Note that Lv et al (2024) published new genomes which suggest single spillover with lineage A coming first.

Lineage A was seen in various places outside Wuhan before lineage B before being replaced. Difficult to reconcile this with A coming after B which outcompeted it.

https://twitter.com/virologyanon/status/1778322175214436608?t=7e83InBAk2zAZVwaZXjDKQ&s=19

Expand full comment

I am surprised to see that the likelihood of natural COVID in Wuhan is considered to be proportional to its population. If we have a lab in a city with 10 millions people, and another lab in a city with 100 000 people, we wouldn't consider the first to be 100x more likely to see a lab leak than the second.

What should matter is the contact surface between the potential source and people. If there are 10 000 people per day visiting the wet market or other locations with wildlife in Wuhan, that gives the city roughly the same probability as 10 000 people living in a rural area. Well, realistically it would be more for various reason... but maybe 10x more, not 1000x more.

I imagine this was covered somewhere, but I haven't seen it. If a kind soul can enlighten me on what mistake I'm doing, I would be grateful.

Expand full comment

It sounds like this has been pretty stressful to cover, but thanks for doing it - it's a fascinating topic for all kinds of reasons, not just to do with Covid, and I appreciate you highlighting it and talking me through it all.

Expand full comment

I don't really understand the responses Scott's made to Saar's "but is the market cluster really such a coincidence in lab leak scenario?" — seems like they sort of get in the weeds about /why/ these markets tend to be involved in early COVID spread, but that doesn't address the main thrust of the point (whether or not it's the frozen seafood, it's still not unlikely in a lab leak scenario if it kept happening when COVID spread to other countries).

But I haven't had time to go back and re-read more carefully, so maybe I'M missing something.

Expand full comment

Nobody knows what covid "doubling time" is. Usually when people talk about "doubling times", especially in the context of early covid cases, they mean doubling times of measured tests. This is something completely different from the actual doubling time. I am very surprised that people who are otherwise doing good statistics miss this distinction. If you are at the same time ramping up testing, and ramping up awareness/fear of the disease, those are already two (!) exponential processes that influence the measured number of tests. Good luck extracting the actual doubling time.

Also covid is not actually exponential. The exponential model of covid spread makes some very implausible assumptions about how people are connected. There is unfortunately very little research on this because accurately modeling these kinds of networks is hard. Also with covid there were clear seasonal effects in the spread in Europe - it basically didn't spread at all in summer, just like the flu.

That said I don't really buy "Brazil had 4 months of covid and nobody noticed". But it isn't as much as a slam dunk as you make it out to be.

Expand full comment

Does this debate make Rootclaim reconsider the usefulness of this betting/debate idea?

It seems like, from their point of view, the truth did not prevail - neither in the debate itself, nor in Scott’s commentary, and many people have updated towards zoonosis.

So was the whole debate counterproductive wrt spreading the truth as they see it? What does that imply for the format generally?

Expand full comment

>Here is a map of December 2020 COVID cases. I recommend ignoring the contour lines and just looking at the dots. How could dots be oversmoothed?

Counterargument: this is potentially an argument against the data being good. The infected people at the wet market presumably don't all live there, the commute (or at least their customers commute). Why wouldn't you have any secondary clusters anywhere? It's kind of weird that you have so many cases centered around it, suggesting that maybe people got tested more near the wet market than elsewhere?

Expand full comment

Well thanks for doing this Scott, each part has been an illuminating read and it seems like it was all a giant hassle. Good luck with the lab leak zealots.

Expand full comment

When your debate opponent posts a picture of a sequenced pair of viruses and says, "See!!!" and you can, in fact, see an obvious explanation, then it is clear that they either have no idea what they are talking about or are just being disingenuous.

Expand full comment

Scott writes:

>The fact that the COVID comparison has few mutations, and the HKU1 insert has many mutations, just shows that whatever older virus we chose to compare HKU1 to is more distant from HKU1 than BANAL-52 (or whatever) is from COVID.

Response:

It doesn’t matter what are the reasons there are so many mutations. Whatever that reason is, it also applies to the insert, providing it far more opportunities to happen.

Additionally, having a long insert is just one of 4 rare coincidences in the SARS2 FCS. The rest don't appear in HKU1:

1. The HKU1 insert doesn’t introduce a new FCS. It’s just happens to be next to it.

2. The HKU1 insert just seems like 7 repeats of TCT (Serine) plus a few SNVs. That's something that is common in natural evolution (duplicating a sequence during RNA replication).

3. It has no rare sequence like CGGCGG.

In contrast, the SARS2 insert appears in a highly conserved area with few mutations, introduces a completely new FCS, and uses a specific unique sequence that is completely foreign to this virus.

Given that this is the best example that zoonotic supporters managed to find, and it addresses only 1 of 4 rare coincidences, it actually strengthens the conclusion that this is very unlikely to arise from natural evolution.

Expand full comment

I know you ended this post with a lament that the haters will continue on, but I was a 75-25 lab leak guy before your posts here and I'm closer to 90-10 zoonosis now. Big things that changed my mind: 1. China wanted to cover up a market origin. - I knew that China was hiding something with all the evidence they destroyed but didn't consider that they might have done so with the intention of covering up a market spillover and made any conclusion harder to come to.

2. Other zoonosis origins have taken several years to find the animal source, if they find it at all - I think there was an example of a known zoonosis origins floating around that made me think it would be easier to find if it existed and the lack of that leaned lab leak more than I believe now.

3. The unscrupulousness of Saar - he definitely seems like a smart guy but when you don't take any time to step back and consider you were wrong after losing the debate and go to immediately claiming you are still Obviously right and will be vindicated, it betrays a severe lack of epistemic humility that is necessary to actually be right about stuff.

Expand full comment

More diversity comes from more spread,

Very early on, most of the first sequences outside of Wuhan were A, not B

Expand full comment

New discovery regarding the claim that a lab-leak requires WIV to have 'secret viruses', which is supposedly unlikely:

Daszak also confirms that EcoHealth and Wuhan Institute of Virology "have 15,000 samples in freezers in Wuhan, and could do the full genomes of the 700+ CoVs we've identified," contradicting his previous false claim that EcoHealth/WIV had published all of the CoVs they identified.

https://twitter.com/R_H_Ebright/status/1778930829563191562

Expand full comment
Apr 16·edited Apr 16

The better question is whether anyone should trust Chinese science published for an external audience. They obviously have incentives to know the truth at a leadership level. That is not the same as what is disseminated.

Expand full comment