761 Comments

Does anyone else feel like this whole thing is an argument that science should be done purely anonymously?

It seems impossible to separate ego and tribalism from these analyses. If everything HAD to be published anonymously, maybe that would remove all incentives or anything other than just getting the best evidence in a place where other people can evaluate it?

Expand full comment

Maybe? But then you have this problem:

"The “in SHAMBLES” is always a couple of papers that have “now debunked” the best papers of the other side. These come out on a regular schedule. They’re usually by people in unrelated fields - the ones I saw on COVID origins were by computer scientists, physicists, and agricultural scientists."

I don't know how you square anonymity with verifying credentials. And I don't think pseudonymity (a la our very own esteemed host) would alleviate the ego concerns.

When it's easy to flood the market of ideas with crap, "does this person have the background such that they can be expected to speak with accuracy on this?" is a good first filter.

Expand full comment
author

One option might be to have anonymous scientists, but nonanonymous journals (eg you don't know who this guy is, but they got into Nature so maybe you should take it seriously). I don't like this because it gives more power to journals, though.

Expand full comment

Journals are one solution to the problem of “how do I decide where to deploy my limited attention.” They used to be a solution to “how do I disseminate this idea,” which is now a solved problem.

You’re right that _any_ solution to the “what do I pay attention to” problem conveys immense power to whoever implements it. I think if we want liberalism to survive the Information Age, the only workable answer is people encoding their values in algorithms and using those + their social networks to determine where to deploy attention.

You can imagine replacing journals with approves/aggregators. So instead of “it’s published in this journal”, which is really “journal X this seriously” everything is published on the internet, and journals simply give their stamp of approval. So the same paper can be approved by three journals, rather than published only in one.

Anyone can then decide whether or not to take seriously journals that ignore papers I consider important. Not approving a paper now signifies a rejection of its importance, which carries its own risks. Does this reduce the power of individual journals by making them compete?

Expand full comment

It certainly would reduce their rent-seeking power, which something tells me is the most desired one.

Expand full comment

If I know anything about academic publishing houses (and I published 20+ papers in journals) this will quickly devolve into the publishing editors contacting the authors with a "pay us X and we will give you the stamp".

And if you think that those journals that do that will fail because they will be outcompeted by those who actually do due diligence, I have a one word reply for you: "sigh".

Expand full comment

How about basing it on public-key cryptography? Key-signing follows successful thesis defense, then everyone involved is sworn to secrecy, and reputable journals (or at least some consistent subset thereof) refuse to publish anything from an unsigned key, or a key whose true name their peer reviewers are able to establish beyond reasonable doubt based on open-source evidence, while reviewers are likewise disqualified if the paper's author is able to identify them.

That way it'd be possible to build up a reputation for expertise in particular fields without journals being able to gatekeep - since messages, or even cash payments, provably passing to and from the same private key aren't restricted to any particular channel - but leveraging that reputation as social-media clout personalized enough to be properly ego-gratifying would soon force retirement from serious research.

Expand full comment

Having the background to speak with some degree of accuracy is also the background that would lead someone to have a vested interest in *not* being accurate about certain sensitive topics. Like some of those communications of virologists and funders from early in the pandemic that lab leak proponents wave around- zoonosis vs lab leak didn't really matter; it was a threat to GoF research either way.

Everything is tradeoffs.

Expand full comment

I think this is very wrong.

Firstly, anonymity doesn't stop ego and tribalism. The internet is the most anonymous place in the world, and it's also the most egoistic and tribalist. Look at us here in The Best BTL Community That Exists - we still get into arguments fairly frequently.

Secondly, real name accountability is what makes people calm down. Anonymous arguers melt away. They don't have to come back the next day and deal with the fallout - e.g. that instead of getting on with their jobs, everyone in the building has to have *another meeting* about the utility of ivermectin because Cregg in Accounting just can't leave it alone. At some point, Cregg has to prove his case or get fired. (Sorry, that's a dumb example, but I hope the point holds.)

Expand full comment
Apr 11·edited Apr 11

100 percent agree.

I'm old enough to have sipped the 1980s/1990s kool aid about widespread public Internet usage becoming an improving change for civic discourse. Obviously today it's difficult and frankly a bit embarrassing to recall the logic of that hope. But anyway it's seemed clear for a good while now that normalizing online anonymity has on balance been disastrous.

I totally recognize and agree with the arguments for anonymity, to be clear! But sadly, at scale it is a force multiplier for some of the worst human instincts. Making it a default expectation online has, all things considered at least in the developed world, turned out to have been a huge unforced error. And taking the scientific world to that same place would be ultimately tragic.

Expand full comment
Apr 9·edited Apr 9

I don't think the problem here is ego or tribalism. The debate is long because people think they are right and there are a lot of possible branches in it.

To me it feels like we need some better way to structure debates.

Expand full comment

YES. I used to think Kialo was a great idea, but it seemed nobody was interested. Have you seen it before?

Expand full comment

Yes I know it, and I also think the idea is great.

It is unfortunate than people aren't more interested, I think it is because people aren't used to it, and it isn't done exactly right or there are some important functionality lacking.

I was also interested in this project : https://github.com/canonical-debate-lab/paper

Expand full comment

I imagine that would also remove most incentive to do science. We dont rely on saints to build our best companies, we let money do the work. Scientist get prestige (and sometimes money) and on net its better than having to rely entirely on however many saints we can produce.

Expand full comment

I think public goods are very different from businesses. These giant companies all use open source software, which is a public good and produced on a model that I think generally works better than publicly funded science.

If goods and services are cheap enough, independent wealth becomes doable and acts as a kind of “free market tenure.” That plus patronage plus corporate funded science I think will get better results than the current setup where most academic papers are not read by anyone.

Expand full comment

Georgist LVT + UBI would make de facto independent wealth the default for everyone, greatly reducing the need for corporate funding or other patronage, and hopefully also concomitant bias.

Expand full comment

I think there are some pros and cons.

On the plus side, there is less pressure to conform to the narrative of your tribe.

On the negative side, with a name attached to the researcher you can at least look up their other publications and get a gut feeling of how biased they are.

If someone claims a surprising result, I might take it serious if they have some academic reputation which they stake on it, but completely ignore it if it was just some crank churning out one revolutionary paper a day.

A case could be made to keep the author anonymous from the reviewers who make the recommendation to publish it in a journal or not. While this would get rid of credentialism ("Well, he is a famous professor, we should be honored to publish his paper"), there are again trust issues. A published academic is less likely to publish a paper using made up data, or plagiarized from another journal. An uncredentialed outsider has no reputation to stake, so they might submit a paper hallucinated by a LLM for shits and giggles.

Expand full comment

Journal submission via public-key cryptography, with keys signed by initial thesis committee and then various co-authors, could in principle allow a professional reputation to be built up based on someone's previous relevant work and absolutely nothing else.

Expand full comment

"The secret of success is sincerity. Once you can fake that you've got it made."

Going a little past that, I suspect there is a lifecycle to institutional information sources: (1) an institution will gain reputation as they make public(izable) statements that turn out to be true and relevant (and therefore useful); (2) the reputation for speaking useful truths eventually attracts people who want to harness that reputation for advocacy or propaganda (I may be repeating myself); (3) those people eventually get in (barring institutional resistance), degrading the truth-alignment and therefore the usefulness of the institution.

People still want to know true things, so in a free market *someone* eventually goes back to (1), albeit probably not at the same institution.

Expand full comment

Does anyone else feel that this situation is more evidence that this is all a waste of time for them to pay attention to because they only have X hours in the day, and if people laser focused on it seem to vehemently disagree with animosity, the different conclusions are more likely to be something like “the same hard-coded priors that cause political and religious differences”?

I’m glad someone else is looking at this for sure, but it seems like it would be foolish for me to claim to know anything beyond “people are fighting about it, this one group says X, this other group says Y.”

That seems like it would have worked better through most of the history of science, rather than trying to, eg pick between gradual and calamitous theories of geological formation, the hot button issue of a few centuries back.

Expand full comment

This is what I've been thinking. Almost none of what I believe about the rest of the world hinges on whether or not the virus is from a lab leak or a zoonosis. China shouldn't have wet markets, there shouldn't be gain-of-function research, scientists shouldn't supress debate or mislead journalists about what they think just because they don't like the social implications.

Yes, it's important to have correct knowledge, but it seems like this is a debate that will never be resolved without more information of the right kind.

I don't understand why animosity is necessary in debates like this.

Expand full comment

Well... the cost/benefit ratio of gain-of-function research depends on how reliable BSL facilities are at preventing lab leaks, a statistic which in turn will be affected to some non-zero extent by whether a lab leak in fact took place at the WIV. I think the frequency of (well-documented) lab-leaks from other facilities is high enough to plausibly justify a ban regardless, and it's not like the Wuhan biolab was an impregnable fortress in terms of security procedures. I do feel a little annoyed about not being exposed to useful rebuttals to the furin-cleavage-site argument sooner, though I suppose I could have gone digging on my own initiative.

I don't think anyone disagrees about shutting down wet markets.

Expand full comment

Yeah, that's sort of why I said almost none what I believe depends on lab leak. I'll admit my reasoning with regard to the lab leak theory was a little biased, but in the opposite way, which I don't feel great about. Namely, I have been more partial to zoonosis from the start than lab leak, which seems discordant with what should have been my priors. I had an impression of the historical track record of lab leaks, so I should have suspected that from the start, but I didn't put two and two together until more information came to light. In the end, it's not my area, so I didn't feel like investing a lot of time on it.

But I feel like in a lot of debates, there's a lack of practicality that makes them higher stakes than they otherwise would be. For instance, a lot of solutions to climate change would be good regardless of what people believe about whether or not it's happening or human caused. If the Earth isn't warming now, it will at some point in the future, so we should develop geoengineering techniques address that. Energy abundance combined with a resilient and diversified power grid is a good thing to do as well. But people seem to want to disagree with animosity no matter what the topic is.

Expand full comment

I'm not totally unsympathetic to green policy objectives on paper, though I reckon either nuclear or space-based solar are the most feasible long-term solutions here, but I think Scott already went over the reasons why climate change is overhyped as an existential threat to humanity. Obviously, the flip side is pretending it doesn't exist at all.

There's a theory that people broadcast preposterous convictions as either a conscious or unconscious mechanism for signalling in-group loyalty in the absence of other mechanisms for friend/enemy recognition (such as war-bonds, local relationships and/or genetic proximity.) I hate this idea, but I can't say it's incompatible with observation of contemporary western politics.

Expand full comment

Yeah, I agree about nuclear and space-based solar. I'm actually pleasantly surprised someone brought up space-based solar, it's been thought about for ages, but rarely talked about in energy debates or in currently common space-colonization schemes. My larger point about energy is that we should be pursuing as many options as possible for getting it, storing it, and using it efficiently in a variety of ways.

I agree that anthropogenic climate change is overhyped, much to the detriment of energy abundance discourse. One perspective I find interesting about acknowledging that it's happening is that we can basically declare success in our first, albeit unintentional, geoengineering experiment. People should be proud we accomplished something like that because it gives us options when natural climate change occurs.

Whatever the reason for Western politics, I agree it's dysfunctional.

Expand full comment

There's a good reason people don't talk much about space-based solar. It just doesn't pencil out.

Expand full comment

The problem with geoengineering is that the world is a complex system and every intervention has unpredictable side effects. There's no magic bullet, but talking as if there is a magic bullet might make people feel like they have an excuse to make the problem worse.

Expand full comment

I think we can either intervene unintentionally the way we are currently, without any regard to consequences good or bad, or we can try to be responsible. I don't think there's a magic bullet, we will have to engage in many different coordinated projects to be successful. I'm also not saying it's easy. But I don't think non-intervention is possible with nearly 8 billion people on the planet.

Expand full comment

I guess you mean anyone here. Wet markets are pretty common and nobody is shutting them down, so there must be a lot of people who really like them.

Expand full comment

Fair enough.

Expand full comment

My understanding is, it's not that the government doesn't genuinely want to shut them down, or that most of the populace wants them to remain in place. Rather, the government's efforts to shut them down are stymied by corruption. The national government hands down directives to shut them down, but at a lower level, the enforcers see this as an avenue to extract bribes, the people running businesses at wet markets bribe the officials as part of the cost of remaining in business, and the majority of the population who don't patronize them ignore them.

They're an embarrassment to the government, not simply because they're defying international sentiment on the risks, but because they showcase the government's weakness, that they don't actually have the power to get rid of them without what they consider an unacceptable cost in social capital.

Expand full comment

It's somewhat amazing that the CCP were able to shut down their own people's rate of reproduction more easily than they were able to shut down consumption of bat-meat and boiling of live dogs.

Expand full comment

This might be terminology confusion? A wet market is a market that sells fresh food, such as a farmer's market, in contrast to a dry market which sells stuff like fabrics and electronics.

Expand full comment

I think that's not the definition people are usually using in these debates, although I don't know the technically correct definition.

Expand full comment

Yeah, I think most people are talking about markets with live animals, but the terminology could be confusing.

Probably because we don't use the term wet market in western countries (we call them farmer's markets), and there was a lot of talk in the news about wet markets during COVID, people started misusing the term.

Expand full comment

Hearing Scott Adams' podcast I've become aware of persuasion and 'sticky' words, and 'wet market' is sticky for all the right reasons. Wet. Germ transmission. A tinge of yuck: blood, mucus, who knows what else, that brings an emotional content. And the words rhyme. It's got all the ingredients. Accuracy in description counts--but not as much as emotional stickiness.

Expand full comment

It seems to me that there are really two debates here, the object level debate, which like you, I don't care too much about.

And a debate about how we should analyse data and different hypothesis, and also how we should debate them, which is what RootClaim is about, and I care more about that.

The object level debate seems more like some kind of test to the epistemic positions to me.

Expand full comment
Apr 9·edited Apr 9

I don't think any animosity is needed, but I think the animosity come from there being too much stuff to read/analyse/respond to, and people being sort of overwhelmed by it.

And also some sort of "I think this argument/analysis will finally work to convince", and this failing to be the case, which could be frustrating, even when all parties are reasonables.

Expand full comment

That makes the animosity seem even worse to me. If one of the goals is to test a new epistemic method for determining which explanations work best, then the people involved should be even less inclined to bring their egos into it. I think epistemology is difficult to improve on at this point. I know that one of the goals of this blog and others is to do just that, but at this point I think the scientific method is really, really good to the degree its principles are instantiated and incentivized. It's unlikely that any one person will make dramatic improvements, so people that try should be humble about it. I get that the people involved are humans with kludgy brains, but this could have been fun, especially if the object level debate has little practical value given the policies that follow from both a lab leak theory and a zoonotic theory.

Expand full comment

Yes, it wasn't a justification of the animosity, just one possible explanation (and I am really unsure it is a correct explanation to be clear).

Take into account I don't feel like there is a lot of animosity, it still seems mostly ok to me.

Expand full comment
Apr 12·edited Apr 12

I've been stuck on the animosity question for days, and Scott labeling it as a fight and saying he doesn't like fights, and it's swirling in my head next to the map of the initial infections.

From inside of Scott's presented narrative (not accepting it as gospel, just trying to understand it), it sure sounds to me like we’re back at mistake theory arguing with conflict theory. Often the lab leak proponents in this narrative have as a direct object the individuals on the other side of the debate (e.g. "Scientists back away from" "the guy making the claim is slippery"), not just their arguments. Saar parses Scott's take on Rootclaim with "sadly, Scott seemingly hadn’t enough time to" "dig into our analysis and fully understand it." Poor Scott!

On an intuitive level, this is not inherently a judgment for or against either side. If the consensus narrative is wrong, then we should ask if something is also wrong with the gatekeepers of consensus. If there is a conspiracy (and, dammit, there was at least one, so we can't just go "that sounds paranoid"), the conspirators have done wrong, possibly with a direct known cost in human lives, not simply made an error and failed to properly update for the next pandemic. Whereas those siding with the consensus narrative have the relative luxury of focusing on the problem, because consensus gatekeepers don't normally need defending within the consensus. (I remember once seeing a Punnett Square on changes to the status quo, with Support and Oppose on one axis and Active and Passive on the other, and then the Passive Oppose box was Xed out because Passive Opposition to the Status Quo works out exactly the same as Passive Support in practice.)

Then... iunno, something something conflict something something Moloch something something everyone ends up with chocolate in their peanut butter and you've got animosity and a fight. I wish after days of thinking about this I had a better conclusion, but I don't.

Expand full comment

...Centuries? I thought plate tectonics wasn't really settled on until the 1970s?

Expand full comment

But catastrophism vs gradualism was largely settled in favor of gradualism centuries ago. (Of course, we now have a tempered gradualism, where the vast majority of geology is gradual, apart from a few special flood basalts and meteor impacts.)

Expand full comment

Ah, I see.

Expand full comment

That's mostly my conclusion too. I was leaning lab leak due to skimming Michael Weissman's blog and thinking this guy looks like he weighed the evidence in detail. I'm grateful for Peter and Scott's work, which sounds even more convincing. But at this point it feels like a waste of time to spend more time on this.

Expand full comment

I feel like I got the opposite impression here. A bunch of people who were gung ho about lab leak a few months ago have now been convinced for zoonosis, and it was worth paying attention to that. Beyond that though, it’s like trying to follow a scientific debate in a field that isn’t your own - very little will come of it and it’s better to let the people who spent the time and effort understanding it report what they think.

Expand full comment

This was my impression, too. “These people changed their opinions” does seem like useful information to me. Especially Scott, given I trust him to try his best to be fair and I think he really does a better job than almost anyone. I don’t see a point in going much deeper than “people I trust think there are multiple distinct lines of evidence pointing at the market”. Maybe I’ll get around to watching this debate, but an hour? I don’t see the utility matching the cost.

Expand full comment

It has been a "fascinating" back and forth about one of the least interesting questions of the pandemic, for sure.

Expand full comment

This is why I like this blog. I wish the media would do more stuff like this. Very interesting, and now I feel like I have an informed opinion on something that two weeks ago I would’ve punted on. Kudos to Peter.

Expand full comment

OK, so if we don't over-focus on tiny details (which is what Peter's argument is), then after all that discussion what we still get is:

1. A novel coronavirus emerges in the same city as a lab doing experiments on novel coronaviruses and not anywhere near where such viruses naturally occur, or in any of the other cities of the world where animals and humans come into regular contact.

2. PRC immediately blocked all investigations.

3. Self-proclaimed "prestigious scientists" thought a lab leak was "just so freaking likely", lied about it deliberately and then organized a PRC-style conspiracy to shut down all such speculation.

Sum that up and it looks very likely that it came from the lab and everyone with relevant expertise immediately realized that was the most likely probability.

For the specific questions about things like Brazil or whatever that seem baffling, all those apparent problems are only problems if you continue to accept that modern epidemiology has any accuracy or value. It doesn't, at all. These are the people who claimed with 100% confidence that COVID wouldn't be seasonal, that it spread in a homogenous population and it would grow without end until everyone had been infected unless there were hardcore lockdowns. No other respiratory virus works that way, and sure enough COVID cases turned out to be seasonal, to depend heavily in things like age and superspreaders, and to peak long before everyone was infected. Lockdowns meanwhile made no difference.

So if there was an early release that went around the world before it was officially recognized, we shouldn't be baffled by non-problems like "why did it not double every 3 days in Brazil" because that claim was wrong to begin with. There is no prestige, there is no skill. Computer scientists are far better at building models and agricultural scientists (or indeed workers) understand animal disease dynamics far better than academic epidemiologists. The people with prestige here are not the ones ignoring counter-arguments.

Expand full comment
author

I agree that 1, 2, and 3 are somewhat suspicious, but:

2. PRC had a weird relationship with investigations that I don't think it's fair to describe as "immediately blocked" - see my timeline above for details.

3. I'm not doubting this, but can you find me where a prestigious scientist said this? I find them saying things in other words, but a Google search for "just so freaking likely" doesn't bring up any results. Again, it wouldn't surprise me if this were true, I just want to find the primary source to see if I have an opinion on it.

I don't think epidemiology "has no accuracy or value" - for example, if you believe the lab leak theory, they exactly identified what features the next pandemic would need to have! Again, I see lots of epidemiologists saying in 2010 or 2015 or whenever that the next big pandemic would come from China, from a bat virus, and be related to SARS. All of those were true (even in case of lab leak).

I also think you're exaggerating the mistakes they made during the pandemic. I'm not sure I see anyone saying with 100% confidence that COVID wouldn't be seasonal (link me to someone if you have them), and it also doesn't seem super obvious that COVID *is* seasonal yet -looking at https://ourworldindata.org/explorers/coronavirus-data-explorer?country=~USA&Metric=Excess+mortality+%28estimates%29&Interval=7-day+rolling+average&Relative+to+Population=true&Color+by+test+positivity=false , I see peaks in April, December, September, January, and possibly another August and/or January. My guess (discussed at https://www.astralcodexten.com/p/diseasonality ) is that COVID will be non-seasonal for a while until everyone gets coordinated antibodies, then stabilize to a seasonal pattern. After forming this theory, I found a paper by Dr. Fauci who had already come up with it, which made me feel more fondly towards him. I feel the same about some of your other comments here.

I'm not sure that anything you say really explains why COVID didn't double every 3 days in Brazil. I think this might be confusing levels of understanding, like "doctors said that one Alzheimers drug would work and then it didn't, why should we believe them when they say injecting bleach into your veins won't kill you"?

Expand full comment

I mis-remembered the wording slightly:

https://unherd.com/2023/07/the-secret-messages-behind-the-lab-leak-cover-up/

> “The main thing still in my mind is that the lab escape version of this is so friggin’ likely to have happened because they were already doing this type of work and the molecular data is fully consistent with that scenario,” [Kristen Andersen] said.

Andersen is a molecular biologist who signed his name to a paper that appeared in Nature in March 2020 and was reported around the world, like here:

“The Discussion Is Basically Over”: Why Scientists Believe the Wuhan-Lab Coronavirus Origin Theory Is Highly Unlikely (https://www.vanityfair.com/news/2020/05/why-scientists-believe-the-wuhan-lab-coronavirus-origin-theory-is-highly-unlikely)

The quote in question comes from a secret Slack channel called #project-wuhan_engineering in which a bunch of scientists organized the publication of anti-lab-leak papers, whilst simultaneously admitting that the claims were probably true. The contents of the channel were eventually obtained and they're interesting, you can see them talk themselves into believing the zoonosis theory over time even when they didn't start out that way. The justification for their conspiracy is also given here, they feared a "shitstorm" that would help Trump and maybe start WW3.

> I don't think epidemiology "has no accuracy or value" - for example, if you believe the lab leak theory, they exactly identified what features the next pandemic would need to have!

Virologists did that. Epidemiologists don't use any virological or genetic knowledge as part of their work. It's one of the things that surprised me when I first started reading epidemiology papers. Where's the science? It was all simulations and basic statistical extrapolation without any firm grounding in biology. Where I had expected to see complex formula that derived expected transmission characteristics from well verified micro-biology there were instead numbers they'd obtained from reading Chinese news reports, or just pulled out of their backsides, or simple derivatives of public data.

Expand full comment
author

Thanks for the Anderson quote. As I said on the original post, I agree that Anderson and other scientists acted maliciously in under-rating their real belief in lab leak. But I think if you look at all of Anderson's communications, you see someone who's not sure (to be more pessimistic about this, who's 50-50) and waiting for more evidence to come in, and who eventually found evidence that convinced him. I still agree he is bad.

First of all, many of the people I'm citing above are virologists - for example, I think Pekar's Lineage A vs. Lineage B molecular clock work is clearly virology by this definition. I also think Worobey (technically a professor of evolutionary biology) is more a virologist on this definition than an epidemiologist (and his previous work on genetic diversity in HIV is really good and AFAICT now proven correct). But also, I guess I find the epidemiology less dubious than you? Figure out where all the early cases are, figure out what's in the middle, check a bunch of records to make sure you got it right. Like, if you read Worobey 2022 ( https://www.science.org/doi/10.1126/science.abp8715 ) including the supplementary text, I think it's both pretty straightforward and also pretty good work.

Expand full comment

Thing is, I'm not sure him being convinced or on the fence is meaningful. His salary depends on it being natural. His political ideology wants it to be natural. He really, really badly wants to believe it's natural. So when he says it's really friggin likely that it's not natural, that's a very strong piece of evidence. Later he manages to collect enough talking points that he convinces himself, but this isn't a rational process of detached evidence collection and analysis. Anyone can convince themselves of anything given enough time to work on it.

My views on epidemiology are mostly shaped by their approach in modelling papers, admittedly. I think the classical cholera style investigations are much more straightforward and potentially robust, except for the investigative approach making some assumptions that aren't necessarily true in this case:

1. That you are studying a natural process.

2. That your data is clean.

3. That cases transmit in a very localized way.

The third point may deserve some elaboration here. In the past, scientists have been quite comfortable with the idea of very long range transmission. In the 80s there was some attempts to investigate whether influenza could travel between continents on upper atmospheric air streams. The WHO investigation concluded SARS-1 travelled via air currents within and between buildings. Ferguson's department circa 2000 published a lot of papers on viral animal diseases like foot-and-mouth, in which they tried to model winds travelling over the landscape due to their belief that the virus was being blown between farms.

So if you're dealing with a virus like Ebola or one that transmits via water in a standalone well, plotting cases on a map will be helpful and accurate. If you're dealing with viruses that can travel long distances invisibly via very non-obvious routes, or hang in the air for long periods with only a small proportion of people who pass through it getting infected, it'd be easy to end up making plausible but incorrect inferences based on geographical proximity. I saw no attempt to tackle this possibility in any of the epidemiology papers I read. I had to learn about their willingness to entertain long range transmission from reading papers from the archives.

Expand full comment

> So when he says it's really friggin likely that it's not natural, that's a very strong piece of evidence.

Not if Andersen himself didn't have any good evidence for that opinion, which they didn't that early in the pandemic. He's speculating because there's a virology lab in Wuhan that studies coronaviruses. You're oddly simultaneously placing a lot of faith in Andersen's ability to determine the source of the virus in January 2020, while also placing very little faith in his ability to form an unbiased scientific opinion. He didn't and couldn't know where it came from; we know a lot more now than he knew then.

> very long range transmission... investigate whether influenza could travel between continents on upper atmospheric air streams

This feels like a hail Mary to support lab-leak. Odds of transmission of COVID drop drastically with distance. 2 meters outdoors is generally considered safe. Even if COVID could survive and travel unlimited long distances on air currents, the average amount of virus within a kilometer of an isolated infected person would be about 125 million times less than at 2 meters.

Expand full comment

I think there's a difference between shoe leather epidemiology and modeling based on biased data that has been handed to you by an authoritarian government. The first confirmed case is two to 8 weeks after the index case depending on which peer reviewed phylogeny you believe.

Also the worobey paper has 90 p values (!) and the statistical methods they used are completely confused. See the stoyan and chiu paper

Expand full comment

> I think Pekar's Lineage A vs. Lineage B molecular clock work is clearly virology by this definition

Have you seen this recent paper showing the Lineage B descended directly from Lineage A? https://academic.oup.com/ve/advance-article/doi/10.1093/ve/veae020/7619252?login=false

Expand full comment

> The justification for their conspiracy is also given here, they feared a "shitstorm" that would help Trump and maybe start WW3.

Well, that's something they put in their chat.

You have to be careful when people describe their own motives. If your only concern is that a lab leak might reflect poorly on you, because you work in a lab, it will still sound better to say "we can't let this get out, because what if it helped Trump?"

Expand full comment

Kevin Drum summarized the slack discussion you quote from here:

https://jabberwocking.com/i-read-the-entire-slack-archive-about-the-origin-of-sars-cov-2-there-is-no-evidence-of-improper-behavior/

Kevin provides a link to the full archive so that readers don’t have to take his word for it. In contrast, you don’t link directly to the archive and you don’t link to a blogger who in turn links to it.

I checked your quote of Anderson, and it is an accurate quote, appearing on pager 3. However, two messages later (still on page 3), Eddie Holmes says that his initial reaction to the lab leak theory was “can’t be true.”

You claimed that the scientists (plural) thought the lab leaks hypothesis was probably true, but the quote you used to support that claim is cherry picking because it doesn’t represent a sentiment shared by the other scientists. Furthermore, it’s cherry picked even as evidence of Anderson’s view. The next day Anderson discusses the probabilities again (page 5, 11:47 AM):

<blockquote>Natural selection and accidental release are both plausible scenarios explaining the data - and a priori should be equally weighed as possible explanations. The presence of furin a posteriori moves me slightly more towards accidental release, but it’s well above my paygrade to cell the shots on a final conclusion.</blockquote>

Here Anderson isn’t saying that the lab leak hypothesis is “probably true,” the position you ascribe to all four scientists. How do we square the two quotes from Anderson? The first is that “so friggin’ likely” means something along the lines of “likely enough that we cannot dismiss the possibility.” The other is that Anderson’s views changed overnight. Even if we accept the latter explanation, I still think your use of your Anderson quote is fundamentally misleading. “There was a 24 hour period when one of the scientists believed that the lab leak theory was probably true” is not what you claimed.

More generally, you accused the scientists of bad faith without evidence. Indeed, you now seem to partially walk that back, writing that they “talk themselves into believing the zoonosis theory over time even when they didn't start out that way.” The phrase “talk themselves into” implies that changes in their views weren’t driven by the evidence but you give no evidence of that. So their crime is that they argued about the evidence, updated their views as new evidence came in, and wrote a paper that reflected their beliefs at the time it was written.

Expand full comment

I think it's obvious that they all believed it could easily be true, hence why they were in a chatroom called #project-wuhan_engineering and were busy telling each other how important it was to squash any speculation about lab leaks.

Yes, they are also in denial and trying to find any get-out that would let them absolve any responsibility for their profession. Their views aren't really changing overnight, what's happening is pure panic and a desperate search for any kind of story that makes things better for them. Again, we're not watching neutral people and I'm not quoting them as experts. I even put "prestigious experts" in quotes originally, because they aren't prestigious and their "expertise", such as it is, is hopelessly compromised by bias.

> you accused the scientists of bad faith without evidence.

Lol. Are you serious. We know they acted in bad faith because we can read their chat logs where they collude on fake papers that claim things the authors don't believe are true, where they work together to confuse and distract journalists, and where they try to cover up what they are doing. There is mountains of evidence of bad faith. It's beyond question that this ENTIRE PROFESSION acted in bad faith.

Expand full comment

Yes, I’m serious. There's a syllogism that I've seen on the political right, which probably happens on the left as well:

1) Premise: The scientists behaved badly.

2) Premise: We have chat logs showing how the scientists behaved.

3) Conclusion: The logs contain evidence that the scientists behave badly.

4) Secondary conclusion: The first premise is supported by evidence.

That is circular reasoning. You may believe that “we know they acted in bad faith because we can read their chat logs,” but to actually know that, you would have to actually read the logs.

Expand full comment

I actually have read them which is why I was able to quote them, and in a just world they would be in prison.

Expand full comment

At the risk of beating a dead horse:

Robert Garry (Feb 2, 14:16): If nCov was not engineered then RatG13 or a very closely related Bat virus somehow ended up in a situation in nature like the poultry farms for H5.... That's very scary and perhaps engineered would be better - at least that can be regulated so it doesn't happen again.

Kristian Andersen (Feb 2, 19:25): Bob said it well... I'd prefer this thing being a lab escape so we have less reason to believe other coronas might do this again in the future.

Expand full comment

Sounds like Garry /also/ thinks it's unlikely to have happened in nature.

Expand full comment

> The justification for their conspiracy is also given here, they feared a "shitstorm" that would help Trump and maybe start WW3.

The closest to this that I can find is a comment by Andrew Rambaut (page 5, 11:53):

“Given the shit show that would happen if anyone serious accused the Chinese of even accidental release, my feeling is we should say that given there is no evidence of a specifically engineered virus, we cannot possibly distinguish between natural evolution and escape so we are content with ascribing it to natural processes.”

Expand full comment

Separate subpost for thread readability:

Re: seasonality. Read any papers by Ferguson's lab from 2020. They were the most influential epidemiologists at that time, and key drivers behind lockdowns. The concept of multiple waves isn't considered at all. Their modelling all assumes one giant wave that overwhelms the entire health system of any country, that passes when 100% of the population has been infected. Lockdowns are presented as a way to "crush the curve" and spread the load out over time. They don't successfully predict that COVID will go away by itself over the summer, which is what happened.

The data is obviously very noisy but this is easier to see if we select test positivity rate (so controlling for varying numbers of tests), and look at the UK which is smaller so has more consistent climate patterns and yet still has high quality public data:

https://ourworldindata.org/explorers/coronavirus-data-explorer?facet=none&country=~GBR&Metric=Share+of+positive+tests&Interval=7-day+rolling+average&Relative+to+Population=true&Color+by+test+positivity=false

We see clear evidence that positivity rates are low in the summer, and go up in winter, then low in summer again. Sometimes there's a smaller wave around August. This is how other respiratory viruses behave, so, it's what we'd expect to see for COVID as well. This expectation was verified early on by the outbreak on the the Diamond Princess cruise ship, which ended with most of the ship having not fallen sick.

So although you're looking at US data, I'm not sure how you can see the clearly multi-wave shaped structure with depressions in summer and not see some seasonality there. At the very least, we can agree that it's not one massive wave.

So why did epidemiologists not predict this? Having read many dozens of their papers, I think the answer is because their field lacks any firm foundation in well validated theory. It's not micro-biology. They don't know why respiratory viruses are seasonal, or why epidemics come in waves that end before everyone is infected. Nor do they know how to find out, because they are desk-bound spreadsheet jockeys and not really scientists who do experiments or lab work. Many of them don't even have any training in biology. Ferguson used to do physics. Others clearly have a stats background. So they just .... guess (make assumptions). And if they don't know how to guess something, they just ignore it. The results are reams of nearly identical papers that make predictions without validating them against reality.

> I'm not sure that anything you say really explains why COVID didn't double every 3 days in Brazil.

Epidemics don't grow exponentially, they have logistic growth patterns.

Expand full comment
author

I don't find your data presentation that convincing for perfect seasonality - eg the spike in April (maybe it's cheating to take when it was first introduced? but I thought that was part of our discussion), then the dual (but separate) spikes in October and January), then a crash in the spring, a bit more in summer, etc. I think that matches "pandemic which has some factors pushing it to be seasonal, but hasn't fully entrained itself to a seasonal rhythm" pattern.

Re: lockdowns - I can't really argue against your critique of that paper, because I'm stuck on a more fundamental question - why didn't scientists expect that, even if all of their assumptions were correct and the lockdown worked, they would have to keep it going forever or else the epidemic would restart as soon as they stopped?

Since this makes no sense, I'm forced to assume they were doing something moderately sensible like trying to spread things out to preserve hospital capacity, or delay until a vaccine was invented or something. But nobody specifically said this and I admit I'm trying to be charitable. I can't really analyze any of the other assumptions behind lockdowns (like that cases would come in one big bolus) until I understand their basic point.

(that having been said, I think it's true that when a novel virus reaches an unexposed population with no lockdowns or behavior changes, cases will start in one big bolus, and I don't think anything that's happened has disproved this. I think most of COVID happening in waves has as much to do with behavior change - voluntary or forced - as seasonality, although as my Diseasonality post says, I don't think those are necessarily uncorrelated).

I'm not sure what you mean by exponential vs. logistic - growth is highly path-dependent at the beginning, and logistic at the end, but in the middle there's a period where you expect it to approximate exponential, at least moreso than "only twenty people in Brazil get the disease and then it goes away for no reason".

Expand full comment

That's OK, I accept the seasonality isn't perfect. The huge spike at the start in April in the UK data is an artifact of tests being highly limited at that time, so they were only being used to confirm already known cases. As tests became more available random testing became prevalent and positivity rate starts to mean something. So we're seeing a vastly magnified tail end of the winter 2020 season there.

> nobody specifically said [they were trying to spread things out] and I admit I'm trying to be charitable

They did say that specifically, over and over. Very odd that you don't recall this, perhaps the discussion in the USA looked different. Here's an entire Wikipedia article all about the 2020 strategy:

https://en.wikipedia.org/wiki/Flattening_the_curve

"The origins of the expression date back to 2007, though during the COVID pandemic the expression became a repeated "sound bite" used by numerous medical and non-medical individuals in the media"

Here's an example:

https://twitter.com/itvnews/status/1238160224181268480?lang=en

Boris Johnson: "This will help us delay and flatten the peak, squash that sombrero."

Here's an article from April 1st in the NEJM saying that the goal should not be to flatten the curve but to crush the curve:

https://www.nejm.org/doi/full/10.1056/NEJMe2007263

So the whole spreading-it-out strategy was very widely discussed.

> I'm not sure what you mean by exponential vs. logistic

Epidemiologists claimed COVID would grow exponentially until everyone had been infected. In reality epidemics always have a roughly bell-curve shape (often skewed rightwards). This isn't what the word exponential means. An exponential curve would go upwards continuously. In other words, assuming for simplicity a population with a power of two size, the last day of an epidemic would see half the population all be infected simultaneously and then the next day the epidemic would vanish, having nobody left to infect. Obviously that isn't something that's ever happened. Instead you see growth start slowly, then speed up, then slow down, until it vanishes into a long tail of background noise (until the next wave).

Expand full comment

I don't see how you current claim curve-flattening relates to your original claim that lockdowns don't work.

https://en.wikipedia.org/wiki/COVID-19_lockdown_in_the_United_Kingdom#/media/File:UK-lockdown+lifting.png

Expand full comment

That was in response to Scott saying he didn't understand the rationale for lockdowns to begin with and that it was pointless to discuss lockdowns further until that was clear.

That they don't work (along with masks, contact tracing etc) is obvious and widely discussed, there's no need to repeat that whole discussion here when it's been done to death elsewhere and the conclusions are certain.

Expand full comment
Apr 9·edited Apr 9

"Flatten the curve" and "hammer and dance" were *extremely* common in the US discussion too; I suspect Scott has just blocked out that time period after everything that followed it (and the parenting thing can be a bit of a brain-drain too).

Edit: I was curious, so here's Scott on flatten the curve:

https://slatestarcodex.com/2020/03/19/coronalinks-3-19-20/

https://www.astralcodexten.com/p/lockdown-effectiveness-much-more

Expand full comment

I remember hearing "flatten the curve" over and over, across many forms of media, during the first months of the pandemic. Today is the first I have ever heard of "hammer and dance.," either the Medium piece itself or the phrase. Apparently the latter was viral? Must have missed me (wouldn't be the first time something viral missed me). Anyone else remember "hammer and dance" being a common phrase in the US?

Expand full comment

> But nobody specifically said this and I admit I'm trying to be charitable.

I'm confused about what you're saying here. Justifications for lockdowns shifted a bit over time, and I don't think there was ever unanimity among the establishment, but there were obviously lots of people saying we should "flatten the curve", even before the first lockdowns began.

Expand full comment

Could the seasonality including April and August be related to allergens? I am going through a non-COVID, initially viral, multi-month bout of illness that has been greatly exacerbated by pollen. Could increased coughing and such provide an addition day of contagiousness? Or perhaps assumption that the cause of symptoms is non-biological influences behavior in ways more likely to aid transmission (e.g. go play racketball indoors rather than....)?

Expand full comment

Logistic growth, 1/(1 + e^(-x)), is very close to exponential growth when not near the inflection point. Until you reach about halfway to the inflection point (about 5.5 days before the inflection point if the doubling rate is about 3.5 days), then the growth is doubling very close to every 3.5 days.

Whether you use an exponential function or a logistic function, you still end up with hundreds of thousands of COVID cases in Brazil.

Expand full comment

One possible element of the PRC government response that I don't recall seeing discussed in the comments is that there speculation that the PLA was conducting bio-warfare research at WIV. (Please note, this is not saying that Covid-19 was specifically a bio-weapon or even necessarily related to any military research). If that's the case, and I seem to recall several intelligence agencies had at least some confidence it was, then it creates a strong incentive for PRC to shut down anyone looking in depth at the WIV, and certainly would bar complete transparency to outside researchers. You can't just fling open the doors and records in a spirit of open source cooperation when there's classified military stuff in the facility.

I have no idea, personally, if it's true, but it's been mentioned, and if true, it would create some very confusing incentive patterns where the WIV records would get completely stonewalled and downplayed, while simultaneously the Wuhan authorities (who never want to look bad to the upper layers of PRC government) would be downplaying the wet market stuff. So everybody is obfuscating on slightly different terms, but there's a very strong blackout on WIV questions.

I only mention it because this is pretty close to how I model pretty much every data source out of Wuhan - everyone lying but with slightly different focus, and absolute bar on WIV transparency due to classified military stuff.

Expand full comment

Prestige is the mechanism used to train a large language model: reward it for saying what sounds right, rather that acting in correspondence with an accurate predictive model of reality. Should it be a surprise that LLM-like mechanisms produce LLM-like results, ie “sounding right”?

Expand full comment

The PRC covered up the wet market outbreak, or at least the local branch at Wuhan did. However, they released information about an outbreak of pneumonia on Dec 31 which the WHO read (previously this was reported by the WHO as China informing them) and by 9 Jan China declared it a novel coronavirus. That’s actually pretty rapid - Sars1 took a few months to classify.

As I recall Fox was calling it fake until March or so.

Expand full comment

You watch a lot of Fox, do you? Here they are in Jan 2020 reporting on it as very much not fake:

https://www.foxnews.com/health/how-did-the-coronavirus-outbreak-start

By PRC coverup I meant the refusal to let anyone external investigate the lab, anyway.

Expand full comment

Regarding epidemiologists, I don't think it's very controversial to point out that simply drawing straight lines (on logarithmic plot), or making extrapolations like "in ddmm there were x confirmed daily cases in Italy, two weeks later there were y daily confirmed cases. In some country there are now x confirmed daily cases, so in two weeks we should expect y daily confirmed cases" would have given you a better prediction score than the epidemiologists working at state institutions. If not all state institutions then at least some of them, and even some of them performing worse than this in their predictions looks really really bad for the epidemiologists as a group. Let's not get bogged down about who said what and when and to what exact degree that's bullshit because that's kinda ephemeral to the argument I'm making in their defense.

After all, there are various factors at play that ought to lead to a more charitable view:

1. Political incentives: Just like epidemiologists have incentives on question of lab leak or not, they have incentives from their superiors (government/ultimately taxpayers)! For example, let's for the sake of the argument say that the prudent in the long-term most cost-effective response to any novel disease being detected is to do like Madagascar in flash game Pandemic II, and shut down everything. Well, some of those are false alarms, indeed most of them are. If epidemiologists always recommended the most prudent actual-fact-based solutions, they'd quickly be out of their jobs and replaced by politically more savvy ones: there's always the angle of political acceptability, so at best their public recommendations are as fact-based as is politically acceptable, even if they in fact knew better.

2. Institutional constraints: Is it insane that Americans cannot use medicines that have been found safe and effective in EU, or vice versa? Well yes it is. But that's how institutions roll. It's not enough to rationally be able to know that a treatment is safe and effective, institutions are required to get to the conclusion their own way. Sometimes red tape is there for a reason, sometimes not, sometimes you really just ought to cut through the damn tape in any case, but if the institution is set up such that they aren't e.g. allowed to run models created in other countries, or rely on "rational evidence" (like drawing straight lines in loglinear plot) as opposed to "scientific evidence" (present claims made in peer-reviewed scientific literature... when such papers don't as of yet exist), then what can you do, even if you did know better?

3. Lies-to-children: If you ask the general public about what they've heard evolutionary biologists tell them about theory of evolution, they're not going to talk about deriving and proving theorems and testing them in the field, or antagonistic pleiotropy, or introgression, or ongoing debates about the role of chance in evolution, or whatever. Chances are the general population heard a version for 5-year-olds, which is necessarily not strictly correct, and understood it like it was Pokémon which is more incorrect. If epidemiologists are brought out to speak to the whole nation, would you expect anything more than damnable lies to children?

4. Domain expertise doesn't imply expertise in epistemology: Here I'm joining Yudkowsky in defense of a hill prepared to die on, and claim that I am 99% confident that conditional on our current understanding of quantum mechanics is broadly correct, the Many-Worlds Interpretation (or, "the theory the ontology of which is Hilbert space, and that evolves according to the Schrödinger equation", as some physicists are proponents of that theory while claiming not to be in the MWI camp) is correct. A large fraction of physicists disagree: it's normal for there to be scientific controversy, but in a situation such as this, how am I hubristic enough to claim I know better? Well, I claim that physicists are really good at things that they do: making calculations, building experiments, analyzing the data from those experiments, checking if the data supports or does not support a given theory, etc, etc, generally following the scientific method as taught in schools (you know, the method that famously advances one funeral at a time, but at the same time is also so preposterously successful). Physicists have a habit of reaching correct conclusions, but they do it in large part because of the institution of science, not because your average physicist has any deep insight into epistemology. The status of MWI is a special case where as of now experiments haven't ruled out all rival theories and there are historical contingencies etc, etc, but it's a case where scientific method falls short, and large fraction of physicists are disastrously wrong in an issue that ought have become more or less part of a normal science by now (in Kuhnian sense).

Likewise, you could postulate that epidemiologists are good at what they do in their day job, but when presented with something truly novel (like a novel coronavirus), then their usual tools aren't cut out for the job, just like tools of physicists aren't cut out to evaluate the merits of MWI.

-

At the end of the day I didn't actually present a single case of epidemiologists employed in any government institutions getting anything particularly right to credibly demonstrate their expertise and domain knowledge, even in terms of stuff that they do "at their day job". I don't even know what those things are. But I've noticed a pattern of certain types of laypeople making a generic arguments against competency of individual persons at institutions (where points 1&2 apply) or science (3&4), in instances where I DO believe I can evaluate the claims on their own merits, and it's consistently something along these lines. Perhaps epidemiologists are among candidate gene researchers or parapsychologists whose entire fields I believe to be bunk upon examining them closer, but I would reserve harsher judgement until I've looked at the issue more closely. Or some kind of middle-ground: perhaps the field as a matter of fact is completely hopeless at providing good truthful advice during pandemics, but the basic research they do allow people with skills at epistemic rationality, like Scott, to come to correct-ish and actionable-ish conclusions (just like physicists have already produced everything it takes to elevate MWI to very strongly preferred status, but it takes philosophers of physics to tell them why that is).

Expand full comment

Responding purely to your point 1:

> Let's for the sake of the argument say that the prudent in the long-term most cost-effective response to any novel disease being detected is to do like Madagascar in flash game Pandemic II, and shut down everything. Well, some of those are false alarms, indeed most of them are. If epidemiologists always recommended the most prudent actual-fact-based solutions, they'd quickly be out of their jobs and replaced by politically more savvy ones

If you asked a general, "What's the best way to invade Iraq to prevent them from using WMDs?" should the general's answer factor in the possibility that there aren't any WMDs in Iraq at all? That's not the question you asked of him, is it? I suppose you could say not accounting for it shows a lack of political savviness, but I would say it's just the reality of advising politicians on anything. Since the final answer will involve politics one way or another, the best thing an advisor can do is stay in their lane and answer questions from within the scope of their expertise. It's silly to fault epidemiologists for not being experts on propaganda-driven media systems.

Expand full comment

I agree that experts being consulted, whether they are generals or epidemiologists, ought to answer questions from within their scope of expertise as you say.

Let's try to put it another way. Of course politicians don't necessarily appoint complete brown-nosed yes-men and their institutional appointments might be broadly competent, they might give solid advice, and it's actually the politicians who at the end of the day make the bad decisions (or, the electorate who want tax cuts/blood/bread and circuses now and no buts). But surely there's some sort of selection process that at least weeds out the folks who'd start stirring up trouble "because their job is to safeguard public health and the current policy isn't helping it" than those that meekly decide "I've given my advice to the best of my ability, it's now out of my hands". How strong the effect is would depend on political culture, etc, but I don't it's surprising we end up with Faucis than firebrands, even if it turned out that the firebrand position was the technically more correct one.

I think Scott had a similar defense of Fauci where he, as usual, presents the argument vastly better than I ever could.

Expand full comment

>2. PRC immediately blocked all investigations.

I've never found this argument particularly convincing. PRC is an authoritarian state; they might have blocked all investigation for any number of reasons, including even the slightest suspicion that it might have *potentially* been a lab leak, fear (justified or not) that enemy powers would foment some sort of a false accusation against them, or simply having a SOP of clamping down on all information until the best possible narrative could be crafted. *By itself* such a blockade is not proof of anything; it could happen if it was a case of zoonosis and it could have happened if it was a case of a lab leak. Same applies largely to 3. They are barely circumstantial evidence, if even that.

Expand full comment

Especially since they also tried to cover up SARS1, which was very definitely zoonotic.

Expand full comment

This is really great Scott. Thank you.

Expand full comment

To me it feels like the bayesianism was the hardest hit in this debate. I have long suspected that in the vast majority of cases when one says about "priors" and "updating" they're basically doing this https://www.smbc-comics.com/comic/bayesian, and in the debate we can see four people (Saar, Peter, judges) doing Bayesian analysis ("The Math And The Aftermath" in the original section) and getting wildly different results from plugging wildly different numbers in the same formula. How can you properly do Bayesian analysis if you can't even agree on the priors and updates which then become all sorts of ways to smuggle your biases into the result? So it doesn't surprise me that "[the judges] both thought of probabilistic analyses as an afterthought"; it looks like an obviously correct decision and (at least to me) somewhat demonstrates that no one generally does proper Bayesian stats with math and numbers for fuzzy, politically charged real-world problems. (I guess it's still useful for estimating the distribution of black and white balls in the urn or the probability that given email message should go into the spam folder.)

Expand full comment

Here's how we responded to this objection in our post (https://blog.rootclaim.com/covid-origins-debate-response-to-scott-alexander/)

Using this as evidence of the weakness in probabilistic inference is a bit funny. We have 6 estimates that span a very wide range, so obviously this concept doesn’t work. It’s not important that 5 are by people who have never done a full probabilistic inference analysis in their life, and one is by a team doing it for a decade.

Expand full comment

I'll quote the summary of the process.

>a) Seek and honestly evaluate best explanations under the assumption the hypothesis is true, b) Estimate the likelihood that there is some better explanation that is yet to be found – the more complex the issue is, the higher the likelihood, and c) Estimate the likelihood of mistakes in the estimates themselves.

B and C here are both just "change the numbers around until they make you more comfortable." What makes you think there's a better explanation? Obviously it would be not being comfortable with the one you ended up with. So just lower the numbers to reflect that. If that doesn't make you comfortable, then obviously the estimates aren't trustworthy, and we need to factor that into our equation too.

It's just vibes with numbers attached.

Expand full comment

Indeed, in many cases, you have to guesstimate. But you're far better off if you're doing it using a methodology that is specifically built to avoid the specific pitfalls humans have when working with probabilities.

The idea is to be less wrong, not perfect.

Expand full comment

Are you familiar with the research on human performance? Merely doing something for 10 years doesn't mean you're good at it. Lots of financial engineers and hedge fundies have decades of experience but fail to consistently beat an index fund. Many people play a sport or instrument casually for years without getting any better.

In particular, one of the things you need to improve is practice *with quick feedback*. Feedback here meaning a clear *external* signal of how well you did. I know that you've claimed a lot that a lot of your analyses were supported by later evidence, but this is easily susceptible to confirmation bias. Are there any examples where a wide range of 3rd parties agreed that the later evidence supported your initial conclusion? And how quickly does this feedback happen?

Expand full comment

To clarify - what I wrote above is not "Rootclaim works because we're doing it for years", but "It's silly to say something doesn't work because people who've never done it reached a different conclusion".

Expand full comment

I mean, ok, that's technically true, but it's also hardly a defense of your method. Without some evidence that all your experience is really valuable, I don't see any reason to favor "you got different results because Rootclaim's method works, it just takes a lot of practice" as an explanation for the different outcomes.

Expand full comment

I agree. We're definitely struggling with making Rootclaim convincing and accessible. The only way to be convinced takes many hours of studying the methodology. We're open to suggestions.

Expand full comment

So far it seems like you mostly focus on things that have already happened, but some aspect of it is in question. Could you apply the method to future events, ala Tetlock's superforecasting, so that it's very clear to everyone whether you were right or wrong?

Expand full comment
author

As I've said before, I think of Bayesianism as like physics. You know it's true in some sense. But baskeball players won't benefit (and will be harmed) by trying to use physics calculations directly to calculate each particular shot. There are some things that nobody has intuitions about yet where maybe using physics is your best bet.

Expand full comment

Yeah, this is how I feel about evolutionary psychology. It *has to* be true. (The second-most plausible explanation for the human brain is “Intelligent Design” and there is no third-most plausible explanation.) And yet, any time anyone ever tries to use the principles of evopsych to reason about something, they somehow end up with a conclusion that validates all of their prior biases.

Humans are just really really bad at correctly applying theories with lots of degrees of freedom. (I have a 73.4% prior that this is because it didn’t help our ancestors in the savannahs acquire mates!)

Expand full comment

> there is no third-most plausible explanation

Oooh, a low-hanging bronze medal! I'll posit that the human brain is the result of a collective hallucination and be taking that third-most plausible medal, thank you very much.

Expand full comment

A collective hallucination of what? Surely a mind, right?

Maybe you do believe in intelligent design of some sort ;)

Expand full comment

Trying to snatch away my bronze participation trophy? Well this certainly cannot be allowed to stand, looks like I must elaborate on my explanation! ;)

The human skull is actually perfectly hollow. At some very early point in human history, a Greek dude who was in a morgue was suffering from ergot poisoning (i.e. tripping balls) and hallucinated a giant ball inside the skull of one of the corpses. He was a pretty respected man in his local community - not the sort of outwardly famous guy you'd expect to get namechecked by Pliny or anyone, but he wielded a fair amount of local soft power, so people sort of agreed to Emperor's New Clothes regarding this freak hallucination from an otherwise very upstanding and respected individual. Now, over time, this became a sort of primed suggestion - respected individuals saw a brain, so you should probably claim to see one as well, to signal that you're a respected individual as well. And eventually we didn't even need the signaling motivation - it was simply due to cultural priming that Michelangelo saw a particular pattern inside a cadaver which he then replicated on the throat of God on the Sistine Chapel.

Now, in the data driven scientific age, we've come up with various scientific models and theories that happen to replicate quite nicely if - and only if - we agree upon the legal fiction of a human brain. The so-called "human brain" is, to analogize, the absence of light that proves the black holes of contemporary research.

For most, that isn't a problem, but there are those whose need for a more base and material "belief" hampers their ability to focus on the downstream research. Fortunately, there are plenty of organizations who can create replicas of the hallucinated "brain body" in order to aid these temporal-focused individuals in forming the legal fiction from whence much new and valid research stems.

P-values for all those associated priors are very high and P-values for alternative explanations are extremely low, so obviously the math can hold up well enough for third-most-likely.

Expand full comment

Humans are just really really bad at correctly applying theories with lots of degrees of freedom. (I have a 73.4% prior that this is because it didn’t help our ancestors in the savannahs acquire mates!)

+1

Expand full comment

See also: utilitarianism

Expand full comment
Apr 10·edited Apr 10

Utilitarianism is wrong (as are its main competitors), but openly saying this highly correlates with being a selfish sociopath (case in point), therefore the discourse is distorted in predictable ways.

Expand full comment

Why do you think this correlates with being a selfish sociopath ? I would bet on the complete opposite.

Expand full comment

Well, because it's wrong in such a way that you shouldn't actually care about total strangers as much as about yourself and those most important to you.

Expand full comment

Caring about total strangers seems like the exact opposite of selfishness.

You can think it is wrong, I think you are wrong about this, but this is a difficult debate. But this being wrong doesn't make the people thinking it is right selfish sociopath, this makes no sense to me.

I think the correlation you are speaking about just doesn't exist, you just don't like utilitarianism (in fact, I think it correlation is in the complete opposite direction).

Expand full comment

As a non-cognitivist, I think "Utilitarianism is wrong" is a type error.

Ethical frameworks like Utilitarianism are not descriptive, but prescriptive. You can either claim that a framework is not even self-consistent, or you can claim that following that framework would lead to some outcomes which you consider evil from some other framework or gut feeling.

Expand full comment
Apr 11·edited Apr 11

Exactly.

Another important thing to consider beyond self-consistency, is if the ethical framework correctly describes our own ethic, or our own ethic at reflective equilibrium.

Expand full comment

The math is perfect and absolutely true but the priors are often just BS. It’s also basically impossible to account for all the possible paths through the graph and sum over them and so you end up just multiplying them as if they independent to derive probabilities that are like 10^100 against which are laughably false (if it was that unlikely how did it happen? Are we really in the only universe out of trillions where a coronavirus pandemic happened? If so, how did we almost end up with two of them?). Basically reality is full of impossible coincidences but it’s only by summing over all the possibilities that you get reasonable probabilities.

Think of the configuration of atoms in the room you’re sitting in: it’s so unlikely to end up in that exact configuration that it might as well be zero. But there are an astronomically large number of configurations that are basically equivalent for all practical purposes so you have to sum over all of them to get reasonable probabilities like 1/2.

Expand full comment

Great analogy on atoms in room.

Expand full comment
Apr 9·edited Apr 9

> As I've said before, I think of Bayesianism as like physics. You know it's true in some sense.

But wouldn't that be the sense in which outcomes either happen or don't, so that all probabilities are either zero or one? Bayesian reasoning isn't a fact about the operation of the universe, it's a way for you to draw conclusions from things you already know. That doesn't help you if you're trying to start from something you don't already know.

Bayesian math can tell you, correctly, that "if this is what you believed before, then _this_ is what you should believe afterwards". But it can't tell you what you should have believed before; I don't follow the analogy to physics being "true in some sense".

(Tangentially, note that the reason it's not helpful for basketball players to do physics calculations is that they aren't capable of performing the actions those calculations say they should take. We have basketball players and we have physics dweebs, and the physics dweebs are much more accurate than the basketball players are, unless you make them take the shots themselves. An ICBM is the same thing as a basketball for these purposes.)

Expand full comment

>Bayesian reasoning isn't a fact about the operation of the universe, it's a way for you to draw conclusions from things you already know.

Sounds like a description equally applicable to physics. We don't know a single fact about how the universe "really" operates, we just have a ton of observations and extrapolate from there best we can.

Expand full comment

Let me try rewording.

Our physics models describe e.g. the motion of objects through space. The models are fuzzy around the edges - we have some trouble clearly demarcating what is and isn't an object, what the nature of space is, and whether the model is accurate at all at certain scales. See MOND.

But we have agreement anyway that space exists and objects exist, and in most cases on what is and isn't an object, and that there are rules governing how objects can move through space, even if we aren't sure of the details of those rules.

Bayesian reasoning doesn't follow this model. There's nothing in the universe to be bound by its laws. In this case, we _do_ have perfect knowledge of the laws; there's nothing fuzzy there. But they do not apply to things in the universe. They are computational artifacts. When you describe a basketball following a parabolic arc under the influence of gravity, the basketball really exists, and so does the gravitational force field. Nothing analogous is true if we replace "physics" with "Bayesian reasoning".

Expand full comment

Would you say the same about math in general? 2+2=4 doesn't at first glance seem to bound "space" or moving "objects" in any way, and yet there would be no physics without it. Bayesian reasoning is part of information theory, which is part of math, which is at the heart of all science. What precisely is the relationship between science and math is a pretty thorny metaphysical question ("the unreasonable effectiveness of math" etc. etc.), but that they are conceptual relatives is pretty straightforward.

Expand full comment
Apr 10·edited Apr 10

It's not difficult to cast 2+2=4 in terms of the behavior of objects. It's not necessary to do so, but you can, and generally as soon as you're trying to apply the knowledge, you do.

If one of my herds of sheep, after paying the shepherds' fees, has produced two dozen lambs, and the other one has likewise produced two dozen lambs, I can apply this identity to know that all together I've gained four dozen lambs. The lambs exist; if they didn't, I would have no need to count them. This is generally how things go when you're applying math to a problem. There is an underlying reality, you show that it meets certain mathematical requirements, and then you get the mathematical results that follow from those requirements for free.

Bayesian reasoning is not like this. There is no underlying reality to take account of. All of the inputs to your problem are computational artifacts generated by you in the course of doing something else. As you set up your Bayesian problem, an external observer _cannot_ determine what numbers you should supply, because those numbers do not correspond to anything with objective existence.

This is why I'm questioning Scott's claim that Bayesian reasoning should be understood in terms of the idea that "you know it's true in some sense". I don't believe that's correct. This is an analogy that is being misapplied. It's not that there are phenomena out there that obey rules (1) of which we have imperfect knowledge, but (2) that are approximated by Bayes' Theorem. We don't have imperfect knowledge of Bayes' Theorem and it is not an approximation to some phenomenon that is subject to further unknown details. It just doesn't apply to objective phenomena. Whether and how Bayes' Theorem applies to a problem that you wish to address is determined solely by how you subjectively think about that problem. Someone else with different subjective feelings will apply it differently, getting different results, and they will be no more or less correct than you are. This is not how things would work if there were an underlying reality involved.

> ("the unreasonable effectiveness of math" etc. etc.)

Tangent: I find the idea of "the unreasonable effectiveness of mathematics" to be incredibly fatuous. The only way to understand it is as a complaint by people who have not even the slightest inkling what math is. It should not come as a surprise that describing how something works is useful when you're investigating how it works. A description of how something works is math.

Expand full comment
Apr 10·edited Apr 10

Bayesianism doesn't describe any part of our universe, but it still describes something : It describes some rules that the credences of a perfectly rational agent would follow (but perfectly rational agents don't exist).

Expand full comment

Well a perfectly rational agent who also had perfect knowledge of all possibly relevant hypotheses and their logical relations but was not actually omniscient beyond that.

Expand full comment

It is true that most uncertainty exists in the map, not in the territory. Bayesian rules are rules about combining uncertainties, so they operate on the map, i.e. the beliefs in your brain.

But you are wrong to state that basketballs really exist. To the best of our knowledge, what exists is some kind of wave function describing the (mostly absent) entanglement of all the particles which make up the observable universe.

A basketball is not part of the fundamental description of reality, it is an abstraction you have in your mind. It is certainly useful in some edge cases (e.g. when it is traveling through the atmosphere at less than Mach 2 and you don't care about describing the faint rubber smell it emits), but it is all on the map, not the territory. Neither is your model of a point mass in vacuum under a constant gravitational field (which yields the parabola) "perfect", it is just a good enough description for some cases (e.g. atmospheric friction is low, you don't care about Brownian motion, it is not rotating at relativistic speeds and the maximum height of the throw is small compared to the diameter of the Earth.)

Expand full comment

I think the basketballs exists in the sense that the concept associated with the word refers to something in reality, it doesn't have to refer to something fundamental nor to be a perfect description of it.

Expand full comment

I think the situation is more akind to engineering than to playing a baskeball match, where explicit calculations work well.

In particular, we have all the time we want between each steps, and don't have to do everything in our head.

Expand full comment
Apr 11·edited Apr 11

Explicit calculations do not in fact work in most of engineering. You would never be able to get anything done because you would be too busy doing math. Most of engineering is rules of thumb lumped on top of other rules of thumb followed by lots of (eventual) verification (with help from computers). e.g consider trying to make a multivariable equation for an entire bridge rather than deciding ahead of time (based on rules of thumb) 90% of the parameters.

Expand full comment

Then you shouldn’t call it “Bayesianism”, because anyone looking that stuff up will just end up looking through Wikipedia pages about math and equations, and assume that’s exactly what you mean. Just call it “probabilistic reasoning” or something.

Expand full comment

Pedantic addendum: As long as that anyone sticks to the math and equations there is no problem, it only gets bad if they got to the more philosophical articles like "Bayesianism" itself.

That is because "Bayesianism" already has a bifurcated meaning, it can be a perfectly cromulent approach to statistical modeling of restricted domains (the stuff e.g. Andrew Gelman is trying to sell to the wider scientific community) or a comprehensive approach to epistemology (the crack Jaynes was smoking). The math is the same, the practical application isn't. Of course in the context of the Yudkowskyan tradition the second meaning is the relevant one, so you are basically right after all.

Expand full comment

You've done a great service sponsoring this debate, so thanks. Before I read it, I rated the evidence for Z vs LL as about 50/50 To me the takeaways are (1) Zoonotic origins much more likely and (2) Bayesian reasoning is limited value in these kinds of problems.

As one of the judges commented, Bayesian reasoning works well when the relevant evidence can be confined to a manageable number of quantifiable data points. Saar's discussion of DNA evidence in criminal cases illustrates these sort of cases. But both of the main Covid origin stories propose extremely complex causal chains, and in both stories there are many, many events whose likelihood is almost impossible to quantify. A better way to do this sort of analysis is to look at the entire causal chain proposed by each theory and look wholistically at the likelihood of that causal chain. What I notice is that the zoonotic causal chain proposed by Peter resembles other zoonotic crossovers we have already seen in the real world. It does not require any special assumptions, and it fits with the (admittedly limited) hard evidence. Whereas the LL hypothesis requires us to accept a causal chain which has not left enough evidence of its existence. eg, no evidence that a Covid precursor virus was known to WIV, no evidence that the type of gain of function research required to produce Covid from unknown precursor was conducted at WIV. In place the LL proponents make assumptions of a cover up that deleted all the evidence of a LL, ie assuming that which is still to be proven.

Expand full comment
Apr 10·edited Apr 10

I think this is going too far. I think it can be useful to think about Bayes factors as a way of noticing when some piece of evidence is particularly strong in comparison to others, in particular when there is some objectivish way of estimating what it is. I think the discussion of the bayes factor of "it started in Wuhan" vs. "it started in a wet market" was somewhat useful for example.

I also think of the Amanda Knox case discussed years ago on Less Wrong, hinging on the fact that the weight of "the base rate is super low given Guede. a stranger, is definitely involved" is >>> than that of "her behavior post-murder was a arguably little sus".

It's adding up all the of pieces to get an overall estimate that doesn't seem to work as a methodology in complex cases. I'd suggest 3 reasons

1. It's really easy to completely miss evidence when doing the calculation especially for the side you don't prefer.

2. It is hard to fully account for the non-independence of each piece of evidence. Often there could be some explanation you didn't consider of that explains multiple things.

3. Most of the factors will have highly subjective estimates. It is very hard to estimate something like "chance the virus would be a modification of a known viral sequence if it were from GOF research".

Expand full comment

1,2, and 3, is true even if you don't use Bayesianism.

But by not using Bayesianism, you also get :

- You can easily incorrectly put all the piece together and do reasoning mistakes that make you draw the completely wrong conclusion.

- The way you draw your conclusion is hardly communicable, mostly we just hope other people will draw the same conclusion.

- You can very easily end-up with an incoherent set of positions. (typically, you think X is very improbable when you think about Y, but not when you think about Z. Choosing one particular credence force you to stay more coherent)

Expand full comment

Those are all true of bayesian reasoning as well, it's just that you get false confidence along with it. Yes, even with bayesian reasoning you still get incoherent sets of positions because it is impossible to to ever do math on every possible hypothesis and piece of data since the world is inherently bigger than the minds of anyone in that world.

Expand full comment
Apr 11·edited Apr 11

You can get false confidence but it as nothing to do with Bayesianism per see.

You don't have to consider every possible hypothesis if you don't want your positions to be incoherent, you only have to consider every position you have. You only need to consider every possible hypothesis if you want to do the complete Solomonoff induction form of it, which is impossible and only work as a useful way to define what would be ideal.

I think we can see Bayesianism as propositional logic but for credences, it helps you to reason, fix your incoherence, and communicate your reasoning more clearly.

It is true that you will still do reasoning mistakes, have incoherent positions, and have hard time to communicate some of your positions, but it is still better than the alternative.

What I wanted to say, is that it doesn't really work worse on the 3 points Nate was speaking about, but it works better on the 3 points I described (even if it isn't perfect).

Expand full comment

Are there any things nobody has intuitions about where using Bayesianism is your best bet?

Expand full comment

While some people try to argue that bayesianism should come with specific priors and tell you how to find The Right Opinion, I think there’s a large Bayesian orthodoxy in economics and philosophy that will tell you that there is no such thing as The Right Opinion - bayesianism is just the optimal way for computationally unbounded agents to manage their own private uncertainty, but doesn’t tell you what private uncertainty to have.

It’s like double entry bookkeeping - it tells you how to track the money in all the accounts you have, but doesn’t tell you what amount of money to start with in each account.

Expand full comment

I really like this framing.

Expand full comment
Apr 10·edited Apr 10

Yes, Bayesianism is like propositional logic for credences.

Expand full comment

One problem is that it breaks down in situations that aren't modeled well by the "infinite omniscient computer floating outside the plane of reality" assumption. And that includes many important situations across a wide swathe of daily life. The situations that resemble Spherical Cow Game Theory Land are more the exception than the rule.

Expand full comment

I am 99% sure it’s zoonotic because of the arguments presented here and in other blogs I have read. However if it’s proven to be 100% lab created I would go with an American attack.

Why? Well as Sherlock said if you eliminate the impossible then whatever remains, however improbable, must be the truth. And the leak is highly unlikely given where the virus clearly originated - the wet market.

Lab created is not lab leak, yet everybody assumes this to be true. But these are independent variables.

Also were it an attack then choosing Wuhan has two layers of plausible deniability - the wet market (where in this scenario it’s deliberately implanted). The second layer is if the virus is proven to be proven to be lab created then people would still blame the Chinese.

Obviously this wouldn’t work with Beijing.

Another fact that might put some credence into this conspiracy theory is the very strong efforts by western media and scientists to deny the lab origin. Were it proven to be lab grown this would be retrospectively suspicious.

Anyway this isn’t my belief - I’m team zoonotic. Take one thing away from this - lab grown is not lab leak.

Expand full comment

> Lab created is not lab leak, yet everybody assumes this to be true. But these are independent variables.

How can it be true that they are independent variables?

Expand full comment

I explained that. The other option is a deliberate release. Obviously the Chinese wouldn’t do that.

Expand full comment
Apr 10·edited Apr 10

Why wouldn't they? They likely would've thought that their superior socialist system would deal with the virus better than the decadent West (which even ended up seeming plausible for a couple of years), and releasing it anywhere else would just invite more suspicion with the same eventual result.

Expand full comment

You can't deal with it without damaging your economy, so tha'ts kind of crazy.

Expand full comment

Well, that might be worth it if it damages adversaries even more, especially if you intend to conduct some not-necessarily-peaceful reunifications in the near future.

Expand full comment

I seem to remember that Mao was on the record as having desired a nuclear war, because china's enormous population advantage would mean that china would get a huge head start in rebuilding civilization after a mass casualty event

xi isn't mao, but also, covid-19 isn't nuclear armageddon

still in general i agree, deliberate release is very very unlikely

Expand full comment

I have thought about the American bioweapon theory, from time to time.

You could take the "lab leak rate" out of the calculation -- typically that's on the order of 0.2%, the judges here guessed higher at around 2%. So you get a factor of 50 right there. Instead of that 2%, I guess you'd replace it with some percentage for odds the US would choose to try such an attack (i.e. maybe the Trump admin thinks it could cripple the Chinese economy without blowing back on the US).

You could maybe also take the factor of ~10,000 out of the bayesian calculation for "lab leak goes straight to the market and nowhere else, if you just assume that was intentional.

But is that really the one most likely place that the US would launch it? Maybe you need to mark it down for all the other possible locations. Why not release the virus right next to the WIV? And if they did pick Huanan market, how would the US know to release the virus right at shop 6/29, with its suspicious history?

Also, why not make something more obvious like a virus with a WIV16 backbone, or at least something closer to SARS1? Or one with an optimal cleavage site? How would the US attackers even have a suitable backbone virus, in the first place?

All these scenarios are substantially less likely than zoonosis, to me. But, like Chinese lab leak, they're not strictly impossible. I don't think that bayesian analysis, as practiced by Rootclaim, gives objective or precise numbers for the odds, but it could still be interesting for someone to set up one of these bayesian analyses to compare the US vs Chinese lab scenarios and try to rank the relative likelihood.

Expand full comment

> Another fact that might put some credence into this conspiracy theory is the very strong efforts by western media and scientists to deny the lab origin. Were it proven to be lab grown this would be retrospectively suspicious.

Retrospectively? I think you're being a little too generous.

Compare https://www.currentaffairs.org/2022/08/why-the-chair-of-the-lancets-covid-19-commission-thinks-the-us-government-is-preventing-a-real-investigation-into-the-pandemic :

> Well, more than that: I appointed him—this was Peter Daszak—I appointed him to chair the task force of the pandemic commission that I was running for the Lancet. And he headed a task force on the origins. I thought, naively at the beginning, “Well, here’s a guy who is so connected, he would know.” And then I realized he was not telling me the truth. And it took me some months, but the more I saw it, the more I resented it.

> And so I told him, “Look, you have to leave.” And then the other scientists in that task force attacked me for being anti-scientific. And I asked them: “What are your connections with all of this?” They didn’t tell me. Then when the Freedom of Information Act released some of these documents that NIH had been hiding from the public, I saw that people that were attacking me were also part of this thing. So I disbanded that whole task force. So my own experience was to witness close up how they’re not talking. And they’re trying to keep our eyes on something else. And away from even asking the questions that we’re talking about. We don’t have the answers. But we have good reasons to ask.

> So you’re saying that Daszak and others did not disclose to you pretty serious conflicts of interest? Since, on the hypothesis that it had something to do with this kind of research, that would have implicated Daszak himself in the origins of the crisis?

> Well, he could have explained to me right from the beginning that there was a big research program and that they were manipulating the viruses, and here’s how. He could have given me the research proposals. And when I asked him for one of the research proposals, he said, “No, my lawyer says I can’t give it to you.” I said, “What? You’re heading a commission. We’re a transparent commission. You’re telling me your lawyer says you can’t give me your project proposal.” I said, “Well, then you can’t be on this commission. This is not even a close call.”

> But there were so many other things. He was just filled with misdirection. I don’t know whether he understands or not, maybe he doesn’t understand. But the things he said just were absolutely not right.

Expand full comment
founding

"Lab created is not lab leak, yet everybody assumes this to be true. But these are independent variables."

This strikes me as absurd. Certainly it is possible for a virus to A: be created in a lab and then B: leak from that lab. To me, it seems obvious that this is far more likely than a lab being A: created in a lab and the C: deliberately released as a covert biowarfare attack. Careless biologists trying to do good work are far more common than bioweapons researchers, and most bioweapons researchers are working for government that want their products sealed away in the deepest vaults against the direst contingency.

OK, maybe you believe that the covert biowarfare experience is the most plausible. Go figure. But even then, you'd have to acknowledge the *possibility* of lab creation -> lab leak, and would not be saying things like "lab created is not lab leak".

Expand full comment

> December: COVID doesn't exist, it's all lies

Early January: Fine, it exists, but it’s just some wet market thing that can't spread from person to person

Not quite. It’s called Covid 19 because it was recognised in 2019, albeit on New Year’s Eve.

(Although an Irish health minister did think that there were 18 other versions. We’ve made him prime minister. )

Edit:

Actually I was posting this based on an old report by the WHO. They’ve updated since to say that it wasn’t confirmed by China until Jan 9th.

Expand full comment

Our new Taoiseach, who is going to lead the party into an election, and totally coincidentally we got news today about the tender for a major development in the local city being received and evaluated 😁

Nothing like an election for loosening the purse strings! Also Simon is going to be very law and order, he reminds me of Tony Blair back in the early 00s. I wonder if Simon is modelling himself on Tony?

Expand full comment

Scott's good faith treatment of some of these "very prestigious authors" is not well-deserved.

Scott is great at Normal Scientific Situations, where there's a thorny question, and maybe it's answerable through exhaustive literature meta-analysis, and where the papers comprising the literature have been written by dispassionate actors.

This is not like that. Practically all of the scientists mentioned above who have published about this have conflicts of interest, and their livelihoods would be adversely effected if LL was institutionally accepted or conclusively proven. Because they're scientists, and all they know how to do is write papers, that's what they do, and because they're human, they're engaging in motivated reasoning while writing those papers.

Some of them are just lying. The model is big tobacco or an industry lobby.

Expand full comment
author

Can you give an example? What's Worobey and Pekar's COI? Do they do gain of function (wouldn't surprise me if so, I just hadn't heard of it). Didn't Worobey originally call for more investigation into lab leak before deciding it was false? Was that just a red herring to get people to trust him later?

Expand full comment

I admit my view about this is a little extreme, but I think anyone who's worked in virology and received public grant money (especially from the NIH) is conflicted out of this debate.

Politicians fund things and slash funding for things in hamfisted ways. It's not just virologists who do GoF research whose funding would be endangered by a calamitous lab accident, it's virology writ large and possibly even broader areas of biomedical science. Virologists know this, have acted/published accordingly, and that's the state of things.

Worobey possibly did act in good faith.

https://www.science.org/doi/10.1126/science.abj0016

I'm not sure. But he later wrote a shoddy, overconfident paper, and doubled down in defending it.

Expand full comment
author

I agree it sucks that there are so many fields where everyone relies on government money that it becomes a giant correlated failure mode and we have to have discussions like this.

Expand full comment

Do you have an estimate of how many virologists in the US have NOT received public grant money, from NIH or otherwise?

Expand full comment

No. Though maybe the ones who have not received such funding don't want the deaths of millions of people on their field's collective conscience. May be the ones who did get such funding also don't want that.

Expand full comment

If everyone who received grant money from the government can't be trusted, then your critique isn't limited to unique bad-actor situations, it applies to almost all "Normal Scientific Situations" because governments fund at least in part most of the research in the world.

Expand full comment

I specifically said virologists who received government funding were conflicted out of *this* issue.

Expand full comment

Why specifically *this* issue, and not every other scientific issue that potentially weighs in on whether a policy might have been good or bad?

Expand full comment

Well, most research is pretty trivial and low stakes. Some but not a lot has real-world application or implication. Institutional science is a machine for turning tax dollars into published papers and that's the end of the line.

By way of contrast this particular line of research led to the deadliest accident in human history, or so I believe.

Expand full comment

Lab safety is tedious and expensive. Nobody enjoys having a bunch of bureaucrats imposing imperfect rules on them either. This really does seem to be a problem for people, e.g. the wastewater sampling guy who has no interest in gain of function complained about additional paperwork.

Unfortunately, while the origin of covid is still up in the air it's clear from the followup that the virology community cannot be trusted to self-regulate.

Expand full comment

Is it not relevant that high-profile virologists are on record lying on *this* issue?

Expand full comment

It is more than that.

Like all humans, scientists are prone to form tribes. Sometimes around a paradigm, sometimes around a political cause.

Let us say that you are a climatologist and most of your colleagues are (rightfully, IMO) concerned about human-caused climate change. They also feel under attack by climate change deniers who may or may not be sponsored by the fossil fuel industry.

You study some glaciers and find out that they actually shrink slightly slower than current models suggest. I would estimate that framed as such, this result would be much less publishable than if you found the reverse. Every reviewer knows that such an article would give fodder to the deniers which they consider their enemy. You yourself know this. So instead you change the framing. Perhaps one glacier was shrinking faster than expected, so you lead with that. Or you focus your attention on some other detail and only measure that previous models require a correction (in an unspecified direction) in passing.

I have no idea if climate science works this way. I have also no idea how much biologists form an anti-lab leak block. But in humans in general, these tendencies do exist. I know that most nuclear physicists would not be annoyed if the amount of red tape around radiation protection was amplified by a factor of ten. I assume most biologists would be similarly annoyed if the BSL requirements for all classes of research were incremented by one or two, so they have some motivation to reason against the lab leak hypothesis.

Expand full comment

The crazy thing is that everything you said is true *even if millions of people had not died and entire economies were not brought to their knees*.

Expand full comment

Climate science is actually much worse than that :((

Expand full comment

The people who really seem to have put a thumb on the scale here are the journal editors. The most surprising example is the Proximal Origins paper where the journal actually strengthened the conclusion beyond what the authors felt they could support!

Having a low bar for publishing the Worobey and Pekar papers looks like a less egregious example.

Expand full comment

The motives are somewhat less transparent for journal editors, but only somewhat.

Expand full comment

Hopefully without going off on a tangent - I saw a survey on a controversial issue in psychology research that included questions on political beliefs. The journal editors were much further left than the academics.

Expand full comment
Apr 9·edited Apr 10

Worobey did sign the Bloom et al. letter, yes. Personally I do suspect it was a red herring, but it's hard to prove.

The Worobey et al / Pekar et al coauthor network also includes the original "Proximal origin" authors (Holmes, Andersen, Garry, Rambaut) as senior authors. Imo it's easier to show bad faith for those authors, by looking at their FOIA'd internal communications about that paper (https://usrtk.org/covid-19-origins/visual-timeline-proximal-origin/). I think these authors' bad faith is demonstrable, regardless of what COI or politics motivated it. Personally for them I'd bet it's more about indirect incentives (prestige; staying friendly with NIH leaders who don't want to disrupt China collabs) than direct ones ("here's a bag of hush money"). But that's beside the point: we *know* that they really really really wanted to disprove lab, regardless of why; and we know they exaggerated their confidence against lab publicly.

Personally, I also consider their papers themselves to be evidence of bad faith. Their arguments are so bad, so consistently, always in favour of zoo, that it can't be sincere error.

Expand full comment

(To me the most important part of the FOIA'd communications is that when they published that "we do not believe that any laboratory-based scenario is plausible," they privately knew that at least one was: serial passage in an animal, which their "glycans" argument didn't rule out.

Their other arguments imo they do seem to have genuinely believed.

Another important moment was changing "do not believe to be necessary" to "do not believe to be plausible," in between the Nature rejection and the Nature Medicine submission. That was the point of no return in their Tragic Hero arc.)

Expand full comment

Could be wrong, but I don't think either of them do any lab work with viruses, at all.

Pekar was a student working on his PhD (in bioinformatics, I think?). He's hardly some well connected scientist.

Worobey actually strikes me as a guy who's got a soft spot for lab leak theories being possibly true.

Worobey and Bill Hamilton travelled to the Congo to test whether or not HIV was natural (Hamilton died from the trip, complications of Malaria, IIRC).

In 2021, Worobey signed the Jesse Bloom letter asking for a better investigation of covid origins.

Here's a 2021 tweet where Worobey is talking about the 18060T mutation, indicating that at some point he took the proCov2 arguments seriously:

https://twitter.com/MichaelWorobey/status/1439665957656432640

And if you read Pekar and Worobey's 2021 paper, it has passages like this:

"The first described cluster of COVID-19 was associated with the Huanan Seafood Wholesale Market in late December 2019, and the earliest sequenced SARS-CoV-2 genomes came from this cluster (8, 9). However, this market cluster is unlikely to have denoted the beginning of the pandemic, as COVID-19 cases from early December lacked connections to the market (7). The earliest such case in the scientific literature is from an individual retrospectively diagnosed on 1 December 2019 (6). Notably, however, newspaper reports document retrospective COVID-19 diagnoses recorded by the Chinese government going back to 17 November 2019 in Hubei province (10). These reports detail daily retrospective COVID-19 diagnoses through the end of November, suggesting that SARS-CoV-2 was actively circulating for at least a month before it was discovered."

(from: https://www.science.org/doi/10.1126/science.abf8003)

Worobey also endorses the 1977 flu as being a lab accident:

https://twitter.com/MichaelWorobey/status/1494004351232143360

I guess that's a more widely held opinion, though it's still not proven, no one has any idea which lab is to blame. I'm just mentioning that because I've also seen more elaborate theories where all these scientists are paid apologists for every disease in history.

So, Worobey and Pekar come in thinking lab leak is possible, hold that opinion until mid 2021, then go on to do the most thorough investigation of the data that anyone has done, along with the best genetic simulations, and discover that it was just a market outbreak, after all.

What's the conspiracy theory to do?

The same thing every conspiracy theory does when it is challenged -- it expands. Now they say Worobey and Pekar are in on the plot. Maybe Fauci didn't have enough money to pay them off in 2020, but he finally raised enough cash in 2022?

Conspiracy theorists say that they think that it was a long game, where the earlier comments open to lab leak were a "red herring":

https://www.astralcodexten.com/p/highlights-from-the-comments-on-the-5d7/comment/53570474

Perhaps they were planning to discredit the lab leak theory all along. But they were so meticulous about their plot that they waited until early 2022 to publish, after a majority of the public had already gotten frustrated by the lack of good explanations from scientists and been convinced by the lab leak theory.

A brilliant ploy by those clever scientists! After intentionally losing the PR battle, they decided to publish 2 dense papers that most people would be too lazy to read, alongside a few articles in NPR or the NY Times, that wouldn't reach the majority of people.

Expand full comment

You don't need to posit a complicated secret plot to observe the behavior we're seeing from virologists. You actually just need a professional class of people with a shared set of incentives. Except in a few circumstances we know about through documentary email evidence, this isn't a cabal. It's an industrial lobby.

The specifics about Worobey deserve attention. I think what happened is that he was open minded about this issue, wrote a high profile, overconfident, but incredibly shoddy paper when he looked at a bunch of limited data on the subject, and realized he would have allies defending the shoddy paper in truly bad actors like Kristian Andersen and Robert Garry. Shared incentives.

Expand full comment

That's why you ignore the motivations of any particular person involved and look at the evidence. The evidence all points - overwhelmingly - at zoonosis.

Expand full comment

And what evidence would this be? Did they identify the intermediate host? Find a closely related virus circulating in animal populations? Or are you talking about the pictures of Raccoon Dogs locked up in cages that Eddie Holmes took on his iPhone in 2014? Because if it is Eddie Holmes iPhone picture then I agree the evidence is overwhelming!

Expand full comment

This is an excellent example of willful ignorance. Nice work.

Expand full comment

Exactly. This just sounds like an attempt to pre-emptively rule out all possible contradictory evidence, much like a moon-landing hoaxer would say "of course you can't trust *them*, they're in on it." (And never mind the COIs of people pushing lab-leak of course!)

Expand full comment

You don't have to look terribly hard at any particular person on any particular subject to invent some sort of motivation for what they think or say.

Expand full comment

“If they secretly knew they’d just started the worst pandemic in modern history, wouldn’t they at least be wearing masks?”

They’re all under 60 and have a healthy BMI, so… no? Assuming a massive coverup they’d presumably know that the virus is only fatal to elderly people, those with a BMI over 30, and those with major comorbidities.

I don’t believe the massive cover up theory but IMO it’s perfectly rational for extremely sophisticated actors to not wear a mask in that situation. Not to mention they’d have to wear N95 masks to avoid getting infected, a simple surgical mask would do ~nothing to protect them from others.

Expand full comment
author

Even I wore a mask the first month or two of the COVID pandemic! Also, they're Chinese, and Asian people start wearing masks at the drop of a hat.

I guess maybe they're virologists and know more about mask-wearing than the average person.

Expand full comment

Yes, average people never developed a good model of why or how masks could work. Some highly paranoid countries mandated P95 mask in 2021 but everyone else were happy to keep up the surgical mask charade for 3 years. I think a virologist would never wear anything less than an N95 mask, ideally with goggles.

Expand full comment

Only p95s worked really. I wear them when visiting hospitals but that’s the only time really.

Expand full comment

They all work to varying degrees. Wearing a cloth mask works if some guy coughs upwind of you outside and it happens to catch a strand of spittle. No mask works if you are sharing a diving bell with 4 sick people for several hours.

Expand full comment

There is a perfectly good model: paper masks are for other people, not for yourself. They don't limit your chances of getting it, they limit your chances of breathing it onto other people during casual contact.

This is how masks are used normally in Asia: you wear them when *you have a cold* to protect co-workers, not to protect yourself.

Expand full comment

People also wear surgical masks in Asia to protect themselves against smog, which makes no sense.

And sure, if you wear a surgical mask while infected it will marginally help others.

Expand full comment

Does protection against smog not make sense? It helps a little, if only because the area around your airways become more humid and the dust particles become less airborne.

Anecdotally, I was in Australia in February 2020. The east coast was out of masks everywhere because of the smog from bushfires (setting us up for a hell of a lot more pain and shortages when we needed them for COVID, because we'd used them all up trying not to breathe ash)

Expand full comment

It’s pretty easy to test. Light up a cigarette, smell the smoke. Then put on a surgical mask for a few minutes and repeat. I promise you’ll barely notice a difference.

Then do same with a P95.

Expand full comment

Unreal that years later this basic idea is still misunderstood.

Expand full comment

One of many corners of things where reality is looking pretty thin these days.

Expand full comment

It's not misunderstood, it's rejected by people who understand that masks have no effect (go look at case graphs! It's not complicated stuff!)

Expand full comment

> Also, they're Chinese, and Asian people start wearing masks at the drop of a hat.

The masks are worn when you're already sick, not when you're trying to avoid becoming sick.

(At least, that's the traditional system. Currently, there are quite a few people wearing prophylactic masks on the subway.)

Expand full comment

Why wouldn’t a surgical mask reduce their risk by about 10% or something? And why wouldn’t someone care about reducing that risk of getting something that would be very unpleasant just because it wouldn’t kill them?

Expand full comment

It would be more like 1%, not 10%, given that you breathe in mostly unfiltered air when you have a surgical mask on. Not worth the annoyance of having to wear a mask. Surgical masks do help if _everyone_ is _correctly_ wearing one, but otherwise they're just a security theater.

It's even more silly for people to use a surgical mask to protect themselves against smog/forest fires, as smoke particles are smaller than Covid aerosols. But tens of millions of people in Asia do this every year.

Expand full comment

There is no reason to think they would have any idea on what mortality looked like at that time.

Expand full comment

I guess it took some mental effort to keep your odds at 90-10, even though I agree with them. Essentially, none of the arguments listed in the original debate post were thoroughly debunked or disproven, and many of the weird coincidences stayed weird.

But man, don’t you want to update when one side seems so overconfident in their theory and arguments? Not sure what the associated fallacy is for this one.

Expand full comment
author

I'm also confident, I just think I'm right.

Expand full comment

FWIW I’ve always been a 50-50 guy prior to this but it’s always been just as clear to me that Woroby is basically the same on the other side just with more scientific-sounding methods: he starts with the conclusion he wants to make and works back to the statistical method that will prove it, then runs to the NYTimes with “ABSOLUTE PROOF THAT CORONAVIRUS STARTED IN THE WET MARKET, a trillion to one odds!!!” which is a huge red flag to me (if you know anything about statistics you have to be suspicious about the kinds of probabilities they claim to be deriving, nobody should be that sure about anything). However I strongly agree that lab leak proponents aren’t doing themselves any favors with their kind of gish gallop of arguments. I think the reason I am less invested than I used to be is that some of the arguments from lab leak proponents have seemingly made their way into the discourse: the government seems increasingly suspicious of these kinds of ”let’s go bring all the scariest bat viruses to our largest population centers and see how dangerous we can make them!!”-type research proposals, which was the whole goal of the original lab leak advocates anyways.

Expand full comment

That description of Worobey would be a huge red flag if it were an accurate description of reality.

That would also be a huge red flag of a research plan if it accurate described research plans. All research has costs and benefits. For sure the calculations can always be improved, but no one's well served by rounding costs to infinity and benefits to zero.

For a concrete example: I looked at the relevant sequences and I bet that the protease inhibitor in Paxlovid also would inhibit the protease in SARS2-like virus ZC45, sampled, grown in a lab, and published prior to the pandemic (not in Wuhan, FYI). But not cultured well enough for a good system to test inhibitors. It's plausible to me that the inhibitor could've been identified and tested in a chimeric virus combining a SARS-like virus that grows better with the ZC45 protease. How many lives are saved accelerated Paxlovid availability by having the in vitro work largely sorted before the pandemic starts and ready for preclinical trials immediately and perhaps on the market for Winter 2020/21? But you don't need to look any further than the controversy over a Boston University study (which posed less risk to everyone) to see that this is the kind of work people want to ban.

Expand full comment

It would be a huge red flag for Worobey (if it were an accurate description....). It would not however, be a sensible reason to update to believing Lab Leak was more likely.

Expand full comment

"Lv (is this even a real name? It sounds like Roman numeral? But I guess that’s what you expect in a country ruled by someone named Xi)"

This one I can answer. In the romanation of Chinese, the letter u actually represents two different sounds. When done properly, the sound "oo" like in "boot" is the plain letter u, and the sound "u" as in the French "tu" is represented by a u with umlaut, like this ü (should show up properly on most computers these days?). But in practice, no-one ever bothers to write or type the dotty accents. However, when you type Chinese, all the input methods have you type the letter v for the accented ü. It's an American keyboard workaround. This means that some people have now started to use the actual letter v to stand in for that ü.

This happens particularly in the surname Lu, because there are two different surnames, one with the u sound (陆 Lu), and one with the accented ü sound (吕 Lü). It's useful for people with these different names to keep them unmixed; but it's a hassle to type the u with umlaut. So a few people named 吕 Lü have just started using that typing shortcut and just romanising their name as Lv.

Expand full comment
author

Thank you!

Expand full comment

Thanks! But it's more like a Windows workaround. You can type umlauts on a Mac with US keyboard layout quite easily. Option-U to create a letterless umlaut, then a 'u'. Maybe it has to be enabled somewhere, I forget. On Windows it's more complicated and I can never remember. It used to be some magic number you had to type on the numpad.

Somehow this situation reminds me of geneticists renaming genes to work around Excel.

Expand full comment

In Chinese Traditional Computer Science (CTCS), the Mac is a cursed form, never to be spoken of...

More seriously, the goal here is not to actually type an umlaut ü, because Chinese computer users don't want to see the romanisation onscreen. The romanisation is only a way to type Chinese characters (only one of many ways, but probably the most popular now). So the people who design Chinese IMEs (the little software that takes your keyboard input and turns it into proper writing) just wanted to have a single key correspond to each element in the romanisation that all kids learn at school. (They learn it because it helps with literacy before you've learned enough characters to read properly; and it helps to teach standard Mandarin to all the people who speak non-Mandarin Chinese or some highly accented form at home.)

So, as in every Mac vs Windows debate ever... yes, the Mac is clearly superior. And yet that's somehow not the point.

Expand full comment

Yes that's true when Chinese users type Chinese. What I meant was that if it was easier to use umlauts then maybe the non-Chinese form of the name would look more like German than the key sequence used for an IME, which looks weird and unpronounceable when written in the Latin script.

Expand full comment

Oh, definitely. And some Chinese speakers do it with the accents. But you know what happens the second you use a non-standard character: every time you transfer to a new computer, or send a file, you have to expend extra effort making sure that the character hasn't been chewed up by the computer. This becomes even more of a problem in this country: Chinese bureaucracy is fabulously unforgiving of name errors. And often, a Chinese scientist working at a Chinese university doesn't give a monkey's how hard/easy/aesthetically unpleasing his name may be to westerners.

Expand full comment

> if it was easier to use umlauts then maybe the non-Chinese form of the name would look more like German than the key sequence used for an IME, which looks weird and unpronounceable when written in the Latin script.

You should see double pinyin. If I'm reading this chart correctly, "Xi Jinping" would be input as "xi jnpy". "Dajia hao" would be "dajw hk". (The principle behind double pinyin is that there are very few Chinese syllables, so you can make them all two letters long. It makes sense conceptually, but I don't think many people use it.)

Also, it isn't exactly unheard of in Latin script to have words like MANV. 🤪

Expand full comment

> When done properly, the sound "oo" like in "boot" is the plain letter u, and the sound "u" as in the French "tu" is represented by a u with umlaut, like this ü (should show up properly on most computers these days?).

This is not correct; the sound /y/ is properly represented by "u" in the syllables yu, yue, xu, xue, ju, jue, qu, and que, where it's unambiguous. The ü only shows up in nü, nüe, lü, and lüe because N and L can be followed by either vowel. (Technically, the zero onset can also be followed by either vowel, but those cases are disambiguated with phantom consonants in the spelling, so we have wu and yu instead of u and ü.) (This assumes we're using Hanyu Pinyin, which might not be the case for an English speaker studying Tang poetry, but definitely is the case for the authors' names on a modern Chinese research paper.)

Expand full comment

Haha, sure. It was just a simplification to convey the point.

Expand full comment
Apr 9·edited Apr 9

The thing is:

> But in practice, no-one ever bothers to write or type the dotty accents.

I've never seen anyone write "lu" when they meant "lv". I don't think people view the umlaut as an unnecessary frill that you might as well not bother writing down. I think they view the umlaut as not being part of pinyin at all. This contrasts with tone markings, where they know about them, don't bother writing them down, but are happy to indicate them if they think you're a foreigner who needs help. If they think you need help with which vowel a particular syllable uses, they will helpfully volunteer that it's V, assuming that you view the system the same way they do.

Expand full comment

Nüe and lüe would not be ambiguous without the dots either, because their back-vowel analogs are spelled nuo and luo, but they are still spelled with dots because.

Expand full comment

Finally a piece of real knowledge!

Expand full comment

Disappointed no one in this whole Lv discussion is sharing their priors or even confidence intervals that they're right!!! >:(

Expand full comment

Damn, that's an angry emoji! I suppose on this blog, you're justified...

In this thread, I'm speaking well within my range of personal and professional expertise, and I think I'd go as high as 99% confidence for everything I've asserted as true - with the caveat that I made some deliberate simplifications, and was not aiming for precision elsewhere.

Expand full comment

> When done properly, the sound "oo" like in "boot" is the plain letter u, and the sound "u" as in the French "tu" is represented by a u with umlaut

Aren't those the same sound? Or is there some subtle difference I'm too American to pick up on?

Expand full comment

I speak French as an L2 and they're different sounds in French but in English we don't make a distinction between them because the tu ("v") sound doesn't really exist.

The "u" is more like the typical "boot" sound, and the "v" involves a narrowing of the throat (only way I can really describe it).

So yeah, subtle difference. I struggle to hear the difference myself.

Expand full comment

TBF the English oo is gradually getting further front (especially in British English) — among young speakers today it's roughly halfway between French ou and French u

Expand full comment

Haha, yeah, as Madge says, there is a difference. The "v" sound is a high rounded vowel. You make it by rounding your lips as if you're going to say "oo" (that's what the "rounded" means); then, keeping your lips still in that position, try to say "ee" (ee is said with your mouth nearly closed, so your tongue is high in your mouth space - that's what the "high" means).

Every language has a unique set of sounds that it regards as different or the same, and yeah, they always trip up speakers of other languages. The classic example has got to be Mandarin and other tonal languages, where the inflection on the word makes it mean different things. I've lived here for 20 years, and I still often don't hear them.

For speakers of other languages learning English, the same challenges exist. For example, Chinese doesn't distinguish between long and short vowels, so "sit" and "seat" sound identical to Chinese people just beginning to learn English.

Expand full comment

Thank you for the tip on pronouncing that vowel. This may change my life when I speak French, especially because my career is in road safety and the word for "road" has said "v" sound in it.

Expand full comment
Apr 17·edited Apr 17

Note that /u/ is also a high rounded vowel. /u/ and /y/ differ in frontness, not height.

> Every language has a unique set of sounds that it regards as different or the same

A longstanding interest of mine is in how "the same" gets determined. Mandarin speakers perceive [θ] as being /s/. (They don't even hear it as "like /s/, but weird" - it's just /s/ to them.) Cantonese speakers hear it as /f/.

This is true despite the fact that /s/ and /f/ are both present, and distinct, in each of Mandarin and Cantonese.

And the effect is consistent - different Mandarin speakers will reliably classify [θ] as /s/ without needing to coordinate or discuss with each other, even if neither has ever heard the sound before.

So there's something about the phonology of the language that suffices to classify the sound, even though it doesn't actually exist.

(This is not true of every language-sound pair. The Mandarin sound "q" might come across as /ʧ/ to an English speaker, or it might come across as /ts/.)

I'd like to know how we go from a classification of sounds that do exist in a language to a classification of sounds that don't. We know the problem is solvable, but as far as I'm aware we know very little about how it's done.

Expand full comment

Yeah, that's an interesting question that I've never really thought about, nor read about.

My first guess would be that we all classify sounds that aren't phonemes in our own language through a process that goes something like: this sound occurs in speech, so the mouth making it must be aiming to speak properly (defined here as, using the set of phonemes that my brain knows); but this sound isn't a proper phoneme, so it must be a mouth trying to pronounce a sound, but experiencing some kind of mechanical failure along the way; so I mentally reverse-engineer the sound and work out, what was the mouth trying to do that would mistakenly end up as [whatever that was].

As a first approximation, you could probably look at just whatever's closest. So perhaps the different perceptions of "th" between Mandarin and Cantonese speakers reflects differing levels of sibilance in the relevant consonants in those languages. But the full subconscious interpretive process is probably more based on reconstruction of mouth gestures rather than pure phonic features.

But all that is pure speculation... hey, I wonder if research into parrots has anything relevant? Maybe people researching parrot mimicry have looked at how their different mouth physiology interacts with our phoneme system, and some of that might apply across?

Expand full comment
Apr 17·edited Apr 17

> But the full subconscious interpretive process is probably more based on reconstruction of mouth gestures rather than pure phonic features.

I don't think this is right. People hearing strange sounds, even when they recognize them as strange, are generally not able to reproduce those sounds. Babies also need a period of experimentation where they make a bunch of random sounds and figure out which ones match the ones in the language(s) around them. This suggests to me that what people hear are phonic features, not reconstructions of oral gestures. Note Madge's reply to you thanking you for the description of the necessary gestures involved in pronouncing /y/.

The variation between (English) "ts" and "ch" when hearing a Mandarin "q" is quite easy to explain in those terms: the "q" has a place of articulation (the location of the airflow constriction) matching English "ch", but it has tongue positioning matching "ts". (It's also aspirated, which can't be true of English "ts"; I'm not sure whether or how much English "ch" might be aspirated.) People can easily recognize these similarities, but they can't describe them, and they also can't imitate them.

Moving into the realm of pure speculation, phonemes in every language exist as a cluster of allophones. One thing that might be going on when classifying nonexistent sounds is applying a rule that makes certain (existing) allophones equivalent to each other, and seeing by that rule that the weird sound is equivalent to some real sound.

Expand full comment

"...generally not able to reproduce those sounds...what people hear are phonic features, not reconstructions of oral gestures."

This is a good point.

But... I think not quite right, because it mixes up two different levels of processing: the conscious and unconscious. Using mirroring, actually people can be really quite good at reproducing even random sounds. I think those failures of reproduction that you note are what happens when we let our conscious mind get in the way. People can't *consciously* reproduce the sounds of other languages, but language teachers succeed when we induce our learners to relax and mirror/copy for a while (then go though a delicate process of inducing them to turn successful mirroring into conscious understanding).

The reinterpretation of foreign sounds into the phonemes of your native language happens entirely at the unconscious level, so I think it could easily have access to those simulated mirroring processes.

Your explanation of the ts/ch interpretations is completely following my theory! You explain those two different interpretations in terms of pronunciation features: place of articulation, aspiration. That's what I said.

And any sound-based theory would run into similar problems, anyway: we aren't very good at (consciously) hearing phonic features like pitch or length or harmonics. Whatever theory you use, you have to posit that at the subconscious level, your brain is processing a lot more detail in the sound that you can consciously access. The only question would be whether that detail is encoded in terms of phonic features or reconstructed gestural features.

The idea that there are allophone rules makes sense to me. It would allow for more detail and variation in the possibilities for different phonemes. But yeah, just speculation at this point. I did a quick search for parrot studies but couldn't find anything immediately relevant.

Expand full comment

Oh, here's something relevant: https://www.researchgate.net/profile/Louann-Gerken/publication/251481108_Learning_Phonemes_How_Far_Can_the_Input_Take_Us/links/5603322708ae08d4f17150ea/Learning-Phonemes-How-Far-Can-the-Input-Take-Us.pdf

This paper finds:

"...with input as limited as this artificial language, and only 9 minutes of exposure, subjects do not develop featural representations."

There you go, that's evidence against what I was suggesting. So I dunno! Perhaps you're right that the processing of speech sounds is mostly phonic early on, and only turns into gestural/featural understanding a bit later. Interesting.

Expand full comment

Re raccoon dogs I think you misunderstood my point about the whole inventory. They were selling 38 raccoon dogs per month across four wuhan markets. Over 11 days they tested 15 raccoon dogs at the suppliers of various markets in Wuhan. My point was that this is 15/38 raccoon dogs tested over 11 days and so likely comprises all or nearly all of the inventory of the raccoon dogs that would be sold in Wuhan in that period

Expand full comment

Also, even if you think that the raccoon dogs were farmed in western Hubei, the zoonosis side still has to explain why the farm transported the raccoon dogs from Yunnan given that they all lived wild in Hubei. Then you have to explain why there were no secondary outbreaks anywhere else the farm was supplying. And that 20 raccoon dogs is an extremely small fraction of the total wildlife trade in China and so Wuhan should have correspondingly low odds for an outbreak. One would need very strong zoonosis evidence to outweigh this, but what we have is no infected animals, a case search which the man in charge said was market biased, and environmental sampling which basically rules out raccoon dogs

Expand full comment
author

Sorry, I've edited in a correction to the piece with a link to this comment, and will try to think about this more.

Expand full comment

Thanks

Expand full comment

You say that civets are plausible. Worobey says these were not for sale at the Huanan market from October to December 2019, based on the observations of the Xiao team one of whom is a co author of the worobey paper. There is a very small amount of civet genetic material in the environmental sampling but DNA can survive on surfaces for up to a year so this is consistent with the Xiao findings that they were sold there from 2017-2019 but not from October to December 2019. https://link.springer.com/article/10.1007/s00414-022-02800-6

On bamboo rats, bamboo rats are one of the most well tested animals throughout the pandemic. The WHO reports "samples raised by some market suppliers were sampled and tested between Feb and March 2020" including 62 samples from 10 species each, one of which is bamboo rats, which implies that perhaps 20 individual bamboo rats were tested. The bamboo rats sold at the Wuhan markets were farmed so this implies they tested on farms. 42 bamboo rats were sold per month at the four Wuhan markets, so maybe about 20 at HuanN. I.e. they tested 20 bamboo rats at Huanan market suppliers, which is pretty much the entire inventory of bamboo rats at the Huanan market. Do you think they are still a plausible host?

Expand full comment

In general there actually was quite a lot of testing of potential candidate hosts in Hubei and in Yunnan. They probably didn't test the farmed raccoon dogs because they weren't farmed. They didn't test animals in the market probably because they would mostly have been sold or cleared out each night because they only had 3-4 in cages

Expand full comment
Apr 9·edited Apr 9

This isn't like your alien case. It's more like the alien goes to a wet market and gets covid and posits it was from raccoon dogs. The Chinese immediately trace back the suppliers of that exact market but don't find any infected raccoon dogs. It's not that they randomly tested raccoon dogs in the area, they tested at the specific suppliers

Also to make your alien caze fully analogous you should say that no humans have ever contracted covid outside of a laboratory setting. For the next three years everyone says it was humans that gave the alien covid

Expand full comment

The other considerations in favor raccoon dogs are very weak. The freuling et al study on raccoon dogs transmitting covid has an R0 of 1. Yes they are everywhere in the environmental sampling but so are humans and fish. They're also not correlated with covid in the market unlike eg bamboo rats with bamboo rat covs.

Expand full comment

I can't take arguments that dismiss Nepalese journals no one's heard of seriously.

Expand full comment

Re: Connor Reed, I don't know if it's relevant, but he died from an overdose of various drugs. It's a sad story, but the discrepancies in the accounts are there: family says he never took drugs but the university dorm mates say he used to regularly order and have them delivered.

https://www.walesonline.co.uk/news/wales-news/connor-reed-covid-death-wuhan-21116962

So I think the different accounts can be explained by (1) he does contract coronavirus (2) he's the (allegedly) first known British case (3) this gets interest from the news media (4) he does a plain interview with local paper (5) national and other media get interested (6) 'hey I can sell my story!' but it's no good giving the same version of it over and over again, each news outlet wants something new so (6) he builds up the story for them e.g. from the original "I just thought it was the flu, that's why I refused antibiotics" to "I was suffocating with pneumonia, further details inside".

https://www.walesonline.co.uk/news/wales-news/connor-reed-covid-coronavirus-wuhan-19239316

"His parents, originally from north Wales, emigrated to Australia's Gold Coast when Connor was 12. The young man was forever dreaming up "get-rich-quick schemes" said his father and he dreamed of one day making millions from some unspecified Anglo-Chinese business venture."

I can imagine someone who dreamed of being rich and famous seizing on the chance to get fees from the media from selling his story, and the juicier the story, the better the fees. That's why he wasn't keeping the story straight between versions.

Expand full comment

He may also just have suffered the usual confusion people have when remembering an exact sequence of events from the past. For instance: "I get sick, hospital tells me I have weird pneumonia, hospital later tells me I had COVID" would be conflated with "hospital told me I had COVID" by a lot of people I think, unless they were unusually precise programmer/logician types.

He might have been lying too, that's perhaps more likely, but I rate his testimony as more reliable than Peter did because throughout 2020 onwards I kept meeting people I know well and consider credible who made the exact same claim about having had a really aggressive and weird respiratory illness in late 2019 or very early 2020. None of them had any reason to lie about it. I said this on the previous thread, but I found the way Peter casually dismissed Connor+the other 91 cases as all being liars to be kind of ridiculous and not good for his case. But presumably he hasn't had the same experience as me.

Expand full comment

To me the fact that so many people keep telling me about the weird sickness they had that winter cuts the other way. It’s clear that most of these people didn’t have covid, because if *that* many people did, then it would be a real coincidence that big floods of cases in hospitals always showed up a few weeks after detected cases, and never in all these cities with equal numbers of undetected cases. The fact that so many people retrospectively thought that some winter illness they had that year was weird is just evidence that people retrospectively reinterpret their experiences as weird in light of knowing that a pandemic started soon after they were sick. Connor Reed is just one more of those people.

Expand full comment

I agree, but also see my parallel response to the long ranger.

Expand full comment
Apr 11·edited Apr 11

Exactly. Those "I got covid in 2019" claims that were flying around everywhere a few years ago are so annoying.

Expand full comment

I regularly tell people that I had COVID early in December of 2019 or whatever. I've never attached any epistemic value to this statement, and is something I say to signal in-group-ness with people who want to talk conspiratorially about COVID so they'll be easier to steer to other topics. Yes this is one of the dark arts. Yes I'm telling the truth right now. Yes I realize this has no credibility in any direction.

Expand full comment

I think that the probability of some random patient making unsubstantiated claims about COVID being a reliable witness in the first place would be low, and anyone who feels they have to prop up their case by such accounts probably have an extremely weak case. (At least if Saar originally brought him up. If he was brought up by someone else, then I would not update on it either way as far as the debate is concerned, and just wish we had a better filter for arguments which are obviously just a waste of time.)

Expand full comment

"Peter, very reasonably responding to the numbers Saar gave during the debate and not the numbers he had elsewhere, trolled him by giving a set of numbers that came out to 10^25-to-one in favor of lab leak."

I think you mean "in favor of zoonosis" or "against the lab leak"

Expand full comment
author

Sorry, fixed.

Expand full comment

Not sure if we'll do a proper response, so in the meantime I'll open a few threads here on the major issues.

The most important thing to note is that this post discusses a lot of evidence (which mostly is irrelevant to our analysis) but Scott does not address the main problem we pointed out - that his whole conclusion stands solely on the market being some random place in Wuhan that is no more likely to form the early cluster.

He assigns this a 500x(!) factor, compared to our 2x. We specifically pointed to major mistakes in his calculation (see our blog https://blog.rootclaim.com/covid-origins-debate-response-to-scott-alexander/ and especially the last section).

In summary:

1. His calculation assumes zoonosis will almost always start in markets, while in SARS1 it was 1 in 9, and in the USAID PREDICT project markets were given a low priority. This is a 5-10x mistake. (He tries to refute this using cherry-picked examples. I'll address that in a separate thread).

2. He then gives no weight at all (Zero!) to the conditions in HSM, implying an HSM vendor who interacts daily with many people in an unhygienic closed environment that was proven to form early clusters elsewhere, is no different from a random Wuhan resident. This is a 10-100x mistake, depending on how much more conducive you think HSM is. And even if you don't think it is, just the fact it has 1000 people in the same space means it is more likely to be noticed, since there needs to be some critical mass of hospitalizations - how many people in Wuhan work in such a place?

3. Even if you somehow manage to ignore this, there is still the alternative explanation of the Wuhan CDC moving just next door to the market during the outbreak, which could also easily account for 10-100x.

Since Scott's odds are currently 17x zoonosis, fixing these mistakes easily turn him into an avid lab-leak supporter.

Expand full comment

I'm just an amateur reading this stuff, so perhaps your response to this is going to be mainly a probability-fu that goes over my head, but I see two pretty major objections to your case, at least.

First, even if we have no meaningful knowledge about where the first cases were centered, the lab leak theory depends on the WIV being able to create COVID-19. The case against this, that you need a close relative, that WIV didn't have a close relative the last time they acquired new coronaviruses, and that they would have neither the means nor reason to acquire more coronaviruses, sounds relatively airtight. Sure, you could imagine some circumstances where someone a researcher acquires a Laotian virus and keeps it totally secret or whatever, but then you're at enough degrees of freedom from the provable facts that lots of theories are equally plausible.

2. I sort of hate making meta-rationality points, but my read is that the lab leak side has done a lot of gish-galloping between weak evidence. (This may not apply to you in particular if you've been more careful.) My impression is that lots of pro lab leak arguments rely on spurious or easily disproven evidence which lab leak advocates havent owned up to, whereas the errors noted on the zoonosis side seem minor, good faith, and readily acknowledged by the zoonosis side. For me personally, I'm at the point of assuming many lab leak arguments that I hear for the first time now will be false, given the previous pattern that impressive sounding arguments that end up being false. I also feel that on a normative level, brining in weak counterevidence to clog up a debate us pollution to public discourse in a way that should be penalized. I'd like to know if you agree with this general stance in principal and why it shouldn't apply to the lab leak side.

Expand full comment

1. You're addressing the issue of the WIV having a close relative. I'll respond to that in a different thread. The important thing to note is that once you say "we have no meaningful knowledge about where the first cases were centered", meaning you remove HSM from the analysis, everyone becomes pro lab-leak. So claiming this is strong evidence against lab-leak is a minority opinion.

2. We address this in our post (search 'pseudoscience'). All sides of all analyses we ever did use awful reasoning. Humans are just very bad at this.

Expand full comment

1. But the whole idea that everything switches to be way more pro lab leak without the early case tracking doesn't make much sense. If you have a person who is agnostic about the early clustering, but finds the evidence to be strong that WIV couldn't/wouldn't have manufactured the virus, then lab leak seems no more likely than many other far fetched theories. When you look at Scott's grid for the factors people multiplied together, many people didn't even list probabilities on that section. This fits with what Soctt said about these probabilities mostly being an afterthought once people had already drawn their conclusions, and would likely be revised even if someone changed their stance on the evidence.

2. I guess Scott already made the relevant points better than I will, but the claim that both sides are equally bad about pseudoscience seems like false equivocation. You're using flaws you claimed to have found in published papers as evidence of the zoonosis side getting things wrong, but you're still defending extremely weak evidence like the Connor Reed stuff. I don't really find it plausible that there is an equal level of good faith attentiveness to the quality of evidence if that's going to be the comparison.

Expand full comment

1. I'm not saying this evidence is irrelevant. I said I'll address it somewhere else. Just wanted you to notice that on this claim you are arguing against everyone, not just team lab-leak.

2. Note I'm not defending Reed. We don't even use him in our analysis. I'm doing the opposite - pointing out the misrepresentations zoonosis supporters are forced to make about him in order to avoid this evidence.

But end of the day, I don't have a model that shows who has worse reasoning. I can only share my experience that in all our analyses, we find everyone to be awful. Feel free to ignore it.

Expand full comment

The strongest evidence in favor of a lab leak is that it would be an incredible coincidence for the virus to show up in Wuhan - actually, not just incredible, but absurdly incredible - without the laboratory playing some role in this. (Annoying counterintuitive mathematics involved here, but it checks out.)

Meanwhile, the strongest evidence in favor of zoonosis *doesn't actually contradict the possibility of a laboratory leak*. Yet the "zoonosis" hypothesis is treated as oppositional to "laboratory leak", when most paths to a laboratory leak actually involve zoonosis.

It could come from a researcher collecting specimens who got infected, or a researcher who got infected from an already-collected specimen, or it could arise semi-naturally in a laboratory as a result of the disease moving through laboratory animals, or it could arise semi-artificially in a laboratory as a result of researchers deliberately spreading the disease among populations of animals to see how quickly the disease spreads and in what conditions (which, I don't know, seems like exactly the kind of experimentation a laboratory like this would be doing very regularly), or it could arise entirely artificially in a laboratory as a result of direct modification of a virus.

Of these, the evidence against zoonosis is really only evidence against the last path to a laboratory leak.

Yet everybody pretends these are mutually incompatible hypotheses - which means the zoonosis hypothesis is not, in fact, the hypothesis that the disease had a zoonotic origin, but rather the hypothesis that the disease had a zoonotic origin *and also it had nothing to do with the laboratory*.

There is a motte and bailey here!

"The furin cleavage site is more likely to be the result of a natural mutation for XYZ reasons" isn't evidence that the laboratory wasn't involved! It isn't evidence that the origin of the virus must have been raccoon-dogs! It isn't evidence that the disease must have originated from raccoon-dogs brought in to sell at the wet market! It is only, exclusively, evidence that there was a zoonotic mutation involved.

Which is not evidence against the laboratory leak hypothesis.

The only real evidence against the laboratory leak hypothesis is the relative frequency of zoonosis versus laboratory leaks - which is more than overwhelmed by the remarkable coincidence involved in the disease appearing where it did. All the other evidence has some, let us say, severe chain-of-custody issues including the laboratory in question deleting and hiding data, on top of being more "absence of evidence" than "evidence" per se. (Granted that absence of evidence is evidence, but it's significantly weaker, particularly when paired with the chain of custody issues)

The weight of the evidence is on "laboratory leak (with probably zoonotic origin, whether inside the lab or outside the lab)", not "zoonotic and also has nothing to do with the laboratory".

Expand full comment

On the order of 1 in 1000 people in Wuhan visited Huanan market every day, mostly for a small fraction of the day. Assuming that WIV-linked and WIV-unlinked people have similar rates of visiting Huanan market and equivalent locations suggesting exposure via the wildlife industry, 500x for the market coincidence is conservative.

Curves of onset dates for market-linked and market-unlinked cases show that spread was no faster in the market than in Wuhan ex-market. Early spread is associated with hospitals, transit, social events, and high density work places. See Fig 1 here for the onset curve comparison: https://www.nejm.org/doi/full/10.1056/NEJMoa2001316

Your anecdotes to try to build up markets as singular sites of spread e.g. regarding Singapore and Thailand outbreaks do not hold up to fact checking. For example, you write of a Singapore outbreak, "This too happened months after zero Covid." Nope. There were 27 active clusters the day this outbreak was announced: https://www.moh.gov.sg/news-highlights/details/update-on-local-covid-19-situation-(16-july)

Your Thailand anecdote is a great example of the fallacy of comparing low-prevalence, high-mitigation era outbreaks to the pre-pandemic situation in Wuhan. Here's an article from Thailand discussing the causes of outbreaks in various stages of the pandemic -- https://www.hindawi.com/journals/apm/2021/5807056/ -- the food industry is dense and essential; of course outbreaks are disproportionately found here and I bet most here can remember examples of food industry outbreaks in 2020, but probably can't name one they've heard of recently. Here's a paper in Thai a major industrial food processing outbreak at the time -- https://he01.tci-thaijo.org/index.php/DCJ/article/view/257499 -- you hear about the outbreaks at public markets because those are the outbreaks that are publicized in order to identify cases.

"there is still the alternative explanation of the Wuhan CDC moving just next door to the market during the outbreak, which could also easily account for 10-100x."

I though the probabilistic framework was the reason you avoided this argument? Your current argument leans heavily on a rejected grant proposal that describes the type of work we knew WIV was doing without reading it as key evidence. Specifically: "For a bat coronavirus pandemic to start in Wuhan due to a lab leak, the following chain of events needs to occur: (1) WIV started a DEFUSE-like project." It doesn't say "Wuhan CDC started a DEFUSE-like project" and it hinges on the existence of countless covered up samples in places Wuhan CDC did not sample.

Expand full comment

Thanks. This is a good comment that directly addresses the analysis.

Curves of onset dates - Seeing as this argument is recurring, we may analyze it in more detail. In any case, note it is only relevant to part of the HSM arguments, and also note how, again, the only zoonosis claims that are not easily refuted always resort to using unreliable data or complex error-prone models.

Singapore - There aren't 27 clusters. 23 of those have 0 new cases. The rest are not really clusters, but isolated cases (note that "KTV" may refer to hundreds of different locations). The fish market later developed to 1200 cases, again demonstrating the key point - dozens of small clusters, and once it gets to a seafood market, it explodes. It is becoming very hard to argue there isn't something about these markets that is highly conducive to SARS2 growth.

The same happened in China - There are many small clusters, but 2 out of 5 big ones were at seafood markets/facilities.

Thailand - I don't understand. Are you claiming there are plenty of clusters with >1000 cases that went unnoticed?

CDC - Note the hypothesis is infection through WIV-CDC cooperation. Not CDC doing GoF.

Expand full comment

Dude. You said there was COVID zero for months and it was dead wrong. When you get contradictory data that upends what you said was important evidence you revise your beliefs and don’t come up with new techniques to wave your hands to make it go away.

As Scott noted I think, both the Thai and Singapore clusters are epidemiologically linked to crowded housing for immigrant workers. Their dormitories were closed as well. This is also true for earlier Singapore clusters that aren’t linked to fish so aren’t something you discuss. This is widely discussed in the media in Singapore and Thailand if you don’t limit your search to a single mention of shrimp and call it a day.

Like I said, almost everyone knows examples of food industry spread during low prevalence times since density in the economy was reduced as much as possible. It’s apples to oranges.

Expand full comment

Ok. Will make sure to use "COVID near-zero" from now on. The point is these markets form large early clusters that other locations don't.

Not sure what is the importance of the dormitories here. It's just another reason why those facilities keep popping up. It doesn't affect p(HSM|LL), which is what we're interested in.

As far as we could find this is not about "food industry" but about crowded closed location with many cool wet surfaces. In the east it is seafood markets, in the west it is meat processing plants. Vegetable markets, for example, come up much less often, and in Xinfadi we clearly see seafood vendors having far more infections than vegetable markets.

Do you have data showing otherwise?

Expand full comment

Do you honestly not get the point that spread in countries suppressing an R0=3 virus to R~1 is going to be disproportionately concentrated in the food industry? Greater than R0=3 for your seafood market outbreaks since they're lineages after substantial adaptive mutations e.g. your Singapore example is in an imported Delta sublineage. One that's also found in Indonesia and in Singapore for people with Indonesia travel history iirc, with a similar story for the Thai outbreak.

Food is cheap. Keeping the food industry running for local consumption is difficult and even more difficult in economies that depend on food export. There are plenty of other anecdotes from all over the world throughout 2020 (and into 2021 for places that kept SARS2 at a low level e.g. Singapore). Check out the link to the Thai paper I posted; the abstract and some other content are in English. Then check out pre-pandemic reports about the Thai shrimp industry - https://www.ap.org/news-highlights/seafood-from-slaves/ - and try to figure out difficult it is to keep the industry running with the same degree of mitigation you see elsewhere. The cases (including a case earlier than the shrimp vendor you cite) were largely in a food processing factory. Further, migrant workers without work/residency permission (a large fraction in the Thai industry) are unlikely to have very good healthcare access.

Thailand didn't get it under control by figuring out how to produce food with social distancing, by the way. The solution was a "bubble and seal" policy that you can read about in the report here: https://mwgthailand.org/en/press/1619664349

There were on the order of 100 million infections globally in the first wave of the COVID-19 pandemic with on the order of 1000 introductions into communities. What does Bayes have to say about the anecdotal market outbreak out of that period occurring spitting distance from illegal trade of SARS2-susceptible wildlife?

Expand full comment

Are you saying there are just as much large early clusters in dry food factories, vegetable markets, supermarkets etc? I did not see any evidence of that, but open to examining it. Feel free to share.

As to your last question - That is exactly what we analyzed and the finding is that it's negligible.

Expand full comment

>What does Bayes have to say about the anecdotal market outbreak out of that period occurring spitting distance from illegal trade of SARS2-susceptible wildlife?

Not much, considering how prevalent it is in China, and for all the other reasons you note that markets in general are likely places to acquire outbreaks.

At least, not as much as "... occurring within spitting distance of 1 out of 3 total labs doing GoF research on coronaviruses at that time".

Expand full comment

I agree, especially with (2). There's also a whole lot of sampling bias (searching under the lamplight) that makes the clustering appear to be stronger than it was.

Expand full comment

People who say that we're living in a simulation give trillion-to-one odds that we could be in a real universe. (That's not a dig on Bayesianism, I'm just setting up the actual point.) They say to look for suspicious names. A Saar debating SARS? Impossible. Clearly COVID, you, and I are all fake.

Expand full comment

You should try to get the late 2019 addresses of the WIV staff. If one of them lives next to the wet market I think your case against zoonosis could start to look strong again.

Heck, maybe a young scientist lives with a parent who is a fishmonger. Or racoon vendor!

Expand full comment

Our analysis already shows HSM is negligible as evidence, so there is no value to weakening it further. I understand our analyses are not easily understandable to a wide audience, but that's what we need to solve, not revert to weaker arguments that may be more convincing to some.

Expand full comment

If you're right, and I tend to think you aren't, that map is killing you. it is extremely convincing to me - simple and evocative. You are going to need something powerful to overcome it.

Expand full comment

The case map? It has a dozen problems with it, but it's not important to our analysis. We're fine with HSM accounting for many early cases. That's in our calculation. We provide 3 different reasons why this is not a low likelihood event under lab leak. Discussed in excruciating detail here https://blog.rootclaim.com/covid-origins-debate-response-to-scott-alexander/

Expand full comment

Thank you very much for all that work!

Expand full comment

> SARS spread back and forth in some kind of weird net between civets, raccoon dogs, and a bunch of made-up-sounding animals like "ferret-badgers" and "greater hog-badgers".

I'm not sure what makes "ferret badger" a weirder name than "raccoon dog".

Animals tend to either (1) have a local name; or (2) be named after an animal that is local to the language that needs a name for the animal. The Tasmanian tiger resembles nothing so much as a dog (which it isn't), but it's called a "tiger" (which it also isn't) anyway, because they had to call it something, and people only know about the animals that they already know about. "Tiger" was good enough to distinguish it from other marsupials.

Expand full comment
author

They're both weird names! But if you told me "Here's my pet gorilla-penguin", I would say you were making it up.

Expand full comment

Wait until you hear about alligator pears.

Expand full comment

My ears pricked up. Do you have a source? I only want the good stuff. Nutty. Creamy. You know what I'm talking about.

Expand full comment

We have to get our penguin monkey tacos somewhere!

Expand full comment

I can't believe no one has mentioned this yet (probably because it isn't particularly relevant), but raccoon dogs aren't raccoons, they're dogs. They actually have pretty much zero relation to raccoons, which is funny because they have an almost identical appearance and ecological niche.

Expand full comment

That's the way all English noun compounds work. (It's called "right-headedness".) Raccoon dogs must be dogs, in some sense, in the same way that alligator pears must be pears and not alligators, or that I don't need to look anything up to know that the greater hog-badger is not a type of pig.

They aren't actually dogs, but the terminology unambiguously indicates that they are conceptualized as dogs, and they are canids.

Expand full comment

I remain completely perplexed by the sources of evidence used here, and the fact that anyone thinks they can draw solid inferences from data that has obviously been manipulated by the Chinese government. Maybe someone can explain the premise to me, and maybe there's something I'm missing, but analyzing the early cases as acknowledged by the Chinese government just seems fundamentally pointless.

Everyone agrees there has been a cover-up. No one seems to agree exactly what was being covered up and why. Based on the way authoritarian regimes typically act in response to such situations, I think a fair guess is: many different things, for many different reasons, and often for no reason at all. And surely it was also a multi-level cover-up; a standard feature of authoritarian regimes is that lower-level officials conceal things from higher-level officials. So, you may have had multiple coverups operating at cross purposes to one another. Thus, the kind of pattern Scott suggests is so unlikely seems actually quite likely to me; not because it was conscious misdirection by some omni-competent villain but because it was the result of different officials at different levels trying to conceal different things.

So, I think before you can make *anything* at all out of the data on the purported early cases, you need some kind of theory of the data generating process and the biases it introduces. Looking at the data without that backdrop is like a case where I see "here's a survey of 700 people about politics," tell you nothing about how I sampled them, and then you analyze them under the assumption the sampling was random, when in fact this could be 700 people I sampled at a Democratic party meeting or 700 people I know personally or anything else.

So, can someone explain this to me? What is everyone doing? It seems like there's zero analysis on this and everyone just kinda takes the early case data at something like face value with no model of where it came from. Is there a model?

Perhaps the justification for doing this is that it's too hard to come up with a meaningful model. Probably no one in the world has a granular understanding of both the multi-level Chinese politics involved and the virological side of things. You probably can't develop that understanding of the Chinese politics purely based on open sources, so to really have all that information, you'd need either a regime insider or, maybe, an intelligence agency that knows a lot of non-public things about Chinese politics.

Which brings me back around to a point I made in the last post, the only real candidate for an organization that can bring both pieces together is the US intelligence community. It's hard to be sure how good their political understanding is, especially given the high profile failure in recent years of our HUMINT in China, but given the amount of investment we put into understanding Chinese politics, it's probably better than anyone else's. And they have the scientific experts as well. So, it strikes me as *extremely* notable that the intel community is divided on lab leak vs. zoonotic origins, but no agency is willing to offer more than low confidence for the zoonotic origin.

Unless you're incredibly sure that you have a better understanding of Chinese politics than the US government does (not impossible, but they definitely know many things you don't), I really just don't see how you can possibly reject that conclusion.

Expand full comment

This will be a shorter comment than this point deserves, but I think you can have multiple levels of assumptions here, which generally favor zoologist.

1. There wasn't a cover up preceding the actual outbreak. Thus, what we know about the wet market from before Jan 2019, what we know about the WIV's virus stock and capabilities before roughly this time, etc., are all basically right, because the state cover up apparatus had no reason to deploy at this time.

2. The Chinese government isn't making up complex data. The coverup involves destroying information, like killing the animals at the wet market, but they aren't falsifying information about infection case studies or viral DNA sequences, because it would require significant expertise to create fakes without clear flaws and because the investment versus return for this sort of falsification is bad.

3. The Chinese government can't do perfect containment and suppression of prior outbreaks, so we are not missing out on information about previous outbreaks that have been silenced without any word getting out.

I think these are all reasonable assumptions to make, and cover most of the strong evidence used in the debate.

Expand full comment

#2 here seems pretty reasonable to me. #1 and #3 seem a lot more iffy.

I think the fault with #1 is the idea that, if there was a cover-up, it had a specific motive (to disguise the origins of COVID) thus before COVID started there was no motive and no cover-up. One of the main points I'm trying to make above is that authoritarian regimes often don't work that way; there's reflexively secrecy by bureaucrats as a means of self-protection. And then there are innocent bureaucratic explanations that don't even assume secrecy.

For example, in US science it's routine -- because of the difficulty in securing a steady stream of grant funding and the delays involved -- to bootstrap your grants. You apply for a grant having already done part (maybe even most) of the work involved from the tail end of your last grant, complete the work, and then use the remaining funds to get your lab going on the next project (and the cycle repeats). I don't know anything about how Chinese funding works, but it's not hard to imagine someone at the WIV who was keeping his current project quiet for the same reason. Or, if there was someone doing unauthorized gain-of-function work, they'd keep that on the down low as well and so on.

People say that the WIV had published a catalogue of the viruses it had gathered, and had no reason to lie about that (e.g., by omitting COVID itself or a close relative) before the outbreak. Again, that's just not true. What evidence do we have that the catalogue was comprehensive? Something as simple as laziness could mean that some percentage (perhaps the majority) of viruses didn't make it into the catalogue. Or maybe they had been unusually productive, so they reported half of what they had done so that they could slack off later and then report that old work as new work. And so on. You don't need any kind of potent motive. All of that routinely happens in any organization and more so in authoritarian systems. There's no need to assume a conspiracy any bigger than a handful of people in a lab doing things that people in labs do all the time for simple bureaucratic reasons.

So, I just don't buy the idea that our understanding of what the WIV was up to as of 2019 is "basically right." Maybe it is. But I think it's an unjustified assumption. And to really know that, you have a really detailed understanding of the dynamics within the WIV and even the personalities of the individual scientists there that I just don't think anyone has outside of China.

And then, once the outbreak starts, everybody panics and starts reflexive concealment. Over two million Chinese government officials have been prosecuted under Xi Jinping's anti-corruption campaign since 2012; many of them have been killed. So, circa 2019, bureaucrats in China were living in an environment of grave fear. The natural response to that when you see a big scandal coming is just to shred stuff, destroy evidence, and start distancing yourself. Again, no conspiracy. No planned cover-up. Just scared people trying to make sure they can't be tied to a pandemic whose origins they most likely don't know. And if everyone is doing that; then someone who knows they're responsible can just blend into the background noise of obfuscation and fear.

Eventually, the central government gets involved and starts crafting a narrative, but it seems likely that they don't really know the truth themselves. If it was a lab leak, there might not be more than a handful of people who know that. All you really need is one PI who was doing the work and screwed up. I'm a researcher (thought not in an entirely different field) and aside from me, no one has a full picture of what I do day-to-day. If something went terribly wrong and investigators were trying to reconstruct it, the only possible ways they could do that are by getting me to talk or through forensic reconstruction from my computers and hard drives. If those were destroyed, they could only get a partial view from talking to my RAs and colleagues or looking at institutional records. I've never worked in a virology lab, but the number of people who would know if some PI was doing this work -- even in the absence of any kind of effort to keep it quiet -- is probably in the single digits.

And, when you turn to the epidemiological data, someone can correct me if I'm wrong, but the reports on the early cases around the market that everyone is relying on were -- I believe -- not released until early 2022 by which time all kinds of manipulation was going on.

As to #3, if you mean containing and suppressing some discrete prior outbreak involving a meaningful number of people months earlier than the acknowledged start of the pandemic, then I agree.

But, I don't think it's at all implausible that COVID started a few weeks earlier than the acknowledged timeline and the Chinese government has kept this under wraps and that's all you really need to toss out the whole market argument.

It seems like the US intel community has some kind of indication that happened, though one is left reading ambiguous tea leaves. The declassified ICA reports as fact that COVID started spreading "no later than November." According to multiple different media outlets, the first US intelligence reports on COVID were issued in late November.

If that's true, and something was happening on a scale observable to US intelligence agencies by late November, then the market-linked cases starting on December 11 are useless as evidence of anything.

Expand full comment

I agree with Sam above, but more to the point, I don't disagree with JBG - and I don't think Scott disagrees with JBG! Looking online, I agree that the US intelligence community has only offered low confidence on any conclusion; but it looks to me like they lean more towards natural origins than lab. And I read that being as exactly what Scott says. It's certainly what I think.

You may be right that all this analysis of data is a waste of time. But in terms of conclusions... there's nothing to see here. I personally think that natural origins seems most likely. But government conspiracies do happen sometimes, and if in 20 years' time it turns out that it was a lab leak, I'll... accept it.

Expand full comment

It's a bit hard to characterize the intel position, but as of the last available reporting you have five agencies (the National Intelligence Council plus four other unnamed agencies) concluding it was zoonotic). All of those are at low confidence. The FBI and the Department of Energy assess it was a lab leak (intriguingly, they think this "for different reasons" than one another). The FBI assesses this at medium confidence vs. low confidence for DOE. And then you have the CIA and one one other unnamed agency unable to reach a conclusion.

If you give a little extra weight to the FBI because of its higher confidence level, then that leaves the average position at unable to reach a conclusion. That's where I am as well; I'm extremely uncertain. If you absolutely made me bet, I guess I'd put my money against lab leak but it's like 53-47 or something for me.

I'd be interested to get a clear characterization of Scott's position, but I read him as leaning further than that.

Expand full comment

I like how you basically went through the same process most of us in this topic went through, but at like 10x the speed.

The "asymmetry of passion" on the lableak side will always guarantee that a plethora of arguments, rationalizations and attacks will be thrown at the wall to see what sticks, and like the famous hydra, anytime scientists knock down one argument, two more spawn in its stead. As I have written here https://www.protagonist-science.com/p/a-tale-of-two-pandemic-origins and elsewhere, the controversy is mostly driven by a set of asymmetries that blows up the lableak side; despite the emperor being naked.

After debunking a few dozens of those flawed arguments, most scientists and communicators but the most stubborn just have to give up; its exhausting, draining and often comes with unpleasant harassment on top.

That does not mean that most people confused or unsure about this topic are conspiracy theorists (just wrong on the topic), or do not deserve good information; of course they do. But asymmetries and the actions of certain activists just makes the fight against the windmills of conspiratorial ideation much harder for the minority of scientists who can speak intelligently about the topic; as I am sure you will experience now coming out with yet another long post that will expose you to some of that same "passion".

In any case, maybe your 10x speed approach is more sensible, just ripping off the band aid and get back to normal life quicker; while the virologists who conducted the essential work have been stuck in a nightmare for years for just doing their jobs; getting death threats, white powder letters, having to hire security guards, scrape the name off their labs from institute's map etc...

Expand full comment

This post is pretty representative of the entire debate.

I'm not an expert in epidemiology, virology, local CCP politics, or any really topic that would be useful in evaluating the evidence. Not that I know nothing about these subjects, but I have to make predictions based on my trust of the actors involved. I don't have time to watch a 12 hour debate, or even read this blog post fully, with all its links, and threads and subthreads - even if I did, no guarantee I come to the same conclusion as anyone else.

So for me, it's a question of "Who do I trust?"

Not to pick on you too much, but you have a blog called "Protagonist Science", wear your PhD in your username like a badge of honor, and indirectly smear lableak proponents as conspiracy theorists disrupting the noble work of the virologists.

Sorry dude, I can't trust you! You're *clearly* not an unbiased actor. Your whole online persona seems to be "defender of institutional science". And this a debate where institutional science itself is in the crosshairs!

This doesn't mean lab leak is true, or that you're lying or being intentionally manipulative. Institutional science really does seem to be right most of the time. But on this question, I can't trust it. Nor do I trust the CCP, or Peter, or Saar or Scott - not fully anyway.

So I'm at a loss. Of the people I know and trust IRL, they seem to be split roughly 50-50. So will I until further notice.

Expand full comment

> Sorry dude, I can't trust you! You're *clearly* not an unbiased actor. Your whole online persona seems to be "defender of institutional science". And this a debate where institutional science itself is in the crosshairs!

Everyone is biased. The question is if there is a mechanism to correct for bias, such as requiring and citing specific, detailed evidence, in public, and open to counter-arguments - that's the point of institutional science. Overall the fact that domain experts all converge on zoonosis should be itself VERY strong evidence for zoonosis. The fact that there's some vague "institutional science" whose side they're on might make them turn an 80% into a 90%, but the idea that it can turn "probably lab leak" into "probably zoonosis" is crazy. If "well, if gain-of-function research done at an inexperienced lab in Wuhan caused COVID-19 and the public learned that, that would make people blame all institutional science and virology" is sufficient to flip experts' probabilities on their head, then that level of "bias" can be applied to effectively everything. You think the moon landing really happened? Imagine how bad it would look for space researchers - even Soviet ones - if it came out it was faked! I just don't know who to trust!

Like, imagine if in early 2021, the CDC published a report in conjunction with a WIV whistleblower that said "Yes, COVID-19 was the result of gain of function research we were doing. We were inexperienced and we fucked up. We believe one of our techs became infected but asymptomatic and infected that raccoon dog vendor, but nobody else. We don't know who or when, exactly, but we do have a sample which corresponds almost exactly to Lineage A, which we believe mutated into Lineage B shortly thereafter." Would the result be that everyone decided that institutional science sucks? No, obviously not.

(Even excepting this, the GJP are not institutional scientists but are converging on zoonosis: https://goodjudgment.com/wp-content/uploads/2024/03/Superforecasters-Covid-Origin-20240311.pdf )

Expand full comment

> Overall the fact that domain experts all converge on zoonosis should be itself VERY strong evidence for zoonosis

IF YOU TRUST THE DOMAIN EXPERTS

This is my entire point. Do you trust these domain experts? I'm not sure I do, at least more than I do the people who hold the lab leak hypothesis (and I certainly don't trust the "Good Judgement Project" - why should I? I've never heard of them til right now). I'm sure you're right in that among domain experts there's been a lot of debate, arguments, counter arguments, etc. But for a debate like this, where there's no falsifiable "right answer", it's not clear if they arrived at their conclusions because the zoonosis hypothesis is true or if it's a better meme.

This isn't a "conspiracy" thing, but a "It is difficult to get a man to understand something, when his salary depends on his not understanding it" thing.

Expand full comment

> and I certainly don't trust the "Good Judgement Project" - why should I? I've never heard of them til right now

They are Tetlock's pet project, who are much better than other groups at predicting the future due to the reliance on individual high-quality forecasters. They outdo basically every non-GJP group in accuracy of predictions, including prediction markets, AFAIK.

> This isn't a "conspiracy" thing, but a "It is difficult to get a man to understand something, when his salary depends on his not understanding it" thing.

My point is that their salary DOESN'T depend on them not understanding it. If American scientists converged on lab leak it would, I dunno, discourage virology collaborations with Chinese firms? It doesn't actually matter. The idea that the lab leak would somehow hurt the average American scientist is insane.

Expand full comment

Re: Good Judgement Project: Lab Leak vs. Zoology is not a prediction market thing - it's not even a prediction! "Who will win the 2024 election?" or "Will the Russo-Ukrainian War be over by 2025?" are questions whose answers can be verified in time. This is a debate about something that occurred 4 years ago, that we'll probably never get a 100% answer on.

As for the salary question: If lab leak turned out true, you think the fields involved are gonna get away with it?

If lab leak is true, it means the WIV, NIH, and probably virology as a field itself will be implicated in a scientific/industrial accident that makes Chernobyl look small: millions dead, billions of lives disrupted. Virologists and NIH grant administrators would be lucky to escape with their heads, let alone their funding.

Expand full comment

> If lab leak is true, it means the WIV, NIH, and probably virology as a field itself will be implicated in a scientific/industrial accident that makes Chernobyl look small: millions dead, billions of lives disrupted. Virologists and NIH grant administrators would be lucky to escape with their heads, let alone their funding.

No, none of that would happen. I don't know why you think it would.

Keep in mind that right now the MAJORITY OF THE POPULATION believes the lab leak theory and functionally NOTHING has happened.

Expand full comment

Thank you for this honest comment.

It is not irrational to go with one's trust network; it probably is smarter and more successful than going with whatever random or motivated information source one comes across online.

But reliance on trust networks by themselves is also not ideal; as I wrote here: https://www.protagonist-science.com/p/asymmetric-power-in-the-information-282

"The overt favoritism of simplified, emotionally engaging narratives and personalized niche content delivered by contrarian shysters, marketeering influencers, or other engagement gurus destroys any navigable level of signal-to-noise ratio on any topic for the wider public.

This is one of the core roots of our epistemic crisis.

The current informational architecture destroys the signal-to-noise ratio of any topic that garners widespread attention.

Think about every aspect of the pandemic, the one topic that by its nature demanded us all to pay attention to. What ‘attention-grappling’ aspect of Covid does not have a polarizing wedge driven through society? Does it exist? Do mask work? Do lockdowns help? Should we open schools? Can you trust your health institutions? What about Ivermectin instead of vaccines? Are vaccines even safe and effective? Did humans create SARS-CoV-2? (on that last one, I did some work to provide epistemic clarity, and the answer is no)

If we cannot browse reliable information, if we lack domain expertise, time, or energy to deeply engage with the topic, we have to rely on our trust networks to reach actionable certainty on any topic to navigate modern life. However, relying solely on trust networks, not expert institutions, scientific consensus, or factual reporting to inform one’s opinions is dangerous in a world of fragmented realities. Fake experts, political commentators, and other attention stealers have a far wider reach than actual trustworthy sources, and usually stronger personal appeal or skill to manipulate us into trusting them. (I’d even say: never trust an influencer, always look for what the boring ‘institutions’ have to say)

ven worse, in an ideologically polarized environment, the system-imposed need to outsource opinion formation to trust networks will often result in our dependence on unreliable proxies already in our ideological network, or demand of citizens to choose the most stomach-able lowest common denominator tribe ideology they can live with to get their information from.

This leads to absurd social phenomena, for example, the association of ineffective Ivermectin prescriptions with political affiliation, or watching a specific cable news network reducing COVID-19 vaccine compliance.

Now let’s quickly think back to chapter 1, several systemic problems for democracy arise when the complex system we are part of is experiencing noise. Again, the scientific literature is nuanced here, a little bit of noise is normal and can be overall good for the robustness of a system (Tsimring L., Rep. Prog. Phys., 2014 , Junge K. et al., Systems Research and Behavioral Science, 2020), whereas large amounts of noise can throw systems out of equilibrium (Tyloo M. et al., Phys. Rev. E, 2019). But what could happen if noise completely drowns out any useful signal?

High noise means that communication breaks down between individual elements of a network that need to act together to fulfill a function.

This can manifest in multiple ways, one obvious example is the inability of feedback loops to exert regulation, another is a faulty allocation of resources within the network, yet another would be the decoupling or separation of elements from the larger system. "

Expand full comment

I understand this dilemma.

Ask 100 people with doctorates in theology if God exists (they are the domain experts, after all) and you will probably get results very biased towards the affirmative.

Ask 100 physics PhDs if quantum mechanics is real (they are the domain experts, again) and you will get very similar results.

Personally I would take the word of one of these groups but not the other. But there is no clear line separating dispassionate objective describers of reality (ha!) from a tribal cult, it is a gradient thing.

Expand full comment

> while the virologists who conducted the essential work have been stuck in a nightmare for years for just doing their jobs; getting death threats, white powder letters, having to hire security guards, scrape the name off their labs from institute's map etc...

Wow, this is quite bad, I hope people like this will not come here.

But, it seems to me the debate between Scott and Saar was mostly polite, even if they still disagree. Maybe some comments were bad, it seemed mostly ok to me, but I certainly didn't see most of them.

Expand full comment

Saar might be captured by a false belief, but that does not imply he is not well-meaning or polite. We all hold false beliefs on various topics without being lunatics about it.

Maybe a way to put it is that while almost no lab leak believer is a real lunatic, almost every real lunatic is a lab leak truther. The narrative just attracts them; an evil cabal of scientists, a devastating Frankensteinian creation, everything that went wrong during COVID-19 solely being attributible to a handful of public health officials and virologists; and unless these evil researchers are stopped by force, another pandemic is being cooked up right now...

Expand full comment

I'm not quite a lab leak believer. I do think there are more open questions than the zoonosis side is sometimes willing to admit. But the obviously-ridiculing phrasing about the evil cabal of scientists is annoying, frankly. One thing that is quite clear is that there was indeed an evil cabal of scientists. Mocking this seems wrong in both tone and substance. Would you characterize Daszak's actions as benevolent, regardless of zoonosis/ lab leak beliefs?

Expand full comment

What actions are you referring? The Lancet letter? Yes, totally benign. The WHO mission? Yes, he did not even want to be on this thing but Kerkhove and Ben Embarek pushed him onto it. What other actions are you referring to? The DARPA proposal? Totally irrelevant for SARS-CoV-2. The funding of his Chinese collaborators? Again, where is there malice?

Expand full comment

"Benign" in the sense "white lie"? I don't think that it had a benign effect for the reputation of scientific establishment.

Expand full comment

A little white lie to deter further regulations and scrutiny towards duel use GOF research that resulted in 20 million people's deaths is a noble cause and something I am ok with!

Expand full comment

Guys, you're falling for a troll: "totally benign" for the Lancet letter, desperately shoving "PhD" even in the username, etc. It's clearly designed to be a stereotype of a certain type of person just to rile y'all up.

Expand full comment

> Lv (is this even a real name? It sounds like Roman numeral? But I guess that’s what you expect in a country ruled by someone named Xi)

Yes, it's a real name. Lv is how the Chinese commonly spell the syllable that is more formally supposed to be spelled lü. (IPA: /ly/.) The reason they do it that way is that pinyin input methods require typing "v" instead of "ü", for the fairly obvious reason that there is no ü key.

But since people are more familiar with the typing they do every day than they are with the specifics of a system of writing Chinese in foreign characters for the benefit of foreigners, they think of "lv" as being the correct spelling.

Expand full comment

Ironically it still seems plausible to me that someone got infected at the lab and infected someone working at the market. Just, I don't have any reason TO believe that.

I can't follow any of the biology, but none of it sounded like "probably came from human alterations" was more likely than "probably evolved randomly in an animal".

Expand full comment

"Probably came from human alterations" is not necessarily mutually exclusive with "Probably evolved randomly in an animal" - you could have humans deliberately infecting (with a more natural variant of the virus) and manipulating exposure among populations of laboratory animals in order to foster "random evolution in an animal" (or just to observe how it spreads between them).

Expand full comment

"So which is more likely - that somehow 20 people had COVID long before the virus was officially detected, and on a totally different continent, yet somehow a scientist looking through wastewater found the water from exactly those people and managed to detect the virus? Or that there was a sampling error, which happens all the time in these kinds of things?"

You are disregarding the possibility that the virus detected in Brazil, and other locales, was not Covid-19 but another virus similar enough to Covid-19. As far as I know, detected doesn't mean the whole genomic sequence matches SARS-CoV2; only a few distinctive segments. It should be theoretically possible to detect as Covid-19 a different, yet similar enough, virus.

Hope someone will correct me if I'm wrong.

Expand full comment
Apr 9·edited Apr 9

Your discussion of ascertainment bias is mistaken on numerous points.

1. The fact that cases in his dataset had symptom onset date in December doesn't show that there is no ascertainment bias. From around 30th Dec, hospitals could not send people to even get tested if they were not market linked. This includes many people who were still in hospital after 30th Dec who had symptom onset prior to 30th Dec.

I think this also accounts for your claim that the first five cases were market linked and cannot be subject to ascertainment bias. There is a difference between symptom onset date and when cases were found. Its also not even obvious that the first case was market linked, which is a sign of what a mess the case search was. the National Health Commission of China [reported](https://pdfhost.io/v/YCxN2y61O_The_outbreak_of_pneumonia_infected_by_the_novel_coron) ([original](https://archive.ph/gkpzs) version) the case of a 48 year old woman with no market link who had symptoms starting 10th Dec

2. His robustness test is ridiculous as noted in the peer reviewed Dave Bahry article that he posted in the comments.He excludes some fraction of the cases and asks whether his statistical tests still hold. This excludes false positives but we're interested in false negatives - people who didn't get counted because they weren't market linked. This is like sampling people in new york and then excluding some people from your dataset and concluding that the global population is centred on central park

3. You point to the jinyintan study saying that people were diagnosed there on the basis of clinical characteristics not a market link. The problem with this is that *suspected* cases at other hospitals could only be transferred to jinyintan for testing if they had a market link from 30th Dec. There are Chinese news articles reporting this with quotes from doctors about how they couldn't transfer to jinyintan https://docs.google.com/document/d/1_Cl-uVa7U8WlbssUVKNEI23xYqkEdGPMpnhwFzGZDFU/edit

For an in depth review of the reporting criteria at hospitals see this. It is amazing to think there was no ascertainment bias https://www.researchgate.net/publication/373301830_SAGO_Presentation_Limitations_of_the_official_2019_Wuhan_cases_based_on_Primary_Sources

The paper linked by Dave Bahry in the comments on your first post has statements by various Chinese bodies saying "we focused on the area around the market" for most of January for a case search that ended in mid feb. Are they lying? What is going on here?

Also, the study conducted on Jan 2nd is the jinyintan study which is affected by the biases mentioned above, so you are mentioning this study twice, both in the worobey quote and your following note.

4. None of this accounts for the Mr Chen case. He was only counted as a case because he transferred to Wuhan Central Hospital on the North of the river because his relative happened to work there. He lived 30km from the Huanan market and went nowhere near it in the two weeks prior to symptom onset on 16th Dec. This is strong evidence that loads of cases on this side of town were missed. Also, Mr Chen was one of the whistle-blower cases for the whole pandemic

In worobey 2021, he outlines how many of the first cases appearing at several hospitals were market linked. He doesn't grapple with how the Mr Chen case doesnt fit with this. Instead he says "he travelled north of the market shortly before symptoms". This is deliberately misleading. What he is referring to is Mr Chen travelling to Mulan mountain 90 km north of the city of Wuhan at the end of November. There is no indication that he visited the Huanan market on this tourist trip and it is more than 16 days before he developed symptoms so can't be when he contracted covid. This might also make you question how reliable a guide worobey is. You will see in the weissman piece that he also obviously doesn't understand basic statistics.

The George Gao link you are looking for is here https://www.bbc.co.uk/sounds/play/m001ng7c

He says they focused too much on the market in the case search and focused too much on the wild animal section in the environmental sampling

Expand full comment

I'll admit, this is probably the first time I've ever been disappointed in Scott's coverage/analysis of a topic.

I'm not even really pro–lab-leak — as long as everyone remembers how scathing the LL hypothesis coverage was and how joyfully stupid people jumped on the bandwagon because it flattered their tribal biases, I'm pretty happy; I honestly don't know which origin is actually the case and have not the time to dig real deep into it — but there are some really odd omissions and errors in this piece.

Feels almost like he's /trying/ to come down on the zoonosis side. But since I've agreed with and liked nearly everything the man's written since LessWrong days, I'll be charitable and say: it's still better than almost anyone else could manage, Scott, my friend! 👊

Expand full comment

700 millions to one comes is 8.8 in log10, and seems very implausible on the face of it.

Such high numbers can appear within a very specific model. The odds of coming up head 29 out of 29 times when flipping a fair coin are roughly 1 to 537 millions. But in the real world, you will observe this with a much higher frequency because sometimes your model assumptions are wrong, and the coin is not fair, has head on either side, is flipped in a deterministic way and so on.

Given that rootclaims are the experts on this kind of analysis, I think it would be part of their job to make certain which probabilities are within-model and which are really life betting odds which they would take if this was a question which was likely to be settled with future evidence. If both Scott and Peter walked away with them believing that rootclaim favored these extreme odds, that is a failure of communication on their part, and I can understand Peter's response of 'You want extreme odds? THIS are extreme odds!'

Politics is the mindkiller, and I think a lot of the actors are firmly committed to their camp, which is generally not helpful for debates or truth. Arguments as soldiers, where even a bad argument requires refutation from the opposite side, thereby binding their resources. This is a reason to not consider any arguments not brought up by either Peter or Saar (assuming that both are competent to give the best arguments for their side, which seems likely), otherwise you will be stuck with debunking bogus studies for the foreseeable future.

What percentage of papers are published by non-partisan scientists? Even the wrong origin theory has likely some facts which increase its odds, so a non-partisan researcher should end up publishing some mixture of arguments for and against the lab leak hypothesis. By contrast, a partisan scientist would start with writing on the bottom "Therefore the lab leak hypothesis is (more|less) likely than previously thought." and then try to fill the blank space above. That does not invalidate their arguments per se, I would prefer not to read arguments from motivated reasoning. (Of course, there is the possibility that partisan researchers metagame by publishing weak arguments against their position.)

I do not think that this is overall a great topic for discussion. Policy-wise, it is not important as there is a general agreement that both zoonosis and lab leaks are an important avenue for future pandemics and we should try to avoid either. (The same is true if one debates if p(doom) is 0.8 or 0.01, btw.)

It is also unlikely to be firmly decided by future evidence. This is what I generally like about science, you start out with two positions like special relativity versus Newtonian mechanics, and then you do a bunch of experiments and eventually (at least) one side is thoroughly refuted. By contrast, COVID origins feel more like debating theism vs atheism in that it seems extremely likely that the question will be settled by evidence eventually.

Speaking of falsifiability, I can't help but notice that rootclaims tends to focus on topics where it is unlikely that a firm consensus will be reached, like COVID origins or Syrian chemical weapon attacks. If their method was working as well as they claim, there should be some questions which will have firm answers in the future. Moreover, you can make a lot of money with such questions in both prediction and stock markets.

By contrast, most of what I see on rootclaims seems unlikely to be validated by future evidence. There is some possibility that Putin having cancer or Trump having a toupee will become public knowledge at some point, but all of the whosdoneit stories will likely only settle on a guilty beyond reasonable doubt vs not guilty verdict, which is only a rough approximation of the objective truth in the best of cases.

Expand full comment

I really appreciate you putting so much time and effort into your original post and this follow-up. COVID origins are a really, really important question. Something which caused millions of deaths is well worth a few hundred thousand dollars to incentivize the best analysis possible.

I really hope you end up doing your own bet/debate! Especially because as a prediction market enthusiast, we are badly in need of high quality, neutral, near-term sources of resolution criteria for Covid origins questions!

I have already optimistically made some markets:

https://manifold.markets/Joshua/scott-alexander-is-planning-a-covid-9aaacc1866ce

https://manifold.markets/Joshua/scott-alexander-is-planning-a-covid

Expand full comment

> Something which caused millions of deaths is well worth a few hundred thousand dollars to incentivize the best analysis possible.

Is it, though?

Whether it was the wet market or the WIV, we *know* it could have been either one. Diseases have come from wet markets before, and my understanding of the WIV protocols suggests that they are several levels laxer than they ought to be for this type of research.

But do we hear anything about the US government pressuring China to improve the safety of either one? We do not. If we did, do we have any reason to suppose that it would make any difference to China? We do not.

(And that doesn’t even address the question of whether it would make any difference to the US government even if the rationalist community converged on a unanimous opinion.)

In practical terms it’s like arguing about how many angels can dance on the head of a pin. It may be a good mental exercise that will make the exercisers smarter, but that’s all.

Expand full comment

For what it's worth, I was strongly lab-leak prior to the original post - Scott can confirm this via the ACX survey I filled out a day or two before the blog post. I had no idea Scott was going to do a post about this, so had no reason to lie on the survey.

After the original post, I updated to about 15% lab leak.

After this post, I'm officially at 0% lab leak.

Maybe this can be a point of reference the next time "don't argue with conspiracy theorists, you won't change anyone's mind and you'll only legitimize and/or more-deeply-entrench the conspiracy theorists" comes up.

If Scott wishes to confirm my survey answer, my survey had an extended comment about whether ACX had gotten better/worse/the-same, which tied in with my answer about the Roman Empire. A Ctrl-F for "parables" may or may not return my answer, if not I'd be happy to privately provide the email address I used.

Expand full comment

You sound like someone who must apply to our challenge.

Expand full comment

What's the challenge?

Expand full comment

I like the point you're making re: to-argue-or-not, but the numbers you give are sort of disquieting. /To me/ they don't exactly scream "well-calibrated judge of evidence": those are unreasonably large updates, especially given being /strongly/ pro-lab-leak at the start, and especially given the 0% at the end (that's hard to justify even for much, much more dubious propositions — and note this back and forth hasn't shifted /Scott's/ lab-leak-credence down from 10%: nothing more conclusive here than in the first debate).

Expand full comment

The US Natsec folk actually care a lot about the narrative on lab leak; it's a great accusation to make to disrupt CCP rule in China, but better to hold in reserve for the right time or as part of existing deterrence (and throwing accusations like that around willy-nilly, based on research done by random bloggers and play-money engines, damages the reputation of democracy among authoritarian-leaning governments around the world).

Expand full comment

> He says:

>> As soon as I get there, a doctor diagnoses pneumonia. So that’s why my lungs are making that noise. I am sent for a battery of tests lasting six hours.

> And then says that he went home either that day, or the next.

>> Day 13: I arrived back at my apartment late yesterday evening. The doctor prescribed antibiotics for the pneumonia but I’m reluctant to take them

Having recently been diagnosed with pneumonia at a Chinese hospital, this is much closer to what happened to me than the later Fox version is. The "battery of tests" consisted of a chest X-ray, a blood sample, and a urine sample, and it took a while, but that is largely because it took me a long time to be able to pee. At one point, a nurse knocked on the bathroom door to check if I was all right. After that, it was just waiting in an examination room until a doctor came in saying "this X-ray clearly shows that you have pneumonia". All together the process, including travel to and from the hospital, took somewhere between 4 and 7 hours.

The antibiotics seemed to cause weird auditory hallucinations, but I was not hesitant about taking them. It would have been difficult for them to be worse than the untreated pneumonia. My major symptoms were a fever of 39 C, unwillingness to eat or drink (and note - unwillingness to drink does not combine well with a high fever!), pretty questionable lucidity much of the time, and on two occasions a loss of balance while standing up. On the second of those occasions I hurt my ribs pretty badly by falling into my kitchen furniture.

I had no shortness of breath and my respiratory tract didn't make any noises that it doesn't make "normally". This has left me with a slightly funny feeling when I see people describing pneumonia as mainly about lung problems. Of course it is a lung problem, but I seem to have gotten everything but the lung-related symptoms.

From speaking to friends, I get the sense that my case was worse than average. (There was an epidemic of pneumonia going around at the time.)

Expand full comment

As another way for Saar to demonstrate the success of his technique, why doesn't he enter forecasting tournaments (or make a Metaculus account)? I know you mentioned this in your original post Scott, but I wished you'd emphasized it more here in section 3.3. If he quickly rose to the top of the leaderboard on Metaculus, I'd find that pretty compelling. And the fact that he hasn't done that (AFAIK?) makes me lose a lot of interest in his claims.

Expand full comment

We're working now on adjusting the model to handle predictions. We would still need to improve our capacity a lot in order to win forecasting competitions. The process is very slow today.

Expand full comment

Your coverage of the Pekar et al replication problems is weirdly incomplete.

It might be worth mentioning that their first correction required lowering their significance threshold to preserve the conclusion.

It is definitely worth mentioning that the same guy who forced the first correction has a second correction which actually invalidates the conclusion. The first correction was about bugs in the code, the second was about a fundamental error. Authors have not responded to this yet, but given that they acknowledged the earlier errors the challenge is very credible.

On the Worobey et al criticisms, you quote as rebuttal a preprint written by a couple of highly conflicted scientists. One of the spatial statisticians (Chiu) has replied that the paper was even worse than they had originally thought and basically dared the virologists to publish their rebuttal.

On the timing, I believe most of papers estimate an earlier origin than late November.

Expand full comment

> For what it’s worth, my timeline of Chinese denials and coverups looks like this:

> 𝗠𝗮𝗿𝗰𝗵: COVID was a US bioweapon, or possibly came from Italy.

You accidentally touched on something that really bothered me earlier in the piece: Peter's summary of covid detection dates.

I clicked through to read his post, and he discusses problems with detection in Spain and Brazil, while completely glossing over Italy. Or, to be more specific, he mentions Italy quite a bit -- he says that they have an early detection date, in December 2019, and that although this is a surprise when compared to the date at which outbreak becomes obvious in some sense, it appears to be a real finding, because it's supported by other early detections between December and whenever it is (February?) that outbreak was formally declared.

But of course the big news about covid detection in Italy was that it was detected in November. Peter must be aware of this, and the argument he presents for why the December detection looks real automatically extends backward to November - if detection between January and March means the December finding is real, then there's detection between December and March, which should mean the November finding is real.

It's very, very weird that this goes completely unmentioned.

Expand full comment

A November outbreak in Italy would have made Italy the epicentre of the entire outbreak.

Expand full comment
Apr 9·edited Apr 9

So would a December outbreak in Italy, which we officially believe in. Was Italy the epicenter of the outbreak?

Should we rethink our ideas about what necessarily follows from what?

If we're cataloguing early detections of covid, should we even bother to mention ones that made a big news splash at the time, or would it make more sense to pretend they don't exist?

Expand full comment

So why would a Sarbecovirus found in Yunnan and South East casue an outbreak in Italy? Do you think if say for example China did discover anti bodies in waste water earlier than their officially stated cases that they would share that information?

Expand full comment

> Do you think if say for example China did discover anti bodies in waste water earlier than their officially stated cases that they would share that information?

Depends. If they found evidence of it from 2018, 2017, and 2011, probably, yeah. That would indicate that whatever happened in 2019 was apparently unrelated to the virus, which seems like the kind of thing they'd be interested in and see no reason to hide.

Same goes for March 2019. Less so for October.

> So why would a Sarbecovirus found in Yunnan and South East cause an outbreak in Italy?

I have no particular model of this, but it seems unnecessary, since your question is just as interesting when we assume that the outbreak in Italy "only" started going in December. That is also "too soon". As I just remarked, a November outbreak an Italy would have made Italy the center of the whole thing, and that's also true of a December outbreak. So if we're ruling out a November outbreak on those grounds, why do we accept the December outbreak?

Why did it start an outbreak in Iran? Everyone accepts that, but the jump to Iran is no more likely than the jump to Italy.

Expand full comment

The early Italy “detection” has mutations characterizing the B.1 lineage when sequenced. The lineage that happens to DEFINITELY not have existed in November 2019 and that is the expectation for a contaminated sample collected in Italy in Spring 2020.

Expand full comment

That can't be a reason it isn't mentioned in Peter's review - he rejects the Brazilian early detection for that very reason.

Expand full comment

I can't follow your point at all here but fwiw Peter mentions B.1 in Italy and Brazil here -- https://www.astralcodexten.com/p/practically-a-book-review-rootclaim/comment/52716610 -- the Italy paper claiming that the Brazil result confirms that they are not looking at an artifact is outrageous. The first time this was noted publicly was in the context of it obviously being an artifact in the Brazil data -- the B.1 mutation is the *last* mutation, chronologically, to occur on top of lineage B to make B.1; apparently it also shares a mutation with a B.1.1.33 sublineage that was even sampled in Santa Catarina in March 2020.

It's unfortunate in academia that there's very little incentive to publish boring results or we would hear about all of the false positive PCR results from 2019 samples that didn't stand up to scrutiny.

Expand full comment

You still haven't read Weissman.

You spent 15 hours watching videos, probably as much again talking to people - mostly the same two people or people influenced by them!

You could easily read Weissman. You don't have to check his math any more than you checked the math on Pekar et al. You don't have to believe his numbers: his review of the evidence includes things that you don't seem to have heard about. You link to him only at the end of a long post, and you don't even mention that his numbers make Saar look less like an outlier. This is really not the Scott Alexander that I am used to.

Expand full comment

Scott has probably spent upwards of 100 hours of extensive time on this, between reading all the comments, verifying the arguments and skimming citation links. This isn't like 100 hours of a video game, or 100 hours of writing a blog post on what he believes and wants to communicate, but intensive analysis, looking for the weak and strong points of each side and trying to understand their importance wrt to his internal understanding.

At some point, you have to understand that Scott Alexander is a human being who can understandably get sick of discussing a topic, and not a blog post generating machine, despite all evidence for cyborg Scott. And asking for what could potentially be another 20 hours of investment is impolite. Please don't pester him like this, it just makes it more likely he gets more aversive to this kind of deep dive and still not do what you're asking.

Expand full comment
Apr 9·edited Apr 9

Quantity is hardly a substitute for quality of research. I too have spent over 100 hours looking into this, and I seem to have looked at far more sources than Scott. I only keep mentioning Weissman because he is easily the most careful and thorough source I've seen, and he links to most of the others. I've learned a great deal from Scott's past writings and frankly I'm shocked at the approach here.

Your claim that it would take 20 hours to read Weissman's analysis is simply wrong. Certainly it would be a matter of minutes to mention Weissman's current odds at this writing, or to mention that Weissman reviews ~10 other informed Bayesian analyses which mostly came up with odds closer to Rootclaim than to Peter and the judges.

While I appreciate that many people here know Scott personally, he is a public intellectual now and if he's going to weigh in on this question it is fine to criticize him for epistemic sloppiness.

Expand full comment

Yes, I thought it quite clear that I believe his time was high quality, and you have not given me any reason to think your time was spent likewise, aka, if you have object level points you should make them instead of talking about how sophisticated your understanding is.

The 20 hours claim is from the fact that you, or someone like you, would complain if Scott didn't trace the citation links, or give good counter arguments, or didn't address a particular point people who like Weissman's analysis did *not* emphasize beforehand.

I'll add that if you *really* thought there were low commitment ways to do this, you could have posted those numbers yourself, and Scott may just copy and paste them in. Make things easy for someone if you're going to impose on them!

I'm a pretty low quality commentator, if I wanted to make a point I feel good about defending, usually I'm spending at least half an hour per paragraph to research, read and incorporate some point, even if it's as inane as someone suggesting that no one in effective altruism is donating money. The blog post you are referencing

1. Appears to use math that Scott dislikes, as well as math he doesn't feel super confident on

2. Is very long and contains many sub points.

So I can easily imagine 20 hours being spent on it, and at the end there will still be someone protesting that their particular set of objections weren't answered.

The answer to infinite pits of intellectual-labor-demanding-and-also-flesh-eating tapirs, is, in many religions and cultures, "just walk away".

Expand full comment

Nobody needs to pay close attention to the math. It's basically just multiplying likelihood ratios, just like everybody else. The intimidating math is just a systematic way to discount ratios to recognize uncertainty, and it tends to shift the odds a bit toward zoo from what it would have been without the discounts. Almost all the disagreement with Scott comes down to two factors.

1) For multiple reasons, explained in the text, the atrocious Pekar et al. paper is not used and the location cluster of deeply flawed Worobey paper is not used. The reasons are

a) P(Wuhan|market zoo)/P(Wuhan|lab) is ~10^-3, not 10^-2, based on official Chinese wildlife trade stats.

b) The lineage at the market was not a candidate for the ancestral spillover.

c) There was strong evidence of ascertainment bias in the locations, although one can contrive other stories that would account for the distributions.

d) The internal DNA-RNA correlations for SC2 failed to show the signal that most of the actual animal coronaviruses showed. (Bloom)

e) Market outbreaks were far more common that the Worobey odds assumed.

2) I recently updated to include the restriction enzyme site pattern. As first proposed it suffered from a serious multiple comparison issue and shaky connection to a lab scenario. After Emily Kopp found the detailed DEFUSE plans, those problems were removed, justifying serious calculations by multiple methods of the probability of such a pattern under zoonosis.

Expand full comment

Oh, hi Mike. Thanks for being super thoughtful and summarizing your post. I'm going to be rude and not reply to your points, because I want to post now.

I haven't read your post, although I plan to do so in the future, so if the answer is "RTFA", I'll gladly accept it, BUT:

1. Let's say there was lots of ascertainment bias. We're missing the elderly collapsing in the streets, lab COVID infected raccoon dogs are biting children who are then kidnapped by genetically engineered Chinese supersoldiers etc.

How is it possible that the pandemic becomes the size that it was if we're missing 5~ other clusters, which implies 2 less doubling cycles are required?

The answers I can guess are

a. Peter is mistaken that transmission rate was mostly uniform. What he actually observed wasn't "transmission rates given one case of Covid that spreads", but specifically "transmission rates given some kind of super spreader event". Aka yeah, sure COVID causes pandemics, but in reality we roll something like 10 dies in each country with cases coming, even if 9 of them end up in single case chains that end shortly, you low roll one case and that's what ends up seeding the initial case (because some idiot decided to go visit grandma at their retirement home, or cough up their lungs at a supermarket)

b. Initial lab leak Covid has different transmission characteristics than pandemic Covid, where for some reason the lab leaked version doesn't have high transmissibility, then there's some mutation event that happens centered around the wet market that kicks off the pandemic.

c. Lab leaks were happening all the time, but we got unlikely and happened to get a highly transmissible one going to a wet market (maybe combined with b above)

d. I'm talking out of my ass wrt to transmission coefficients, there can be tons of variance in how pandemics turn out, and the "transmission coefficient" is just a number that retrodicts but not predicts disease spread. (Aka, if I get a transmission coefficient of 3, that is mostly about how the pandemic proceeded and that if you isolated transmission vectors, you can find dramatically different causal and distinct pathways.) this is a pretty half assed position because I myself cannot think of "ways transmission coefficient can be wrong".

I don't understand, in the case where there *is* ascertainment bias AND we are sure that Covid doubles in about 3.5 days, we end up with the pandemic happening 3 months later, rather than 1-2 months later. Is it one of the points above, or something else in the article?

I'm also just incredibly curious how Peter can make statements like "impossible to be found in a lab because we can't keep the virus alive" and seem to not be challenged on what I think is an incredibly overstated claim.

These are the cruxes in my mind, re: ascertainment bias and lab leak feasibility. I don't think that not answering means anything more than "don't want to reply to annoying internet comments".

Expand full comment

Quick partial thoughts. Some early reports did in fact show much slower growth outside the market-linked cases. R0 is not a pure biological constant but also very strongly dependent on details of the human environment, which is why NPIs work. Populations are structured. Drop an agent in at random and it may or may not hit a high-R0 subpopulation. If it doesn't go extinct, it will find the high-R0 groups. Highest eigenvalue wins- that works both for population stricture and for possible sequence evolution, although the latter wasn't huge. Scott and Peter don't distinguish between mere details in prefactors and mere details in exponents. Details in exponents matter.

The most plausible picture fitting phylogeny, excess mortality, Italy wastewater,... would be a spillover in Oct. 2019 rattling around fairly slowly mostly not in elderly people and finally getting impossible to officially ignore when it got a good foothold in the HSM.

I include references in my long blog.

Expand full comment

"How is it possible that the pandemic becomes the size that it was if we're missing 5~ other clusters, which implies 2 less doubling cycles are required?

The answers I can guess are

a. Peter is mistaken that transmission rate was mostly uniform. What he actually observed wasn't "transmission rates given one case of Covid that spreads", but specifically "transmission rates given some kind of super spreader event". Aka yeah, sure COVID causes pandemics, but in reality we roll something like 10 dies in each country with cases coming, even if 9 of them end up in single case chains that end shortly, you low roll one case and that's what ends up seeding the initial case (because some idiot decided to go visit grandma at their retirement home, or cough up their lungs at a supermarket)"

Perhaps you aren't interested in anyone else's take, but isn't the simple version basically what you postulate in (a)? We know for a fact that Covid transmission is extremely non-uniform - it has extremely high overdispersion. The vast majority of onward spread is driven by a small % of infected, and most chains of transmission rapidly die.

Suppose at some arbitrary t0 there are 5 people infected, one of whom subsequently kicks off the HSM cluster. Something like 70% of infected individuals infect 0 others. So there's a 24% chance all 4 others die immediately. Conditional on not infecting 0, the mosrt common number of infected is 1. So in the 76% of other occurences where it doesn't die immediately, generation 2 will often have <= 5 cases too, and thus have another good chance of dying at that point. (Obviously the math quickly gets messy and it's best to simulate it out). But even if these other clusters survive, there's no reason they should all be roughly the same size a month later. Rather, because it's all explosive, a small subset of transmission paths will tend to dominate the total. My guess (I would be happy to simulate this out with whatever numbers we can find if you care) is that the biggest cluster will have the vast majority of cases after say a month.

Expand full comment

This is comparable to Marxists that say you haven't "read the theory" IMO.

Expand full comment

I can't figure out the hanzi for researcher Lv, so this is just a guess, but I've occasionally seen v used to romanize what is more commonly represented as ü. Lü is a moderately common Chinese surname.

Expand full comment

That's exactly what a Chinese friend told me.

Expand full comment

This has been awesome

Expand full comment

"Saar mentions that there are several other possible sources like restaurants or farms. I think Peter demonstrated during the debate that pandemics are unlikely to start in rural areas, so farms aren’t that important. Restaurants mostly source their products from wet markets. During SARS1, some pandemics started in restaurants because they kept the civets in cages next to the diners (like how some Western restaurants keep lobsters). After SARS1, restaurants stopped doing that and became a less likely spillover location."

I think it's important to focus on this: "Restaurants mostly source their products from wet markets."

This implies that, if the cluster of cases first started around a restaurant, we'd treat that as evidence of zoonosis - that is, this actually *undermines* the idea that it would be a weird coincidence for the first case to appear in a wet market, because restaurants belong to the same reference class.

To elaborate on this: It's a big coincidence that the first case showed up near the research center, because the research center is nearly unique, in that it was researching exactly the kind of disease COVID was. If we restrict our consideration to only two cases - lab leak, and any other cause - then this is a -huge- update in favor of a lab leak, because we'd expect a lab leak to show up near this specific research center, but we wouldn't expect an "any other cause" to show up near this specific research center. So this is very strong evidence of a lab leak.

This is huge evidence; we must counterbalance it with equally huge evidence. So - would we expect a zoonotic origin to show up in the wet market, specifically? Or, alternatively framed: How many different locations would we expect a zoonotic origin to appear in, weighted by their actual likelihood of showing up there? Let's call this set of locations the "reference class".

You've just argued that restaurants are a part of the reference class - which *reduces the value of the evidence for zoonosis*. Zoos which take (for exhibits) or treat wild animals are valid reference cases. Anybody exposed to the logistical chain involved in transporting mammals (there's no reason to favor raccoon-dogs except that they show up in this specific wet market - any feline, mustelid, or rodent, to name three entire families of mammals that aren't even the complete set of entire plausible families of mammals, would also work).

The reference class for locations that we'd expect a zoonotic origin to first appear in is -huge-. This specific wet market is not that special, it only looks special if we constrain our reference class in an extremely post-hoc manner, such as limiting ourselves to places with exposure to raccoon-dogs - when the only reason we're considering raccoon-dogs is because they were the most likely candidate vector in the specific location. If the most common animal in the wet market where it first showed up was Siberian weasels, we'd similarly be tempted to artificially limit our reference class to places with Siberian weasel exposure.

So the location the virus appeared in *doesn't actually provide much evidence for a zoonotic origin*, because the reference class is enormous, and so it isn't actually very surprising that the first identified site happened to be a part of that reference class. The more places you can identify that are plausibly *part* of that reference class, the less evidentiary value you should place on the place it actually first showed up; you've noticed that restaurants are part of that. This should cause you to update away from zoonosis!

So, from the information about where it first showed up, we have very strong evidence of a laboratory leak (of some kind - a zoonotic origin does not exclude the pandemic beginning with a laboratory leak; the disease could have arisen from mutation of a prior strain in laboratory animals, for example), and very weak evidence (it's still evidence, it's just very weak!) of a zoonotic origin.

Personally, I think the overall evidence in fact points towards "Zoonosis in laboratory animals or infected animals brought into the laboratory, with or without human intervention accelerating it, followed by a laboratory leak" as the most likely origin (human intervention accelerating being, for example, if the laboratory was deliberately evolving a virus in laboratory animals by manipulating group exposures). This also includes "Natural zoonotic origin in a sample animal brought into the laboratory and then leaked". The cleavage site fairly strongly suggests mutation, and the location information very strongly suggests laboratory leak.

Expand full comment

This is a very good point. For the lab leak, we're at "what is the chance of a bat coronavirus pandemic occuring in one of the two cities (Wuhan, Chapel Hill) in the world that host the leading research labs on bat coronaviruses, conditional on a lab leak" (Not 1, but substantial). And Wuhan (beyond being in China) is pretty unlikely under zoonotic/wet market.

For zoonotic, the "what is the chance of the first reference class is way broader: there are multiple wet markets with in Wuhan, there's a swathe of other locations that are just as animal-trade linked (eg restaurants) and the seafood market (by being a crowded seafood market) is hardly super unlikely as an early super spreader regardless of source.

Expand full comment

> You've just argued that restaurants are a part of the reference class - which *reduces the value of the evidence for zoonosis*.

Does it? If the restaurants are getting food from the wet market then at worst I would expect that the wet market and a restaurant would be potential spreaders, but not the restaurant on its own.

Expand full comment

Yes.

Consider two cases, and ask yourself which is more surprising:

You win the lottery

Somebody with your first name wins the lottery

The more limited your reference class, the more surprising - that is, the more evidence is provided by - a particular example fitting into it.

If the reference class for "Could be a result of zoonosis" is "Wet markets that sell raccoon-dogs", and the first cases all appeared in a wet market that sold raccoon-dogs, this is stronger evidence for zoonosis - it is more surprising - that if the reference class for "Could be a result for zoonosis" is "Anywhere in the world". A hypothesis that predicted the first thing has better evidence for it than a hypothesis that predicted the second thing.

The larger your reference class, the less surprising it is when a case fits into it, and the less evidence it provides for the validity of any conclusion utilizing that reference class as evidence.

Edit:

This means it is CRITICALLY IMPORTANT to pay attention to post-hoc adjustments to, or definitions, for, the reference class! If I say "In one week, somebody with the first name John will win the lottery", and John Smith wins the lottery the next week, that's really good evidence for - well, something! But if a week after the lottery I say "Last week, somebody named John Smith will have won the lottery" - well, that isn't surprising at all, even though it *seems* very specific! This is because in the second case, we've defined the reference class in a post-hoc manner.

Expand full comment
Apr 11·edited Apr 11

> Natural zoonotic origin in a sample animal brought into the laboratory and then leaked

This is a far more important point than anyone ever credits. A lot of smart people (like Scott) invested a lot of energy in telling us that the "lab leak" theory was completely insane, because they equivocated it with a wacky man-made virus theory. When the dust settles and it becomes painfully clear that that's never what the lab leak theory meant -- that the question is predicted almost exclusively on location -- somehow or another those smart people all pivot to "oh yeah that's what we meant too, and it's definitely still not a lab leak, for all these slightly tweaked reasons". The motivated reasoning at play to avoid losing face and/or avoid tribe alienation is painfully obvious.

I have absolutely no idea whether, to what extent, and how COVID-19 originated in part due to human design, and I expect I never will. But I started at 90% at some kind of lab-related first human infection and now I'm at, let's say, 99%. At this point there is no possibly-extant evidence which can overcome the proximity to the virology lab, so the only remaining way to disprove a lab leak would be to prove the virus first spread to humans somewhere other than Wuhan. To be explicit, China's official stance is more credible than Scott's, and that's saying something.

Expand full comment

> A lot of smart people (like Scott) invested a lot of energy in telling us that the "lab leak" theory was completely insane

Can you quote Scott doing this?

Expand full comment

How does early cases clustering around prove it's originated from the wet market? It only proves the fastest way to transmit is via dark cool wet surfaces. An infected lab worker might as well went to the market to buy some fish or something and spreading might have started there. Every city has a wet market but only one has a virology center studying covids. I was 50/50 after reading the original post, but in time I started resliding towards the lab leak side.

Expand full comment

My understanding is that Covid is not transmitted through fomites, though?

Certainly clustering at the wet market doesn't "prove" it originated there, but it's evidence in favor. As Scott says in the post, if it originated in the lab, why wouldn't we see multiple early clusters as the researchers went to multiple spots? Why just the one at the market?

Expand full comment

Indeed, not fomites. Basically never happens. But change that to "enclosed locations with lots of people breathing in the same air repeatedly" and it goes through fine (actually better tbh).

Why just one at the wet market?

Many possibilities (eg there were, and we missed them... surely early testing missed a huge share of cases), but the simplest one is "most people infect 0-1 people and spread is driven by super spreading events/people."

Expand full comment

But, one point Peter made repeatedly is that while the first cluster was at the wet market, it does not seem to have been a "superspreader event" in the sense of being more virulent; people got infected there at the same rates as other places. So we have the same problem.

Expand full comment

This is still a very confusing point, for me. Let's forget the greater debate for a moment. However Covid got into the market - why _wasn't_ there a super-spreader event? Whether Saar was right or wrong in general, the arguments he presented for the market being conducive to accelerated spreading seem perfectly reasonable. This seems important - if one explanation is that we simply don't understand well Covid's behavior in its early stages, this is relevant to some key claims that were brought up. If another explanation is an increased level of immunity due to previous exposure - that's obviously relevant too.

Expand full comment

I'm pretty sure there was one - I think Peter's argument against this are just badly wrong. (He seems to think the law of large numbers holds at small n, and was seemingly unaware that covid spread is massively over dispersed). But random dumb luck can happen too. Peter just seems to think R0 in a crowded indoor seafood market should be the same as anywhere else, generation after generation.

Expand full comment

"Peter just seems to think R0 in a crowded indoor seafood market should be the same as anywhere else, generation after generation." ... but... it was. That's exactly the confusing part. Is the explanation "random dumb luck"?

Expand full comment

Did the disease spread more slowly at the Wuhan wet market than it did at wet markets elsewhere? That would be an anomaly in need of an explanation. If the Wuhan wet market spread is similar to other wet market spread, that just suggests that wet markets, while more likely to expose people to novel diseases than other settings are, aren’t a greater source of person to person transmission than other crowded indoor spaces, which doesn’t seem anomalous to me. (Do everyday diseases spread faster in wet markets than other places?)

Expand full comment

That’s not the correct reference class, though.

Doubling every 3.5 days is the rate we expect for the entire population, not for a sub- population heavily interacting within crowded indoor spaces.

Most super-spreader events were indeed crowded indoor spaces- weddings, churches, this kind of thing. So we would expect wet markets to be at least as conducive to spreading as those spaces. If they’re not, if they’re just like an average environment people spend time in- then it already is an anomaly.

And that’s just to start with. Cold _is_ conjectured to help covid spread, as does speaking loudly (as one would in a busy wet market, I assume).

But that’s still not all. Later super-spreader events would have been at least mildly tempered by immunity. Not all of them, and only for populations that were never before exposed. But here we’re talking about literally optimal conditions for spreading.

So yes, I do think there is an anomaly to explain.

Expand full comment

If I understand the claim (of Peter's) that you are referencing correctly, I think it's just wrong. But it's also fairly irrelevant as a response to this point. Most chains die immediately (~70% of infected infect 0 others). More again fizzle out quickly. If the first chain infects someone who visits HSM, and any others (if they exist) have the modal "die out quickly" outcome happen, voila

Expand full comment

Note: Nobody is claiming that COVID originated at the wet market. All of the significant claims have it originate somewhere else, and then begin the initial spread in the wet market.

This may seem like a silly distinction, but it's actually pretty critical - because, wherever it originated, we must actually ask the same question: Why didn't we see multiple early clusters? Shouldn't the people transporting the animals have been infected - why didn't the restaurant where they stopped for lunch have a cluster?

Once you stop treating the disease as literally beginning in the wet market, you have the same issue with -any- origin.

We start making assumptions like the animal handlers being immune because of a prior infection that was too rural to spread, but we can just as easily assume that it was exactly one researcher who got infected, was asymptomatic, and then played video games through most of their infectious period until they decided they needed to visit the wet market. Seems plausible, right? At least as plausible as "prior infections happened but didn't spread beyond a rural community" or whatever assumptions you want to make to explain why there was only one cluster in the zoonotic case. Any such assumptions are just justifying whatever conclusion you've already met, or trying to interpret the evidence you do have - these assumptions aren't evidence *1.

Meanwhile, as multiple other people have observed, we have multiple international cases of the first COVID spread event in a country or area being at wet markets, also without clusters. So, if your intuitions say that human-to-human spread events should have multiple clusters, and it would be very weird for there to be only one spread event and for it to be at a wet market - your intuitions are wrong.

This isn't to say that this proves anything, only that it's actually quite weak evidence.

*1 I notice a LOT of double-counting, recycling, or similarly dubious accounting, of evidence going on in the people pushing zoonosis. "Raccoon-dogs were only sold in this one market where the disease occurred, which would make it a huge coincidence that the disease occurred here for other reasons, therefore the disease must be zoonotic", for instance. Pay attention to the fact that you could swap out "raccoon-dog" with Siberian weasels; had Siberian weasels been sold at the wet market instead of raccoon-dogs, we'd be having the same argument but with a different animal. The evidence we have available says that, if it was zoonotic, it probably began in raccoon-dogs, based on their presence - but you don't get to then use that as evidence that it was, therefore, zoonotic, because raccoon-dogs were present!

Expand full comment

Human to human transmission is common, especially in urban areas. Animal to human transmission is rare. If the first human case was at the wet market, that's where you would expect it to spread.

Expand full comment

You don't become infectious as soon as you are exposed to the disease; your viral load simply isn't high enough. In order for this idea to work, *supposing animal to human transmission is rare in this specific situation of animals enclosed near humans and the specific disease being the well-adapted-to-infect-humans COVID-19*, we have to suppose a human got exposed at the wet market at some point prior to the first major spread event at the wet market, then came back to the wet market without going anywhere else (otherwise, by the arguments involved, we should expect to cause multiple clusters elsewhere!)

So - simply, no. If the first human case was at the wet market, the wet market is *not* where we would expect it to spread, in the case that such transmission is rare.

Expand full comment

Couldn’t (say) one vendor get exposed by spending a lot of time around an infected raccoon dog, build up enough virus to be contagious, and infect multiple people over the course of one or more workdays? That would show a cluster happening at the wet market. As I understand it the clusters don’t mean “all these people caught it at the same moment/were in this place at the same time,” but “of the people who were infected, a lot of them were in this place recently.”

Expand full comment

Sure! But that vendor doesn't spend 24 hours a day at the wet market, so we're once again left with the question of why there weren't other clusters.

Mind, I don't endorse this question! The entire point here is that this question applies equally *no matter the source* - so the question is not an argument against a laboratory leak hypothesis. My point is that it is, effectively, an argument against any plausible source for COVID-19 at all (and thus must be dismissed)

Expand full comment
Apr 9·edited Apr 9

Concerning the whole question of the Chinese cover-up: My understanding of the Soviet-school cover-up strategies is the following:

1) since lower levels of authorities are afraid of reporting bad news, it is difficult for higher up authorities to know what to cover up when, since they don't have all the facts reported correctly;

2) there isn't necessarily one single narrative pushed by a high party official and for which evidence is aligned in a concerted effort. Rather, the goal is to mess around with evidence enough so that nobody can make up any consistent story.

Both of this seems in line with the Chinese cover up being shoddy, at times contradictory and irrational. The ultimate goal was not to convince the world of the "frozen lobster from Maine" origin story, but simply of the "it is very complicated, we might never really know" origin story. And this they did very well, as all these discussions illustrate.

Expand full comment

Yes, the Wuhan local communist party tried to keep the outbreak in the wet market from Beijing. Hence the arrest of Li Wenliang in early January.

Expand full comment
Apr 9·edited Apr 9

"Then they banned Chinese scientists from researching the origins of COVID."

And that arouses no suspicion whatsoever?

Expand full comment

For what? Everyone already knows that, one way or another, the CCP was indirectly responsible for COVID happening, and obviously it would not be in the CCPs interest for any concrete evidence that could incriminate them to come up. It's likely that even the CCP doesn't know what the true cause of COVID was, and they don't *want* to know, because if they found the truth, there's a good chance it would get leaked.

Expand full comment

None at all - why would it?

Expand full comment

Scott says: "I asked a synthetic biologist about [using CGGCGG]. He said:

» “Nope. I would literally never do this if I was designing a small insert (maybe I wouldn't notice if it happened by chance with ~1 in 25 odds in a naive codon optimization algorithm as part of a larger sequence). High GC% is bad. Tandem repeat is worse. Several other perfectly fine arginine codons. And I wouldn't engineer a viral genome using human codon usage. An engineer would not do it.”

---

The opinion of a single expert in a private conversation is not a good argument. Examples of good arguments:

1. Pfizer and Moderna vaccines recorded almost all arginines into CGG.

2. Shibo Jiang inserted a furin cleavage site and used CGG for the leading arginine.

3. If indeed the FCS was part of investigating the PAA -> PRRA hypothesis (see our full post https://blog.rootclaim.com/covid-origins-debate-response-to-scott-alexander/), then PAA is already CG-rich (CCT GCA GCG) so virologists modeling how PAA could naturally evolve into an FCS could have decided to keep it CG rich.

4. Having a unique sequence can be helpful for easy tracking of mutations in the FCS during lab experiments.

Expand full comment

He said synthetic biologist, not friend. No doubt they are friendly but it’s not a random guy.

On the other hand you are a random guy and who knows if any of that is gibberish or not.

Expand full comment

> “I asked my friend who confirmed my opinion” is not an argument we should see in a rationalist blog.

I think there is no need repharse the argument, it looks like a strawman, he asked a synthetic biologist, I don't think the fact (s)he was his friend or not matter too much here.

I don't think one synthetic biologist thinking nobody would do that, give a lot of weight to the idea nobody would do that (because I don't think people could really be confident about what other people do in their fields), but it still adds some weight.

Expand full comment

you're right. i fixed it.

Expand full comment

Neither the Pfizer nor the Moderna vaccine used CGGCGG for the two arginine in question. Moderna used the most common codon (per their analysis; arginine is ambiguous here) for every codon and changed only the bad ones. CGG-CGG is bad as a tandem repeat.

Human codon usage also makes sense for an mRNA vaccine. Not for a synthetic virus.

You really should not source your ideas to weird conspiracy theory articles from circa May 2020. Here is the article cited re: Shibo Jiang and an FCS: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3823846/ … it’s a system for antibody expression, not a virus, and one of three inserted arginine is coded by CGG.

What happened to the chimera theory in the Oct 2020 Rootclaim post? This all seems like replacing one poorly supported intelligent design theory with another.

Expand full comment

The post has several attempts to discredit Reed, all of them unsuccessful.

>This is a weird inconsistency! In the Wales interview, the cat got it before him (at least that’s how I interpret “I don’t think I caught it from her”). In the Mail interview, he got it nine days before the cat.

Not sure what the claim is here. That Reed made up the whole story about the cat? That would be pretty weird. Is that really the preferred explanation and not that he just assumed he didn't notice how long the cat was sick? Remember it's not his cat, just a "kitten hanging around my apartment"

--

>In The Wales interview, it’s “the feline coronavirus”. In the Mail interview, he doesn’t know what the cat got and speculates that it might have been COVID. But also, if it was “the feline coronavirus”, how would he know? Wouldn’t you need a vet to diagnose that? But in the Mail interview, he said he didn’t leave the house for a week around the time his cat was sick. So how did he go to the vet?

Scott seems to not understand that the Mail quote is not an 'interview' but a recreation of his experience day-by-day. Unclear how exactly, but since some recollections are very specific, he might have kept a diary or perhaps recreated it from messages to friends etc. It therefore makes perfect sense that in real time he didn't know what he had and thought maybe the cat got the same disease. Later he understood better and could make a more educated guess.

---

>"I was stunned when the doctors told me I was suffering from the virus. I thought I was going to die but I managed to beat it,” he told the outlet, adding he was hospitalized at Zhongnan University Hospital for two weeks following his diagnosis.

In his earlier story, he was at the hospital for less than a day. Now it’s two weeks.

Reed never mentioned any hospitalization, and he isn't even directly quoted here. So this supposed major revelation relies solely on the reporter not misunderstanding Reed. In this case, it seems clear what happened: we know he went to the hospital two weeks (day 12) after symptoms - they simply misunderstood this to be a two week hospitalization. 

---

>But also, the doctors “told [him he] was suffering from the virus”, but this is impossible - the virus hadn’t been discovered yet.

This is from the same source. Again, much more likely that they lost the context and he was describing the later diagnosis he received. He's even quoted a few lines later "It was only when I called back a couple of weeks ago that they told me I’d had the coronavirus". 

---

>I can’t deny that it’s weird to do your regular shopping at a market an hour away, but it really sounds like he’s referring to the wet market where all the cases started here.

Again, Scott puts a lot of weight on news reporters accurately quoting and understanding everything Reed says. It is indeed very very rare for news reporters to make mistakes, but it's still more likely than a single male spending a 3 hours commute to do his regular shopping.

This would all be much more convincing if the supposed lies and inconsistencies were found in the many video interviews he gave, but somehow those all consistently retell the same story of Reed accurately describing Covid symptoms in November.

Overall, these claims seem quite desperate. Reed proves to be very reliable, and all attempts to find inconsistencies in his story failed. It doesn't mean he had Covid - it's possible there was some misunderstanding (we don't even use him in our analysis). His story is, however, very useful as a litmus test for motivated reasoning. 

Expand full comment

What is your credence that Connor Reed in fact had coronavirus in November 2019? I believe it is unlikely even if lab leak is true. Would you consider doing a challenge just on this issue?

Expand full comment

Above 50% but not very high. Probably the most important factor is whether spillover happened in HSM, so that debate won't be much simpler than the origins debate.

Expand full comment
Apr 10·edited Apr 10

I think it is unlikely even conditional on lab leak under the scenario you propose. So the challenge could exclude this point and be only about this conditional, making it much simpler.

Would be interested to see a calculation in which you reach >50%. The prior for “a Brit was hospitalized with later confirmed Covid in the first week of December 2019” is very low, and to my mind we have the kind of evidence for it that would be fairly likely to exist if that didn’t happen, more so than the evidence that would be fairly likely to exist if it did.

Expand full comment

Why is it low under lab leak? There could be >100 cases at that time. Having one of them being a foreigner is not surprising. Of course the local cases do not interview to western outlets.

Expand full comment
Apr 12·edited Apr 12

Possibly 100 cases with onset by December 7 when he is hospitalized (edit: i mean to say "went to the hospital"), on a low ascertainment / many mild cases assumption (which is reasonable on the lab leak hypothesis but hurts you elsewhere since Connor had a serious case and went to hospital and was tested). No way there's 100 with onset by November 25. 10 would be a reasonable high-end and roughly aligns with about 100 on Dec 7. How many of these could possibly have been lab-confirmed? Maybe like 1 (remember we're assuming low ascertainment) before accounting for the fact that no documented lab-confirmed case with onset anywhere near this date exists, probably like 0.2 after accounting for this. Plus the fact it's a young person is like 30 to 1 (6 out of 174 cases were there 20's in the WHO report).

There are about 40,000 British people living in China. Wuhan's a big city (more foreigners) but it's not Shanghai or Beijing (less foreigners), it's reasonable to say about 400 brits, maybe half in their 20's vs about 2 million total people in China in there 20's. That gets me 1.5 million to 1 (2 million / 200 * 5 * 30).

Now the other side (false story) is much less amenable to estimates and it's true I have restricted to a very specific scenario (Brit in their 20's, early onset, lab confirmation in the story). All these factors are fairly likely features of a false story however:

Brit - Arguably a likely feature because the "mainstream" news print media culture in the UK seems more permissive to this kind of story, at least it seems to me as an American when I read a story that I am skeptical of that is basically "X tells incredible story of Y" it is from the UK a disproportionate amount of the time.

Young - Many foreigners in China are young I assume. Also arguably a person that is less mature and less established in a career is more likely to make a story like this up. At the least there's no factor that makes it much less likely to be a young person as there would be if it's real.

Early onset - Narratively I think the story roughly makes sense and is print-in-a-local-newspaper-or-tabloid level of plausible with Nov 10. to December 10 onset. About equally likely to be any date in this range if it's made up. Much more likely to be close to the end of the range if it's true due to exponential growth.

Lab confirmation: Kind of has to be included in the story for it to be taken seriously and get this far, otherwise it could just be the flu.

On top of these considerations:

1. The story should have been pretty clearly "big if true" by March, certainly by Connor's death in October, yet there is no investigative story on it. No story with a Covid-origins angle whatsoever as far as i've seen either. I believe the likely explanation is that attempts to investigate failed to confirm it. The NY Times isn't going to write a story debunking this, they are just going to call him up and try to get confirmation and drop if it doesn't.

2. I do find the story a little suspicious. Won't rehash this but I will mention the earliest interviews are The Sun and NorthWalesLive in February and, even ignoring small inconsistencies and just going by the vibe, it really feels like the story is has grown in the telling when we read his timeline a month later. Of course, even true stories often grow in the telling, but the problem is that the February interview wasn't the first time he told the story (it started with Facebook posts). The Coronavirus confirmation from the hospital could easily be another detail that was added.

Expand full comment

He wasn't hospitalized, he went to a hospital for a few hours and got medicine. That's likely because he's a foreigner and doesn't have a regular doctor/clinic. He describes a typical Covid experience for early strains. Nothing special.

Expand full comment
author

"Reed proves to be very reliable, and all attempts to find inconsistencies in his story failed."

I feel like this is a bizarre way to describe the interaction. There are many inconsistencies, you just say each one of them was a mistake by the reporter.

Is your claim that the hospital took a sample from him (blood? sputum?), saved it for two months after a simple ER visit, then retested it (with no request from him to do so) two months later?

This isn't how I remember hospitals working, but I am just a mere psychiatrist - do any pulmonologists or respiratory care doctors want to weigh in about whether you would save patient samples and keep doing tests on them for months after the patient was gone?

If Chinese hospitals did this, wouldn't the obvious next step be to go to all the frozen patient sputum samples from November 2019 and see if anyone else had COVID? They did this with blood donations because blood donations are actually saved. I think the reason they didn't do this with patient samples (and did do much worse things, like try to resample the patients) is because hospitals generally don't save samples like this.

Expand full comment

It's not that i "just say" it's a mistake. it's:

1. When a person gives many interviews, there will likely be a few misrepresentations.

2. When you filter all the inconsistencies, you will also find those.

3. If everything you find is ambiguous, and in video interviews there are no inconsistencies, it's fair to conclude these are all misrepresentations and the interviewee is reliable.

You than make a leap to a very different question "did he have a positive test?".

It's definitely possible Reed believes he got a positive test but he didn't. He could've misunderstood what the hospital said, or the hospital could have mixed up something. That's probably better answered by Gilles. I'll ask him to chime in.

Expand full comment

Is there a video interview in which he narrates his trip to the hospital? All the most suspicious parts of his story center on that (I don’t think the cat thing is a big deal).

Expand full comment

I don't think so.

Expand full comment

Also in The Sun article https://www.thesun.co.uk/news/10876645/coronavirus-uk-brit-virus-china-wuhan/ , apparently the first published by timestamp and the probably the first interview given, he is quoted saying “It was only when I called back a couple of weeks ago that they told me I’d had the coronavirus.".

In the timeline account https://www.dailymail.co.uk/news/article-8075633/First-British-victim-25-describes-coronavirus.html he says "A notification from the hospital informs me that I was infected with the Wuhan coronavirus." Considering that if the account is taken to be a true he probably wrote it on looking back through stuff in his phone, it is pretty hard to reconcile. Of course you can argue he got a text then called back but that's extremely strained, you'd never write that sentence if that's what happened.

Another difficulty with the Daily Mail timeline is he has the officials tell everyone to stay indoors on Jan 1. Li Wenliang was reprimanded for stirring up panic on Jan 3. Covid lockdown was actually implemented Jan 23.

I think if I were defending Reed in court I'd try to argue that he was just freestyling in the Daily Mail, writing it as fast as possible based purely on recollection (maybe they offered to pay him for the story). That solves most of the issues.

Expand full comment

Another weird detail around the hospital trip, he doesn't explicitly say he saw a British doctor but he certainly gives the impression: "I decide to go to Zhongnan University Hospital because there are plenty of foreign doctors there, studying. It isn’t rational but, in my feverish state, I want to see a British doctor. My Mandarin is pretty good, so I have no language problem when I call the taxi. It’s a 20-minute ride. As soon as I get there, a doctor diagnoses pneumonia." Would you ever write that if you didn't see a British doctor? It seems unlikely that he just waltzed into Zhongnan University Hospital and asked for and saw a British doctor. I wonder if it can be determined whether or not there was in fact a British hospitalist or relevant specialist working at the Zhongnan University Hospital in November 2019. Seems like it might at least have shown up in a human interest story at some point if there was?

Expand full comment

No dice on a challenge on this issue? I suggest 3 round written debate, judged by Zvi Mowshowitz if he is willing to do it (his high fee shouldn’t be a major issue as this won’t take more than 8 hours to judge imo.) I think it likely I may be able to fundraise to make the stakes more than 100K if this debate is promoted by Scott if you are open to that. To be clear the question I want to debate is “did Connor Reed have Covid-19, more likely than not, conditional on the theory that COVID 19 leaked from WIV subsequent to a GOF experiment, and an initial superspreader event occurred at the Huanan Seafood Market due substantially to its characteristics that made it exceptionally conducive to such an event.” Your use of this case is being used by opponents as a point of ridicule so I am sure you are eager to defend it.

Expand full comment

This may be interesting. We'll discuss it internally. let's move to email info@rootclaim.com

Expand full comment
Apr 9·edited Apr 9

The two posts did shift me towards zoonosis (funnily, the first post shifted me into the other direction because the most important arguments that were new to me were pro lab leak).

But there is one point were I think that Scott has an inaccurate model, and that is the early stage of a pandemic. I think Scott underestimates the large variance that this early stage has. For doubling times, they are smooth and consistent (what I would call "sharply concentrated"). But AFTER the inital phase, not DURING the initial phase. So, once you have 10-20 infected people, then your data looks very smooth. But with less people, not so much. I happen to study very abstract version of such processes (branching processes, Galton-Watson trees, and some highly abstracted forms of epidemics). And this initial phase looks very different from later phases in all these abstract settings. It's abstract, but it's a part that I would expect to transfer to the real world. It can easily happen that you have one or two or even three initial generations where the process just fizzles before properly starting. And this is even more true for infections where the number of transmission of a source person varies a lot (many infect no other people even in the same household, but some infect 5 or 10 or more). As it does for Covid.

This makes two main points of Scott's argument weaker:

I think we should be less sure about the early timeline. I am not talking about an additional month, but it is perfectly possible that the onset is 1-3 transmissions earlier than the model calculations, which is at least 1-2 weeks.

We should also less sure that the first infection is at the same place as the center of subsequent infections. If the first generations of viral spread go like

1 infection -> 3 new infections -> 2 new infections -> 1 new infection -> BOOOOM,

then we will see the center around the person in generation 4. But this can be at a different place than the person in generation 1. I think Scott underestimates the likelihood of this scenario. BOOM here does not necessarily mean a single superspreading event. It can also be just a few more generations where the number of infection doubles or triples.

I am not saying that this scenario definitely happened. I think it's somwehat less likely than the opposite scenario, where there is no evolutionary bottleneck after patient zero. But Scott quite heavily relies on this scenario being unlikely. It also seems (one of?) the main disagreements with Rootclaim.

Expand full comment

Yes, Pr(first case at location|first super spread at location) is way below 1. Even more so (I think) when said location is highly super spread prone.

Expand full comment

There are many theoretical scenarios by which the market cluster wouldn't be the first cluster. But they don't resolve what is, under lab leak, a highly unlikely coincidence--that the first known cases happened at the one place in the city that a zoonotic pandemic is most likely to start. In fact, a scenario like this is almost required by LL (perhaps with slightly fewer steps), but pointing out that such events are impossible to rule out statistically doesn't change the fact that zoonosis is by far the simplest explanation for the observations.

We could just as easily say, by this logic, that the first case might not even be from Wuhan, in which case is there any reason to raise LL to your attention in the first place?

Expand full comment

This greatly depends on how you imagine LL happening. If it's from an animal "disposed of" by selling it on the side or otherwise making it to HSM, then the coincidence is not that unlikely.

Expand full comment

You can imagine anything you want, doesn't mean it's likely.

This is just post hoc rationalization. If the cases clustered on the lab, do you think we'd be having this same discussion? Of course not. In that world, if someone did try to say "the coincidence of starting near the lab is not that unlikely if the lab were secretly buying animals from the wet market", they would be dismissed out of hand as obviously biased. But for some reason we're forced to take this gibberish seriously.

Expand full comment
Apr 9·edited Apr 9

FWIW I think that wild -> lab -> leak to HSM is one of the more likely scenarios.

Look, there are two key facts that most need explanation. The a-priori staggeringly unlikely coincidence of a lab studying relevant viruses that had previously experienced lab leaks being very close to the first outbreak and that had provably transported animals from Yunnan to Wuhan, and the centering at HSM. Many LLers dismiss or deny the second. Many "zoonosis purists" dismiss the first. An explanation that handles both well deserves attention.

In your proposed "alternative reality" , we should be definitely interested in the sourcing of animals in the WIV labs. And in this reality, we should be interested in the disposal methods practiced in the labs.

ETA: another key factor at this point is the evidence (which I believe) that the probability of deliberate engineering is low. So a theory that explains all three of these facts is direly needed, and I don't see why a reasonable candidate should be considered "gibberish".

Expand full comment

The pandemic starting in Wuhan is not "staggeringly unlikely." This was covered in the debates. I have seen no argument justifying a prior of <1%, and several percent is more likely, particularly once you realize how many labs have been supposedly implicated. This is somewhat weird, but introducing substantial additional complexity into your model, without any additional evidence, is just epicycles.

> So a theory that explains all three of these facts is direly needed, and I don't see why a reasonable candidate should be considered "gibberish".

Zoonosis at the wet market explains all 3 facts without having to invoke total speculation about some lab worker sneaking samples out. On top of that, you have the additional coincidence that the animal at the lab (rather than one at the market, which is a much less sanitary place and with many more animals) was the one with the virus. It should be ignored until there is some evidence for it.

TBH this reminds me of an argument I once saw for a flat earth, that the earth itself was flat, but the space around it was curved (or something) so that it reproduced all of the exact same predictions as a round earth, but you could make a semantic argument that the earth is "actually" flat. Would it even make sense to call this a "lab leak" if the spillover actually happened at the market from a virus that had evolved totally in an animal, and the work the lab was doing and the safety level it was conducted at is wholly irrelevant, other than having mammal samples with coronaviruses?

Expand full comment

"The pandemic starting in Wuhan is not "staggeringly unlikely." This was covered in the debates."

I said "a-priori". But also this was one of the weakest parts of the debates. None of the other labs was anywhere near as likely as WIV to be a source of a pandemic outbreak, all factors taken into account. Put simply, there probably was no other location on earth more likely to involve a combination of a lab and a pandemic. Definitely a <1% prior.

The explanation for WIV proximity under zoonosis (BTW there were a few proposed, some obviously false) is certainly not significantly more likely than the integral sum of all the paths from lab to market. You call the latter "more epicycles", I have the same reaction when you all but shrug off the lab proximity coincidence and invent statistical stories to explain it away. Ultimately, something strange *had* to happen as Scott said in the original post. You can have your pick. The price for lab (that leaked in the past) -> market doesn't seem to be so incomparably higher than no lab involvement -> lab proximity to make "gibberish" an appropriate evaluation.

Animals having the virus in the WIV lab... yeah... let's completely ignore this. Why would an institute deliberately collecting animals with relevant viruses possibly have animals with relevant viruses.

"Would it even make sense to call this a "lab leak" if the spillover actually happened at the market from a virus that had evolved totally in an animal, and the work the lab was doing and the safety level it was conducted at is wholly irrelevant, other than having mammal samples with coronaviruses?"

This is a mind-boggling level of un-engagement with what I was saying - though perhaps I fail to grasp your point?.. How would the work the lab was doing (including collecting animals particularly likely to have coronaviruses, exposing people to them and bringing them to densely populated centers) be wholly irrelevant?

The #1 question for me is whether the pandemic would have happened without the lab activity. This is what primarily distinguishes the different possibilities, and this is what actually matters for the future. How on earth would the means by which the virus could have traveled be wholly irrelevant?

Expand full comment

Yes, it's nice to read a sane comment on that. The issue why the debate is so heated is that both alternatives have a pretty big coincidence to explain away. Either that the outbreak is just a few miles from one of the two research facilities worldwide with gain-of-function research on corona viruses. Or that all early cases are clustered around the wet market, which is the natural place where a zoonosis would start in the city.

Everyone needs to explain away either one or the other. Both sides can make arguments why the base rates are not quite so unlikely, but both are freak coincidences, and arguments against them invite attacks from the other side.

Expand full comment

Scott writes "I think scientists had called wet markets as an especially dangerous potential transmission location in advance.", and follows up with quotes warning against markets.

These seem like the result of a targeted search for such quotes, and therefore have no probabilistic weight as they provide no comparison to other spillover sources. We didn't claim no one warned against markets, just that they were relatively low priority.

In our post (https://blog.rootclaim.com/covid-origins-debate-response-to-scott-alexander/) we provide an unbiased search we conducted that clearly showed markets were just one of many high-risk interfaces (search "USAID"), and a relatively low priority one. Nevertheless, this was a quick search, and we are open to see it improved, but only using proper methodologies.

Cherry-picking is one of the most basic human reasoning biases, and probably the most common way people get convinced of wrong things.

Expand full comment

You keep linking to that full document. It’s a tedious post. In this particular case you are claiming that Scott is engaged in a “targeted search for such quotes regarding wet markets being especially dangerous”. I see no rebuttal in your document and posting it every time you disagree with something is just a gish gallop.

Expand full comment

Good point. I added instructions to search for the keyword USAID.

Expand full comment
Apr 9·edited Apr 9

Thank you for this! It's a great public service and example of truth-seeking, and your thoroughness and equanimity are remarkable.

I don't have anything to add except the single data point of how this has affected my own belief state. Prior to your initial post I was something like 60/40 in favor of zoonosis (both stories seemed to have some strong points in favor, I hadn't looked into it very deeply and expected to never know the truth, but zoonosis seemed like the more boring & mundane route and thus a marginally safer bet). After your debate summary post, I was with you on 90/10. After this post (and other discourse responding to the previous post), I'm even more strongly convinced that the lab leak evidence is rooted in typical pseudo-scientific patterns and would put myself at 95-99% in favor of zoonosis.

Expand full comment

> (The one argument I know about, haven’t responded to, and it really is because I’m lazy and scared and bad is Michael Weissman’s Bayesian analysis here. It’s 25,000 words and uses a bunch of logits and calculus. Sorry, pass.)

As far as I can tell, the fancy math is basically trying to distract from similar tactics and arguments as Saar and other posters, such as claiming Worobey shows evidence of strong ascertainment bias, which you already responded to. One of the judges posted a brief response to it: https://ermsta.com/posts/20240301 although it mostly doesn't go into the details of the arguments; it's mostly about how to assign Bayes Factors to them.

Expand full comment

I'm unsure if Weismann is correct but i found Eric's arguments about why there is no ascertainment bias handwavy and unconvincing. A lot of the "well, you see more cases without known links near the market for XYZ" (possibly) answer one specific claim of ascertainment bias but often end up implying another one.

Expand full comment

The WHO and Chinese CDC do believe there is a strong bias and state that the cases cannot be used to reliably pin the origin at the wet market

Expand full comment

I'm pretty fine with saying that the WHO and Chinese CDC are hopelessly compromised, but of course this has implications for trusting much of their data.

Expand full comment

The Chinese authorities are the ones who came up with wet market theory. Hence why they focused their testing there

Expand full comment

I cannot tell you how completely insane it is to claim there was no ascertainment bias in the case search. I respect Scott a lot but his rebuttal is confused. He argues that there is no ascertainment bias in part because all of the cases in the worobey dataset are cases with symptom onset in december and a market link was only required for suspected or confirmed cases by 30th Dec. This is an egregious error that suggests a basic lack of acquaintance with the arguments.

Expand full comment
Apr 9·edited Apr 9

At this point I've gotten into a lot of arguments around covid origins, whether it be here or elsewhere. And this comment evinces a very common pattern; namely, asserting that someone is wrong and doesn't understand any of the arguments, but either without explaining why or simply repeating the arguments already addressed. Scott actually wrote about this exact phenomenon:

> “Sure, Scott confronted 489 arguments. But hw failed to confront the strongest argument against his case - this one obscure article in a Nepalese journal that nobody except me has ever heard of. That means he’s a bad-faith actor strawmanning everyone he disagrees with!” I know that someone will find some detail I’m wrong about and spam it all over Twitter with “Scott didn’t realize that an 91Q mutation is different from a ZY6 mutation, how can you ever trust anything he says?” And I know that next month, someone will come up with another SMOKING GUN! - and if I don’t respond to it immediately they’ll say I’m scared and know I’ve lost and am refusing to admit I’m wrong out of sheer stubbornness, and twist some quote of mine to show I’ve admitted I’ve changed my mind.

This pattern also seems to be true of online Marxists, 2020 stolen-election theorists, and a variety of conspiracies and crackpots. All the previous arguments are bad, but the 1000th one is actually rock-solid and if you don't engage that one with the same seriousness as the previous ones you're just acting in bad faith. Or they just repeat the same argument 17 times, regardless of what anyone says. Anyone who disagrees just doesn't understand the arguments. Care to explain why? Psh, the fact that you're even asking means you clearly haven't done enough research.

Scott gave multiple reasons to believe that ascertainment bias was not a big issue. So did Peter, and Worobey. But I guess they all just missed the obvious because... there's on argument that you can identify and *claim* is in error, without explaining why?

Expand full comment

I explained why in a different thread above and didn't want to repeat myself

Expand full comment

To explain, suppose someone gets infected on the 18th, gets symptoms on the 25th and is hospitalised on the 30th. By 30th dec, A market link is required to class this person as a suspected case and to transfer to another hospital to be tested, and so is never classed as a case. Scott says "the cases all had symptom onset in december so there can't be ascertainment bias". Do you agree that something has gone wrong here?

He's confused date of symptom onset from the date when search and testing requirements were put in place. It's a very bad error suggesting that he doesn't understand the arguments around market cases on which he has based his conclusions

Expand full comment

Scott addresses this:

> This doesn’t mean bias is impossible - some of these points are people who caught COVID on December 31, but only got diagnosed January 4 after the new diagnostic criteria were added. But most cases are pre-criteria.

there could potentially be ascertainment bias in those data, but you've taken about as large a delay from infections to hospitalization as is common and you still have a date that comes well after many of the early cases. You haven't come close to demonstrating that such bias is actually large enough to explain the results, which is the relevant fact, not "there was theoretically some tiny form of ascertainment bias, therefore I can throw the data out and instead use some other source that Definitely Isn't Biased instead." (And call other people insane).

Expand full comment

You are confusing symptom onset, when people caught it, and when the guidance came in.

The guidance specifying a market link to class a patient as suspected actually was sent round hospitals on the 30th. If I contract symptoms on the 22nd and present at hospital on the 30th then I don't get tested if I'm not market linked. My *symptoms* started on the 22nd. Scott says worobey only includes cases with december symptom onset, so can't possibly have missed cases with symptom onset prior to the 30th. Can you see what has gone wrong here

Expand full comment

He literally says that bias isn't impossible, just that most cases come before this.

Expand full comment
author

Yeah, sorry, I added that paragraph specifically because one of the people reviewing the piece said "If you don't make it crystal clear that you know this, some lab leaker will go crazy in the comments".

Expand full comment
Apr 10·edited Apr 10

Alex wrote above:

"And this comment evinces a very common pattern; namely, asserting that someone is wrong and doesn't understand any of the arguments, but either without explaining why or simply repeating the arguments already addressed."

There is a substantial irony that this is rather close to what Alex was himself doing here. (And unlike claimed, Simon was actually making - fairly obviously and clearly imho - a precise response to Scott's point which was itself a response to...). Like sure, maybe Alex was just confused and misunderstood what Simon's point was. Such things happen. But one should be *really* careful about making sure one is right before making an accusation like this.

Expand full comment

All of my comments in this thread have arguments in them. Simon's comment (that I responded to) did not. Certainly no argument strong enough to say Scott is "completely insane."

> But one should be *really* careful about making sure one is right before making an accusation like this.

It's not an accusation. It's simply a description of the comment's content.

Expand full comment
Apr 10·edited Apr 10

Your first two sentences are false. Reading Simon's comment it was immediately obvious to me what his argument was. You didn't understand it? Fine.

The only "argument" your response makes (if you can call it that, it's a stretch) is to [wrongly] accuse him of behavior Scott had criticised in the OP: "asserting that someone is wrong... without explaining why or simply repeating the arguments already addressed." I'm not sure "hey, you know what your argument reminds me of? People making meta-cognitive errors" is much of an argument.

Expand full comment

> Reading Simon's comment it was immediately obvious to me what his argument was.

If you've read the rest of the thread (where he supposedly makes more of an argument) or have seen the argument before, it might be clear what he's referring to. His comment as-is, however, does not explain anything with any clarity, nor (as I pointed out already) does it address all of the other arguments against ascertainment bias.

> The only "argument" your response makes

The whole point is that it didn't address arguments that had already been made... seriously, watch the actual debates.

Expand full comment

>During SARS, the international health community criticized China for having wet markets where zoonotic spillovers could happen. China promised to clean them up, then mostly didn’t (for example, the raccoon-dog vendor at Wuhan was fined a few times, but kept operating). China’s first priority was to prevent people from accusing them of failing to clean up wet markets.

> >police from the Wuhan Public Security Bureau investigating the case interrogated Li, issued a formal written warning and censuring him for "publishing untrue statements about seven confirmed SARS cases at the Huanan Seafood Market."

>My impression is that China (realistically Wuhan City Government, I don’t think Xi would have been involved at this early stage) made a vague attempt to cover up the wet market early on - but that it wasn’t their Department Of Covering-Up’s finest work.

"People who believe lab leak over zoonosis are falling prey to Chinese propaganda" was not the take I expected to end up with from the debate, but it's so delicious that I'm incorporating it into all future discussions I have on the matter.

Expand full comment

It makes sense, honestly. The lab leak hypothesis is, ironically enough, less incriminating for the CCP, since they can just point out that everyone else was also doing gain of function research. On the other hand, China is the only developed country crazy enough to still allow anything like wet markets, which is especially embarrassing since they're basically a dictatorship. How pathetic does your dictatorship need to be to lack the power to enforce basic safety standards?

Expand full comment

The value of adverse inferenceas evidence applies not just on the part of China, but on the part of Western (particularly US) science of infectious diseases as currently structured. As Broad Institute's Nick Patterson writes:

"It is essentially certain that attempts were made by US officials and some

virologists to deny the possibility of a lab-leak, and especially the likeli-

hood that US science is implicated in this disaster."

https://npatterson.substack.com/p/yet-more-on-covid-origins

If there's a culprit implicated on the LL side of the debate, it's not necessarily scoped around geographical boundaries of a nation-state.

Expand full comment

Oh yeah, obviously there's incentives there too. But China doesn't have the same incentive to deny it, since that gives them an opportunity to blame it on the US. ...Unless there's evidence it happened because they had piss poor safety standards. Then the optics would be significantly worse than the wet market scenario.

Expand full comment

> If they secretly knew they’d just started the worst pandemic in modern history, wouldn’t they at least be wearing masks?

If this is a lab leak, and they know it is a lab leak, it doesn't mean they would think it is worse than what they would think at the time if it was not a lab leak or if they didn’t know it is a lab leak.

I don't think china was trying to specifically cover a lab leak, I think they didn't really know what it was, just like us. I am just pointing out that I think this argument don't work well (even if it has mostly no impact on the debate).

Expand full comment

The lab leak discussion seems to have become dominated by the gain-of-function scenario.

The possibility of “zoonotic collection”, i.e. that the virus jumped straight from bats into some scientific researcher who was poking around in caves, seems more probable for several reasons, most critically that it doesn’t require the invocation of a conspiracy theory. It also suggests possible lines of active inquiry that could increase/decrease support, which seem to be under-discussed.

It is very unlikely that any lab would be inserting furin sites without the knowledge of at least several people, and presumably also producing associated documentation (saved files, email chains). Likely, there would be people who are aware of the research who don’t feel any particular personal culpability (e.g. the insertion was carried by a parallel team, was someone else’s project, etc). We then need a conspiracy of silence: all these people keep their mouths shut, all files are deleted or inaccessible, no-one whispers anything to a friend late at night after a few drinks, etc.

In contrast, zoonotic collection not only does not require any kind of conspiracy. It could be that there are genuinely zero people consciously aware of what happened. Field researchers are often young, and early on COVID was likely less virulent anyway. Someone could have returned to Wuhan with barely a mild cold, or totally asymptomatic. If their sampling activity also happened to not pick up any bits of COVID to sequence, there is no particular reason that they would assume they were the source of a global pandemic.

Zoonotic collection requires a researcher to return from the field sometime in 2019. Almost surely, the WIV will have some kind of documentation of such trips (e.g. via sample collection dates, expense reports, emails, etc). Even outside the WIV, the movement of researchers would have been documented by travel records (planes, trains, hotels, payment activity). And at a minimum, information at least suggestive regarding the frequency and locations of field research may be held by US partners. Even looking at the historical frequency of sample collection by the WIV, much of which is (presumably?) public data, could be useful.

If the WIV did not do any in-field collection through 2019, then zoonotic collection is surely very improbable. Conversely, the chances rise if they it turns out they had someone in a cave in Yunnan every second weekend.

Expand full comment

I'd also note that zoonotic collection is a very credible possibility.

It has all the advantages of lab leak theories in general, explaining not only how COVID appeared in the same town as the WIV, but how it managed to travel multiple thousand kilometres from the bat coronavirus hotspots in south-west China, bypassing all other possible spreading sites en route: we just require that a researcher got on a plane / train.

In contrast, a current striking weakness of transmission mechanisms that don’t involve the WIV is that no-one has identified the real-world supply chain of live animals that could have moved the virus in the same way.

In addition, we know that there are historical examples of people going into caves and getting sick, likely from exposure to bats / similar viruses to COVID-19. Most strikingly, the Mojiang mine incident, where six miners became ill, three died, and from which location the WIV subsequently collected RaTG13. A sample-collecting researcher would of course be much more likely than any miner to get close to bats and their excreta.

Expand full comment

Is the invented “multiple thousands of kilometers” metric there because you know that bats travel multiple hundreds of kilometers so “hundreds” doesn’t cut it?

What do you reckon is the ratio of person-days for virus hunters vs everyone else in the same caves as bats and which type of person where’s PPE more frequently?

Expand full comment

The reference is to the sites where the nearest known relatives of COVID-19 have been collected, including by WIV researchers. It is possible of course that COVID originated in some colony of bats much closer to the WIV, just much less likely, per Scott’s commentary above.

Given what else we know about the WIV, I would be surprised if their use of PPE during sampling activity was particularly rigorous.

The risk rate per “miner day” vs “researcher day” is an interesting question which I doubt there is much good data on, but I’d reiterate that deliberately getting close to bats to collect viruses seems intuitively to be relatively high risk.

Expand full comment
Apr 11·edited Apr 11

The unaltered virus theory is inconvenient for lab-leakers that it requires throwing out all the claims about furin cleavage sites and DEFUSE and the like. Plus they usually have an axe to grind against EcoHealth and the imagined blob of scientists and don't want to give that up.

Expand full comment

I would have thought that throwing out claims that are hard to prove and are bogging down the argument would be a selling point.

The WIV and EcoHealth’s behavior, among that of many others, would still be reprehensible in the zoonotic collection scenario no? In either case they carried out or supported dangerous research in a demonstrably negligent fashion, and pushed to shut down the origins debate.

Expand full comment

Also, to emphasize something I kind of only implicitly gestured at: The laboratory leak hypothesis does not actually exclude zoonosis! It could be zoonosis inside the lab, among the lab animals, OR it could be zoonosis brought into Wuhan through a specimen.

Zoonosis is not actually the hypothesis that COVID originated with "zoonosis", but rather the more narrow "Zoonosis without any kind of laboratory involvement in the initial exposure". Indeed, the "Zoonosis" hypothesis is barely about zoonosis at all, excepting insofar as it is evidence against a very specific hypothesis in the laboratory-leak-hypothesis-space - rather, it is more fundamentally the claim that the laboratory had nothing to do with the virus at all.

So the evidence "for zoonosis" should be discounted by some factor representing the odds that, if the disease had a natural origin, it might arrive at Wuhan via the laboratory studying that disease, insofar as that is relevant to the evidence in question.

If you did not consider this, you should update away from "zoonosis" to some extent.

Expand full comment

You can twist it even further: an enterprising lab employee selling some bats labeled "biohazard, to be safely disposed of" on the wet market.

Expand full comment

Sure. Or just improperly disposed of specimens in general, plus an enterprising hungry person who figures it'll be fine if they cook it well enough, something they've been doing for a couple of years without issue, which is usually fine except this disease is particularly good at infecting humans via respiratory action, and also they stepped on/compressed some corpses causing them to expel the partially liquified contents of their lungs into the air.

Personally I prefer the abstracted "laboratory leak via any cause" - I just plain don't know enough about the laboratory procedures in question to know which avenues are even plausible, much less likely.

Expand full comment

The catchall term for this that most on the LL side have aligned around (certainly Alina Chan has) is "research-related accident."

Expand full comment
Apr 11·edited Apr 11

The unaltered virus theory is inconvenient for lab-leakers because it requires throwing out all the claims about furin cleavage sites and DEFUSE and the like. Plus they usually have an axe to grind against EcoHealth and the imagined blob of scientists and don't want to give that up.

Expand full comment

Mind that you are careful about categorizing groups of people based on the people who you have the most exposure to - in general your ingroup is not going to point you to the best examples of the outgroup, but the examples that are easiest for them to make fun of.

Expand full comment

This isn't "ingroup making fun of outgroup", I'm talking about what I've seen *personally* by participating in many COVID origin debates online. The sample is "people who hang out in the same spaces as me", it's not cherrypicked at all.

Expand full comment

Mind that, from my perspective, anybody who doesn't have that experience isn't going to comment, so the same issue applies.

I haven't observed this. Possibly it's because I don't find the debate that interesting, and so don't seek out places to argue about it - but this in and of itself is likely a filter for people who are, in fact, strongly interested in the debate, which probably substantially filters the sort of people who participate in it towards people with unusually strong and specific opinions.

Expand full comment
founding

> (The one argument I know about, haven’t responded to, and it really is because I’m lazy and scared and bad is Michael Weissman’s Bayesian analysis here. It’s 25,000 words and uses a bunch of logits and calculus. Sorry, pass.)

A quick skim suggests this is just a different way of framing the update strengths, and it should still be easy to point to "this likelihood ratio disagrees with these other likelihood ratios" in a way that quickly points to the disagreement. You might be able to get Weissman to format his numbers in basically the same table as you have for the judges and commentators. (And then the disagreement is over something like "no, the prior shouldn't be 70:1" or "no, we can't get a likelihood ratio of 4:1 from the lack of intermediate hosts because of suppressed evidence", or so on.

Expand full comment
Apr 9·edited Apr 10

It largely hinges on two things

1. Whether you trust the Worobey and Pekar papers. The reason to trust them is obvious. The reasons not to trust them have been mentioned in many comments here already so forgive me for skipping it.

In addition to the high weights that Peter's camp assigns to these papers, this factor often gets counted multiple times as it is passed through different secondary sources. Trust in the market origin papers is the cornerstone of Peter's arguments and the papers were also the largest factor cited by many of the forecasters.

2. What you think of the DEFUSE proposal. Once you start multiplying out the independent coincidences in timing, location, and several details of the genome you end up with really high odds for some form of manipulation. If you don't look at Weissman's analysis of this part it's hard to appreciate just how unlikely it would be for this to happen by chance. Peter's side kind of glosses over it. There's plenty of evidence of messy non-engineered evolution as well, but that is consistent with either theory.

Expand full comment

I did reformat my factors in a way that can be directly compared with Scott's. I sent him that but have lost track and will do it again. But, as mentioned elsewhere in these comments, the differences are almost all in 1) I don't use his Worobey/Pekar factor for reasons I explain at unpleasant length. I argue that their likelihood ratio should be slightly reversed when the errors are corrected and less fragmentary data sets are uses. Since markets only gave a bit of the zoo priors, I don't bother to reduce the overall zoo odds for that. 2) After months of declining to use the restriction enzyme pattern due to the multiple comparisons issues that many had noted, I included them because of the stunning match between the DEFUSE team plans released by Emily Kopp and the scenario hypothesized by Bruttel et al. It's this last factor, based on a new release of unexpectedly relevant data, that brought my odds up rom 100/1-ish to 500/1- ish, both after substantial regression back toward 1/1 due to acknowledging big fat-tailed uncertainty in the priors.

It's true that explicitly including some of the math machinery reduces the readability except for aficionados. It's still a better thing to do than to hide it all in impenetrable code filled with multiple big errors all tending in the same direction, sufficient to reverse the direction of the conclusion— the method of Pekar et al.

Expand full comment

Scott writes:

>No BANAL-52 relative close enough to create COVID from has ever been discovered.

>By mentioning BANAL-52, I was trying to be maximally charitable to the lab leak side. In order to create COVID, they would need a virus very close to COVID. But in years and years of searching, nobody has ever discovered a virus like this. Therefore it must be rare. As a way of bounding how rare, let’s see how rare the closest virus ever discovered is. That’s BANAL-52. It is very rare. Therefore, the COVID ancestor must be rarer than that.

Response:

This is obvious hindsight bias. We know 5 viruses that are all a few % from each other, with SARS2 being one of them, and the other being 3 BANALs and RaTG13. SARS2 is singled out here only because that's the one that in hindsight started the pandemic, but it could have been any of endless viruses sampled from this space.

Another way to understand this bias, is that whatever restriction you choose to apply to the virus on the lab-leak side, you need to apply to zoonosis (otherwise you're calculating conditional probabilities of different evidence for each side). Meaning, you need to look only at hypothetical zoonotic pandemics that come from a relative of SARS2, rather than any of the endless viruses that could start a pandemic if they somehow attained this 12nt FCS. And unless you can show that this specific sequence is more likely for one of the hypotheses, this redundant restriction cancels out and has no effect.

To be fair, Scott realizes he may have messed this up and writes:

>I don’t know how strong this argument is, because maybe there are millions of rare viruses capable of becoming pandemics, such that getting any one of them is very easy, even though each one individually is rare. The version of this I find convincing is that it should be a probabilistic cost to say that WIV did gain-of-function on a seemingly undiscovered and so-far-very-hard-to-discover rare virus instead of on any of the usual SARS-like viruses that people do their gain-of-function research on.

This still has the hindsight bias mistake above of thinking we need a "so-far-very-hard-to-discover" virus.

As to "usual SARS-like viruses" - This is a whole different argument of the form "an engineer won't do that". You never confidently know what the engineer is trying to achieve and what makes sense for them.

But in this specific case we don't need that, as we actually know from DEFUSE they are interested in a wide range of viruses beyond close relatives of SARS1, and even beyond SARS-like viruses. (also see this: https://twitter.com/ydeigin/status/1778239459818881535)

Bottom line: This has negligible weight. It is quite likely WIV would have a virus that is good enough to start a pandemic (after adding an FCS and potentially passaging).

Expand full comment

> But in years and years of searching, nobody has ever discovered a virus like this. Therefore it must be rare.

Is rarity an argument for being synthetic? If so, are the viruses reported in sick, smuggled pangolin in Guangdong and Guanxi province also lab leaks?

Closest relatives by full genome identity:

SARS2: 96.8% identity to BANAL-20-52

Guanxi pangolin coronavirus: 85.3% to BANAL-20-52

Guangdong pangolin coronavirus: 90.8% to BANAL-20-236

Which one is mysteriously different from the library of sampled genomes?

Lastly, of course we know that researchers at WIV are broadly interested in viral zoonotic reservoirs and evolution therein. No need to read between the lines of DEFUSE for this -- just check out the scientific literature.

Expand full comment

I was unclear what above was Scott's and what was mine's. I edited and it should make more sense now.

Expand full comment

I thought WIV were also collecting samples of novel virus from nearby and throughout APAC, including in the region where BANAL-52 was first identified? If so, it would explain how they got a SARS2 precursor to work on in the first place. This is the argument that, for me, most strongly explained how SARS2 could have traveled to Wuhan, as opposed to animals traveling thousands of kilometers.

Expand full comment
Apr 9·edited Apr 9

There are SARS2-like viruses sampled in bats in Japan, Zhejiang, and just outside Wuhan. There’s no lack of evidence of undersampled paths via animals to locations all over the place.

It’s also not thousands of KM from most related viruses sampled in bats to Wuhan.

Edit: oh btw WIV folks were out sampling bats at the exact same time the call came in to

help analyze the first spooky sequence and shortly after that came samples to test. A weird priority when there’s supposedly a lab leak that already happened to cover up!

Expand full comment

What does "SARS2-like" mean in this context? Are you saying something similar to BANAL-52 was discovered before the pandemic in Japan?

I'm not interested in viruses with genetic drift that also dissipate across distance. The problem with SARS2 is the absence of drift over distance, just a direct hop across that distance. As you get farther from where they found BANAL-52, you don't get genetically more similar viruses to SARS2, but the opposite.

So to get zoonosis, an infected animal had to travel thousands of km, possibly jumping across species, then end up in Wuhan. Odd, too, that a second/third crossover event didn't happen along the way or after the fact. Not impossible. Just odd.

About the edit: I don't think you read my other comments about coverups, so I'll paraphrase. I don't expect high degrees of coordination, planning, malice, or even knowledge of what's going on. Governments just like to keep everything secret. I agree with Scott that the Chinese government wanted any narrative other than "we're to blame" and were willing to suppress anything that pointed to a Chinese origin: WIV and HWM alike.

Expand full comment

I think the photo of the WIV team out to dinner is interesting. It doesn't look like any of them are high-risk (i.e. >60, or even close to it). As such, it would make sense for an early spread of detectable disease to come from somewhere near WIV (like a wet market) even if it didn't come from WIV itself. Thus, you might see spread from nearby and erroneously think, "no, it didn't come from WIV, it came from this place next to WIV." Now, a rejoinder might be that "if someone from the lab were sick, wouldn't they spread it to multiple other people?" Now you're assigning roles to specific people at the lab, to catch and spread the virus according to a model that doesn't necessarily match the real world.

As to whether they would all have worried looks on their faces for having secret knowledge that they "just started the worst pandemic in modern history"... This take is full of projections of future knowledge onto individuals in the past. Did this group of researchers, in mid-Jan 2020, know that:

1.) A recently-identified virus of concern had leaked from their lab,

2.) This virus would become a worldwide pandemic,

3.) It would be the 'worst in modern history', (Not sure I agree with this? I guess it depends on how you define 'modern history', how you define 'pandemic', and how you define 'worst'.)

4.) They needed to wear masks to avoid catching it.

If anything, the invocation of their non-mask wearing shows that this argument proves too much. Shouldn't ANY team of virologists in Jan 2020 have known to mask up? If they didn't, it's probably not a good idea to invoke this photo for this kind of 'evidence'.

I know the whole photo thing was mostly tongue-in-cheek, but it's indicative of how Scott, as a new zoonosis advocate, has potentially adopted a filter for evidence.

Expand full comment

1.) A recently-identified virus of concern had leaked from their lab,

Well it didn’t so they didn’t. They would definitely have know of a virus of concern since the WHO had declared that by mid Jan.

2.) This virus would become a worldwide pandemic,

No, because it hadn’t. Bring virologists they would know that it could. But they weren’t time travellers.

3.) It would be the 'worst in modern history', (Not sure I agree with this? I guess it depends on how you define 'modern history', how you define 'pandemic', and how you define 'worst'.)

A Pandemic has a pretty standard definition. Worst is measurable. Modern history is a bit less clear.

And, no they wouldn’t know that yet, unless it was something they manufactured in which case they would know the potency. But it wasn’t. The other alternative is that, again, they were time travellers.

4.) They needed to wear masks to avoid catching it.

If anything, the invocation of their non-mask wearing shows that this argument proves too much. Shouldn't ANY team of virologists in Jan 2020 have known to mask up? If they didn't, it's probably not a good idea to invoke this photo for this kind of 'evidence'.

As that stage the virus was confined even within Wuhan.

Are you all missing that one of these guys had to be sick, if not most of them, if they were patient zero?

Expand full comment

"Are you all missing that one of these guys had to be sick, if not most of them, if they were patient zero?"

Let's say one of these researchers was patient 0. For simplicity, let's say this person visited HWM in mid-December and got someone there sick. How would they still be sick in mid-January when this photo was taken?

Expand full comment

Bingo. Without the benefit of hindsight this looks like it’s going the way of SARS and MERS outbreaks and the first hint of R0=3 across town isn’t something people at WIV are keenly aware of. Folks in much of the rest of the world put ourselves at higher risk in March 2020 with a couple months extra knowledge.

Expand full comment

This is completely ridiculous. Zhengli sequenced it at the end of Dec, and knew it had a furin cleavage site, which was the key to transmissibility. The doctors all knew there was human to human transmission

Expand full comment

You could just as easily be describing the 2015 MERS outbreak. I bet scientists in Seoul mostly didn’t cancel dinner plans in June/July 2015.

Expand full comment
Apr 9·edited Apr 9

But you're ("you" being grammatically-convenient shorthand for "lab leak + cover-up proponents") trying to have it both ways. On the one hand, this was serious enough to merit a huge-by-international-standards cover-up, and at least an "above-average" cover-up even by the standards of China's "step one: cover everything up" culture. But on the other hand, also, nobody could have known it was serious, which is why they're smiling happily in a restaurant.

This picture may even be the single most damaging piece of evidence to "lab leak + cover-up", insofar as it requires Chinese authorities to not act like Chinese authorities. Simply put, the researchers responsible for the accidental release of a deadly virus which could easily tank China's economy are seen here being allowed to happily celebrate a team dinner instead of being conveniently "vanished" for their carelessness.

(number of times the Chrome tab crashed while writing this reply: 5, plus 1 while composing this edit)

Expand full comment

Honestly, I'm not a lab leak + cover-up proponent. I think it's obvious that some evidence was attempted to be covered up (though as Scott points out the Chinese government had good incentives to cover up both HWM as a zoonotic origin and WIV as a lab-leak origin) early on, and that US social media companies suppressed discussion of lab leak.

I also don't think the Chinese threshold for cover-up is "serious-by-international-standards" so much as "inconvenient facts will be poorly suppressed" in a way that's likely slightly more authoritarian than anywhere else in the world, including the US. (Corollary: Are classified documents really sensitive, or are they classified out of habit? Most insiders seem to agree secrets are kept out of habit, even to the detriment of efficient communications.)

Meanwhile, at what point do you 'disappear' someone, versus just randomly withhold information? I suspect there's one threshold for killing every animal in a wet market just in case (because the censors don't really know the truth, and they don't want to) and another threshold for actually 'disappearing' scientists with global reputations.

Unrelated: Is Chrome particularly bad on Scott's longer comments sections? Sometimes I find it struggling mightily, but I'm not sure if that's a browser issue or a Substack issue. Anyone have a good solution to this, other than just not commenting on popular posts and open threads?

Expand full comment

First and foremost, while this might be excuse-making, I'd like to apologize for some imprecise choice of language here and there. I let (and am probably letting) my frustration with the Chrome-crashing situation spill over into the replies, and while it may be to some degree unavoidable, it's still far from ideal. I appreciate your thoughtful reply in the face of my slightly-heated, choppy, poorly-worded original-reply.

> Honestly, I'm not a lab leak + cover-up proponent.

I really could have done a better job communicating that I was basically using the rhetorical "you" - as in "boy, the sports team who neither one of us play on or coach had a dreadful game" "yeah, well, you can't make an omelette without breaking some eggs". Didn't mean to imply that you were.

> I also don't think the Chinese threshold for cover-up is "serious-by-international-standards" so much as "inconvenient facts will be poorly suppressed" in a way that's likely slightly more authoritarian than anywhere else in the world, including the US.

I *think* we might be saying the same thing using completely different language. What I'm saying is that the odds that China will attempt to cover up a given embarrassing incident are a lot higher than that of the average Western European + North America nation, faced with the exact same hypothetical incident which is equally embarrassing. An example that immediately comes to mind are the intelligence failures surrounding 9/11 and, further on, Iraq WMD, which may not have been exactly shouted loudly during a primetime State of the Union, but were relatively-openly published.

> Meanwhile, at what point do you 'disappear' someone, versus just randomly withhold information? I suspect there's one threshold for killing every animal in a wet market just in case (because the censors don't really know the truth, and they don't want to) and another threshold for actually 'disappearing' scientists with global reputations.

Well, there's Hoffa "disappear" and then there's Jack Ma "disappear" and then, a degree further down, is "you still get to go home to your family, for now, but you most assuredly are made to understand that being seen having a good time is not in your current best interests".

I think even in a Western nation, it would be extremely undiplomatic ("not a good look", as the kids say), and as far as Asian culture goes, Japanese executives have harakiried over far less.

> Unrelated: Is Chrome particularly bad on Scott's longer comments sections? Sometimes I find it struggling mightily, but I'm not sure if that's a browser issue or a Substack issue. Anyone have a good solution to this, other than just not commenting on popular posts and open threads?

I've found that Chrome crashes about 10 times as frequently as Firefox does on big Substack threads (even a couple dozen comments are usually enough to risk a crash).

There was some Chrome update circa 2022-23 that *really* tanked stability, and ironically I suspect it was an update designed to better-handle memory management - this was right around the time a Chrome update added "automatic" unloading of old tabs from RAM. Anyway, short version is that Chrome used to get real slow in certain RAM-intensive situations, and now it just shrugs and crashes instead. Net dis-improvement.

That said, crappy two-years-ago Chrome update or not, I can still browse 400+ comment threads on the old site without much trouble, and Firefox still crashes as well, especially on big threads like the original Lab Leak Debate post. It's pretty inexcusable for a site that's just slightly-formatted text and graphics to crash *at all* on a modern browser on a desktop computer with 16GB of RAM, so I still think the blame-card goes to Substack on this one.

(Number of times the Chrome tab crashed while writing this reply: 0!!! ...likely because I only had the single comment open rather than the actual Scott post + thread)

Expand full comment
Apr 10·edited Apr 10

Yeah, I like to reply to comments from the email link. It works much better, but then I sometimes miss other comments within the discussion.

I have low confidence in my ability to assess either the Chinese propensity to hide information or the US/EU/etc. propensity. My sense is that all governments hide as much as they think they can get away with. My other sense is that they actually get away with much more than most people realize. So perhaps the only place we differ is that it sounds as though my prior for secrecy is fairly high for both the US and China.

Thanks for calling my reply 'thoughtful'. I didn't think you were being malicious, so much as just having a strongly expressed opinion.

Expand full comment

"Most insiders seem to agree secrets are kept out of habit, even to the detriment of efficient communications."

....an added motive I've heard from an insider it that it increases the probability that your paper/brief will be read by colleagues or higher-ups, if you are able to get in labelled "secret", or better "top secret".

Not clear if this increases or reduces efficient communications...

Expand full comment

LOL, I've heard that, too but had forgotten until you mentioned it. Sounds like an impediment has been adapted as a signaling mechanism.

I'm going to still call this 'inefficient'. If you're reducing communication with your peers in order to enhance communication with your boss, you're probably not working in the kind of environment where ideas get refined until they rise on their own merits. You're probably working in an environment where the exact opposite is happening!

Expand full comment

The photo is confusing on either hypothesis. Shi zhengli and linfa Wang, 2 veterans of sars go out for dinner in mid Jan in Wuhan once the hospitals are overflowing, and around a week before people get welded into their flats. They sequenced the virus on 27th Dec and would have known it had a furin cleavage site, which is crucial for transmission. According to her own testimony Shi freaked out about a lab leak on 28th checked her secret database and put her mind at rest though she hasn't bothered to put the world's mind at rest other than by constantly lying.

Everyone else who sequenced the virus in late Dec first thought 'high sars homology this is very bad'. Shi zhengli thought let's go out for a meal after the ct scanners have started crashing and patients are dying in the corridors

Linfa Wang is over 60 btw

Expand full comment

I agree with all of this. My bias for any kind of 'cover-up' is that the parties responsible will be mostly incompetent, or at least incapable of knowing what information to censor ex ante. Thus, outside a mass censorship regime, it's difficult to point at a successful targeted cover-up.

With COVID, I could see cover-ups pointing both toward the wet market and the lab-leak - and that these are likely not mutually exclusive. I think it's important to remember that the people covering up evidence aren't omniscient. So the Chinese authorities with under motivated reasoning might cover up evidence under orders to find some origin for SARS2 external to China (ergo how you arrive at 'let's not talk about lab leak or wet market' kind of censorship). Then later, when people within the USG realized some culpability for authorizing GoF research in China, all discussion of a lab leak was suppressed on US social media. Again, this doesn't require the culprits actually KNOWING the true origin, or suppression of 'smoking gun' evidence, so much as them not wanting to engage in free and open debate.

I think the point that free and open debate was suppressed by different parties at different times - motivated by varying incentives - is pretty sound. Whether it supports lab leak is less sound, at least early on and in the context of China, when you point out that they had incentives to look askance at wet market zoonosis as well.

Expand full comment
Apr 9·edited Apr 9

Hospitals weren’t overflowing mid January. Things changed dramatically between mid and late January much like between mid and late March in much of the rest of the world.

If you care about this, it’s worth looking up some hospital names in Chinese and searching limited to January 2020. Hospitals were sending staff to the one hospital trying to centralize C19 care before very quickly being overwhelmed everywhere by testing demand. Of course it didn’t take long to recognize spread within hospitals and papers on nosocomial spread tell the incomplete story of the role this played in January.

Expand full comment

Eg https://www.washingtonpost.com/opinions/2023/08/22/wuhan-doctors-pandemic-china-coverup/

At Xinhua hospital By Jan. 11, he said, “medical staff in the unit were infected one after another.” The government had still not acknowledged human-to-human transmission, or health-care workers getting sick, but the virus was everywhere. “The hospital was full of people, and the situation was a bit chaotic,” ““**Our hospital’s outpatient clinic is crowded with a large number of suspected patients who can’t be admitted,**” he said. “Some patients kneel down and beg the doctor to take them in.”

The doctors at jinyintan and other hospitals knew there was human to human transmission by like 1st Jan and had people transported in disinfected ambulances. Apparently this hadn't registered with Wang and zhengli by mid Jan?

Expand full comment

“A bit chaotic” and over capacity for diagnostics was the mid January situation at the hospitals nearest to Huanan market. For most hospitals this was later. Spread in healthcare there and elsewhere starting early January is something you can read about in scientific papers and was also a feature of SARS and MERS outbreaks that didn’t cause pandemics. Again, most of the people where I live did similarly risky things through early March — plenty of places in the world had higher first wave prevalence than Wuhan despite months of warning that a pandemic was coming.

Expand full comment

For "a bit chaotic" read "people kneel down and beg doctors to take them in". This was four days before linfa wangs meal.

The hospitals were using disinfected negative pressure ambulances and N95s around 1st Jan! There was a case in Thailand on 13th Jan.

Wuhan Central Hospital had run out of beds on 1 Jan

They literally started building new hospitals on the 23rd of Jan, 8 days after the meal. It's completely inconceivable the WIV weren't aware of human to human transmission by mid Jan. Zhengli and Wang are supposed to be some of the world's foremost coronavirus experts.

Expand full comment

Your quote is a present tense remark from an anonymous radiologist at Xinhua hospital; not talking about 15/January. The same source also says, “on January 21st, I estimated that the number of infections in the city might be about 10,000” and describes a doubling time of 3-4 days. So even for someone at what turned out to be close to the center of the outbreak the POV of 15/Jan was “this is going to get out of hand very quickly” and not “there’s a huge risk on an individual basis to someone at a random restaurant in Wuhan.”

I don’t see how coronavirus expertise helps one have this level of insight into reality in the hospital or how building hospitals is relevant to two people who have nothing to do with building hospitals.

Expand full comment

> “The first known case predates the market outbreak by a month” - this is not the consensus position. I cannot say for sure what Dr. Chou means by this, but I suspect he’s referring to one of the many claims to this effect that Peter effectively debunked during the debate (Connor Reed, Mr. Chen, the 92 cases, Brazil, etc).

The first known Chinese case was traced back to November 2019 (https://archive.is/U6Swq). French blood samples and lung scans demonstrated cases in November 2019 also. By December (when the wet market outbreak began) there are indicators that it was in Norway, the US, and Italy (based on antibodies found in blood samples and RNA found in wastewater/cadavers)

I don't really understand the argument that it couldn't possibly be in Europe that early because it would be more widespread - we weren't testing for it, and COVID follows a seasonal pattern (like all human coronaviruses). September and October in Italy and France aren't peak seasonal conditions, so COVID would have likely spread much more slowly especially as it hadn't blanketed the country yet. IIRC the Italian samples from October were even sequenced

> “Genetic analyses put the realistic start date around Sept/Oct” - see the section on Brazil above for the many reasons this is impossible. Pekar, the most-cited genetic analysis, puts the origin in November. Dr. Chou doesn’t cite his sources, so I don’t know what he’s referring to, but it certainly hasn’t entered the knowledge of the reality-based community.

I'm not sure that I'd personally throw around the phrase "reality-based community" when describing ignorance of peer-reviewed studies, some of which have been around for quite a while now... but you do you!

This new study estimates an origin between August and early October 2019: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0301195

This study estimates an origin no later than October 2019: https://academic.oup.com/ve/article/10/1/veae020/7619252?login=false

This 2022 study put it at around mid-Sept 2019 to early October 2019: https://academic.oup.com/bioinformatics/article/38/10/2719/6553661?login=false (though the authors later wrote that they suspect it might have originated earlier than that)

A 2020 study estimated that it jumped to humans in Oct/Nov 2019: https://twitter.com/BallouxFrancois/status/1257787789816537088

The market cases were all lineage B (apart from a single swab which appears to be contamination as it was from a PPE glove). Lineage B was relatively late in the evolution of early COVID (can see here: https://academic.oup.com/view-large/figure/445990907/veae020f2.tif). That alone suggests an earlier origin than is claimed by the wet market theorists.

> “No raccoon-dogs anywhere on the planet have tested positive, beyond those being forcibly infected to do experiments”. False, this paper discusses an outbreak of COVID among raccoon-dogs on a farm in Poland.

Nope, that is a common error people make. The cases in Poland were at a "mink and raccoon dog farm", and the cases were in minks (2 out of 20 tested positive). The mink samples were subsequently uploaded to GISAID - none from raccoon dogs. Following the trail of study citations reveals this.

> “They aren’t capable of catching or spreading COVID”. False, here’s a paper on the subject which says that “Raccoon dogs are susceptible to and efficiently transmit SARS-CoV2”.

...it's a study talking about them being forcibly infected, as I said in the previous quote

> “The clustering around the wet market in Wuhan . . . was just a product of oversmoothing”. Here is a map of December 2020 COVID cases. I recommend ignoring the contour lines and just looking at the dots. How could dots be oversmoothed?:

There is a Twitter thread explaining it step by step here: https://twitter.com/danwalker9999/status/1747673884336312613

Multiple other scientists have written about it too:

https://academic.oup.com/jrsssa/advance-article-abstract/doi/10.1093/jrsssa/qnae021/7632556?login=false

https://academic.oup.com/jrsssa/advance-article-abstract/doi/10.1093/jrsssa/qnad139/7557954?login=false

https://zenodo.org/records/7016143

The broader problem is sampling bias, something both the Chinese CDC and WHO agree makes the early cases unsuitable for claiming the wet market to be the point of origin. The director of the Chinese CDC said that they put too much of a focus around the market and may have missed it coming from the other side of Wuhan.

Expand full comment

So the origin was … across Europe and the world months before it was recognised in China.

Expand full comment
Apr 9·edited Apr 9

I don't know the origin - I just know the wet market theory makes zero sense when looking at all of the available evidence.

If I had to guess I'd say there was a leak around Sept 2019, which is when the Wuhan lab took their pathogen database offline, switched control of the lab over to the military and stripped out the ventilation system.

By October 2019 the US consulate in Wuhan was aware of a severe respiratory illness circulating. When athletes from around the world showed up for the "Military World Games" that month they described Wuhan as a ghost town, and many athletes ended up getting sick.

That event alone could have spread it to many countries around the world when the 9000 athletes returned home. So many Canadian athletes were infected that they had a quarantine section in their return flight. Some Iranian athletes reportedly died but no full verification on that.

Expand full comment

> By October 2019 the US consulate in Wuhan was aware of a severe respiratory illness circulating. When athletes from around the world showed up for the "Military World Games" that month they described Wuhan as a ghost town, and many athletes ended up getting sick.

Do you have citations for any of these claims?

More importantly, do you have any explanation for how, if this was the case, it took 4+ months for there to be any significant level of COVID in countries like the US or Italy?

Expand full comment
Apr 10·edited Apr 10

https://statemag.state.gov/2020/04/0420feat05/

> By mid-October 2019, the dedicated team at the U.S. Consulate General in Wuhan knew that the city had been struck by what was thought to be an unusually vicious flu season. The disease worsened in November. When city officials began to close public schools in mid-December to control the spread of the disease, the team passed the word to Embassy Beijing and continued monitoring

Lots of stories talk about athletes returning from the games, but here is an example:

https://www.dailymail.co.uk/news/article-8327047/More-competitors-reveal-ill-World-Military-Games.html

> More importantly, do you have any explanation for how, if this was the case, it took 4+ months for there to be any significant level of COVID in countries like the US or Italy?

It's a complicated question - we only realised it was there when we started testing. For example it appeared to spread like wildfire in the US in March or April, but the reality is that it had been slowly spreading for months and that we were just picking it up with testing. If you look at maps of cases in the US over time you see that effectively as soon as they started testing in a county then they found it. The delay was a product of the slow roll out of testing.

Seasonality is also another factor - COVID (like all human coronaviruses) follows a seasonal pattern which will vary depending on where it is. April is a peak time in Italy for example, not just in 2020 but 2021 and onwards. Outside of the peak times COVID seems to disappear, before ramping up again when the conditions hit. Obviously it doesn't actually disappear, it's just simmering under the surface. Flu does the same thing (and no I don't understand the mechanics of it, apart from maybe people being indoors more at those times of year)

Expand full comment

> https://statemag.state.gov/2020/04/0420feat05/

This is a strange article. It goes from talking about something in October, which it chalks up to flu, and then into events in February, after Covid was identified. While your statement "US consulate in Wuhan was aware of a severe respiratory illness circulating" might technically be true, trying to use it as evidence of a lab leak in September is wildly misleading.

> It's a complicated question - we only realised it was there when we started testing. For example it appeared to spread like wildfire in the US in March or April, but the reality is that it had been slowly spreading for months and that we were just picking it up with testing. If you look at maps of cases in the US over time you see that effectively as soon as they started testing in a county then they found it. The delay was a product of the slow roll out of testing.

There was certainly a delay in testing, but that doesn't mean that we can be off by arbitrary amounts. If Covid were circulating in the US from late October, than by the time Covid was known to be in the US in late January, there would have already been 16 million cases and on the order of tens or hundreds of thousands of deaths. Fast-forward another 1.5 months to early March, when we started doing something to mitigate the spread, and it would have already infected everyone in the country and burned itself out.

> . Outside of the peak times COVID seems to disappear, before ramping up again when the conditions hit.

Covid has now become seasonal, because it's been around enough that people start developing some immunity. In 2021 and 2022, we saw peaks in late summer as well as in winter. But in any event, its season seems to be winter, similar to flu... so why wouldn't it have been spreading in fall, which is the start of flu season?

(Scott wrote about seasonality here https://www.astralcodexten.com/p/diseasonality)

Overall I think this is not worth taking seriously.

Expand full comment

Given that you seem to be arguing in laughably bad faith I have no problem with you not engaging on this subject

Expand full comment
Apr 9·edited Apr 9

The military games stuff is a total joke. You won’t find verification of any of the stuff which was all reported in tabloids that don’t follow up when stories don’t pan out. For example, Spanish athletes were tested who asked to be tested and tested positive at the same rate as people in Spain at large after the first wave. Ditto for the vaunted State memo… mitigation targeted at schools at the leading edge of flu season is normal in China. You can read about it in their influenza surveillance reports published until things hit the fan in January.

Expand full comment

If you're kind of familiar with this subject, maybe take a look at Figure 3A and wonder if it sounds right that SARS-CoV-2 shares a common ancestor with BANAL-20-52 (across the whole genome!) in August 2019.

"The whole genome tree representing the dataset without human variants (see Fig 3A) offers a very similar timeline, estimating the divergence time of SARS-CoV-2 at around 2019.58, which corresponds to August 2019."

Expand full comment

Scott writes:

>I think others are using it to prove WIV had “secret viruses” in their catalogue, but the rice virus wasn’t secret, it was HKU4, which is common and which WIV has already published papers about.

Response:

This is incorrect. At least one of the sequences detected in the study is still unpublished by original Wuhan researchers to this day. It is 98.38% identical to the closest known full HKU4 genome (BtTp-GX2012). Moreover, that sequence was inside a BAC reverse genetics system which is also unpublished (i.e. the Wuhan experiments that used it remain unpublished).

So, we have here an example of an unpublished virus that is about as close to a known virus as BANALs are to SARS2, directly contradicting the claim that it's unlikely WIV could have the SARS2 backbone unpublished.

Expand full comment

Nipah work there also tends to get forgotten:

https://arxiv.org/abs/2109.09112

Expand full comment
Apr 9·edited Apr 10

More on ascertainment bias. I promise you aren't the first to notice Worobey and Pekar tried to pre-empt criticisms ;)

>"Before going further, I recommend reading page 8 of the supplementary text of Worobey’s paper, titled “Robustness Of Statistical Test Results To Ascertainment Bias”"

Their robustness test was fallacious, as I show in my published critique (Bahry, 2023): "Although Worobey et al. purport to test for “robustness” of their results to sampling bias, their tests fail (30). For instance, they in effect address false positives near HSM, by dropping cases nearest to HSM from the data. But the issue was false negatives: cases missed due to *not* being near HSM. This is as fallacious as surveying New Yorkers; dropping the 68% most central ones from the data; and concluding from the remaining 32% of New Yorkers that most of humanity lives near to and centered on Central Park."

>"the market connection was discovered December 30 and added to diagnostic criteria January 3"

Covid was discovered on Dec 29, *because of* the Huanan market cluster. The market was part of the search from the very start, including the initial search for earlier-December cases (The nCoV Outbreak Joint Epidemiology Investigation Team & Li, 2020): "On December 29, 2019, a hospital in Wuhan admitted four individuals with pneumonia and recognized that all four had worked in the Huanan Seafood Wholesale Market, which sells live poultry, aquatic products, and several kinds of wild animals to the public. The hospital reported this occurrence to the local center for disease control (CDC), which lead Wuhan CDC staff to initiate a field investigation with a retrospective search for pneumonia patients potentially linked to the market. The investigators found additional patients linked to the market, and on December 30, health authorities from Hubei Province reported this cluster to China CDC. ..."

>"I looked for the direct source of the Gao quote and couldn’t find it"

He told a BBC podcast host (https://www.bbc.co.uk/sounds/play/m001ng7c, context + Gao's answer at 24:00–25:12).

There is no question: the early search is riddled with clear, known, overwhelming ascertainment bias. The Chinese officials who did the search have been clear on this (Bahry, 2023, Table 1). Zoo crew (a coauthor network of Western scientists including Worobey, Pekar, and the "Proximal origin" authors among others) have tried hard to downplay it, but imo that's because they *want* to: they're acting like lawyers, not like scientists.

References

Bahry, D. (2023). Rational discourse on virology and pandemics. mBio 14: e00313-23. https://doi.org/10.1128/mbio.00313-23

The nCoV Outbreak Joint Epidemiology Investigation Team & Li, Q. (2020). An outbreak of NCIP (2019-nCoV) infection in China - Wuhan, Hubei Province, 2019-2020. China CDC Wkly 2: 79-80. http://www.ncbi.nlm.nih.gov/pmc/articles/pmc8393104/

Expand full comment

See also my comments on the original post:

https://www.astralcodexten.com/p/practically-a-book-review-rootclaim/comment/52698542 (on ascertainment bias)

https://www.astralcodexten.com/p/practically-a-book-review-rootclaim/comment/52974201 (on bringing Peter's apparent photographic memory down to Earth)

Expand full comment

I'm confused as to why Scott didn't engage with these comments first time around. Bahry has a peer reviewed paper on ascertainment bias and took the time to comment on the original post and provided references.

Expand full comment

I'm assuming I posted them late, after he'd read the comments he based this "highlights" post on

There were already a lot my the time I got there lol

Expand full comment

>"I looked for the direct source of the Gao quote and couldn’t find it"

He told a BBC podcast host (https://www.bbc.co.uk/sounds/play/m001ng7c, context + Gao's answer at 24:00–25:12).

Expand full comment
Apr 9·edited Apr 9

Excerpted from my longer comment on the clear, known, overwhelming ascertainment bias, which existed from the very start and which Worobey et al's robustness test fallaciously tried to dismiss (https://www.astralcodexten.com/p/highlights-from-the-comments-on-the-5d7/comment/53562588).

Expand full comment

the argument in 3.1 (Apology to Peter re: extreme odds) is a red herring. Who cares whether or not Saar was *also* quoting insane numbers? No-one should end up at 10^-20 odds in favor of any theory when high-quality evidence is as hard to come by as it is in this case. Whether or not Saar was making huge modeling errors, it's very clear that Peter *was* making such errors.

Expand full comment

Does the fact that this was obviously and explicitly intentional and the part of the message rather than a result of poor understanding not matter? Like, you can object to this as a rhetorical device. But that's very different from "*was* making such errors".

Expand full comment

It's both. I object to it as a rhetorical device. I love cynical humor as a way to drive home an argument, but a formal scientific debate with hours of arguments on sub-sub-points isn't the place for it. It makes keeping track of everything more cumbersome. Apparently now the onus is on me to know which arguments were meant seriously and which weren't, which de facto gives you (generic you) the freedom to retract-without-retracting any and all arguments at any time.

Those "obviously joking" numbers somehow made it into Scott's final summary of both parties' positions, so Scott seems to be taking them seriously at least.

I suspect this type of confusion sowing is part of Peter's strategy. I already mentioned this in the comments of the previous post, but from the few spot checks I did, my conclusion is that Peter is great at debating tactics and terrible at truth seeking.

Expand full comment

Scott very clearly does not take them seriously in the sense of actually representing a Bayesian analysis that Peter did. We know this because Scott says so. Whatever other confusion there is, this is not a confusing part.

I also don't quite appreciate this tactic, but I think you're not engaging with Peter's point here (which I'm not necessarily endorsing - just explaining how I understood it) - that precisely because there's so little firm information, doing overly formal mathematical analysis is absurd. My reading is that he's saying, in effect, that 10^-20, 10^-11.9, 10^-3 are all nonsense and we would be better served by acknowledging it.

As for the onus being on you - for Peter (again, in my understanding) the numbers are not as important for his argument. You can decide to disbelieve his numbers entirely - I think he might even like it!

Expand full comment

"Even if you spend hours and hours talking to the scientists involved and trying to figure out the flaws, it doesn’t matter, because there will be a new set of papers like that a few weeks later."

There's a term for the general phenomenon you're describing in this section: https://en.wikipedia.org/wiki/Brandolini's_law

Expand full comment
Apr 9·edited Apr 9

I don't know if this is a point that comes up a lot in the chemical weapons debate, but something that comes to mind when people challenge the mainstream narrative:

If this was the rebels doing the attack, they got incredible bang for their buck. The Western coalition launched a coordinated airstrikes that blew up a research center and two military bases as a reprisal. If the Rebels were the ones responsible... why not keep going? Goad NATO into blowing up more government bases? The fact that chemical weapon attacks dropped after the airstrike suggests that government forces were in fact responsible, and the airstrike did its job of intimidating them.

Expand full comment

These two posts changed my mind, which I find impressive.

Two notes: can we please agree to stop irresponsible wet markets AND gain of function/ virus-hunting labs?

And: I find it annoying Peter called the lab leak claim a conspiracy theory. It is an obvious potential cause to investigate, even if it turns out to be wrong.

Expand full comment

Sure, it's reasonable to investigate in 2021. The problem is when it is 2024, and yet most people still believe it against all evidence and proponents say preemptively dismiss all contradictory evidence by claiming that you can't trust any scientists anywhere due to cultural bias, etc.

Expand full comment

There are plenty of scientists who favour a lab leak, as well as intelligence agencies. The world is complex: there is room for disagreement without anyone needing to call any side conspiracy theorists.

Expand full comment

I was referring to the person in this very thread who said that all virologists should be ignored due to potential conflicts of interest (presumably, they meant only the ones saying pro-zoonosis things?)

Expand full comment

>How come there wasn’t a second obvious cluster radiating out from a coffee shop, lots of coffee-shop-linked cases, etc?

Because they would not be detected. We need the appearance of multiple severe pneumonia cases clearly linked to the same location. A coffee shop can't generate that

Expand full comment

And also they had a surveillance system set up specifically to monitor outbreaks at wet markets.

Expand full comment

It's silly really. The hospitalisation rate at it's peak of viciousness was something like 4 in 10,000. So by the time the hospitals start seeing patients coming in it has already been widely circulating

Expand full comment

"So far, 55 cases have been connected to a Starbucks store in Paju, north of Seoul. Earlier this month, health authorities identified more than 15 cases tied to a Hollys Coffee store in Gangnam Ward, southern Seoul."

Expand full comment

Scott was referring to the first detection of Covid. Of course, once it's on the radar, small clusters will be detected often.

Expand full comment

At a meta level, I find the entire discussion fascinating as science, as epistemology, and as human behavior. At a practical level, most of the issues are beyond my competence, but three items crossed my threshold: 1, the notion that there are fewer than 2,000 sarbecovirus is absurd. That refers to the number officially described and reported. The actual number is orders of magnitude larger, probably many orders of magnitude larger. 2, I find Peter's point about COVID using a PRRAR linkage for the furin cleavage site compelling. In what alternate universe could a Chinese virology lab not expert in this kind of work design and use a novel linker, especially when the evidence then available suggested that it wouldn't work? 3, the rapid doubling time repeatedly mentioned by Scott might be misleading; do we actually know how fast doubling time was for the original strain?

Expand full comment

I'm team zoonosis on balance, partly because of the big picture: pandemics predate institutes of virology by a long way. My prior is that nature is perfectly capable of creating coronavirus pandemics and we should expect it to do so regularly.

But I do notice that at some point Scott gets cross with Saar! This is a savage burn:

> I’m sympathetic to this way of thinking - my beliefs also intuitively feel so obvious that nobody could possibly disagree. But I eventually learned real life didn’t work this way; I think Rootclaim would benefit from a similar lesson.

I'm not 100% certain Saar's arguments are getting generous treatment by the end!

Expand full comment

Methods matter. Let's look at Worobey. One interpretation is that if the data are a good sample of all the cases, with little bias either in the initial gathering or the later filtering and if you make an assumption about how rare it is for there to be an initial spread event at a market, you get some enormous likelihood factor. It's so big it decides your final odds.

If.

Let's say you think those are reasonable possibilities but then so are the alternatives. You might get a likelihood factor, but there's no way it can be big, because there's a pretty good chance the assumptions of the argument fall apart.

Despite his silly boasting, Saar gets this one big thing right. No factor gets to be extreme if there are reasonable paths around the premises of its model. That applies equally to both sides. (It pretty much wiped out my big CGGCGG factor that had strongly favored lab.) It's basic to hierarchical Bayes, not a special rootclaim trick. It's a little shocking that people writing extensively on this, spending many dozen hours, didn't spend an hour learning the basic logic.

Expand full comment

To fulfill your request for a link on George Gao affirming possibility of ascertainment bias around HSM, here it is:

https://www.bbc.co.uk/sounds/play/m001ng7c at 24:40

Expand full comment

Yeah but Mike worobey wrote a paper. He figured out that Gao didn't focus too much on the market even though Gao thinks he did. I know Gao was actually there and in charge but Worobey is such a great scientist that he realised Gao was wrong. Worobey is so competent that he sometimes bothers to check the code in his papers

Expand full comment

wrt 'Lv' name comment: could be mistranslation of Lu (v often represents classic 'u' in pinyin spellings).

seemed like a strange comment that appeared to bash Chinese names (Xi? very common Chinese name as well) and their atypicality when translated into the english language---unusual to the intellectual nature of this publication.

otherwise very thought-provoking post.

Expand full comment

I think it was a reference to both anglicizations corresponding to Roman numerals. Lv is 55 and Xi is 11

Expand full comment

These sorts of questions make me think "rationalists" focus too much on objective reality. To me, the [acting hypothesis] or, the model of the world that will produce the highest utility yield when acted upon, seems a lot easier here than the objective reality.

So, what I'm getting at is, imagine a China that regularly leaks viruses from its labs. It also has a genetic warfare branch and is developing bioweapons and intends to use them/regularly does, often for weird sideways goals that are hard to infer from the results. 20% of worldwide pandemics are caused by a mixture of Chinese indifference/recklessness/malice.

In this world, when a natural pandemic arises, China does not open up its data, and provide proof that the pandemic is natural. It continues to maximally hide data, because not doing so would highlight all of the unnatural viruses. The entire purpose of clamping down on research is to ensure that natural and unnatural pandemics look the exact same. So, this China doesn't care that the information its hiding exonerates it, at least in this particular instance. Rather, it knowingly hides exonerating evidence in the expectation that people will then conclude that hidden evidence is likely exonerating.

In some scenarios, small amounts of hidden evidence are semi-intentionally leaked, that is, the government is somewhat more careless with information that exonerates it than information that doesn't. A random sample of leaked data now implies that the hidden information is overwhelmingly exonerating.

Now, in all of this, assume covid happens, and it's natural, and in fact, part of an 80% category of natural events. Are the people who say it's natural really reasoning correctly? Are the people who say it's 80% likely to have been natural even the correct ones here? To me, the most rational group is the one that concludes that China [et al] is up to no good, and it doesn't particularly matter if the goat is behind this particular cover up.

Expand full comment

Have you missed a step in this argument? Surely you don't believe that imaging something makes it true?

Expand full comment

Absence of evidence is evidence of evidence!

Expand full comment

My main initial reaction after reading it. The debate moved me slightly away from lableak as I found the anti-lableak arguments fairly convincing.

That said:

1) The repeated "The COVID pandemic doubles every 3.5 days. So if the first infection was much earlier - let’s say November 11 - we would expect 256x as much COVID as we actually saw. " arguments fall a little flat when the number of infected people is small because the spread rate varies greatly based on behavior of individuals. So if 100 people are infected average spread rate works as a fine proxy. But if it is 5, the particular behaviors of the particular people matter a lot and impact the spread rate a lot. Some people are loner shut ins. Some people go out of town. Etc.

2) There doesn't seem to be nearly enough connection with/acceptance of the fact that a lot of the information out there might simply be lies. The medical/scientific establishment and the Chinese and US governments are not entities which are above faking information/data for political/funding reasons. The anti-lableak analysis seems to take most of the research at face value, which seems naive.

Expand full comment

Re: (1), wouldn't the fact that a given person is in the initial 5 cases likely imply that they are different from the average population, but in a way that probably makes them MORE likely to be around people a lot, and therefore get (and transmit) COVID?

Expand full comment

Maybe yes, maybe no. Depends on how they got it. If they got it from getting pricked by a syringe, or bitten by a raccoon dog they might otherwise be a shut-in who only spends time at work and gaming.

The average spread rates make sense with average populations, not sure they do for the first few cases. At the very least a lot more uncertainty there.

Expand full comment
author

I want a little credit for qualifying that with things like:

"I’m using the version of the doubling time argument because it’s simple enough for me to understand, and I don’t have to worry about anyone trying to hide something in their complex model. It’s not exactly true, but it’s true enough to rule out COVID starting much before November 2019."

and

"If one person had COVID, it’s not too unlikely [that spread rates would deviate from this simple model] - not all COVID cases transmit it forward. If (let’s say) twenty people had COVID, it’s very unlikely - at that point, the law of large numbers takes over; in a freak coincidence, every single patient would have to fail to infect anyone else."

Expand full comment

We would not expect very early cases to have anything like the same kinetics as later ones regardless of LL or zoonosis because the virus would still be optimizing itself for human to human transmission.

Expand full comment

The amount of mutations a virus gets to try is roughly proportional to the number of hosts. It is not clear to me if the selection pressure within a single host and when infecting new hosts pull in the same direction (e.g. for being more airborne).

If you have a small number of hosts, the chances of the virus having enough time to mutate for better transmission seems low.

Expand full comment

It's very unlikely for a new zoonotic disease to immediately be very transmissible between humans. One expects a series of fairly unlikely transmission events before it gets going. Most new zoonotic diseases fail at this stage and don't infect more than a few people. The initial selection pressure is very high.

This is very well established in serial passaging experiments in numerous species and humans.

I worked with a group that passaged COVID to be infectious in mice, for example (so that one could use mice as model organisms to develop drugs like paxlovid). Many independent labs made mouse versions of COVID for this purpose. The equivalent of serial passaging happens naturally of course.

Expand full comment

This is basically what surprised Nikolai Petrovsky and David Winkler when they looked at how well it binds to human ACE2. A key difference with SARS1 was it had far less genetic diversity early on. Alina Chan and colleagues had an early preprint on how different it was to SARS1 in this respect.

Expand full comment

That was less about you and more the debater.

Expand full comment

I don't know how much the debate looked into the analysis Gilles Demaneuf presented to SAGO about 247-260 cases that were apparently recorded by the Chinese CDC but only 174 were given to WHO. Another issue is early samples were destroyed for safety reasons. David Relman describes the early case data as "hopelessly impoverished" which is a major reason he hasn't accepted the Huanan Seafood Market theory.

Expand full comment

I'm terribly sorry, but does anyone know if any of these papers being referenced by either side share raw data or code?

I ask because, as a rule of thumb, I tend to trust papers that share their underlying R, Python, or occasionally Stata code. I may not be an expert in their particular field but if their data is clean and their code is readable and published, I can feel pretty darn confident that it's accurate and will replicate.

I am not finding any code and little trustworthy papers in any of these links.

https://elifesciences.org/articles/16777 has multiple datasets but they're all like this (https://elifesciences.org/download/aHR0cHM6Ly9jZG4uZWxpZmVzY2llbmNlcy5vcmcvYXJ0aWNsZXMvMTY3NzcvZWxpZmUtMTY3NzctZmlnMS1kYXRhMS12Mi5jc3Y-/elife-16777-fig1-data1-v2.csv), incredibly minimal, at least I think they are.

This article quotes 47k individual samples but...am I missing a link here? (https://www.nature.com/articles/s41598-021-91470)

This one I think is giving me codes to go look it up in some GenBank database (https://academic.oup.com/ve/article/8/1/veac046/6601809)

I got down to these two (https://academic.oup.com/jrsssa/advance-article-abstract/doi/10.1093/jrsssa/qnad139/7557954?login=false) and (https://arxiv.org/abs/2403.05859) and...I'm terribly sorry but for two papers arguing about statistics and centrality I would certainly expect some data and some code. At this point I'm getting frustrated; I'm terribly sorry to the people involved, I'm sure a lot of work was involved, but I don't want to read 27 pages of words, I want to read 7 lines of R code, maybe 100 rows of data, and a paragraph explaining why you chose this algorithm as a measure of centrality. I know you have the data and code, I can see page 4 of Debarre and Worobey paper and I know how those graphs are generated and there can't possibly be privacy concerns, you just showed us a map.

Am I missing something? Does someone have a link? The Rootclaim site doesn't seem to have a clean list of academic sources and I'm not sure where to find this from Peter. I'm finding the complete lack of raw data and code on both sides rather concerning.

Expand full comment

You could look at https://github.com/sars-cov-2-origins

Not sure that will satisfy your desire for 7 lines of R code and a one paragraph explanation though.

Expand full comment

Thanks. That's pretty much exactly what I look for. Not that it's 7 lines of code but I see something like this (https://github.com/sars-cov-2-origins/huanan-market/blob/main/scripts/market_spatial_analyses.R) and I'm like...yeah, that looks right, I could probably dive in over a weekend, understand and reproduce what they did on Saturday & Sunday, maybe test some variations on Monday. Which means someone in the field who does this professionally could too.

Did I miss this? My Ctrl-F on Scott's article isn't finding the paper listed in the readme and I'm still concerned that I'm just not reading the papers right, that several of these have clear code and data that I'm just not finding.

Expand full comment

I think the problem is that that data is just dirty and there's not enough of it to extract much meaning without some form of hacking.

The two JRSSSA papers go into this, but Scott won't acknowledge the ascertainment bias even when it shows up on a simple test published in a reputable statistics journal so what can you do?

The thing that bugs me the most is that the "case location" is the home address.

I hope you pre-register your hypotheses before you start exploring the noise.

Expand full comment

Sorry if this is unclear but this is more of a "hack" than a willingness to deep dive.

Should I trust X? Well, if he's posted his code and data and I can verify that in 20 minutes, then that's pretty trustworthy. I'm not actually gonna do the deep dive unless it gets really important to me for some reason...I have a life :)

Expand full comment

Yes indeed. I wasn't sure which other papers you were asking about, and wanted to give a sense for how elaborate the Pekar simulations are.

Expand full comment

If you start looking into the code for the Pekar paper you should also look at the pubpeer comments https://pubpeer.com/publications/3FB983CC74C0A93394568A373167CE and the improved Pekar code is in https://github.com/nizzaneela

The author @nizzaneela has recommended the book Mathematics of Epidemics on Networks, which has accompanying python code.

The spread model used by Pekar is apparently this one: https://www.nature.com/articles/s41586-020-2554-8

with contacts modeled as Barabasi-Albert graphs.

Following @nizzaneela over the past year has been incredibly educational, but I haven't seen anybody collect the important bits.

Expand full comment

I loathe reading other people's code, but a lot of this stuff isn't that hard to reproduce. I think it took me less than a day to recreate Worobey 2022's kernel density estimate maps so that I could understand those and tweak the parameters.

I also tried writing some of my own epidemic simulations from scratch to test if Pekar's 2022 paper was correct. I got similar results:

https://twitter.com/tgof137/status/1772417277670871113

I didn't keep track of how long I've spent on that, I think less than 2 weeks in total? Someone asked me to publish a paper on that, so maybe I'll go back, document and test every assumption made, clean up the code and share it.

If this is something you're interested in playing around with, it's all fairly accessible and I can point you towards some data sources.

Expand full comment

> But hw failed to confront

Typo

Expand full comment

I don't usually like to yell racism, but I think the lab leak hypothesis doesn't really present a particularly compelling argument linking the lab to the wet market.

"Maybe someone wanted some civet" is an argument that only hangs together if you think all Chinese eat lots of weird game on the regular. Lots of people don't eat that stuff at all, and when they do, it tends to be special occasion or if they're tourists. These meats are way more expensive than normal poultry or meats!

Consider demographics. Who, in the WIV, would be likely to spend lots of time actually in the lab? A lab tech or a post-grad - lowest in the hierarchy, doing the grunt work, being exposed to viruses.

Neither of these positions are things you can just luck into - you need tertiary education and training. This implies class - probably at least middle class. Buying a live animal to cook at home is just not really an urban middle class thing!

Also, this is a class of job (skilled, specialised) that often recruits talent from all over the country - a lot of them probably don't live with parents, most probably live in sharehouses with people of the same class (uni students, teachers, clerks). Its not massively likely that these people live with a guy that sells civets at the wet market, or a guy that makes it a habit to bring weird animals home.

They have very different habits - probably a lot of takeaway, or convenience store fare, and dining out (all are cheap options over there). Lots of pre-packaged supermarket stuff, I would imagine - dumplings you can boil, air-fryer or steamed dishes. Much of it delivered probably, because of time demands of the job - and also delivery services in large Chinese cities like Wuhan are really very cheap (high density makes it a lot more efficient than over in the west).

My argument is that if a WIV lab tech or researcher was patient zero, we wouldn't have seen initial cases centred on the wet market - they would have been centred on a flat where 5 low paid interns/PhD students/grad students share a bathroom, and probably the train platform or busport. In the event someone did actually want game, they'd go to a restaurant, not the market directly, and it's really really weird if they somehow spread it to no one else (coworkers, housemates, other passengers on shared transit) on the way to the restaurant!

The equivalent in a western context is claiming that patient zero is actually a white corporate banker, despite all the cases actually found at methadone clinic in a mostly black neighbourhood. It just doesn't make sense to me. Yes you can probably construct a chain but it doesn't seem likely that everyone in that chain is going to be asymptomatic. I feel like you can only make this claim if you have zero understanding of how different Chinese demographics actually live.

Expand full comment

These are interesting points, but there are many scenarios not requiring "patient zero" to be a lab member. Improper disposal (then collected by a person to eat/ sell), deliberate smuggling to sell, escape of an animal from the lab - the list goes on.

Expand full comment

I see Saar asserting multiple times that p = 1/10,000 can't arise except in physics experiments. For example, in his latest blog post he says, "More generally, such extreme numbers are not possible outside very controlled environments where all confounders can be reliably eliminated [...]" (https://blog.rootclaim.com/covid-origins-debate-response-to-scott-alexander/).

However, such extreme probabilities appear quite regularly in a lot of real life situations. The probability that a given commercial airline will crash is around one in a million. The probability of being killed by lightning in the US is 20/330M per year (https://www.weather.gov/safety/lightning-victims). Saar has a weird bias to think that only probabilities like 1-100% are reasonable.

Expand full comment

Read carefully. "in highly controlled environments like physics experiments and computers, or when highly accurate statistics are available using a good reference class."

Expand full comment

100% is not reasonable.

Expand full comment

I did rather enjoy these two posts, even if the math and science is largely over my head...Matt Yglesias recently did a post complaining how there's been no Official Reckoning on covid, the whole thing's just a polarized epistemic shambles. So small-h heroic efforts like this on Thorny Questions are appreciated. I got a lot out of Zvi's covid posts, but the nature of such iterative roundups makes it harder to reference any one specific claim on a specific topic (and of course he doesn't write much on the topic since declaring Mission Accomplished, so I've not heard more current claims until now). A 2024 summary on one of the central questions is thus extra useful. Updated to maybe 60-40 or so now.

Sadly, it's also been reminding me of that other larger-h Heroic Effort you did on a different controversial covid topic, and the attendant months (years? was it years?) of professional critic Alexandros Mainos and company looming over sundry unrelated posts. Bayes really does seem to take the biggest L in the LL v. Z debate; there's more to rationalism than just Bayes, but even if Saar/Weissman/whomever are correct that zoonosis is really a one-legged stool resting on a single load-bearing prior...it just feels like Obvious Nonsense? Don't let good process excuse bad results...Shut Up and Multiply, But Not Blindly. I really do want to believe that hard questions can be better-resolved through some careful application of math, guesstimating, intuition, and reference class definition! This was not exactly a ringing steelman endorsement of such process, though. Maybe it's just too hard when so much of the Simulacra Level 0 information is impossible to trust *or* verify.

Expand full comment

Scott writes:

>30,000 people donated blood in autumn 2019, and the hospitals still had most of it. So they tested the blood samples for COVID antibodies and didn’t find any... There are 12 million people in Wuhan, so if even a few hundred people had COVID during that time, one of them should have turned up. None of them did.

Response:

This is missing two important factors:

1. We need to give 1-2 weeks for antibodies to develop.

2. People are not allowed (and don't want) to donate blood until feeling well.

That means this whole sample is delayed by around 3 weeks.

So let's see what zero positive blood samples tell us:

1. We have 44,000 samples 1-Sep to 31-Dec.

2. Since infections more than double every week, almost all the positives will fall on the last week. That's 44000/13=3400 samples, or 1 in 3000 Wuhan residents.

3. So to have one positive sample, we need ~3000 infected in early December.

4. That's 11.5 doublings. At 3.5 days it's 5.7 weeks, bringing patient zero to late October or later. (Doubling is probably slower at that time, so it's even earlier, but never mind).

That perfectly matches the evidence under lab leak (Reed, Chen, the lineages in the market and more), so the blood samples have no weight as evidence on origins.

Expand full comment

Great posts, and I learned a lot, e.g. that the furin cleavage site in the gain-of-function grant application may not qualify as one of the weird random coincidences after all: scientists may just have been good at assessing what makes for pandemic potential. However, I don't understand why much more attention is not generally paid to the fact that the Wuhan CDC moved *right next to the wet market right before the outbreak*. Perhaps this is Rootclaim's fault, stubbornly pushing an inferior lab-leak theory? There was a comment by Jacob on the original post but it did not make the highlights. To quote from it, "[CDC] employees were fanning out all over China going into bat caves".

Let me put it this way: having learned of the Wuhan CDC move, I would now not be satisfied completely even if I came to the conclusion for myself, after doing my own research, that the wet market's supposed role as the epicentre of the outbreak could be explained away by ascertainment bias. For then there would still be the weird double coincidence (cf. "one of God's biggest and funniest jokes" in the original post) that 1) pandemic begins in the city where the WIV is based, "biggest coronavirus lab in the Eastern Hemisphere" according to the original post, and 2) exact timing of the CDC move, near the exact location where people think the outbreak started.

But it seems like if certain connections could be established both between the WIV and the Wuhan CDC and between the Wuhan CDC and the wet market, then the weird double coincidences might go away?

Expand full comment
Apr 11·edited Apr 11

> However, I don't understand why much more attention is not generally paid to the fact that the Wuhan CDC moved *right next to the wet market right before the outbreak*.

The main lab-leak claim is that it leaked from WIV, not the CDC, so the location of the CDC is irrelevant.

The problem with switching to the CDC leak theory is that you have to throw out nearly all the pro-leak evidence in the process. No more furin cleavage site, no more DEFUSE, etc.

As for the WIV -> CDC -> Wet market theory you suggest, now you have the worst of both worlds. That's just a regular WIV leak with extra steps and all the arguments against it still apply, plus a complexity penalty.

Expand full comment

I did not suggest a WIV -> CDC -> WM theory. I suggested that all the double coincidences could go away if there were both WIV-CDC and CDC-WM connections. For example, if we found that the CDC was established in Wuhan because the world-leading WIV was already there (I have no idea if that's the case), that would satisfy WIV-CDC; all that would be needed for a superior explanation of the origin of Covid might then be some connection between CDC and WM. Meanwhile, as for "switching" lab-leak theories:

1) I was poorly informed before reading the original post two weeks ago, from which I took that the notorious furin cleavage site in DEFUSE (the grant application) might be a red herring (see the beginning of my first comment), and also the furin cleavage site in general. If that's "nearly all the pro-leak evidence" as you say, it could mean there isn't much left. Yet the mother of all pro-leak clues is surely the (city-level) location of the WIV, and there is also the neglected (possibly even bigger?) coincidence of the CDC move.

2) Yes, "switching lab-leak theories" sounds terrible, and Peter said something like "lab leak theories are cheap" in the first video (the only one I watched --- I'm still not well-informed). But actually, perhaps it shouldn't incur much of a penalty. Whenever there is a real conspiracy to be unmasked, there are bound to be some eye-catching wrong clues because of "The Pyramid And The Garden". And some of the conspiracy theorists will jump on them. It is perfectly possible that the furin cleavage site in DEFUSE is such a wrong clue --- an apparent massive coincidence that isn't really one on closer look, if a furin cleavage site is just the standard way of how a virus becomes a respiratory virus (see the original post).

Expand full comment

The furin cleavage site not being seen in other sarbecoviruses and WIV being part of a proposal that was looking at these is a pretty big coincidence. But the location and sampling history are also material circumstantial evidence. The nearest relatives are found in Yunnan and Laos. WIV was sampling SARS-related bat coronaviruses in both locations up to 2019. That is one direct link to Wuhan which is ~1500km away.

Peter Daszak's latest email release suggests WIV had hundreds of unpublished genomes. So the arguments they lacked a relevant progenitor are unfounded.

Expand full comment

Scott writes:

>There was a Lineage A sample in the market, lab leak proponents just try to ignore/dismiss/conspiracize it away

Response:

This was written in response to the fact that all market cases are lineage B. But the lineage A sample is a single environmental sample, not a case. It is additionally from 1-Jan, when SARS2 was already all over Wuhan, so it's not surprising one got to the market. 

And indeed, it had 2-3 mutations meaning it is from a late infection, so it tells us nothing about the early outbreak.

Expand full comment

Per Liu et al A20 has 2 mutations from A (C6145T and G26262T): https://ngdc.cncb.ac.cn/ncov/genome/accession?q=EPI_ISL_10497477#

Per Crits-Christoph et al, 6145C is in 23% of mapped reads, so there's some virus unmutated at this position at this location as well: https://www.biorxiv.org/content/10.1101/2023.09.13.557637v1.full 6145C

With a mutation every two weeks or so, that's totally consistent with the Pekar et al estimate of the date of first lineage A infection of 25/Nov (95%CI 29/Oct-14/Dec).

Are there any other environmental samples of lineage A or even lineage B to compete with this? Would an environmental sample from any lineage also be ignored if it were at WIV because "SARS2 was already all over Wuhan"?

Lineage A was ~30-40% of sequenced Wuhan samples at the relevant time. Lineage A isn't just sampled anywhere in Huanan market, but in the thick of other positive samples in one corner of Huanan market. Several solid examples of modeling the Wuhan outbreak to estimate number of infectious people in Wuhan at the time time. Good enough estimates of daily visits to Huanan market. And can get reasonable estimates for environmental sample sensitivity from Huanan market data itself for multiply sampled locations. So if this is important it's possible to work out a likelihood rather than assuming it's negligibly high that this could happen by chance.

Expand full comment

WIV is not a market visited by 10,000 people a day.

Agree that a more accurate model could be helpful here.

Expand full comment

Regarding Saar and others repeatedly calling the wet market a superspreader event and Peter rebutting them, I think there's an important fact here that somehow fell through the cracks: if you google "covid dispersion parameter" you'll see dozens upon dozens of papers that assert that Covid had it unusually low, for example: https://www.pnas.org/doi/full/10.1073/pnas.2016623118

> Recent estimates suggest that the dispersion parameter k for severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is on the order of 0.1, which corresponds to about 10% of cases being the source of 80% of infections.

As far as I understand it, this means that all arguments about the initial spread (not) doubling every 3.5 days are deeply suspect. It might be entirely possible for Covid to linger in a handful of people for a month or two without there appearing 256 times more cases than observed afterwards. It could evade that 10% chance of hitting a proper spreader for a long time, with the curve becoming observably exponential only with several tens of people infected.

I wouldn't say that this property of Covid automatically puts that kind of arguments to rest, but it definitely should be addressed and analyzed.

Expand full comment

Especially since one would expect the virus to still be actively adapting to human transmission (regardless of LL or zoonosis)

Expand full comment
Apr 10·edited Apr 10

> In most places with an outbreak of known origin, epidemics show some geographic clustering. This has been true ever since the very beginning of epidemiology, when John Snow successfully traced a cholera outbreak back to its origin at a contaminated water pump by taking the center of the map of cholera cases.

> This isn’t a 100% law of nature; an infected lab worker might get lucky and not pass it to any of his lab co-workers. Still, we might expect him to infect his family, the stores he went to, or the restaurants he went to.

> If he lived near his workplace, these might also be near the lab. If he didn’t - let’s say he lived on the other side of town and had a long commute - he would start a cluster near his house, or his favorite store, or his favorite restaurant. Then the people there would infect their families/co-workers/stores/restaurants. The cluster would start somewhere! Sure, some people would infect nobody close to their work or home, and instead just infect one person a hundred miles away who they breathed on during a trip - but this is the exception, not the rule.

In case of Covid this is the rule, not the exception, in the early stages before the law of large numbers kicks in. As long as it's not dying off, you expect people to infect a single someone until boom, someone infects 10-20 people at once. An infected person has to get 10% *unlucky* to pass it to all their coworkers.

Expand full comment

This is an anecdote, not formal data, and I'll give the short rough version (I gave more detail on the original thread): when Omicron came to Sydney, Australia (where we know it had to come in via the airport, all the state borders are still closed), at date 0 the first case is found in a inbound traveller. A couple days later, the first community case. A couple days later, ~20 people get infected on a cruise. A couple days later (so ~ a week from the "index" case) one of them then goes to a nightclub in Newcastle, 150ks north, and infects ~300 people.

Expand full comment

Scott writes about our Syria chemical attack analysis:

>So when Saar says that his method has a great track record, what he means is that when he looks into it further, he becomes even more convinced of his previous position. He doesn’t mean that any kind of external consensus has shifted towards his results over time.

Response:

Yes, at some point one needs to draw a line where further debate becomes ridiculous. For me it is when the rocket impacts from 7 sites all intersect at a small field within opposition-controlled territory where a video was taken on the night of the attack showing islamist opposition fighters wearing gas masks launching the same rocket type that was found in said impact sites.

I understand others draw the line somewhere else and that's fine.

https://twitter.com/Rootclaim/status/1405891184443199488

Expand full comment

Scott writes:

>During my email discussions with Saar, he kept insisting his position was obviously right. He would send me emails like (not exact quotes) ‘Now that I’ve demolished all the evidence for zoonotic transmission, you have to agree with me, right?’ or ‘You must secretly agree I’m right, it’s just be hard for you to admit.’

Response:

That is not how I converse with people and I'm not sure what I've done to cause Scott to stoop to this level and misrepresent a friendly professional exchange in this way.

During our exchange, I've corrected numerous factual mistakes and demonstrated why probabilistic inference methods that Scott proposed do not work in practice. I understood he conceded those points as they were removed in later drafts. Nevertheless, his confidence in the conclusion was unchanged, which seemed to me as confirmation bias and possibly preferring a good story over rational thinking. I very politely pointed this out. That's all.

I believe the fact that Scott still hasn't corrected the obvious mistake in estimating p(HSM|LL) (again, his current position is that an HSM worker is no more likely to be an early case than a random Wuhan resident!), points to a serious problem in rational thinking, regardless of what the origin of Covid truly is.

I hope people he trust better than me can help him understand this. I failed.

Expand full comment

"his current position is that an HSM worker is no more likely to be an early case than a random Wuhan resident" -- you're correcting a difference in opinion; not fact.

This isn't really a matter of opinion. There is plenty of serological data; there are plenty locations elsewhere in Wuhan with similar risk profiles; food industry exposure isn't reported as a risk in general.

Here's a rough list in descending order of things correlated with infection probability in Wuhan's wave:

1. Contact with COVID-19 patient

2. Being a healthcare worker (nothing seen for other occupational exposure afaik, but this isn't a complete literature review)

3. Residing in districts surrounding Huanan market

4. Visiting a hospital in ~January 2020 (for reasons other than C19 symptoms)

Dependence on age in inconsistent; not much for Wuhan at large -- https://www.sciencedirect.com/science/article/pii/S2666606521000031 -- more-so when limited to Wuchang where spread was later -- https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7539137/ -- that's consistent with experiences elsewhere in the world where controlling spread via healthcare lagged control elsewhere.

Expand full comment

You are answering a different question. The question is about being an early case. Your data is about being infected in later stages.

Claiming that an HSM vendor, who interacts daily with many people in an unhygienic closed environment that was proven to form early clusters elsewhere, is no different from a random Wuhan resident, is just wrong. Claiming it with a Bayes factor of 500x is especially wrong. Claiming 90% Covid is zoonotic when that is your only evidence is irrational.

Expand full comment

I’ve seen videos and picture of pre pandemic normal days at Huanan market. No idea where your characterization of it as being wildly dense and closed relative to other places is coming from.

Size and visitors per day is comparable to a large Wal-Mart. More workers in the same footprint, though, so more likely to sustain local chains of infection.

Expand full comment

https://arguablywrong.home.blog/2024/04/09/how-likely-is-it-for-covid-to-establish-itself/

Critique of yours and Peters epidemic modeling. I do find it a bit weird that you profess to put a lot of faith in the published science but ignore it when it's inconsistent with zoonosis. E.g. bloom and kumar have October start date as plausible in peer reviewed studies but you dismiss them on the basis of a back of the envelope model.

Expand full comment

Is more than the back of an envelope needed to dismiss A+C29095T and A+C18060T origins claims by pointing out that both analyses you refer to just ignored sampling dates? One has a back-of-the-envelope estimate of the time of the most recent common ancestor, extrapolating from the result of oversampling a common lineage in the USA in a frequency-based analysis. The other has a discussion of things in the news with no start date estimate if I guess what you're referring to correctly.

Expand full comment

My point is that relying on your own back of the envelope epidemic spread model seems like a bad idea. Phylogenetics and epidemic modeling is a tricky business, so outlining what the literature says seems good. Instead Scott relies on Peter's erroneous interpretation of the Pekar 2021 model, Peter's back of the envelope model, and Pekar 2022. There is one person in the world who understands the Pekar 2022 model (Nod) and he thinks it is garbage. The authors themselves clearly don't understand it. So let's not rely on the Pekar paper too much.

Expand full comment
author
Apr 15·edited Apr 15Author

This seems to match what I said - if one person in Brazil had COVID, it could fizzle out, but once we're talking about 20 people, it becomes much less likely (you say 90% chance of explosion). I don't know why you think it's devastating for me or something.

My calculation of starting dates is based on Miller's take on Pekar, who did the full simulation - I include the doubling time version as a more easily-comprehensible guide to approximately what Pekar was doing. As you can see in my post, Pekar thinks late October is possible but unlikely; September date (corresponding to when many lab leakers say WIV workers fell ill) is basically impossible.

My impression is that Li gives very long COVID doubling times compared to every other source, and it's hard to tell if that's because the virus doubled slower in the beginning vs. because they didn't have enough cases to do a good estimate. There were other studies as early as late January 2020 finding much shorter doubling times, even in Wuhan.

Expand full comment

The brilliant nod who will probably get the pekar paper retracted, also critiques some of your core assumptions here https://twitter.com/nizzaneela/status/1777989261817508165

Expand full comment

OK, here's the non-mathy passage summarizing Worobey/Pekar. Extensive details and many references are in the arguments leading up to it or the Appendix. ZW means zoonotic by wildlife, ZWM means just at market.

Summary on the market sub-hypothesis, ZW­M

Evaluation of the general ZW hypothesis has usually focussed on its ZW­M sub-hypothesis. As discussed in Appendix 1, by far the largest Bayes factors in those analyses that conclude that ZW is more probable than LL come from specific arguments about the HSM. It’s worth roughly summarizing the likelihood factors specific to ZW­M as opposed to more generic ZW. The first is that the Wuhan markets had a much smaller share of the wildlife trade than one would expect from the population, probably at least a factor of ten less. The second is that there was a fairly early superspreader event at HSM. Given that market superspreader events are likely for market spillovers but moderately unlikely (we’ve seen several at other cities) for infections with other sources, this would give roughly a factor of ten favoring ZW­M. The absence of any market-linked cases of the more ancestral lineage is likely if the spillover occurred elsewhere and unlikely if the spillover occurred at a market, so that would give a factor disfavoring ZW­M. The clustering of non-linked cases near the market looks like a result of ascertainment bias and thus gives negligible evidence either way. The lack of correlation between potential animal host DNA and SC2 RNA in market samples, in contrast to actual animal coronaviruses, also is to be expected if the animals were irrelevant but somewhat surprising if one of them was the prior host. Other than the deeply flawed Pekar et al. 2022 paper, multiple reconstructions of the phylogeny indicate that the spillover occurred more than a month before cases started being identified at HSM. Thus overall ZW­M is unfavored compared to other ZW versions, but with enough uncertainty that it may be best to ignore these factors.

Expand full comment

The onset curve of HSM-linked cases doesn't look like a superspreader event. Who was the superspreader? Where were they in the market? On what days did this happen?

From Li et al, "Early Transmission Dynamics" https://www.nejm.org/doi/10.1056/NEJMoa2001316 --

"One of the features of SARS and MERS outbreaks is heterogeneity in transmissibility, and in particular the occurrence of super-spreading events, particularly in hospitals. Super-spreading events have not yet been identified for NCIP, but they could become a feature as the epidemic progresses."

On the heels of that came reports of documented superspreading, particularly in hospitals as expected from previous experience -- I think this might be the earliest paper documenting an example -- https://jamanetwork.com/journals/jama/fullarticle/2761044

FWIW, not all coronaviruses are equal. The "canine" coronavirus(es) you are thinking of could have been shed by a dog, a raccoon dog, or perhaps a fox -- closely related to enteric viruses with sequenced samples coming from diarrheic animals in each. The single sample driving positive correlation is from a machine for removing animal fur and you can see why the quantity of virus present might be different for respiratory vs enteric virus. For the others, you might want to look closer at the correlation plots. It's worth taking a look at viral load over time (is deposited host mitochondrial DNA going to vary over so many orders of magnitude?) and realizing that a high correlation coefficient is a red flag of an artifact since actual correlation won't be so high.

Expand full comment

Zach- Your "superspreader" point just seems semantic. The HSM spread seemed similar to those from Xinfadi etc. except that for the later ones people were looking out and ready to shut things down sooner.

As for the flukiness of the high RNA-DNA correlations, the point is that each virus gets a roughly similar chance at making such a fluke, and SC2 didn't. Your enteric/respiratory distinction may matter or it may not. It's for reasons like that that I discount factors.

That reminds me of another issue that you (or somebody) might be able to help with. Of the 27 non-bat species hosting sarbecoviruses, how many are respiratory? This enters into the factor coming from the presence of an FCS.

You may notice that the missing SC2 DNA-RNA correlation does not appear directly in my net odds, just as one of several reasons to discount Worobey. The big points are that P(Wuhan|market zoo) is especially low, based on official stats and that the potentially ancestral sequences are all in unlinked cases, with location distribution subject to ascertainment bias. None of this gives a big enough factor against zoo for me to bother using, since markets just weren't all that big in the priors anyway. The point is that there are so many serious "ifs" needed for Worobey that by the time they're done it can't even compensate for the low P(Wuhan|market zoo)/P(Wuhan|general zoo)/

Expand full comment

This article has a reconstruction of the Xinfadi transmission chain and seems fairly interesting for that kind of comparison -- https://www.sciencedirect.com/science/article/pii/S1201971222000376 -- not entirely comparable (earlier detection, different genome, etc) but perhaps long enough undetected spread to get an idea of the comparison.

Re: "Of the 27 non-bat species hosting sarbecoviruses, how many are respiratory?"

Not my thing, but I know about this one: "Using CT scans we show that SARSr-CoV-2 positive Malayan pangolins are characterized by bilateral ground-glass opacities in lungs in a similar manner to COVID-19 patients." https://journals.plos.org/plospathogens/article?id=10.1371/journal.ppat.1011384

Expand full comment

Thanks, but what I'm looking for is data on the other non-bat sarbecoviruses.

Expand full comment

That's what this is. The "r" in SARSr-CoV-2 is for "related."

Expand full comment

My ignorance is showing. I know what the "r" stands for, but once you've also specified "-CoV-2" it doesn't sound like some virus that was previously circulating since pangolin SARSr-CoV wasn't overall very close to SC2. But if this notation just means something previously around and closer to SC2 than to SC1, then yes it's a useful data point.

Expand full comment

The figure you present as if it were from Pekar 2021 has been digitally altered from the figure in the paper to remove the confidence intervals in the legend and shift the curve so that it ends at Dec 11th rather than Dec 4th.

See the originals in Figure 3 of the published paper and bioarxiv version:

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8139421/?report=reader#!po=12.5000

https://www.biorxiv.org/content/10.1101/2020.11.20.392126v1.full

This is obviously a very serious problem. Please share the provenance of the figure in your post, since it does not come from the source you have claimed.

Expand full comment

Peter says on Twitter that he is responsible for altering the figure:

https://twitter.com/tgof137/status/1778080286934233370

Update your estimate of Peter’s reliability accordingly.

Expand full comment
Apr 11·edited Apr 11

And then he says that he was still more accurate than Saar, as proven by the decisions of the judges and Scott. Yikes.

I'm not sure he understands why image manipulation is a problem.

Expand full comment

>"What is the Weissman paper that observeralt is talking about? It argues: if the pandemic started at the market, each seemingly non-market-linked case must ultimately derive from a market-linked case. Therefore, we should expect non-market-linked cases to require more steps than market-linked cases. Therefore, they should be further away. But if we look at the map above, we see that not-market-linked cases are closer to the market than market-linked cases. So something must be wrong, and that something might be ascertainment bias."

It's not just vaguely "and that something might be ascertainment bias." It's that unlinked cases being closer to HSM than linked cases, is exactly what to expect if there was also proximity ascertainment bias (e.g., through searching nearby hospitals). Intuition: linked cases were found even if they lived far away (found via the case definition), but unlinked cases were almost only found if they lived nearby.

Expand full comment

Yes. Although it doesn't need to be "almost only" just "mostly". There were fairly many distant unlinked cases, just not enough to show up in their heat maps. In fact, one of the arguments used has followed the famous logic "We're not prejudiced against distant cases. Why some of our best friends are distant cases."

Expand full comment

Doubling time arguments very, very early in the pandemic aren't really valid because the sample size changes the math. Some people, maybe most people with early variants, infect zero other people. Others infects dozens. In bulk, this averages out and we can speak of an aggregate doubling time. When it's just a few people the tails are very long. A handful of people with COVID in November could turn into thousands in December or still just a handful, depending on the role of the dice.

Expand full comment

Man, this doubling time arguments is used several times here, underpinning a lot of rebuttal, and it is wrong enough to merit a mistake/correction.

The funny thing is that it is being used qualitatively, but actually one can directly compute, from published estimates of variability in transmission, what the probability is of there being N1 infections at time t1 given N0 infections at time to. That distribution p(N1,t1 | N0, t0) is very narrow when N0 is large but wide when N0 is small, as it would be when N0 is the first few people to ever be infected. This number could just go directly into the likelihood that everyone is trying to compute, rather than being used as knockdown argument with infinite weight.

Expand full comment

It feels like this point (in various forms, with various degrees of formalisation - your is more formalised than most) was made about 15 times in the original comment section, pointing out how bizarre treating doubling time as some iron law small sample characteristic is, and the best we got in this iteration was "well, now I'm aware you shouldn't do this, but i will still do it, with the caveat that it's wrong [waves magic hands so that it won't matter]"

Expand full comment

> A handful of people with COVID in November could turn into thousands in December or still just a handful, depending on the role of the dice.

If this is true, what is a time when a Covid outbreak in a previously zero-Covid region followed this pattern? i.e. somebody came to Italy, they infected 1 other person, who infected one other person, etc, for several weeks, at which point it finally broke out of the "staying just a handful" territory?

Expand full comment

If it happened, you wouldn't know, since there wouldn't be enough cases to track. But it is established that transmission is highly variable; here is a paper summarizing the issue: https://bmcpublichealth.biomedcentral.com/articles/10.1186/s12889-023-15915-1#:~:text=The%20dispersion%20parameter%20k%20is,Lloyd%2DSmith%20et%20al.

What is the probability, even with high dispersion parameter k (the parameter that determines the variability in the number of new infections caused by an infected person), that a transmissible disease could go on for a given amount of time (e.g. 1 month) without extinguishing completely (going to 0 infections) and without getting big enough to notice (e.g. 100 or 1000 infections)? I can't answer that off the cuff, one would have to simulate or derive a closed-form solution from the equations.

My point is that as far as I can tell, no one in the debate did this, and instead just assume the probability is zero, without thinking carefully about it.

Expand full comment

> If it happened, you wouldn't know, since there wouldn't be enough cases to track.

Yes, you would. You would at the very least have a "patient zero" who had no apparent contact with anybody from outside the country. If they did have contact (e.g. they went to a funeral also attended to by somebody from outside the country), the person from outside the country would have no covid antibodies (since the "patient zero" instead got it from some weird A->B->C->D->E chain, not the funeral).

> that a transmissible disease could go on for a given amount of time (e.g. 1 month) without extinguishing completely (going to 0 infections) and without getting big enough to notice (e.g. 100 or 1000 infections)? I can't answer that off the cuff, one would have to simulate or derive a closed-form solution from the equations.

Low. Covid-19 has a low k, which means most people do not spread it. e.g. here, https://theconversation.com/is-the-k-number-the-new-r-number-what-you-need-to-know-140286 . If 70% of people simply don't spread it, then to delay its exponential growth by 1 generation (~6 days) carries a 70% risk that the person who carries it to the next generation doesn't infect anybody. If it's 2 generations (12 days), 91%. If it's 3 generations (18 days), 97.3%. If it's 4 generations (24 days), 99.19%. 5 generations (30 days, or ~1 mo), 99.757%. Of course, there's also the risk that it starts to climb out of its "single host hole" in these generations, since if there are two hosts there is ~double the risk that it spikes out with a "superspreader" events, even higher for 3, 4, 5, etc.

I have work to do today, but if you can find a good p(n) table for Covid-19, where n is the number of people a given person infects before their own infection ends, I can plug it into a program pretty easily. It is very difficult for the number infected to stay low because with very low numbers of infected, there's a great risk of extinction of the virus in the host population altogether.

Expand full comment

Just saw that Peter Miller did a model using ~real R0 and k numbers: https://twitter.com/tgof137/status/1772417301561708753 . Looks like the earliest plausible first case is sometime in October, but overwhelmingly more likely to be in November.

Expand full comment

FWIW, Lv is sometimes how the pinyin transliteration for Lü is written, for ease of typing.

Expand full comment

When I think of all the discussion it is possible to have about the pandemic, in particular the correct public policy for pandemics, I'm perplexed that so much energy is going into this thing that has only tangential bearing on the future. Yes let's not have any lab leaks in the future. There I said it. Anyone disagree? Anyone in favor of lab leaks? Probably the most pertinent issue connected to lab leaks is whether and how we should be doing this kind of research. It's curious that there's almost no discussion of the future.

Expand full comment

This proposal caught my attention a few months ago -- https://forum.effectivealtruism.org/posts/K5jXKG33AEHccpTmH/cause-exploration-prizes-data-driven-wildlife-protection -- "So my final question is: through these data-driven initiatives, is Open Philanthropy up for the challenge of battling the global illegal wildlife trade?"

While I'm new to this, it seems like a no brainer synthesis of concerns for animal welfare and human welfare including and beyond disease reduction.

Expand full comment

Sure but too get to the stage of "let's ban virology (again)" requires first passing through the stage of "there's a reason to ban virology (again)". Do you think people who believe in zoonosis are going to be enthusiastic proponents of jailing a few bent scientists?

Expand full comment

"Why is everyone so concerned about whether these particular named individuals are negligently responsible for millions of deaths?" is certainly a take.

Union Carbide paid about $1b in 2024 dollars and got 8 supervisors put in jail, and they only hurt half a million people and killed a couple thousand.

Obviously it's a big fucking deal if it was negligence rather than natural.

Expand full comment

You still need to get the virus in to the wet market. Raccoon dog? Researcher? Frozen fish? Ed Holmes picture proves raccoon dogs in the market, but it also proves that virologists were in the market! Why spread from the market and not from the WIV? For hub spoke reasons. Viruses don't spread very well in virus labs. Next places the virus showed up outside wuhan were transport hubs, singapore, new york, milan. Where's the transport hub in Wuhan? The central railway station right next to the wet market.

Expand full comment

A virus would not spread outside of a lab it would spread any place people congregate like a market or mall. Look an aerial view of the LAB https://images.app.goo.gl/ZhJesuy6pmTbWthG8 there is no way large groups of people would be around for spread to happen.

Expand full comment

The discussion seems to omit Steven Quay's analysis highlighting the Wuhan Institute of Virology was serviced by Line 2 of the Metro which ran through Hankou station. One of the locations that Stoyan and Chiu Chiu showed was near the early epicenter of reported cases along with the Wuhan Center for Disease Control.

There is also no evidence of any link between bats in Yunnan and Laos and raccoon dogs. The one clear link is WIV's sampling history which contrary to claims in the Rootclaim debate were undertaken up to 2019 and included these locations along with Vietnam and Malaysia. Daszak's emails released today highlight WIV had significant amount of unpublished results. Shan-Lu Liu noted early on this was the case and could have leaked given safety issues.

Expand full comment

It seems like going through the process of writing both the original post and this comments compilation has been a very stressful process, but I'm very glad you did and appreciate how thoughtfully you approached everything. I have found this all interesting and useful to read, and it has led to thoughtful conversations. Thanks!

Expand full comment

To be honest, the paragraph about Rootclaim's investigation into the chemical attacks had the strongest impact on me in this post. Before that I thought they were some serious group like the Samotsvety (although Saar's behaviour after he was defeated in the debate had already made me very wary). After reading it, I can only repeat the Tobias Schneider question: why should we take Rootclaim's opinion seriously at all?

Expand full comment

Scott, you have used a doctored version of figure 3E from Pekar et al. This misleading version was also used in the Rootclaim debate according to the replies here.

https://twitter.com/nizzaneela/status/1777989261817508165?t=B5bQ5CPYHe6e0A4mCxPEdQ&s=19

Expand full comment

Since it looks like nobody else has pointed it out:

> This has been true ever since the very beginning of epidemiology, when John Snow successfully traced a cholera outbreak back to its origin at a contaminated water pump by taking the center of the map of cholera cases.

This isn't what happened! Snow didn't make the map until well after he concluded that the contaminated pump was responsible; the map was made after the fact as a way to convince other people, not part of how he came to the conclusion. I learned about this from this video by Patrick Kelly: https://www.youtube.com/watch?v=bALs7kNpNSM

Expand full comment

I did mention this in the comments on the original post, I guess Scott missed it.

Expand full comment

Technical rebuttal to the analysis of early claims presented above. It's also come to light the figure 3E used above from Pekar et al is doctored. It's not the figure they use.

https://arguablywrong.home.blog/2024/04/09/how-likely-is-it-for-covid-to-establish-itself/

Expand full comment

> I would literally never do this if I was designing a small insert (maybe I wouldn't notice if it happened by chance with ~1 in 25 odds in a naive codon optimization algorithm as part of a larger sequence). High GC% is bad. Tandem repeat is worse. Several other perfectly fine arginine codons. And I wouldn't engineer a viral genome using human codon usage. An engineer would not do it.

While I do not care much for speculations about the origin story of covid, I find this fascinating.

What is so bad about tandem repeats?

Why is it bad to have a high GC percentage? Are these bases rarer in nuclei and thus limit the amount of copies a virus can create? Or is it that G,C are identical in DNA and RNA while T gets replaced with U?

I thought that the codon code was universal, human ribosomes parse viral RNA just fine.

Per Wikipedia, GC(C|G|A|U) and AA(C|U) encode Arg. Which of these are good for viruses? Why are others better for humans?

If this is bad for a virus, did later covid variants evolve to replace this with a better encoding?

Expand full comment

Both points related to strength and specificity of nucleic acid binding and potential for errors in replication. Take a random coding sequence and replace all of the wobble codons that you can with C and G randomly and paste it into RNAfold and see what happens -- http://rna.tbi.univie.ac.at/cgi-bin/RNAWebSuite/RNAfold.cgi -- look up the structure of C-G and A-U basepairs to see why.

On tandem repeats, imagine trying to read a page that's the same sentence over and over again while transcribing the results on another sheet, and now you can see how replication slippage can happen. Not the greatest analogy but hopefully it gets the point across.

I think that viral relative synonymous codon usage has little to do with what is better than with the product of differing mutation rates in different hosts & tissues, but I don't want to speculate beyond that.

G-to-U is a fairly frequent type of mutation and you can keep an eye here to see if there ever happens to be a mutation from CGGCGG to CGACGG or CGGCGA that takes hold by chance -- https://cov-spectrum.org/explore/World/AllSamples/AllTimes/variants?variantQuery=G23608T+%7C+G23611T& (with the caveat that I bet this sequence is prone to sequencing errors since there are mutations nearby frequently).

Expand full comment

Regarding Syria, it should be noted that the western governments attributing the chemical attacks to Assad are largely the same ones which were arming the rebels in the first place, and wikipedia is loath to write anything critical of NATO powers with regard to current events, especially when the US press circles the wagons

I'm not saying we know that the rebels staged the attack, but I am saying there's a lot we don't know and lying on all sides of that mess

Expand full comment
founding

Regarding Syria, it should be noted that the chemical attacks were attributed to Assad not only by "western governments ... which were arming the rebels in the first place", but also by independent NGOs using open-source intelligence. Suggesting that we should just write the whole thing off as propaganda from "western governments", nothing to see here, no way we'll ever know, move along now, is at best ignorance and folly.

Expand full comment

It's strange that Scott has trouble comparing his factors and mine since I posted a translation as a comment on his previous "book review".

Here's a slightly simplified and improved repeat of that comment.

Net priors using Scott's grouping + Scott's "reason's WIV wouldn't do it".

Scott 2.7 MBW 12 before integration of uncertainty, 5.4 after integrating

combo of all HSM/lineage factors (Worobey/Pekar)

Scott 0.002 MBW 1

FCS-ish

Scott 25 MBW 25

cover up success

Scott 0.5 MBW: did it?

other factor: restriction enzyme pattern

Scott NA MBW 70

So really it all comes down to Worobey/Pekar, plus the surprising new FOIA find that the DEFUSERS were planning just the restriction enzyme pattern that Bruttel et al. had noticed.

Does Scott ( or anyone) really believe that there's about one chance in a thousand that the Worobey data are messed up? That there's 99.9% probability that George Gao, the official WHO summary on market linkage, Bloom, Kumar, Samsone, Lv... are all wrong? If Scott allowed any reasonable chance that they were right, then his odds would favor LL.

Expand full comment

The "DEFUSERS" quite obviously weren't planning that since one of them published a SARS2 reverse genetics system that does not use those restriction sites because they are not actually convenient at all -- https://www.cell.com/cell/fulltext/S0092-8674(20)30675-9

Just plug the SARS2 genome in and look at the restriction map yourself. Try to find precedent for the shortest fragment. The closest you will get from papers referenced by Bruttel et al, and it's not that close, is a fragment that's only as short as it is because a longer fragment made for an unstable plasmid and it was cut in two. Then just look at where the enzymes cut and try to rationalize that with any experimental plan. And, of course, the restrictions sites (and the missing sites) are exactly what's expected based on sequences around the restriction sites and genomes of related viruses. The expectation of engineered sites is exactly the opposite.

The bit you think is relevant in the "new FOIA" has never been described accurately in the media to my knowledge. It is a sample budget sent to a lab in the USA to help prepare their budget. The same budget comes from North Carolina. It happens to include two items for a restriction enzyme to be replaced with whatever the other lab planned to buy. But... no evidence is needed that folks at WIV do reverse genetics with BsaI and BsmBI! No FOIA needed; it was published in 2017 https://journals.plos.org/plospathogens/article?id=10.1371/journal.ppat.1006698 -- it's just simply not evidence of anything and even if it were true it would not be evidence that adds information; one of the many blips of made up evidence that reveals the lack of supporting evidence for lab leak.

The paper in PLOS is the latest one. Just look at the dates in Figure 3A -- that's where the August 2019 date that keeps getting thrown around comes from -- "estimating the divergence time of SARS-CoV-2 at around 2019.58, which corresponds to August 2019" -- what is SARS-CoV-2 diverging from in August 2019 in that picture?

Expand full comment

Zach- It specified using 6 segments in a way that would allow repeated swaps, i.e. leaving the sites in. Be honest.

Expand full comment

"It" is the giant file of DEFUSE drafts and related communication.

The relevant text: "The genome of consensus candidates will be synthesized commercially (e.g. BioBasic), as six contiguous cDNA pieces linked by unique restriction endonuclease sites for full length genome assembly. Full length genomes will be transcribed into genome-length RNA and electroporation used to recover recombinant viruses (22,62)."

Both refs 22 and 62 use BglI which cleaves GCCNNNN↓NGGC necessarily retaining the restriction site in the assembled product. This is not necessary for BsaI or BsmBI -- no reason to add any sites ever.

Seriously just plug the sequence into https://nc3.neb.com/NEBcutter/ or similar, look at the BsaI/BsmBI digest, and wonder why you've never heard how short the shortest fragment is before.

Expand full comment

The number of segments and that each is less than 8knt are obvious engineering constraints and show a clear pattern in synthetic viruses. Having one short segment or not doesn't fall in that category. The BsaI/BsmBI combination was used previously by WIV and by Baric.

You do understand that by pushing hard to be conservative I used only 0.2 probability that LL would show this pattern, because some other choices were possible if less likely. Also by pushing hard to be conservative I used a much higher probability under ZW than the early simulations gave, 0.002 rather than 0.0004. Then I systematically discounted for uncertainty, ending up with a ratio of 70 rather than 100. I can imagine an ultra-conservative calculation getting down to 20, but not further.

Expand full comment
Apr 11·edited Apr 11

Ok it’s clear you aren’t interested in checking the length of the shortest fragment. It’s more anomalously short for engineered viruses than the longest fragment is anomalously short for natural viruses. It’s not mentioned in Bruttel et al because it’s contradictory evidence to their hypothesis and they cherry pick analysis of longest fragment length. One of the cut sites is in the middle of S2 iirc for no reason.

The odds are near 1 that a natural virus with SARS2 genome outside of these sites will have these sites or very close to it.

At this point Bruttel et al will start to say “consensus genome” but they do not know or pretend not to know what “consensus genome” means in this context. A consensus genome of genomes that differ this much is a genome that won’t cause a pandemic.

“Consensus genome” in DEFUSE means resolving differences of a handful of nucleotides in genomes sampled at about the same time and place.

Expand full comment

"The odds are near 1 that a natural virus with SARS2 genome outside of these sites will have these sites or very close to it." That's just flat wrong, by orders of magnitude. I go through 4 distinct ways of estimating this chance and none give any number near one. One unused variant urged by a zoonatus gave zero/64k.

An LL advocate has confidently explained that the one very short segment between the last BsaI site and the first BsmBI site is a needed engineering feature due to issues with double digestions, important for re-use. You say that no one would use it for some unspecified reason. At that level of ultra-detail Texas sharpshooters can draw targets around the data and reverse Texas sharpshooters can draw targets away from the data forever. I try to stick to the basics where you know the data and the model well enough to not have to argue whether the likelihood ratio is 100/1 or 1/100 depending on which motivated argument you're listening to.

Expand full comment

For the record, I messaged you and asked you to summarize your factors, before Scott's first blog post, so that your work could be part of the discussion.

You refused to make a summary and said that I needed to instead read every blog post that you've written and find any relevant numbers.

As such, you ended up a footnote to Scott's posts. That was entirely your choice.

Expand full comment

Bullshit. I said just scan down the latest version for the logits that were all in bold face as stand-alone lines.

Expand full comment

Scanning through the text from start to the first Appendix is fast with a track pad. The slow way, with a down arrow, took 80 seconds.

Expand full comment

Peter, the way you phrase this suggests that Scott delegated this part of the research to you. Is that right?

Expand full comment

No, not really. Scott wrote a blog post and sent it to Saar and I before posting, for feedback and comments.

In the draft post, Scott had already made a table comparing the odds everyone gave, as well as his own numbers. I had already made my own table comparing the odds everyone gave in the debate:

https://docs.google.com/spreadsheets/d/1LG2-Ir5Bl2sU0_USYV-IlfafrrQMWbNr/edit?usp=drive_link&ouid=107312989181202525546&rtpof=true&sd=true

The way Scott and I set up the priors was different, though -- Scott wanted to start with a global prior on lab leaks, the same way that Rootclaim did in the week 1 debate. I objected to this for various reasons, one of the main ones being that neither of the judges used a global prior on lab leaks, so you would basically just have to make up a number for them.

After some arguing, Scott did change his table so the priors were set up somewhere between what I did and what he did (without making any other changes).

The part about Michael was kinda separate. Since Scott chose to include one debate viewer's analysis, I thought it could be interesting to include some other pro-lab leak bayesian analyses like Michael's. I would have either added his to my spreadsheet or sent it to Scott and maybe he would include it. But I didn't get a helpful response from Michael so I scrapped that idea.

Sounds like Scott wasn't that interested in reading all of Michael's blog posts and figuring out his latest theories either.

Expand full comment

That makes more sense, thanks.

Expand full comment

Scott writes:

>The first two known Lineage A cases were very close to the market

Response:

Untrue.

One case is just claimed to be near the market with no location or even distance given.

The other is yet another mistake in Worobey et al that we discovered during the debate. They claim the case "resided closer to the Huanan market than expected" based on Wuhan population distribution, forgetting that infections are not linearly related to density. In early stages of spread, the exponential growth could easily cause dense neighborhoods to have 10 times more cases per capita. So this "finding" is just an artifact of HSM being in Wuhan's densest areas. Given that it was already borderline significant at p=0.034, it’s safe to say there is nothing there after correcting the mistake.

Expand full comment

You could try exponentiating the population distribution, drawing from that, and seeing if it makes a huge difference. I had the same concern so that's what I did (with a different but very correlated statistical test -- sum of distances from candidate outbreak locations to cases). But exponents 0.5, 1, 2, 4 (seems to much more than span the credible range) all gave similar results; this is what I expected since HSM looks to be in about the median lived density of Wuhan eyeballing maps but it was worth checking out.

Expand full comment

Hard to respond without seeing the full calculation.

Expand full comment

1. Draw from population^x [0.5, 1, 2, 4] a thousand times

2. Calculate sum of distances to case residences from each candidate outbreak location; do the same for HSM and for the geometric median of cases (minimum possible sum of distances).

3. HSM ends up having a sum of distances ~17 km higher than the minimum. The median is around 1000 km for x=1. Likelihood of being this low or lower seems similar to Worobey 2022 statistics, unsurprisingly. The median drops with higher exponents, but up to x=4 (already absurdly high imo given locations of early outbreaks globally) HSM is still significantly closer to cases. Since HSM is in a not-high-but-not-low density region of Wuhan, none of this is unexpected.

Given different possible choices for population distributions it's better to reproduce this than to trust any calculation I did if it matters... here is a population distribution data set different from the one I used to try https://sedac.ciesin.columbia.edu/data/collection/gpw-v4

Expand full comment

1. Do you have source code?

2. Is this referring to all cases? My response was only for the single A case.

Expand full comment

1. Yes but this is a case where it's better for someone to reinvent the wheel since it doesn't take long and there will be small differences with methods choices.

2. I don't put much stock into the 1 or 2 lineage A case locations; 1 environmental sample a couple dozen meters from susceptible wildlife is more interesting than 1 or 2 cases a couple kilometers away. The results would be similar for the lineage A case(s) and similar for the market-linked vs market-unlinked analysis.

Expand full comment

Anon Virology you mention above suggests argument lineage B came first because it was more prolific and had more genetic diversity than lineage A is ridiculous. On that basis Omicron came first. Note that Lv et al (2024) published new genomes which suggest single spillover with lineage A coming first.

Lineage A was seen in various places outside Wuhan before lineage B before being replaced. Difficult to reconcile this with A coming after B which outcompeted it.

https://twitter.com/virologyanon/status/1778322175214436608?t=7e83InBAk2zAZVwaZXjDKQ&s=19

Expand full comment
Apr 11·edited Apr 11

You realize that there are typically collection dates associated with sequences, right?

In early sequences (dates matter), lineage B has more diversity than lineage A, which has more diversity than posited ancestral genomes one mutation away from A. The analyses supporting this always ignore collection dates. You can throw out all of the data from Wuhan or all of the data from China and this is still true.

You can also count the number of places with exported sequences -- count provinces in China with imported genomes from Wuhan or countries with imported genomes from China -- more for lineage B than lineage A than lineages derived from A.

If you feel like fact checking this rather than ignoring it and moving on to something else, it's a good idea to censor lineage B.1 from searches and/or cut off dates before B.1 starts spreading internationally from Europe outbreaks. Censoring B.1* (e.g. with a cov-spectrum search) will also eliminate most goofs in reported collection dates since.

Expand full comment

A has more diversity than its "ancestral genomes", and from this we conclude that B having more diversity than A is evidence A came... later?

I must be misreading something.

Expand full comment

*posited* ancestral genomes...

Expand full comment

Ah, posited — /but not by you./

Sorry. It's been a rough week.

Expand full comment

I am surprised to see that the likelihood of natural COVID in Wuhan is considered to be proportional to its population. If we have a lab in a city with 10 millions people, and another lab in a city with 100 000 people, we wouldn't consider the first to be 100x more likely to see a lab leak than the second.

What should matter is the contact surface between the potential source and people. If there are 10 000 people per day visiting the wet market or other locations with wildlife in Wuhan, that gives the city roughly the same probability as 10 000 people living in a rural area. Well, realistically it would be more for various reason... but maybe 10x more, not 1000x more.

I imagine this was covered somewhere, but I haven't seen it. If a kind soul can enlighten me on what mistake I'm doing, I would be grateful.

Expand full comment

My blog discusses Wuhan's fraction of the overall mammalian wildlife trade. It's less than 0.01%, much less than expected from the population. The trade is overwhelmingly farther south. For individual species the fraction varies, as I discuss.

Expand full comment

If I read correctly, in your blog you make a distinction between zoonosis from wildlife trade vs zoonosis from actual wildlife (wild wildlife ?) - or rather in aggregate, I think. For the aggregate, you take the population of Wuhan as a first estimation and refine it by noting the the relevant wildlife was not living near the city.

My point is that a random person in an urban area will have less chance to encounter (directly or indirectly) the relevant wildlife than the same person in a more rural area (a very quick googling gives me the impression most epidemics do start in rural areas). Wildlife trade might or might not compensate for this - you say it doesn't - but even then, I'm not sure why we would use the population in the first place, it just seems like too bad a proxy for what we're trying to measure.

Can you explain your thinking ? Maybe because you don't think it's relevant given the other factors you do mention ? Or I am just wrong to think it's a bad proxy ?

Expand full comment

Approximately using the population is quite conventional, shared by all the Bayesian analyses regardless of where they end up. Wuhan should go down compared to say Guangzhou but up compared to Beijing etc. so population gives a decent first approximation for some generic spillover. It's true that based on seropositivity for related viruses rural areas should go up and cities should go down, but it's easier for an epidemic to get going in a city because the higher density raises R0. So population makes a decent first estimate for generic spillover, though probably generous because of the radically higher density of sources farther south.

For market-specific stories, there are fortunately official stats on overall trade in China and specifically in Wuhan. Here you can see that Wuhan gets much less than it's population-based share. The markets are almost all in Guangdong, Guangxi,...

Since in various prior warnings (e.g. from EcoHealth Alliance) markets weren't a particularly prominent path, Wuhan's small share doesn't appreciably lower the prior odds of a spillover there. It does of course substantially lower the odds of a market spillover there, which becomes relevant because the most widely publicized papers alleging strong evidence for zoonosis were entirely focussed on a market-specific account. It's one among several serious reasons (including shocking internal math/coding/logic errors) to discount those papers. Since Scott gets a BF of 500 from them, it's a BFD for his analysis. Knock that factor down much and he ends up with odds favoring LL.

Expand full comment

"it's easier for an epidemic to get going in a city because the higher density raises R0"

I think that's the part the confuses me regarding my original point. Shouldn't we ignore this ? Isn't it screened off by the fact that we know it happened in Wuhan ?

I mean, what determines whether an epidemic gets going is where it starts (Wuhan) and how good it is at spreading (furin cleavage, etc). Since we're already taking both of these things into account, shouldn't we ignore the fact that the epidemic in fact got going ? It feels like double counting.

It's as if we were taking into account the fact that the virus started in big Chinese city to determine the odds that it started in Wuhan. Of course that's gonna boost Wuhan's odds, but in this case it's obvious that we're doing something wrong.

Of course assuming your sources regarding the market trade are solid, none of that is going to matter much. But honestly I'm more interested in knowing whether I'm making a mistake than in the final answer - I agree with Scott that we should take both lessons to heart in either case.

Expand full comment

Here's what we're trying to calculate for this particular likelihood factor.

P(starts in Wuhan|this type of pandemic from zoonotic spillover)/P(starts in Wuhan|this type of pandemic from DEFUSE-style lab).

The denominator isn't much less than 1, so it's boring. We're really discussing the numerator.

It amounts to what an informed person would guess ahead of time if asked what the probability of a Wuhan location would be if a spillover were to occur. It just depends on how Wuhan compares with other places. If you knew nothing at all about the type of disease or how pandemics get started you'd just say that P(starts in Wuhan|this type of pandemic from spillover) would be about Wuhan's share of the population. You do know some more, though. If this is zoonotic, the source is definitely from around Yunnan or farther south. That would reduce Wuhan's share of the probability. Wuhan is a big city, which may make it more likely than rural areas to show a noticeable early stage of a pandemic. That doesn't matter much since China is mostly urban. Rural areas show much more serology sign of related background infections from wildlife, so that could reduce the probability that the start would be in Wuhan. These factors just give some uncertain fuzz to the estimates.

Once you specify that you're asking a similar question but about market spillovers specifically, rather than generic spillovers, the reasoning is similar but you start with Wuhan's share of the mammalian market trade rather than of the population.

The form of the reasoning is always the same. Conditional on some hypothesis (e.g. |market spillover) what would be the probability of some observable outcome (e.g. first found in Wuhan) : (P(Wuhan|market spillover). As I think you're getting at, you do have to be careful not to inadvertently sneak some observation into your hypothesis when it wouldn't have been there or would have been less specific if you made the estimate before knowing the observation.

Expand full comment

When you use 'this type of pandemic from zoonotic spillover' has the hypothesis, this is implicitly two separate events : there was an occurrence of zoonosis (a virus was transmitted directly from an animal to a human), and it turned into a pandemic. But the fact that it turned into a pandemic is an observable, not part of the tested hypothesis.

I tried to see in your post how you dealt with that, but I'm not sure you do ? But then the odds that you get at the end are not for P(zoonosis)/P(LL), they are instead for P(zoonosis & pandemic)/P(LL & pandemic). Since the pandemic has indeed been observed (we're not trying to predict a future event), that should be factored out. So we end up with (P(zoonosis) * P(pandemic|zoonosis)) / (P(LL) * P(pandemic|LL)). We get the factor P(pandemic|zoonosis)/P(pandemic|LL), which should favor LL for exactly the same reason that considering the pandemic part of the hypothesis when calculating the odds for Wuhan favored zoonosis : zoonosis has more chance to occur in rural areas and not go very far.

Sorry if I'm misunderstanding your method, your blog post is quite long and I've only given it a once over for now.

Expand full comment

It sounds like this has been pretty stressful to cover, but thanks for doing it - it's a fascinating topic for all kinds of reasons, not just to do with Covid, and I appreciate you highlighting it and talking me through it all.

Expand full comment

I don't really understand the responses Scott's made to Saar's "but is the market cluster really such a coincidence in lab leak scenario?" — seems like they sort of get in the weeds about /why/ these markets tend to be involved in early COVID spread, but that doesn't address the main thrust of the point (whether or not it's the frozen seafood, it's still not unlikely in a lab leak scenario if it kept happening when COVID spread to other countries).

But I haven't had time to go back and re-read more carefully, so maybe I'M missing something.

Expand full comment

Nobody knows what covid "doubling time" is. Usually when people talk about "doubling times", especially in the context of early covid cases, they mean doubling times of measured tests. This is something completely different from the actual doubling time. I am very surprised that people who are otherwise doing good statistics miss this distinction. If you are at the same time ramping up testing, and ramping up awareness/fear of the disease, those are already two (!) exponential processes that influence the measured number of tests. Good luck extracting the actual doubling time.

Also covid is not actually exponential. The exponential model of covid spread makes some very implausible assumptions about how people are connected. There is unfortunately very little research on this because accurately modeling these kinds of networks is hard. Also with covid there were clear seasonal effects in the spread in Europe - it basically didn't spread at all in summer, just like the flu.

That said I don't really buy "Brazil had 4 months of covid and nobody noticed". But it isn't as much as a slam dunk as you make it out to be.

Expand full comment

Does this debate make Rootclaim reconsider the usefulness of this betting/debate idea?

It seems like, from their point of view, the truth did not prevail - neither in the debate itself, nor in Scott’s commentary, and many people have updated towards zoonosis.

So was the whole debate counterproductive wrt spreading the truth as they see it? What does that imply for the format generally?

Expand full comment

>Here is a map of December 2020 COVID cases. I recommend ignoring the contour lines and just looking at the dots. How could dots be oversmoothed?

Counterargument: this is potentially an argument against the data being good. The infected people at the wet market presumably don't all live there, the commute (or at least their customers commute). Why wouldn't you have any secondary clusters anywhere? It's kind of weird that you have so many cases centered around it, suggesting that maybe people got tested more near the wet market than elsewhere?

Expand full comment

You are correct. I recommend reading Weissman's well sourced post ( https://michaelweissman.substack.com/p/an-inconvenient-probability-v57). Apparently the WHO and Chinese CDC only tested folks that were linked to the wet market (HSM Hunan Seafood Market). So, unsurprisingly, they tested people around HSM and found that their hotspot was next to HSM. Would normally be silly that this study passed muster. Now it's just sad.

I got this from Weissman's post. It goes over some of the sampling error:

https://journals.asm.org/doi/10.1128/mbio.00313-23

Expand full comment

Well thanks for doing this Scott, each part has been an illuminating read and it seems like it was all a giant hassle. Good luck with the lab leak zealots.

Expand full comment

The zoonotic spillover advocates show just as much zeal, believe me. I'll post a separate response to the points raised but I agree that Scott should also read Professor Weissman's analysis and Jesse Bloom's more recent analysis of the market samples (I had incorrectly cited this as 2023 rather than 2024) and wasn't covered in this debate. The fact that Daszak has just released emails today showing they had ~700 unpublished genomes from sampling Yunnan, Laos, Vietnam that was going on up to 2019 again highlights how new information keeps coming up that undercuts the claim WIV didn't have close enough progenitors. We simply don't know and probably never will short of a whistleblower one way or the other.

https://michaelweissman.substack.com/p/an-inconvenient-probability-v57

Expand full comment

You are doing the exact "this new piece of information CHANGES EVERYTHING" move criticized above. It's time to pack it in.

Expand full comment

When your debate opponent posts a picture of a sequenced pair of viruses and says, "See!!!" and you can, in fact, see an obvious explanation, then it is clear that they either have no idea what they are talking about or are just being disingenuous.

Expand full comment

How could anyone possibly think that an alignment of viral genomes that share a common ancestor some hundreds of years ago is a mic drop moment in this way? It shows how much of an echo chamber the whole lab leak sphere is that no one will apparently say how bad it is.

Expand full comment

Scott writes:

>The fact that the COVID comparison has few mutations, and the HKU1 insert has many mutations, just shows that whatever older virus we chose to compare HKU1 to is more distant from HKU1 than BANAL-52 (or whatever) is from COVID.

Response:

It doesn’t matter what are the reasons there are so many mutations. Whatever that reason is, it also applies to the insert, providing it far more opportunities to happen.

Additionally, having a long insert is just one of 4 rare coincidences in the SARS2 FCS. The rest don't appear in HKU1:

1. The HKU1 insert doesn’t introduce a new FCS. It’s just happens to be next to it.

2. The HKU1 insert just seems like 7 repeats of TCT (Serine) plus a few SNVs. That's something that is common in natural evolution (duplicating a sequence during RNA replication).

3. It has no rare sequence like CGGCGG.

In contrast, the SARS2 insert appears in a highly conserved area with few mutations, introduces a completely new FCS, and uses a specific unique sequence that is completely foreign to this virus.

Given that this is the best example that zoonotic supporters managed to find, and it addresses only 1 of 4 rare coincidences, it actually strengthens the conclusion that this is very unlikely to arise from natural evolution.

Expand full comment
Apr 12·edited Apr 12

"In contrast, the SARS2 insert appears in a highly conserved area with few mutations" is one of the least accurate statements you can possibly make.

You're aware of counterexamples during the SARS2 pandemic of "clean" and "foreign" inserts including those with the same length and including CGGCGG (and now CGGCGGCGG!) -- no need to appeal to alignments of genomes that diverged hundreds of years ago.

Much less significantly because neither is a good argument: arguing that CGGCGG is impossibly rare and then arguing that an insert containing CGGCGG is foreign is double counting.

Lastly, based on what's written at rootclaim, you seem to be under the impression that something like protease cleavage is strictly determined by the linear amino acid sequence. You write, "The FCS could have developed through standard single nucleotide variations (SNV), which are 10,000 times more common" but this is problematic and here are a couple reasons why:

1. Protease cleavage isn't determined strictly by linear amino acid sequence. The context in the protein structure as well as post-translational modification can't be ignored.

2. Ignoring that, let's suppose BANAL-20-52 mutates from QTQTNSR|SV to QTQRNSR|SV to produce a mediocre predicted cleavage site (per ProP). Why didn't it happen? One reason is that lots of things could have happened and didn't happen. Another reason is that the "T" here is encoded by ACU; the most likely path to get to "R" that jumps out is synonymous mutation to ACG (or ACA) and then mutation to AGG (or AGA). Step 1 wouldn't be selected. Step 2 requires one of the rarest SNVs.

Seriously you might want to consider talking to literally anyone who has any clue what they're talking about before going further down this silly god-of-the-gaps rabbit hole. When was the last time you made this argument to someone who bought it?

Expand full comment

This response has several mistakes. If you wish to continue discussing it, please edit your response to remove personal attacks and show openness to being wrong.

Expand full comment

I edited and removed a new line that I inserted as a copy-paste error. I don't think I'm attacking anything other than bad arguments. You wrote the same thing on Manifold when I bluntly told you how stupid the HKU1-OC43 alignment was and then it reappeared on your Rootclaim blog.

Expand full comment

Oh didn't realize it's you. What's the purpose of assuming the other side is 'silly' 'stupid' 'doesn't know what they're talking about' instead of just asking them to explain?

Will gladly continue discussing this under the assumption both sides are open to being mistaken, and working towards truth. Otherwise, I'm fine continuing this with someone else.

Expand full comment

The point is to be honest about bad the arguments are and hope that works eventually. There's no explaining the idea that SARS2 is different from HKU1 because SARS2 is more similar to a virus with shared ancestry a few years ago than HKU1 is to a virus with shared ancestry hundreds of years ago.

You're right "silly" is the wrong word -- plenty of people who should know better fell for CGG and restriction enzyme stuff and it's not silly that our intuition is bad so often about this sort of thing.

Expand full comment

It's a rare event but I agree with one of your points. The CGGCGG and 12nt insert observations are not independent. As you say, once it's granted that it's an insert, the CGGCGG probability goes way up compared to what it would be in some generic part of the genome. Ignoring that is double counting, as you say. Still, even in inserts there's a CGGCGG probability, and it doesn't become large just because it's demonstrably not zero. You just have to grind through the actual numbers.

Expand full comment

The inserts I've found with RR (and now RRR) are disproportionately CGG, but the sample size isn't so high and I wasn't systematic about looking for them (typically I just look at a couple sites with frequent inserts to reduce the false positive rate; lots of apparent inserts are sequencing artifacts).

Expand full comment

Gadboit did it systematically and got 2.27% using multiple natural sequences, way higher that the 0.07% or less that you get for the generic genome sites.

Expand full comment

I know you ended this post with a lament that the haters will continue on, but I was a 75-25 lab leak guy before your posts here and I'm closer to 90-10 zoonosis now. Big things that changed my mind: 1. China wanted to cover up a market origin. - I knew that China was hiding something with all the evidence they destroyed but didn't consider that they might have done so with the intention of covering up a market spillover and made any conclusion harder to come to.

2. Other zoonosis origins have taken several years to find the animal source, if they find it at all - I think there was an example of a known zoonosis origins floating around that made me think it would be easier to find if it existed and the lack of that leaned lab leak more than I believe now.

3. The unscrupulousness of Saar - he definitely seems like a smart guy but when you don't take any time to step back and consider you were wrong after losing the debate and go to immediately claiming you are still Obviously right and will be vindicated, it betrays a severe lack of epistemic humility that is necessary to actually be right about stuff.

Expand full comment
founding
Apr 14·edited Apr 14

Regarding #3, how is this even relevant? There are *many* unscrupulous people arguing prominently for the lab leak hypothesis. There are also many smart, conscientious people arguing prominently for the lab leak hypothesis. Just like there are many unscrupulous people and also many smart conscientious people arguing for the zoonotic hypothesis.

There's one guy who was rich and arrogant enough to offer a $100k bet on this. Those qualities are only weakly correlated with being smart, conscientious, scrupulous, or correct, particularly in a field distinct from the one they've made their money in. And almost completely uncorrelated with being the person who presents the best argument for a hypothesis, such that if you can dismiss their argument then you can largely dismiss the hypothesis. So, stipulating for the sake of argument that Saar is unscrupulous and not as smart as he seems, how is it relevant that the rich arrogant guy happens to fall in the "unscrupulous, lab-leak" quadrant rather than one of the other three? That's mostly a coincidence, but it seems to be one-third of your case for going from 3:1 lab leak to 10:1 zoonosis.

You're not the only one to make this mistake. Saar paid $100K to claim the title, "person everyone should listen to about the lab leak hypothesis", and a large fraction of this allegedly rational community seems to have taken the bait and said "yep, he was the one we should listen to, and he didn't convince us, so we're done".

You'd have done better to pick the most physically attractive proponents of each hypothesis, and gone with the one who seemed most scrupulous to you. The same 50/50 chance of getting the right answer, but a better view during the debate.

Expand full comment

I think it's a mistake to conflate "this is one of the three things on your list" and "this is one third of your reason for changing your mind".

Expand full comment

I agree with John. It's extremely weird to put a lot of weight on one self-selected representative of a position. It's especially weird when people say "he did poorly, was unprepared etc" and treat that as an argument against LL.

Yes, you didn't say it was one third of the weight but you did present it as one of 3 the "big things" behind a 27 fold change in odds. (I could just as easily argue Peter was unscrupulous, he literally doctored a graph from a paper).

Expand full comment

A lot of critics have concluded that Saar must have just presented the wrong lab leak theory, and there's secretly a better lab leak theory which would have convinced the debate judges and also Scott Alexander.

It seems like a guy with 100 million dollars and a small research team, working alongside the co-founder of DRASTIC, would put up a case that's at least somewhat close to the best current version of the lab leak theory.

After Scott's first post, dozens of people tried to fill in the gaps and show why he was wrong, and Scott just gave page after page of easy rebuttals. But that might just be because none of those people actually knew the real lab leak theory, that secret version which is actually correct and persuasive.

I tend to think the problem is just that the lab leak theory is not correct, and falls apart under examination.

But, who knows, maybe there's a better lab leak theory out there somewhere, just waiting for some smart, conscientious, and scrupulous person to present it.

Expand full comment

More diversity comes from more spread,

Very early on, most of the first sequences outside of Wuhan were A, not B

Expand full comment

New discovery regarding the claim that a lab-leak requires WIV to have 'secret viruses', which is supposedly unlikely:

Daszak also confirms that EcoHealth and Wuhan Institute of Virology "have 15,000 samples in freezers in Wuhan, and could do the full genomes of the 700+ CoVs we've identified," contradicting his previous false claim that EcoHealth/WIV had published all of the CoVs they identified.

https://twitter.com/R_H_Ebright/status/1778930829563191562

Expand full comment

I feel the molecular evidences detailed in this video to be quite overwhelming, and not properly addressed in this debate

https://www.youtube.com/watch?v=EuuY94tsbls&t=621s

Expand full comment

Yes. It is especially interesting since later evidence confirmed 'predictions' made by their model.

https://usrtk.org/covid-19-origins/scientists-proposed-making-viruses-with-unique-features-of-sars-cov-2-in-wuhan/

We did not use it during the debate because a) the new evidence was not yet available. b) we do not like complex models as they have too many single points of failure so steelmanning makes it hard to give them low conditional probabilities. We similarly rejected complex zoonosis models.

Expand full comment
Apr 16·edited Apr 16

The better question is whether anyone should trust Chinese science published for an external audience. They obviously have incentives to know the truth at a leadership level. That is not the same as what is disseminated.

Expand full comment

> Wikipedia and all Western governments agree with Tobias and not Saar.

Perhaps replace "agree with" with "believe" in the case of Wikipedia, and "support" in the case of Western governments. Governments and their intelligence agencies have agendas of their own that don't always align with truth, and Wikipedia doesn't perform its own investigations to form an independent opinion, so they're basically just citing other people with "credibility" (like government reports). I don't know who's right, but I did follow the Syria stuff for a while and there was some suspicious weirdness about the investigation, eg. the original team being replaced and their report thrown out, whistleblowers, inconsistencies and more.

Thanks for the debate and this post. I always leaned slightly towards zoonotic, but the WIV coincidence, the poor safety record and the evidence of gain of function research being done there made me quite suspicious. Your review is definitely more convincing for zoonotic origins.

Expand full comment