188 Comments

Don’t forget about the role of mosaic variants in schizophrenia

Expand full comment

Perhaps you already have this in mind, but I think a lot of the non genetic component of why someone does or does not develop a disorder like schizophrenia is stochastic (chance) rather than whether they were exposed to certain environmental risk factors.

Kevin Mitchell describes this well:

‘Whether that risk manifests as actual disease, and which symptoms emerge, is probabilistic, reflecting additional non-genetic factors. Given the lack of evidence for systematic environmental risk factors, this diversity of outcomes may reflect intrinsic stochastic variation in the trajectories of brain development. Chance thus plays a substantial role in determining which outcome from a wide possible range is actually realised by the processes of development in an individual.’

http://www.wiringthebrain.com/2022/03/what-have-we-learned-from-psychiatric.html

Expand full comment

It's very similar to the old: a <cancer|whatever> quick-test has a 1% false positive rate. Your test is positive. What is the chance you have <cancer|whatever>? 99%?

when only a very small percentage of the population have the measured condition, statistics are very un-intuitive.

Expand full comment

The first argument is also often used to explain why in elite challenges (e.g. interviews to top companies) luck matters more than skill: since everyone who has a chance at all is already in the top percentile of skill anyway, the variance introduced by luck matters relatively more and is the main factor.

This becomes more true the more selective the process is. Since schizophrenia, like Harvard, is very selective, we expect the difference between people who get it and people who were strong candidates but didn't to be mostly luck (or environmental factors).

Expand full comment

Ad the nazi eugenic argument: I don’t think many schizophrenics would have children once the disease developed, especially back then when there weren’t any drugs ameliorating their condition. So the nazis killing them did not change the national genetic composition much.

Expand full comment

Can we summarize the confusion as being something like base rate neglect? The prevalence of schizophrenia is so low that even knowing your twin has it doesn’t make your odds that high, because they’re so so so low to begin with. Going from 1% to 20% chance of having schizophrenia is a massive step up.

Expand full comment

I am someone who takes great interest in scientific findings outside his own area of expertise.

I find it rather disheartening to discover that most of it is rather bunk, and that you need to be an expert in the field, writing simulations, to prove how bunk it is. I find it even more disheartening to discover that almost no-one who is an expert in the field bothers to notice or debunk the bunk.

So I find it very refreshing (heartening?) that you are doing so. This blog is a goldmine for anyone who wants to see how things are going in Psychology, EA, forecasting, and other topics where you point your mind.

I guess I might as well make this a question for the commentariat... what techniques do you find for helping discover great insights in fields where you have an interest, but no background, and are thus susceptible to falling for the bunk? Another way of putting it, if you were in charge of writing summary articles about all the cool stuff in various fields, how would you make sure you weren't being duped?

Expand full comment

I think this is a really rare complete miss for you, for the reason that “is mostly genetic” is just a very naive way to think about any disorder. Both Torrey and Aftab’s articles are specifically trying to reach past this kind of naive conception to point at the very thing you discovered in your analysis: that while genetics are a primary risk factor, they simply are not sufficient to cause schizophrenia alone, and waving a hand at ‘polygenic background’ doesn’t do anything to explain why, even in the presence of enormous genetic risk, there are critical environmental factors which it would be wise for us to think of as causative (even if they wouldn’t be necessarily causative in the presence of a lower genetic risk profile).

Honestly I don’t understand the reluctance to embrace the model that “maybe if hundreds of genes interact in unclear ways to create an individually highly variable risk, and it turns out identical genetic risk profiles will or will not develop schizophrenia based on environment, it’s not so smart to insist on thinking of causation as genetic.” It’s not like this explanation elides genetics, it just doesn’t throw in the towel on causation once we discover that causation isn’t purely, or even isn’t ‘mostly’ environmental.

There is a level where clinging to your mental model as “mostly this, therefore because of this and should be talked about as this” is unhelpful. Here’s a probably overloaded metaphor. If I build a building, and it collapses due to wind load, I don't just stop at saying the cause was wind. I talk about how the designers knew that wind was a primary risk for this sort of building and the design was insufficiently prepared for wind. What we are finding out is that certain buildings by their nature are more exposed to the wind, and are going to require additional reinforcement. Now we need to know what the specific scenarios are whereby winds arise and interact during the storm to collapse all buildings that collapse via wind; knowing that there is a lot of risk for particular styles of construction helps us, but it isn’t enough to say “wind did it” and put your fingers in your ears when it comes to the other factors that made wind a critical actualized risk rather than a potential risk. It is proper in this scenario to say that the cause is poor engineering, which should account for the wind and doesn’t. Wind is everywhere and we aren’t going to be eliminating it anytime soon, right? Likewise with the literally hundreds of genes with unclear interactions that create risk for psychiatric disorders.

Expand full comment

Regarding Torrey in particular: well, I try to say very little about the sort of person who's a first-degree relative of someone with Scary-Name Neurodevelopmental Disorder that blatantly represents the far end of population variance, and is absolutely dead insistent that Scary-Name Neurodevelopmental Disorder traits represent the face of evil itself, given we know that in all these cases first-degree relatives of people with Scary-Name Neurodevelopmental Disorder generally have a ton of Scary-Name Neurodevelopmental Disorder traits. But I can see why he (and many others for various SNNDs) is so enraptured by anti-genetic explanations for this particular SNND, yes. The fact Torrey is a disciple of Gajdusek is something I've been thinking about lately -- Gajdusek is one of the people with a claim to discovering prions, and is, uh, a 'character' (a self-identified "pedophiliac pediatrician", amongst other things). I'm not entirely sure what to make of the Gajdusek connection, given how little anyone focuses on it. Gajdusek is, by the words of anyone who interacted with him, not someone you forget easily.

In the specific case of schizophrenia, the "all first-degree relatives of people with SNNDs basically have SNNDs themselves" problem stands out somewhat. Research on SZ is really bad, consistently, because it comes from a psychosis-focused perspective and not a schizotypy-focused perspective -- there's increasing recognition of concepts like the "psychosis spectrum", but the keyword still tends to be psychosis. "At-risk mental states"/"prodromes" are some of the least-bad people have gotten, but are still terrible, because they're written under the assumption people diagnosable with those terms will *usually* develop psychosis, which is clearly untrue. Also, holy shit, have you seen a 'prodrome' test? They're awful. They read like they were written by someone trying to strawman the worst ends of psychiatry. "Do you have strong feelings or beliefs that are very important to you, about such things as religion, philosophy, or politics?" "Do you daydream a lot or find yourself preoccupied with stories, fantasies, or ideas?" "Do you usually prefer to be alone?" "Do you find that you have trouble getting motivated to do things?" Well, bad news, kid -- you might be a latent schizophrenic! Here, enroll in this trial where we give you antipsychotics.

Anyway: I think schizotypy, proper, is pretty intrinsic, though it's also modulated by other factors (e.g. developmental stage, paranoia-inducing experiences). When you're looking at a particular extreme subset of the schizotypal population, this is going to be muddled. I'm not sure if people who end up with an SZ diagnosis are particularly representative of the schizotypal population, not just in terms of "being on some extremes" but in much deeper and more fundamental senses. I have the strong and consistent impression that SZ vs bipolar diagnosis primarily loads on things other than presentation of the psychosis alone, for instance, and that's already selecting for the minority of people who have such obvious and serious psychosis that they get diagnosed with something and stick around the system for long enough other people notice.

Expand full comment

Something I've wanted to mention here for a while:

One point about twin studies in general is that often the twins are not separated at birth, but several years afterwards. So overall this supports the lead-crime hypothesis, because otherwise, the environmental effects are too large, and lead proves too much (and I first heard this from the "Lucifer Curves" author). But it also means that MANY of our twin studies could be confounding "early childhood environment" with "genetics".

Of course, GWAS isn't susceptible to this bias, so when the twin studies agree with GWAS, we should probably accept the overall point.

Expand full comment
Jan 24·edited Jan 24

If I understand heritability correctly, 80% heritable means that a 1-SD increase in genetic susceptibility gives you a sqrt(0.8) = 0.89-SD increase in risk, and a 1-SD increase in environmental risk factors gives you a sqrt(0.2) = 0.45-SD increase in risk. 99th percentile is about 2.33 standard deviations. So, for example, if you have +2-SD genetic susceptibility, you would need to be at about the +1.22-SD = 89th percentile in terms of environmental risk factors to be at the 99th percentile for schizophrenia risk.

You would need to be at about +(2.33/0.89)SD = 99.6th percentile of genetic susceptibility to have a 50/50 chance of being at the 99th-percentile for schizophrenia risk. Assuming that the environmental risk is all non-shared, you would still only have 50% concordance for identical twins at the 99.6th percentile of genetic risk.

Expand full comment

This is really cool, I love these simulation-based sanity checks for unintuitive things in probability theory.

Btw the company Orchid Health is apparently working on polygenic embryo screening for schizophrenia, but people are upset about it and the group whose data they're using, the Psychiatric Genomics Consortium, are trying to rescind permission for their data usage:

"PGC objects to such uses because its goal is to improve the lives of people with mental illness, not stop them from being born"

https://www.science.org/content/article/genetics-group-slams-company-using-its-data-screen-embryos-genomes

This is pretty terrible if you ask me, and I hope it doesn't turn out to be a legal hurdle for Orchid.

Expand full comment

If there were single gene polymorphisms with large negative effect, they would get selected out of the population ... eventually. Which suggests that there can't be high-frequency mutations with large negative effect, unless there is some compensating advantage (like, e.g. giving you resistance to malaria).

Which leaves us with multiple mutations, each of which individually has a small effect, adding up to a large total effect. And mutation-selection balance, where random mutations are introducing harmful mutations at about the same rate the natural selection is removing them.

Expand full comment

I think it's interesting that your example kinda proves the point of the import of environmental influence on schizophrenia, and with that maybe shows the quite logical mistake people make when they focus on the fact that schizophrenia is 80% genetogenic. It seems that environmental factors would only make a difference in 20% of the cases when reasoned naively, but you can see that for the ones with a high enough genetic risk, environmental factors actually contribute 90% to the the development of schizophrenia! It's just that if you don't have the genetic risk it's nigh impossible to develop schizophrenia, even in a highly conducive environment, yet if you're one of the 10% of people that has a higher genetic risk the environmental factor becomes highly important.

Expand full comment

I would also recommend a recent (2023) Nature Human Behavior review article on the topic of twin concordance vs. heritability: https://www.nature.com/articles/s41562-023-01609-6

"While genetic effects are evident from the higher concordance in MZ twins than in DZ pairs, concordance in MZ twins does not approach 100%, even for highly heritable traits such as schizophrenia"

Expand full comment

> I think you should expect very slightly fewer schizophrenics in the new generation, but the effect size wasn’t noticeable in this small granular simulation - nor, apparently, in Germany.

Since the population of Germany was so much larger than 2000, shouldn't statistical significance be much easier to pick up?

Expand full comment

One might also expect that the Nazis generate a significantly worse environment, which would counteract the effect (if any) of the changes in the gene pool.

Expand full comment

Also: it is in principle possible for something to be both generic and caused by a parasite (the genetic variation is improved resistance to parasitic infection).

(Looks at cat infected with T. Gondi suspiciously ...)

Expand full comment
founding

>A better model would have to take into account that people’s children aren’t clones of themselves, and that children’s environment is correlated with their parents’. But both of these would drive schizophrenia rates up, not down,

I believe the first would drive the rate up (because the variance in genetic schizophrenia factors would be higher), but the second would drive them down (think of the edge case where children's environment is perfectly correlated with their parents, and children are clones of their parents: in that case, schizophrenia/non-schizophrenia is perfectly passed on to children, so if you remove the parent schizophrenics, there won't be any schizophrenic children).

Expand full comment

I know nothing about schizophrenia, but I was slightly surprised that the model you use is something like compute "genetic score" + "environment score", and if you are above threshold you develop it.

What I expected to see is something like "genetic score" + "environment score" give you your probability to develop schizophrenia (which can fairly low even for worst scores since it is not that widespread), then you roll the dice and either get it or not depending on your luck. In this model even if my twin I share environment with develops schizophrenia, it means that my baseline chance is likely significantly higher than average, but there is still a decent chance I won't develop it because it is fairly rare.

Your model kind of assumes that some people would be genetically "immune" to developing schizophrenia; do you think that's closer to how it works?

Expand full comment

This conversation brings to mind guitarist Peter Green's schizophrenia. We saw him perform in what looked like a bathrobe, with a giant crucifix swinging from his neck. With Danny Kirwan, Jeremy Spencer, etc. in a seedy carpet store on Market Street Bill Graham called Fillmore West.

Later in his music career, Green released ethereal solo recordings that were much calmer than his Fleetwood Mac work. But he got concerned about starving children shown on TV, and felt somewhat guilty about living a pop star's self-absorbed life. ("Couldn't we just make them a sandwich?")

Things got more complicated when he busted up his brother's collection of crystal ware, in an emotional outburst. The family got help for him, and he was diagnosed with schizophrenia. How? The initial story was he was basically kidnapped by a hipster cult of Germans and dosed with some particularly weird acid. That theory was eventually pretty much debunked. So, was Peter schizophrenic when he was madly grinning and playing his guitar? Was channeling a sort of madness into guitar solos therapy? I don't know about his family history, but Green, whose name by birth was Greenbaum, I believe, shortened to sound less semitic, did talk about insults he had suffered growing up Jewish.

But it likely wasn't notorious German LSD or antisemitism that caused his illness. It does cause me to wonder whether his music would have been so rich if he hadn't been a little bit 'crazy'.

Expand full comment

To throw fuel on the bonfire, doesn't this also steer *away* from all those embryonic genetic screening business? After all, if you can have 80% schizophrenia genes but not develop it because of environmental factors, how many of the other "eek! genetic risk!" alarms are likely to occur if you let the embryo develop into a pregnancy, be born, and grow up?

Expand full comment

This sounds more like a science communication (or science understanding) issue.

I asked a convenience sample at breakfast what "80% genetic" meant to them. Generally the reply was "80% of the cases have it entirely due to genetic reasons with environment playing no part, and 20% with environmental reasons and genetics playing no part". I asked what about ones where environment played some part and genetics played some part. They updated their answer to be 80% genes with no environment, an unknown X% environment with no genes, and 20%-X% mix.

I then laid out the argument and spreadsheets in the article, and asked how they would choose to describe the reality of the spreadsheets, in their own words. The consensus reply was "It's 100% genetic and 100% environmental, but X% of the population is immune regardless of their environment, and Y% of the population is immune regardless of their genetics, and those populations are non-exclusive, and you could construct a Venn diagram to easily and accurately communicate it to the general public."

Then one person pointed out something interesting: 20% environmental is somewhat circumstantial. Since there's no absolute way to define the environment, it's a number that's derived from the types of environment available for the study population, and theoretically a more powerfully negative (or positive) environment could exist or develop in the future that would change the ratio from 80/20. And therefore the 80/20 mix isn't something set in stone but instead set in the social milieu of the study and there should be an expectation that it changes over time or study population.

Expand full comment

Does the uniform distribution make sense for the genetic component (or the environmental component for that matter)? Seems like a power or normal distribution would make more sense, especially given the low appearance rate in the population. Using those distributions changes the simple model considerably. Another consideration is the deviation in environment vs genetics - that has a big impact on how the model shapes out - if the outliers are on the environmental or genetics side drives the analysis.

Expand full comment

Man, it’s always the damn base-rates! Always.

Even some of the weirdness around high heritability but low variance explained by the SNPs we’ve found may be explained by rare variants (low base-rates) that aren’t found in ordinary GWAS setups. And previous missteps in psychiatric genetics can be explained by not considering the base-rate of how many (underpowered) studies were looking for candidate genes.

Expand full comment
Jan 24·edited Jan 24

Nice. I ran the same simulation and got similar results. Then I ran the simulation out to 5 generations of Nazi genetic experiments. For the budding Nazis out there, after 5 generations, I did start to see an effect! G2=19, G3=15, G4=12, G5=9.

So after only five generations of murderous crimes against humanity (~100 years), Eugenics reduced schizophrenia by half! Of course, there's the pesky side effect that many of these genes probably have some positive effect when they don't manifest as disease...

But maybe a return to the atrocities committed in service to the Eugenics movement isn't necessary - even in a world where a disease is 80% genetically driven. I took Scott's same model again (I generated my own threshold based on my top 20 cutoff), and tried two different things. First, I reduced environmental contributions by 20%, and second I created an entirely new population with a baseline 20% reduction in environmental contributions. Both returned 2 schizophrenic people in the overall population of 2000 - a 90% reduction!

My model suggested a 50% schizophrenia decrease with just a 10% reduction in environmental factors, and a 100% decrease with a 30% reduction in environmental factors. In other words, you don't even have to be perfect (or particularly good!) at mitigating environmental factors to have a huge impact. So, even if genetic factors are responsible for the majority of the effect, if environmental factors are easier to control they might deserve significant focus for even heavily genetic-driven diseases.

CAVEAT: Of course this is based on Scott's oversimplified model.

Expand full comment

I think the implication that Torrey is lying is uncharitable. Maybe he just doesn't know these decidedly unintuitive mathematical results, and Psychiatry Research will issue a correction (or even if this too central to the paper, a retraction).

Expand full comment

The math is wrong. For the causal paths, you need the square root of the variance, so you need sqrt(.80) = 0.89, and sqrt(.20) = 0.45. With heritability 80% and environment 20%, you get relative strength of 2 to 1, not 4 to 1. Variances are deceptive.

I don't think this will change your simulation results that much though.

Expand full comment

Good post, but quick note: 80% heritability means genes are *twice* as important as environments, not four times as important. When predicting, you want to use unstandardized terms and sqrt(0.8) = 0.8944, sqrt(0.2) = 0.4472.

Expand full comment

I just thought of a thought experiment for Aftab's point: can someone who knows genetics better than me comment?

Consider drowning. Right now I would predict that drowning is largely environmental (obviously genetic predispositions towards risk-taking that could include risky behaviors around water will affect it).

However, what if there were a mutation that gave people gills? If this mutation became widespread, it would significantly increase the genetic component to drowning risk across the population. To what degree is it correct to say that, in this hypothetical world, an individual drowning is largely caused by their lack of the "gill" gene(s)? How does that differ to the actual world we live in when someone drowns?

Expand full comment

Another way to reframe the situation intuitively is that the odds a randomly selected person has schizophrenia are 1%. If having a twin with schizophrenia increases your odds to between 15-50%, it seems like having a schizophrenia gene increases your odds 15-50x of developing schizophrenia vs. the base rate. That's a very strong genetic connection!

Expand full comment

This is a great explanation of the subject, very clear and concise.

> E. Fuller Torrey recently published a journal article trying to cast doubt on the commonly-accepted claim that schizophrenia is mostly genetic.

> People really hate the finding that most diseases are substantially (often primarily) genetic. There’s a whole toolbox that people in denial about this use to sow doubt..

I feel like these two lines are very bravery-debate-ish. "The evil anti-science people are using clever tactics to spread misinformation. But we're the brave defenders of science who can see through their ways!"

Expand full comment

Also schizophrenia is a disease which is constantly getting filtered out by selective pressure and replenished with mutational load. Killing off people who have it will reduce the amount of schizophrenia genes in the next generation a lot less than the numbers of people who have it because those people would likely not have had many children to begin with, and there's always more mutations in the next generation upping the overall amount.

Expand full comment

They are from PSY 5137 – Introduction to Behavioral Genetics – Fall 2020 - by Matt McGue, PhD

Regents Professor, University of Minnesota, Department of Psychology

https://sites.google.com/umn.edu/behavioralgenetics-mcgue/home-topics

Hope this helps!

Expand full comment

Truly the worst post I've ever seen from you, Scott. Uncreative self-soothing that misunderstands the existing literature on the genetics of schizophrenia—and, as others have pointed out, bad math!

Expand full comment

Except you have just proven the point of the paper: even amongst those with highest genetic risk, a tiny minority develops schizophrenia. Hence the "80% genetic" angle is not just factually-true-but-kinda-unhelpful, it is actively misleading and used by people to discourage and/or disregard research into environmental (i.e. social, political, toxicological) causative factors.

This is emblematic of a wider problem in the field: heritability has no intuitive correlate when it comes to the individual patient on a biological patient. The missing heritability problem taught us it does not even have any correlate on the level of DNA, at least at current levels of data availability, and I would risk predicting that this will not meaningfully change with the whole genome sequencing effort people are getting hyped up about today (i.e. the 1 million genomes the EU is trying to scrape together).

Likely, we will have another round of articles lamenting the fact that the expenses have not been worth it. This is an obvious grift by the genetics community to try and stay relevant, when the other main promise they made couldn't be fulfilled: genetics based drug discovery has been a disaster and so the industry funding is drying up, I guess.

I do not understand why people have such enthusiasm for genetic causes of chronic disease - it means we are factually powerless to prevent it. Is this just residual belief in eugenics? If you are actively seeking out a Gattaca-style future of polgenic scores and genetically engineered humans, please be aware of the enormity of the dice you are rolling: more likely than not, you will be amongst the discarded. It takes a lot of delusion to believe to come out on top, kinda like a modern, more depressing gods-chosen-people belief.

Expand full comment

I guess this one is bunk then? https://www.independentsciencenews.org/health/the-great-dna-data-deficit/

It's hard to know what you can believe about science reporting these days without becoming a specialist yourself...

Expand full comment

Obesity is also 80% genetic, yet we know from epidemiology that obesity increased substantially for environmental reasons.

This suggests that there are strong gene-environment interactions. But the "X% genetic" model tacitly assumes ZERO gene-environment interactions. Maybe it's not a great model?

(A similar story to obesity happened with the Flynn effect: IQ is like 80% genetic, but IQ scores increased substantially over the second half of the 20th century for what are surely environmental reasons.)

Expand full comment
Jan 25·edited Jan 25

Good discussion. It illustrates how people get very confused when they don't work through the numbers but simply make judgements based on the qualitative sound of words like "most." They forget that there are multiple variables at play, and "most" may apply only to some of them.

Worth noting that exactly analogous considerations apply to IQ and some personality traits.

Expand full comment

I'm surprised this statistical fallacy required a post to debunk. You can swap out schizophrenia for any other rare condition; imagine someone saying "IQ is heritable? Then why is it that when one twin is >145IQ, the other twin is >145IQ "only" 33% of the time?"

Expand full comment

Smart analysis. However, when people say heritability accounts for 80% of the variation in a binary trait, they mean actual cases, not 80% of the variance of a latent variable which causes that trait above a theshold, right? How do they figure this? Through some kind of log odds ratio on the variable corresponding to identical genome?

Expand full comment

You learn something new every day. Thank you, Scott!

Expand full comment
founding

Sapolsky's lectures are great for developing more intuition here: https://www.youtube.com/watch?v=OareDiaR0hg

It's important to remember that heritability studies are only as powerful as the range of environments available to researchers--which are always very narrow, relative to the range of conceivable environments. The fact that WWII-era German culture was wildly different from post-war Germany is more than enough (for me) to explain the lack of reduced rates of schizophrenia, even with high inter-culture heritability scores. Which people become schizophrenic might change drastically in a deeply religious society vs a far-future technological society vs an isolationist commune--even if _within_ each of those societies schizophrenia is 80% heritable.

My best summary of my intuition here is that there's no such thing as genetic variance that's pure and independent of environmental variance (and vice versa). Heritability measures are _always_ conditioned on a given environmental range.

Expand full comment

"Is it wrong to give twins random environmental scores? Don’t twin pairs grow up in very similar environments? Yes"

Is it wrong? The book Innate by Kevin Mitchell changed the way I think about individual development. It looks like most of your ‘environment’ is how your brain happens to wire itself up during fetal development. That’s why identical twins can be so different, not to mention siblings in general.

Expand full comment
Jan 25·edited Jan 25

You can model these questions using a liability threshold model.* For example, see the methods section of this article: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8510582/. This model is very similar to the one you've used, but it allows you to define covariation between family members. For example, you can take into account that a child and parent share ~50% of their genome. You can also take into account shared environments.

Using this model to address the first question:

- The twin of a diagnosed schizophrenic has an ~36.5% risk of developing the disease.

- The twins can share up to ~43% of their environmental factors before that risk surpasses 50%.

To answer the second question:

- A child of two parents without schizophrenia has ~0.85% chance of developing the disease.

- This number drops to ~0.76% if you assume that the child shares 50% of their environmental factors with each of their parents.

If you want to play around with these models yourself, I uploaded my code here: https://github.com/tgiardina/Some-Unintuitive-Properties-Of-Polygenic-Disorders

* This is a variation on the model we use at Orchid to estimate the disease risk reduction using embryo screening (https://portal.orchidhealth.com/risk-calculator).

Expand full comment
Jan 25·edited Jan 25

> Making everyone mate is beyond the scope of this discount-rate simulation

If you want to do it in the future:

1. Create a new column filled with random numbers.

2. Sort by that column.

3. Take the top half of the population and put them into a new column, probably on a separate sheet.

4. Take the bottom half and put them in their own column next to the top half on the new sheet.

You now have n/2 random marriages. You can take the average value of each couple, regress it to the mean, and have a set of n/2 predictions of child genetic risk.

Actual children would vary around that predicted mean. You can add a normally distributed term† to make that happen. (For example, if you want to generate multiple children from one couple!) For random mating, I believe the variance of this term should be equal to population variance in the trait (?). But there's a good chance I'm confusing the questions of "how much regression to the mean do we do based on our knowledge of the child's ancestry?" -- ultimately a question about the mean of the children's distribution -- and "how much residual variation is left among the children of two known parents?", a question about its variance.

† I get the sense from your post that your generated numbers are centiles rather than deviations. This will interfere with inheritance models.

Expand full comment

This whole post seems like a strong argument that Awais Aftab is correct that "mostly genetic", or even "80% genetic", is a misleading phrase. Even you got slightly confused about exactly what it meant when running your simulation!

I frequently see people on the right say "since we know from twin studies that X is highly heretitable, that means environmental interventions are necessarily useless". Which is the exact same mistake you're rebutting here; as you just demonstrated, it's entirely possible for something to be "80% genetic" in this sense and also 100% dependent on environmental factors. (Though not necessarily ones we can change, they could be totally random.)

In colloquial terms, schizophrenia in your toy model is almost 100% dependent on genes *and* 100% dependent on environment; it's nearly impossible to get it without both. Framing it as one or the other, and mostly one, *is* misleading. (Something you've written on before: https://slatestarcodex.com/2013/06/25/nature-is-not-a-slate-its-a-series-of-levers/)

That's not to say we shouldn't do it, I'm not sure there's a better way to describe the situation in precise statistical terms. Just, maybe we need big flashing disclaimers every time reminding people of the above? Or maybe we need a new jargon word than "hereditability" that doesn't give people this impression?

Expand full comment

I do think a real criticism of the high end risk estimates for psychiatric disorders is that heritability is not just what proportion of the disease is caused by genes, but has an essential clause of how much it causes disease in a given environment.

Twin studies typically are not twins who were relocated to dire poverty or traumatic circumstances: most ended up in fairly middle to upper class circumstances as twin adoptees. When environment and nutrients are right and more standardized, then it makes sense that genes will play a relatively larger role.

Expand full comment

The concordance rate (in Figure 2) from the 1970 paper seems higher than 33% and closer maybe to 38% or just somewhat under 40% and I don’t think it accounts for shared environmental effects.

Is concordance the probability that both twins have a phenotype given that either one of them does or is it the probability that both twins have a phenotype given that a specific twin does? I think it is the latter (or at least the 1970 paper you linked uses it that way), but the former makes a lot more sense.

Also, if we assume that schizophrenia genetic and environmental risk are normal and additive, we can derive the following formula for the concordance of twin pairs (using the paper’s definition).

$$p^{-1}\int_{-\infty}^{\Phi^{-1}(p)} \Phi\left(\frac{\Phi^{-1}(p) - h^2x}{\sqrt{1 - (h^2)^2}}\right)\varphi(x)dx$$

Where $p$ is the population prevalence and $\varphi$, $\Phi$, and $\Phi^{-1}$ denote the standard normal PDF, CDF, and inverse CDF respectively. It is not very pretty.

Using a heritability of 0.8 and a population prevalence of 1%, we get a concordance rate of about 38% which is what the paper says (and I checked this in an R simulation and it is true or I have made at least two mistakes).

Also, for the probability that both have it given that either have it, or what I think makes more sense for concordance, we can derive the formula

$$\frac{\int_{-\infty}^{\Phi^{-1}(p)} \Phi\left(\frac{\Phi^{-1}(p) - h^2x}{\sqrt{1 - (h^2)^2}}\right)\varphi(x)dx}{2p- \int_{-\infty}^{\Phi^{-1}(p)} \Phi\left(\frac{\Phi^{-1}(p) - h^2x}{\sqrt{1 - (h^2)^2}}\right)\varphi(x)dx}$$

Which gives a probability of about 23% for the heritability and prevalence specified in the post.

You can look at either formula at e.g. the website quicklatex or in whatever latex processor you have.

Also, the second argument is bad because the fact that Schizophrenia has low incidence means it’s immediately clear from e.g. the breeder’s equation, that culling Schizophrenics before reproduction would not change the frequency of Schizophrenia very quickly regardless of the heritability and I do not understand how the author did not see this.

Expand full comment

Children born of older fathers are more likely to be schizophrenic (presumably because the fathers' sperm has accumulated more mutations as time has passed and something like 50% of genes are expressed in the brain). Do you know how much more likely? If the answer is a lot, i.e., if a large proportion of schizophrenics were sired by older fathers, then there is an additional reason why the Nazis' policy had little effect on the next generation, or at least one can tell a plausible story about that. The proportion of such offspring was little changed before and after WW2.

Expand full comment

The fact that people have more than one parent does make the case significantly stronger against the Germany thing. I ran a simulation in which a population of 5000 with 1000 genes each were tested for some 100% genetic trait, and the top 1% for the trait were eliminated, then had them produce children. 28 of the children (so about 0.6%) were still above the original threshold. Repeating the experiment a few times gave similar results (32 and 30 schizophrenic children the next two times) (I didn't bother working out exactly what the threshold should be on average for the top percentile, and just used the same threshold that caught 1% of the first sample every time, but that doesn't seem to have been a huge issue). If I eliminate the top 1%, run 2 generation and test the grandchildren, the rate returns even closer to the original rate, around 0.9%. If this effect is combined with the fact that real schizophrenia is only 80% heritable not 100%, it seems quite likely that the effect would be unmeasurably small.

Expand full comment
Jan 26·edited Jan 26

Maybe "genetic" should mean two concurrently related things: first, that it is heritable. Second, that it is probabilistic. The first meaning that its presence in the gene pool is a certitude. The second meaning that its expression is conditional. Conditional on what? On a number of intricate factors and processes which are still genetically mediated. We know nothing about these intricate processes and factors hence our confusion. And it makes sense that a phenomenon produced by intricate processes and factors we don't understand would confound our understanding of its phenotypical expression.

Expand full comment

"P(A|B) = [P(A)*P(B|A)]/P(B), all the rest is commentary."

That's all you need to explain the raised but still low prevalence in the twins of those affected by a rare condition.

Expand full comment

I could be wrong here, but schizophrenia starts in the late teens/early 20s for men (20s for women usually), and there is a prodrome phase of depression and isolation, etc. before that, too.

So ofc the incidence of schizophrenia isn't going to change much by elimination of schizophrenics in a population - they aren't the ones doing much reproducing of those genes in the first place.

Expand full comment
Jan 30·edited Jan 30

The gene naming objection does not survive even cursory thinking. You can tell that Huntington's disease is inherited from the biological parents without having any concept of what genes or DNA are. All you need is correlations. And these days, nobody disputes that height is largely genetic, despite us having very little understanding of the genes involves or their role.

Expand full comment
Jan 30·edited Jan 30

Scott, what happens to this data if you expand "schizophrenia" into "thought disorders and bipolar processes?"

It seems to me that inter-rater reliability is part of the problem. In my forensic work, I have seen many of the same patients over and over, and who have also been seen by other forensic psychologist (its a rural state so the 5 of us who actually do this see the same repeat offenders a lot). Often times our Dx line up perfectly, but frequently one of us will Dx the patient with one thing, and the other six months later with something else. And both Dx are "close enough" to capture the disordered thought process that results in their being diverted away from the justice system and into the state hospital system. In fact, we work together so well, that we often consult with each other to reconcile the Dx variance before it goes to court.

What is the probability of say, for example an adult Dx with Major Depressive Disorder, recurrent, severe with psychotic features having a child who is later Dx with Schizophrenia?

Expand full comment

Yep, took the Bloody Code and it's brethren half a millenium

Expand full comment