While I agree with your general suspicion of village wisdom in the final comment, it really is true that "nature" was the dominant view in western countries in the 19th century (and early 20th), despite the lack of knowledge of genetics- basically everyone thought ancestry determined everything and upbringing had no effect.
"But interracial children are just as healthy as within-race children. In fact, even interspecies hybrids like mules are pretty healthy (their inability to breed comes from an unrelated chromosome issue)."
You're forgetting Mendel. We have paired chromosomes so the first generation cross has all the proteins of both parents (hybrid vigour). It's the next generation where you'd expect problems on that model and then only if you interbred the first generation offspring.
I agree that you don't necessarily see the outbreeding depression in F1. But I think generally hybrid vigour is more about getting rid of inbreeding depression, not about improvements for already internally diverse strains.
Also somewhat relevant here, there was definitely a lot of outbreeding depression when our ancestors had kids with some neanderthals. Or at least strong purifying selection, I'd assume those are related, although not sure about F1.
"But I think generally hybrid vigour is more about getting rid of inbreeding depression, not about improvements for already internally diverse strains."
I think the history of plant breeding suggests otherwise, tho you often have to double the chromosome numbers to restore fertility and that could be more important. But if that were so, then why bother to make the hybrid in the first place?
Plants are more robust to inbreeding than animals, because they self fertilize. But even in plants, hybrid vigor is the exception not the rule. We all know about the mule because it is the exception, but it is valuable to know the few exceptions.
The mule isn't exactly an exception. It has characteristics that are useful to humans. It's also sterile; it has no characteristics that are useful to itself.
I should also note that when I mentioned mules to a class of Chinese high schoolers, they were shocked at the concept. So I'm not sure it's true that "we all know about the mule".
The way I heard it was that "hybrid vigor" was a selection effect. Hybrids have a much wider range of "suitabilities", and the low ends tend to be selected away, so what we tend to notice are the ones that are "more fit".
I'm pretty sure there's a lot of evidence against the idea that on average mixed race children aee just as healthy. Psychologically, and with substance abuse, they seem less healthy. Also aren't they less fertile on average? Perhaps none of this is from genetics, but I don't think the data backs "just as healthy" in adult samples.
But how much of that is caused by society's attitude and their feeling of not fitting in, especially if one or both parents are from a non-WEIRD background?
Parents who interracially marry will tend to be weird in the non-acronym sense, since there's still social stratification. My understanding is that nowadays something like half of Jewish marriages are to gentiles, but generations ago such people would have been much more unrepresentative.
This is vastly silly. Or are Latin Americans (say) WEIRD? Is the Latin American working class WEIRDer than its bourgeoisie? Are Filipinos WEIRD?
>I'm pretty sure there's a lot of evidence against the idea that on >average mixed race children aee just as healthy. Psychologically, and >with substance abuse, they seem less healthy. Also aren't they less >fertile on average?
Evidence (other than "feels", which I don't get; this is the first time I'm hearing any of this)?
WEIRD is taken in the sense of "The Weirdest People in the World: How the West Became Psychologically Peculiar and Particularly Prosperous" by Joseph Henrich (Author).
"Perhaps you are WEIRD: raised in a society that is Western, Educated, Industrialized, Rich, and Democratic. If so, you’re rather psychologically peculiar." - but if you read the book it's not just raised in such a society: it takes a few generations.
That could be part of it. It's never been proven it's all of it, but even if it were, it is still a risk, even if it's environmentally mediated in nature. It's not like parents know how to 100% get rid of that risk for their kids. Of course I'm not saying ban interracial marriage, but there are risks. Domestic violence and divorce are other examples of risks that can be higher in interracial couples. They are higher in lesbian couples too.
There's reasonable evidence that Cro Magnon-Neanderthal crosses had problems. The evidence is vague enough that we don't know what the problems were, but we do know that some survived and bred successfully. (One of the fossils analyzed is said to be a second generation Cro Magnon-Neanderthal cross. (Cro Magnon's lived in larger groups, so it's probable that the reason they are our dominant ancestor is a higher rate of reproduction. But, of course, this also let technological changes spread faster among them.) I'd guess that the same was true of Denisovans, but there the evidence is so scant that it's just a wild guess.
I feel like the presumption of schizophrenia being genetic is quite questionable. I would like to see a self-contained argument in favor of that, rather than dissociated evidence-fragments.
I'm with Prof. Gerdes on this one. With the rather cringey caveat that neural networks can do a decent job of accounting for nonlinear effects without getting bogged down in the impossibility of getting any sort of statistical significance which other approaches have. That's has other problems, like different training runs potentially getting very different answers to the same question, but it should be looked into.
The GxE case that occurred most to me during the original post was educational attainment based on IQ genes and exposure to opportunities for drug use. You mentioned previously (I believe) that high IQ is associated with more risk taking/drug experimentation. If we take IQ as genetic, then it may be a plus for educational attainment in some environments and a minus for educational attainment (or at least a highly attenuated plus) in others. This seems likely to confound twin studies. I would expect there to be many other similar situations.
[edit: also, the idea that there is little evidence of GxG interactions seems unsurprising even if there is lots of GxG interactions. Given the number of genes, the number of gene combinations is going to be astronomical (moves in game of chess >> number of particles in universe). This might cause detecting GxG interactions to require unbelievably powerful (large sample size) statistical tests.]
A really good example where we might be able to get some data would be people with a genetic susceptibility to alcoholism growing up in Mormon vs non-Mormon homes.
I think the praise of long-read sequencing is partially true but overdone. First, short-read and microarrays are being conflated in Andy's post and they're pretty different. Most GWAS prior to the last few years was done with microarrays, which can only identify pre-determined variants (but not necessarily just SNPs). So you never see anything that you don't know ahead of time to look for and so it's basically just SNPs (which are common literally by definition). Whole genome sequencing ala Illumina short-read is not limited to just predetermined variants and so can detect, for example, de novo mutations that only one person in the whole world has. Common SNPs are just the tip of the iceberg. More accurate would be to say that short-read is extremely good at single nucleotide variants (of any rarity) but it can also detect differences in copy number, large deletions, and many other variants.
But long-read really does have advantages (and is even close to price-competitive these days). There are regions of the genome that are basically impossible to sequence accurately in short-read (see the recent "telomere-to-telomere" genomes) and these are recently being shown to have a surprising amount of variation. Long read is also better at determining structural variation, which again, there appears to be more of than was thought a decade ago. However, short read will detect many structural variations as well but won't be as good at telling you exactly what the variation is. And this information is often just not used in short-read analyses even if it could theoretically be used.
Statistical geneticists are absolutely aware of the limitations, though right now they general focus of the field has been the recent increase in whole genome (short-read) data that means the shortcomings of microarray data are being overcome. Namely, that rare variants are now much easier to assess. Maybe in another ~10 years we'll have large datasets of long-read data and will be all excited about that, but it's not available yet.
I'm also not sure what the example about a splice variant means. Splicing happens post-transcriptionally in RNA when introns are removed and so does not appear in genome sequencing. Variants that affect splicing could look like practically anything in the genome and so there's no clear advantage of long-read for determining those and even if you had long-read it wouldn't be trivial to determine that the variant you detect is actually a splice variant. Moreover, splice variants are ubiquitous. Probably Andy meant a structural variant instead? Or maybe they meant long-read RNA-seq, which is very cool tech but not very relevant to the discussion?
Do you have any source on structural variation being common (and therefore probably not that harmful)? I would have assumed if you have any large differences in copy number, large deletions etc. we don't need a genetic test to notice those.
I only said that they're more common than previously thought and I definitely don't know whether individual structural variants are common in the sense that you can infer they're benign. I'm thinking of things like this: https://www.nature.com/articles/s41586-024-07278-3 There's substantial large-scale variation in centromeres between individuals (the first author's lab has more recent work that they've presented but doesn't appear to be published yet that shows this in more than just the two samples used in the publication I'm citing). Moreover, that's structural variation in centromeres which has generally been thought of as "junk" DNA - structural variation elsewhere could be much more impactful.
The other thing I'm thinking of is pan-genome type approaches that have revealed that different genetic ancestry groups vary (compared to each other or the 'reference' genome) by large structural changes and not just small SNVs or indels. https://www.nature.com/articles/s41586-023-05896-x But these are again not the kind of individual-individual differences that we care about for heritability.
Thank you for this reply! I appreciate the notes and perspective.
Some thoughts:
- “[short-read] can also detect differences in copy number, large deletions, and many other variants.” Yes in principle, but with way more nuance than I think is generally appreciated. The variant calling pipelines are trying to make sense of very challenging data. I know of one used by a clinical provider that (among lots of other steps) does mapping with 2 different tools, a bunch of cleaning up of the alignments, variant calling with 3 different tools, filtering and then a final call set based on consensus of the various variant callers. A lot of this process is to deal with the fact that the incoming data is just hard to make sense of without a large amount of information to use as a scaffold (eg. a reference), and then anything too divergent from the reference starts looking suspect. Some is definitely a need to make sure that the thing seen in the data isn't an artefact - PCR and library preparation can make weird things happen to pieces of DNA. But I think there is definitely signal present that doesn't pass filters as it's hard to disentangle from noise with short-read data, which just isn't true to the same extent for long-read data.
- “And this information is often just not used in short-read analyses even if it could theoretically be used.” This is I think where I struggle. For me, having that information feels like it might be helpful to be able to specify not just whether a region is variable, but what type of variant is present. But this is coming from my very specific non-model organisms background where genotype-phenotype is waaaay easier. But it’s a good point that at the level of multiple samples it probably isn’t impactful.
- “large deletions” -this is a fun bias in short-read variant calling where it’s way easier to call a large deletion using short-reads than a large insertion
- “Statistical geneticists are absolutely aware of the limitations” - this is re-assuring, thank you!
- “large datasets of long-read data” - would appreciate your thoughts - how much more useful would you think a dataset of the type like TOPMED would be if it was long-read vs short-read? Or if UK BioBank was re-done with long-reads? And cost-effective-wise - does it make any sense to do short-read based population studies now that sequencing costs per sample are almost even between short and long-reads(at least for PacBio)?
- “example about a splice variant means”, sorry, yes this was not helpful in that comment. I was mainly trying to highlight how hard short-read data is to work with, and yes the comparison was short-read RNA-Seq vs long-read RNA-Seq for reconstructing transcripts/splice patterns. Transcriptome analysis with long-read data continues to show large numbers (10x) of variants that aren’t able to be captured from short-reads. Thank you for your kind clarification for this in your reply!
So, apart from the gross effects of simple Mendelian genes (most of which appear to be loss-of-protein-function point mutations) genetics is a chaotic system. Still deterministic, but no easier to model than turbulence.
"Except perhaps in the most extreme cases, wealth has to be actively maintained and fortified by each successive generation, or even very wealthy families fall off the map within a few generations."
Occasionally when he will be meeting people that he truly knows nothing about, he will ask me to look into "who they are" (e.g. where is their money from).
These would be people who are very comfortable but not on any 400 list.
My research will sometimes be simple: turning up the name of a familiar product, say.
I recall once there was a gentleman whom upon googling I found was a descendant of ... I want to say, "Famous Industrialist's right-hand man" or such.
Not Famous Industrialist himself, just a close associate.
These folks had an enormous fund with their own full-time investment managers, with a beautiful website - hundreds of people benefitting to this day.
I guess you can attribute that all to the founder of the fortune, but it seems to me that the heirs must have been smart - if only smart enough to realize it would be folly to try to cheat one another of it, or take it out of the hands of its managers - about their money, even if it was now distributed to so many. They are not, however, particularly on "the map" as far as I know.
I wonder how the descendants of largely 'self-made' fortunes fair compare to those of large lotto winners. Maybe not enough of the later (or enough time since we had 9 figure lotto winners).
I think the Vanderbilts still have some presence in the Forbes 400. (Anderson Cooper is a Vanderbilt but not in that list.) One of the very few families who managed to keep it going for 3+ generations.
I'm impressed with the ones who figured out how to take Biltmore, an unfinished folly, that would otherwise be a decaying white elephant; and turn it into what seems a thriving business.
Gloria Vanderbilt and Anderson Cooper did new work, not maintain existing businesses and they didn't even provide capital for their new works. Maybe they inherited connections.
I strikes me that the best summation is that clever people try and spend a lot of time and effort nitpicking and attacking all the studies showing high heritably for intelligence but they fail, almost all the studies hold up and the result is clear evidence for high heritability.
The idea that evolution is guaranteed to weed out nonlinear genetic combinations seems absurd. It is just too easy to conceive of ones that must exist. Height and metabolic rate would multiplicatively contribute to caloric requirements. Some neurotransmitter/neuron interactions must be multiplicative. I design IC's. We find that our distributions of key parameters (which are based on hundreds or thousands of physical parameters (mostly unmeasured) are seemingly normal distributions (additive). nonetheless, it is clear that numerous interactions are, in fact, multiplicative. They are just outweighed by many other additive factors. So, we know that nonlinearities don't have to tilt the scale and we know that a few do exist that can have real measurable impact(caloric requirement).
> (I do think villagers had one other advantage lots of moderns didn’t, which was that they closely observed animal breeding. But that’s not common sense - that’s access to important scientific data!)
Despite this advantage, it still took them millennia to understand heredity. For the longest time, people had similarly hilarious misconceptions about animal breeding as about human conception. See https://gwern.net/review/bakewell#the-invention-of-heritability
As someone without any background in genetics beyond high school biology, who read the original piece, skimmed this one, and has gone through several similar Substack pieces discussing heritability, I think I've determined I have no idea whether I believe pro- or anti-heriditarians are closer to the truth (though I do agree that elite liberals have hewed far too closely to a naive anti-hereditarian line in most ways, which isn't exactly a revelation).
It seems like there are just methodological disagreements upon methodological disagreements, and there's no way of piecing things out without getting something close to a Master's in the field.
It's frustrating to me (in a very self-centered and myopic way) reading debates where neither side is making a baseline error that allows me to explain away most of their arguments, lol. This is why I tend to stick closer to pure theory in my pleasure reading. I guess I'll just take the epistemically cautious option and half-heartedly adopt a position somewhere in the middle, while keeping an open mind.
This whole discourse has had the benefit of reminding me, though, that when I have kids, so long as my approach to raising them is well thought-out and executed, there should be a very good chance that they turn out similar to me and my partner, which I guess feels nice lol.
The jargon is far beyond what I can understand, which is disappointing since it is being mediated through a very smart layman good at explaining things, or at least his conception of them. If reading one of his many paragraphs on this subject, I feel I "get" the gist, I abandon the paragraph on the assumption that is as good as it gets and I don't want to get muddled.
However, the baseline error or rather confusion I see here is whether people are *all* talking about heritability being a matter of one or two generations? Which doesn't really make sense to me, why that would be the definition; and whether the anti-hereditarians are denying that all the information there *can be* related to the value of a trait, is in the genome, however it gets expressed.
"I do think villagers had one other advantage lots of moderns didn’t, which was that they closely observed animal breeding. But that’s not common sense - that’s access to important scientific data!"
A certain Dr Costin "BAP" Alamariu wrote his doctoral dissertation on this very matter...
LLMs (to use the new hotness) seem like a much better deterministic (temp=0) program comparison to the genome then say the linux kernel. Like the genome they are an evolved system that exhibit complex behaviors. Weights are like base pairs in the analogy, and just like in the genome the systems at a high level are quite robust towards some shuffling and tweaking of weights (lots of regularization techniques (e.g. dropout etc.) map to leaning to learn in way the organisms evolve to evolve and are bit like recombination etc. in that respect).
I'd be really curious what you'd get if you took a bunch of differentially fine-tuned descendants of a large base model and did genome wide association studies on the weights looking at their different behavior. I bet you wouldn't find much, even though it's 100% weight driven, and at the same time I bet recombination between the models would work quite well and somewhat intuitively in their outcome in terms of behavior results. We largely know why too, it's that we're only looking at weight interactions in the compressed latent space and if we were able to see if into the much higher dimensional space then it would look a lot more like linear combinations. Someone should run the experiment if it hasn't been done already.
I think the genome must also be working in some higher dimensional space, it's just not big enough, not enough genes to be anything else. The intuition pump for me here isn't even humans, it's organisms with complex instinctive behaviors (e.g. things like hunting behaviors of snakes or migrating behaviors in butterflies), just think mechanistically on how genome can lead to such behaviors and in an evolving way where small mutation or recombination can somehow land 'near' the existing behavior so they can evolve with changing environment. We need something like spare-autoencoders for the geome.
> you can’t, even in principle, peer into a cell with a microscope, and tell whether the two genes are “interacting” or not.
Actually you totally can! For example, a good molecular biologist could tell if a mutation in one gene causes its protein to not bind to the promoter of another gene.
Your comment is entirely missing the point, in a way that makes me think you either didn’t read or didn’t understand the rest of what I wrote.
…But thanks for the great illustration of my claim that “many people in the field have a tendency to think about those topics in an overly narrow way”. :-P
I think I got your point re GxG and would be interested in your thoughts on the following (apologies if you addressed this in your full piece, I have not yet read it in its entirety): the pathway you describe goes from genes to "traits" to "outcomes", where the paradigmatic trait is a purely "organic" property such as height of blood pressure and the paradigmatic outcome is a highly socially constructed property such as being divorced or being diagnosed with antisocial personality disorder (hence interactions causing the latter kind of property are not something you can observe in a cell).
My question is: do you think GxG on outcomes can reasonably be called heritability? The outcomes you describe are highly culturally dependent. E.g. diagnosis rates for mental disorders and divorce rates change over time even in the same culture. Surely this kind of "heritability" is massively environmentally modulated? I.e. the underlying traits might interact towards a propensity for the given outcome in one environment (high societal divorce rate) but not the other (low societal divorce rate). Sasha Gusev opined in the comments to the original post that GxG could reasonably be described as heritability but GxE not. This seems plausible for GxG on traits but it seems to me that GxG on outcomes would belong in the same bucket re heritability as GxE?
It just does not square with an ordinary language understanding of heritability that heritable properties would include things that are passed down for some generations (e.g. a propensity for sports injuries) but then simply stop to be heritable when the culture shifts (e.g. VR gaming becomes the prime example for thrill seeking behavior).
I think you have the wrong idea when you talk about “can reasonably called heritability” or “ordinary language understanding of heritability”. Heritability is a scientific term with a precise and universally-accepted definition. It means what it means.
I think what you’re getting at is: many people are confused about heritability—they have mistaken ideas about what you can and cannot infer when you learn that a Measurement X has high or low heritability in Population Y. …And that’s absolutely true! Lots of people are confused that way, and we should all work together to set them straight.
In the case at hand, if Population A is 50% American and 50% Yanomami, and Population B is 100% American, then the heritability of pretty much anything will be much lower in Population A than Population B. Indeed, that applies not just to mental health, personality, and behavior, but also to things like height and blood pressure.
> Consider that less than a century after his death, there are no longer any Rockefellers on the Forbes 400 list.
Depending on the time window, Forbes 400 turnover is roughly 5% a year. A Cato study showed that in the span from 1982–2014 (32 years), 71% who were on the list in 1982 (including heirs) had left the ranks. So it's not surprising that Rockefellers or other members of 19th-century and early 20th-century families are no longer on the list. However, the heirs to the Rockefeller fortune have a bunch of family trusts that keep (most of) the heirs wealthy, but not super wealthy.
Is parental favoritism accounted for in twin studies? Kin selection would predict that parental investment (maybe only for fathers) is mediated by indicators of genetic similarity, e.g. facial similarity between parent and child. If something like that happens, EEA would be less true for fraternal twins, and non-randomly: a more intelligent parent may invest more in the fraternal twin who is more genetically similar to them, and that investment could plausible increase IQ in that twin.
I think considering that genome is akin a computer code can not be true, or even roughly true: normal computer code is extremely fragile so any mutation would be either neutral (no effect) or very detrimental, there would be no chance a single mutation to be slightly positive or slightly negative, which is a much-have to get progressive selection sorting through single mutations. But it can not really be a bunch of completely decorrelated variants each having a small (positive or negative) delta effect on fitness that you can add (the linear model), this makes no sense in a complex organism. So there should be correlations, probably complex ones, but this do not necessarily rules out selection, just makes it slower and and promote stagnation until environmental change is enough to make one particular change so positive it exceeds al the negative cross-correlations...after fixation of this, it triggers a cascading effect on the correlated genes being optimized for cancelling/reversing the negative cross-correaltion effects of the initial change. This sounds a lot like punctuated equilibrium.
This is also what you get with food recipes or computer code tuned for optimisation through genetic algos: you have a lot of parameters with complex cross-correlation else nothing interresting comes out of it, but also a way to change them one at a time mostly without complete failure (else the mutants will tell you nothing on how quality can be improved): I suspect genome is largely like that, with some easy parts acting as boring uncorrelated traits and maybe some critical parts where changing anything leads to complete failure (those part will not evolve, at least until the rest of the code makes it non-critical).
Somebody in the initial post also mentioned that selection act mainly as a pushback for accumulation of mutation load (accumulation of slight detrimental effect of many mutations -individually having very small negative effect but they are present in small number because, their effect being small individually, their cleaning through natural selection is very slow). It means global fitness is quite far from optimal and affected a lot by general mutation load. And polygenic traits are often largely affected by a global mutation load rather that many specific genes. This would explain a significant correlation between different (all?) polygenic traits, and maybe the missing heritability. Because measuring mutation load (Simply counting SNP even the rare or unique ones may not be enough because of the cross-correlations) is difficult, and except for twins maybe the load with cross-correlation effects may be more variable that classic shared genes count predicts: the load of siblings is not necessarily the average of parent loads. I think it's roughly what Greg Cochran says since quite a few years or decades...
Using AI to to find potentially more advanced gXg interactions, ethnic complexes, etc. would be super useful moving forward.
Also knowing gXe can help too. It means that genes which are adaptive under different conditions can be preserved while the conditions are taken into account when caring for the individuals with them. This reduces problems of genetic bottlenecking
> But interracial children are just as healthy as within-race children. In fact, even interspecies hybrids like mules are pretty healthy (their inability to breed comes from an unrelated chromosome issue).
I mean, lots of closely-related pairs of species can't interbreed at all because the fetus won't develop properly, right? Which immediately makes me wonder, how much are we missing by only counting successfully-born children and omitting miscarriages, which are pretty common in humans? Seems unlikely it should be a big deal here, but...
I've seen someone make the observation that how well mammals can interbreed is strongly related to how invasive the placenta is, and that the human placenta is much more invasive than average.
That link just goes to bza9’s profile rather than any specific argument.
> Related: if GxG interactions are a big deal, wouldn’t you expect outbreeding depression?
That appears to be the case for Neandertals vs modern out-of-Africans. Most Neandertal genes got purged, only the rare ones that added value to moderns survived.
> In fact, even interspecies hybrids like mules are pretty healthy (their inability to breed comes from an unrelated chromosome issue). So evolution can’t be using interactions like this.
Or you can't generalize from mules to every other pairing, like Neandertal & neo-Africans.
I grew up in a second world country that a certain class of counter culture rich people moved to to live that authentic life (non pejorative, these motherfuckers came to the jungle and started farming and ranching and conserving on the side; the money was there as an emergency safety device instead of a crutch. Full Kudos.) and I can say:
You might not be on any lists, you might not even have much money on paper, but every single one of these dudes/ets had enough income (not wealth, but the ability to spend money no matter what or where) to do whatever they wanted forever*, and their parents had, and their grandparents had, and their kids had, and their kids kids had, and their uncles and cousins and nephews had, and their uncles and and nephews kids had. One guy in particular you might see in various photography galleries knew who made the money his family had been living on and when AND was willing to talk about it: a british industrialist, who started passing down the fortune around 1750. Same family and same fortune since then, long after the specific name and business faded from history.
Hundreds of people who have never needed to and will never need to collect a paycheck, the modern image of landed gentry.
*Proviso: only if you never go into business personally, or develope a gambling/drug habit. If you do that, you instantly get cut off from the indestructible safety line and become a normal rich person, as far as I can tell. It is purposefully mysterious, these types don't want you to know where they got it and what they got and most of the time THEY don't want to know either, and why should they?
> Also, shouldn’t this show up as high shared environment in twin studies?
Yeah, when writing my own post I spent a while trying to make up an illustrative example situation where C would be ≈ 0 in twin studies but where gene × environment effects would be obviously present. And I totally failed to think of anything remotely plausible. IIRC, you basically need to assume that there are multiple big effects that almost perfectly cancel each other out by a remarkable coincidence.
This is not my area and I'm not following well at all, but I'm confused by the idea that GWASes should be expected to capture much of anything? Like, why is this analogy wrong?:
"I think the proteins are all determined by the genes"
"Well, when we try to build statistical models that take a genome and predict all the proteins, they do very badly"
Like, yeah, protein folding is a super hard computational problem! Do we have any reason to expect genome to phenotype prediction to be any easier, or to expect these GWASes to have anything like the level of sophistication required to capture it? Obviously some traits are just one or two genes, we can get those, but traits that need hundreds or thousands of genes will have combinatorially vast sets of interactions that I wouldn't expect these models to capture any measurable proportion of!
If GWASes could only be run on huge multi-billion dollar datacentres stuffed with GPUs, I might take their lack of predictive power as meaningful, but it seems like they're not like that?
Or like, our very best statistical models to take in the source code of a video game and predict its review rating out of 10 on Steam will do very badly, and this says very little about the extent to which ratings are caused by source code.
I'm not taking a strong stance here because I'm clearly confused about at least one thing, but something I'm definitely confused about is how little discussion I see of how computationally difficult we should expect the task of a GWAS to be, when my intuition says it's "really really extremely computationally difficult".
Funny, I got exactly the opposite impression. All the fancy new methods of the last ~15 years show low heritability for behavioural traits, Scott (and other herediterians) tries his best to nitpick them and is still eventually reduced to saying something like "what convinces me more than any of the studies is my 'lived experience' with the top 1% criminally insane, criminal and insane adoptive children". The article made me update towards anti-hereditarianism quite a bit.
While we are talking meta-heuristics: why am I supposed to trust the studies from the very-failed-field-of-failing-to-replicate-studies from before when the field discovered it was oh so very failed over the studies from the formerly-very-failed-field-of-failing-to-replicate-studies done after it discovered it had oh so verily failed?
The discussion on people losing wealth over generations seems very much a US bias, and perhaps a recency bias - where by recency I mean the last two centuries. I wonder if the Rockefellers falling off the top 400 is a reduction in their wealth or an increase in the wealth of the rest of the 0.01%. Is it that surprising that the new tech and financial tycoons of this new gilded age have surpassed the old captains of industry over a century of enormous economic growth? For most of history this just doesn’t happen and I believe the wealthy families of the Roman Empire consolidated their wealth over time.
Here in Britain, remembering something I heard in a podcast recently, I looked up the wealth of the Duke of Westminster, Hugh Grosvenor,. He’s further down the list than I assumed — ranking 14th with approximately £9.88 billion of wealth. his family traces its lineage all the way back to Gilbert le Grosvenor, who rode in with William the Conqueror in 1066. That’s nearly a thousand years of inherited power and land.
But I too may have a local British bias, because a French Grosvenor may not have survived the French Revolution, and Russian Grosvenor definitely would not have survived the Bolsheviks. Probably most of Europe has seen more churn, if not revolution , then certainly wealth destroying wars.
> Maybe the best we can do is blame autocorrelation? That is, for all the data points on the graph, there are really only three clusters - Europeans, Africans, and everyone else. So you really only need ~3 unlucky coincidences to get this finding. And three unlucky coincidences, if you admitted they were three unlucky coincidences, wouldn’t be statistically significant, let alone “p = 7e-08” (lol). So maybe all the technical issues just explain why we shouldn’t take the scores seriously, and the answer to why it matches reality is a combination of “bad luck” and “it doesn’t really match reality that well, cf. the Chinese vs. Puerto Rican issue, but with enough autocorrelated data points even small coincidental matches look very significant”.
Two unlucky coincidences, surely? Europeans have arbitrary value of EA4 that is X. Chinese/Puerto Rican/Rest of the World then have arbitrary value of EA4 which is lower than that of Europeans. Africans then have arbitrary value of EA4 which is lower than Rest of World. All you need is the RotW to be below Euros, and the Africans to be below RotW; Euro's actual value doesn't matter. If they had EA4 that was one kojillion or six, as long as the RotW is below them and the Africans are below that, you get this result. And these coincidences are quite likely; there are only 6 orders (ERA, EAR, RAE, REA, ARE, AER), so there's a 1/6 chance of ERA.
While I agree with your general suspicion of village wisdom in the final comment, it really is true that "nature" was the dominant view in western countries in the 19th century (and early 20th), despite the lack of knowledge of genetics- basically everyone thought ancestry determined everything and upbringing had no effect.
Source? (I assume The Nurture Assumption talks about this but I haven't read it.)
Though strangely inheritance only applied to a legitimate children
"But interracial children are just as healthy as within-race children. In fact, even interspecies hybrids like mules are pretty healthy (their inability to breed comes from an unrelated chromosome issue)."
You're forgetting Mendel. We have paired chromosomes so the first generation cross has all the proteins of both parents (hybrid vigour). It's the next generation where you'd expect problems on that model and then only if you interbred the first generation offspring.
We have loads of data from plant and animal breeding. Humans are no different.
I agree that you don't necessarily see the outbreeding depression in F1. But I think generally hybrid vigour is more about getting rid of inbreeding depression, not about improvements for already internally diverse strains.
Also somewhat relevant here, there was definitely a lot of outbreeding depression when our ancestors had kids with some neanderthals. Or at least strong purifying selection, I'd assume those are related, although not sure about F1.
"But I think generally hybrid vigour is more about getting rid of inbreeding depression, not about improvements for already internally diverse strains."
I think the history of plant breeding suggests otherwise, tho you often have to double the chromosome numbers to restore fertility and that could be more important. But if that were so, then why bother to make the hybrid in the first place?
Plants are more robust to inbreeding than animals, because they self fertilize. But even in plants, hybrid vigor is the exception not the rule. We all know about the mule because it is the exception, but it is valuable to know the few exceptions.
The mule isn't exactly an exception. It has characteristics that are useful to humans. It's also sterile; it has no characteristics that are useful to itself.
I should also note that when I mentioned mules to a class of Chinese high schoolers, they were shocked at the concept. So I'm not sure it's true that "we all know about the mule".
The way I heard it was that "hybrid vigor" was a selection effect. Hybrids have a much wider range of "suitabilities", and the low ends tend to be selected away, so what we tend to notice are the ones that are "more fit".
I'm pretty sure there's a lot of evidence against the idea that on average mixed race children aee just as healthy. Psychologically, and with substance abuse, they seem less healthy. Also aren't they less fertile on average? Perhaps none of this is from genetics, but I don't think the data backs "just as healthy" in adult samples.
But how much of that is caused by society's attitude and their feeling of not fitting in, especially if one or both parents are from a non-WEIRD background?
Parents who interracially marry will tend to be weird in the non-acronym sense, since there's still social stratification. My understanding is that nowadays something like half of Jewish marriages are to gentiles, but generations ago such people would have been much more unrepresentative.
Of course if you're non-WEIRD you wouldn't even consider marrying somebody from another race. Of course, marriage isn't always involved.
This is vastly silly. Or are Latin Americans (say) WEIRD? Is the Latin American working class WEIRDer than its bourgeoisie? Are Filipinos WEIRD?
>I'm pretty sure there's a lot of evidence against the idea that on >average mixed race children aee just as healthy. Psychologically, and >with substance abuse, they seem less healthy. Also aren't they less >fertile on average?
Evidence (other than "feels", which I don't get; this is the first time I'm hearing any of this)?
WEIRD is taken in the sense of "The Weirdest People in the World: How the West Became Psychologically Peculiar and Particularly Prosperous" by Joseph Henrich (Author).
"Perhaps you are WEIRD: raised in a society that is Western, Educated, Industrialized, Rich, and Democratic. If so, you’re rather psychologically peculiar." - but if you read the book it's not just raised in such a society: it takes a few generations.
That could be part of it. It's never been proven it's all of it, but even if it were, it is still a risk, even if it's environmentally mediated in nature. It's not like parents know how to 100% get rid of that risk for their kids. Of course I'm not saying ban interracial marriage, but there are risks. Domestic violence and divorce are other examples of risks that can be higher in interracial couples. They are higher in lesbian couples too.
Non-conformity always adds stress. A man and a woman is mixed marriage enough without adding to the differences.
There's reasonable evidence that Cro Magnon-Neanderthal crosses had problems. The evidence is vague enough that we don't know what the problems were, but we do know that some survived and bred successfully. (One of the fossils analyzed is said to be a second generation Cro Magnon-Neanderthal cross. (Cro Magnon's lived in larger groups, so it's probable that the reason they are our dominant ancestor is a higher rate of reproduction. But, of course, this also let technological changes spread faster among them.) I'd guess that the same was true of Denisovans, but there the evidence is so scant that it's just a wild guess.
I feel like the presumption of schizophrenia being genetic is quite questionable. I would like to see a self-contained argument in favor of that, rather than dissociated evidence-fragments.
As luck would have it, Scott has written a post on exactly that: https://www.astralcodexten.com/p/its-fair-to-describe-schizophrenia
I'm with Prof. Gerdes on this one. With the rather cringey caveat that neural networks can do a decent job of accounting for nonlinear effects without getting bogged down in the impossibility of getting any sort of statistical significance which other approaches have. That's has other problems, like different training runs potentially getting very different answers to the same question, but it should be looked into.
The GxE case that occurred most to me during the original post was educational attainment based on IQ genes and exposure to opportunities for drug use. You mentioned previously (I believe) that high IQ is associated with more risk taking/drug experimentation. If we take IQ as genetic, then it may be a plus for educational attainment in some environments and a minus for educational attainment (or at least a highly attenuated plus) in others. This seems likely to confound twin studies. I would expect there to be many other similar situations.
[edit: also, the idea that there is little evidence of GxG interactions seems unsurprising even if there is lots of GxG interactions. Given the number of genes, the number of gene combinations is going to be astronomical (moves in game of chess >> number of particles in universe). This might cause detecting GxG interactions to require unbelievably powerful (large sample size) statistical tests.]
A really good example where we might be able to get some data would be people with a genetic susceptibility to alcoholism growing up in Mormon vs non-Mormon homes.
I think the praise of long-read sequencing is partially true but overdone. First, short-read and microarrays are being conflated in Andy's post and they're pretty different. Most GWAS prior to the last few years was done with microarrays, which can only identify pre-determined variants (but not necessarily just SNPs). So you never see anything that you don't know ahead of time to look for and so it's basically just SNPs (which are common literally by definition). Whole genome sequencing ala Illumina short-read is not limited to just predetermined variants and so can detect, for example, de novo mutations that only one person in the whole world has. Common SNPs are just the tip of the iceberg. More accurate would be to say that short-read is extremely good at single nucleotide variants (of any rarity) but it can also detect differences in copy number, large deletions, and many other variants.
But long-read really does have advantages (and is even close to price-competitive these days). There are regions of the genome that are basically impossible to sequence accurately in short-read (see the recent "telomere-to-telomere" genomes) and these are recently being shown to have a surprising amount of variation. Long read is also better at determining structural variation, which again, there appears to be more of than was thought a decade ago. However, short read will detect many structural variations as well but won't be as good at telling you exactly what the variation is. And this information is often just not used in short-read analyses even if it could theoretically be used.
Statistical geneticists are absolutely aware of the limitations, though right now they general focus of the field has been the recent increase in whole genome (short-read) data that means the shortcomings of microarray data are being overcome. Namely, that rare variants are now much easier to assess. Maybe in another ~10 years we'll have large datasets of long-read data and will be all excited about that, but it's not available yet.
I'm also not sure what the example about a splice variant means. Splicing happens post-transcriptionally in RNA when introns are removed and so does not appear in genome sequencing. Variants that affect splicing could look like practically anything in the genome and so there's no clear advantage of long-read for determining those and even if you had long-read it wouldn't be trivial to determine that the variant you detect is actually a splice variant. Moreover, splice variants are ubiquitous. Probably Andy meant a structural variant instead? Or maybe they meant long-read RNA-seq, which is very cool tech but not very relevant to the discussion?
Do you have any source on structural variation being common (and therefore probably not that harmful)? I would have assumed if you have any large differences in copy number, large deletions etc. we don't need a genetic test to notice those.
I only said that they're more common than previously thought and I definitely don't know whether individual structural variants are common in the sense that you can infer they're benign. I'm thinking of things like this: https://www.nature.com/articles/s41586-024-07278-3 There's substantial large-scale variation in centromeres between individuals (the first author's lab has more recent work that they've presented but doesn't appear to be published yet that shows this in more than just the two samples used in the publication I'm citing). Moreover, that's structural variation in centromeres which has generally been thought of as "junk" DNA - structural variation elsewhere could be much more impactful.
The other thing I'm thinking of is pan-genome type approaches that have revealed that different genetic ancestry groups vary (compared to each other or the 'reference' genome) by large structural changes and not just small SNVs or indels. https://www.nature.com/articles/s41586-023-05896-x But these are again not the kind of individual-individual differences that we care about for heritability.
Thank you for this reply! I appreciate the notes and perspective.
Some thoughts:
- “[short-read] can also detect differences in copy number, large deletions, and many other variants.” Yes in principle, but with way more nuance than I think is generally appreciated. The variant calling pipelines are trying to make sense of very challenging data. I know of one used by a clinical provider that (among lots of other steps) does mapping with 2 different tools, a bunch of cleaning up of the alignments, variant calling with 3 different tools, filtering and then a final call set based on consensus of the various variant callers. A lot of this process is to deal with the fact that the incoming data is just hard to make sense of without a large amount of information to use as a scaffold (eg. a reference), and then anything too divergent from the reference starts looking suspect. Some is definitely a need to make sure that the thing seen in the data isn't an artefact - PCR and library preparation can make weird things happen to pieces of DNA. But I think there is definitely signal present that doesn't pass filters as it's hard to disentangle from noise with short-read data, which just isn't true to the same extent for long-read data.
- “And this information is often just not used in short-read analyses even if it could theoretically be used.” This is I think where I struggle. For me, having that information feels like it might be helpful to be able to specify not just whether a region is variable, but what type of variant is present. But this is coming from my very specific non-model organisms background where genotype-phenotype is waaaay easier. But it’s a good point that at the level of multiple samples it probably isn’t impactful.
- “large deletions” -this is a fun bias in short-read variant calling where it’s way easier to call a large deletion using short-reads than a large insertion
- “Statistical geneticists are absolutely aware of the limitations” - this is re-assuring, thank you!
- “large datasets of long-read data” - would appreciate your thoughts - how much more useful would you think a dataset of the type like TOPMED would be if it was long-read vs short-read? Or if UK BioBank was re-done with long-reads? And cost-effective-wise - does it make any sense to do short-read based population studies now that sequencing costs per sample are almost even between short and long-reads(at least for PacBio)?
- “example about a splice variant means”, sorry, yes this was not helpful in that comment. I was mainly trying to highlight how hard short-read data is to work with, and yes the comparison was short-read RNA-Seq vs long-read RNA-Seq for reconstructing transcripts/splice patterns. Transcriptome analysis with long-read data continues to show large numbers (10x) of variants that aren’t able to be captured from short-reads. Thank you for your kind clarification for this in your reply!
So, apart from the gross effects of simple Mendelian genes (most of which appear to be loss-of-protein-function point mutations) genetics is a chaotic system. Still deterministic, but no easier to model than turbulence.
"Except perhaps in the most extreme cases, wealth has to be actively maintained and fortified by each successive generation, or even very wealthy families fall off the map within a few generations."
Occasionally when he will be meeting people that he truly knows nothing about, he will ask me to look into "who they are" (e.g. where is their money from).
These would be people who are very comfortable but not on any 400 list.
My research will sometimes be simple: turning up the name of a familiar product, say.
I recall once there was a gentleman whom upon googling I found was a descendant of ... I want to say, "Famous Industrialist's right-hand man" or such.
Not Famous Industrialist himself, just a close associate.
These folks had an enormous fund with their own full-time investment managers, with a beautiful website - hundreds of people benefitting to this day.
I guess you can attribute that all to the founder of the fortune, but it seems to me that the heirs must have been smart - if only smart enough to realize it would be folly to try to cheat one another of it, or take it out of the hands of its managers - about their money, even if it was now distributed to so many. They are not, however, particularly on "the map" as far as I know.
I wonder how the descendants of largely 'self-made' fortunes fair compare to those of large lotto winners. Maybe not enough of the later (or enough time since we had 9 figure lotto winners).
Georgia had a land lottery when it was being settled, and we've looked into the results on later generations:
https://westhunt.wordpress.com/2015/04/22/the-lottery/
A land lottery doesn’t really compare to wining the lottery.
Why? That's winning A lottery.
Because we are talking about “fortunes” in general when talking about lottery winners. As was the guy you were responding to.
I think the Vanderbilts still have some presence in the Forbes 400. (Anderson Cooper is a Vanderbilt but not in that list.) One of the very few families who managed to keep it going for 3+ generations.
I'm impressed with the ones who figured out how to take Biltmore, an unfinished folly, that would otherwise be a decaying white elephant; and turn it into what seems a thriving business.
No, there are no Vanderbilts on the Forbes 400. In aggregate, they don't even make the Forbes list of America's Richest Families
https://www.forbes.com/sites/natalierobehmed/2014/07/14/the-vanderbilts-how-american-royalty-lost-their-crown-jewels/
Gloria Vanderbilt and Anderson Cooper did new work, not maintain existing businesses and they didn't even provide capital for their new works. Maybe they inherited connections.
I strikes me that the best summation is that clever people try and spend a lot of time and effort nitpicking and attacking all the studies showing high heritably for intelligence but they fail, almost all the studies hold up and the result is clear evidence for high heritability.
The point of the post(s) is the opposite. Twin studies and heritability have fallen out of favour.
The idea that evolution is guaranteed to weed out nonlinear genetic combinations seems absurd. It is just too easy to conceive of ones that must exist. Height and metabolic rate would multiplicatively contribute to caloric requirements. Some neurotransmitter/neuron interactions must be multiplicative. I design IC's. We find that our distributions of key parameters (which are based on hundreds or thousands of physical parameters (mostly unmeasured) are seemingly normal distributions (additive). nonetheless, it is clear that numerous interactions are, in fact, multiplicative. They are just outweighed by many other additive factors. So, we know that nonlinearities don't have to tilt the scale and we know that a few do exist that can have real measurable impact(caloric requirement).
> (I do think villagers had one other advantage lots of moderns didn’t, which was that they closely observed animal breeding. But that’s not common sense - that’s access to important scientific data!)
Despite this advantage, it still took them millennia to understand heredity. For the longest time, people had similarly hilarious misconceptions about animal breeding as about human conception. See https://gwern.net/review/bakewell#the-invention-of-heritability
As someone without any background in genetics beyond high school biology, who read the original piece, skimmed this one, and has gone through several similar Substack pieces discussing heritability, I think I've determined I have no idea whether I believe pro- or anti-heriditarians are closer to the truth (though I do agree that elite liberals have hewed far too closely to a naive anti-hereditarian line in most ways, which isn't exactly a revelation).
It seems like there are just methodological disagreements upon methodological disagreements, and there's no way of piecing things out without getting something close to a Master's in the field.
It's frustrating to me (in a very self-centered and myopic way) reading debates where neither side is making a baseline error that allows me to explain away most of their arguments, lol. This is why I tend to stick closer to pure theory in my pleasure reading. I guess I'll just take the epistemically cautious option and half-heartedly adopt a position somewhere in the middle, while keeping an open mind.
This whole discourse has had the benefit of reminding me, though, that when I have kids, so long as my approach to raising them is well thought-out and executed, there should be a very good chance that they turn out similar to me and my partner, which I guess feels nice lol.
The jargon is far beyond what I can understand, which is disappointing since it is being mediated through a very smart layman good at explaining things, or at least his conception of them. If reading one of his many paragraphs on this subject, I feel I "get" the gist, I abandon the paragraph on the assumption that is as good as it gets and I don't want to get muddled.
However, the baseline error or rather confusion I see here is whether people are *all* talking about heritability being a matter of one or two generations? Which doesn't really make sense to me, why that would be the definition; and whether the anti-hereditarians are denying that all the information there *can be* related to the value of a trait, is in the genome, however it gets expressed.
"I do think villagers had one other advantage lots of moderns didn’t, which was that they closely observed animal breeding. But that’s not common sense - that’s access to important scientific data!"
A certain Dr Costin "BAP" Alamariu wrote his doctoral dissertation on this very matter...
LLMs (to use the new hotness) seem like a much better deterministic (temp=0) program comparison to the genome then say the linux kernel. Like the genome they are an evolved system that exhibit complex behaviors. Weights are like base pairs in the analogy, and just like in the genome the systems at a high level are quite robust towards some shuffling and tweaking of weights (lots of regularization techniques (e.g. dropout etc.) map to leaning to learn in way the organisms evolve to evolve and are bit like recombination etc. in that respect).
I'd be really curious what you'd get if you took a bunch of differentially fine-tuned descendants of a large base model and did genome wide association studies on the weights looking at their different behavior. I bet you wouldn't find much, even though it's 100% weight driven, and at the same time I bet recombination between the models would work quite well and somewhat intuitively in their outcome in terms of behavior results. We largely know why too, it's that we're only looking at weight interactions in the compressed latent space and if we were able to see if into the much higher dimensional space then it would look a lot more like linear combinations. Someone should run the experiment if it hasn't been done already.
I think the genome must also be working in some higher dimensional space, it's just not big enough, not enough genes to be anything else. The intuition pump for me here isn't even humans, it's organisms with complex instinctive behaviors (e.g. things like hunting behaviors of snakes or migrating behaviors in butterflies), just think mechanistically on how genome can lead to such behaviors and in an evolving way where small mutation or recombination can somehow land 'near' the existing behavior so they can evolve with changing environment. We need something like spare-autoencoders for the geome.
> you can’t, even in principle, peer into a cell with a microscope, and tell whether the two genes are “interacting” or not.
Actually you totally can! For example, a good molecular biologist could tell if a mutation in one gene causes its protein to not bind to the promoter of another gene.
For direct physical interactions between genes you could also look at things like Hi-C for chromatin looping. E.g. https://pmc.ncbi.nlm.nih.gov/articles/PMC7415676/
If you're being pedantic and talking about literal microscopes, there are things like FRET experiments which can look at protein-protein interactions.
Your comment is entirely missing the point, in a way that makes me think you either didn’t read or didn’t understand the rest of what I wrote.
…But thanks for the great illustration of my claim that “many people in the field have a tendency to think about those topics in an overly narrow way”. :-P
I think I got your point re GxG and would be interested in your thoughts on the following (apologies if you addressed this in your full piece, I have not yet read it in its entirety): the pathway you describe goes from genes to "traits" to "outcomes", where the paradigmatic trait is a purely "organic" property such as height of blood pressure and the paradigmatic outcome is a highly socially constructed property such as being divorced or being diagnosed with antisocial personality disorder (hence interactions causing the latter kind of property are not something you can observe in a cell).
My question is: do you think GxG on outcomes can reasonably be called heritability? The outcomes you describe are highly culturally dependent. E.g. diagnosis rates for mental disorders and divorce rates change over time even in the same culture. Surely this kind of "heritability" is massively environmentally modulated? I.e. the underlying traits might interact towards a propensity for the given outcome in one environment (high societal divorce rate) but not the other (low societal divorce rate). Sasha Gusev opined in the comments to the original post that GxG could reasonably be described as heritability but GxE not. This seems plausible for GxG on traits but it seems to me that GxG on outcomes would belong in the same bucket re heritability as GxE?
It just does not square with an ordinary language understanding of heritability that heritable properties would include things that are passed down for some generations (e.g. a propensity for sports injuries) but then simply stop to be heritable when the culture shifts (e.g. VR gaming becomes the prime example for thrill seeking behavior).
I think you have the wrong idea when you talk about “can reasonably called heritability” or “ordinary language understanding of heritability”. Heritability is a scientific term with a precise and universally-accepted definition. It means what it means.
I think what you’re getting at is: many people are confused about heritability—they have mistaken ideas about what you can and cannot infer when you learn that a Measurement X has high or low heritability in Population Y. …And that’s absolutely true! Lots of people are confused that way, and we should all work together to set them straight.
In the case at hand, if Population A is 50% American and 50% Yanomami, and Population B is 100% American, then the heritability of pretty much anything will be much lower in Population A than Population B. Indeed, that applies not just to mental health, personality, and behavior, but also to things like height and blood pressure.
> Consider that less than a century after his death, there are no longer any Rockefellers on the Forbes 400 list.
Depending on the time window, Forbes 400 turnover is roughly 5% a year. A Cato study showed that in the span from 1982–2014 (32 years), 71% who were on the list in 1982 (including heirs) had left the ranks. So it's not surprising that Rockefellers or other members of 19th-century and early 20th-century families are no longer on the list. However, the heirs to the Rockefeller fortune have a bunch of family trusts that keep (most of) the heirs wealthy, but not super wealthy.
This agrees with the point! It's not surprising
Is parental favoritism accounted for in twin studies? Kin selection would predict that parental investment (maybe only for fathers) is mediated by indicators of genetic similarity, e.g. facial similarity between parent and child. If something like that happens, EEA would be less true for fraternal twins, and non-randomly: a more intelligent parent may invest more in the fraternal twin who is more genetically similar to them, and that investment could plausible increase IQ in that twin.
I think considering that genome is akin a computer code can not be true, or even roughly true: normal computer code is extremely fragile so any mutation would be either neutral (no effect) or very detrimental, there would be no chance a single mutation to be slightly positive or slightly negative, which is a much-have to get progressive selection sorting through single mutations. But it can not really be a bunch of completely decorrelated variants each having a small (positive or negative) delta effect on fitness that you can add (the linear model), this makes no sense in a complex organism. So there should be correlations, probably complex ones, but this do not necessarily rules out selection, just makes it slower and and promote stagnation until environmental change is enough to make one particular change so positive it exceeds al the negative cross-correlations...after fixation of this, it triggers a cascading effect on the correlated genes being optimized for cancelling/reversing the negative cross-correaltion effects of the initial change. This sounds a lot like punctuated equilibrium.
This is also what you get with food recipes or computer code tuned for optimisation through genetic algos: you have a lot of parameters with complex cross-correlation else nothing interresting comes out of it, but also a way to change them one at a time mostly without complete failure (else the mutants will tell you nothing on how quality can be improved): I suspect genome is largely like that, with some easy parts acting as boring uncorrelated traits and maybe some critical parts where changing anything leads to complete failure (those part will not evolve, at least until the rest of the code makes it non-critical).
Somebody in the initial post also mentioned that selection act mainly as a pushback for accumulation of mutation load (accumulation of slight detrimental effect of many mutations -individually having very small negative effect but they are present in small number because, their effect being small individually, their cleaning through natural selection is very slow). It means global fitness is quite far from optimal and affected a lot by general mutation load. And polygenic traits are often largely affected by a global mutation load rather that many specific genes. This would explain a significant correlation between different (all?) polygenic traits, and maybe the missing heritability. Because measuring mutation load (Simply counting SNP even the rare or unique ones may not be enough because of the cross-correlations) is difficult, and except for twins maybe the load with cross-correlation effects may be more variable that classic shared genes count predicts: the load of siblings is not necessarily the average of parent loads. I think it's roughly what Greg Cochran says since quite a few years or decades...
Using AI to to find potentially more advanced gXg interactions, ethnic complexes, etc. would be super useful moving forward.
Also knowing gXe can help too. It means that genes which are adaptive under different conditions can be preserved while the conditions are taken into account when caring for the individuals with them. This reduces problems of genetic bottlenecking
You link to bza9's response instead just links to bza9's profile on Substack. Can you fix this? Thanks!
I noticed the same thing. Here is a link to his highest-level comment on the previous ACX post:
https://www.astralcodexten.com/p/missing-heritability-much-more-than/comment/129589130
> But interracial children are just as healthy as within-race children. In fact, even interspecies hybrids like mules are pretty healthy (their inability to breed comes from an unrelated chromosome issue).
I mean, lots of closely-related pairs of species can't interbreed at all because the fetus won't develop properly, right? Which immediately makes me wonder, how much are we missing by only counting successfully-born children and omitting miscarriages, which are pretty common in humans? Seems unlikely it should be a big deal here, but...
See also: Lethal allele.
I've seen someone make the observation that how well mammals can interbreed is strongly related to how invasive the placenta is, and that the human placenta is much more invasive than average.
> But see bza9’s argument against here
That link just goes to bza9’s profile rather than any specific argument.
> Related: if GxG interactions are a big deal, wouldn’t you expect outbreeding depression?
That appears to be the case for Neandertals vs modern out-of-Africans. Most Neandertal genes got purged, only the rare ones that added value to moderns survived.
> In fact, even interspecies hybrids like mules are pretty healthy (their inability to breed comes from an unrelated chromosome issue). So evolution can’t be using interactions like this.
Or you can't generalize from mules to every other pairing, like Neandertal & neo-Africans.
I grew up in a second world country that a certain class of counter culture rich people moved to to live that authentic life (non pejorative, these motherfuckers came to the jungle and started farming and ranching and conserving on the side; the money was there as an emergency safety device instead of a crutch. Full Kudos.) and I can say:
You might not be on any lists, you might not even have much money on paper, but every single one of these dudes/ets had enough income (not wealth, but the ability to spend money no matter what or where) to do whatever they wanted forever*, and their parents had, and their grandparents had, and their kids had, and their kids kids had, and their uncles and cousins and nephews had, and their uncles and and nephews kids had. One guy in particular you might see in various photography galleries knew who made the money his family had been living on and when AND was willing to talk about it: a british industrialist, who started passing down the fortune around 1750. Same family and same fortune since then, long after the specific name and business faded from history.
Hundreds of people who have never needed to and will never need to collect a paycheck, the modern image of landed gentry.
*Proviso: only if you never go into business personally, or develope a gambling/drug habit. If you do that, you instantly get cut off from the indestructible safety line and become a normal rich person, as far as I can tell. It is purposefully mysterious, these types don't want you to know where they got it and what they got and most of the time THEY don't want to know either, and why should they?
> Also, shouldn’t this show up as high shared environment in twin studies?
Yeah, when writing my own post I spent a while trying to make up an illustrative example situation where C would be ≈ 0 in twin studies but where gene × environment effects would be obviously present. And I totally failed to think of anything remotely plausible. IIRC, you basically need to assume that there are multiple big effects that almost perfectly cancel each other out by a remarkable coincidence.
So instead the thing I put into my post was a toy example (based on EA) where twin studies would measure C>>0 but in some sense the real C “should be” even bigger than that. See table in §1.5.3: https://www.lesswrong.com/posts/xXtDCeYLBR88QWebJ/heritability-five-battles#1_5_3_The_simplest_twin_study_analysis_assumes_no_interaction_between_genes_and_shared_environment
This is not my area and I'm not following well at all, but I'm confused by the idea that GWASes should be expected to capture much of anything? Like, why is this analogy wrong?:
"I think the proteins are all determined by the genes"
"Well, when we try to build statistical models that take a genome and predict all the proteins, they do very badly"
Like, yeah, protein folding is a super hard computational problem! Do we have any reason to expect genome to phenotype prediction to be any easier, or to expect these GWASes to have anything like the level of sophistication required to capture it? Obviously some traits are just one or two genes, we can get those, but traits that need hundreds or thousands of genes will have combinatorially vast sets of interactions that I wouldn't expect these models to capture any measurable proportion of!
If GWASes could only be run on huge multi-billion dollar datacentres stuffed with GPUs, I might take their lack of predictive power as meaningful, but it seems like they're not like that?
Or like, our very best statistical models to take in the source code of a video game and predict its review rating out of 10 on Steam will do very badly, and this says very little about the extent to which ratings are caused by source code.
I'm not taking a strong stance here because I'm clearly confused about at least one thing, but something I'm definitely confused about is how little discussion I see of how computationally difficult we should expect the task of a GWAS to be, when my intuition says it's "really really extremely computationally difficult".
Funny, I got exactly the opposite impression. All the fancy new methods of the last ~15 years show low heritability for behavioural traits, Scott (and other herediterians) tries his best to nitpick them and is still eventually reduced to saying something like "what convinces me more than any of the studies is my 'lived experience' with the top 1% criminally insane, criminal and insane adoptive children". The article made me update towards anti-hereditarianism quite a bit.
While we are talking meta-heuristics: why am I supposed to trust the studies from the very-failed-field-of-failing-to-replicate-studies from before when the field discovered it was oh so very failed over the studies from the formerly-very-failed-field-of-failing-to-replicate-studies done after it discovered it had oh so verily failed?
What failures of replication are you talking about?
The discussion on people losing wealth over generations seems very much a US bias, and perhaps a recency bias - where by recency I mean the last two centuries. I wonder if the Rockefellers falling off the top 400 is a reduction in their wealth or an increase in the wealth of the rest of the 0.01%. Is it that surprising that the new tech and financial tycoons of this new gilded age have surpassed the old captains of industry over a century of enormous economic growth? For most of history this just doesn’t happen and I believe the wealthy families of the Roman Empire consolidated their wealth over time.
Here in Britain, remembering something I heard in a podcast recently, I looked up the wealth of the Duke of Westminster, Hugh Grosvenor,. He’s further down the list than I assumed — ranking 14th with approximately £9.88 billion of wealth. his family traces its lineage all the way back to Gilbert le Grosvenor, who rode in with William the Conqueror in 1066. That’s nearly a thousand years of inherited power and land.
But I too may have a local British bias, because a French Grosvenor may not have survived the French Revolution, and Russian Grosvenor definitely would not have survived the Bolsheviks. Probably most of Europe has seen more churn, if not revolution , then certainly wealth destroying wars.
> Maybe the best we can do is blame autocorrelation? That is, for all the data points on the graph, there are really only three clusters - Europeans, Africans, and everyone else. So you really only need ~3 unlucky coincidences to get this finding. And three unlucky coincidences, if you admitted they were three unlucky coincidences, wouldn’t be statistically significant, let alone “p = 7e-08” (lol). So maybe all the technical issues just explain why we shouldn’t take the scores seriously, and the answer to why it matches reality is a combination of “bad luck” and “it doesn’t really match reality that well, cf. the Chinese vs. Puerto Rican issue, but with enough autocorrelated data points even small coincidental matches look very significant”.
Two unlucky coincidences, surely? Europeans have arbitrary value of EA4 that is X. Chinese/Puerto Rican/Rest of the World then have arbitrary value of EA4 which is lower than that of Europeans. Africans then have arbitrary value of EA4 which is lower than Rest of World. All you need is the RotW to be below Euros, and the Africans to be below RotW; Euro's actual value doesn't matter. If they had EA4 that was one kojillion or six, as long as the RotW is below them and the Africans are below that, you get this result. And these coincidences are quite likely; there are only 6 orders (ERA, EAR, RAE, REA, ARE, AER), so there's a 1/6 chance of ERA.