[see footnote 4 for conflicts of interest]
In 2021, Genomic Prediction announced the first polygenically selected baby.
When a couple uses IVF, they may get as many as ten embryos. If they only want one child, which one do they implant? In the early days, doctors would just eyeball them and choose whichever looked healthiest. Later, they started testing for some of the most severe and easiest-to-detect genetic disorders like Down Syndrome and cystic fibrosis1. The final step was polygenic selection - genotyping each embryo and implanting the one with the best genes overall.
Best in what sense? Genomic Prediction claimed the ability to forecast health outcomes from diabetes to schizophrenia. For example, although the average person has a 30% chance of getting type II diabetes, if you genetically test five embryos and select the one with the lowest predicted risk, they’ll only have a 20% chance2. Since you’re taking the healthiest of many embryos, you should expect a child conceived via this method to be significantly healthier than one born naturally. Polygenic selection straddles the line between disease prevention and human enhancement.
In 2023, Orchid Health entered the field. Unlike Genomic Prediction, which tested only the most important genetic variants, Orchid offers whole genome sequencing, which can detect the de novo3 mutations involved in autism, developmental disorders, and certain other genetic diseases.
Critics accused GP and Orchid of offering “designer babies”, but this was only true in the weakest sense - customers couldn’t “design” a baby for anything other than slightly lower risk of genetic disease. These companies refused to offer selection on “traits” - the industry term for the really controversial stuff like height, IQ, or eye color. Still, these were trivial extensions of their technology, and everybody knew it was just a matter of time before someone took the plunge.
Last month, a startup called Nucleus took the plunge. They had previously offered 23andMe style genetic tests for adults. Now they announced a partnership with Genomic Prediction focusing on embryos. Although GP would continue to only test for health outcomes, you could forward the raw data from GP to Nucleus, and Nucleus would predict extra traits, including height, BMI, eye color, hair color, ADHD, IQ, and even handedness.
And this week, Herasight4 entered the space with the most impressive disease risk scores yet, an IQ predictor worth 6-95 extra points, and a series of challenges to competitors, whom they call out for insufficient scientific rigor. Their most scathing attack is on Nucleus itself, accusing its predictions of being misleading and unreliable.
Let’s start with the science, then move on to the companies and see if we can litigate their dispute.
In Theory, All Of This Should Work
Polygenic embryo screening is a natural extension of two well-validated technologies: genetic testing of embryos, and polygenic prediction of traits in adults.
Genetic testing of embryos has been done for decades, usually to detect chromosomal abnormalities like Down Syndrome or simple single-gene disorders like cystic fibrosis. It’s challenging - you need to take a very small number of cells (often only 5-10) from a tiny proto-placenta that may not have many cells to spare, and extract a readable amount of genetic material from this limited sample - but there are known solutions that mostly work.
But most traits are polygenic, requiring information about thousands or tens of thousands of genes to predict. These are too complicated to understand fully at current levels of technology, but some studies have chipped away at the problem and gotten a partial understanding. Often this looks like being able to predict a few percent of the variance in a trait, and determine whether someone’s genetic risk is slightly higher or lower than average.
Polygenic prediction of traits in adults is still young and full of hidden pitfalls. Last month, we discussed how some early studies unknowingly conflated direct genetic effects and various confounders6 - for example, they tended to pick up on genes associated with well-off ethnic groups or families who had good health outcomes for social reasons. Pinpointing the direct component requires an additional step where researchers validate their algorithms within families (for example, on pairs of siblings where one has a higher polygenic score than the other) to see how much predictive power remains. This is especially important for embryo selection companies, whose entire value proposition depends on comparing two genomes from the same family.
How have they done? It depends on the number of embryos they have to work with; the more embryos, the better you can do by selecting the best.

Here is a table of different companies’ reported risk reductions, slightly adjusted7 for different reporting conventions but otherwise taking all claims at face value (we’ll talk about how wise that is later).

Some people might genuinely want to select on a single condition. For example, people with a strong family history of schizophrenia might want to minimize the chance of their children getting the disease; for these people, reducing schizophrenia risk by 58% (while keeping everything else constant) sounds pretty good.
Everyone else probably wants a generically healthy embryo with low risk of all conditions. Exactly how this works depends on the customer’s own values - would they prefer an embryo with lower cancer risk to one who will have fewer heart attacks? - and the exact benefits will depend on how parents make that decision. Genomic Prediction and Herasight try to help by providing semi-objective measures of which embryo is overall healthiest according to different conditions’ effects on longevity and patient-rated quality of life.
For Genomic Prediction, that’s the “embryo health score” If you selected the single highest-health-score embryo from a set of five, here’s how they’d do:
For Herasight, it’s a “polygenic longevity index”. They don’t give exact risk reduction numbers for each disease, saying that it depends too much on a couple’s specific family history, but say that most people gain 1-4 years of healthy life (when I test it on a set of twenty embryos, the the healthiest gets an extra 1.66 years).
How much would you pay to give your children an extra 1-4 years of healthy life?
This is no longer a hypothetical question. Here are the costs of the companies in this space:
Is it worth it?
If:
You’re already doing IVF
The claimed risk reductions are accurate
You value your kids’ health about as much as your own
You have a low time discount rate.
You’re well-off enough that these aren’t extraordinary sums of money for you
You’re okay using expected utility calculations where a 50% chance of preventing X is half as good as fully preventing X.
…then I’ll go out on a limb and say yeah, obviously it’s worth it.
Consider e.g. Genomic Prediction, which costs $3,250 for five embryos and claims to lower absolute risk of Type 2 diabetes by 12%. That implies that not getting Type 2 diabetes is worth $27,000. Ask anybody dealing with regular insulin injections (let alone limb amputations) whether it would be worth $27,000 to wave a magic wand and not have Type 2 diabetes! It’s not a hard question! And that’s just one of a dozen conditions you can lower the risk for! Other ones, like not getting breast cancer, might be so valuable that it’s hard to even attach numbers!
(but maybe the low time discount rate is a mistake? Suppose you invest the $3,250 in an index fund that makes 7% over inflation, then give it to your future child when they turn 45 (average age of type 2 diabetes diagnoses). Now it’s worth $75,000. Is this the “true” cost of the intervention? Does it matter that this counterfactual is fake and most people don’t do this?)
What about IQ? Six extra IQ points (Herasight’s estimate with five embryos) is about a quarter of the gap between the average person and the average Ivy League student. The benefits of intelligence are hard to quantify, but it’s been shown to have probably-causal positive effects on income, mortality, and achievement. Probably the income effects alone make up for the cost of intervention - again assuming total parent-child altruism and low discount rate8.
So if we accept all of these claims and assumptions, the choice seems obvious. It’s probably even obvious for governments to pay for all citizens to get these, given how much they’d save on health care costs.
In Practice, It’s Complicated
Critics have raised both scientific and ethical objections to polygenic embryo screening.
Polygenic embryo selection has been condemned by various bodies including the Society For Psychiatric Genetics, the European Society of Human Genetics, the Behavioral Genetics Society. Their statements are . . . not good. They tend towards vague language about how people are more than just their genes, or how no genetic test can be perfect, or how embryo screening is not exactly the same thing as some other form of screening which has a longer history and more proponents. “Although in general higher scores mean you are more likely to have a condition, many healthy people will have high scores; others might develop the condition even with a low score”, says the Society for Psychiatric Genetics, as if they have just blown the lid off some dastardly conspiracy. “Screening embryos for psychiatric conditions may increase stigma surrounding these diagnoses”, they continue - an objection which, taken seriously, could be used to ban every form of medical treatment. We will mostly ignore these people and try to imagine the objections that mildly competent critics might raise, some of which will coincidentally overlap with the content of the non-hypothetical statements.
Scientific Objection: Efficacy
Are we sure this works at all?
A typical polygenic score is created by collecting thousands or millions of adult genomes, then matching genetic information with surveys about who has the trait/condition of interest. Reputable studies then test these scores on holdout samples - adults who were not used to make the score - to see if they still accurately predict who has the trait/condition. Polygenic embryo selection depends on an assumption that the scores which work in these kinds of retrospective tests will also work prospectively on embryos. This assumption hasn’t been formally proven in studies (which would require years to decades to conduct), but seems common-sensical.
The strongest challenge to the application of polygenic scores for embryo selection comes from a recent body of research showing that most scores combine causal genetic effects with population stratification, and therefore can be expected to lose much of their predictive power when comparing two members of the same family (e.g. two embryos from the same couple). There is increasing agreement in the field that unless scores are validated within families, headline results like “decreases risk of X by Y%” will be large overestimates.
When I talked to company representatives, they all said that they took accuracy extremely seriously and had various white papers and journal articles where anyone could double-check their methodology. But I attended an industry conference a few months ago, and the gossip level was comparable to a high school cafeteria (minus the sex rumors - most of the attendees were having their own kids via IVF). Everyone had some story about someone being careless or fudging their numbers.
Some of the conflicts broke out into the open on Wednesday, when Herasight left stealth and published a white paper and associated blog post. They criticize Genomic Prediction for reporting between-family rather than within family results9, and Orchid for smuggling a term for age into their Alzheimer’s predictor (unsurprisingly, this makes it work better). We’ll get to their accusations against Nucleus below. Note that this was recent enough that competitors haven’t had time to respond or to air their own criticisms of Herasight; if this happens, I’ll try to keep you updated.
Maybe this is cope, but my optimistic perspective is that this bounds the damage. This obviously isn’t a field capable of maintaining a conspiracy of silence. But aside from the Nucleus allegations, the complaints aren’t existential. Maybe some numbers are too high, maybe some predictors are slightly rigged. But the more we learn about these admittedly concerning problems, the more we can hope that we’d have heard about it if nothing worked at all.
Overall my strongest opinion on the scientific criticisms is:
Authorities on all sides have cited Alex Young10 as an authority on how polygenic scores can be confounded or misleading.
Last week Alex Young revealed that he had been working with Herasight while it was in stealth mode, and endorses their research.
LOL.
Probably that means Herasight’s products are okay.
That serves as proof-of-concept that this technology can work, and means other companies’ claims are at least plausible.
Scientific Objections: Antagonistic Pleiotropy
This is a fancy term for “sometimes genes that are good in one way are bad in other ways”. For example, there is a gene that decreases the risk of lung cancer, but increases the risk of leukemia. If you selected against lung cancer, you might give your child higher leukemia risk. Several of the professional societies raise this concern, and Sasha Gusev gives several examples here, including a correlation between education/IQ and anorexia.
When I think about these concerns, I consider the following thought experiment: suppose that I had a natural, unselected child, and that child became high school valedictorian and got into Harvard. Would my first reaction be “Oh no! This slightly raises her risk of anorexia!”? If not, why should this be our reaction to artificially increasing IQ? Genetic selection isn’t doing some different, magical thing. It’s just picking from within the natural IQ/anorexia variation. If you would be happy to have higher IQ (or lower breast cancer risk, or lower schizophrenia risk) naturally, you should be happy to get it through selection too.
(Objection one: suppose that the genetic component of IQ is net negative, but the environmental component is net positive to an even greater degree. Then IQ itself might be net positive - so you could still celebrate your valedictorian child - but since the genetic component alone is bad you wouldn’t want to select for it. I have never heard anyone seriously claim this, most studies suggest that genetic components of good things are good in the expected ways, and most critics don’t get this far. I mention it for the sake of completeness only.)
(Objection two: is the example above just saying that I value IQ more than non-anorexia? If so, couldn’t I give an alternate example of learning that my child isn’t anorexic, celebrating this seemingly-obviously-good fact, but actually this means they have lower IQ and based on my stated values I should be sad? I don’t think so. There is no claim that the increased anorexia risk from raising IQ is exactly as bad as the IQ increase is good - for example, you could imagine a world where going from moron to supergenius only raises anorexia risk 0.0001%. More generally - although not rigorously - selecting for X should usually increase X more than it increases tangentially-correlated construct Y. So selecting for IQ should be net positive, even though it might slightly increase anorexia risk, and selecting for anorexia should be net negative, even though it might slightly increase IQ. I think this is the intuition that drives parents to be happy both when they learn that their child is smart, and when they learn their child doesn’t have anorexia - not just an intuition that one trait matters more than the other)
But also, here’s the table of correlated genetic risks for psychiatric disorders:
…where blue means that lowering the risk of one disease also lowers the risk of the other, and red means the opposite (as in the IQ - anorexia example above).
Here’s the same table for other conditions, courtesy of Genomic Prediction (except I flipped the colors from the original, to match the one above):
Aside from two bright orange squares (gallstones vs. hypertension and hypothyroidism - I don’t know what’s up with this and it doesn’t seem to be a widely-appreciated result) we see that most correlations are zero or positive - that is, selecting against one disease selects against another or at worst does nothing. In this ocean of blue, worrying about those few orange squares feels a bit motivated.
Hans Jonas-ism says that no medical intervention may ever cause any harm, no matter how much benefit it produces. By this standard, perhaps slightly raising the risk of gallstones in the process of preventing various cancers and psychoses and other forms of human misery is unacceptable. To anyone with the more normal perspective where something with large benefits and tiny downsides is still pretty good, I don’t think the antagonistic pleiotropy argument carries much weight.
Ethical Objection: Cost
No way around this one: if these products work, they mean that rich people can have healthier/smarter/taller/prettier kids than poor people.
One might object that at least they’re in good company: other products which help rich kids get healthier/smarter/taller/prettier than poor kids include private tutors, gyms, hair salons, health insurance, clothing, books, and food. Is this really the time to declare ourselves against this kind of thing? But maybe we should fight against expanding this already-bloated category. Or maybe there’s something more final about a genetic advantage.
Maybe a stronger argument is that rich people get first crack at every new technology, but poor people usually follow close behind. The first cellphone, in 1982, cost $12,000 in today’s dollars. Now you can get something a thousand times better for $50, and Kenyan pastoralists use cell phones to call up the local shaman. The trajectory of genetics has been even more striking: sequencing a single genome cost about $100 million in 2000 and is somewhere around $100 today.
Polygenic embryo selection has the potential to follow a similar path. There are two associated costs - sequencing the embryos, and running the analysis. Sequencing costs are decreasing and may eventually be comparable to the sorts of genetic screening (for e.g. Down Syndrome) that most families get anyway. Analysis costs are mostly the one-time expense of inventing the predictor; we might expect these to follow the same pattern as generic medications, where cutting-edge technology is jealously guarded and expensive, but last decade’s technology has made its way off patent and is cheap-to-free. A few groups have already created free open-source predictors; so far these are much worse than the private companies’ versions, but one of last year’s ACX Grantees is working on a better one.
Also, it would be crazy for any forward-thinking government not to cover this; it could save hundreds of thousands of dollars in future health care expenses. In countries with public health care, this comes directly out of the government treasury; even in the US, it’s covered by Medicare after age 65. The government should be begging people to select embryos.
The most persistent cost barrier is likely to be in vitro fertilization itself, a necessary precursor. In the US, 2-3% of babies are born through IVF. For those kids, this is a no-brainer - even if the cost never comes down, the cheaper products are only a fraction of total IVF expense. What about the other 98%? If those parents feel like they have to get embryo selection (and therefore IVF) to keep up, this could be a significant burden. IVF isn’t fun - it requires pumping a woman full of mind-altering hormones for weeks, extracting eggs in a minor surgery, and then implanting embryos in another minor surgery, all with a decent chance that some step will fail and you’ll have to do it all again. It also costs $15,000 in the US (less in poorer countries), and unlike the genetics, the cost has barely gone down in the past twenty-five years.
Some countries, including Israel, offer free IVF for anybody who wants it. And universal basic IVF is surprisingly popular even in the usually government-phobic United States - Donald Trump made it part of his campaign platform. So there’s a plausible path to embryo selection for everyone who wants it.
But it’s still going to take a while, it will hit different people at different times, and so far11 there’s no way around the month or two of various miserable medical procedures for women.
Ethical Objection: Personhood
Is it really correct to say that you have reduced someone’s risk of breast cancer by 46%, if what you’ve really done is closer to replacing them with a different person who is 46% less likely to have breast cancer? I cover this one in more depth here.
Ethical Objection: Race
This one is awkward: right now the technology works best for white people.
Most genetic data available for research/commercial use comes from the UK, US, and Europe - areas which are mostly white. Asian biobanks, and those serving US minority communities, have been more reluctant to share data. So we know a lot about the genetics of white people, and only a limited amount about the genetics of anyone else. Companies are suitably embarrassed about this, and researchers in the field are working hard to wring every ounce of information out of the minority data they have. But for now, white people are the clear winner.
Here’s data from Herasight:
A European family with five embryos and no family history can cut their diabetes risk by 47% and an African family 29%, with everyone else in between.
As usual, all companies say that they adjust their scores based on the couple’s genetic ancestry. As usual, Herasight challenges them to publicly release data on exactly how they performed the adjustments and how well they work. All companies say they are working as hard as they can to improve cross-ancestry portability, but that progress will remain limited until governments collect/release better genetic data on non-white populations.
Ethical Objection: Selection
At some point, you’ve got to choose.
Genomic Prediction and Herasight offer scores that aggregate overall health risks.
And this is the best case scenario! Herasight offers predictors for IQ, height and BMI; Nucleus offers those plus eye color and hair color12. A parent might encounter a situation where the embryo with their favorite eye color also has the highest cancer and schizophrenia risk, and choose to doom their child to cancer and schizophrenia because they really want pretty eyes.
On average, even if everyone in the world selected for eye color, it wouldn’t raise cancer and schizophrenia risk. No not-deliberately-perverse polygenic selection choice can make your child worse off in expectation. Still, suppose you got cancer, and your mom admitted that she selected you for pretty eyes and didn’t even check the cancer column of the embryo selection report. How would you feel?
And would you feel better or worse than someone whose parents didn’t do embryo selection at all, and spent the money on a Caribbean vacation? What if they selected your brother for everything great, then had you naturally? What if they selected you for IQ, but actually you are very stupid, and you were one of the 20% of cases where a predictor that’s right 80% of the time gets it wrong?
Mark my words, one day there will be entire subfields of therapy dedicated to these issues.
Going Nuclear
Even as outsiders criticize the whole field, Herasight has launched a full-scale attack on competitor Nucleus.
Herasight’s white paper compares its own predictors (favorably) to those of Orchid and Genomic Prediction…
…but refuses to acknowledge Nucleus at all. In a supplementary note, the authors explain why: they accuse Nucleus of being so bad that it would “not yield a reliable or meaningful addition to our analysis”.
They say Nucleus has inflated the accuracy of their scores. This is most dramatic for a few conditions like ADHD, where the leading published polygenic score is based on 2,300,000 variants but explains only ~1% of variance in the condition. Nucleus’ score is based on 12 variants13 and (implicitly) claims to explain 3-6%. This doesn’t make sense.
Some of Nucleus’ other scores do use millions of variants. But many of these are 5-10 year old scores downloaded from open-source catalogs, whose accuracy statistics are easily available and far less than Nucleus claims. Here is what Herasight finds when they double-check Nucleus’ numbers:
On their Substack, Herasight also criticizes Nucleus’ monogenic screening product. They point out cases where it fails to properly screen for the conditions it claims. For example, the Nucleus website advertises screening for spinal muscular atrophy:
But on their gene list…
…they don’t screen for SMN, which causes 95% of spinal muscular atrophy cases. They only screen for UBA1, which causes a distinct and much rarer condition called x-linked infantile spinal muscular atrophy. Professional organizations publish guidelines for what genes need to be screened in a screening product, and Nucleus does not appear to be following them.
In further discussion, Herasight continued with exhaustive criticism of essentially everything Nucleus had ever done down to the smallest detail. Nucleus reports list the same baseline disease risk regardless of patient ancestry, but different ancestry groups should have different risks14. Nucleus’ physician reports sometimes list lower-than-average risk for patients with positive polygenic scores15. Nucleus’ age-based risk tables don’t distinguish between age and cohort effects (is this bad? see footnote16). My favorite critique is that Nucleus wrote a blog post criticizing competing company Orchid…
…which included a section on how Orchid is a polygenic selection company, and polygenic selection companies are inherently “sketchy” and “honestly should be illegal”. But Nucleus is also a polygenic selection company! This is like Marlboro attacking Camel on the grounds that cigarettes are addictive and should be banned! Obviously something went wrong here - my guess is AI - and it’s a really bad look, especially when these scientific issues are so hard to litigate, and so many of us will have to go off gestalt impressions of corporate culture.
Nucleus states that they validate their models internally and intend to make their results public soon.
A Foothill Of The Future
It’s hard not to love this technology. Lots of people (and the aforementioned professional organizations) manage anyway, but it’s hard.
If this were a single-use medical treatment, delivered by a doctor after someone got the relevant condition, it would be one of the biggest advances of the decade - imagine a drug that cures 10 - 40%17 of breast cancers with no side effects! But in fact, it works for breast cancer, and schizophrenia, and heart attacks, and approximately everything else. The only things comparable are antibiotics and GLP-1RAs.
And then there’s the IQ effects. Even after studying the literature, people have wildly different opinions about the importance of IQ. One of the most important debates is to what degree IQ differences are a cause of poverty, a consequence of poverty, or both. I lean towards both - a country with limited access to schools and medical care will have low average IQ, but as a consequence it probably won’t become the next big semiconductor hub. This technology could close half the IQ gap between poor and middle-income countries, or between middle-income and rich. Or it could give rich countries average IQs that have never been seen before, and let us see what kind of O-ring technologies (and new forms of social cooperation) lie just beyond the frontier.
(this is the nice quantifiable argument in favor of IQ enhancement, but I find myself more convinced by fuzzier things - how much is it worth to be able to enjoy great art and literature? To fully comprehend what we know of nature, and be able to fully appreciate the mystery of the rest? To have a sense of why society works the way it does, instead of feeling like you’re being blown back and forth by institutions you don’t really understand? Amateur psychoanalysts like to say that the only people who care about IQ are those looking for an excuse to boast about how high their own is, but my experience is the opposite: I care about IQ because I bang up against the limits of my own a thousand times a day, and I hate it. I fantasize about ways to make my children smarter than I am for the same reason a dog confined in a tiny crate might fantasize about getting her puppies adopted out to a nice house with a big grassy yard.)
My biggest qualm is that it might not matter. This is such a tiny foothill, flanking such a vast and foreboding range of mountains, that it might be a mistake to care about it at all.
Selecting the best of five or ten embryos is not a very effective way to get the genes you want. There are things in the pipeline that will make this look like Hippocrates draining black bile. By the time the first polygenically selected children are adults, they’ll be old news.
And then there’s AI. The average age at diagnosis for Type II diabetes is 45 years. Will there still be people growing gradually older and getting Type II diabetes and taking insulin injections in 2070? If not, what are we even doing here?
Many people in the transhumanist community are still bullish on this technology. They think - well, there’s still an outside chance that something comes up and AGI takes another few decades. If we can enhance humans to be smarter, healthier, and more determined by the time it arrives, maybe we’ll have a better chance. Or maybe, if there’s a positive optimistic vision of a human-based high-tech future, people will be more willing to delay AI in the first place.
I like this argument, but I also think it’s worth stepping back. What’s the point of anything? Why have kids at all in a world that’s changing this fast? Why save for the future? At some point your answer has to be romantic and aesthetic - it’s never been clear whether anything you do matters in any ultimate sense, but you’ve got to act as if it does and hope for the best.
From that perspective, this is the most romantic technology of all. You’re not just giving a better life to your kids. Genes travel from generation to generation; you’re giving a better life your grandkids, your great-grandkids and so on to the point 1.77*log₂(population) generations from now when you are the ancestor of everybody and nobody. Somebody in Macaronesia in 3525 AD will avoid getting breast cancer because of you (if there is still cancer; if there are still breasts).
Some combination of reasonable cost-benefit analysis and romantic/aesthetic commitments makes me want to have children despite the uncertainty, and the same combination made me sign up to use this technology despite the same. More later on how that’s going.
I’m slightly mixing up two different things here - Down Syndrome can be detected with an aneuploidy test, but cystic fibrosis takes a more involved PGT-M test.
There are two separate questions here. First, how much would diabetes risk decline if you selected the embryo with the lowest risk for diabetes - something you have no reason to do, since you have no reason to privilege diabetes risk over risk of any other disease? Second, how much would diabetes risk go down if you selected the embryo with the lowest health risk overall? Genomic Prediction’s their risk calculator calculator shows, seemingly paradoxically, that you get -38% relative risk by selecting against diabetes alone, but -41% relative risk by selecting against everything at once. Over email, they stand by this surprising result, saying that “for a couple of diseases (type II diabetes and CAD), the EHS actually accomplishes a larger risk reduction than the individual predictors. The explanation is that the EHS takes into account multiple PRS of diseases with high comorbidity”. See eg Figure 3 here:
…and the section of the post called “Antagonistic Pleiotropy” for more. However, this paradoxical benefit is only true for a few conditions like diabetes - for everything else, selecting on health index does better than you would naively think, but still does not decrease the risk of a given condition as much as selecting against that condition directly.
That is, new mutations in that particular baby, as opposed to older mutations already present in the parents.
Conflicts of interest: I have used Orchid’s and Herasight’s products on my own embryos (not the ones used to conceive my existing kids, but for a potential third child), employees of Genomic Prediction and Herasight have been extremely helpful in contributing expertise to ACX posts on genetics, and I might invest in this field at some point (though haven’t done so yet). This post started as Herasight asking me to write about their white paper, then spiraled out of control. There were some unexpected time pressures and the result is that I didn’t get a chance to run everything in Herasight’s white paper by their competitors as thoroughly as I would like. Although I talked to representatives of all four companies profiled here, I feel like this probably reflects Herasight’s perspective better than other companies’, and that this is a major flaw. If other companies have responses, I’ll publish them. Thanks to all companies involved for their assistance on this article.
Finally, I am favorably disposed toward Herasight because of how I learned about them: a professor named Jonathan Anomaly got cancelled from Penn for being too gung-ho about genetic enhancement, and used his newfound freedom to join a very-early-stage Herasight, raise their ambitions, and sell everyone (including me) on the idea. I grew up on a diet of books and movies about mad scientists, and I’m a sucker for a story about a guy named Doctor Anomaly pursuing revenge against the small-minded fools who destroyed his career by creating a race of superbabies.
The version of the tool I looked at said 5.9 points for five embryos, up to 9 points for twenty embryos. The version of the tool on their current said says 5.3 - 9, so they might have recalculated after I finalized this article.
Used in quotation marks because these scores were fine for the predictive tasks they were applied for - they just weren’t finding genes that directly caused the outcome of interest.
Conflict of interest notice: this table was originally unadjusted. A representative of Herasight claimed that this was unfair, because each company used slightly different reporting conventions, and offered to correct for this in a neutral way. I retraced their reasoning, confirmed that the correction did not especially benefit Herasight at the expense of other companies, and accepted the correction. The original unadjusted table is below:
Herasight was insufficiently comfortable with Nucleus’ methodology to even be willing to posit a corrected value, so I left their self-reported value in gray.
Zagorsky (2007) says an extra IQ point means $234-$616/year in higher salary. The midpoint of $425 equals $670 in today’s dollars; assuming a forty-year career, Nucleus’ +1 point estimate is worth $26,800 (vs. $9,249 Nucleus cost) and Herasight’s +6 point estimate is worth $160,800 (vs. $53,250 Herasight cost).
As part of researching this article, I asked all four major companies about their within-family validation strategies. Here are some details:
Genomic Prediction discusses their strategy in this paper. The results are complicated to interpret - the within-family numbers often have such wide error bars that they overlap with both the across-family numbers and with zero - but looking qualitatively it seems like most scores on average lose about 25% of their risk reduction ability (though averages might not be the right way to do this, and some might be much more affected than others). Their website reports unadjusted, not within-family validated numbers; GP says they say this clearly on their site (which is true), Herasight counters that they still present their numbers as applicable to embryo selection (which is also true). To get the most applicable-to-embryo-selection numbers, you might want to adjust GP’s stated numbers down somewhat; it’s hard to say exactly how much, but maybe 20 - 25%?
Herasight has their within-family validation results in their white paper. They say there is no significant decrease in accuracy for 16 of their 17 disease risk predictors; the last, osteoporosis, has a minor decrease. They intend to publish more on their trait predictors soon.
Orchid say they validate within-family whenever possible, although certain conditions are too rare to do so properly. They present within-family results for seven of their scores here; none show any significant decrease in accuracy.
Before I got to this question, Nucleus asked me not to send them further questions. But their website says:
» “Research shows that polygenic scores for diseases are less likely to be impacted by factors that could confound predictions, like assortative mating, where people tend to marry those with similar characteristics (13). Furthermore, research shows that the heritability of these phenotypes don’t change across relatives and people who are unrelated, indicating true direct genetic effects (14). Recent research also shows, compared to behavioral genetics phenotypes like IQ, clinical phenotypes like migraines have negligible indirect effects (15).”
This is not how other experts I talked to described the state of the research, but it suggests Nucleus doesn’t see a need to within-family-validate their scores.
See Missing Heritability: Much More Than You Wanted To Know for more on Young’s research.
One typical way to quantify whether health interventions are worth it is through DALYs (and the very similar QALYs). US health economists usually support interventions that cost less $100,000 per QALY gained. Herasight claims a gain of 1-4 QALYs; taking my 1.66 example, that’s a $53,250 cost for a $166,000 gain. But there are several reasons not to take this at face value. First, so far the government isn’t paying - you are - and you may value money differently than the government does (for example, if you have less than $100,000, you will not spend $100,000 for any number of QALYs; if you’re a billionaire, you might happily spend tens of millions on a single QALY). Second, if you apply a time discount, the intervention probably goes back below the $100,000 per DALY threshold again.
Herasight says they are skeptical that Nucleus’ hair and eye color predictors work, partly based on their overall skepticism of Nucleus’ product, and partly because hair and eye color predictors have proven unexpectedly hard and Nucleus does not have the resources that it would take to solve this difficult problem.
Nucleus says they use 7,000 variants from 13 genes, but Herasight says they use 12 variants total. It looks like the discrepancy comes from Nucleus using two different tests for ADHD - a monogenic screen which looks for 7,000 different pathogenic variations in 13 relevant genes, and a polygenic score which includes 12 variants. The monogenic screen is not really related to the kind of polygenic scores we’re talking about here, and the 12-variant polygenic score is more comparable to the scores offered by other companies.
Most of these complaints are based on Nucleus’ adult results, which they have been offering longer than the embryo selection results and which Herasight had an easier time getting copies of. Their concerns are based on an assumption that Nucleus’ embryo selection technology is based on its adult genomic technology.
Positive polygenic scores usually mean higher-than-average risk. I asked Nucleus about this, and they said they are using it to mean higher risk than the midpoint of cases and controls. The geneticist I asked about this said this is a possible way to do things, but non-standard and potentially confusing. I wasn’t able to consult enough outsiders to have a strong opinion either way.
An age effect is when a disease is genuinely more common in someone of a certain age - for example, Alzheimers is more common in elderly people. A cohort effect is when a disease is more common in people of a certain generation, sometimes because of diagnostic changes - for example, ADHD diagnoses are more common in people born in the 1990s, since schools started screening for it in the 1990s. Nucleus’ adult report on ADHD looks like this:
…so it’s reporting a combination of both types of effect. If I imagine myself as a patient, I am fine with this - it is true that I, as a man between 18 and 44, am 4.5% likely to have an ADHD diagnosis (when my genes are taken into account). But if you were expecting this to tell you the true frequency of ADHD with age, you would get confused.
It would be much more irresponsible to present information this way with embryo selection (because it’s not true that an embryo born nine months from now will have 1.1% chance of an ADHD diagnosis at age 45, since they’re in a generation with a higher ADHD diagnosis rate), but Nucleus doesn’t seem to be doing this.
10% if you’re selecting against everything equally; up to 46% if you’re selecting against breast cancer in particular. The 10% number is probably closer to how most people will use it, but the 46% number might be more suitable for this specific analogy where it’s being used as a cure.
Share this post