Substack cometh, and lo it is good. (Pricing)

Human to Neanderthal gene flow and high quality Denisovan genomes

Recurrent gene flow between Neanderthals and modern humans over the past 200,000 years:

Our understanding of admixture between humans and Neanderthals has changed dramatically over the past decade and a half. Once thought not to have occurred at all, there is now ample evidence for gene flow from Neanderthals to humans and vice versa. Li et al. used a new framework to model the increasingly complex dynamics of introgression between humans and Neanderthals and the ramifications for both populations. They identified regions of human ancestry in Neanderthals, estimated population sizes for Neanderthals were about 20% lower than previously thought, and proposed the possibility of two pulses of gene flow from humans to Neanderthals. This study comprehensively synthesizes our current knowledge of hominin admixture. —Corinne N. Simonti
Substack cometh, and lo it is good. (Pricing)

Should we Invest in Curing Rare Diseases or Making Them Rarer?

An guest-post from Noor Siddiqui and Nikki Teran of Orchid

Rare diseases cost Americans around 8 trillion dollars a year. About half of that is direct medical costs. If families are lucky, they may go bankrupt paying for treatment of often questionable effectiveness. The unlucky families may go bankrupt just trying to find a cure; unfortunately less relevant as only about 5% of rare diseases have treatments

What can we do to reduce this burden? 

Two options are: 

  1. Incentivize the development of more treatments

or

  1. Reduce the incidence in future generations by screening as early as possible

Towards option one, the FDA created a “golden ticket” to encourage the development of treatments for rare diseases. Their transferable golden ticket (technically a “priority review voucher”) allows its owner to shave four months off the FDA’s review process. Previous vouchers have sold for as high as $350 million. But the program is set to sunset this fall, raising the question of whether the problem of rare diseases will be dealt with at the federal level.

Towards option two, preimplantation genetic testing may provide its own “golden ticket” for preventing rare diseases. Preimplantation genetic testing in concert with in vitro fertilization has already been shown to be cost effective for inherited rare diseases like sickle cell and Huntington’s disease, with cost savings in the millions per individual (not to mention the priceless improvement in quality of life). Advances in whole genome sequencing open up this technology to potentially prevent all rare genetic diseases.

Rare Diseases Aren’t Actually Rare – 10% of Americans are affected 

In the United States, a rare disease is defined as one that affects fewer than 200,000 Americans. Despite each disease affecting no more than .06% of the population, there are about 7,000 different rare diseases, meaning roughly 10% of the U.S. population has a rare disease. 

These numbers may seem large, but it makes sense if you consider the complexity of human genetics and development. Humans have around 20,000 genes. Mutations in some genes are lethal. Mutations in some genes lead to normal human variation; you may not even know the person has a mutation. And mutations in others lead to rare diseases; not lethal at or before birth, but often quite deleterious. Around 80% of rare diseases are genetic and many of these diseases are discovered in children – the mutations are severe enough to be noticed early.

30% of children with rare diseases do not live past the age of 10 

The devastation of rare diseases stems from their chronic, progressive nature and the fact that many involve multi-organ system failure or neurological impairments. For families, this can be unimaginably cruel. Children may be robbed of normal development or, even more brutally for families, regression. In Sanfillipo syndrome, children may develop normally until age four before showing any behavioral changes. They then progressively develop dementia and lose motor function. Most do not make it to adulthood.

The emotional toll is compounded by a profound sense of isolation as healthcare providers often lack expertise with the particular rare condition. Making matters worse, the few treatments that do exist frequently carry price tags in the millions, straining families to the breaking point financially as well as emotionally and physically through caretaking demands.

Just last December a gene therapy treatment was approved for sickle cell disease, which affects around 100,000 Americans and as many as one in every 365 African-Americans. Unfortunately, gene therapy is not a perfect cure. For sickle cell disease, the chemotherapy administered as part of the treatment leads to infertility.

But many rare diseases are too complex for over-the-counter solutions and affect too few people a year to interest drug companies. The market just isn’t there.

The FDA Priority Review Voucher Program Incentivises Orphan Drug Development

The priority review voucher program turns the inefficiencies in the FDA’s approvals process from a bug into a feature. The government created the priority review voucher program to incentivize the development of treatments of certain tropical diseases, rare pediatric diseases, and medical countermeasures against biological, chemical, radiological, or nuclear threats. These are known as “orphan” indications as they otherwise lack the financial incentive needed for a pharmaceutical company to invest in research.

The voucher allows for fast-tracked new drug application approval. Time is money for pharmaceutical companies. Being able to jump the approval line and get a drug approved in six months instead of ten can be huge for pharma company’s bottom lines.

The first priority review voucher sale was used by Regerneron Pharmaceuticals to get their PCSK9 inhibitor to market a month ahead of Amgen’s cholesterol lowering drug targeting the same pathway. 

Pharmaceutical patents last 20 years, but that clock starts when the patent is filed, not when the drug is approved. Patents are typically filed during preclinical trials, which can take upwards of two to five years. These are followed by clinical trials (when the drug is actually tested in people). Clinical trials can take upwards of three to ten years. By the time a drug is ready for FDA evaluation, there may only be a handful of years left on the patent.

Some drugs can make billions of dollars per year, meaning it may make financial sense to spend hundreds of millions of dollars on an extra four months of sales.

https://www.gao.gov/assets/gao-20-251.pdf

Vouchers are sold in more ways than one. Some larger pharma companies will acquire smaller biotechs, or their assets, when the smaller company is going after “voucherable” targets. These acquisitions can help bring to market drugs for orphaned diseases that may not otherwise have had the resources from venture capital to survive.

A friend of mine worked for small drug companies that were pursuing treatments for Ebola primarily because of the priority review voucher program. They weren’t targeting the financial incentive of the antiviral’s sale; they were targeting the voucher itself. 

Unfortunately, the priority review voucher program for neglected tropical diseases has already sunset. Despite (or more realistically due to) the recent pandemic, Congress was unable to reauthorize the Pandemic and All Hazards Preparedness Act this past fall. The neglected tropical disease and medical countermeasure priority review programs are casualties of congressional dysfunction. Without the financial incentives created by this program, many of these drug pipelines will dry up.

The pediatric rare disease program has a little more time — it’s set to sunset this fall — and its fate will likely be the same. Although the program has successfully contributed to 19 new treatments for rare pediatric diseases, even its continuation wouldn’t be enough. What about the other 6,981 rare diseases? For genetic disorders, prevention may be worth five thousand pounds of cure.

https://www.gao.gov/assets/gao-20-251.pdf

Preimplantation Genetic Testing (PGT) Can Prevent Many Rare Diseases

During typical in vitro fertilization (IVF), multiple embryos are created and the “highest quality” embryo is implanted. Historically, the “highest quality” embryo was determined subjectively based on appearance (“morphology”). This appearance often correlated with genetic issues and, inversely, later pregnancy success. Today, success can be increased by looking for the genetic issues directly: preimplantation genetic testing (PGT) has been shown to reduce the miscarriage rate from 9.1% to 2.6%. As genetic issues are more common with age, this particularly increases success in older mothers.

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8984775/figure/jld220010f2/

Cutting-edge preimplantation genetic testing can read 99.6% of an embryo’s genome, allowing for the identification of mutations that can cause rare diseases. This process works both for mutations that parents may pass on to their children (inherited mutations) and formutations that spontaneously occur (de novo mutations). Orchid performs this cutting-edge version of preimplantation genetic testing. The whole-genome sequencing has provided parents with information on these rare diseases, enabling improved decisions on which embryo to implant. 

Doctors already recommend screening procedures that are similarly priced and arguably less impactful. The chances of a 35 year-old woman having a child with Down syndrome is just under 0.3%. The life expectancy of someone with Down syndrome is 60 years and they generally report living happy, fulfilling lives. In contrast, the combined prevalence of the monogenic disorders Orchid screens for is 3-4%, more than 10 times as likely, and many are fatal at birth. Whole genome screening is able to detect these mutations whether they are inherited from the parents or de novo mutations that are unique to the embryo. Additionally, preimplantation genetic screening also provides information on chromosomal abnormalities like Down syndrome or others that are incompatible with life. Amniocentesis typically costs $4,100 per fetus while Orchid’s whole-genome testing is $2,500 per embryo.

Many of the disorders detectable by whole-genome preimplantation genetic testing carry additional cost beyond just reduced lifespan. 40% of infants in the NICU have a detectable single-gene mutation, meaning many of these babies require intensive care due to their detectable genetic disease. A single day in the NICU costs upwards of $3,500.

Even just alternative forms of diagnosis are comparable to whole-genome preimplantation genetic testing. A 2014 study that looked at neurodevelopmental disorders, which affect more than 3% of children, found the average family spent $19,100 on inconclusive diagnostics alone. That means the families spent nearly $20,000 each on diagnostics, not treatments or other care, before the cost of the test that ultimately provided a diagnosis.

Parents and Patients Should Have As Many Options as Possible

Ideally, biotech companies would have access to priority review vouchers and parents would have access to whole-genome preimplantation genetic testing. The federal government should create policies to maximize the public welfare. The priority review voucher program has encouraged the development of treatments that will change millions of lives. Systematic incentives like the voucher program can only be executed by this sort of top-down approach.

Practically, parents want to provide the best lives possible for their children. Preimplantation genetic testing can prevent monogenic genetic diseases and mitigate risk for chronic diseases like schizophrenia, bipolar disorder, and diabetes. However, the cost of IVF is an impediment to universal access, as the US does not have as extensive coverage as other countries. The federal government could require insurance to cover both IVF and genetic screening of embryos created through IVF. Parents would then have the necessary information to make their own choices about their children.

The voucher program will not be able to create treatments for all rare diseases and genetic tests may never be able to prevent all rare diseases. But the voucher program may be able to make many conditions minor inconveniences and genetic testing may provide families the opportunity to make informed decisions, making both social and financial sense. 

Authors 

Dr. Nikki Teran has worked in biotech at Pfizer, AncestryDNA, and startups. She earned her bachelor’s degree in Molecular Biophysics and Biochemistry from Yale, and her PhD in Genetics from Stanford, focusing on rare disease diagnostics. 

Noor Siddiqui is the CEO of Orchid, a reproductive technology company that built the world’s first whole genome embryo reports. She taught Frontiers in Reproductive Technology at Stanford, where she was an AI and genomics researcher and earned her Master’s and Bachelor’s degrees in Computer Science. 

Substack cometh, and lo it is good. (Pricing)

Razib Khan essay according to ChatGPT4o

My prompt: “Write a 5,000 word essay in the style and topic of Razib Khan”

Sure, I can create a detailed essay inspired by the style and themes that Razib Khan often explores, particularly those related to genetics, history, and human evolution. Razib Khan’s work frequently delves into the intersection of genetics with human history, population dynamics, and the migration patterns that have shaped our current world. I’ll create an essay that aligns with these themes.

## Tracing the Threads of Human History: A Genetic Odyssey

### Introduction

In the vast tapestry of human history, the threads of genetics provide a unique and powerful lens through which we can understand our past. The story of humanity is not only written in ancient texts or carved in stones but also encoded within our very DNA. Through the study of genetics, we can trace migrations, uncover relationships between ancient populations, and gain insights into the forces that have shaped our species over millennia. This essay will explore the fascinating interplay between genetics and human history, shedding light on how modern genetic research is revolutionizing our understanding of the past.

### The Genetic Landscape of Early Homo Sapiens

The journey of modern Homo sapiens began in Africa around 300,000 years ago. This period marks the emergence of anatomically modern humans, who would eventually spread across the globe. Genetic evidence suggests that all non-African populations descend from a small group of individuals who left Africa approximately 60,000 years ago. This Out of Africa theory, supported by mitochondrial DNA and Y-chromosome studies, posits that a relatively small founder population migrated out of Africa and gave rise to all non-African lineages.

Genetic markers, such as single nucleotide polymorphisms (SNPs), allow researchers to trace these ancient migrations. By comparing the genetic diversity of contemporary populations, scientists can infer historical population sizes and migration routes. For instance, the reduced genetic diversity observed in non-African populations compared to African populations indicates a genetic bottleneck—a period when the population size was drastically reduced—consistent with the Out of Africa migration.

### Interbreeding with Archaic Humans

As modern humans spread out of Africa, they encountered other hominin species, such as Neanderthals in Europe and Denisovans in Asia. Genetic evidence reveals that interbreeding occurred between these archaic humans and modern Homo sapiens. This admixture has left a lasting legacy in the genomes of contemporary non-African populations. Approximately 1-2% of the DNA in people of European and Asian descent can be traced back to Neanderthals, while Melanesians and Aboriginal Australians carry around 4-6% Denisovan DNA.

The discovery of these genetic contributions challenges the simplistic view of human evolution as a linear progression. Instead, it highlights a complex web of interactions between different hominin species. The genes inherited from Neanderthals and Denisovans have been linked to various traits, including immune system functions, skin pigmentation, and even the risk of certain diseases. This genetic legacy underscores the adaptive significance of these ancient interbreeding events, suggesting that they conferred certain advantages to modern humans in their new environments.

### The Peopling of the Americas

One of the most intriguing chapters in the story of human migration is the peopling of the Americas. Genetic and archaeological evidence indicates that the first Americans arrived via a land bridge known as Beringia, which connected Siberia to Alaska during the last Ice Age. This migration likely occurred around 15,000 to 20,000 years ago, although some evidence suggests even earlier settlements.

Genomic studies have revealed that Native Americans are primarily descended from a single founding population that migrated from Asia. However, the genetic diversity within Native American populations suggests subsequent waves of migration and complex population dynamics. For instance, recent genetic research has identified traces of Australasian ancestry in some Amazonian tribes, hinting at a previously unknown migration route or admixture event.

The study of ancient DNA (aDNA) from skeletal remains has further illuminated the early history of the Americas. For example, the analysis of the 12,600-year-old Anzick-1 child from Montana provided crucial insights into the genetic makeup of the Clovis culture, one of the earliest widespread cultures in North America. These ancient DNA studies continue to refine our understanding of the timing and routes of the initial peopling of the Americas.

### The Agricultural Revolution and Genetic Adaptations

The transition from a hunter-gatherer lifestyle to agriculture was a pivotal moment in human history, profoundly impacting genetic and cultural evolution. This Neolithic Revolution began around 10,000 years ago in the Fertile Crescent and subsequently spread to other parts of the world. The advent of agriculture led to significant demographic changes, including population growth, sedentism, and the rise of complex societies.

Genetic studies have identified numerous adaptations associated with the shift to agriculture. One of the most well-known examples is the evolution of lactase persistence—the ability to digest lactose, the sugar in milk, into adulthood. In most mammals, the expression of lactase declines after weaning, but in some human populations, a genetic mutation allows for continued lactase production. This adaptation is particularly prevalent in populations with a long history of dairy farming, such as those in Europe and parts of Africa.

The spread of agriculture also influenced the genetic landscape through processes such as gene flow and selection. For instance, the introduction of farming to Europe by Near Eastern migrants around 8,000 years ago led to genetic mixing between incoming farmers and local hunter-gatherers. This admixture is evident in the genomes of contemporary Europeans, who carry genetic markers from both ancestral populations.

### The Rise and Fall of Civilizations

Genetics provides a unique perspective on the rise and fall of ancient civilizations. By analyzing DNA from ancient remains, researchers can trace the movements and interactions of past populations, shedding light on historical events that shaped the world we live in today.

One notable example is the Bronze Age collapse, a period of societal upheaval around 1200 BCE that affected several ancient civilizations in the Eastern Mediterranean. Genetic studies of ancient remains from this period have revealed significant population movements and admixture events. For instance, the genomes of ancient Canaanites show a mixture of local Levantine ancestry and incoming populations from the Caucasus and Iran, reflecting the complex demographic changes during this tumultuous era.

Similarly, the genetic legacy of the Mongol Empire, which at its height in the 13th century was the largest contiguous land empire in history, can be traced in the genomes of contemporary populations across Eurasia. The expansion of the Mongol Empire facilitated extensive gene flow, leading to genetic contributions from Mongol warriors in populations as far west as Europe.

### The Impact of Disease on Human Evolution

Infectious diseases have been powerful selective forces throughout human history, shaping our genomes in profound ways. The co-evolution of humans and pathogens has left distinctive genetic signatures that provide insights into past epidemics and their impacts on populations.

One of the most well-studied examples is the Black Death, a devastating pandemic that swept through Europe in the 14th century. Genetic analyses of ancient DNA from plague victims have identified specific genetic variants associated with increased resistance to Yersinia pestis, the bacterium responsible for the plague. These variants, which likely provided a survival advantage during the pandemic, are found at higher frequencies in contemporary European populations.

The study of genetic adaptations to diseases extends beyond the Black Death. For example, the high prevalence of sickle cell trait in certain African populations is a classic example of balancing selection. The sickle cell trait, caused by a mutation in the hemoglobin gene, provides protection against malaria—a significant selective pressure in regions where the disease is endemic. However, individuals with two copies of the mutation develop sickle cell disease, illustrating the trade-offs inherent in genetic adaptations to infectious diseases.

### Modern Human Diversity and Its Implications

Today, advances in genomic technologies have revolutionized our understanding of human diversity and its implications for health and disease. Genome-wide association studies (GWAS) have identified thousands of genetic variants associated with complex traits and diseases, providing insights into the genetic architecture of common conditions such as heart disease, diabetes, and cancer.

However, much of this research has focused on populations of European descent, leading to a disparity in our understanding of genetic risk factors across different ethnic groups. Efforts to diversify genetic research are crucial for ensuring that all populations benefit from the advances in precision medicine. Initiatives such as the African Genome Variation Project and the H3Africa Consortium are working to address this gap by studying the genetic diversity of African populations, which harbor a wealth of genetic variation due to their deep evolutionary history.

Understanding genetic diversity also has implications beyond medicine. It informs our understanding of human evolution, migration, and the complex interplay between genes and the environment. For example, research on the genetic basis of traits such as skin pigmentation and adaptation to high-altitude environments has revealed the intricate ways in which humans have adapted to diverse ecological niches.

### Ethical Considerations in Genetic Research

The power of genetic research to uncover the past and inform the future comes with significant ethical considerations. Issues of privacy, consent, and the potential for genetic discrimination are paramount in the context of genomic research. The use of ancient DNA to study human history also raises questions about the ownership and stewardship of genetic heritage, particularly for indigenous populations.

Ethical frameworks and guidelines are essential for navigating these challenges. Engaging with communities, respecting their autonomy, and ensuring that research benefits are shared equitably are critical components of ethical genetic research. As we continue to explore the genetic underpinnings of human history, it is vital to balance scientific inquiry with respect for the rights and dignity of individuals and communities.

### Conclusion

The study of genetics offers a powerful tool for unraveling the complex history of our species. From the initial migrations out of Africa to the rise and fall of ancient civilizations, and from the adaptations to new environments and diseases to the modern diversity of human populations, genetics provides a unique window into our past. As we advance our understanding of the genetic basis of human history, we must also consider the ethical implications of this research and strive to ensure that its benefits are shared equitably

Substack cometh, and lo it is good. (Pricing)

Probably only 15 states would restrict most abortion if it was put to a popular vote

A few years ago Pew did a survey on attitudes to abortion by state, and you can see the results below in the table.

What I want to focus on is that both Kansas and Ohio have both kept most abortions legal through the popular vote. This can give us a sense of the popular mood: probably no state more liberal than them in surveyed views would approve abortion restrictions if it was a plebiscite. In fact, the victories in Kansas and Ohio, two states where the earlier poll suggested people were split, indicate that currently by popular vote only about ten states would probably be pro-life today.

Which brings me to the issue of legislatures and the parties. Oklahoma is as pro-choice a state as New Mexico on the popular level, but because Oklahoma is a very conservative Republican state, and New Mexico is a trending liberal Democratic state, their abortion policies at the opposite poles. I also want to highlight Nebraska and North Carolina, two states that are more pro-choice that Kansas and Ohio, but have very restrictive abortion laws. Finally, Florida is a very pro-choice state despite trending Republican since 2016. I think that Republican trend will be in jeopardy unless the party moderates on the abortion issue. The gap between public sentiment and the law seems too great.

Abortion laws as of Nov 14th 2022
State Legal in all/most cases Illegal in all/most cases Don’t know N Delta
West Virginia 35% 58% 6% 309 23%
Mississippi 36% 59% 5% 309 23%
Arkansas 38% 60% 2% 311 22%
Kentucky 36% 57% 7% 439 21%
Alabama 37% 58% 4% 511 21%
Louisiana 39% 57% 4% 465 18%
Tennessee 40% 55% 5% 661 15%
South Carolina 42% 52% 6% 495 10%
Indiana 43% 51% 6% 654 8%
Missouri 45% 50% 5% 642 5%
Texas 45% 50% 4% 2,535 5%
North Dakota 47% 51% 3% 338 4%
Utah 47% 51% 3% 315 4%
Idaho 45% 49% 6% 320 4%
South Dakota 48% 50% 3% 305 2%
Georgia 48% 49% 4% 968 1%
Wyoming 48% 49% 3% 316 1%
Kansas 49% 49% 3% 307 0%
Ohio 48% 47% 4% 1,132 -1%
Arizona 49% 46% 4% 653 -3%
North Carolina 49% 45% 6% 1,022 -4%
Nebraska 50% 46% 5% 312 -4%
New Mexico 51% 45% 4% 312 -6%
Oklahoma 51% 45% 4% 391 -6%
Iowa 52% 46% 2% 330 -6%
Pennsylvania 51% 44% 5% 1,366 -7%
Minnesota 52% 45% 4% 563 -7%
Wisconsin 53% 45% 3% 600 -8%
Michigan 54% 42% 4% 982 -12%
Virginia 55% 42% 3% 882 -13%
Illinois 56% 41% 3% 1,326 -15%
Delaware 55% 38% 6% 301 -17%
Florida 56% 39% 5% 2,020 -17%
Montana 56% 38% 5% 312 -18%
California 57% 38% 5% 3,697 -19%
Colorado 59% 36% 5% 504 -23%
Washington 60% 36% 5% 714 -24%
New Jersey 61% 35% 4% 886 -26%
Nevada 62% 34% 3% 314 -28%
Alaska 63% 34% 3% 310 -29%
Oregon 63% 34% 3% 419 -29%
Maine 64% 33% 3% 303 -31%
Maryland 64% 33% 3% 644 -31%
Rhode Island 63% 31% 7% 305 -32%
New York 64% 32% 4% 1,966 -32%
Hawaii 66% 29% 4% 312 -37%
New Hampshire 66% 29% 5% 303 -37%
Connecticut 67% 28% 5% 377 -39%
District of Columbia 70% 26% 4% 303 -44%
Vermont 70% 26% 4% 306 -44%
Massachusetts 74% 22% 4% 704 -52%
Substack cometh, and lo it is good. (Pricing)

People in Brazil are quite “mixed-race”

Probably the most famous Brazil American is Gisele Bündchen, erstwhile supermodel and ex-wife of Tom Brady. Bündchen is a German Brazilian, and all the media I see say she is purely German. She grew up in a predominantly German town in the southern state of Rio Grande do Sul, which is often contrasted with the black-dominated areas of northeastern Brazil. About 80% of people in Rio Grande do Sul identify as white, about 10% mixed-race, 5% black and the remaining 5% indigenous, Asian, etc.

These sorts of facts are often used to recapitulate American racial dynamics in Brazil, except here you have a black majority and a white minority, though the latter are still socially, culturally and economically dominant. This is in contrast to the model that Brazilians themselves promoted in the 20th century of being a multiracial and mixed-race society, albeit defined by a fair amount of naked anti-black bias.

The main problem with the first narrative is it is just a plain fact that most Brazilians are mixed-race in the American context. Bündchen is the exception, not the rule. This has been hard to ascertain because of the lack of high density SNP array surveys in the early years of this blog, but I decided to go back and check now that these chips are very cheap, and a paper with 6,500 Brazilians typed on 370,000 SNPs exists to illustrate the ancestry distributions within: A minimum set of ancestry informative markers for determining admixture proportions in a mixed American population: the Brazilian set.

The admixture plot shows that under “11,” sampled in the far southern Brazilian city of Pelotas, only a few individuals on the right portion of the distribution show trace amounts of non-European ancestry. The prevalence of low but widespread Amerindian ancestry is not surprising in Brazil, where the early European settlers seem to have absorbed the natives. Second, under “12” you see samples from the city of Salvador, where 80% of people identify as mixed-race or black. Here you see lots of African ancestry, but only the individuals at the far left of the distribution are as African as the average African American (the rightmost panel is from a central Brazilian city).

This pattern is even more clear on PCA:

What’s the takeway? By American standards most Brazilians are black, because they have African ancestry. This includes a majority of self-identified “white” Brazilians.

Substack cometh, and lo it is good. (Pricing)

Computational linguistic phylogenetics and Indo-Europeans

A new paper, Language trees with sampled ancestors support a hybrid model for the origin of Indo-European languages, has made a splash by inferring a far older date of diversification of these languages than has been assumed by other linguists, archaeologists and geneticists. As you can see above, the splits start a bit earlier than 5000 BC in this model, 1,500-2,000 years before the “classic” Pontic-steppe hypothesis. There are divergences in the typology from what some have assumed, for example, the deep split of Indo-Iranian from other groups. Nemets in Proto-Indo-European Urheimat Debate has given some skeptical thoughts, while Iosif Lazaridis of the “Southern Arc” fame has also offered his two cents. Others on social media have pointed out what seem to be important errors in the paper:

I can’t speak to the linguistics. I will say that I was taught about Bayesian phylogenetic inference in graduate school, so I know some of the models and parameters they’re using, and I’ve even used BEAST 2 myself. It’s not my specialty at all, so I have weak intuition, but these are serious methods that allow for understanding the past and reconstruction of evolutionary relationships. But I will pass on Asya Pereltsvaig’s criticism in The Indo-European Controversy: Facts and Fallacies in Historical Linguistics that the reliance on lexicon as input data might be a major problem in these inferences; misleading data in, misleading result out. But I’ll comment a few issues that jumped out at me informed by ancient DNA.

First, the position of Tocharian is not surprising…it often comes out as diverging early from the other languages. Tocharian languages were found in the northern and northeast regions of the Tarim basin. Historically, the southern rim of the basin was dominated by Iranian languages. It seems the most likely candidate for the people that gave rise to the Tocharian languages is the Afanasievo culture. The Afanasievo we now know were basically an eastern branch of the Yamnaya that show up in the Altai 3300 BC. This is 5,300 years BP. In the paper, the Tocharian split from other Indo-Europeans 5,400 to 8,600 years BP over a 95% confidence interval. The only way this makes sense to me is if there was deep linguistic structure within the Yamnaya despite overall genetic homogeneity maintained through mate exchange. In the text the authors seem to imply that the Tocharians are an early eastward migration, perhaps from the south Caucasus region. This does not align very well with the ancient DNA. The Afanasievo early on are replica copies of Yamnaya. Were the Tocharians already there? Did the Afanasievo just adopt their language?

The second issue I have broadly is with the Indo-Iranians. The authors propose that the Indic and Iranian branches separated in 3,500 BC. While earlier work indicates that the Indo-Iranian languages descend from the Sintashta language and the cultures of the Andronovo horizon, these authors emphasize the role of populations from the south Caucasus traversing Iran south of the Caspian Sea.

Below is a map that gets to the crux of my confusion about this ancient date and longstanding indigeneity of Indo-Iranians on the Iranian plateau:

We have some histories of the Middle Eastern Bronze Age. We know that the area of southwest Iran was dominated by the non-Indo-European Elamites as early as 3000 BC, and these people persisted down into the Common Era. Modern Armenia was dominated by non-Indo-European speaking Urartians after 1000 BC. This language is related to Hurrian, documented 4,000 years ago. Before the Indo-European Hittites ruled Hatti, the Hattians ruled Hatti. And the Hattians were not Indo-European. Judging by the obscure Eteocretan language that persisted into antiquity the Minoans were almost certainly not Indo-European speaking. Around 1500 BC it is true the ruling elite of the Hurrians, the Mittani, seem to have had an Indo-Aryan connection, but they were also likely intrusive, and their emergence as mobile warriors suspiciously post-dates the development of the light war chariot and the domestic horse thousands of miles to the north several centuries earlier by the Sintashta. The Assyrian royal annals date the arrival of Persians to the 9th century BC, but the results in the paper imply that the Iranians were already present in the Zagros for thousands of years before this (the south Caucasus being the Indo-European ur-heimat ultimately, the Indo-Iranians moving south and east very early on from that region).

I’m focusing on the Middle East because there is a rich history of textual evidence starting in the third millennium BC. These results imply that Indo-European languages are in fact native to the northern Middle East, in the southern Caucasus. And yet assorted obscure languages like Gutian, Kassite and Kaska, are found where you might expect a stray Indo-European here and there.  To me this is curious and weird. Further to the west, these results seem to imply that Greek was brought with Caucasus ancestry, but Minoan was likely not Indo-European. There are all these non-Indo-European languages attested in the textual record…and only a few Indo-European ones (Hittite being the first).

One of the major points of this paper that contradicts some theories in historical linguistics is a rejection of the tentative connection between Balto-Slavic and Indo-Iranian. Genetically, the curious aspect of the two language families is that Y chromosomal haplogroup R1a is very frequent in both, but differentiated into two lineages that seem to have diverged 5,500-6,000 years ago. But there is more than just Y chromosomes here; over the past decade autosomal genome analyses show that many South Asians, in particular those in the northwest and upper caste populations are enriched for a minority ancestral component that resembles Eastern Europeans. We now know what happened due to ancient DNA: Genetic ancestry changes in Stone to Bronze Age transition in the East European plain. A branch of the Corded Ware Culture (CWC) migrated eastward, becoming the Fatyanovo Culture, then the Balanovo Culture, then the Abeshevo Culture, and finally the Sintashta Culture. The Sintashta seem to have given rise a group of societies known as Andronovo that are hypothesized to evolved into Iranians and Indo-Aryans.

The result here does away with all this. Rather than Indo-Aryan speech being brought by steppe pastoralists between 3,500 and 4,000 years ago, as genetics would imply, the Indo-Aryan speech was likely present during the Indus Valley Civilization. These results imply that Indo-Aryan arrived in India thousands of years before the intrusion of steppe pastoralists, and it was carried eastward by farmers from the Caucasus. The Vedas and Sanskrit then come down from the IVC. And yet strangely the Vedas do not depict a very complex society like the IVC, but a more simple agro-pastoralist one. And, the sacred language of the IVC people presumably, Sanskrit, was maintained in particular by a Brahmin priestly caste that is notable for having a very high fraction of steppe ancestry, that much arrived later.

A massive issue of this paper is that it makes a hash of a major phenomenon that we know between 3500 and 2500 BC, and that’s the spread of steppe-people in all directions, especially out of the Corded Ware complex. The CWC are notable for having a major admixture of Globular Amphora Culture (GAC) Neolithic ancestry, about 25-35% of their genetics, and then spreading into all directions. As noted by the authors and other observers, ancient DNA suggests that Anatolian, Armenian, and perhaps Greek and Illyrian (Albanian), are exceptions to this, deriving directly from Yamnaya or pre-Yamnaya (in the case of Hittites) Indo-European people (remember, CWC is a mix of Yamnaya and GAC). The genetics is very clear that a major wave of post-CWC people went into Asia, and south into the Indian subcontinent and Iran. The Y chromosomes imply this was male mediated, and post-CWC Y chromosomes are found in appreciable quantities as far south as Sri Lanka. But these data place this demographic migration far too late to have been the origin of Sanskrit, which is associated with Arya culture.

As Lazaridis points out on social media, the divergence of European language groups like Germanic, Celtic, Italic and Balto-Slavic also predates the CWC expansion westward. For example, Italic language split off in 3500 BC, 500 years earlier than the expansion of CWC into Eastern Europe, with a 95% lower-bound of 2200 BC, about when steppe ancestry shows up in the Italian peninsula according to ancient DNA. If the dates are true then it seems that the various Indo-European language groups were differentiated already very early on in the Yamnaya, and not later on through their expansion across Europe. In other words, this is a model of “ancient linguistic substructure.”

To be entirely candid, it’s very hard for me to reconcile the ancient DNA with this typology and time-depth in a parsimonious manner that holds together in my head. This doesn’t mean the other models don’t have holes, the “Southern Arc” theory is pretty complicated too, and everything would have been “easier” if the Hittites had steppe ancestry, and they do not seem to. But there are too many things that are hard for me to understand with this new model. For example, the vast numbers of steppe Iranian people seem to be mostly descended from the CWC societies that gave rise to Europeans, but their languages diverged extremely early from their western neighbors, almost 2,000 years before the diversification of the CWC as an archaeological, demographic and genetic unit.

Substack cometh, and lo it is good. (Pricing)

How non-Africans came to be….

Most of the history of the human species is in Africa, which is why I wrote a Substack trying to outline what I think are various alternative models about what’s happened. But the last 100,000 years has been defined by the migration out of Africa for many. The above chart is a simplified representation of what I think is a good model for what happened. For a more complex and thorough treatment, check out Genetics and Material Culture Support Repeated Expansions into Paleolithic Eurasia from a Population Hub Out of Africa.

Also, new PNAS paper, The role of genetic selection and climatic factors in the dispersal of anatomically modern humans out of Africa:

The evolutionarily recent dispersal of anatomically modern humans (AMH) out of Africa (OoA) and across Eurasia provides a unique opportunity to examine the impacts of genetic selection as humans adapted to multiple new environments. Analysis of ancient Eurasian genomic datasets (~1,000 to 45,000 y old) reveals signatures of strong selection, including at least 57 hard sweeps after the initial AMH movement OoA, which have been obscured in modern populations by extensive admixture during the Holocene. The spatiotemporal patterns of these hard sweeps provide a means to reconstruct early AMH population dispersals OoA. We identify a previously unsuspected extended period of genetic adaptation lasting ~30,000 y, potentially in the Arabian Peninsula area, prior to a major Neandertal genetic introgression and subsequent rapid dispersal across Eurasia as far as Australia. Consistent functional targets of selection initiated during this period, which we term the Arabian Standstill, include loci involved in the regulation of fat storage, neural development, skin physiology, and cilia function. Similar adaptive signatures are also evident in introgressed archaic hominin loci and modern Arctic human groups, and we suggest that this signal represents selection for cold adaptation. Surprisingly, many of the candidate selected loci across these groups appear to directly interact and coordinately regulate biological processes, with a number associated with major modern diseases including the ciliopathies, metabolic syndrome, and neurodegenerative disorders. This expands the potential for ancestral human adaptation to directly impact modern diseases, providing a platform for evolutionary medicine.

This “stand still” out of Africa has been identified for a few decades now in the genomic data. I’m not sure people necessarily believe it…but it’s clear that non-Africans are very homogeneous and closely related to each other compared to various diverse African groups. That indicates a long period of isolation and homogeneity. The “Basal Eurasians” were part of this.

But earlier “modern” or “para-stem” populations may not have been. The Altai Neanderthal samples that date to 100,000 years ago have some “modern” ancestry. How? The best assumption right now is that these were very early migrants out of the African ur-heimat that mixed with Neanderthals in their eastern range, likely coming up through East Asia? Possibly mixed with Denisovans?

The point here is that some elements of the “weakly structured model” probably applies to Eurasia too. Africa is a big continent and good habitat for humans. Pulses out of this continent probably happened all the time. It’s just that the last Eurasian expansion about 50,000 years ago erased everything earlier (or absorbed it). Though I still think if we get ancient DNA from Laos it may turn out that 1% or something of the ancestry of Andamanese may date to a population that diverged 85,000 years ago from the main out-of-Africa group, before their isolation in the Near East (since the divergence is not that great, I don’t think the models would have good power to detect this proportion).

Substack cometh, and lo it is good. (Pricing)

Early Maghrebis are Eurasian back-migration


A new ancient DNA paper from Northwest Africa: Northwest African Neolithic initiated by migrants from Iberia and Levant:

In northwestern Africa, lifestyle transitioned from foraging to food production around 7,400 years ago but what sparked that change remains unclear. Archaeological data support conflicting views: (1) that migrant European Neolithic farmers brought the new way of life to North Africa1,2,3 or (2) that local hunter-gatherers adopted technological innovations4,5. The latter view is also supported by archaeogenetic data6. Here we fill key chronological and archaeogenetic gaps for the Maghreb, from Epipalaeolithic to Middle Neolithic, by sequencing the genomes of nine individuals (to between 45.8- and 0.2-fold genome coverage). Notably, we trace 8,000 years of population continuity and isolation from the Upper Palaeolithic, via the Epipaleolithic, to some Maghrebi Neolithic farming groups. However, remains from the earliest Neolithic contexts showed mostly European Neolithic ancestry. We suggest that farming was introduced by European migrants and was then rapidly adopted by local groups. During the Middle Neolithic a new ancestry from the Levant appears in the Maghreb, coinciding with the arrival of pastoralism in the region, and all three ancestries blend together during the Late Neolithic. Our results show ancestry shifts in the Neolithization of northwestern Africa that probably mirrored a heterogeneous economic and cultural landscape, in a more multifaceted process than observed in other regions.

Basically, the ancestors of the Berbers seem to be a mix of indigenous people whose ancestors were Eurasians who arrived during the Last Glacial Maximum, Early European Farmers (Anatolian) and Levantine farmers/pastoralists, in that order of timing, but reverse order of contribution. To me it is interesting that there is no Sub-Saharan African ancestry in the earliest populations, indicating that the Sahara was really not habitable for humans for much of the Ice Age.

Substack cometh, and lo it is good. (Pricing)

Moderate population structure out-of-Africa

A few weeks ago I wrote on my Substack about a new model of African H. sapiens genesis that assumes recurrent gene flow between deeply divergent populations within the continent. In contrast, the out-of-Africa movement was defined by a rapid expansion from a small ancestral founding group that diversified over the last 50,000 years. But there is a twist to this: the out-of-Africa event was presaged by many earlier expansions outward.. The Neandersovans were one clear example, about 600,000 years ago. But there were others. We know this because Neanderthal Y and mtDNA seems to be more closely related to modern humans than the whole genome, and the Altai Neanderthal shows clear evidence of modern human admixture as early as 100,000 years ago, far earlier than the primary out-of-Africa event 50,000 years ago.

Laos cave fossils prompt rethink of human migration map:

Archaeologists have uncovered two new bone fragments in a cave in northern Laos, suggesting that Homo sapiens wandered southeast Asia up to 86,000 years ago. The findings, published this week in Nature Communications1, indicate that humans migrated through the area earlier than previously thought.

Over more than a decade, excavations in the Tam Pà Ling cave have uncovered seven bone fragments sandwiched between layers of clay. Laura Shackelford, an anthropologist at the University of Illinois Urbana–Champaign, and her colleagues have regularly had to hike through sticky tropical heat to reach the mountain-top cave.

After digging 7 metres down, excavations have finally hit bedrock, and the team has been able to reconstruct a complete chronology of the cave, says Shackelford. Sediment and bones unearthed in the cave show that modern humans have inhabited the mountainous region for at least 68,000 years, and passed through even earlier.

These archaeological findings seem to solidify the idea of modern (African) humans in Southeast Asia. They don’t seem to have left an imprint, but that’s OK, the first modern Europeans didn’t either.

Substack cometh, and lo it is good. (Pricing)

Open Thread – 5/8/2023 – GNXP

I’m in New York City until Thursday. Might do some meet-ups… Email or DM me.

Finally finished my Iran sequence…

Been listening to audiobooks, The Medici: Power, Money, and Ambition in the Italian Renaissance and The French Revolution: From Enlightenment to Tyranny. The former is a narrative that’s really focused on the biographies of the great Medici so it’s great for listening when you’re running or whatever, and the latter so far has focused on personalities.