Technology 10 min read

Molecular De-Extinction: Mining Ancient Genomes for Tomorrow's Antibiotics

Molecular De-Extinction: Mining Ancient Genomes for Tomorrow's Antibiotics

Somewhere in a laboratory at the University of Pennsylvania, a synthetic peptide derived from the woolly mammoth is killing drug-resistant bacteria. The mammoth has been extinct for four thousand years. The bacteria it is destroying evolved their resistance last decade. This is molecular de-extinction — and it may be the strangest, most promising frontier in the fight against superbugs.

The premise is counterintuitive: to fight the most modern medical crisis, look backward. Way backward. Into the proteomes of creatures that vanished millennia ago, into the genomes of archaea that have thrived in boiling acid for billions of years, into the venom glands of snakes and scorpions whose chemical arsenals evolved over hundreds of millions of years of predator-prey warfare. These organisms developed antimicrobial molecules that modern bacteria have never encountered — and therefore have never evolved resistance against.

APEX: Teaching AI to Read the Dead

The engine behind molecular de-extinction is a deep-learning platform called APEX — Antibiotic Peptide de-Extinction — developed by César de la Fuente's lab at the University of Pennsylvania. APEX uses a multitask neural network combining recurrent and attention architectures to predict which peptide sequences, buried within larger proteins, might have antimicrobial activity.

The scale of the search is staggering. In their landmark Nature Biomedical Engineering paper (2024), the team mined the proteomes of every extinct organism with available sequence data — the "extinctome." APEX processed 10.3 million encrypted peptides and identified 37,176 sequences predicted to have broad-spectrum antimicrobial activity. Of these, 11,035 were unique to extinct species — molecules that existed only in organisms no longer alive.

They synthesized 69 of the most promising candidates and tested them against drug-resistant bacteria. The results were striking.

10.3M
Peptides mined from extinct proteomes
37,176
Predicted antimicrobial sequences
11,035
Unique to extinct organisms
69
Synthesized and confirmed active

Lead compounds included mammuthusin-2 from the woolly mammoth, elephasin-2 from the straight-tusked elephant, hydrodamin-1 from an ancient sea cow, mylodonin-2 from the giant sloth, and megalocerin-1 from the extinct giant elk. In mouse models of skin abscess and thigh infections, the best performers — elephasin-2 and mylodonin-2 — showed anti-infective activity comparable to polymyxin B, a widely used last-resort antibiotic.

Most of these peptides kill bacteria by depolarizing the cytoplasmic membrane — the inner barrier of the cell. This is significant because known antimicrobial peptides typically target the outer membrane. A different mechanism means a different resistance landscape. Bacteria have never been selected to defend against these molecules. Evolution has not prepared them.

Three Source Pools: Extinct, Extreme, Venomous

The extinctome was only the beginning. De la Fuente's team has since turned APEX on two additional biological reservoirs, each offering a distinct chemical space.

Archaea: Life's Ancient Extremists

In a study published in Nature Microbiology (2025), the team updated APEX to version 1.1 and aimed it at 233 archaeal proteomes. Archaea — single-celled organisms that thrive in boiling hot springs, hypersaline lakes, and deep-sea hydrothermal vents — diverged from bacteria billions of years ago. Their proteins evolved under radically different selective pressures.

APEX identified 12,623 candidate antimicrobial peptides from these proteomes, which they named archaeasins. The amino acid composition was clearly distinct from traditional antimicrobial peptides — a signature of their alien evolutionary lineage. Of 80 archaeasins synthesized and tested, 93% showed antimicrobial activity against ESKAPE pathogens including A. baumannii, E. coli, K. pneumoniae, P. aeruginosa, S. aureus, and Enterococcus spp. Three were tested in mouse infection models and all arrested drug-resistant infections.

A 93% hit rate is extraordinary. For context, traditional high-throughput screening of synthetic chemical libraries typically yields hit rates below 1%. Even AI-driven virtual screening of existing compound libraries (like Genentech's GNEprop, which achieved a 90-fold improvement) operates in low single-digit percentages. Archaea, it turns out, are chemical factories for antimicrobial molecules — hiding in plain sight for billions of years.

Venoms: Nature's Oldest Weapons

The third source pool drew from a database of more than 40 million venom-encrypted peptides — molecular fragments hidden within the venom proteins of snakes, spiders, scorpions, and wasps. Published in Nature Communications (2025), the study used APEX to flag 386 compounds with the molecular hallmarks of antibiotics. The team synthesized 58, and 53 killed drug-resistant bacteria at doses harmless to human red blood cells. That is a 91% hit rate. More than 2,000 new antibacterial motifs were identified — structural patterns that could serve as templates for future drug design.

Three source pools. Three studies. Three Nature-family publications. A combined hit rate that dwarfs anything in conventional drug discovery. The common thread: biological reservoirs that modern pathogens have never encountered.

Our Extinct Relatives Had Antibiotics Too

Before the APEX extinctome work, de la Fuente's group had already explored a more personal branch of molecular de-extinction. In a 2023 study published in Cell Host & Microbe, they mined the proteomes of Neanderthals and Denisovans — our closest extinct relatives — for antimicrobial peptides.

Using a machine-learning model called panCleave, they identified 69 candidate peptides from ancient human proteins. Six showed in vivo activity against Acinetobacter baumannii, one of the most dangerous drug-resistant pathogens on Earth. The lead compound, Neanderthalin-1, performed comparably to polymyxin B in a mouse skin infection model.

The Neanderthal peptides had a strikingly different amino acid profile from known antimicrobial peptides — more polar, more acidic, more aromatic, less basic. They represent a chemistry that modern drug design would never have arrived at by rational design alone. Evolution optimized these molecules over hundreds of thousands of years of coexistence with pathogens. We lost that chemistry when the Neanderthals disappeared. Now AI is retrieving it.

From Mining to Creating: The APEX Evolution

What makes this work more than a clever mining exercise is how the platform has evolved. APEX began as a classifier — a tool for finding needles in ancient haystacks. Over three years, de la Fuente's team has transformed it into a generative system capable of creating entirely new molecules inspired by ancient biology.

Platform Year Approach Key Result
APEX 2023–24 Deep-learning classifier mining extinct proteomes 10.3M peptides screened, 69 confirmed active
APEX-GO 2024 Generative Bayesian optimization on de-extinct templates 85% hit rate, 72% potency improvement
AMP-Diffusion 2025 Latent diffusion model on ESM-2 protein embeddings 50K candidates generated, 2 matched FDA drugs in mice
ApexOracle 2025–26 Multimodal: pathogen genome + phenotype → molecule design Designs antibiotics for pathogens that don't exist yet

APEX-GO took 10 de-extinct peptide templates and used generative Bayesian optimization to produce 100 optimized derivatives — achieving an 85% hit rate and 72% potency improvement against clinically relevant Gram-negative pathogens. In mouse models, several derivatives — particularly optimized versions of mammuthusin-3 and mylodonin-2 — performed comparably to or better than polymyxin B.

AMP-Diffusion, published in Cell Biomaterials in September 2025, went further. Built on ESM-2, Meta's protein language model, it used latent diffusion to generate 50,000 entirely new candidate molecules. APEX ranked them, 46 were synthesized, and two matched the efficacy of FDA-approved antibiotics levofloxacin and polymyxin B in mouse skin infection models. These were not ancient molecules — they were new molecules generated by a model trained on ancient chemical patterns.

The most ambitious iteration is ApexOracle, described in an arXiv preprint (July 2025) and featured in MIT Technology Review (February 2026). ApexOracle is multimodal: it takes a pathogen's genome (processed through the Evo2 DNA foundation model), its phenotypic characteristics (as text), and a candidate molecule as inputs. It then predicts efficacy and can generate de novo molecules optimized for that specific pathogen. The ambition is to design antibiotics for pathogens that do not exist yet — to anticipate future resistance before it emerges.

The trajectory is clear. In three years, the platform evolved from mining what evolution created to creating what evolution never imagined.

Paleomycin: Resurrecting Antibiotics from Deep Time

Meanwhile, at McMaster University, Gerry Wright's lab has pursued a complementary approach to molecular de-extinction — one focused not on peptides but on antibiotics themselves.

In a study published in Nature Communications (2023), the Wright lab used computational phylogenetics to reconstruct an ancestral glycopeptide antibiotic — a molecule they called paleomycin. By analyzing the evolutionary history of vancomycin-class antibiotics across hundreds of bacterial species, they inferred the sequence of the biosynthetic genes that would have existed 150 to 400 million years ago. Then they inserted those genes into a microbial host and produced the molecule.

Paleomycin is a different flavor of de-extinction. Where de la Fuente mines peptide fragments encrypted within proteins, Wright reconstructs entire biosynthetic pathways from deep evolutionary time. The two approaches share the same foundational insight: time is a dimension of chemical space. The further back you reach, the less likely modern bacteria are to have pre-existing defenses.

The Global Microbiome: 863,498 Hidden Antibiotics

The de-extinction paradigm extends beyond ancient and extinct organisms. In June 2024, a team reported in Cell the creation of AMPSphere — a database of 863,498 non-redundant antimicrobial peptide sequences mined from 63,410 metagenomes and 87,920 microbial genomes sampled from environments worldwide. Soil, ocean, human gut, extreme environments — all yielding molecules that standard antibiotic discovery pipelines would never encounter. Of 100 candidates tested, all showed preclinical antimicrobial activity.

AMPSphere represents the logical extension of molecular de-extinction: if time is one dimension of untapped chemical space, ecology is another. The global microbiome harbors nearly a million potential antibiotics. We have tested 100 of them.

The Translation Gap

Honesty requires acknowledging what stands between these discoveries and actual medicine.

The history of antimicrobial peptides as drugs is sobering. Despite more than 5,000 AMPs identified over decades of research, fewer than 50 have entered clinical trials, and fewer still have been approved. Pexiganan, iseganan, and omiganan — all promising peptide antibiotics — failed in Phase 3. The reasons are consistent: poor stability (peptides are rapidly degraded by proteases), toxicity at therapeutic doses (the same membrane-disrupting activity that kills bacteria can damage host cells), difficulty in delivery, and the persistent gap between in vitro activity and in vivo efficacy.

De la Fuente's team is aware of these challenges. Their best peptides show approximately 40% proteolytic stability at six hours — an improvement over many natural AMPs, but still a pharmacokinetic hurdle. They have filed one patent (WO2025054593) covering 41 antimicrobial peptides and their synergistic combinations. No IND-enabling studies have been announced. The path from mouse skin infections to human clinical trials remains long.

A recent review in ACS Omega frames the challenge precisely: "Molecular paleontology meets drug discovery" is a compelling narrative, but stability, toxicity, and delivery remain unsolved problems. The review notes that in the entire field, only 14 papers on molecular de-extinction were published between 2023 and 2025. This is a beginning, not a breakthrough.

Three Waves of AI Antibiotics

Molecular de-extinction is best understood as the third wave of a larger revolution in AI-driven antibiotic discovery.

The first wave was screening: using machine learning to find existing molecules with antibiotic potential. Halicin (2020) and abaucin (2023), both from the Collins lab at MIT, were discovered by training models on known compounds and scanning large chemical libraries. The AI found needles in existing haystacks.

The second wave was generation: using deep learning to design entirely new molecules from scratch. The Collins lab's generative AI work, published in Cell (August 2025), produced NG1 and DN1 — novel antibiotics with mechanisms never before seen in nature. Genentech's GNEprop screened 1.4 billion virtual compounds. These systems create new haystacks entirely.

The third wave — exemplified by de la Fuente's work — mines biology itself. Not synthetic chemical space, not existing drug libraries, but the molecular record of evolution. Extinct organisms, extreme environments, ancient human relatives, venom glands, the global metagenome. Each source offers chemical diversity that synthetic chemistry cannot replicate, because it was shaped by selective pressures that no laboratory can simulate.

The three waves are complementary, not competing. Each accesses a different region of chemical space. Together, they represent the most significant expansion of antibiotic discovery capability since the golden age of soil screening in the mid-20th century.

Why This Matters Now

The antibiotic pipeline dropped 35% in five years. NDM-producing superbugs surged 461% in the United States. The WHO reports that only 11 of 90 pipeline agents target the most critical pathogens. The traditional approach — modifying existing antibiotic scaffolds — is running out of road.

Molecular de-extinction offers something genuinely different: not a new variation on a known theme, but access to chemical diversity that modern bacteria have never been exposed to. Peptides from organisms that went extinct before the antibiotic era. Molecules from archaea that diverged from bacteria billions of years ago. Compounds shaped by evolutionary arms races in environments no drug company has ever screened.

These molecules are not drugs yet. They may never become drugs — the AMP translation gap is real, and the history of the field counsels humility. But the hit rates are extraordinary (93% for archaeasins, 91% for venom peptides), the mechanisms are novel (cytoplasmic membrane depolarization rather than outer membrane targeting), and the platform is evolving at remarkable speed — from classifier to generator to multimodal oracle in three years.

And there is something poetically fitting about the approach. Antibiotic resistance is an evolutionary problem. Bacteria evolve faster than we can develop drugs. The traditional response has been to try to out-engineer evolution — to design better molecules through chemistry. Molecular de-extinction takes a different stance: instead of trying to beat evolution, mine it. Reach into the deepest archives of biological time, where 3.8 billion years of molecular warfare have produced a chemical diversity we are only beginning to comprehend.

The woolly mammoth is not coming back. But its molecules might save us.

Key Sources: Nature Biomedical Engineering (2024) — APEX extinctome mining; Nature Microbiology (2025) — Archaeasins; Nature Communications (2025) — Venom-encrypted peptides; Cell Host & Microbe (2023) — Neanderthal/Denisovan peptides; bioRxiv (2024) — APEX-GO; Cell Biomaterials (2025) — AMP-Diffusion; MIT Technology Review (Feb 2026) — ApexOracle profile; Nature Communications (2023) — Paleomycin; ACS Omega (2025) — "Molecular Paleontology Meets Drug Discovery" review; Patent WO2025054593.