Babak Alipanahi
babaka.bsky.social
Babak Alipanahi
@babaka.bsky.social
Chief Scientist at @exai.bio — I use computational biology, machine learning and large-scale datasets to improve human health
Reposted by Babak Alipanahi
AI hallucinations in science manuscripts are a nuisance. Paranormal citations, or paracites, will be a nightmare.

www.biorxiv.org/content/10.6... (w/ @sina.bio & @lauraluebbert.com).
February 3, 2026 at 5:19 PM
Reposted by Babak Alipanahi
Time for a thread on our Christmas preprint “Origin and evolution of acrocentric chromosomes in human and great apes”. I had so much fun with this project and paper. It will be hard to summarize in a thread, but I’ll try www.biorxiv.org/content/10.6... [1/21]
February 2, 2026 at 2:58 PM
A good, enjoyable paper on AUROC vs AUPRC under class imbalance. In a nutshell, AUPRC's superiority is a myth.

AUROC with bootstrapping all the way!

arxiv.org/abs/2401.06091
A Closer Look at AUROC and AUPRC under Class Imbalance
In machine learning (ML), a widespread claim is that the area under the precision-recall curve (AUPRC) is a superior metric for model comparison to the area under the receiver operating characteristic...
arxiv.org
January 27, 2026 at 6:59 PM
Reposted by Babak Alipanahi
TF-MINDI is out! A new method to learn cis-regulatory codes through rich embeddings of TF binding sites. TF-MINDI decomposes motif neighbourhoods, and works downstream of any sequence-to-function deep learning model. We deeply study the enhancer code in human neural development, check out the thread
January 15, 2026 at 12:32 PM
Reposted by Babak Alipanahi
Vaccines, the most impressive public health intervention in medical history, and where we could be headed if there was not efforts to negate truth, facts, and evidence
A great, open-access, review and perspective by @scientificdiscovery.dev
January 7, 2026 at 8:17 PM
Reposted by Babak Alipanahi
Now published in Algorithms for Molecular Biology: link.springer.com/article/10.1.... Key message: a tiny CNN model with 7k parameters can capture main splice signals across vertebrates+insect and halves the minimap2 & miniprot junction error rate. I always use this new feature now.
Preprint on "Improving spliced alignment by modeling splice sites with deep learning". It describes minisplice for modeling splice signals. Minimap2 and miniprot now optionally use the predicted scores to improve spliced alignment.
arxiv.org/abs/2506.12986
January 6, 2026 at 11:02 PM
Reposted by Babak Alipanahi
Now published in gigascience: academic.oup.com/gigascience/.... Key messages: SVs are highly enriched in low-complexity/tandem-repeat regions and are harder to call. They behave differently from transposon insertions. Always stratify if you study SVs.
January 6, 2026 at 10:55 PM
Reposted by Babak Alipanahi
Excited to see this out www.nature.com/articles/s41...! Nonparametric kernel-based tests for spatially variable isoform usage in spatial transcriptomics. So many interesting examples in the CNS and cancer, we're only scratching the surface!
Mapping isoforms and regulatory mechanisms from spatial transcriptomics data with SPLISOSM - Nature Biotechnology
Differential isoform usage is identified with high statistical power from spatial transcriptomics data.
www.nature.com
January 6, 2026 at 7:12 PM
Reposted by Babak Alipanahi
Nature research paper: Uncovering the role of LINE-1 in the evolution of lung adenocarcinoma

go.nature.com/4oUHIPb
Uncovering the role of LINE-1 in the evolution of lung adenocarcinoma - Nature
Lung adenocarcinomas bearing the ID2 mutational signature display increased LINE-1 retrotransposon activity, which contributes to their fast evolutionary dynamics and aggressive phenotype.
go.nature.com
December 15, 2025 at 9:40 AM
Reposted by Babak Alipanahi
That’s a wrap on #SABCS25! Thank you to Dr. Lee Schwartzberg for presenting data demonstrating our platform’s ability to detect early stage breast cancer with high accuracy. #AI #RNA #earlydetection

Learn more here: www.exai.bio/publications...
December 12, 2025 at 5:35 PM
Reposted by Babak Alipanahi
Important new, large (N>28,000 women) randomized clinical trial of breast cancer screening: age-based vs risk-based by polygenic risk score, genomics
"opportunity to modernize screening"
jamanetwork.com/journals/jam...
Risk-Based vs Annual Breast Cancer Screening
This randomized clinical trial examines whether risk-based screening is a safe and effective alternative to annual mammography for detecting breast cancer in women 40 years and older.
jamanetwork.com
December 12, 2025 at 5:00 PM
Reposted by Babak Alipanahi
Even though the highest-profile names in today’s corporate Cambridge are in biotech and software, the influx of defense startups hearkens back to an earlier era — which, in 1922, saw the birth of Raytheon, now synonymous with the old guard of defense contractors. www.thecrimson.com/article/2025...
As Cambridge Faces a Life Sciences Downturn, Startups Turn to a New Industry: Warfare | News | The Harvard Crimson
As biotech firms shed jobs and life sciences funding dries up, policymakers have started to see defense technology as a way to buttress the Massachusetts economy. Industry experts say Cambridge may be...
www.thecrimson.com
December 8, 2025 at 3:41 PM
Reposted by Babak Alipanahi
SCIENCE SAVES LIVES.

Overall pediatric cancer survival rate increased from 63% in mid 1970s to 87%‼️between 2015 & 2021.

And this isn’t due to supplements, eating better or avoiding red food dye.

It’s due to science & industry working together to develop & approve therapies!
December 6, 2025 at 7:40 PM
Reposted by Babak Alipanahi
JASPAR 2026 is out 🎉

The new release massively expands the TF motif collections and adds a dedicated DeepLearning collection of motifs learned from deep learning models.

Database: jaspar.elixir.no
Paper (NAR): doi.org/10.1093/nar/...

🧵1/2
JASPAR: An open-access database of transcription factor binding profiles
JASPAR is the largest open-access database of curated and non-redundant transcription factor (TF) binding profiles from six different taxonomic groups.
jaspar.elixir.no
December 3, 2025 at 2:43 PM
Reposted by Babak Alipanahi
579 high-quality human genomes from @humanpangenome.bsky.social, Arab Pangenome and individual papers (CHM13, CN1, KSA001, I002C, YAO and KOREF1). Sequences available in the AGC format (3.7GB) and FM-index in the ropebwt3 format (20.3GB). For details, see github.com/lh3/human-asm
GitHub - lh3/human-asm: A collection of high-quality human genomes
A collection of high-quality human genomes. Contribute to lh3/human-asm development by creating an account on GitHub.
github.com
December 3, 2025 at 3:44 AM
A nice paper on distilling AI-based splicing models into much simpler additive models:

"[...] the distilled models achieve this without modeling RNA structure or feature interactions, indicating that [AI]-based splicing models recognize exons primarily through simple additive sequence features."
Interpretable Distillation Reveals that Deep-learning-based Splicing Models Suffer from Pervasive Confounders and Blind Spots [new]
Splicing models rely on confounders, failing on non-reference sequences, revealing training limits.
December 2, 2025 at 12:53 AM
Reposted by Babak Alipanahi
Abbott Laboratories is nearing a potential acquisition of Exact Sciences Corp, in what would be its largest deal in nearly a decade, people familiar with the matter said. www.bloomberg.com/news/article...
Abbott Nears Deal for Cancer Test Maker Exact Sciences
Abbott Laboratories is nearing a potential acquisition of medical-testing company Exact Sciences Corp., in what would be its largest deal in nearly a decade, people familiar with the matter said.
www.bloomberg.com
November 19, 2025 at 8:24 PM
Reposted by Babak Alipanahi
Are you an early-stage graduate student (2nd or 3rd year) or early-stage postdoc based in the US or Canada, working primarily in Drosophila? Would you like to help improve the experience of all trainees working in Drosophila research? If so, read on.

(Please repost to reach a broad audience.)
November 12, 2025 at 4:49 AM
Reposted by Babak Alipanahi
First time on Bsky and first big announcement!

I am excited to announce that our new study explaining the missing heritability of many phenotypes using WGS data from ~347,000 UK Biobank participants has just been published in @Nature.

Our manuscript is here: www.nature.com/articles/s41....
Estimation and mapping of the missing heritability of human phenotypes - Nature
WGS data were used from 347,630 individuals with European ancestry in the UK Biobank to obtain high-precision estimates of coding and non-coding rare variant heritability for 34 co...
www.nature.com
November 12, 2025 at 5:57 PM
Reposted by Babak Alipanahi
Yesterday our co-founder and CSO @babak-a.bsky.social presented at the Biotech-Pharma Statistics Workshop (BBSW) about the powerful combination of AI and cell-free RNA to detect early-stage lung cancer in the blood.
November 7, 2025 at 5:04 PM
Reposted by Babak Alipanahi
This version of the Persian dish tahchin incorporates common Thanksgiving ingredients. It is deeply savory and buttery, like stuffing, and some may say even better because it has a whole lot more texture coming from the crispy rice that everyone will be fighting over.
Caramelized Onion, Cranberry and Rosemary Tahchin  Recipe
Tahchin is a Persian rice dish in which the rice is mixed with yogurt, oil, egg yolks and saffron and baked until a golden crust forms at the bottom (Persians refer to this as the tahdig) The rice on the inside becomes buttery and almost cake-like and is often layered with chicken and barberries, a tart dried fruit that has a beautiful crimson color This version incorporates common Thanksgiving ingredients like rosemary, sweet-tart cranberries and buttery onions to make a striking dish that feels more like a main than a side
nyti.ms
November 7, 2025 at 4:15 PM
Reposted by Babak Alipanahi
Introducing Molview - the ipython/jupyter widget version of nano-protein-viewer🔍:
November 4, 2025 at 2:00 AM
Reposted by Babak Alipanahi
Neurodevelopmental Outcomes of 3-Year-Old Children Exposed to Maternal Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) Infection in Utero

I hate to say “I told you so…” but nevertheless: I told you so.
journals.lww.com
October 31, 2025 at 2:52 AM
Reposted by Babak Alipanahi
Delighted to see our method, PRSformer, at #NeurIPS2025! PRSformer is AI model for population-scale disease-risk prediction from individual genomes. It lays the groundwork for phenome-wide risk prediction.

www.biorxiv.org/content/10.1...
PRSformer: Disease Prediction from Million-Scale Individual Genotypes
Predicting disease risk from DNA presents an unprecedented emerging challenge as biobanks approach population scale sizes (N>106 individuals) with ultra-high-dimensional features (L>105 genotypes). Cu...
www.biorxiv.org
October 28, 2025 at 10:23 PM
Simultaneously comical and tragic.
I'm so tired.
October 22, 2025 at 6:56 PM