Yun S. Song
yun-s-song.bsky.social
Yun S. Song
@yun-s-song.bsky.social
Professor of EECS and Statistics at UC Berkeley. Mathematical and computational biologist.
An open-rank faculty search in AI + Engineering (Bioengineering included) at UC Berkeley.

Due date: Monday, Nov 3, 2025 at 11:59pm (PT)
Please help spread the news.

aprecruit.berkeley.edu/JPF05144
Assistant/Associate/Full Professor – Engineering + Artificial Intelligence - College of Engineering (host academic department(s) to be determined)
University of California, Berkeley is hiring. Apply now!
aprecruit.berkeley.edu
October 14, 2025 at 9:53 PM
Reposted by Yun S. Song
This is truly an incredible breakthrough IMO. Really exemplifies what you get when deep domain expertise (popgen/evolution/disease genetics in this case) fuses with cleverly crafted ML. What u get r sleek, well thought out architectures that absolutely destroy the behemoths. Wow!! 1/
We are excited to share GPN-Star, a cost-effective, biologically grounded genomic language modeling framework that achieves state-of-the-art performance across a wide range of variant effect prediction tasks relevant to human genetics.
www.biorxiv.org/content/10.1...
(1/n)
September 22, 2025 at 8:34 AM
We are excited to share GPN-Star, a cost-effective, biologically grounded genomic language modeling framework that achieves state-of-the-art performance across a wide range of variant effect prediction tasks relevant to human genetics.
www.biorxiv.org/content/10.1...
(1/n)
September 22, 2025 at 5:29 AM
SINGER, our ARG inference method, is finally published and freely available online:

doi.org/10.1038/s415...

It was a long journey – 16 months from initial submission to acceptance. Is it just me, or has peer review gotten more arduous lately? 4+ rounds of review isn't so unusual these days...
Robust and accurate Bayesian inference of genome-wide genealogies for hundreds of genomes - Nature Genetics
SINGER is a method for creating ancestral recombination graphs to understand the genealogical history of genomes. The method has increased speed, and thus scalability, without sacrificing accuracy.
doi.org
September 11, 2025 at 3:50 AM
Reposted by Yun S. Song
Hi Bluesky — Dedicating my first post to this work and software, led by the incredibly meticulous and capable @fandingzhou.bsky.social! An earlier version of this was shared at the 2022 Bioconductor Conference (bioc2022.bioconductor.org/schedule/).
Gene expression changes aren’t just about mean shifts — variability shifts matter too, especially for aging. We're thrilled to introduce QRscore, a flexible non-parametric framework for detecting shifts in mean and variance across conditions. doi.org/10.1016/j.cr...
September 5, 2025 at 1:32 PM
Reposted by Yun S. Song
Gene expression changes aren’t just about mean shifts — variability shifts matter too, especially for aging. We're thrilled to introduce QRscore, a flexible non-parametric framework for detecting shifts in mean and variance across conditions. doi.org/10.1016/j.cr...
September 5, 2025 at 2:15 AM
Reposted by Yun S. Song
In a new preprint we use deep learning on lineage trees to infer the functional form of the relationship between affinity and fitness that controls antibody evolution in germinal centers: arxiv.org/abs/2508.09871 🧵
Inference of germinal center evolutionary dynamics via simulation-based deep learning
B cells and the antibodies they produce are vital to health and survival, motivating research on the details of the mutational and evolutionary processes in the germinal centers (GC) from which mature...
arxiv.org
August 16, 2025 at 10:56 PM
Antibodies are highly diverse, but most possible sequences are unstable or polyreactive. In this work, just published in Cell Syst., we propose a new source of data for modeling constraints from these properties. Our models show clear improvements in predicting Ab dysfunction. (1/n)
t.co/qCZERPUMPF
https://authors.elsevier.com/a/1lbX08YyDfuZWX
t.co
August 15, 2025 at 1:17 PM
Reposted by Yun S. Song
(1/4) 🧬 Why Sequence the Genomes of Earth’s Biodiversity?
The Earth BioGenome Project 🌍 is a global network of initiatives working together to create a complete genome library for all Eukaryotic life—from mushrooms 🍄 to mammals 🐘.
#biodiversity #genomes #sequence #earthbiogenome #education #stem
July 29, 2025 at 8:59 PM
Reposted by Yun S. Song
Germinal center clonal diversity trees as a musical score, a great image to start @victora.bsky.social's CCII seminar, "Replaying germinal center evolution on a quantified affinity landscape"
#GerminalCenter #Immunology
www.ccii.med.kyoto-u.ac.jp/en/event/the...
July 2, 2025 at 2:42 AM
Reposted by Yun S. Song
In vivo mapping of mutagenesis sensitivity of human enhancers

www.nature.com/articles/s41...
In vivo mapping of mutagenesis sensitivity of human enhancers - Nature
Human enhancers contain a high density of sequence features that are required for their normal in vivo function.
www.nature.com
June 18, 2025 at 9:20 PM
The 2026 Probabilistic Modeling in Genomics (ProbGen) meeting will be held at UC Berkeley, March 25-28, 2026. We have an amazing list of keynote speakers and session chairs:
probgen2026.github.io

Please help spread the news.
Home - ProbGen 2026
Your Site Description
probgen2026.github.io
June 6, 2025 at 5:52 PM
Reposted by Yun S. Song
Wanted to highlight our latest preprint--a huge effort by multiple people and labs, but led primarily by @wsdewitt.github.io, Tatsuya Araki, and Ashni Vora, in a very close wet-dry collaboration with @matsen.bsky.social’s lab at the Hutch

www.biorxiv.org/content/10.1...
Replaying germinal center evolution on a quantified affinity landscape
Darwinian evolution of immunoglobulin genes within germinal centers (GC) underlies the progressive increase in antibody affinity following antigen exposure. Whereas the mechanics of how competition be...
www.biorxiv.org
June 5, 2025 at 2:28 PM
Reposted by Yun S. Song
Check out CRISPRpedia, our resource on all things #CRISPR! The latest chapter is on CRISPR & ethics: innovativegenomics.org/crisprpedia/...

CRISPRpedia features 85+ original illustrations that are free to download & use for non-commercial purposes!

#STEMeducation #STEMed #bioethics #SciArt
June 4, 2025 at 5:13 PM
Reposted by Yun S. Song
How well can deep learning models predict the effect of modifying chromatin on gene expression???

Our work -- led by Sanjit Batra and Alan Cabrera when they were in @yun-s-song.bsky.social ’s and Isaac Hilton’s labs -- tries to answer this.

🧵🧬🧪

elifesciences.org/reviewed-pre...
Predicting the effect of CRISPR-Cas9-based epigenome editing
elifesciences.org
May 30, 2025 at 2:45 AM
Reposted by Yun S. Song
New preprint in collaboration with @paulinanunezv.bsky.social supervised by @jonnyfrazer.bsky.social and Mafalda Dias – we propose a simple approach to improving zero-shot variant effect prediction in pre-existing protein and genome language models: 🧶 1/n

www.biorxiv.org/content/10.1...
From Likelihood to Fitness: Improving Variant Effect Prediction in Protein and Genome Language Models
Generative models trained on natural sequences are increasingly used to predict the effects of genetic variation, enabling progress in therapeutic design, disease risk prediction, and synthetic biolog...
www.biorxiv.org
May 26, 2025 at 5:30 PM
How can one efficiently simulate phylodynamics for populations with billions of individuals, as is typical in many applications, e.g., viral evolution and cancer genomics? In this work with M. Celentano, @wsdewitt.github.io , & S. Prillo, we provide a solution. doi.org/10.1073/pnas...
1/n
May 23, 2025 at 9:02 PM
Reposted by Yun S. Song
In a medical breakthrough, a team including IGI’s
@urnov.bsky.social & @giannikopoulosp.bsky.social created an on-demand #CRISPR therapy for an infant with a deadly gene mutation — developed, approved, and delivered to the patient in just 6 months.

Read more: ow.ly/G0Bg50VTonC

#RareDisease 🧬
May 15, 2025 at 5:04 PM
Reposted by Yun S. Song
Jennifer Doudna @jenniferdoudna.bsky.social @doudna-lab.bsky.social speaks with Cleo Abrams on the history and future of #CRISPR 🧬. Watch here: youtu.be/0OXaanDHENI?...
You Can Fix Your DNA... Starting Now (feat. Nobel Prize Winner)
YouTube video by Cleo Abram
youtu.be
May 19, 2025 at 4:16 PM
Reposted by Yun S. Song
Overfitting is among the conceptually most interesting problems in machine learning.
I am happy of several new phenomena we began to understand with Pierfrancesco Urbani.
Alert: mostly non-rigorous! (Celebrating Jorge Kurchan)
web.stanford.edu/~montanar/OT...
web.stanford.edu
April 30, 2025 at 8:23 PM
Reposted by Yun S. Song
If you want to check if a human gene has copy-number changes or lands in a complex region, try pangene.bioinweb.org. Recently updated with more and better assemblies.
April 26, 2025 at 1:06 AM
Thrilled to see my digital art on the cover of Trends Genet. The two binary strings represent reverse-complementary DNA sequences (00=A, 01=C, 10=G, 11=T) and the connecting rectangles represent “embeddings” learned by DNA language models. Pls check out our article as well: doi.org/10.1016/j.ti...
April 7, 2025 at 3:01 PM
In our updated TraitGym preprint (w/ @gonzalobenegas.bsky.social & Gökcen Eraslan), we evaluate Evo 2 on regulatory variants associated with human traits. We see marked performance gains with scale on Mendelian traits, although still a bit behind alignment-based methods.
doi.org/10.1101/2025...
1/n
March 4, 2025 at 7:54 PM
Reposted by Yun S. Song
Can DNA sequence models predict mutations affecting human traits?

We introduce TraitGym, a curated benchmark of causal regulatory variants for 113 Mendelian & 83 complex traits, and evaluate functional genomics and DNA language models. Joint work w/ Gökcen Eraslan and @yun-s-song.bsky.social 🧵👇
February 13, 2025 at 8:57 PM
Reposted by Yun S. Song
A month ago we @vevotherapeutics.bsky.social announced that we have generated the largest single-cell perturbation atlas in history, Tahoe-100M. Today, we announce that we will fully open-source Tahoe-100M in Feb, as part of a collaboration with NVidia health to train cell state models.
January 13, 2025 at 4:23 PM