Ethan Armand
ethan-armand.bsky.social
Ethan Armand
@ethan-armand.bsky.social
Bioinformatics PhD Student in the Ren lab at UCSD. Modeling gene regulation across species.
While they do predict a previously "unseen" cell type, fetal astrocytes are relatively similar to the central tendancy of their dataset. The Pearson correlation of between the dataset mean and fetal astrocytes (0.78) is greater than their leave one out validation.
April 26, 2025 at 7:18 PM
In contrast borzoi is a pure sequence model which is easier to train, more performant, and predicts track resolution rather than expression abundance.

Decima which predicts psuedobulk as well has a mean of 0.8 for the test set
www.biorxiv.org/content/10.1...
Decoding sequence determinants of gene expression in diverse cellular and disease states
Sequence-to-function models that predict gene expression from genomic DNA sequence have proven valuable for many biological tasks, including understanding cis -regulatory syntax and interpreting non-c...
www.biorxiv.org
April 26, 2025 at 7:13 PM
I think their experiments don't offer strong support for this claim. Their model is trained across the whole genome, and when they use a validation set (held out chromosomes) accuracy drops precipitously to mean 0.75.

(methods Leave-out-chromosome evaluation)
April 26, 2025 at 7:10 PM