Ryan Z Friedman, PhD
rfriedman22.bsky.social
Ryan Z Friedman, PhD
@rfriedman22.bsky.social
Gene regulation, machine learning, data viz | Postdoc with Cole Trapnell, comp bio PhD with Barak Cohen and Mike A White
🏳️‍🌈✡️ he/him
https://ryanzfriedman.com/
Thanks so much Tim!
January 9, 2025 at 6:40 AM
Thanks so much Jacob!
January 8, 2025 at 4:08 PM
Thank you Alex!!
January 8, 2025 at 5:33 AM
I first had this idea 6.5 years ago, early on in grad school. This journey has been a long one. I couldn't have done it without the support and guidance from @genologos.bsky.social and Barak Cohen, our collaboration with @corbolab.bsky.social, or help with the modeling and analysis from my coauthors
January 8, 2025 at 12:18 AM
Our solution is to let your model guide you. By focusing on uncertain sequences and testing them functional genomic assays, you can *iteratively* train a model. We applied this to understand why the same DNA sequence motif has radically different effects in different contexts.
January 8, 2025 at 12:18 AM
Many thanks to Yawei Wu, Lloyd Tripp, and Daniel Lyon for their help with these analyses!

The manuscript itself is also restructured. Figs 2 and 4 are swapped, there's a 5th fig for the K562 analysis, and we reworked the Discussion.

Apologies if threading isn't the way to go on Bsky. 🧬🔄

8/8
February 20, 2024 at 7:00 PM
We analyzed a second pair of sequences with similar motif content. The model correctly predicts that the RORB motif must be 3' of the CRX motif.

These results show our model learns the context that distinguishes functionally non-equivalent motifs.

7/
February 20, 2024 at 6:59 PM
RORB motifs have a wide range of effects when mutated. Our model predicts this correctly & these effects are correlated with motif affinity.

Along with our other results, this shows active learning generates the data needed to learn regulatory grammars.

6/
February 20, 2024 at 6:58 PM
We have a new result showing that our model accurately predicts when CRX motifs increase vs. decrease expression. This is crucial because nc variants can change activity in unexpected directions, so it's important to have data that can tell when a motif has a positive vs negative effect.

5/
February 20, 2024 at 6:58 PM
Our experiments suggest that inactive sequences are low-information training examples. This is important because large libraries derived from random DNA are mostly inactive seqs. We think iteratively training models on smaller but more informative training data is more effective

4/
February 20, 2024 at 6:57 PM
When we did many rounds, active learning was more efficient, approached the upper bound with less data, and enriched for positive examples!

This demonstrates that active learning is broadly effective and illustrate that enriching for active sequences is more informative

3/
February 20, 2024 at 6:57 PM
We tested active learning in a second system using Nadav Ahituv and @jshendure.bsky.social's genome-wide MPRA in K562s. We downsampled the data, trained a CNN, then sampled from the remaining data. Active learning consistently outperformed random sampling across many starting conditions.

2/
February 20, 2024 at 6:57 PM
I should note that I set up my folders somewhere around my third year of grad school and haven't meaningfully reorganized it since then, so I'm open to a complete overhaul.
January 17, 2024 at 12:55 AM
Next time I meet a techie asking how to move into compbio I’ll connect them with you!
December 5, 2023 at 4:30 PM
Yeah, I struggle with finding a polite way to say “go learn a bunch of biology or take a huge pay cut to be an entry level bioinformatician for a few years” but there is definitely a mode of thought among some techies that they can watch 20 hours of YouTube videos and call it good
December 5, 2023 at 4:07 PM
Agreed that there is strong cell to cell variability. But changes in e.g. the ZRS enhancer of Shh can cause loss of limbs or gain of extra digits. Some of that is due to changes in Shh in space/time, but some is also due to how *much* Shh is produced
October 11, 2023 at 9:21 PM
hmm...A difference in 2 vs 3 fold change can definitely matter! Over/underactive CREs can cause developmental defects and disease. Agreed that cell culture won't tell you space+time, but they can tell you information about how sequence features encode activity.
October 11, 2023 at 9:13 PM