Etowah Adams
etowah0.bsky.social
Etowah Adams
@etowah0.bsky.social
enjoying and bemoaning biology. phd student
@columbia prev. @harvardmed @ginkgo @yale
Thank you for your help interpreting features :) It was really special having the community squint at these features with us. Lots of weird features means that everyone sees different/new things!
February 10, 2025 at 4:13 PM
Thanks to our coauthors Minji Lee, Steven Yu, and @moalquraishi.bsky.social!

Check work from Elana Pearl on using SAEs on pLMs too!
x.com/ElanaPearl/s...
February 10, 2025 at 4:12 PM
Such hard-to-interpret but predictive features could result from biases and limitations in our datasets and models. However, there is another intriguing possibility: these features may correspond to biological mechanisms that have yet to be discovered.
February 10, 2025 at 4:12 PM
Not all predictive features are easily interpretable. For instance, a predictor for membrane localization, latent L28/3154, predominantly activates on poly-alanine sequences, whose functional relevance remains unclear.
February 10, 2025 at 4:12 PM
When trained to predict thermostability, the linear models assign their most positive weights to features that correlate with hydrophobic amino acids and their most negative weights to a feature activating on glutamate, an amino acid linked to instability.
February 10, 2025 at 4:12 PM
For the extracellular class, we uncover features that activate on signal peptides, which trigger the cellular mechanisms that transport proteins outside the cell.
February 10, 2025 at 4:12 PM
These sequence motifs consist of K/R rich regions and act like a “passport” for proteins to enter the nucleus. As many NLS variants still remain unknown, our SAE-based approach has potential to become a novel method to aid their discovery and characterization.
February 10, 2025 at 4:12 PM
For example, we trained a linear classifier on subcellular localization labels (nucleus, cytoplasm, mitochondrion…). Excitingly, the most highly weighted features for the nucleus class recognize nuclear localization signals (NLS)
February 10, 2025 at 4:12 PM
Having established methods to interpret features, we sought to associate features with functional properties and more generally understand pLM performance on downstream tasks. We train linear models on SAE features and analyze features that contribute the most to task performance.
February 10, 2025 at 4:12 PM
We categorize SAE latents by family specificity and activation pattern by layer. We find more family specific features in middle layers, and more single token activating (point) features in layer layers. We also study how SAE hyperparameters change what features are uncovered.
February 10, 2025 at 4:12 PM
We find many features appear highly specific to certain protein families, suggesting pLMs contain an internal notion of protein families which mirrors known homology-based families. Other features correspond to broader, family-independent coevolutionary signals.
February 10, 2025 at 4:12 PM
(Regarding the recent discourse surrounding SAEs, we refer to this thread from @banburismus_)
x.com/banburismus_...
February 10, 2025 at 4:12 PM
Using sparse autoencoders (SAEs), we find many interpretable features in a protein language model (pLM), ESM-2, ranging from secondary structure, to conserved domains, to context specific properties. Explore them at interprot.com!
February 10, 2025 at 4:12 PM
For more background on mechanistic interpretability and protein language models, check out this previous thread on our work (X link, sorry) x.com/liambai21/st...
February 10, 2025 at 4:12 PM