Junhao (Bear) Xiong
junhaobearxiong.bsky.social
Junhao (Bear) Xiong
@junhaobearxiong.bsky.social
Machine learning for computational biology. PhD student at Berkeley EECS.
The guided library in round 2 showed significantly higher activity than the initial unguided library in the experimental base editing assay.
May 31, 2025 at 3:46 PM
We didn't just validate in silico - we also synthesized & tested proteins in the lab. We used ProteinGuide to engineer an adenine base editor for high activity: generated 2,000 variants → tested in bacteria → used results to guide 2,000 new designs.
May 31, 2025 at 3:46 PM
In our third task, we demonstrate the generality of ProteinGuide beyond amino acid sequences, to structure tokens. In particular, we guide ESM3 to generate backbone structures (as tokens) with specified CATH fold class labels.
May 31, 2025 at 3:46 PM
In our second task, we guided ESM3 to re-design enzymes sequences predicted to belong to specific enzyme classes, based on a published classifier, CLEAN, for enzyme commission number.
May 31, 2025 at 3:46 PM
In our first task, we guided ProteinMPNN with experimental stability measurements from the @grocklin.bsky.social lab to generate amino acid sequences encoding proteins that are more stable than what ProteinMPNN would do on its own.
May 31, 2025 at 3:46 PM
We leverage the fact that MLMs (e.g., ESM3), OA-AR models (e.g., ProteinMPNN), and masking-based diffusion models are actually equivalent. This allows us to leverage our previously-developed guidance methodology for discrete diffusion and flow models for MLMs and OA-AR models.
May 31, 2025 at 3:46 PM
Guide your favorite protein generative model with experimental data? Meet ProteinGuide - a method to condition pre-trained models on properties without retraining. We validated it both in silico by guiding ProteinMPNN and ESM3 on 3 tasks and in vitro by engineering base editors.
May 31, 2025 at 3:46 PM