@gregdnlp.bsky.social. I really like how explore what happens during the alignment of LLM's with RLHF. This was so cool to see having observed similar outcomes in my research.
@gregdnlp.bsky.social. I really like how explore what happens during the alignment of LLM's with RLHF. This was so cool to see having observed similar outcomes in my research.
@quanquangu.bsky.social. I really like how they explore new techniques for RLHF
@quanquangu.bsky.social. I really like how they explore new techniques for RLHF
Aligning autoregressive pLM's to generate EGFR binders via Direct Policy Optimization (DPO) from the incredible @noeliaferruz.bsky.social who gave a great talk as part of the MLSB workshop
Aligning autoregressive pLM's to generate EGFR binders via Direct Policy Optimization (DPO) from the incredible @noeliaferruz.bsky.social who gave a great talk as part of the MLSB workshop