Lightnews — Scholar-powered news

David van Dijk

@vandijklab.bsky.social

100 followers 46 following 51 posts

Learning the rules of life.
Assistant Professor of Medicine and Computer Science @ Yale

Posts Replies Media Videos

David van Dijk

@vandijklab.bsky.social

We thank our amazing team at Yale, Google Research, and Google DeepMind

April 18, 2025 at 2:14 PM

David van Dijk

@vandijklab.bsky.social

Beyond standard training, we used Reinforcement Learning (RL) 🤖 to fine-tune C2S-Scale.
Using GRPO + biological rewards, we specifically improved:
• Perturbation prediction accuracy 🧪
• Biological Q&A relevance ❓
Aligning LLMs with biological goals! ✅

April 18, 2025 at 2:14 PM

David van Dijk

@vandijklab.bsky.social

Size matters! 📈 We observed clear scaling laws: As model size increased from 410M → 27 Billion parameters, performance consistently improved across tasks.
This confirms that LLMs learn better biological representations at scale using the C2S approach. Even works with efficient LoRA tuning! 💪

April 18, 2025 at 2:14 PM

David van Dijk

@vandijklab.bsky.social

And it works! 🎉 C2S-Scale achieves SOTA performance, surpassing specialized single-cell models AND general LLMs:
• 🎯 Cell type annotation
• 🧪 Predicting perturbation responses
• ✍️ Generating dataset summaries from cells
• 🗺️ Inferring spatial relationships
• ❓ Answering complex biological questions

April 18, 2025 at 2:14 PM

David van Dijk

@vandijklab.bsky.social

To truly "teach" biology to LLMs, we built a massive corpus: Over 1 BILLION tokens! 📚
This wasn't just cell sentences – it included:
• 🧬 50M+ cell profiles (human/mouse)
• 🏷️ Annotations & Metadata
• 📄 Biological Text (abstracts, etc.)
Result? One model, many tasks!

April 18, 2025 at 2:14 PM

David van Dijk

@vandijklab.bsky.social

We enable LLMs to "read" biology via Cell2Sentence (C2S) 🧬➡️📝: ranking genes creates text.
This lets us leverage massive pre-trained models, unifying transcriptomic data with biological text (annotations, papers) for richer understanding.

April 18, 2025 at 2:14 PM

David van Dijk

@vandijklab.bsky.social

This work highlights the power of CLM-based intelligent adaptive solvers for scalable operator learning of dynamical systems. Imagine more efficient and accurate simulations for everything from fluid dynamics to climate modeling! 🌍

February 13, 2025 at 7:23 PM

David van Dijk

@vandijklab.bsky.social

📈 Benchmarked on diverse systems, COAST consistently outperforms state-of-the-art methods in both accuracy and efficiency!

February 13, 2025 at 7:23 PM

David van Dijk

@vandijklab.bsky.social

🔑 Key finding: COAST generates variable step sizes that intelligently adapt to the current system's complexity! Smaller steps in complex regions, larger steps in simpler ones. Across systems, more complex dynamics get finer time resolution.

February 13, 2025 at 7:23 PM

David van Dijk

@vandijklab.bsky.social

Current ML methods for PDEs often use fixed time steps, which is inefficient, especially for complex dynamics. COAST, powered by a causal language model (CLM), predicts both the solution AND the optimal time step. 🧠

February 13, 2025 at 7:23 PM

David van Dijk

@vandijklab.bsky.social

Excited to share our new preprint: COAST: Intelligent Time-Adaptive Neural Operators! 🌊 We introduce a novel neural operator that learns to dynamically and intelligently adjust time step sizes for modeling dynamical systems from data. 🚀 doi.org/10.48550/arX...

February 13, 2025 at 7:23 PM

David van Dijk

@vandijklab.bsky.social

🔥🧠🌌 Now accepted at #ICLR2025 !

How does complexity shape intelligence? 🤔

In our new paper "Intelligence at the Edge of Chaos", we explore the relationship between complex systems and the emergence of intelligence in AI models. Can complexity alone unlock smarter systems?
arxiv.org/abs/2410.02536

February 12, 2025 at 8:38 AM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news