Gwen Cheni
banner
gwencheni.bsky.social
Gwen Cheni
@gwencheni.bsky.social
Building stealth AI+bio. Prev @KhoslaVentures @indbio @sosv🧬💻 @ucsf🌉 @jpmorgan @GoldmanSachs @yale @UChicago @LMU_Muenchen
Emergent properties:

Thinking time steadily improved throughout the training process 😳
January 21, 2025 at 2:19 AM
In addition to open source, DeepSeek-R1 is significant because it’s complete reinforcement learning (RL), no supervised fine-tuning (SFT)(“cold start”). Reminiscent of AlphaZero (which mastered Go, Shogi, and Chess from scratch, without playing against human grandmasters).
January 21, 2025 at 2:18 AM
“Unlike solutions relying on superior LLMs for data synthesis, rStar-Math leverages smaller language models (SLMs) with Monte Carlo Tree Search (MCTS) to establish a self-evolutionary process, iteratively generating higher-quality training data.” 2/n
January 12, 2025 at 4:46 PM
The reduced computation cost is due to necessitating only a fraction of the number of parameters in traditional MLPs: 5/n
January 8, 2025 at 2:05 AM
By incorporating signal processing, Fourier Analysis Networks (FAN)-based transformer beats the MLP-based Transformer, LSTM, and Mamba for time series forecasting and language modeling, while using fewer parameters and floating-point operations (FLOPs). 3/n
January 8, 2025 at 2:05 AM
Sequential revisions — Similar to an internal editor, self-evaluation by the model. 5/n
January 3, 2025 at 5:17 PM
Outcome Reward (throw out the entire answer if wrong, sparse data) was inferior to Process Reward (rewards intermediate steps, learns step-level policies, flexible step-level segmentation, information entropy). 4/n
January 3, 2025 at 5:17 PM