Jason Weston
jasonweston.bsky.social
Jason Weston
@jasonweston.bsky.social

Senior Director, Research Scientist @ Meta FAIR + Visiting Prof @ NYU.
Pretrain+SFT: NLP from Scratch (2011). Multilayer attention+position encode+LLM: MemNet (2015). Recent (2024): Self-Rewarding LLMs & more!

Computer science 92%
Biology 7%

Reposted by Jason Weston

Reposted by Jason Weston

Reposted by Jason Weston

Reposted by Jason Weston

Reposted by Jason Weston

Reposted by Jason Weston

Reposted by Jason Weston

Reposted by Jason Weston

Reposted by Jason Weston

Reposted by Jason Weston

Reposted by Luke Zettlemoyer

Reposted by Jason Weston

Reposted by Luke Zettlemoyer

Reposted by Jason Weston

Reposted by Jason Weston

Reposted by Luke Zettlemoyer

Reposted by Jason Weston

Training Large Language Models to Reason in a Continuous Latent Space

Introduces a new paradigm for LLM reasoning called Chain of Continuous Thought (COCONUT)

Directly feed the last hidden state (a continuous thought) as the input embedding for the next token.

arxiv.org/abs/2412.06769

Reposted by Jason Weston