Lightnews — Scholar-powered news

LightNews

Karim Farid

@kifarid.bsky.social

2.2K followers 800 following 24 posts

PhD. Student @ELLIS.eu @UniFreiburg with Thomas Brox and Cordelia Schmid Understanding intelligence and cultivating its societal benefits https://kifarid.github.io

kifarid.github.io

Posts Media Videos Starter Packs

Karim Farid @kifarid.bsky.social · 7m

Generative models that assume the underlying distribution is continuous, for example, flow matching and common diffusion models.

Karim Farid @kifarid.bsky.social · 18h

I really hope someone can revive continuous models for language. They’ve taken over the visual domain by far, but getting them to work in language still feels like pure alchemy.

Reposted by Karim Farid

Simon Ging @simon.ging.ai · May 8

Excited to release our models and preprint: "Using Knowledge Graphs to harvest datasets for efficient CLIP model training"

We propose a dataset collection method using knowledge graphs and web image search, and create EntityNet-33M: a dataset of 33M images paired with 46M texts.

Using Knowledge Graphs to harvest datasets for efficient CLIP model training

Training high-quality CLIP models typically requires enormous datasets, which limits the development of domain-specific models -- especially in areas that even the largest CLIP models do not cover wel...

Reposted by Karim Farid

Phillip Isola @phillipisola.bsky.social · 2d

Over the past year, my lab has been working on fleshing out theory + applications of the Platonic Representation Hypothesis.

Today I want to share two new works on this topic:

Eliciting higher alignment: arxiv.org/abs/2510.02425
Unpaired learning of unified reps: arxiv.org/abs/2510.08492

1/9

Karim Farid @kifarid.bsky.social · 22h

Orbis shows that the objective matters.
Continuous modeling yields more stable and generalizable world models, yet true probabilistic coverage remains a challenge.

Immensely grateful to my co-authors @arianmousakhan.bsky.social, Sudhanshu Mittal, and Silvio Galesso, and to @thomasbrox.bsky.social

Karim Farid @kifarid.bsky.social · 22h

Under the hood 🧠

Orbis uses a hybrid tokenizer with semantic + detail tokens that work in both continuous and discrete spaces.
The world model then predicts the next frame by gradually denoising or unmasking it, using past frames as context.

Karim Farid @kifarid.bsky.social · 23h

Realistic and Diverse Rollouts 4/4

Karim Farid @kifarid.bsky.social · 23h

Realistic and Diverse Rollouts 3/4

Karim Farid @kifarid.bsky.social · 23h

Realistic and Diverse Rollouts 2/4

Karim Farid @kifarid.bsky.social · 23h

Realistic and Diverse Rollouts 1/4

Karim Farid @kifarid.bsky.social · 23h

While other models drift or blur on turns, Orbis stays on track — generating realistic, stable futures beyond the training horizon.

On our curated nuPlan-turns dataset, Orbis achieves better FVD, precision, and recall, capturing both visual and dynamics realism.

Karim Farid @kifarid.bsky.social · 23h

We ask how continuous vs. discrete models and their tokenizers shape long-horizon behavior.

Findings:
Continuous models (Flow Matching) are
• Far less brittle to design choices
• Produce realistic, stable rollouts up to 20s
• And generalize better to unseen driving conditions

Continuous > Discrete

Karim Farid @kifarid.bsky.social · 23h

Driving world models look good for a few frames, then they drift, blur, or freeze, especially when a turn or complex scene appears. These failures reveal a deeper issue: models aren’t capturing real dynamics. We introduce new metrics to measure such breakdowns.

Karim Farid @kifarid.bsky.social · 23h

Our work Orbis goes to #NeurIPS2025!

A continuous autoregressive driving world model that outperforms Cosmos, Vista, and GEM with far less compute.

469M parameters
Trained on ~280h of driving videos

📄 arxiv.org/pdf/2507.13162
🎬 lmb-freiburg.github.io/orbis.github...
💻 github.com/lmb-freiburg...

Karim Farid @kifarid.bsky.social · 1d

The question raised here is whether this approach is a generalist or a specialist that cannot transcend to the G-foundation state.

Karim Farid @kifarid.bsky.social · 1d

I think HRM is quite great too. I would say they contributed the main idea (deep supervision) behind TRM.

Karim Farid @kifarid.bsky.social · 1d

Transformers do not need to have something like "gradient descent" as an emergent property when it is kind of baked into it.

Karim Farid @kifarid.bsky.social · 1d

The TRM works because it has an optimization algorithm as an inductive bias to find the answer. Can't say anything about this work but brilliant.

Karim Farid @kifarid.bsky.social · 1d

We should normalize having the ‘Ideas That Failed’ section. It would save enormous amounts of compute and time otherwise spent rediscovering stuff that doesn’t work.

Reposted by Karim Farid

Amy Tabb @atabb.bsky.social · Dec 16

I stumbled on @eugenevinitsky.bsky.social 's blog and his "Personal Rules of Productive Research" is very good. I now do a lot of things in the post, & wish I had done them when I was younger.

I share my "mini-paper" w ppl I hope will be co-authors.

www.eugenevinitsky.com/posts/person...

Eugene Vinitsky

www.eugenevinitsky.com

Reposted by Karim Farid

Neil Renic @ncrenic.bsky.social · Dec 10

Just had an idea

Reposted by Karim Farid

Eugene Vinitsky 🍒 @eugenevinitsky.bsky.social · Dec 5

My major realization of the past year of teaching is that a lot is forgiven if students believe you genuinely care about them and the topic

Reposted by Karim Farid

Phillip Isola @phillipisola.bsky.social · Dec 4

Possible challenge: getting a model of {X,Y,Z,...} that is much better than independent models of each individual modality {X}, {Y}, {Z}, ... i.e. where the whole is greater than the sum of the parts.

Karim Farid @kifarid.bsky.social · Dec 5

I also really hope that the LAM from V1 is still there!