Karim Farid
@kifarid.bsky.social
2.2K followers 800 following 24 posts
PhD. Student @ELLIS.eu @UniFreiburg with Thomas Brox and Cordelia Schmid Understanding intelligence and cultivating its societal benefits https://kifarid.github.io
Posts Media Videos Starter Packs
kifarid.bsky.social
Generative models that assume the underlying distribution is continuous, for example, flow matching and common diffusion models.
kifarid.bsky.social
I really hope someone can revive continuous models for language. They’ve taken over the visual domain by far, but getting them to work in language still feels like pure alchemy.
Reposted by Karim Farid
simon.ging.ai
Excited to release our models and preprint: "Using Knowledge Graphs to harvest datasets for efficient CLIP model training"

We propose a dataset collection method using knowledge graphs and web image search, and create EntityNet-33M: a dataset of 33M images paired with 46M texts.
Using Knowledge Graphs to harvest datasets for efficient CLIP model training
Training high-quality CLIP models typically requires enormous datasets, which limits the development of domain-specific models -- especially in areas that even the largest CLIP models do not cover wel...
arxiv.org
Reposted by Karim Farid
phillipisola.bsky.social
Over the past year, my lab has been working on fleshing out theory + applications of the Platonic Representation Hypothesis.

Today I want to share two new works on this topic:

Eliciting higher alignment: arxiv.org/abs/2510.02425
Unpaired learning of unified reps: arxiv.org/abs/2510.08492

1/9
kifarid.bsky.social
Orbis shows that the objective matters.
Continuous modeling yields more stable and generalizable world models, yet true probabilistic coverage remains a challenge.

Immensely grateful to my co-authors @arianmousakhan.bsky.social, Sudhanshu Mittal, and Silvio Galesso, and to @thomasbrox.bsky.social
kifarid.bsky.social
Under the hood 🧠

Orbis uses a hybrid tokenizer with semantic + detail tokens that work in both continuous and discrete spaces.
The world model then predicts the next frame by gradually denoising or unmasking it, using past frames as context.
kifarid.bsky.social
Realistic and Diverse Rollouts 4/4
kifarid.bsky.social
Realistic and Diverse Rollouts 3/4
kifarid.bsky.social
Realistic and Diverse Rollouts 2/4
kifarid.bsky.social
Realistic and Diverse Rollouts 1/4
kifarid.bsky.social
While other models drift or blur on turns, Orbis stays on track — generating realistic, stable futures beyond the training horizon.

On our curated nuPlan-turns dataset, Orbis achieves better FVD, precision, and recall, capturing both visual and dynamics realism.
kifarid.bsky.social
We ask how continuous vs. discrete models and their tokenizers shape long-horizon behavior.

Findings:
Continuous models (Flow Matching) are
• Far less brittle to design choices
• Produce realistic, stable rollouts up to 20s
• And generalize better to unseen driving conditions

Continuous > Discrete
kifarid.bsky.social
Driving world models look good for a few frames, then they drift, blur, or freeze, especially when a turn or complex scene appears. These failures reveal a deeper issue: models aren’t capturing real dynamics. We introduce new metrics to measure such breakdowns.
kifarid.bsky.social
Our work Orbis goes to #NeurIPS2025!

A continuous autoregressive driving world model that outperforms Cosmos, Vista, and GEM with far less compute.

469M parameters
Trained on ~280h of driving videos

📄 arxiv.org/pdf/2507.13162
🎬 lmb-freiburg.github.io/orbis.github...
💻 github.com/lmb-freiburg...
kifarid.bsky.social
The question raised here is whether this approach is a generalist or a specialist that cannot transcend to the G-foundation state.
kifarid.bsky.social
I think HRM is quite great too. I would say they contributed the main idea (deep supervision) behind TRM.
kifarid.bsky.social
Transformers do not need to have something like "gradient descent" as an emergent property when it is kind of baked into it.
kifarid.bsky.social
The TRM works because it has an optimization algorithm as an inductive bias to find the answer. Can't say anything about this work but brilliant.
kifarid.bsky.social
We should normalize having the ‘Ideas That Failed’ section. It would save enormous amounts of compute and time otherwise spent rediscovering stuff that doesn’t work.
Reposted by Karim Farid
atabb.bsky.social
I stumbled on @eugenevinitsky.bsky.social 's blog and his "Personal Rules of Productive Research" is very good. I now do a lot of things in the post, & wish I had done them when I was younger.

I share my "mini-paper" w ppl I hope will be co-authors.

www.eugenevinitsky.com/posts/person...
Eugene Vinitsky
www.eugenevinitsky.com
Reposted by Karim Farid
Reposted by Karim Farid
eugenevinitsky.bsky.social
My major realization of the past year of teaching is that a lot is forgiven if students believe you genuinely care about them and the topic
Reposted by Karim Farid
phillipisola.bsky.social
Possible challenge: getting a model of {X,Y,Z,...} that is much better than independent models of each individual modality {X}, {Y}, {Z}, ... i.e. where the whole is greater than the sum of the parts.
kifarid.bsky.social
I also really hope that the LAM from V1 is still there!