https://web.mit.edu/phillipi/
But just for the chance of a meal in Paris, happy to take that bet and probably end up wrong :)
But just for the chance of a meal in Paris, happy to take that bet and probably end up wrong :)
This paper shows some effect of alignment increasing with scale, for a domain closer to remote sensing: www.arxiv.org/abs/2509.19453
This paper shows some effect of alignment increasing with scale, for a domain closer to remote sensing: www.arxiv.org/abs/2509.19453
9/9
9/9
8/9
8/9
If you concatenate datasets, the model “should” figure out all the synergies and cross-modal relationships, then exploit them to make better inferences. We now have some evidence this can happen.
7/9
If you concatenate datasets, the model “should” figure out all the synergies and cross-modal relationships, then exploit them to make better inferences. We now have some evidence this can happen.
7/9
We do the simplest thing: just train a model (e.g., a next-token predictor) on all elements of the concatenated dataset [X,Y,Z].
You end up with a better model of dataset X than if you had trained on X alone!
6/9
We do the simplest thing: just train a model (e.g., a next-token predictor) on all elements of the concatenated dataset [X,Y,Z].
You end up with a better model of dataset X than if you had trained on X alone!
6/9
5/9
5/9
This tells us that LLMs know more about the sensory world than we might suspect; you just have to find ways to elicit the knowledge.
4/9
This tells us that LLMs know more about the sensory world than we might suspect; you just have to find ways to elicit the knowledge.
4/9
If you ask it to “imagine hearing,” its representation becomes more like that of an auditory model.
3/9
If you ask it to “imagine hearing,” its representation becomes more like that of an auditory model.
3/9
We are interested in identifying commonalities between different models and modalities, and providing unifications.
2/9
We are interested in identifying commonalities between different models and modalities, and providing unifications.
2/9
How weird it would be if an LLM (a Markov chain!) could explain "thinking".
It feels like it makes us less special, like Copernicus placing the sun at the center, rather than the Earth.
How weird it would be if an LLM (a Markov chain!) could explain "thinking".
It feels like it makes us less special, like Copernicus placing the sun at the center, rather than the Earth.
Wrong: the teacher could underfit and be more correct than the "GT" y's. This paper is about one version of this: arxiv.org/abs/2206.15477
Wrong: the teacher could underfit and be more correct than the "GT" y's. This paper is about one version of this: arxiv.org/abs/2206.15477