Raphael Baena
raphaelbaena.bsky.social
Raphael Baena
@raphaelbaena.bsky.social
Postdoc at @ImagineEnpc Research in Computer Vision | Ph.D at @IMTAtlantique
Research: Text Recognition (OCR, HTR, Chinese HTR)
Reposted by Raphael Baena
1/n🚀Gaussians > Differentiable function > Mesh?
Check out our new work: MILo: Mesh-In-the-Loop Gaussian Splatting!

🎉Accepted to SIGGRAPH Asia 2025 (TOG)
MILo is a novel differentiable framework that extracts meshes directly from Gaussian parameters during training.

🧵👇
September 8, 2025 at 11:35 AM
Reposted by Raphael Baena
I wrote a notebook for a lecture/exercice on image generation with flow matching. The idea is to use FM to render images composed of simple shapes using their attributes (type, size, color, etc). Not super useful but fun and easy to train!
colab.research.google.com/drive/16GJyb...

Comments welcome!
June 27, 2025 at 4:53 PM
Reposted by Raphael Baena
🔥🔥🔥 CV Folks, I have some news! We're organizing a 1-day meeting in center Paris on June 6th before CVPR called CVPR@Paris (similar as NeurIPS@Paris) 🥐🍾🥖🍷

Registration is open (it's free) with priority given to authors of accepted papers: cvprinparis.github.io/CVPR2025InPa...

Big 🧵👇 with details!
March 21, 2025 at 6:43 AM
Reposted by Raphael Baena
Starter pack including some of the lab members: go.bsky.app/QK8j87w
March 14, 2025 at 10:34 AM
Reposted by Raphael Baena
Guillaume Astruc, Nicolas Gonthier, Clement Mallet, Loic Landrieu
AnySat: An Earth Observation Model for Any Resolutions, Scales, and Modalities
https://arxiv.org/abs/2412.14123
December 19, 2024 at 6:45 AM
Reposted by Raphael Baena
🍏 New preprint alert! 🍏
PoM: Efficient Image and Video Generation with the Polynomial Mixer
arxiv.org/abs/2411.12663
This is my latest "summer project" and it was so big I had to call in reinforcements (Thanks @nicolasdufour.bsky.social)

TL;DR Transformers are for boomers, welcome to the future
🧵👇
PoM: Efficient Image and Video Generation with the Polynomial Mixer
Diffusion models based on Multi-Head Attention (MHA) have become ubiquitous to generate high quality images and videos. However, encoding an image or a video as a sequence of patches results in costly...
arxiv.org
November 20, 2024 at 8:08 AM
Reposted by Raphael Baena
I'm slowly putting my intro to ML course material on github, starting with the lab sessions: github.com/davidpicard/...
These are self-contained notebooks in which you have to implement famous algorithms from the literature (k-NN, SVM, DT, etc), with a custom dataset that I (painstakingly) made!
November 19, 2024 at 2:30 PM