merve
@merve.bsky.social
proud mediterrenean 🧿 open-sourceress at hugging face 🤗 multimodality, zero-shot vision, vision language models, transformers
here's a good blog on successful DSE model MCDSE, compression and more huggingface.co/blog/marco/a...
Visually Multilingual: Introducing mcdse-2b
A Blog post by Marco Cimolai on Hugging Face
huggingface.co
April 15, 2025 at 4:27 PM
here's a good blog on successful DSE model MCDSE, compression and more huggingface.co/blog/marco/a...
the model also has impressive OCR capabilities ⬇️
April 11, 2025 at 7:10 PM
the model also has impressive OCR capabilities ⬇️
we'll give this model a test on agentic capabilities but here's an example from paper:
April 11, 2025 at 7:09 PM
we'll give this model a test on agentic capabilities but here's an example from paper:
This model consists of a dynamic res handling MoonViT encoder, a projection layer and a 16B MoE decoder (with 2.8B active params)
the paper introduces an interesting pre-training pipeline to handle long context and the model saw 4.4T tokens arxiv.org/pdf/2504.07491
the paper introduces an interesting pre-training pipeline to handle long context and the model saw 4.4T tokens arxiv.org/pdf/2504.07491
April 11, 2025 at 7:08 PM
This model consists of a dynamic res handling MoonViT encoder, a projection layer and a 16B MoE decoder (with 2.8B active params)
the paper introduces an interesting pre-training pipeline to handle long context and the model saw 4.4T tokens arxiv.org/pdf/2504.07491
the paper introduces an interesting pre-training pipeline to handle long context and the model saw 4.4T tokens arxiv.org/pdf/2504.07491
Reposted by merve
Smol but mighty:
• 256M delivers 80% of the performance of our 2.2B model.
• 500M hits 90%.
Both beat our SOTA 80B model from 17 months ago! 🎉
Efficiency 🤝 Performance
Explore the collection here: huggingface.co/collections/...
Blog: huggingface.co/blog/smolervlm
• 256M delivers 80% of the performance of our 2.2B model.
• 500M hits 90%.
Both beat our SOTA 80B model from 17 months ago! 🎉
Efficiency 🤝 Performance
Explore the collection here: huggingface.co/collections/...
Blog: huggingface.co/blog/smolervlm
January 23, 2025 at 1:33 PM
Smol but mighty:
• 256M delivers 80% of the performance of our 2.2B model.
• 500M hits 90%.
Both beat our SOTA 80B model from 17 months ago! 🎉
Efficiency 🤝 Performance
Explore the collection here: huggingface.co/collections/...
Blog: huggingface.co/blog/smolervlm
• 256M delivers 80% of the performance of our 2.2B model.
• 500M hits 90%.
Both beat our SOTA 80B model from 17 months ago! 🎉
Efficiency 🤝 Performance
Explore the collection here: huggingface.co/collections/...
Blog: huggingface.co/blog/smolervlm
Learn more from their blog post here huggingface.co/blog/vdr-2b-... 📖
Visual Document Retrieval Goes Multilingual
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
huggingface.co
January 13, 2025 at 11:12 AM
Learn more from their blog post here huggingface.co/blog/vdr-2b-... 📖