Be warned, I did fire up the meme generator for my slides...
Be warned, I did fire up the meme generator for my slides...
Currently processing @natlibscot.bsky.social's 27,915-page handbook collection with one command.
Processing at ~350 images/sec on A100
Using @hf.co Jobs + uv - zero setup batch OCR!
Will share final time + cost when done!
Currently processing @natlibscot.bsky.social's 27,915-page handbook collection with one command.
Processing at ~350 images/sec on A100
Using @hf.co Jobs + uv - zero setup batch OCR!
Will share final time + cost when done!
This formalizes the existing maintenance structure, as I've personally led the project for the past two years on behalf of Hugging Face. I'm super excited about the transfer!
Details in 🧵
This formalizes the existing maintenance structure, as I've personally led the project for the past two years on behalf of Hugging Face. I'm super excited about the transfer!
Details in 🧵
Modern vision-language models have transformed what's possible: handwriting, 100+ languages, math formulas, tables, signature extraction...
New @hf.co guide on OCR
huggingface.co/blog/ocr-ope...
Modern vision-language models have transformed what's possible: handwriting, 100+ languages, math formulas, tables, signature extraction...
New @hf.co guide on OCR
huggingface.co/blog/ocr-ope...
With @wjbmattingly.bsky.social I'm launching small-models-for-glam on @hf.co to create/curate models that run on modest hardware and address GLAM use cases.
Follow the org to keep up-to-date!
huggingface.co/small-models...
With @wjbmattingly.bsky.social I'm launching small-models-for-glam on @hf.co to create/curate models that run on modest hardware and address GLAM use cases.
Follow the org to keep up-to-date!
huggingface.co/small-models...
With @wjbmattingly.bsky.social I'm launching small-models-for-glam on @hf.co to create/curate models that run on modest hardware and address GLAM use cases.
Follow the org to keep up-to-date!
huggingface.co/small-models...
With @wjbmattingly.bsky.social I'm launching small-models-for-glam on @hf.co to create/curate models that run on modest hardware and address GLAM use cases.
Follow the org to keep up-to-date!
huggingface.co/small-models...
📚 2.5bn tokens of mostly Latin and French texts
🕰️ 800→1600 CE
📜 23k manuscripts
🖥️ 18k on the reading interface: comma.inria.fr
🔍 Paper: inria.hal.science/hal-05299220v1
(1/🧵)
📚 2.5bn tokens of mostly Latin and French texts
🕰️ 800→1600 CE
📜 23k manuscripts
🖥️ 18k on the reading interface: comma.inria.fr
🔍 Paper: inria.hal.science/hal-05299220v1
(1/🧵)
📚 2.5bn tokens of mostly Latin and French texts
🕰️ 800→1600 CE
📜 23k manuscripts
🖥️ 18k on the reading interface: comma.inria.fr
🔍 Paper: inria.hal.science/hal-05299220v1
(1/🧵)
Nanonets just released OCR2 - a 3B parameter vision-language model for document OCR 📄
You can run it with one command on @hf.co Jobs (no local GPU needed)
Nanonets just released OCR2 - a 3B parameter vision-language model for document OCR 📄
You can run it with one command on @hf.co Jobs (no local GPU needed)
Nanonets just released OCR2 - a 3B parameter vision-language model for document OCR 📄
You can run it with one command on @hf.co Jobs (no local GPU needed)
Nanonets just released OCR2 - a 3B parameter vision-language model for document OCR 📄
You can run it with one command on @hf.co Jobs (no local GPU needed)
I built a UV script so you can run SOTA multilingual OCR in seconds with zero setup using @hf.co Jobs
Tested on 1800s library cards - works great ✨
I built a UV script so you can run SOTA multilingual OCR in seconds with zero setup using @hf.co Jobs
Tested on 1800s library cards - works great ✨
I built a UV script so you can run SOTA multilingual OCR in seconds with zero setup using @hf.co Jobs
Tested on 1800s library cards - works great ✨
I built a UV script so you can run SOTA multilingual OCR in seconds with zero setup using @hf.co Jobs
Tested on 1800s library cards - works great ✨
I uploaded two new @hf.co datasets (~470K cards) for training/evaluating models to extract structured metadata from catalogue cards.
I uploaded two new @hf.co datasets (~470K cards) for training/evaluating models to extract structured metadata from catalogue cards.
jobs.gem.com/bluesky/am9i...
jobs.gem.com/bluesky/am9i...
Libraries are starting to explore AI-assisted cataloguing, but we lack public evaluation data. Hoping this helps fill that gap.
huggingface.co/datasets/big...
Libraries are starting to explore AI-assisted cataloguing, but we lack public evaluation data. Hoping this helps fill that gap.
huggingface.co/datasets/big...
iconclass-vlm generates museum catalog codes (fun fact: "71H7131" = "Bathsheba with David's letter"!)
@hf.co TRL + Jobs = magic ✨
Guide here: danielvanstrien.xyz/posts/2025/i...
iconclass-vlm generates museum catalog codes (fun fact: "71H7131" = "Bathsheba with David's letter"!)
@hf.co TRL + Jobs = magic ✨
Guide here: danielvanstrien.xyz/posts/2025/i...
iconclass-vlm generates museum catalog codes (fun fact: "71H7131" = "Bathsheba with David's letter"!)
@hf.co TRL + Jobs = magic ✨
Guide here: danielvanstrien.xyz/posts/2025/i...
iconclass-vlm generates museum catalog codes (fun fact: "71H7131" = "Bathsheba with David's letter"!)
@hf.co TRL + Jobs = magic ✨
Guide here: danielvanstrien.xyz/posts/2025/i...
iconclass-vlm: Qwen2.5-VL-3B trained using SFT to generate ICONCLASS codes (think Dewey Decimal for art!)
Trained with @hf.co TRL + Jobs - single UV script, no GPU needed!
Blog soon!
iconclass-vlm: Qwen2.5-VL-3B trained using SFT to generate ICONCLASS codes (think Dewey Decimal for art!)
Trained with @hf.co TRL + Jobs - single UV script, no GPU needed!
Blog soon!
NuMarkdown-8B-Thinking from NuMind (YC S22) doesn't just extract text - it reasons through documents first.
Could be pretty valuable for weird historical documents?
Example here: davanstrien-ocr-time-capsule.static.hf.space/index.html?d...
NuMarkdown-8B-Thinking from NuMind (YC S22) doesn't just extract text - it reasons through documents first.
Could be pretty valuable for weird historical documents?
Example here: davanstrien-ocr-time-capsule.static.hf.space/index.html?d...
One command, no setup:
hf jobs uv run --flavor l4x4 [script-url] \
--input-dataset your/dataset \
--output-dataset your/output
Works on L4 GPUs ⚡
huggingface.co/datasets/uv-...
One command, no setup:
hf jobs uv run --flavor l4x4 [script-url] \
--input-dataset your/dataset \
--output-dataset your/output
Works on L4 GPUs ⚡
huggingface.co/datasets/uv-...
carpentries.org/blog/2025/08...
carpentries.org/blog/2025/08...
How well do these models handle Victorian theatre playbills from @bldigischol.bsky.social?
RolmOCR vs traditional OCR on tricky playbills (ornate fonts, faded ink, DRAMATIC ALL CAPS!)
@hf.co Demo: huggingface.co/spaces/davan...
How well do these models handle Victorian theatre playbills from @bldigischol.bsky.social?
RolmOCR vs traditional OCR on tricky playbills (ornate fonts, faded ink, DRAMATIC ALL CAPS!)
@hf.co Demo: huggingface.co/spaces/davan...
I made a quick Space to compare VLM OCR with "traditional" OCR using 11k Scottish exam papers from @natlibscot.bsky.social
huggingface.co/spaces/davanstrien/ocr-time-capsule
I made a quick Space to compare VLM OCR with "traditional" OCR using 11k Scottish exam papers from @natlibscot.bsky.social
huggingface.co/spaces/davanstrien/ocr-time-capsule
I made a quick Space to compare VLM OCR with "traditional" OCR using 11k Scottish exam papers from @natlibscot.bsky.social
huggingface.co/spaces/davanstrien/ocr-time-capsule
I made a quick Space to compare VLM OCR with "traditional" OCR using 11k Scottish exam papers from @natlibscot.bsky.social
huggingface.co/spaces/davanstrien/ocr-time-capsule