Zoom on the IIIF Community Calendar: iiif.io/community
Powered by a fine-tuned ModernBERT classifier. Full dataset stored in Lance format on the Hub with vector embeddings.
huggingface.co/spaces/libra...
Powered by a fine-tuned ModernBERT classifier. Full dataset stored in Lance format on the Hub with vector embeddings.
huggingface.co/spaces/libra...
Updated the Dataset Papers on ArXiv app to surface them: 52K+ papers classified as introducing new datasets from 212K CS papers.
Updated the Dataset Papers on ArXiv app to surface them: 52K+ papers classified as introducing new datasets from 212K CS papers.
Zoom on the IIIF Community Calendar: iiif.io/community
Zoom on the IIIF Community Calendar: iiif.io/community
SAM3 on HF Jobs → correct the errors → train YOLO → repeat.
Three rounds: 31% → 99% accuracy on historical index cards from @natlibscot.bsky.social
SAM3 on HF Jobs → correct the errors → train YOLO → repeat.
Three rounds: 31% → 99% accuracy on historical index cards from @natlibscot.bsky.social
SAM3 script: huggingface.co/datasets/uv-...
SAM3 on HF Jobs → correct the errors → train YOLO → repeat.
Three rounds: 31% → 99% accuracy on historical index cards from @natlibscot.bsky.social
SAM3 on HF Jobs → correct the errors → train YOLO → repeat.
Three rounds: 31% → 99% accuracy on historical index cards from @natlibscot.bsky.social
I used a dataset I labelled in 2022 and left on @hf.co for 3 years 😬.
It finds illustrated pages in historical books. No server. No GPU.
I used a dataset I labelled in 2022 and left on @hf.co for 3 years 😬.
It finds illustrated pages in historical books. No server. No GPU.
Part of small-models-for-glam: small, efficient models for cultural heritage work.
Not everything needs GPT-4!
Try it: huggingface.co/spaces/small-models-for-glam/iiif-illustration-detector
Part of small-models-for-glam: small, efficient models for cultural heritage work.
Not everything needs GPT-4!
Try it: huggingface.co/spaces/small-models-for-glam/iiif-illustration-detector
I used a dataset I labelled in 2022 and left on @hf.co for 3 years 😬.
It finds illustrated pages in historical books. No server. No GPU.
I used a dataset I labelled in 2022 and left on @hf.co for 3 years 😬.
It finds illustrated pages in historical books. No server. No GPU.
Probably slides on their own aren't that useful, but they do feature one of my growing collection of libraries-and-AI memes, so there's that danielvanstrien.xyz/slides.html
Probably slides on their own aren't that useful, but they do feature one of my growing collection of libraries-and-AI memes, so there's that danielvanstrien.xyz/slides.html
Very excited to see SAM3 massively lower that barrier. Describe the class you want to detect and get annotated datasets automatically!
Try it yourself: huggingface.co/datasets/uv-...!
Very excited to see SAM3 massively lower that barrier. Describe the class you want to detect and get annotated datasets automatically!
Try it yourself: huggingface.co/datasets/uv-...!
Be warned, I did fire up the meme generator for my slides...
Be warned, I did fire up the meme generator for my slides...
Full script at huggingface.co/datasets/uv-...
Full script at huggingface.co/datasets/uv-...
Currently processing @natlibscot.bsky.social's 27,915-page handbook collection with one command.
Processing at ~350 images/sec on A100
Using @hf.co Jobs + uv - zero setup batch OCR!
Will share final time + cost when done!
Currently processing @natlibscot.bsky.social's 27,915-page handbook collection with one command.
Processing at ~350 images/sec on A100
Using @hf.co Jobs + uv - zero setup batch OCR!
Will share final time + cost when done!