I've tried Tesseract and EasyOCR, and neither performs well on this corpus. Abbyy Finereader probably does better than those, but I need to restart my license.
Preliminary checks on Tesseract+LLM for cleaning are very hopeful for printed works.
September 15, 2025 at 11:36 AM
I've tried Tesseract and EasyOCR, and neither performs well on this corpus. Abbyy Finereader probably does better than those, but I need to restart my license.
Preliminary checks on Tesseract+LLM for cleaning are very hopeful for printed works.