In the first two posts of this series, we benchmarked OCR on two increasingly demanding tasks: Grounded (BBox) OCR, reading text AND returning its coordinates Image → Markdown OCR, plain-text...
Most OCR tools tell you what a document says. That’s fine for search indexing and RAG. But when your workflow needs to act on a specific piece of text (redact...
In our first benchmark, we showed that JSL Vision OCR is the #1 grounded OCR model overall, beating every closed-source frontier system on the FUNSD dataset. This post answers a different question: plain-text...
If you’ve shopped for an OCR model recently, you already know the problem: every vendor claims state-of-the-art accuracy, every benchmark uses a different dataset, and “VLMs can do OCR” is...
This post presents a comparative benchmark of medical Vision Language Models (VLMs) evaluated on a range of clinically relevant visual and multimodal tasks. The study focuses on assessing how well...