CommonForms: A Large, Diverse Dataset for Form Field Detection Paper • 2509.16506 • Published Sep 20, 2025 • 22
GutenOCR: A Grounded Vision-Language Front-End for Documents Paper • 2601.14490 • Published 28 days ago • 37
RICO Collection A collection of RICO screenshot-based datasets for training and evaluation. We've attempted to compile all surrounding metadata for the relevant tasks • 8 items • Updated Jan 16 • 5
Gemma 3 Collection All versions of Google's new multimodal models including QAT in 1B, 4B, 12B, and 27B sizes. In GGUF, dynamic 4-bit and 16-bit formats. • 55 items • Updated 1 day ago • 105