OCR
Gemini 2.0 ingesting millions of PDFs
Result: Gemini 2.0 found to be better compared to other providers for Table OCR.
Evaluation dataset: Subset of Reducto’s rd-tablebench. Hugging Face dataset link.
Evaluation Metric: Needleman-Wunsch algorithm accuracy, In table OCR, this algorithm aligns the predicted output with the ground truth table, scoring the degree of similarity and providing an objective measure of accuracy.