Tag: ocr

Total 2 articles

SmolDocling: 256M OCR Model Processes Documents in 0.35s on Consumer GPUs

SmolDocling OCR VLM Vision Language Model

IBM Research's SmolDocling, a 256M-parameter vision-language model, delivers fast document OCR and multimodal processing at 0.35s per page on consumer GPUs, handling text, formulas, code and charts efficiently.

Mar 18, 2025 • 3 min read

News

MinerU Beginner's Guide - Ultimate Open Source PDF Data Extraction Tool

MinerU PDF Tools Data Extraction OCR Document Processing Open Source Tools Machine Learning Data Analysis

A detailed tutorial on how to use MinerU, including online experience and local deployment methods. Supports extracting text, images, tables, and mathematical formulas from PDF documents, suitable for academic research, data analysis, and more.

Nov 11, 2024 • 5 min read

Tools