29 / 481

GLM-OCR

GLM-OCR - Product Hunt launch logo and brand identity

SOTA document parsing & OCR in just 0.9B parameters

#Open Source #Artificial Intelligence

GLM-OCR – SOTA document parsing and OCR with a lightweight 0.9B parameter model

Summary: GLM-OCR is a 0.9B parameter OCR model achieving state-of-the-art accuracy (94.6 on OmniDocBench) on complex layouts, tables, handwriting, and mixed content. It supports ultra-fast inference via vLLM/SGLang and outputs structured data like Markdown and JSON.

What it does

GLM-OCR parses documents with complex layouts, reconstructing tables and mixed content into clean Markdown and JSON using a CogViT encoder and GLM-0.5B decoder architecture.

Who it's for

It is designed for developers needing efficient OCR in retrieval-augmented generation (RAG) pipelines and edge deployments requiring low latency.

Why it matters

It enables fast, accurate parsing of heavy document layouts and mixed content without high computational cost.