DocuFlow
Turn PDF Invoices into Data using Local AI Agents.
#GitHub
DocuFlow – Automate invoice data extraction with local AI
Summary: DocuFlow is an open-source, containerized pipeline that automates OCR, parsing, and analytics of invoices using Gemini 2.5 Flash AI. It extracts vendor, date, and amount data from PDFs and updates the database in real-time.
What it does
DocuFlow processes invoices by ingesting files dropped into a folder, then uses AI to perform OCR and extract key financial data automatically.
Who it's for
It is designed for users needing automated extraction and analysis of invoice and financial document data.
Why it matters
It replaces manual regex-based extraction with an intelligent, real-time AI-driven pipeline for accurate data processing.