Winnow

Keep the signal. Drop the noise.

#Open Source #Developer Tools #Artificial Intelligence #GitHub

Winnow - Main product screenshot demonstrating key features and user interface

Winnow – Compress RAG prompts to reduce token costs

Summary: Winnow reduces token usage by over 50% in retrieval-augmented generation (RAG) prompts using question-guided filtering and LLMLingua-2 to maintain semantic accuracy. It offers a FastAPI server, batch compression API, and Docker-based self-hosting.

What it does

Winnow compresses RAG prompts before they reach the LLM, filtering tokens relevant to the question to preserve meaning while cutting token costs.

Who it's for

Developers and teams using RAG pipelines who need to optimize token consumption and maintain answer relevance.

Why it matters

It addresses high token costs in RAG workflows by efficiently compressing prompts without losing semantic accuracy.

Upvote on Product Hunt

Winnow

Winnow – Compress RAG prompts to reduce token costs

What it does

Who it's for

Why it matters

Related Products