LLMProxy

#Productivity #API #GitHub

LLMProxy – High-performance reverse proxy for LLM inference services

Summary: LLMProxy routes requests to LLM backends supporting both standard JSON responses and real-time token streaming via SSE, with zero buffering and OpenAI-compatible API routing.

What it does

It acts as a reverse proxy for LLM inference engines, handling stream=false and stream=true modes with intelligent load balancing and zero buffering on streams.

Who it's for

Developers and teams managing LLM inference services requiring efficient request routing and streaming support.

Why it matters

It enables seamless, high-performance routing and streaming for LLM backends, improving inference service efficiency.

Upvote on Product Hunt

LLMProxy

LLMProxy – High-performance reverse proxy for LLM inference services

What it does

Who it's for

Why it matters

Related Products