138 / 295

LLMProxy

LLMProxy - Product Hunt launch logo and brand identity

LLMProxy

#Productivity #API #GitHub

LLMProxy – High-performance reverse proxy for LLM inference services

Summary: LLMProxy routes requests to LLM backends supporting both standard JSON responses and real-time token streaming via SSE, with zero buffering and OpenAI-compatible API routing.

What it does

It acts as a reverse proxy for LLM inference engines, handling stream=false and stream=true modes with intelligent load balancing and zero buffering on streams.

Who it's for

Developers and teams managing LLM inference services requiring efficient request routing and streaming support.

Why it matters

It enables seamless, high-performance routing and streaming for LLM backends, improving inference service efficiency.