Open source · MIT licensed · Built in Rust

Stop Getting
Throttled.

Nozzle is a local proxy that sits between your AI tools and LLM providers, enforcing a single token budget across everything on your machine. No more 429s. No more rate-limit hell.

Get Started → View on GitHub

429: Too Many Requests

Claude Code, your IDE, and three running agents all share one API key. The moment they overlap, your whole stack stops dead. One key, one limit, zero mercy.

HTTP 429

Sudden Lockouts

Platforms shadowban and throttle accounts they deem "high-risk" — often without notice, without appeal, and without recourse. You're dependent on their mood.

Account suspended

Invisible Throttling

Sometimes it's not a hard block — it's a slow strangulation. Responses crawl. Latency spikes. Your build slows to a halt while you debug a problem that isn't yours.

Degraded performance

Your tools

Claude Code
Cursor · Agents

Local proxy

nozzle

:8770–:8773

Anthropic

api.anthropic.com

:8771

OpenAI

api.openai.com

:8772

Google

generativelanguage…

:8773

Token bucket algorithm

Shared budget across all your tools. Tokens refill at your configured rate — no tool can hog the limit.

Zero added latency

SSE streaming piped through immediately. Nozzle adds sub-millisecond overhead in the fast path.

Transparent proxy

All headers and paths forwarded unchanged. Your existing API keys work as-is. Change one env var, done.

Self-correcting estimates

Estimates tokens before the request, reads actual counts from the response, and corrects the bucket automatically.

Token Rate Limiting

A token bucket enforces your configured rate across all concurrent requests. Fast path for bursts, smart queuing when you're at the ceiling. Set it once to 80% of your API tier and never see a 429 again.

Multi-Provider Support

Anthropic, OpenAI, and Google out of the box — each on their own local port. Add any HTTP API as a provider; token extraction built in for all three, others fall back to body-length estimation.

Open Source & Extensible

MIT licensed. No black boxes. No paywalls. Community-driven and built in Rust for reliability. Add providers, tune limits, build a dashboard — it's all yours.

Zero Config Start

Works out of the box for Anthropic, OpenAI, and Google with sensible defaults. Installs as a macOS LaunchAgent in one command — starts on login, restarts on crash. Change one env var and you're proxying.

Claude Code / aider / any SDK

ANTHROPIC_BASE_URL=http://127.0.0.1:8771 — drop it in your shell profile. Done.

Live stats at :8770

curl localhost:8770/status — see token counts, rate-limit delays, and bucket level per provider in real time.

LLM Providers

<1ms

Added overhead

100%

Open source

MIT

License

// open source

Built Together, Kept Free

Nozzle is early-stage and community-driven. There's real work to do — Linux support, Homebrew formula, per-provider rate limits, a proper dashboard, Prometheus metrics. If you've ever been throttled by a platform you depend on, you know why this matters. Come build with us.

Contribute on GitHub Open an Issue

Linux / systemd

Homebrew formula

More providers

Dashboard UI

Prometheus metrics

Per-provider limits

Stop Getting
Throttled.

The Wall Every Dev Hits

429: Too Many Requests

Sudden Lockouts

Invisible Throttling

Nozzle Routes Around It

Built for the Build

Token Rate Limiting

Multi-Provider Support

Open Source & Extensible

Zero Config Start

Up in 60 Seconds

Built Together, Kept Free

Stop Getting Throttled.

The Wall Every Dev Hits

429: Too Many Requests

Sudden Lockouts

Invisible Throttling

Nozzle Routes Around It

Built for the Build

Token Rate Limiting

Multi-Provider Support

Open Source & Extensible

Zero Config Start

Up in 60 Seconds

Built Together, Kept Free

Stop Getting
Throttled.