WeSearch

Run your own local LLM with rate limits via API-keys

·1 min read · 0 reactions · 0 comments · 14 views
#technology#programming#ai
Run your own local LLM with rate limits via API-keys
⚡ TL;DR · AI summary

A new Ruby prototype allows users to run a local LLM proxy with rate limits using API keys. The proxy supports a refillable token bucket system and can be set up with minimal dependencies. Users can test the proxy and manage token limits for individual clients.

Key facts
Original article
GitHub
Read full at GitHub →
Opening excerpt (first ~120 words) tap to expand

LLM token bucket proxy Small Ruby prototype for an OpenAI-compatible LLM proxy with a refillable token bucket. It uses only Ruby standard libraries: no gems, no Rack, no WEBrick. Run BASE_API_URL=http://192.168.0.124:8888/v1 \ BASE_API_KEY=1mmer \ BASE_MODEL=gemma4 \ ruby llm_proxy.rb The proxy listens on 0.0.0.0:8899 by default. For your local LLM at 192.168.0.124:8888, run the saved local setup: ./run_local_proxy.sh That starts the Ruby proxy at http://127.0.0.1:8899/v1 and forwards to http://192.168.0.124:8888/v1.

Excerpt limited to ~120 words for fair-use compliance. The full article is at GitHub.

Anonymous · no account needed
Share 𝕏 Facebook Reddit LinkedIn Threads WhatsApp Bluesky Mastodon Email

Discussion

0 comments

More from GitHub