Knarr — Langertha LLM Proxy
. * .
. _/|_ . KNARR
. /| |\ . Langertha LLM Proxy
~~~~~|______|~~~~~
~~ ~~~~~~~~~~~~~ ~~ Cargo transport for your LLM calls
~~~~~~~~~~~~~~~~~~~~
An LLM proxy that routes requests from any client to any backend — with automatic Langfuse tracing for every call.
Set your API key, start the container, done. All requests are traced.
Getting Started
docker run -e ANTHROPIC_API_KEY -p 8080:8080 raudssus/langertha-knarr
Now point Claude Code at it:
ANTHROPIC_BASE_URL=http://localhost:8080 claude
That's it. Claude Code sends requests to Knarr, Knarr forwards them to Anthropic using your API key (passthrough mode). Add Langfuse keys and every request gets traced automatically.
How it works
Knarr runs in passthrough mode by default: requests that don't match a configured model are forwarded to the upstream API (Anthropic, OpenAI) using the client's own API key. No key duplication, no configuration needed.
Claude Code Anthropic API
│ ▲
│ ANTHROPIC_BASE_URL=http://localhost:8080 │
▼ │
Knarr ──────── passthrough ──────────────────────►│
│ │
└── Langfuse trace (auto)
For explicit routing (e.g., send "gpt-4o" requests to OpenAI, "cheap" to Groq), configure models in a YAML file or let Knarr auto-detect from environment variables.
More examples
# OpenAI Python SDK
OPENAI_BASE_URL=http://localhost:8080/v1 python my_app.py
# curl
curl http://localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-d '{"model":"gpt-4o-mini","messages":[{"role":"user","content":"Hello"}]}'
# Ollama clients (Open WebUI, etc.)
OLLAMA_HOST=http://localhost:11434 open-webui
Knarr listens on:
- Port 8080 — OpenAI + Anthropic API (passthrough + routing)
- Port 11434 — Ollama API (routing only)
- Health — http://localhost:8080/health
Windows
Use WSL2 — all commands work as-is inside a WSL terminal:
wsl
docker run --env-file .env -p 8080:8080 -p 11434:11434 raudssus/langertha-knarr
Or with Docker Desktop from PowerShell:
docker run --env-file .env -p 8080:8080 -p 11434:11434 raudssus/langertha-knarr
The --env-file .env approach works identically on Linux, macOS, and
Windows. Create your .env file once, run the same command everywhere.
Using a .env File
Create a .env file with your API keys (see .env.example):
# .env
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
LANGFUSE_PUBLIC_KEY=pk-lf-...
LANGFUSE_SECRET_KEY=sk-lf-...
Then run with --env-file:
docker run --env-file .env -p 8080:8080 -p 11434:11434 raudssus/langertha-knarr
Knarr reads the file, detects which providers have keys, configures them with sensible default models, and starts serving.
Docker Build (Temporary CPAN Indexer Bypass)
Default build flow stays unchanged:
docker build -t raudssus/langertha-knarr .
If CPAN indexers lag behind new releases, inject a direct CPAN dist path for Langertha:
docker build -t raudssus/langertha-knarr \
--build-arg LANGERTHA_SRC='GETTY/Langertha-0.307.tar.gz' \
.
LANGERTHA_SRC is passed directly to cpanm (for example AUTHOR/Dist-x.yyy.tar.gz or a tarball URL).
Docker Compose
The included docker-compose.yml starts Knarr with Langfuse tracing
out of the box:
cp .env.example .env
# Edit .env — add your API keys and Langfuse keys
docker compose up
This starts:
| Service | Port | Description | |---------|------|-------------| | Knarr | 8080, 11434 | LLM Proxy | | Langfuse | 3000 | Tracing Dashboard | | PostgreSQL | — | Langfuse storage |
The docker-compose.yml automatically loads .env and connects Knarr to
the Langfuse instance. Open http://localhost:3000 for the dashboard — every
LLM call through Knarr is traced with model, input, output, latency, and
token usage.
Minimal Docker Compose (without Langfuse)
If you don't need tracing:
services:
knarr:
image: raudssus/langertha-knarr
ports:
- "8080:8080"
- "11434:11434"
env_file: .env
Multiple Providers
Set multiple API keys — Knarr configures all of them automatically:
docker run --env-file .env -p 8080:8080 -p 11434:11434 raudssus/langertha-knarr
[knarr] Knarr LLM Proxy starting in container mode...
[knarr]
[knarr] Config: auto-detecting from environment variables
[knarr] Engines: 3 provider(s) configured
[knarr]
[knarr] anthropic => Anthropic / claude-sonnet-4-6 (key from $ANTHROPIC_API_KEY)
[knarr] groq => Groq / llama-3.3-70b-versatile (key from $GROQ_API_KEY)
[knarr] openai => OpenAI / gpt-4o-mini (key from $OPENAI_API_KEY)
[knarr]
[knarr] Auto-discover: enabled (will query provider model lists)
[knarr] Default engine: OpenAI
[knarr] Langfuse: disabled (set LANGFUSE_PUBLIC_KEY + LANGFUSE_SECRET_KEY to enable)
[knarr] Proxy auth: open (set KNARR_API_KEY to require authentication)
Each provider gets a default model:
| Provider | Default Model | ENV Variable |
|----------|---------------|--------------|
| OpenAI | gpt-4o-mini | OPENAI_API_KEY |
| Anthropic | claude-sonnet-4-6 | ANTHROPIC_API_KEY |
| Groq | llama-3.3-70b-versatile | GROQ_API_KEY |
| Mistral | mistral-large-latest | MISTRAL_API_KEY |
| DeepSeek | deepseek-chat | DEEPSEEK_API_KEY |
| MiniMax | MiniMax-M2.1 | MINIMAX_API_KEY |
| Gemini | gemini-2.0-flash | GEMINI_API_KEY |
| OpenRouter | openai/gpt-4o-mini | OPENROUTER_API_KEY |
| Perplexity | sonar | PERPLEXITY_API_KEY |
| Cerebras | llama-3.3-70b | CEREBRAS_API_KEY |
With auto-discover enabled (default), Knarr queries each provider's model list — so you can use any model they offer, not just the defaults.
Langfuse Tracing
Knarr traces every request automatically when Langfuse credentials are set.
Add these to your .env:
LANGFUSE_PUBLIC_KEY=pk-lf-...
LANGFUSE_SECRET_KEY=sk-lf-...
That's it. Every proxy request creates:
- Trace with model name, engine type, API format, and full input/output
- Generation with start/end time, token usage, and model information
- Error tracking when backend calls fail
- Tag
knarron all traces
Langfuse Cloud
Just set the keys — Langfuse Cloud (https://cloud.langfuse.com) is the
default:
# .env
OPENAI_API_KEY=sk-...
LANGFUSE_PUBLIC_KEY=pk-lf-...
LANGFUSE_SECRET_KEY=sk-lf-...
Self-Hosted Langfuse
Use docker compose up for a local Langfuse stack, or point at your own:
# .env
LANGFUSE_PUBLIC_KEY=pk-lf-...
LANGFUSE_SECRET_KEY=sk-lf-...
LANGFUSE_URL=http://my-langfuse-server:3000
Proxy Authentication
Protect your proxy with an API key:
# .env
KNARR_API_KEY=my-secret-proxy-key
Clients must send Authorization: Bearer my-secret-proxy-key or
x-api-key: my-secret-proxy-key. The /health endpoint is always open.
API Formats
Knarr accepts three API formats and routes them to any Langertha backend:
OpenAI (Port 8080)
curl http://localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{"model":"gpt-4o-mini","messages":[{"role":"user","content":"Hello"}]}'
curl http://localhost:8080/v1/models
Anthropic (Port 8080)
curl http://localhost:8080/v1/messages \
-H "Content-Type: application/json" \
-d '{"model":"claude-sonnet-4-6","messages":[{"role":"user","content":"Hello"}],"max_tokens":1024}'
Ollama (Port 11434)
curl http://localhost:11434/api/chat \
-d '{"model":"gpt-4o-mini","messages":[{"role":"user","content":"Hello"}]}'
curl http://localhost:11434/api/tags
All formats support streaming — SSE for OpenAI/Anthropic, NDJSON for Ollama.
Tool Calling Bridge
Knarr can bridge tool-calling payloads across API formats when a client format and backend engine format differ.
- OpenAI client format to Anthropic-compatible backend:
OpenAI
tools/tool_choiceand assistanttool_callsare mapped to Anthropictools/tool_choice+tool_use/tool_resultblocks. - Anthropic client format to OpenAI-compatible backend:
Anthropic tool blocks are mapped to OpenAI
tool_callsandtoolmessages. - Hermes-style tool output support:
If a backend emits Hermes XML (
<tool_call>{...}</tool_call>), Knarr parses it and exposes native tool-call structures to OpenAI and Anthropic clients.
This lets you test tool behavior through one endpoint while targeting different engine families behind Knarr.
Use Cases
Claude Code through any backend
docker run --env-file .env -p 8080:8080 raudssus/langertha-knarr
# In another terminal:
ANTHROPIC_BASE_URL=http://localhost:8080 claude
Every Claude Code request gets traced in Langfuse.
Ollama clients with cloud models
Use cloud LLMs from any Ollama-compatible client like Open WebUI:
docker run --env-file .env -p 11434:11434 raudssus/langertha-knarr
# Open WebUI connects to port 11434, thinks it's Ollama,
# but requests go to cloud providers through Knarr
Local + Cloud hybrid
Mount a config file for custom routing:
# knarr.yaml
models:
llama3.2:
engine: OllamaOpenAI
url: http://host.docker.internal:11434/v1
model: llama3.2
gpt-4o:
engine: OpenAI
default:
engine: OllamaOpenAI
url: http://host.docker.internal:11434/v1
docker run --env-file .env \
-v ./knarr.yaml:/etc/knarr/config.yaml \
-p 8080:8080 -p 11434:11434 \
raudssus/langertha-knarr start -c /etc/knarr/config.yaml
Using a Config File
For more control than auto-detection, create a knarr.yaml:
listen:
- "127.0.0.1:8080"
- "127.0.0.1:11434"
models:
gpt-4o:
engine: OpenAI
gpt-4o-mini:
engine: OpenAI
model: gpt-4o-mini
claude-sonnet:
engine: Anthropic
model: claude-sonnet-4-6
api_key: ${ANTHROPIC_API_KEY}
local-llama:
engine: OllamaOpenAI
url: http://localhost:11434/v1
model: llama3.2
deepseek:
engine: DeepSeek
model: deepseek-chat
default:
engine: OpenAI
auto_discover: true
# Passthrough: requests go directly to upstream APIs
# The client's own API key is used — no duplication needed
# Models with explicit config above are routed via Langertha,
# everything else passes through transparently
passthrough:
anthropic: https://api.anthropic.com
openai: https://api.openai.com
# Or point at a custom upstream:
# anthropic: https://my-anthropic-cache.internal
# proxy_api_key: your-secret
# langfuse:
# url: http://localhost:3000
# public_key: pk-lf-...
# secret_key: sk-lf-...
Config values support ${ENV_VAR} interpolation — variables are resolved
at startup.
models.<name>.engine resolves in this order:
Langertha::Engine::<EngineName>LangerthaX::Engine::<EngineName>- Fully-qualified class name if you set one directly
Passthrough Mode
Passthrough is the default behavior: requests go directly to the upstream API (Anthropic, OpenAI) using the client's own API key. No key duplication, no model configuration needed. Knarr just sits in the middle and traces.
If you also configure explicit model routing (the models: section), those
specific models are handled by Langertha engines. Everything else still
passes through.
Enabled by default in container mode. In a config file:
# Enable with default upstream URLs
passthrough: true
# Or per format with custom upstreams
passthrough:
anthropic: https://api.anthropic.com
openai: https://my-openai-mirror.internal
Claude Code example — no Knarr API key needed, your existing key works:
docker run -p 8080:8080 raudssus/langertha-knarr
ANTHROPIC_BASE_URL=http://localhost:8080 claude
Generating a Config
Knarr can generate a config from your environment:
# Via Docker — pass your env vars through
docker run --rm --env-file .env raudssus/langertha-knarr init > knarr.yaml
# Or pass all API keys from your current shell
docker run --rm \
$(env | grep -E '_(API_KEY|API_TOKEN)=|^LANGFUSE_' | sed 's/^/-e /') \
raudssus/langertha-knarr init > knarr.yaml
Then mount it:
docker run --env-file .env \
-v ./knarr.yaml:/etc/knarr/config.yaml \
-p 8080:8080 -p 11434:11434 \
raudssus/langertha-knarr start -c /etc/knarr/config.yaml
All Environment Variables
API Keys
| Variable | Provider |
|----------|----------|
| OPENAI_API_KEY | OpenAI |
| ANTHROPIC_API_KEY | Anthropic |
| GROQ_API_KEY | Groq |
| MISTRAL_API_KEY | Mistral |
| DEEPSEEK_API_KEY | DeepSeek |
| MINIMAX_API_KEY | MiniMax |
| GEMINI_API_KEY | Gemini |
| OPENROUTER_API_KEY | OpenRouter |
| PERPLEXITY_API_KEY | Perplexity |
| CEREBRAS_API_KEY | Cerebras |
| REPLICATE_API_TOKEN | Replicate |
| HUGGINGFACE_API_KEY | HuggingFace |
LANGERTHA_-prefixed variants (e.g., LANGERTHA_OPENAI_API_KEY) take
priority over bare names.
Langfuse
| Variable | Description | Default |
|----------|-------------|---------|
| LANGFUSE_PUBLIC_KEY | Public key (pk-lf-...) | — |
| LANGFUSE_SECRET_KEY | Secret key (sk-lf-...) | — |
| LANGFUSE_URL | Server URL | https://cloud.langfuse.com |
Proxy
| Variable | Description | Default |
|----------|-------------|---------|
| KNARR_API_KEY | Require client authentication | — (open) |
CLI Reference
knarr Show help
knarr container Auto-start from ENV (Docker default)
knarr start Start with config file (./knarr.yaml)
knarr start -p 9090 Custom port
knarr start -c prod.yaml Custom config
knarr init Generate config from environment
knarr init -e .env Include .env file in scan
knarr models List configured models
knarr models --format json
knarr check Validate config file
Installing as a Perl Module
Knarr is also a standard CPAN distribution:
cpanm Langertha::Knarr
Then use the knarr CLI directly:
export OPENAI_API_KEY=sk-...
knarr init > knarr.yaml
knarr start
Using Knarr Programmatically
use Langertha::Knarr;
use Langertha::Knarr::Config;
# Build from YAML config
my $config = Langertha::Knarr::Config->new(file => 'knarr.yaml');
my $app = Langertha::Knarr->build_app(config => $config);
# Or build from environment (like container mode)
my $config = Langertha::Knarr::Config->from_env;
my $app = Langertha::Knarr->build_app(config => $config);
# $app is a Mojolicious app — embed, test, or run as you like
use Mojo::Server::Daemon;
Mojo::Server::Daemon->new(
app => $app,
listen => ['http://127.0.0.1:8080'],
)->run;
You can also add request policy hooks when building the app:
my $app = Langertha::Knarr->build_app(
config => $config,
before_request => sub ($c, $ctx) {
# $ctx: proxy_class, type, format, body, model_name, stream, messages, params
return {
stop => 1,
status => 418,
message => 'embeddings disabled by policy',
type => 'policy_denied',
} if $ctx->{type} eq 'embedding';
return;
},
api_key_validator => sub ($c, $ctx) {
# $ctx: api_key, raw_auth, path, method, content_type
return { allow => 1 } if $ctx->{api_key} eq 'allow-key';
return { allow => 0, status => 403, message => 'forbidden' };
},
);
Using the Config and Router Independently
use Langertha::Knarr::Config;
use Langertha::Knarr::Router;
my $config = Langertha::Knarr::Config->new(file => 'knarr.yaml');
my $router = Langertha::Knarr::Router->new(config => $config);
# Resolve a model name to a Langertha engine
my ($engine, $model) = $router->resolve('gpt-4o-mini');
# $engine is a Langertha::Engine::OpenAI (or whatever the config maps to)
# $model is the resolved model name
my $response = $engine->simple_chat(
{ role => 'user', content => 'Hello!' },
);
Built With
- Langertha — Perl LLM framework with 22+ engine backends
- Mojolicious — Real-time web framework for Perl
- Langfuse — Open source LLM observability
License
This software is copyright (c) 2026 by Torsten Raudssus.
This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.