LLamaPass

Your Ollama.
One gateway. Every user.

Self-hosted API gateway for Ollama with multi-user support, API key management, rate limiting, and real-time usage tracking.

Get started

LLamaPass is open-source software you install on your server to manage access to Ollama

Join this server

Use this instance

The admin of this server has shared LLamaPass with you. Register with your invite code to get access.

  • Register with an invite code for instant access
  • Generate your API key from the dashboard
  • Call models via API or OpenAI SDK
Register
Deploy your own

Run LLamaPass on your server

Install LLamaPass with Docker to manage multi-user access to your own Ollama instance.

  • Clone and run with a single Docker command
  • Connect to your Ollama instance
  • Invite users and manage their access
View on GitHub

Use from the terminal

Install the CLI and start chatting with models instantly

terminal
# Install the CLI
$ pip install llamapass

# Configure your API key
$ llamapass config set-key oah_your_key

# Chat with any model
$ llamapass run gemma3
LlamaPass - gemma3 (type /bye to quit)

>>> Hello!
Hi there! How can I help you today?

Deploy in minutes

Clone, configure, and run with Docker

terminal
# Clone and run with Docker
$ git clone https://github.com/edoardoted99/llamapass.git
$ cd llamapass
$ cp .env.example .env
$ docker compose up --build

# Create your first admin user
$ docker compose exec web python manage.py createsuperuser

# Ready at http://localhost:8000

Everything you need

A complete gateway between your users and Ollama

🔑

API Key Management

Create, revoke, and manage keys with expiration, per-key model restrictions, and rate limits.

📊

Usage Dashboard

30-day analytics with charts for requests, tokens, latency, errors, and model breakdown per key.

Rate Limiting

Configurable per-key limits backed by Redis. Live monitoring shows how close each key is to its limit.

🔄

Streaming Proxy

Transparent async proxy to Ollama. Full streaming support for chat and generate endpoints.

🧪

Built-in Test Page

Test Chat, Generate, and Embeddings endpoints directly from the browser. No curl needed.

🤝

OpenAI Compatible

Works with OpenAI SDKs out of the box. Just point your base URL and use your LLamaPass API key.

Simple to integrate

Use any HTTP client or the OpenAI SDK

curl
curl https://llamapass.org/ollama/api/chat \
  -H "Authorization: Bearer oah_your_key" \
  -d '{
  "model": "gemma3:1b",
  "messages": [
    {"role": "user", "content": "Hello!"}
  ],
  "stream": false
}'
Python — OpenAI SDK
from openai import OpenAI

client = OpenAI(
    base_url="https://llamapass.org/ollama/v1",
    api_key="oah_your_key",
)

response = client.chat.completions.create(
    model="gemma3:1b",
    messages=[
      {"role": "user", "content": "Hello!"}
    ],
)
print(response.choices[0].message.content)