Introduction

Debug your AI agents with complete visibility into every request. vLLora works out of the box with OpenAI-compatible endpoints, supports 300+ models with your own keys, and captures deep traces on latency, cost, and model output.

vLLora Tracing Interface

Installation

Easy install on Linux and macOS using Homebrew.

Homebrew (macOS & Linux)

brew tap vllora/vllora 
brew install vllora

Launch vLLora:

vllora

This starts the gateway and opens the UI in your browser.

Homebrew Setup

New to Homebrew? Check these guides:

Run from Source

Want to contribute or run the latest development version? Clone the GitHub repository and build from source:

git clone https://github.com/vllora/vllora.git
cd vllora
cargo run serve

This will start the gateway on http://localhost:9090 with the UI available at http://localhost:9091 in your browser.

Development Setup

Make sure you have Rust installed. Visit rustup.rs to get started.

Send your First Request

After starting vLLora, visit http://localhost:9091 to configure your API keys through the UI. Once configured, point your application to http://localhost:9090 as the base URL.

vLLora works as a drop-in replacement for the OpenAI API, so you can use any OpenAI-compatible client or SDK. Every request will be captured and visualized in the UI with full tracing details.

curl -X POST http://localhost:9090/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai/gpt-4o-mini",
    "messages": [
      {
        "role": "user",
        "content": "Hello, vLLora!"
      }
    ]
  }'

Now check http://localhost:9091 to see your first trace with full request details, costs, and timing breakdowns.

Installation​

Homebrew (macOS & Linux)​

Run from Source​

Send your First Request​

Installation

Homebrew (macOS & Linux)

Run from Source

Send your First Request