Using vLLora to debug AI agents

Building agents takes many cycles of tweaking prompts, running evaluations, and figuring out what went wrong. vLLora lets you see traces in real time so you can debug, monitor, and optimize faster. It’s fair-code runs locally, and stores data in SQLite for fine-tuning later. Just swap out your OpenAI URL with your local vLLora endpoint, everything else works the same.

Debugging demo

Observe your agent in real time

Step 1: Setup vLLora

Download and install vLLora using Homebrew:

brew tap vllora/vllora
brew install vllora
vllora

Setup your provider
Change your OpenAI base URL to vLLora which is http://localhost:9090/v1

vLLora Dashboard showing Chat, Debug, and provider setup

Step 2: Send a request

You can now send a request to vLLora using the chat section or by using the curl command.

vLLora Chat section

Step 3: Debug your agent

Once your request runs through vLLora, you can debug your agent by going over the traces. You could see the exact request sent to the model, the response from the model, time taken to run the request, tool call arguments, and more.

vLLora Debugging Agent

Tip: Keep the Debug tab open while you experiment — every new request streams in instantly.

Using vLLora with your existing AI Agents

vLLora is OpenAI-compatible, so you can point your existing agent frameworks (LangChain, CrewAI, Google ADK, custom apps, etc.) to vLLora without code changes beyond the base URL.

LangChain (Python)
from langchain_openai import ChatOpenAI
from langchain_core.messages import HumanMessage

llm = ChatOpenAI(
    base_url="http://localhost:9090/v1",
    model="openai/gpt-4o-mini",
    api_key="no_key",
    temperature=0.2,
)

That's it. From here, you can debug your agents in real time by inspecting every call, tool invocation, and understand how your agents behave.

Enhanced Tracing with vLLora Python Package

For even deeper insights, we also provide the vLLora Python package that complements the OpenAI-compatible approach. This package offers enhanced tracing with framework-specific details like agent workflows, tool calls, and multi-step execution paths.

Example: OpenAI Agents SDK

Install the vLLora package with the OpenAI feature flag:

pip install 'vllora[openai]'

Then initialize vLLora before creating or running any OpenAI agents:

from vllora.openai import init
init()

# Then proceed with your normal OpenAI setup:
from openai import OpenAI
# ...define and run agents...

Once initialized, vLLora automatically captures all agent interactions, function calls, and streaming responses with full end-to-end tracing across your workflow.

Learn more: See our Working with Agent Frameworks guide for detailed integration instructions with OpenAI Agents SDK, Google ADK, and more frameworks.

Next steps

Read the Quickstart to install and send your first trace: Quickstart
Deeper integration with agent frameworks (optional): Working with Agent Frameworks
Overview of the product and setup details: Introduction

Observe your agent in real time​

Step 1: Setup vLLora​

Step 2: Send a request​

Step 3: Debug your agent​

Using vLLora with your existing AI Agents​

Enhanced Tracing with vLLora Python Package​

Example: OpenAI Agents SDK​

Next steps​