Skip to main content

Introducing the vLLora MCP Server

· 4 min read
Mrunmay
AI Engineer

If you’re building agents with tools like Claude Code or Cursor, or you prefer working in the terminal, you’ve probably hit this friction already. Your agent runs, something breaks partway through, and now you have to context-switch to a web UI to understand what happened. You search for the right trace, click through LLM calls, and then try to carry that context back into your editor.

vLLora’s MCP Server removes that context switch. Your coding agent becomes the interface for inspecting traces, understanding failures, and debugging agent behavior — without leaving your editor or terminal.

vLLora MCP Server

Debugging Agents: Why Prompt Tweaks Can't Fix Stale State

· 8 min read
Mrunmay
AI Engineer

In the earlier deep-agent case study (Browsr), I focused on architecture. Here I'll stay grounded in one debugging failure I hit in a maps agent—a failure that looked like a prompt problem but wasn't. The agent behaved correctly in chat, the UI looked correct, and yet the results were consistently from the wrong area. I tried the usual prompt tweaks: stronger instructions, "be careful," "use the visible map," retries. None of it moved the needle.

Here's how map state flows through the agent loop and where it can drift:

Maps agent architecture

Building AI-Powered Image Generation with OpenAI-Compatible Responses API

· 10 min read
Karolis Gudiškis
Karolis Gudiškis

Introduction

The Responses API represents a powerful evolution in how we interact with large language models. Unlike traditional chat completion APIs that return simple text responses, the Responses API enables structured, multi-step workflows that can orchestrate multiple tools and produce rich, multi-modal outputs.

In this article, we'll explore how to build an AI-powered application that combines web search and image generation capabilities.

Source Code: The complete example is available on GitHub.

Documentation: For comprehensive Responses API documentation, see the Responses API guide and Image Generation guide.

Pause, Inspect, Edit: Debug Mode for LLM Requests in vLLora

· 4 min read
Mrunmay
AI Engineer

LLMs behave like black boxes. You send them a request, hope the prompt is right, hope your agent didn't mutate it, hope the framework packaged it correctly — and then hope the response makes sense. In simple one-shot queries this usually works fine. But when you're building agents, tools, multi-step workflows, or RAG pipelines, it becomes very hard to see what the model is actually receiving. A single unexpected message, parameter, or system prompt change can shift the entire run.

Today we're introducing Debug Mode for LLM requests in vLLora that makes this visible — and editable.

Here’s what debugging looks like in practice:

Debugging LLM Request using Debug Mode

Debugging LiveKit Voice Agents with vLLora

· 2 min read
Matteo Pelati
Matteo Pelati

Voice agents built with LiveKit Agents enable real-time, multimodal AI interactions that can handle voice, video, and text. These agents power everything from customer support bots to telehealth assistants, and debugging them requires visibility into the complex pipeline of speech-to-text, language model, and text-to-speech interactions.

In this video, we go over how you can debug voice agents built using LiveKit Agents with vLLora. You'll see how to trace every model call, tool execution, and response as your agent processes real-time audio streams.

Using vLLora with OpenAI Agents SDK

· 2 min read
Mrunmay
AI Engineer

The OpenAI Agents SDK makes it easy to build agents with handoffs, streaming, and function calling. The hard part? Seeing what's actually happening when things don't work as expected.

OpenAI Agents Tracing

Using vLLora with Google ADK

· 2 min read
Mrunmay
AI Engineer

Google ADK (Agent Development Kit) lets you build multi-agent systems across different LLM providers—Gemini, OpenAI, Anthropic, and more. But when your planner agent produces a FunctionCall for an AgentTool that doesn't run correctly, or a nested sub-agent fails silently, debugging what happened across agents and sessions becomes nearly impossible.

Traces of Google ADK on vLLora

Using vLLora to debug Agents

· 3 min read
Mrunmay
AI Engineer

Building AI agents is hard. Debugging them locally across multiple SDKs, tools, and providers feels like flying blind. Logs give you partial visibility. You need to see every call, latency, cost, and output in context without rewriting code.

Debugging demo