Responses API

The vllora Responses API provides a unified interface for building advanced AI agents capable of executing complex tasks autonomously. This API is compatible with OpenAI's Responses API format and supports multimodal inputs, reasoning capabilities, and seamless tool integration.

Overview

The Responses API is a more powerful alternative to the traditional Chat Completions API. It enables:

Structured, multi-step workflows with support for multiple built-in tools
Rich, multi-modal outputs that can be easily processed programmatically
Tool orchestration including web search, image generation, and more
Streaming support for real-time response processing

Basic Usage

Non-Streaming Example

Here's a simple example that sends a text prompt and receives a structured response:

use vllora_llm::async_openai::types::responses::CreateResponse;
use vllora_llm::async_openai::types::responses::InputParam;
use vllora_llm::async_openai::types::responses::OutputItem;
use vllora_llm::async_openai::types::responses::OutputMessageContent;
use vllora_llm::client::VlloraLLMClient;
use vllora_llm::error::LLMResult;

#[tokio::main]
async fn main() -> LLMResult<()> {
    // 1) Build a Responses-style request using async-openai-compat types
    let responses_req = CreateResponse {
        model: Some("gpt-4o".to_string()),
        input: InputParam::Text("Stream numbers 1 to 20 in separate lines.".to_string()),
        max_output_tokens: Some(100),
        ..Default::default()
    };

    // 2) Construct a VlloraLLMClient
    let client = VlloraLLMClient::default();

    // 3) Non-streaming: send the request and print the final reply
    let response = client.responses().create(responses_req.clone()).await?;

    println!("Non-streaming reply:");
    for output in &response.output {
        if let OutputItem::Message(message) = output {
            for message_content in &message.content {
                if let OutputMessageContent::OutputText(text) = message_content {
                    println!("{}", text.text);
                }
            }
        }
    }

    Ok(())
}

Streaming Example

The Responses API also supports streaming for real-time processing:

use vllora_llm::async_openai::types::responses::CreateResponse;
use vllora_llm::async_openai::types::responses::InputParam;
use tokio_stream::StreamExt;
use vllora_llm::client::VlloraLLMClient;
use vllora_llm::error::LLMResult;

#[tokio::main]
async fn main() -> LLMResult<()> {
    let responses_req = CreateResponse {
        model: Some("gpt-4o".to_string()),
        input: InputParam::Text("Stream numbers 1 to 20 in separate lines.".to_string()),
        max_output_tokens: Some(100),
        ..Default::default()
    };

    let client = VlloraLLMClient::default();

    // Streaming: send the same request and print chunks as they arrive
    // Note: Streaming for responses is not yet fully implemented in all providers
    println!("\nStreaming response...");
    let mut stream = client
        .responses()
        .create_stream(responses_req.clone())
        .await?;

    while let Some(chunk) = stream.next().await {
        let chunk = chunk?;
        // ResponseEvent structure may vary - print the chunk for debugging
        println!("{:?}", chunk);
    }

    Ok(())
}

Understanding the Response Structure

The Response struct contains an output field, which is a vector of OutputItem variants. Each item represents a different type of output from the API:

OutputItem::Message - Text messages from the model
OutputItem::ImageGenerationCall - Image generation results
OutputItem::WebSearchCall - Web search results
Other tool outputs

Each output type can be pattern-matched to extract the relevant data.

Working with Tools

The Responses API supports multiple built-in tools that enable powerful workflows:

Web Search - Search the web for current information
Image Generation - Generate images from text prompts
Custom Tools - Define your own tools for specific tasks

For a comprehensive guide on using tools, especially image generation, see the Image Generation Guide.

Next Steps

Image Generation Guide - Learn how to use image generation and web search tools
Usage Guide - Learn about gateway-native types, streaming, and supported parameters
Provider Examples - See examples for different providers
GitHub examples: responses - End-to-end Rust examples for text responses
GitHub examples: image generation - Rust examples for image outputs

Overview​

Basic Usage​

Non-Streaming Example​

Streaming Example​

Understanding the Response Structure​

Working with Tools​

Next Steps​