Skip to main content

Provider Examples

There are runnable examples under llm/examples/ that mirror the patterns in the quick start and usage guides:

  • openai: Direct OpenAI chat completions using VlloraLLMClient (non-streaming + streaming).
  • anthropic: Anthropic (Claude) chat completions via the unified client.
  • gemini: Gemini chat completions via the unified client.
  • bedrock: AWS Bedrock chat completions (Nova etc.) via the unified client.
  • proxy_langdb: Using InferenceModelProvider::Proxy("langdb") to call a LangDB OpenAI-compatible endpoint.
  • tracing: Same OpenAI-style flow as openai, but with tracing_subscriber::fmt() configured to emit spans and events to the console (stdout).
  • tracing_otlp: Shows how to wire vllora_telemetry::events::layer to an OTLP HTTP exporter (e.g. New Relic / any OTLP collector) and emit spans from VlloraLLMClient calls to a remote telemetry backend.

See detailed snippets for specific providers:

  • OpenAI Example: async-openai-compatible non-streaming + streaming example.
  • Anthropic Example: async-openai-compatible request routed to Anthropic with streaming.
  • Bedrock Example: async-openai-compatible request routed to AWS Bedrock with streaming.
  • Gemini Example: async-openai-compatible request routed to Gemini with streaming.
  • LangDB proxy Example: async-openai-compatible request routed to a LangDB OpenAI-compatible endpoint with streaming.
  • Tracing (console) Example: OpenAI-style request with tracing_subscriber::fmt() logging spans/events to stdout.
  • Tracing (OTLP) Example: OpenAI-style request emitting spans via OTLP HTTP exporter.

Each example is a standalone Cargo binary; you can cd into a directory and run:

cargo run

after setting the provider-specific environment variables noted in the example's main.rs.

Source code for these examples lives in the main repo under llm/examples/: https://github.com/vllora/vllora/tree/main/llm/examples