Docs   ConceptsCompletion & Generation

Completion in Rig: LLM Interaction Layer

Rig’s completion system provides a layered approach to interacting with Language Models, offering both high-level convenience and low-level control. The system is built around a set of traits that define different levels of abstraction for LLM interactions.

Core Traits

1. High-Level Interfaces

Prompt Trait

  • Simplest interface for one-shot interactions
  • Fire-and-forget prompting
  • Returns string responses
async fn prompt(&self, prompt: &str) -> Result<String, PromptError>;

Chat Trait

  • Conversation-aware interactions
  • Maintains chat history
  • Supports contextual responses
async fn chat(&self, prompt: &str, history: Vec<Message>) -> Result<String, PromptError>;

2. Low-Level Control

Completion Trait

  • Fine-grained request configuration
  • Access to raw completion responses
  • Tool call handling

Reference to implementation:

rig-core/src/completion.rs [165:246]
...
        chat_history: Vec<Message>,
    ) -> impl std::future::Future<Output = Result<String, PromptError>> + Send;
}
 
/// Trait defininig a low-level LLM completion interface
pub trait Completion<M: CompletionModel> {
    /// Generates a completion request builder for the given `prompt` and `chat_history`.
    /// This function is meant to be called by the user to further customize the
    /// request at prompt time before sending it.
    ///
    /// ❗IMPORTANT: The type that implements this trait might have already
    /// populated fields in the builder (the exact fields depend on the type).
    /// For fields that have already been set by the model, calling the corresponding
    /// method on the builder will overwrite the value set by the model.
    ///
    /// For example, the request builder returned by [`Agent::completion`](crate::agent::Agent::completion) will already
    /// contain the `preamble` provided when creating the agent.
    fn completion(
        &self,
        prompt: &str,
        chat_history: Vec<Message>,
    ) -> impl std::future::Future<Output = Result<CompletionRequestBuilder<M>, CompletionError>> + Send;
}
 
/// General completion response struct that contains the high-level completion choice
/// and the raw response.
#[derive(Debug)]
pub struct CompletionResponse<T> {
    /// The completion choice returned by the completion model provider
    pub choice: ModelChoice,
    /// The raw response returned by the completion model provider
    pub raw_response: T,
}
 
/// Enum representing the high-level completion choice returned by the completion model provider.
#[derive(Debug)]
pub enum ModelChoice {
    /// Represents a completion response as a message
    Message(String),
    /// Represents a completion response as a tool call of the form
    /// `ToolCall(function_name, function_params)`.
    ToolCall(String, serde_json::Value),
}
 
/// Trait defining a completion model that can be used to generate completion responses.
/// This trait is meant to be implemented by the user to define a custom completion model,
/// either from a third party provider (e.g.: OpenAI) or a local model.
pub trait CompletionModel: Clone + Send + Sync {
    /// The raw response type returned by the underlying completion model.
    type Response: Send + Sync;
 
    /// Generates a completion response for the given completion request.
    fn completion(
        &self,
        request: CompletionRequest,
    ) -> impl std::future::Future<Output = Result<CompletionResponse<Self::Response>, CompletionError>>
           + Send;
 
    /// Generates a completion request builder for the given `prompt`.
    fn completion_request(&self, prompt: &str) -> CompletionRequestBuilder<Self> {
        CompletionRequestBuilder::new(self.clone(), prompt.to_string())
    }
}

CompletionModel Trait

  • Provider interface implementation
  • Raw request handling
  • Response parsing and error management

Request Building

CompletionRequestBuilder

Fluent API for constructing requests with:

let request = model.completion_request("prompt")
    .preamble("system instructions")
    .temperature(0.7)
    .max_tokens(1000)
    .documents(context_docs)
    .tools(available_tools)
    .build();

Request Components

  1. Core Elements

    • Prompt text
    • System preamble
    • Chat history
    • Temperature
    • Max tokens
  2. Context Management

    • Document attachments
    • Metadata handling
    • Formatting controls
  3. Tool Integration

    • Tool definitions
    • Parameter validation
    • Response parsing

Response Handling

CompletionResponse

Structured response type with:

enum ModelChoice {
    Message(String),
    ToolCall(String, Value)
}
 
struct CompletionResponse<T> {
    choice: ModelChoice,
    raw_response: T,
}

Error Handling

Comprehensive error types:

enum CompletionError {
    HttpError(reqwest::Error),
    JsonError(serde_json::Error),
    RequestError(Box<dyn Error>),
    ResponseError(String),
    ProviderError(String),
}

Usage Patterns

Basic Completion

let openai = Client::new(api_key);
let model = openai.completion_model("gpt-4");
 
let response = model
    .prompt("Explain quantum computing")
    .await?;

Contextual Chat

let chat_response = model
    .chat(
        "Continue the discussion",
        vec![Message::user("Previous context")]
    )
    .await?;

Advanced Request Configuration

let request = model
    .completion_request("Complex query")
    .preamble("Expert system")
    .temperature(0.8)
    .documents(context)
    .tools(available_tools)
    .send()
    .await?;

Provider Integration

Implementing New Providers

impl CompletionModel for CustomProvider {
    type Response = CustomResponse;
    
    async fn completion(
        &self, 
        request: CompletionRequest
    ) -> Result<CompletionResponse<Self::Response>, CompletionError> {
        // Provider-specific implementation
    }
}

Best Practices

  1. Interface Selection

    • Use Prompt for simple interactions
    • Use Chat for conversational flows
    • Use Completion for fine-grained control
  2. Error Handling

    • Handle provider-specific errors
    • Implement graceful fallbacks
    • Log raw responses for debugging
  3. Resource Management

    • Reuse model instances
    • Batch similar requests
    • Monitor token usage

See Also


API Reference (Completion)