Completion in Rig: LLM Interaction Layer

Rig’s completion system provides a layered approach to interacting with Language Models, offering both high-level convenience and low-level control. The system is built around a set of traits that define different levels of abstraction for LLM interactions.

Core Traits

1. High-Level Interfaces

`Prompt` Trait

Simplest interface for one-shot interactions
Fire-and-forget prompting
Returns string responses

async fn prompt(&self, prompt: &str) -> Result<String, PromptError>;

`Chat` Trait

Conversation-aware interactions
Maintains chat history
Supports contextual responses

async fn chat(&self, prompt: &str, history: Vec<Message>) -> Result<String, PromptError>;

2. Low-Level Control

`Completion` Trait

Fine-grained request configuration
Access to raw completion responses
Tool call handling

Reference to implementation:

rig-core/src/completion.rs [165:246]

...
        chat_history: Vec<Message>,
    ) -> impl std::future::Future<Output = Result<String, PromptError>> + Send;
}
 
/// Trait defininig a low-level LLM completion interface
pub trait Completion<M: CompletionModel> {
    /// Generates a completion request builder for the given `prompt` and `chat_history`.
    /// This function is meant to be called by the user to further customize the
    /// request at prompt time before sending it.
    ///
    /// ❗IMPORTANT: The type that implements this trait might have already
    /// populated fields in the builder (the exact fields depend on the type).
    /// For fields that have already been set by the model, calling the corresponding
    /// method on the builder will overwrite the value set by the model.
    ///
    /// For example, the request builder returned by [`Agent::completion`](crate::agent::Agent::completion) will already
    /// contain the `preamble` provided when creating the agent.
    fn completion(
        &self,
        prompt: &str,
        chat_history: Vec<Message>,
    ) -> impl std::future::Future<Output = Result<CompletionRequestBuilder<M>, CompletionError>> + Send;
}
 
/// General completion response struct that contains the high-level completion choice
/// and the raw response.
#[derive(Debug)]
pub struct CompletionResponse<T> {
    /// The completion choice returned by the completion model provider
    pub choice: ModelChoice,
    /// The raw response returned by the completion model provider
    pub raw_response: T,
}
 
/// Enum representing the high-level completion choice returned by the completion model provider.
#[derive(Debug)]
pub enum ModelChoice {
    /// Represents a completion response as a message
    Message(String),
    /// Represents a completion response as a tool call of the form
    /// `ToolCall(function_name, function_params)`.
    ToolCall(String, serde_json::Value),
}
 
/// Trait defining a completion model that can be used to generate completion responses.
/// This trait is meant to be implemented by the user to define a custom completion model,
/// either from a third party provider (e.g.: OpenAI) or a local model.
pub trait CompletionModel: Clone + Send + Sync {
    /// The raw response type returned by the underlying completion model.
    type Response: Send + Sync;
 
    /// Generates a completion response for the given completion request.
    fn completion(
        &self,
        request: CompletionRequest,
    ) -> impl std::future::Future<Output = Result<CompletionResponse<Self::Response>, CompletionError>>
           + Send;
 
    /// Generates a completion request builder for the given `prompt`.
    fn completion_request(&self, prompt: &str) -> CompletionRequestBuilder<Self> {
        CompletionRequestBuilder::new(self.clone(), prompt.to_string())
    }
}

`CompletionModel` Trait

Provider interface implementation
Raw request handling
Response parsing and error management

Request Building

CompletionRequestBuilder

Fluent API for constructing requests with:

let request = model.completion_request("prompt")
    .preamble("system instructions")
    .temperature(0.7)
    .max_tokens(1000)
    .documents(context_docs)
    .tools(available_tools)
    .build();

Request Components

Core Elements
- Prompt text
- System preamble
- Chat history
- Temperature
- Max tokens
Context Management
- Document attachments
- Metadata handling
- Formatting controls
Tool Integration
- Tool definitions
- Parameter validation
- Response parsing

Response Handling

CompletionResponse

Structured response type with:

enum ModelChoice {
    Message(String),
    ToolCall(String, Value)
}
 
struct CompletionResponse<T> {
    choice: ModelChoice,
    raw_response: T,
}

Error Handling

Comprehensive error types:

enum CompletionError {
    HttpError(reqwest::Error),
    JsonError(serde_json::Error),
    RequestError(Box<dyn Error>),
    ResponseError(String),
    ProviderError(String),
}

Usage Patterns

Basic Completion

let openai = Client::new(api_key);
let model = openai.completion_model("gpt-4");
 
let response = model
    .prompt("Explain quantum computing")
    .await?;

Contextual Chat

let chat_response = model
    .chat(
        "Continue the discussion",
        vec![Message::user("Previous context")]
    )
    .await?;

Advanced Request Configuration

let request = model
    .completion_request("Complex query")
    .preamble("Expert system")
    .temperature(0.8)
    .documents(context)
    .tools(available_tools)
    .send()
    .await?;

Provider Integration

Implementing New Providers

impl CompletionModel for CustomProvider {
    type Response = CustomResponse;
    
    async fn completion(
        &self, 
        request: CompletionRequest
    ) -> Result<CompletionResponse<Self::Response>, CompletionError> {
        // Provider-specific implementation
    }
}

Best Practices

Interface Selection
- Use Prompt for simple interactions
- Use Chat for conversational flows
- Use Completion for fine-grained control
Error Handling
- Handle provider-specific errors
- Implement graceful fallbacks
- Log raw responses for debugging
Resource Management
- Reuse model instances
- Batch similar requests
- Monitor token usage