Completion in Rig: LLM Interaction Layer
Rig’s completion system provides a layered approach to interacting with Language Models, offering both high-level convenience and low-level control. The system is built around a set of traits that define different levels of abstraction for LLM interactions.
Core Traits
1. High-Level Interfaces
Prompt Trait
- Simplest interface for one-shot interactions
- Fire-and-forget prompting
- Returns string responses
async fn prompt(&self, prompt: &str) -> Result<String, PromptError>;Chat Trait
- Conversation-aware interactions
- Maintains chat history
- Supports contextual responses
async fn chat(&self, prompt: &str, history: Vec<Message>) -> Result<String, PromptError>;TypedPrompt Trait
- Structured output interface for typed completions
- Returns deserialized structured data instead of raw strings
- The target type must implement
serde::Deserializeandschemars::JsonSchema
pub trait TypedPrompt: WasmCompatSend + WasmCompatSync {
type TypedRequest<'a, T>: IntoFuture<Output = Result<T, StructuredOutputError>>
where Self: 'a,
T: JsonSchema + DeserializeOwned + WasmCompatSend + 'a;
// Required method
fn prompt_typed<T>(
&self,
prompt: impl Into<Message> + WasmCompatSend,
) -> Self::TypedRequest<'_, T>
where T: JsonSchema + DeserializeOwned + WasmCompatSend;
}This is useful when you need the LLM to return structured data (e.g., JSON conforming to a specific schema) rather than free-form text. See the Structured Output section below for more details.
2. Streaming Interfaces
Rig provides streaming counterparts for all high-level traits. See Streaming for full details.
StreamingPrompt: Streaming one-shot promptsStreamingChat: Streaming chat with historyStreamingCompletion: Low-level streaming completion interface
3. Low-Level Control
Completion Trait
- Fine-grained request configuration
- Access to raw completion responses
- Tool call handling
pub trait Completion<M: CompletionModel> {
/// Generates a completion request builder for the given `prompt` and `chat_history`.
/// Fields pre-populated by the implementing type (e.g., Agent preamble) can be
/// overwritten by calling the corresponding method on the builder.
fn completion(
&self,
prompt: &str,
chat_history: Vec<Message>,
) -> impl Future<Output = Result<CompletionRequestBuilder<M>, CompletionError>> + Send;
}CompletionModel Trait
The provider interface that must be implemented for each LLM backend. In v0.31.0, this trait lives at rig::completion::request::CompletionModel (re-exported via rig::completion).
pub trait CompletionModel:
Clone
+ WasmCompatSend
+ WasmCompatSync {
type Response: WasmCompatSend + WasmCompatSync + Serialize + DeserializeOwned;
type StreamingResponse: Clone + Unpin + WasmCompatSend + WasmCompatSync + Serialize + DeserializeOwned + GetTokenUsage;
type Client;
// Required methods
fn make(client: &Self::Client, model: impl Into<String>) -> Self;
fn completion(
&self,
request: CompletionRequest,
) -> impl Future<Output = Result<CompletionResponse<Self::Response>, CompletionError>> + WasmCompatSend;
fn stream(
&self,
request: CompletionRequest,
) -> impl Future<Output = Result<StreamingCompletionResponse<Self::StreamingResponse>, CompletionError>> + WasmCompatSend;
// Provided method
fn completion_request(
&self,
prompt: impl Into<Message>,
) -> CompletionRequestBuilder<Self> { ... }
}Request Building
CompletionRequestBuilder
Fluent API for constructing requests with:
let request = model.completion_request("prompt")
.preamble("system instructions")
.temperature(0.7)
.max_tokens(1000)
.documents(context_docs)
.tools(available_tools)
.build();Response Handling
CompletionResponse
The CompletionResponse struct wraps the model’s response along with the raw provider-specific data:
pub struct CompletionResponse<T> {
/// One or more assistant content items (text, tool calls, reasoning, etc.)
pub choice: OneOrMany<AssistantContent>,
/// The raw response from the provider
pub raw_response: T,
}AssistantContent
In v0.31.0, the old ModelChoice enum has been replaced by a richer AssistantContent enum (in rig::completion::message) that supports multimodal responses:
pub enum AssistantContent {
/// Plain text response
Text(Text),
/// A tool call requested by the model
ToolCall(ToolCall),
/// Reasoning/chain-of-thought content (for models that support it)
Reasoning(Reasoning),
}The Text struct wraps a string, while ToolCall contains the tool call ID, function name, and arguments:
pub struct ToolCall {
pub id: String,
pub function: ToolFunction,
}
pub struct ToolFunction {
pub name: String,
pub arguments: serde_json::Value,
}Message Types
The Message enum represents conversation messages with rich content support:
pub enum Message {
User { content: OneOrMany<UserContent> },
Assistant { content: OneOrMany<AssistantContent> },
}UserContent supports text, images, audio, documents, video, and tool results:
pub enum UserContent {
Text(Text),
ToolResult(ToolResult),
Image(Image),
Audio(Audio),
Document(Document),
Video(Video),
}Token Usage
v0.31.0 adds a Usage struct and the GetTokenUsage trait for tracking token consumption:
pub struct Usage {
pub prompt_tokens: u64,
pub completion_tokens: u64,
pub total_tokens: u64,
}Implement the GetTokenUsage trait on your provider’s raw response type to expose token metrics.
Error Handling
Comprehensive error types:
pub enum CompletionError {
HttpError(reqwest::Error),
JsonError(serde_json::Error),
RequestError(Box<dyn Error>),
ResponseError(String),
ProviderError(String),
}For structured output, there is an additional error type:
pub enum StructuredOutputError {
CompletionError(CompletionError),
JsonError(serde_json::Error),
// ...
}Usage Patterns
Basic Completion
let openai = openai::Client::from_env();
let model = openai.completion_model("gpt-4o");
let response = model
.prompt("Explain quantum computing")
.await?;Contextual Chat
use rig::completion::Message;
let chat_response = agent
.chat(
"Continue the discussion",
vec![Message::user("Previous context")]
)
.await?;Advanced Request Configuration
let response = model
.completion_request("Complex query")
.preamble("Expert system")
.temperature(0.8)
.documents(context)
.tools(available_tools)
.send()
.await?;Structured Output
Using the TypedPrompt trait (implemented by Agent), you can get structured responses:
use schemars::JsonSchema;
use serde::Deserialize;
#[derive(Deserialize, JsonSchema)]
struct SentimentAnalysis {
/// The sentiment score from -1.0 to 1.0
score: f64,
/// The sentiment label
label: String,
}
let result: SentimentAnalysis = agent
.prompt_typed("Analyze the sentiment of: 'I love this product!'")
.await?;Provider Integration
Implementing New Providers
impl CompletionModel for CustomProvider {
type Response = CustomResponse;
async fn completion(
&self,
request: CompletionRequest
) -> Result<CompletionResponse<Self::Response>, CompletionError> {
// Provider-specific implementation
}
}Best Practices
-
Interface Selection
- Use
Promptfor simple interactions - Use
Chatfor conversational flows - Use
TypedPromptfor structured data extraction - Use
Completionfor fine-grained control - Use
StreamingPrompt/StreamingChatwhen you need incremental output
- Use
-
Error Handling
- Handle provider-specific errors
- Implement graceful fallbacks
- Log raw responses for debugging
-
Resource Management
- Reuse model instances
- Batch similar requests
- Monitor token usage via the
GetTokenUsagetrait
See Also
API Reference (Completion)