Rig-SurrealDB Integration
The rig-surrealdb
crate provides a seamless vector store integration for SurrealDB, enabling similarity search on embedded documents using vector operations. SurrealDB is a fast, cloud-native, distributed database designed for modern apps—now supercharged with LLM vector search support via Rig.
Key Features
- SurrealDB-native Search: Uses SurrealDB’s built-in vector functions like
vector::similarity::cosine
. - Flexible Distance Metrics: Supports multiple distance functions including Cosine, Euclidean, Hamming, Jaccard, and KNN.
- Ergonomic Interface: Fully implements the
VectorStoreIndex
trait from Rig, with simple insert and query APIs. - LLM-Aware Embeddings: Designed to work with embedding models like OpenAI’s
text-embedding-3-small
out of the box.
Usage
Setup
Add rig-surrealdb
to your Cargo.toml
:
[dependencies]
rig-surrealdb = "0.1.0"
Example Workflow
- Connect to SurrealDB: Use an in-memory or remote connection.
let surreal = Surreal::new::<Mem>(()).await?;
surreal.use_ns("example").use_db("example").await?;
- Initialize Embedding Model: Create or import an embedding model using Rig’s providers.
let model = rig::providers::openai::Client::from_env()
.embedding_model(rig::providers::openai::TEXT_EMBEDDING_3_SMALL);
- Create Vector Store: Use defaults or customize the table and distance function.
let vector_store = SurrealVectorStore::with_defaults(model, surreal);
- Insert Documents: Automatically embed and insert documents into the vector index.
vector_store.insert_documents(documents).await?;
- Query Similarity: Perform top-N vector search using a natural language query.
let results = vector_store.top_n::<WordDefinition>("what is glarb-glarb", 3).await?;
Example Code
use rig::{embeddings::EmbeddingsBuilder, vector_store::VectorStoreIndex, Embed};
use rig_surrealdb::{Mem, SurrealVectorStore};
use serde::{Deserialize, Serialize};
use surrealdb::Surreal;
#[derive(Embed, Serialize, Deserialize, Clone, Debug, Eq, PartialEq, Default)]
struct WordDefinition {
word: String,
#[serde(skip)]
#[embed]
definition: String,
}
#[tokio::main]
async fn main() -> Result<(), anyhow::Error> {
let openai_client = rig::providers::openai::Client::from_env();
let model = openai_client.embedding_model(rig::providers::openai::TEXT_EMBEDDING_3_SMALL);
let surreal = Surreal::new::<Mem>(()).await?;
surreal.use_ns("example").use_db("example").await?;
let words = vec![
WordDefinition {
word: "flurbo".to_string(),
definition: "A fictional currency from Rick and Morty.".to_string()
},
WordDefinition {
word: "glarb-glarb".to_string(),
definition: "A creature from the marshlands of Glibbo.".to_string()
},
];
let documents = EmbeddingsBuilder::new(model.clone())
.documents(words)
.unwrap()
.build()
.await?;
let vector_store = SurrealVectorStore::with_defaults(model, surreal);
vector_store.insert_documents(documents).await?;
let query = "weird alien creature";
let results = vector_store.top_n::<WordDefinition>(query, 2).await?;
for (distance, _id, doc) in results {
println!("Distance: {:.3}, Word: {}", distance, doc.word);
}
Ok(())
}
Supported Distance Functions
You can customize similarity search using different distance metrics:
use rig_surrealdb::SurrealDistanceFunction;
let custom_store = SurrealVectorStore::new(
model,
surreal,
Some("my_table".into()),
SurrealDistanceFunction::Jaccard,
);
Available options:
Cosine
(default)Euclidean
Hamming
Jaccard
Knn
Additional Resources
- Examples: Check the examples directory for advanced usage.
- SurrealDB Docs: Visit the SurrealDB documentation for information on query language and capabilities.