Skip to main content

Quick Start Guide

This guide will help you get rag_control running in 5 minutes.

1. Install rag_control, and adapters

pip install rag_control pinecone_adapter openai_adapter

2. Create a Policy Configuration

Create a file named policy_config.yaml:

policies:
- name: strict_citations
description: Strict policy with citation enforcement
generation:
reasoning_level: limited
allow_external_knowledge: false
require_citations: true
temperature: 0.0
max_output_tokens: 512
enforcement:
validate_citations: true
block_on_missing_citations: true
prevent_external_knowledge: true
logging:
level: full

- name: soft_research
description: Relaxed policy for exploratory research
generation:
reasoning_level: full
allow_external_knowledge: false
require_citations: true
temperature: 0.1
max_output_tokens: 1024
enforcement:
validate_citations: true
block_on_missing_citations: false
prevent_external_knowledge: true
logging:
level: full

filters:
- name: enterprise_only
condition:
field: org_tier
operator: equals
value: enterprise
source: user

orgs:
- org_id: default
description: Default organization
default_policy: soft_research
document_policy:
top_k: 5

Learn more: See the Configuration Guide for detailed policy options, filters, and multi-org setups.

3. Initialize the Engine

from rag_control import RAGControl
from rag_control.models import UserContext
from openai_adapter import OpenAILLMAdapter, OpenAIQueryEmbeddingAdapter
from pinecone_adapter import PineconeVectorStoreAdapter

# Initialize adapters
llm_adapter = OpenAILLMAdapter(
api_key="sk-your-openai-key",
model="gpt-4"
)

embedding_adapter = OpenAIQueryEmbeddingAdapter(
api_key="sk-your-openai-key",
model="text-embedding-3-small"
)

vector_store = PineconeVectorStoreAdapter(
api_key="your-pinecone-key",
index_name="documents",
embedding_model="text-embedding-3-small"
)

# Initialize rag_control
engine = RAGControl(
llm=llm_adapter,
query_embedding=embedding_adapter,
vector_store=vector_store,
config_path="policy_config.yaml"
)

# Create a user context
user_context = UserContext(
org_id="default",
user_id="user-123",
attributes={
"namespace": "demo",
"dept": "hr"
},
)

Learn more: See the API Reference for all initialization options and methods.

4. Run a Query

# Execute a query with governance
result = engine.run(
query="What are the key findings from our latest report?",
user_context=user_context
)

print(f"Policy applied: {result.policy_name}")
print(f"Enforcement passed: {result.enforcement_passed}")
print(f"Response: {result.response.content}")
print(f"Tokens used: {result.response.token_count}")

5. Stream Responses (Optional)

For streaming responses:

stream_result = engine.stream(
query="Summarize the financial impact...",
user_context=user_context
)

for chunk in stream_result.response:
print(chunk.content, end="", flush=True)

What Just Happened?

Your RAG system now has:

  • Policy Enforcement: The response was validated against the soft_research policy
  • Citation Tracking: Citations were required and verified
  • Audit Logging: All requests and decisions are logged for compliance
  • Token Optimization: Token usage is tracked and reported

Learn more: Explore Core Concepts to understand how policies, governance, and enforcement work together.

Implementing Adapters

rag_control uses three adapter interfaces to connect to your infrastructure:

  • LLM Adapter: Connect to language models (OpenAI, Anthropic, etc.)
  • Query Embedding Adapter: Convert queries to vectors (OpenAI, Cohere, etc.)
  • Vector Store Adapter: search documents (Pinecone, Weaviate, etc.)

For complete implementation examples, see the Adapters documentation.

Next Steps

Getting Help