Quick Start Guide
This guide will help you get rag_control running in 5 minutes.
1. Install rag_control, and adapters
pip install rag_control pinecone_adapter openai_adapter
2. Create a Policy Configuration
Create a file named policy_config.yaml:
policies:
- name: strict_citations
description: Strict policy with citation enforcement
generation:
reasoning_level: limited
allow_external_knowledge: false
require_citations: true
temperature: 0.0
max_output_tokens: 512
enforcement:
validate_citations: true
block_on_missing_citations: true
prevent_external_knowledge: true
logging:
level: full
- name: soft_research
description: Relaxed policy for exploratory research
generation:
reasoning_level: full
allow_external_knowledge: false
require_citations: true
temperature: 0.1
max_output_tokens: 1024
enforcement:
validate_citations: true
block_on_missing_citations: false
prevent_external_knowledge: true
logging:
level: full
filters:
- name: enterprise_only
condition:
field: org_tier
operator: equals
value: enterprise
source: user
orgs:
- org_id: default
description: Default organization
default_policy: soft_research
document_policy:
top_k: 5
Learn more: See the Configuration Guide for detailed policy options, filters, and multi-org setups.
3. Initialize the Engine
from rag_control import RAGControl
from rag_control.models import UserContext
from openai_adapter import OpenAILLMAdapter, OpenAIQueryEmbeddingAdapter
from pinecone_adapter import PineconeVectorStoreAdapter
# Initialize adapters
llm_adapter = OpenAILLMAdapter(
api_key="sk-your-openai-key",
model="gpt-4"
)
embedding_adapter = OpenAIQueryEmbeddingAdapter(
api_key="sk-your-openai-key",
model="text-embedding-3-small"
)
vector_store = PineconeVectorStoreAdapter(
api_key="your-pinecone-key",
index_name="documents",
embedding_model="text-embedding-3-small"
)
# Initialize rag_control
engine = RAGControl(
llm=llm_adapter,
query_embedding=embedding_adapter,
vector_store=vector_store,
config_path="policy_config.yaml"
)
# Create a user context
user_context = UserContext(
org_id="default",
user_id="user-123",
attributes={
"namespace": "demo",
"dept": "hr"
},
)
Learn more: See the API Reference for all initialization options and methods.
4. Run a Query
# Execute a query with governance
result = engine.run(
query="What are the key findings from our latest report?",
user_context=user_context
)
print(f"Policy applied: {result.policy_name}")
print(f"Enforcement passed: {result.enforcement_passed}")
print(f"Response: {result.response.content}")
print(f"Tokens used: {result.response.token_count}")
5. Stream Responses (Optional)
For streaming responses:
stream_result = engine.stream(
query="Summarize the financial impact...",
user_context=user_context
)
for chunk in stream_result.response:
print(chunk.content, end="", flush=True)
What Just Happened?
Your RAG system now has:
- ✅ Policy Enforcement: The response was validated against the
soft_researchpolicy - ✅ Citation Tracking: Citations were required and verified
- ✅ Audit Logging: All requests and decisions are logged for compliance
- ✅ Token Optimization: Token usage is tracked and reported
Learn more: Explore Core Concepts to understand how policies, governance, and enforcement work together.
Implementing Adapters
rag_control uses three adapter interfaces to connect to your infrastructure:
- LLM Adapter: Connect to language models (OpenAI, Anthropic, etc.)
- Query Embedding Adapter: Convert queries to vectors (OpenAI, Cohere, etc.)
- Vector Store Adapter: search documents (Pinecone, Weaviate, etc.)
For complete implementation examples, see the Adapters documentation.
Next Steps
- Learn more about Core Concepts
- Understand Configuration in detail
- Check API Reference
- Explore Observability
Getting Help
- 📚 Read the documentation
- 🐛 Report issues on GitHub
- 💬 Ask questions on GitHub Discussions