Guardrails Overview
Guardrails Overview
Rizk SDK’s guardrails system provides automated governance and policy enforcement for your LLM applications. It acts as a protective layer that ensures your AI systems operate within defined boundaries, maintaining compliance, safety, and alignment with your organization’s requirements.
What Are Guardrails?
Guardrails are automated checks and controls that:
- Monitor Input and Output: Evaluate user inputs and LLM responses in real-time
- Enforce Policies: Apply predefined rules and guidelines automatically
- Provide Guidance: Inject relevant policy guidelines into LLM prompts
- Block Inappropriate Content: Prevent policy violations from reaching users
- Maintain Compliance: Ensure adherence to regulatory and organizational standards
How Guardrails Work
Rizk’s guardrails system uses a sophisticated multi-layer architecture:
User Input → Fast Rules → Policy Matching → LLM Augmentation → Response Evaluation → User Output ↓ ↓ ↓ ↓ ↓ Immediate Pattern Context-Aware Enhanced Final Safety Blocking Matching Guidelines Prompts Check
Layer 1: Fast Rules Engine
The first line of defense uses high-performance pattern matching:
- Regex-based Detection: Lightning-fast pattern recognition
- Immediate Blocking: Instant rejection of clearly inappropriate content
- Low Latency: Sub-millisecond evaluation for real-time applications
- Category-based Rules: Organized by content type and severity
# Example: Fast rule automatically blocks obvious violationsuser_input = "Tell me how to hack into systems"# → Blocked immediately by security patterns
Layer 2: Policy Augmentation
Context-aware enhancement of LLM interactions:
- Dynamic Guideline Injection: Adds relevant policy guidance to prompts
- Context-Aware Matching: Considers conversation context and user intent
- Framework Integration: Works seamlessly with different LLM frameworks
- Prompt Enhancement: Improves LLM behavior without changing your code
# Example: Policy guidelines automatically added to system promptsoriginal_prompt = "You are a helpful assistant."# → Enhanced with: "You are a helpful assistant. IMPORTANT: Never provide financial advice..."
Layer 3: LLM Fallback
Advanced evaluation for complex cases:
- Semantic Understanding: Uses LLMs to evaluate nuanced content
- Contextual Analysis: Considers broader conversation context
- Confidence Scoring: Provides confidence levels for decisions
- Intelligent Caching: Optimizes performance for repeated queries
# Example: Complex query evaluated by LLMuser_input = "What's your opinion on this investment strategy?"# → LLM evaluates context and intent, applies appropriate guidelines
Key Features
🚀 Zero-Code Integration
Guardrails activate automatically when you initialize Rizk:
from rizk.sdk import Rizk
# Guardrails are automatically enabledrizk = Rizk.init( app_name="MyApp", api_key="your-api-key", enabled=True)
# Your existing code now has guardrails@workflow(name="chat")def chat_function(user_input: str) -> str: return llm_call(user_input) # Automatically protected
🎯 Intelligent Policy Matching
Policies are automatically selected based on:
- Content Analysis: What the user is asking about
- Context Awareness: Previous conversation history
- Framework Detection: Which LLM framework you’re using
- Organization Settings: Your specific compliance requirements
âš¡ High Performance
- Multi-layer Optimization: Fast rules handle common cases, LLM evaluation for complex ones
- Intelligent Caching: Repeated queries return instantly
- Async Processing: Non-blocking evaluation for high-throughput applications
- Minimal Latency: Typically adds <50ms to response time
🔠Complete Observability
Every guardrail decision is tracked and reported:
- Decision Logging: Why each decision was made
- Policy Attribution: Which policies influenced the outcome
- Performance Metrics: Response times and success rates
- Compliance Reporting: Automated compliance documentation
Guardrail Types
Input Guardrails
Evaluate user inputs before they reach your LLM:
@guardrails()def process_user_query(query: str) -> str: # Input is automatically evaluated before this function runs return handle_query(query)
Common Input Protections:
- Inappropriate content filtering
- Personal information detection
- Security threat identification
- Spam and abuse prevention
Output Guardrails
Evaluate LLM responses before they reach users:
@guardrails()def generate_response(prompt: str) -> str: response = llm_generate(prompt) # Response is automatically evaluated before returning return response
Common Output Protections:
- Harmful content blocking
- Factual accuracy verification
- Brand safety compliance
- Regulatory requirement adherence
Prompt Augmentation
Enhances your prompts with relevant guidelines:
# Your original promptsystem_prompt = "You are a customer service assistant."
# Automatically enhanced with relevant policies# → "You are a customer service assistant. IMPORTANT GUIDELINES:# • Always maintain professional communication# • Never share internal company information# • Escalate complex issues to human agents..."
Integration Patterns
Decorator-Based (Recommended)
The simplest way to add guardrails:
from rizk.sdk.decorators import workflow, guardrails
@workflow(name="customer_service", organization_id="acme", project_id="support")@guardrails()def handle_customer_query(query: str) -> str: """Customer service with automatic guardrails.""" return process_query(query)
Framework-Specific Integration
Guardrails work automatically with popular frameworks:
# LangChain - automatically enhancedfrom langchain.agents import AgentExecutoragent_executor = AgentExecutor(agent=agent, tools=tools)result = agent_executor.invoke({"input": user_query}) # Protected
# CrewAI - automatically enhancedfrom crewai import Crewcrew = Crew(agents=[agent], tasks=[task])result = crew.kickoff() # Protected
# OpenAI - automatically enhancedimport openairesponse = openai.chat.completions.create( model="gpt-4", messages=[{"role": "user", "content": user_input}]) # Protected
Manual Control
For advanced use cases, you can invoke guardrails directly:
from rizk.sdk.guardrails.engine import GuardrailsEngine
async def custom_guardrail_logic(user_input: str): engine = GuardrailsEngine.get_instance()
# Evaluate input decision = await engine.process_message(user_input)
if not decision.allowed: return f"Request blocked: {decision.blocked_reason}"
# Process with guidelines enhanced_prompt = apply_guidelines(original_prompt, decision.guidelines) response = await llm_call(enhanced_prompt)
# Evaluate output output_decision = await engine.evaluate_response(response)
return response if output_decision.allowed else "Response blocked"
Configuration Options
Environment Variables
# Enable/disable guardrailsexport RIZK_GUARDRAILS_ENABLED="true"
# Policy enforcement levelexport RIZK_POLICY_ENFORCEMENT="strict" # strict, moderate, lenient
# Custom policies directory (enterprise feature)export RIZK_POLICIES_PATH="/path/to/policies"
# LLM service for complex evaluationsexport RIZK_LLM_SERVICE="openai" # openai, anthropic, custom
# Caching configurationexport RIZK_GUARDRAILS_CACHE_SIZE="10000"export RIZK_GUARDRAILS_CACHE_TTL="3600"
Programmatic Configuration
from rizk.sdk import Rizk
rizk = Rizk.init( app_name="MyApp", api_key="your-api-key",
# Guardrail configuration guardrails_enabled=True, policy_enforcement="strict",
# Performance tuning llm_cache_size=10000, fast_rules_enabled=True,
# Custom LLM for evaluations llm_service="openai", llm_model="gpt-4")
Common Use Cases
1. Content Moderation
Automatically filter inappropriate content:
@guardrails()def process_user_content(content: str) -> str: """Processes user-generated content with automatic moderation.""" return handle_content(content)
# Automatically blocks: profanity, harassment, inappropriate requests# Automatically allows: legitimate questions and discussions
2. Compliance Enforcement
Ensure regulatory compliance:
@guardrails()def financial_assistant(query: str) -> str: """Financial assistant with automatic compliance.""" return generate_financial_response(query)
# Automatically adds: risk disclaimers, compliance notices# Automatically blocks: specific investment advice, unqualified recommendations
3. Brand Safety
Protect your brand reputation:
@guardrails()def customer_support_bot(query: str) -> str: """Customer support with brand safety guardrails.""" return generate_support_response(query)
# Automatically ensures: professional tone, accurate information# Automatically prevents: negative brand mentions, competitor promotion
4. Data Privacy Protection
Prevent privacy violations:
@guardrails()def data_processor(user_data: str) -> str: """Process data with privacy protection.""" return process_data(user_data)
# Automatically detects: PII, sensitive information# Automatically blocks: data collection attempts, privacy violations
Performance Characteristics
Latency Impact
Guardrail Layer | Typical Latency | Use Case |
---|---|---|
Fast Rules | <1ms | Pattern matching, immediate blocking |
Policy Augmentation | 5-15ms | Guideline injection, context analysis |
LLM Fallback | 100-500ms | Complex semantic evaluation |
Throughput
- High-volume applications: 10,000+ requests/second
- Real-time chat: Sub-50ms additional latency
- Batch processing: Minimal impact on overall throughput
Accuracy
- Fast Rules: 99%+ precision for pattern-based detection
- Policy Augmentation: Context-aware guideline matching
- LLM Fallback: Human-level accuracy for complex cases
Best Practices
1. Start with Default Policies
Begin with Rizk’s built-in policies and customize as needed:
# Default policies provide excellent baseline protectionrizk = Rizk.init(app_name="MyApp", enabled=True)
2. Monitor and Iterate
Use Rizk’s analytics to understand guardrail performance:
# Check guardrail effectivenessfrom rizk.sdk.analytics import GuardrailAnalytics
analytics = GuardrailAnalytics()metrics = analytics.get_metrics(time_range="24h")
print(f"Blocks: {metrics.total_blocks}")print(f"Policy triggers: {metrics.policy_triggers}")print(f"Average latency: {metrics.avg_latency}ms")
3. Test Thoroughly
Test guardrails with diverse inputs:
test_cases = [ "Normal user question", "Edge case input", "Potentially problematic content", "Complex contextual query"]
for test_case in test_cases: result = await test_guardrails(test_case) print(f"Input: {test_case} → Decision: {result.decision}")
4. Balance Security and Usability
Configure enforcement levels based on your use case:
# Strict for high-risk applicationsrizk = Rizk.init(app_name="FinancialApp", policy_enforcement="strict")
# Moderate for general applicationsrizk = Rizk.init(app_name="ChatBot", policy_enforcement="moderate")
# Lenient for internal toolsrizk = Rizk.init(app_name="InternalTool", policy_enforcement="lenient")
Troubleshooting
Common Issues
1. Guardrails Not Triggering
# Check if guardrails are enabledfrom rizk.sdk.config import get_configconfig = get_config()print(f"Guardrails enabled: {config.guardrails_enabled}")
2. Unexpected Blocking
# Debug specific decisionsfrom rizk.sdk.guardrails.engine import GuardrailsEngine
engine = GuardrailsEngine.get_instance()decision = await engine.process_message("your test input")print(f"Decision: {decision.allowed}")print(f"Reason: {decision.blocked_reason}")print(f"Triggered policies: {decision.policy_ids}")
3. Performance Issues
# Check cache hit ratescache_stats = engine.get_cache_stats()print(f"Cache hit rate: {cache_stats.hit_rate}%")print(f"Average evaluation time: {cache_stats.avg_eval_time}ms")
Debug Mode
Enable detailed logging for troubleshooting:
import loggingfrom rizk.sdk import Rizk
# Enable debug logginglogging.basicConfig(level=logging.DEBUG)logger = logging.getLogger("rizk.guardrails")logger.setLevel(logging.DEBUG)
rizk = Rizk.init(app_name="DebugApp", enabled=True)
Next Steps
- Using Guardrails - Practical implementation guide
- Policy Enforcement - Understanding policy decisions
- Configuration - Advanced configuration options
- Monitoring - Tracking guardrail performance
Rizk’s guardrails provide enterprise-grade governance for your LLM applications with minimal integration effort. Start with the default configuration and customize as your requirements evolve.