Guardrails Monitoring
Guardrails Monitoring
This guide covers monitoring and analyzing your guardrails system to ensure optimal performance, effectiveness, and user experience. Proper monitoring helps you understand policy impact, identify issues, and optimize your configuration.
Overview
Rizk provides comprehensive monitoring capabilities for guardrails:
- Real-time Metrics: Track performance and decisions as they happen
- Policy Analytics: Understand which policies are most effective
- Performance Monitoring: Monitor latency, cache hit rates, and resource usage
- Compliance Reporting: Generate reports for regulatory requirements
- Alerting: Get notified when thresholds are exceeded
Key Metrics
Decision Metrics
Track guardrail decisions and their outcomes:
from rizk.sdk.analytics import GuardrailAnalytics
analytics = GuardrailAnalytics()
# Get decision metrics for the last 24 hoursmetrics = analytics.get_decision_metrics(time_range="24h")
print(f"Total decisions: {metrics.total_decisions}")print(f"Allowed: {metrics.allowed_count} ({metrics.allowed_percentage:.1f}%)")print(f"Blocked: {metrics.blocked_count} ({metrics.blocked_percentage:.1f}%)")print(f"Average confidence: {metrics.avg_confidence:.2f}")print(f"Decisions per minute: {metrics.decisions_per_minute:.1f}")
Performance Metrics
Monitor system performance and resource usage:
# Get performance metricsperf_metrics = analytics.get_performance_metrics(time_range="24h")
print(f"Average latency: {perf_metrics.avg_latency_ms:.1f}ms")print(f"95th percentile latency: {perf_metrics.p95_latency_ms:.1f}ms")print(f"99th percentile latency: {perf_metrics.p99_latency_ms:.1f}ms")print(f"Cache hit rate: {perf_metrics.cache_hit_rate:.1f}%")print(f"Error rate: {perf_metrics.error_rate:.2f}%")print(f"Throughput: {perf_metrics.requests_per_second:.1f} req/s")
Policy Effectiveness
Measure how well your policies are working:
# Get policy effectiveness metricspolicy_metrics = analytics.get_policy_effectiveness( policy_id="content_moderation", time_range="7d")
print(f"Policy triggers: {policy_metrics.trigger_count}")print(f"True positives: {policy_metrics.true_positives}")print(f"False positives: {policy_metrics.false_positives}")print(f"Precision: {policy_metrics.precision:.2f}")print(f"Recall: {policy_metrics.recall:.2f}")print(f"F1 Score: {policy_metrics.f1_score:.2f}")
Real-Time Monitoring
Dashboard Setup
Set up a real-time monitoring dashboard:
from rizk.sdk.monitoring import GuardrailDashboard
# Create dashboard instancedashboard = GuardrailDashboard()
# Configure dashboard widgetsdashboard.add_widget("decision_rate", { "title": "Decisions per Minute", "type": "line_chart", "metric": "decisions_per_minute", "time_range": "1h"})
dashboard.add_widget("block_rate", { "title": "Block Rate", "type": "gauge", "metric": "block_percentage", "time_range": "5m", "alert_threshold": 20 # Alert if block rate > 20%})
dashboard.add_widget("latency", { "title": "Response Latency", "type": "histogram", "metric": "latency_distribution", "time_range": "1h"})
dashboard.add_widget("top_policies", { "title": "Most Active Policies", "type": "table", "metric": "policy_trigger_counts", "time_range": "24h", "limit": 10})
# Start dashboard serverdashboard.start(port=8080)
Best Practices
1. Monitor Key Metrics
Focus on the most important metrics:
# ✅ Essential metrics to monitoressential_metrics = [ "decision_rate", # Throughput "block_rate", # Policy effectiveness "avg_latency", # Performance "cache_hit_rate", # Efficiency "error_rate", # Reliability "user_satisfaction" # User experience]
for metric in essential_metrics: analytics.add_to_dashboard(metric, alert_threshold=True)
2. Set Appropriate Thresholds
Configure meaningful alert thresholds:
# ✅ Contextual thresholdsthresholds = { "block_rate": { "warning": 15, # 15% block rate "critical": 30 # 30% block rate }, "avg_latency": { "warning": 200, # 200ms average latency "critical": 500 # 500ms average latency }, "cache_hit_rate": { "warning": 70, # 70% cache hit rate "critical": 50 # 50% cache hit rate }}
3. Regular Review and Analysis
Establish regular monitoring reviews:
# ✅ Weekly monitoring reviewdef weekly_monitoring_review(): """Perform weekly analysis of guardrail performance."""
# Get weekly metrics metrics = analytics.get_metrics(time_range="7d")
# Check for trends trends = analytics.get_trends(time_range="30d")
# Generate insights insights = analytics.generate_insights(metrics, trends)
# Create review report report = { "period": "week", "key_metrics": metrics, "trends": trends, "insights": insights, "recommendations": analytics.get_recommendations(insights) }
return report
# Schedule weekly reviewsimport scheduleschedule.every().monday.at("09:00").do(weekly_monitoring_review)
Next Steps
- Configuration - Optimize your guardrails configuration
- Policy Enforcement - Understand policy decisions
- Using Guardrails - Implement guardrails effectively
- Overview - Understand the guardrails system
Effective monitoring is essential for maintaining optimal guardrail performance. Regular analysis and proactive optimization ensure your guardrails continue to provide value while maintaining excellent user experience.