Skip to content

Performance Issues

Performance Issues

This guide helps you identify, diagnose, and resolve performance problems when using Rizk SDK.

🚀 Performance Overview

Rizk SDK is designed for minimal overhead, but performance can be affected by:

  • Framework Detection: Pattern matching and caching
  • Guardrails Evaluation: Policy processing and LLM calls
  • Adapter Registration: Framework and LLM client patching
  • Telemetry Collection: Tracing and metrics overhead
  • Cache Management: Memory and Redis operations

📊 Performance Monitoring

Built-in Performance Instrumentation

Rizk SDK includes performance monitoring capabilities:

from rizk.sdk.performance import performance_instrumented
from rizk.sdk.analytics.processors import PerformanceMonitoringProcessor
# Enable performance monitoring
@performance_instrumented("my_operation")
def my_function():
return "result"
# Check performance stats
processor = PerformanceMonitoringProcessor()
stats = processor.get_performance_stats()
print(f"Average latency: {stats.get('avg_latency_ms', 0):.2f}ms")
print(f"P95 latency: {stats.get('p95_latency_ms', 0):.2f}ms")
print(f"Total operations: {stats.get('total_operations', 0)}")

Custom Performance Profiler

Create a detailed profiler for SDK operations:

import time
import cProfile
import pstats
import io
from functools import wraps
from typing import Callable, Dict, List
from contextlib import contextmanager
class RizkPerformanceProfiler:
"""Performance profiler for Rizk SDK operations."""
def __init__(self):
self.operation_times: Dict[str, List[float]] = {}
self.total_calls: Dict[str, int] = {}
@contextmanager
def profile_operation(self, operation_name: str):
"""Context manager to profile an operation."""
start_time = time.time()
try:
yield
finally:
elapsed = time.time() - start_time
if operation_name not in self.operation_times:
self.operation_times[operation_name] = []
self.total_calls[operation_name] = 0
self.operation_times[operation_name].append(elapsed * 1000) # Convert to ms
self.total_calls[operation_name] += 1
def get_stats(self) -> Dict[str, Dict[str, float]]:
"""Get performance statistics."""
stats = {}
for operation, times in self.operation_times.items():
if times:
stats[operation] = {
"total_calls": self.total_calls[operation],
"avg_time_ms": sum(times) / len(times),
"min_time_ms": min(times),
"max_time_ms": max(times),
"total_time_ms": sum(times),
"p95_time_ms": sorted(times)[int(len(times) * 0.95)] if len(times) > 1 else times[0],
"p99_time_ms": sorted(times)[int(len(times) * 0.99)] if len(times) > 1 else times[0]
}
return stats
def print_report(self):
"""Print a formatted performance report."""
stats = self.get_stats()
print("\n📊 Rizk SDK Performance Report")
print("=" * 80)
if not stats:
print("No performance data collected.")
return
# Sort by total time
sorted_stats = sorted(stats.items(), key=lambda x: x[1]["total_time_ms"], reverse=True)
print(f"{'Operation':<30} {'Calls':<8} {'Avg(ms)':<10} {'Total(ms)':<12} {'P95(ms)':<10}")
print("-" * 80)
for operation, data in sorted_stats:
operation_short = operation[-28:] if len(operation) > 30 else operation
print(f"{operation_short:<30} {data['total_calls']:<8} "
f"{data['avg_time_ms']:<10.2f} {data['total_time_ms']:<12.2f} "
f"{data['p95_time_ms']:<10.2f}")
# Find performance issues
print("\n⚠️ Performance Issues:")
issues_found = False
for operation, data in stats.items():
if data["avg_time_ms"] > 100: # More than 100ms average
print(f" • {operation}: High average latency ({data['avg_time_ms']:.1f}ms)")
issues_found = True
if data["p95_time_ms"] > 500: # P95 more than 500ms
print(f" • {operation}: High P95 latency ({data['p95_time_ms']:.1f}ms)")
issues_found = True
if data["total_calls"] > 1000: # More than 1000 calls
print(f" • {operation}: High call frequency ({data['total_calls']} calls)")
issues_found = True
if not issues_found:
print(" No significant performance issues detected.")
# Global profiler instance
profiler = RizkPerformanceProfiler()

🔧 Common Performance Issues

1. Slow Framework Detection

Symptoms:

  • High latency on first decorator calls
  • Repeated framework detection overhead

Diagnosis:

from rizk.sdk.utils.framework_detection import get_detection_cache_stats
import time
# Measure detection time
start = time.time()
from rizk.sdk.utils.framework_detection import detect_framework
framework = detect_framework()
detection_time = (time.time() - start) * 1000
print(f"Framework detection took: {detection_time:.2f}ms")
print(f"Detected framework: {framework}")
# Check cache performance
cache_stats = get_detection_cache_stats()
print(f"Cache hit rate: {cache_stats['hit_rate']:.1f}%")
print(f"Cache size: {cache_stats['size']}")

Solutions:

import os
# 1. Increase cache size
os.environ["RIZK_FRAMEWORK_CACHE_SIZE"] = "5000" # Default: 1000
# 2. Enable lazy loading
os.environ["RIZK_LAZY_LOADING"] = "true"
# 3. Explicit framework specification (bypasses detection)
from rizk.sdk.decorators import workflow
@workflow(name="my_workflow", framework="langchain") # Explicit
def my_function():
pass
# 4. Pre-warm the cache during initialization
from rizk.sdk import Rizk
from rizk.sdk.utils.framework_detection import detect_framework_cached
Rizk.init(app_name="OptimizedApp")
# Pre-warm detection cache
for _ in range(10):
detect_framework_cached()

2. Guardrails Performance Issues

Symptoms:

  • High latency on message processing
  • Blocking during policy evaluation
  • Memory growth with large message volumes

Diagnosis:

from rizk.sdk.guardrails.engine import GuardrailsEngine
import time
# Test guardrails performance
engine = GuardrailsEngine.get_instance()
test_messages = [
"Short message",
"A much longer message with more content to evaluate for policy violations",
"SSN: 123-45-6789", # Should trigger policies
"Multiple sensitive items: SSN 123-45-6789, Credit Card 1234-5678-9012-3456"
]
for message in test_messages:
start = time.time()
result = engine.evaluate(message)
latency = (time.time() - start) * 1000
print(f"Message length: {len(message):3d} | "
f"Latency: {latency:6.2f}ms | "
f"Decision: {'BLOCKED' if not result.allowed else 'ALLOWED'}")

Solutions:

# 1. Optimize guardrails configuration
from rizk.sdk import Rizk
Rizk.init(
app_name="OptimizedApp",
# Disable features you don't need
telemetry_enabled=False,
disable_batch=True, # Reduce latency for individual calls
# Optimize guardrails settings
llm_cache_size=10000, # Larger LLM cache
state_ttl_seconds=300, # Faster state cleanup
)
# 2. Use async evaluation for high throughput
import asyncio
async def process_messages_async(messages):
engine = GuardrailsEngine.get_instance()
tasks = [engine.process_message(msg) for msg in messages]
results = await asyncio.gather(*tasks)
return results
# 3. Batch processing for multiple messages
async def batch_process():
messages = ["msg1", "msg2", "msg3", "msg4", "msg5"]
results = await process_messages_async(messages)
return results
# 4. Confidence threshold tuning
from rizk.sdk.decorators import guardrails
@guardrails(confidence_threshold=0.8) # Higher threshold = fewer evaluations
def optimized_function(message):
return f"Processed: {message}"

3. Memory Performance Issues

Symptoms:

  • Growing memory usage over time
  • Slow performance with large conversation histories
  • Cache bloat

Diagnosis:

import psutil
import gc
from rizk.sdk.utils.framework_detection import clear_detection_cache
def memory_usage_mb():
"""Get current memory usage in MB."""
return psutil.Process().memory_info().rss / 1024 / 1024
print(f"Initial memory: {memory_usage_mb():.1f} MB")
# Run your operations
for i in range(1000):
# Your Rizk operations here
pass
print(f"After operations: {memory_usage_mb():.1f} MB")
# Check for cache sizes
try:
from rizk.sdk.utils.framework_detection import get_detection_cache_stats
cache_stats = get_detection_cache_stats()
print(f"Detection cache size: {cache_stats['size']}")
except:
pass
# Manual cleanup
gc.collect()
clear_detection_cache()
print(f"After cleanup: {memory_usage_mb():.1f} MB")

Solutions:

import os
# 1. Configure cache limits
os.environ["RIZK_FRAMEWORK_CACHE_SIZE"] = "2000" # Limit cache size
os.environ["RIZK_LLM_CACHE_SIZE"] = "5000"
# 2. Enable automatic cleanup
os.environ["RIZK_STATE_TTL_SECONDS"] = "300" # 5 minutes
# 3. Regular cache maintenance
import schedule
import time
def cleanup_caches():
"""Periodic cache cleanup."""
from rizk.sdk.utils.framework_detection import clear_detection_cache
clear_detection_cache()
gc.collect()
print(f"Cache cleanup completed. Memory: {memory_usage_mb():.1f} MB")
# Schedule cleanup every hour
schedule.every().hour.do(cleanup_caches)
# 4. Use context managers for temporary operations
from contextlib import contextmanager
@contextmanager
def temporary_rizk_operation():
"""Context manager that cleans up after operations."""
try:
yield
finally:
gc.collect()
with temporary_rizk_operation():
# Your operations here
pass

4. Adapter Registration Overhead

Symptoms:

  • Slow SDK initialization
  • Import time delays
  • Multiple adapter registrations

Diagnosis:

import time
from rizk.sdk import Rizk
# Measure initialization time
start = time.time()
Rizk.init(app_name="PerfTest")
init_time = (time.time() - start) * 1000
print(f"SDK initialization took: {init_time:.2f}ms")
# Check registered adapters
from rizk.sdk.utils.framework_registry import FrameworkRegistry, LLMClientRegistry
framework_adapters = FrameworkRegistry.get_all_framework_names()
llm_adapters = LLMClientRegistry.get_all_adapter_instances()
print(f"Framework adapters: {len(framework_adapters)}")
print(f"LLM adapters: {len(llm_adapters)}")

Solutions:

# 1. Lazy initialization
import os
os.environ["RIZK_LAZY_LOADING"] = "true"
# 2. Initialize only once per application
# Store the Rizk instance and reuse it
_rizk_instance = None
def get_rizk_instance():
global _rizk_instance
if _rizk_instance is None:
_rizk_instance = Rizk.init(app_name="MyApp")
return _rizk_instance
# 3. Disable unused features during initialization
Rizk.init(
app_name="FastApp",
telemetry_enabled=False, # Skip telemetry setup
disable_batch=True, # Skip batch processing setup
)
# 4. Profile initialization components
import cProfile
def profile_initialization():
profiler = cProfile.Profile()
profiler.enable()
Rizk.init(app_name="ProfiledApp")
profiler.disable()
# Print top time consumers
stats = pstats.Stats(profiler)
stats.sort_stats('cumulative')
stats.print_stats(10)
profile_initialization()

âš¡ Performance Optimization Techniques

1. Caching Strategies

# Framework detection caching
from rizk.sdk.utils.framework_detection import detect_framework_cached
# Use cached version for repeated calls
framework = detect_framework_cached() # Much faster after first call
# Custom LRU cache for your functions
from functools import lru_cache
@lru_cache(maxsize=1000)
def expensive_operation(input_data):
# Your expensive computation
return result
# Redis caching for distributed systems
from rizk.sdk.cache.redis import RedisCacheAdapter
cache = RedisCacheAdapter()
cache.set("key", "value", ttl=3600) # 1 hour TTL

2. Async Processing

import asyncio
from rizk.sdk.decorators import workflow
@workflow(name="async_workflow")
async def async_function(data):
# Async processing
await asyncio.sleep(0.1) # Simulate async work
return f"Processed: {data}"
# Batch async processing
async def process_batch(items):
tasks = [async_function(item) for item in items]
results = await asyncio.gather(*tasks)
return results
# Run async batch
items = ["item1", "item2", "item3"]
results = asyncio.run(process_batch(items))

3. Configuration Tuning

import os
# Performance-optimized configuration
performance_config = {
# Framework detection
"RIZK_FRAMEWORK_CACHE_SIZE": "5000",
"RIZK_LAZY_LOADING": "true",
# Guardrails
"RIZK_LLM_CACHE_SIZE": "10000",
"RIZK_STATE_TTL_SECONDS": "300",
# Telemetry (disable for max performance)
"RIZK_TRACING_ENABLED": "false",
"RIZK_METRICS_ENABLED": "false",
"RIZK_TELEMETRY": "false",
# Memory management
"RIZK_DEBUG": "false", # Disable in production
"RIZK_VERBOSE": "false",
}
# Apply configuration
for key, value in performance_config.items():
os.environ[key] = value
from rizk.sdk import Rizk
Rizk.init(
app_name="HighPerformanceApp",
disable_batch=True, # Reduce latency
)

📈 Performance Benchmarking

Benchmark Suite

import time
import statistics
from typing import List, Callable
class RizkBenchmark:
"""Benchmark suite for Rizk SDK operations."""
def __init__(self, warmup_runs: int = 10, benchmark_runs: int = 100):
self.warmup_runs = warmup_runs
self.benchmark_runs = benchmark_runs
def benchmark_function(self, func: Callable, *args, **kwargs) -> dict:
"""Benchmark a function and return statistics."""
# Warmup runs
for _ in range(self.warmup_runs):
func(*args, **kwargs)
# Benchmark runs
times = []
for _ in range(self.benchmark_runs):
start = time.time()
func(*args, **kwargs)
times.append((time.time() - start) * 1000) # Convert to ms
return {
"mean_ms": statistics.mean(times),
"median_ms": statistics.median(times),
"std_dev_ms": statistics.stdev(times) if len(times) > 1 else 0,
"min_ms": min(times),
"max_ms": max(times),
"p95_ms": sorted(times)[int(len(times) * 0.95)],
"p99_ms": sorted(times)[int(len(times) * 0.99)],
"total_runs": len(times)
}
def run_comprehensive_benchmark(self):
"""Run comprehensive Rizk SDK benchmark."""
print("🚀 Rizk SDK Performance Benchmark")
print("=" * 50)
from rizk.sdk.utils.framework_detection import detect_framework, detect_framework_cached
from rizk.sdk.decorators import workflow
from rizk.sdk.guardrails.engine import GuardrailsEngine
# 1. Framework Detection Benchmark
print("\n1. Framework Detection:")
# Uncached detection
uncached_stats = self.benchmark_function(detect_framework)
print(f" Uncached: {uncached_stats['mean_ms']:.2f}ms ± {uncached_stats['std_dev_ms']:.2f}ms")
# Cached detection
cached_stats = self.benchmark_function(detect_framework_cached)
print(f" Cached: {cached_stats['mean_ms']:.2f}ms ± {cached_stats['std_dev_ms']:.2f}ms")
print(f" Speedup: {uncached_stats['mean_ms'] / cached_stats['mean_ms']:.1f}x")
# 2. Decorator Application Benchmark
print("\n2. Decorator Application:")
@workflow(name="benchmark_workflow")
def simple_function():
return "result"
decorator_stats = self.benchmark_function(simple_function)
print(f" Decorated function: {decorator_stats['mean_ms']:.2f}ms ± {decorator_stats['std_dev_ms']:.2f}ms")
# 3. Guardrails Benchmark
print("\n3. Guardrails Evaluation:")
try:
engine = GuardrailsEngine.get_instance()
# Simple message
simple_stats = self.benchmark_function(engine.evaluate, "Hello world")
print(f" Simple message: {simple_stats['mean_ms']:.2f}ms ± {simple_stats['std_dev_ms']:.2f}ms")
# Complex message
complex_msg = "This is a longer message with more content to evaluate"
complex_stats = self.benchmark_function(engine.evaluate, complex_msg)
print(f" Complex message: {complex_stats['mean_ms']:.2f}ms ± {complex_stats['std_dev_ms']:.2f}ms")
except Exception as e:
print(f" Guardrails benchmark failed: {e}")
print("\n📊 Performance Summary:")
print(f" Framework detection is {uncached_stats['mean_ms'] / cached_stats['mean_ms']:.1f}x faster when cached")
print(f" Decorator overhead: ~{decorator_stats['mean_ms']:.2f}ms per call")
# Run benchmark
benchmark = RizkBenchmark()
benchmark.run_comprehensive_benchmark()

🎯 Performance Best Practices

1. Application Startup

  • Initialize Rizk SDK once during application startup
  • Use lazy loading for better startup times
  • Pre-warm caches during initialization

2. Framework Detection

  • Use explicit framework specification when possible
  • Enable caching for repeated detection calls
  • Clear detection cache periodically in long-running applications

3. Guardrails Optimization

  • Tune confidence thresholds to reduce unnecessary evaluations
  • Use async processing for high-throughput scenarios
  • Implement proper caching for repeated policy evaluations

4. Memory Management

  • Monitor memory usage in production
  • Implement periodic cache cleanup
  • Use appropriate TTL values for cached data

5. Configuration

  • Disable unused features in production
  • Use environment variables for performance tuning
  • Profile your specific use case to identify bottlenecks

By following these guidelines and using the provided tools, you can optimize Rizk SDK performance for your specific use case and requirements.