Skip to content

Configuration Management

Service Layer Architecture

The configuration system is built on a service-based architecture with multiple specialized services:

from rag_solution.services.llm_parameters_service import LLMParametersService
from rag_solution.services.llm_provider_service import LLMProviderService
from rag_solution.services.provider_config_service import ProviderConfigService
from rag_solution.services.prompt_template_service import PromptTemplateService
from rag_solution.services.pipeline_service import PipelineService

Configuration Types

1. Environment Configuration

Essential settings loaded from environment variables:

# Authentication
JWT_SECRET_KEY=your-secure-jwt-secret-key

# WatsonX.ai Credentials
WATSONX_INSTANCE_ID=your-watsonx-instance-id
WATSONX_APIKEY=your-watsonx-key
WATSONX_URL=https://bam-api.res.ibm.com

# Infrastructure
VECTOR_DB=milvus
MILVUS_HOST=localhost
MILVUS_PORT=19530

# Database
COLLECTIONDB_USER=rag_modulo_user
COLLECTIONDB_PASS=rag_modulo_password
COLLECTIONDB_HOST=localhost

2. Runtime Configuration

Configuration managed through services:

# LLM Parameters Configuration
parameters = LLMParametersInput(
    name="watsonx-default",
    provider="watsonx",
    model_id="granite-13b",
    temperature=0.7,
    max_new_tokens=1000
)
parameters_service.create_parameters(parameters)

# Provider Configuration
provider_config = ProviderConfigInput(
    provider="watsonx",
    api_key="${WATSONX_APIKEY}",
    project_id="${WATSONX_INSTANCE_ID}",
    active=True
)
config_service.create_provider_config(provider_config)

# Template Configuration
template = PromptTemplateInput(
    name="rag-query",
    provider="watsonx",
    template_type=PromptTemplateType.RAG_QUERY,
    template_format="Context:\n{context}\nQuestion:{question}"
)
template_service.create_template(template)

# Pipeline Configuration
pipeline_config = PipelineConfigInput(
    name="default-pipeline",
    provider_id=provider.id,
    llm_parameters_id=parameters.id
)
pipeline_service.create_pipeline_config(pipeline_config)

Service Integration

Provider Service Usage

# Initialize services
db = SessionLocal()
provider_service = LLMProviderService(db)
parameters_service = LLMParametersService(db)

# Get provider and parameters
provider = provider_service.get_provider_by_name("watsonx")
parameters = parameters_service.get_parameters(parameters_id)

# Generate text
response = provider.generate_text(
    prompt="Your prompt",
    model_parameters=parameters
)

Pipeline Service Configuration

# Pipeline Service Configuration
pipeline_config = PipelineConfigInput(
    name="default-pipeline",
    provider_id=provider.id,
    llm_parameters_id=parameters.id,
    evaluator_config={
        "enabled": True,
        "metrics": ["relevance", "coherence", "factuality"],
        "threshold": 0.8
    },
    performance_config={
        "max_concurrent_requests": 10,
        "timeout_seconds": 30,
        "batch_size": 5
    },
    retrieval_config={
        "top_k": 3,
        "similarity_threshold": 0.7,
        "reranking_enabled": True
    }
)
pipeline_service.create_pipeline_config(pipeline_config)

# Pipeline Service Usage
pipeline_service = PipelineService(db)

# Initialize with configuration
await pipeline_service.initialize(
    collection_id=collection_id,
    config_overrides={
        "max_concurrent_requests": 20,
        "timeout_seconds": 45
    }
)

# Execute pipeline with runtime options
result = await pipeline_service.execute_pipeline(
    search_input=SearchInput(
        question="What is RAG?",
        collection_id=collection_id,
        metadata={
            "max_length": 100,
            "temperature": 0.7
        }
    ),
    user_id=user_id,
    evaluation_enabled=True
)

# Access pipeline metrics
metrics = pipeline_service.get_performance_metrics()
print(f"Average latency: {metrics.avg_latency_ms}ms")
print(f"Success rate: {metrics.success_rate}%")
print(f"Throughput: {metrics.requests_per_second} req/s")

Error Handling

from core.custom_exceptions import (
    ConfigurationError,
    ValidationError,
    NotFoundError
)

try:
    # Get configuration
    provider = provider_service.get_provider_by_name("watsonx")
    parameters = parameters_service.get_parameters(parameters_id)
    template = template_service.get_by_type(PromptTemplateType.RAG_QUERY)

except ConfigurationError as e:
    logger.error(f"Configuration error: {str(e)}")
except ValidationError as e:
    logger.error(f"Validation error: {str(e)}")
except NotFoundError as e:
    logger.error(f"Not found error: {str(e)}")

Migration Strategy

Phase 1: Service Migration (Current)

  • Move configuration to specialized services
  • Implement repository pattern
  • Add validation schemas
  • Support async operations

Phase 2: Runtime Configuration

  • All configuration in database
  • Service-based management
  • API endpoints for configuration
  • UI for configuration management

Phase 3: Legacy Removal

  • Remove legacy configuration
  • Update all services
  • Clean up old code
  • Update documentation

Best Practices

  1. Service Usage:
  2. Use dependency injection
  3. Initialize services properly
  4. Handle database sessions
  5. Clean up resources

  6. Configuration Management:

  7. Use service layer for all operations
  8. Validate configurations early
  9. Handle provider-specific requirements
  10. Document configuration changes

  11. Error Handling:

  12. Use custom exceptions
  13. Log errors with context
  14. Provide clear error messages
  15. Handle validation errors

  16. Security:

  17. Store sensitive data securely
  18. Use environment variables
  19. Implement access control
  20. Audit configuration changes

Testing

# Run configuration tests
pytest backend/tests/test_core_config.py
pytest backend/tests/test_provider_config.py
pytest backend/tests/test_llm_parameters.py
pytest backend/tests/test_prompt_template.py

# Run integration tests
pytest backend/tests/integration/test_configuration_flow.py
pytest backend/tests/integration/test_configuration_errors.py

Development Notes

  1. Adding New Configuration:
  2. Create appropriate service
  3. Implement repository pattern
  4. Add validation schemas
  5. Update documentation

  6. Using Configuration:

  7. Always use service layer
  8. Handle all error cases
  9. Log configuration access
  10. Clean up resources

  11. Security Considerations:

  12. Validate all inputs
  13. Sanitize outputs
  14. Handle sensitive data
  15. Implement access control

  16. Performance:

  17. Use caching where appropriate
  18. Optimize database queries
  19. Handle concurrent access
  20. Monitor performance

Future Improvements

  1. Configuration Features:
  2. Version control for configurations
  3. Configuration inheritance and overrides
  4. Dynamic validation with schemas
  5. Real-time performance metrics
  6. A/B testing support
  7. Configuration rollback
  8. Environment-specific configs

  9. Service Enhancements:

  10. Distributed caching layer
  11. Bulk operations optimization
  12. Migration and backup tools
  13. Usage analytics and insights
  14. Auto-scaling support
  15. Circuit breakers
  16. Rate limiting
  17. Request queuing

  18. Performance Monitoring:

  19. Real-time metrics dashboard
  20. Performance alerts
  21. Resource utilization tracking
  22. Bottleneck detection
  23. Query optimization
  24. Cache hit ratios
  25. Error rate monitoring
  26. Latency tracking

  27. Security Enhancements:

  28. Enhanced encryption at rest
  29. Key rotation automation
  30. Fine-grained access policies
  31. Comprehensive audit logging
  32. Security scanning
  33. Compliance reporting
  34. Data anonymization
  35. Access monitoring

  36. Integration Features:

  37. UI configuration management
  38. RESTful API endpoints
  39. Monitoring dashboards
  40. Testing utilities
  41. CI/CD integration
  42. Deployment automation
  43. Health checks
  44. Documentation generation

  45. Pipeline Optimizations:

  46. Dynamic resource allocation
  47. Parallel processing
  48. Request batching
  49. Response caching
  50. Error recovery
  51. Load balancing
  52. Failover handling
  53. Performance tuning