ADR-008: Pydantic BaseSettings for configuration¶
Status¶
Accepted
Context¶
WikiMind needs to manage a variety of configuration: server host/port, data directory paths, LLM provider selection and model names, API keys for multiple providers, cloud sync settings, and database options. Configuration must support environment variables (for CI/CD and containers), .env files (for local development), and secure handling of API keys that must never appear in logs.
The configuration layer was recently refactored to consolidate scattered settings into a single, validated, type-safe configuration object.
Decision¶
We use Pydantic BaseSettings (pydantic-settings) as the configuration foundation, with the following design:
-
Environment-first: All settings are loaded from environment variables with the
WIKIMIND_prefix viaSettingsConfigDict(env_prefix="WIKIMIND_"). A.envfile is loaded automatically when present. -
SecretStr for API keys: All API key fields (
anthropic_api_key,openai_api_key,google_api_key, AWS credentials) useSecretStr, which prevents accidental exposure in logs, repr output, or serialization. -
Keyring fallback: The
get_api_key()function checks settings (env vars) first, then falls back to the OS keychain via thekeyringlibrary. This lets users store keys securely without.envfiles on their personal machine while still supporting env vars in CI/CD. -
Nested configuration: Related settings are grouped into
BaseModelsubclasses (LLMConfig,SyncConfig,DatabaseConfig,ServerConfig) for clarity and validation. -
Cached singleton:
get_settings()uses@lru_cache(maxsize=1)to ensure settings are loaded once and reused across the application.
Alternatives Considered¶
Plain environment variables (os.environ) -- No validation, no type safety, no default values, no nesting, no protection against logging secrets. Requires manual parsing everywhere.
TOML/YAML config file only -- Works for static configuration but does not support environment variable overrides for deployment. Would require a separate mechanism for secrets management. We do plan to support a settings.toml for non-sensitive settings in the future, layered under env vars.
dynaconf -- Feature-rich configuration library with multi-source support. However, it adds a significant dependency, has its own learning curve, and duplicates functionality that Pydantic BaseSettings provides natively. Since we already depend on Pydantic for models, BaseSettings is the natural choice.
python-decouple -- Lightweight env/ini reader but lacks type validation, nested config support, and secret protection. Insufficient for our needs.
Vault/AWS Secrets Manager -- Enterprise secrets management. Overkill for a local-first application. The keyring fallback provides similar security for individual users without requiring cloud infrastructure.
Consequences¶
Enables: - Type-safe configuration with validation errors at startup, not at runtime - API keys never appear in logs or error messages thanks to SecretStr - Flexible key storage: env vars for CI/CD, keyring for personal machines - Derived paths (wiki_dir, raw_dir, db_dir) computed from data_dir - ensure_dirs() called at startup guarantees directory structure exists - get_security_status() provides a production readiness check
Constrains: - Adding a new setting requires updating the Settings class and documenting the corresponding WIKIMIND_* environment variable - The lru_cache means settings cannot be changed at runtime without clearing the cache (acceptable for a daemon that restarts on config changes)
Risks: - The keyring library behavior varies across operating systems; some Linux environments may not have a keyring backend configured. Mitigated by always checking env vars first and only falling back to keyring.
Subsequent decisions¶
- ADR-010 — Auto-enable LLM providers when their API key is detected (extends the validator pattern introduced here)