# DBWarden Documentation > Full documentation for DBWarden — a SQL-first database migration system > Source: https://dbwarden.emiliano-go.com > Pages: 85 ======================================================================== PAGE: https://dbwarden.emiliano-go.com/advanced/checksum-integrity/ ======================================================================== # Checksum Integrity DBWarden stores a SHA-256 checksum of each migration file at apply time. On subsequent runs, it recalculates the checksum and compares. A mismatch means the file changed after it was applied. ## What checksums protect against - Accidentally editing an applied migration file - Copy-paste errors that silently modify historical SQL - Merge conflicts that land inside already-applied migration files - Tooling (formatters, editors) modifying migration files in place ## What triggers a mismatch error ``` ChecksumMismatchError: Migration '0004_add_indexes' checksum has changed. stored: a3f8c2e1... current: 7b91d40c... database: primary The migration file was modified after it was applied. Use 'dbwarden history --database primary' to inspect applied migrations. ``` This error blocks `migrate` and `status` from running. DBWarden will not proceed while a checksum is inconsistent. ## Repeatable migrations Migrations prefixed `RA__` (repeatable, always) or `ROC__` (repeatable on change) behave differently: they are designed to be re-applied. Checksum changes on `ROC__` files trigger re-application on the next `migrate` run. This is expected behavior, not an error. Checksum mismatch errors only apply to versioned migrations (`V__` prefix). ## Diagnosing a mismatch ```bash # See which migrations are applied and their checksums $ dbwarden history --database primary # Check the current status $ dbwarden status --database primary ``` Common causes: 1. **Editor auto-format**: your editor reformatted whitespace in the file 2. **Merge conflict**: conflict markers were added/removed inside a migration file 3. **Intentional edit**: someone changed the migration to fix a typo or add a comment ## Resolution: dev environment In development where the migration has not been applied to shared data: 1. Reset the database state and re-run migrations from scratch, **or** 2. If the change is trivial (whitespace, comment), revert the file to its original state: ```bash git diff migrations/primary/V__0004_add_indexes.sql git checkout migrations/primary/V__0004_add_indexes.sql ``` After reverting, the stored checksum and file checksum will match again. ## Resolution: production environment In production, never modify applied migration files. The resolution is: 1. **Revert the file** to its exact applied state (use git history) 2. Create a **new migration** for any schema changes you need to make If the file change was accidental and the schema is correct, reverting the file is safe; no data or schema change occurs. If the file was intentionally changed to fix an error in a migration that was already applied in production, the database schema may already reflect the original (wrong) SQL. Coordinate carefully: ```bash # 1. Revert the migration file to what was actually applied git checkout -- migrations/primary/V__0004_add_indexes.sql # 2. Verify status is clean $ dbwarden status --database primary # 3. Create a corrective migration for the actual schema fix $ dbwarden new "fix index on users" --database primary ``` ## When is it safe to ignore? Never. A checksum mismatch means recorded history diverges from what is on disk. Even if the change appears harmless, proceeding without resolving the mismatch means your migration history is untrustworthy. ## Schema snapshot checksums DBWarden also writes a **schema snapshot** after each migration: a JSON file at `.dbwarden/schemas/.schema.json`. Each snapshot contains a `checksum` field computed from the full snapshot content via SHA-256, plus a `previous_checksum` field linking it to the prior snapshot: ```json { "tables": { ... }, "checksum": "abc123...", "previous_checksum": "def456..." } ``` Snapshots are written atomically (write to temp file, verify, rename) and read with integrity validation; if the file content doesn't match the stored checksum, the snapshot is rejected. The snapshot checksum chain serves a different purpose from the migration checksum. A migration checksum tells you a specific SQL file hasn't changed. A snapshot checksum tells you the full schema state at a given point is intact. If a snapshot is corrupted or manually edited, `find_latest_snapshot()` falls back to the previous intact snapshot. ## Preventing mismatches - Treat versioned migration files as immutable once applied to any shared environment - Configure your editor to exclude `migrations/` from auto-format - Use a pre-commit hook to detect changes to applied migration files: ```bash # .git/hooks/pre-commit (example, adapt to your setup) $ dbwarden check --database primary ``` `dbwarden check` compares models to the live schema. While not a direct checksum check, it surfaces drift that often accompanies unintended migration file edits. See also: [Migration Locking](migration-locking.md) | [Migration Files](../migration-files.md) | [Cookbook: Schema Inspection](../cookbook/05-schema-inspection.md) ======================================================================== PAGE: https://dbwarden.emiliano-go.com/advanced/ci-cd-patterns/ ======================================================================== # CI/CD Patterns Patterns for running DBWarden migrations in automated pipelines. ## Core principle Run migrations from exactly one job. Serialize migration and deploy. Never run `migrate` in parallel across multiple agents or containers targeting the same database. ## GitHub Actions ### Minimal migration job ```yaml name: Deploy on: push: branches: [main] jobs: migrate: runs-on: ubuntu-latest environment: production steps: - uses: actions/checkout@v4 - uses: actions/setup-python@v5 with: python-version: "3.12" - run: uv add -e ".[migrations]" - name: Check migration status run: dbwarden status --database primary env: DATABASE_URL: ${{ secrets.DATABASE_URL }} - name: Apply migrations run: dbwarden migrate --database primary env: DATABASE_URL: ${{ secrets.DATABASE_URL }} - name: Verify post-migration status run: dbwarden status --database primary env: DATABASE_URL: ${{ secrets.DATABASE_URL }} deploy: needs: migrate runs-on: ubuntu-latest steps: - name: Deploy application run: ... ``` The `needs: migrate` dependency ensures migrations are fully applied before the application starts. ### Preventing concurrent migration runs ```yaml concurrency: group: deploy-${{ github.ref }} cancel-in-progress: false ``` `cancel-in-progress: false` queues duplicate runs instead of cancelling mid-flight, which avoids leaving a stale lock on the database. ### Multi-database migration ```yaml - name: Apply all migrations run: dbwarden migrate --all env: PRIMARY_DATABASE_URL: ${{ secrets.PRIMARY_DATABASE_URL }} ANALYTICS_DATABASE_URL: ${{ secrets.ANALYTICS_DATABASE_URL }} ``` Or migrate databases sequentially to control order: ```yaml - name: Migrate primary run: dbwarden migrate --database primary env: DATABASE_URL: ${{ secrets.DATABASE_URL }} - name: Migrate analytics run: dbwarden migrate --database analytics env: ANALYTICS_DATABASE_URL: ${{ secrets.ANALYTICS_DATABASE_URL }} ``` ### With backup before migration ```yaml - name: Apply migrations with backup run: | dbwarden migrate --database primary \ --with-backup \ --backup-dir ./migration-backups env: DATABASE_URL: ${{ secrets.DATABASE_URL }} - name: Upload backup artifact uses: actions/upload-artifact@v4 with: name: migration-backup-${{ github.sha }} path: ./migration-backups/ retention-days: 30 ``` ## GitLab CI ```yaml stages: - migrate - deploy migrate: stage: migrate image: python:3.12 script: - uv add -e ".[migrations]" - dbwarden status --database primary - dbwarden migrate --database primary - dbwarden status --database primary variables: DATABASE_URL: $DATABASE_URL # set in GitLab CI/CD settings as masked variable resource_group: production-database # prevents concurrent runs deploy: stage: deploy needs: [migrate] script: - ... ``` `resource_group` serializes the migrate job across concurrent pipelines. ## Sandbox testing in PR pipelines Instead of running against a shared staging database, use `--sandbox` to apply migrations to a temporary in-memory SQLite database or a Docker-backed instance. This isolates PR checks from each other: ```yaml sandbox-check: runs-on: ubuntu-latest if: github.event_name == 'pull_request' steps: - uses: actions/checkout@v4 - run: uv add -e ".[migrations,testcontainers]" - name: Apply migrations to sandbox run: dbwarden migrate --sandbox --database primary ``` The sandbox starts a fresh database, applies all pending migrations, reports results, and tears down. It never touches the real database. ## Dry-run check in PR pipelines Use `--dry-run` to preview SQL without any database access: ```yaml migration-check: runs-on: ubuntu-latest if: github.event_name == 'pull_request' steps: - uses: actions/checkout@v4 - run: uv add -e ".[migrations]" - name: Check for pending migrations run: dbwarden status --database primary env: DATABASE_URL: ${{ secrets.STAGING_DATABASE_URL }} ``` This surfaces "pending migrations exist" warnings in PR checks without modifying the database. For a deeper check that validates the SQL actually runs, chain `--dry-run` before `--sandbox`: ```yaml - name: Preview SQL run: dbwarden migrate --dry-run --database primary - name: Validate in sandbox run: dbwarden migrate --sandbox --database primary ``` ## Plan output in deploy pipelines The `make-migrations --plan` flag prints the generated migration plan as JSON without writing files. Use it in deploy pipelines to capture what would be generated as a deploy artifact: ```yaml - name: Generate migration plan run: dbwarden make-migrations --database primary --plan > plan.json - name: Upload plan artifact uses: actions/upload-artifact@v4 with: name: migration-plan-${{ github.sha }} path: plan.json ``` The plan JSON includes detected changes, operation counts, and auto-generated migration names. ## Exit codes DBWarden exits non-zero on: - Migration failure - Checksum mismatch - Lock acquisition failure - Configuration error CI pipelines treat non-zero as job failure by default. No extra configuration needed. ## Recommendations - Store `DATABASE_URL` as an encrypted secret, not a plain environment variable - Archive migration output logs as artifacts for audit trails - Use `dbwarden history` output as a post-migration artifact - Run `dbwarden status` before and after `migrate`; before confirms what will run, after confirms nothing is pending See also: [Safe Deployment](safe-deployment.md) | [Credentials and Secrets](../configuration/credentials.md) | [Migration Locking](migration-locking.md) | [Cookbook: Offline & CI](../cookbook/04-offline-ci.md) ======================================================================== PAGE: https://dbwarden.emiliano-go.com/advanced/migration-locking/ ======================================================================== # Migration Locking DBWarden uses a database-level lock to prevent concurrent schema mutation. This page explains how it works, what happens when it fails, and how to recover from a stuck lock. ## How locking works When `dbwarden migrate` runs, it: 1. Acquires a lock row in the `dbwarden_lock` table (created on first use; the table name is `dbwarden_lock`, not `_dbwarden_lock`) 2. Executes all pending migrations within that lock 3. Releases the lock on success or failure The lock is stored in the target database itself: no external service (Redis, filesystem) is required. If a second `migrate` invocation starts while the first holds the lock, it fails immediately with: ``` LockError: Migration lock is already held. Another migration process may be running. Use 'dbwarden unlock' to release the lock if necessary. ``` DBWarden does not retry on lock failure. The calling process (CI job, deploy script) must decide whether to retry or abort. ## Inspecting lock state ```bash $ dbwarden lock-status --database primary ``` Output when unlocked: ``` Migration lock: INACTIVE ``` Output when locked: ``` Migration lock: ACTIVE Another migration process may be running. ``` Use the `locked_at` timestamp in the lock table to determine whether the lock is held by a live process or is stale. ## When a migration fails mid-run If a migration raises an error after partial execution: 1. DBWarden rolls back the in-flight transaction (if the database supports transactional DDL, PostgreSQL does, MySQL does not) 2. The lock is released 3. The CLI exits non-zero For PostgreSQL, partial application within a migration file is rolled back atomically. The migration remains in "pending" state. For MySQL and databases without transactional DDL, partial application is possible. Inspect the database state manually before retrying. ## Stuck lock recovery A lock becomes stale when: - The migration process was killed (SIGKILL, OOM, machine restart) - A CI job was cancelled mid-run - A deploy container was stopped before migrate completed **Before unlocking, confirm no migration is running:** ```bash # Check if the PID from lock-status is still alive ps aux | grep # Or check your deployment logs / CI job status ``` If the process is genuinely dead: ```bash # 1. Confirm lock state $ dbwarden lock-status --database primary # 2. Inspect migration history to see what ran last $ dbwarden history --database primary # 3. Check pending migrations $ dbwarden status --database primary # 4. Release the stale lock $ dbwarden unlock --database primary # 5. Retry migration $ dbwarden migrate --database primary ``` ## When NOT to use `unlock` Do not run `unlock` if: - You are unsure whether a migration process is still running - The `locked_at` timestamp is recent (within seconds or minutes); the process may still be alive - Multiple processes share a database and you cannot confirm all are idle Releasing a lock held by a live migration process will allow a second migration to start concurrently, which can corrupt schema state. ## Preventing concurrent migration in CI In CI/CD, run migrations from a single job with no parallelism: ```yaml # GitHub Actions: serialize via job dependency jobs: migrate: runs-on: ubuntu-latest steps: - run: dbwarden migrate --database primary deploy: needs: migrate ... ``` If your pipeline can trigger multiple concurrent deploys, add a concurrency group: ```yaml concurrency: group: migrate-${{ github.ref }} cancel-in-progress: false ``` `cancel-in-progress: false` queues the second run instead of cancelling it, which avoids orphaned locks from killed jobs. ### 6. Confirm status Run `dbwarden status` to verify no pending migrations remain: ```bash $ dbwarden status --database primary ``` ## Distributed locking with Redis For multi-instance deployments where multiple application replicas could trigger migrations concurrently, DBWarden provides a Redis-backed distributed lock through `dbwarden.fastapi.lock`: ```python from dbwarden.fastapi import migration_lock # Within a FastAPI route or lifespan: async with migration_lock() as locked: if locked: await run_migration() ``` The Redis lock uses `SETNX` + `EXPIRE` with a default TTL of 60 seconds. If the application crashes while holding the lock, Redis releases it automatically after the TTL expires. Long-running migrations should specify a custom TTL or implement lock extension. The lock is also used internally by the `POST /migrate` FastAPI endpoint to serialize migration requests across application instances. ### Database-level vs Redis lock | Aspect | Database lock | Redis lock | |--------|---------------|------------| | Scope | CLI commands (`migrate`, `seed`) | FastAPI `POST /migrate` endpoint | | Storage | `dbwarden_lock` table in the target database | Redis key | | TTL | No TTL: manual `unlock` required after crash | 60-second default TTL | | Failure mode | Blocks other CLI commands until released | Auto-released after TTL | | External dependency | None (uses the database itself) | Redis required | Both locks can be used independently or together; they guard different entry points. The database lock protects the CLI; the Redis lock protects the FastAPI endpoint. ## Lifespan integration The `dbwarden_lifespan` context manager wraps migration logic and engine disposal into a single FastAPI-compatible lifespan. When using the Redis lock in a lifespan, acquire the lock before entering the migration context: ```python from contextlib import asynccontextmanager from fastapi import FastAPI from dbwarden.fastapi import dbwarden_lifespan, migration_lock @asynccontextmanager async def lifespan(app: FastAPI): async with migration_lock(): async with dbwarden_lifespan(mode="migrate", allow_in_production=True): yield ``` See also: [Safe Deployment](safe-deployment.md) | [CI/CD Patterns](ci-cd-patterns.md) | [`lock` commands](../commands/lock.md) | [FastAPI Lifespan](../fastapi/index.md) ======================================================================== PAGE: https://dbwarden.emiliano-go.com/advanced/safe-deployment/ ======================================================================== # Safe Deployment How to deploy schema changes with minimal risk and a clear recovery path. ## Pre-flight checklist Before running migrations in production: - [ ] Confirm no other migration is running (`dbwarden lock-status`) - [ ] Review pending migrations (`dbwarden status`) - [ ] Confirm migrations have been tested in staging - [ ] Take a backup if your database does not have point-in-time recovery ## Standard deploy sequence ```bash # 1. Verify lock is free $ dbwarden lock-status --database primary # 2. Check what will run $ dbwarden status --database primary # 3. Apply with backup $ dbwarden migrate --database primary --with-backup --backup-dir ./backups # 4. Confirm clean state post-migration $ dbwarden status --database primary $ dbwarden history --database primary ``` For multi-database deployments: ```bash $ dbwarden migrate --all --with-backup --backup-dir ./backups ``` ## What happens when a migration fails mid-run ### PostgreSQL (transactional DDL) PostgreSQL wraps DDL in transactions. If a migration file fails partway through, the entire file is rolled back. The migration remains in "pending" state. The lock is released. You can safely fix the SQL and retry. ```bash # After a failed migration: $ dbwarden status --database primary # confirm migration is still pending $ dbwarden lock-status --database primary # confirm lock was released # Fix the migration file, then: $ dbwarden migrate --database primary ``` ### MySQL / databases without transactional DDL DDL cannot be rolled back. A failed migration may have partially applied changes (e.g., a table was created but the index was not). Manual inspection is required before retrying. ```bash # Check what the migration was supposed to do cat migrations/primary/V__0012_add_payment_tables.sql # Inspect current schema state via your database client # Determine what was applied and what was not # Either: # a) Manually apply the remaining SQL # b) Create a corrective migration # c) Roll back manually and retry from scratch ``` ## Recovery: stuck lock If a migration process was killed and the lock was not released: ```bash # 1. Confirm no migration process is running $ dbwarden lock-status --database primary # 2. Inspect history to see the last applied migration $ dbwarden history --database primary # 3. Inspect pending state $ dbwarden status --database primary # 4. Only if the process is confirmed dead: $ dbwarden unlock --database primary # 5. Retry $ dbwarden migrate --database primary ``` See [Migration Locking](migration-locking.md) for full lock recovery guidance. ## Recovery: failed migration, data is wrong If a migration applied successfully but produced incorrect data or schema: **Option A: Rollback** (if the migration has a `-- rollback` section): ```bash $ dbwarden rollback --database primary ``` This executes the rollback SQL defined in the migration file. Verify the rollback SQL was written when the migration was created; not all migrations include one. **Option B: Forward fix** (preferred for data migrations): ```bash # Create a corrective migration $ dbwarden new "fix column type on payments" --database primary # Edit the generated file with the corrective SQL $ dbwarden migrate --database primary ``` Forward fixes are safer than rollbacks for data migrations, as rollback SQL is harder to write correctly after the fact. ## Baseline migrations For databases that already have a schema (migrating from another tool or brownfield setup): ```bash $ dbwarden migrate --database primary --baseline --to-version 0005 ``` `--baseline` marks migrations as applied without executing them. Use this to tell DBWarden "this database already has schema up to version 0005." ## Smoke test after deploy After migrations complete, run a quick connectivity and schema check: ```bash $ dbwarden check-db --database primary ``` `check-db` inspects the live database schema and reports what tables and columns exist. Use this to confirm the schema matches what your application expects. See also: [Migration Locking](migration-locking.md) | [CI/CD Patterns](ci-cd-patterns.md) | [`rollback` command](../commands/rollback.md) | [Cookbook: Safety & Impact](../cookbook/06-safety-impact.md) ======================================================================== PAGE: https://dbwarden.emiliano-go.com/architecture-deep-dive/ ======================================================================== # Architecture This page explains DBWarden internals for contributors and advanced debugging. ## Layered architecture ```text CLI (Typer) -> Commands layer -> Engine layer (planning/parsing/version/checksum/model discovery) -> Repository layer (migration + lock records) -> Database layer (SQLAlchemy connection + SQL execution) ``` ## Responsibilities - CLI: parse args, global flags (`--dev`, `--strict-translation`, `--help`) - Commands: orchestrate workflows (`migrate`, `rollback`, `make-migrations`, `status`, `history`, `check`, `diff`, `generate-models`, `export-models`, `seed`, `lock-status`, `unlock`, `init`, `snapshot`, `settings`, `database`, `version`) - Engine: parse files, resolve ordering, checksums, model discovery - Repositories: read/write migration and lock metadata - Database: execute SQL with backend-aware connections ## Configuration resolution pipeline When runtime config is requested: 1. discover one config source (`dbwarden.py` or single callsite) 2. fallback to `DBWARDEN_CONFIG_MODULE` when configured 3. import source and execute `database_config(...)` calls 4. validate uniqueness/default/model-path rules 5. resolve selected database and apply `--dev` swap when enabled Ambiguous sources fail fast. ## Migration execution lifecycle For `migrate`: 1. ensure migrations metadata table exists 2. ensure lock table exists 3. acquire lock 4. build pending execution plan 5. execute SQL statements 6. record migration metadata/checksums 7. release lock ## Rollback lifecycle Rollback uses the same lock discipline, selecting rollback SQL from applied files in reverse order. ## Model-to-SQL generation lifecycle `make-migrations` pipeline: 1. discover model paths 2. import model modules 3. extract table/column metadata 4. load latest schema snapshot (`.dbwarden/schemas/*.schema.json`) if one exists 5. if snapshot exists: **snapshot-diff path** - diff model tables against snapshot tables - auto-detect table renames from dropped↔added table pairs (column overlap ≥ 0.6) - apply user `--rename-table` flags and/or interactive prompts to confirm table renames - emit `ALTER TABLE ... RENAME TO` (confirmed) or `DROP TABLE` + `CREATE TABLE` (not confirmed) - apply confirmed table renames to snapshot before column processing - auto-detect column renames from dropped↔added pairs of the same type - apply user `--rename` flags and/or interactive prompts to confirm renames - detect column-level changes: type, nullability, default (same-name columns) - emit `RENAME COLUMN` (confirmed) or `DROP` + `ADD` (not confirmed) - emit `ALTER COLUMN TYPE` / `SET NOT NULL` / `DROP NOT NULL` / `SET DEFAULT` / `DROP DEFAULT` - optionally use multi-step safe type change (`--safe-type-change`) - order all operations by `StatementOrder` (RENAME_TABLE first) and assemble upgrade/rollback - generate upgrade and rollback SQL from the ops 6. if no snapshot: **live-DB fallback path** - take a full schema snapshot from the live database via `extract_full_schema_snapshot()` - run standard snapshot-diff pipeline against it (type, nullability, default, FK, index changes) - only rename detection is unavailable without a cached snapshot 7. deduplicate against existing migration statements 8. write migration file 9. write companion `.plan.json` metadata file (with `resolved_from` on rename ops) ### Snapshot write lifecycle (in `migrate`) After applying versioned migrations, `migrate` calls `_write_migration_snapshot()`: 1. connect to database (respecting sandbox override) 2. extract full schema: tables, columns, types, indexes, constraints, enums 3. compute SHA-256 checksum 4. write `.schema.json` to `.dbwarden/schemas/` 5. on failure: log warning (non-blocking) Snapshots are not written during `--dry-run`, `--sandbox`, or for repeatable migrations. ## Repeatable migration model Supported classes: - versioned (`NNNN_`): run once in ordered sequence - runs always (`RA__`): run each migrate execution - runs on change (`ROC__`): run only when checksum changes ## Integrity model Checksums are recorded for migration content and used for: - repeatable migration change detection - migration consistency checks - audit/debug confidence ## Concurrency model Migration-mutating commands are serialized by lock state stored in database tables. Recovery commands: - `dbwarden lock-status` - `dbwarden unlock` ## Dev translation path With `--dev` and SQLite target: 1. extract model types/defaults 2. translate backend-specific constructs 3. fallback behavior in non-strict mode 4. fail-fast behavior with `--strict-translation` Translation happens during SQL generation, not by mutating existing migration files. ## Error propagation strategy - config/load validation errors fail early - execution errors abort current run with context - lock release is guarded in cleanup paths ======================================================================== PAGE: https://dbwarden.emiliano-go.com/cli-reference/ ======================================================================== # CLI Reference Pure command lookup for DBWarden CLI. ## Syntax ```bash $ dbwarden [GLOBAL_OPTIONS] COMMAND [ARGS] [COMMAND_OPTIONS] ``` ## Global options | Option | Description | |---|---| | `--dev` | Use `dev_database_url` and `dev_database_type` for selected database | | `--strict-translation` | Fail on unsupported/lossy dev SQLite translation | | `--help` | Show help | ## Configuration ### `settings show` ```bash $ dbwarden settings show $ dbwarden settings show primary $ dbwarden settings show --all ``` ### `database list` ```bash $ dbwarden database list ``` ## Migration authoring ### `make-migrations` ```bash $ dbwarden make-migrations "create users table" --database primary $ dbwarden make-migrations --verbose --database primary $ dbwarden make-migrations --plan --database primary $ dbwarden make-migrations --rename users.username:email --database primary $ dbwarden make-migrations --rename-table users:accounts --database primary $ dbwarden make-migrations --safe-type-change --database primary ``` Options: - `--database`/`-d`: Target database - `--plan`: Print migration plan JSON without writing files - `--offline`: Use model state file instead of live database (run `export-models` first) - `--verbose`/`-v`: Verbose output - `--rename`: Repeatable. Declare a column rename in format `table.old_name:new_name`. - `--rename-table`: Repeatable. Declare a table rename in format `old_table:new_table`. - `--safe-type-change`: Multi-step safe type change strategy. - `--clickhouse-engine-recreate`: Allow automatic ClickHouse table rebuild on engine change. - `--drop-preserved-clickhouse-table` / `--keep-preserved-clickhouse-table`: Drop or keep the preserved old ClickHouse table after engine-recreate swap. - `--type`/`-t`: Output prefix: `versioned` (default), `ra`/`runs_always`, or `roc`/`runs_on_change`. See [make-migrations](commands/make-migrations.md) for full documentation including rename detection, column-level changes, schema snapshots, and plan format. ### `new` ```bash $ dbwarden new "manual hotfix" --database primary $ dbwarden new "backfill" --database primary --version 0042 $ dbwarden new "seed data" --database primary --type ra ``` Options: `--database`, `--version`, `--type`/`-t` ### `generate-models` ```bash $ dbwarden generate-models --output ./models/ --database primary $ dbwarden generate-models --database primary --single-file $ dbwarden generate-models --database primary --tables users,posts $ dbwarden generate-models --database primary --exclude-tables logs,audit ``` Options: `--output`/`-o` (default `models`), `--tables`, `--exclude-tables`, `--clickhouse-engines`, `--relationships`, `--dialect`, `--single-file`, `--base`, `--database`/`-d` ### `export-models` ```bash $ dbwarden export-models --database primary $ dbwarden export-models --database primary --output .dbwarden/model_state.json ``` Exports current model definitions to a JSON state file for offline migration diffs. Options: `--output`/`-o` (default `.dbwarden/model_state.json`), `--database`/`-d` ### `diff` ```bash $ dbwarden diff --database primary $ dbwarden diff --database primary --out json $ dbwarden diff --database primary --out sql $ dbwarden diff --database primary --offline ``` Read-only model-vs-database comparison. No files are written. Options: `--database`/`-d`, `--out`/`-o` (`table`, `json`, `sql`), `--offline`, `--verbose`/`-v` ### `check-impact` ```bash $ dbwarden check-impact 0042 --database primary $ dbwarden check-impact 0042 --database primary --out json $ dbwarden check-impact 0042 --database primary --scan-path app/ $ dbwarden check-impact path/to/primary__0042_add_bio.plan.json ``` Scans your codebase for references to schema elements affected by a migration. | Option | Description | |--------|-------------| | `migration` | Migration version (e.g. `0042`) or plan file path (required) | | `--out`/`-o` | Output format: `text` (default) or `json` | | `--scan-path` | Directory to scan for affected code (default: `.`) | | `--deep` | Enable deep introspection (imports models live) | | `--verbose`/`-v` | Include INFO-level operations in the scan | | `--database`/`-d` | Target database name | ## Migration execution ### `migrate` ```bash $ dbwarden migrate --database primary $ dbwarden migrate --all $ dbwarden migrate --database primary --to-version 0010 $ dbwarden migrate --database primary --count 2 $ dbwarden migrate --database primary --with-backup $ dbwarden migrate --database primary --baseline --to-version 0005 ``` Options: - `--database`, `--all` - `--to-version`, `--count` - `--baseline` - `--with-backup`, `--backup-dir` - `--dry-run` (show what would be applied without executing) - `--sandbox` (apply in a temporary sandbox database) - `--apply-seeds` (apply pending seeds after migrations, overrides config) - `--verbose` ### `rollback` ```bash $ dbwarden rollback --database primary $ dbwarden rollback --database primary --count 2 $ dbwarden rollback --database primary --to-version 0007 ``` Options: `--database`, `--count`, `--to-version`, `--verbose` ### `downgrade` ```bash $ dbwarden downgrade --to 0005 --database primary ``` Options: `--to` (required), `--database`, `--verbose` ### `make-rollback` ```bash $ dbwarden make-rollback migrations/primary__0005_add_table.sql ``` Generates a `.rollback.sql` file for the given migration file. ### `snapshot` ```bash $ dbwarden snapshot users --database primary ``` Outputs the DDL schema of the specified table. ## Seed management ### `seed create` ```bash $ dbwarden seed create "seed initial data" --database primary $ dbwarden seed create "populate lookup tables" --database primary --type python ``` Options: `--database`, `--type` (`sql` or `python`, default `sql`), `--verbose` ### `seed apply` ```bash $ dbwarden seed apply --database primary $ dbwarden seed apply --database primary --version 0003 $ dbwarden seed apply --database primary --dry-run $ dbwarden seed apply --all ``` Options: `--database`, `--all` (`-a`), `--version`, `--dry-run`, `--verbose` ### `seed list` ```bash $ dbwarden seed list --database primary $ dbwarden seed list --all $ dbwarden seed list --prune ``` Options: `--database`, `--all`, `--prune`, `--verbose` ### `seed rollback` ```bash $ dbwarden seed rollback --database primary $ dbwarden seed rollback --database primary --count 2 $ dbwarden seed rollback --database primary --to-version 0003 $ dbwarden seed rollback --all ``` Options: `--database`, `--count`, `--to-version`, `--all`, `--verbose` ### `seed export` ```bash $ dbwarden seed export --database primary $ dbwarden seed export --all $ dbwarden seed export --database clickhouse --output-dir ./seeds ``` Export code seeds to ROC SQL files for stateless production application. Options: `--database`/`-d`, `--all`/`-a`, `--output-dir`/`-o` (default `seeds/`) ## Inspection and diagnostics ### `status` ```bash $ dbwarden status --database primary $ dbwarden status --all ``` ### `history` ```bash $ dbwarden history --database primary ``` ### `check-db` ```bash $ dbwarden check-db --database primary $ dbwarden check-db --database primary --out json ``` Output formats: `txt`, `json`, `yaml`, `sql` ### `check` ```bash $ dbwarden check --database primary $ dbwarden check --database primary --force $ dbwarden check --database primary --out json ``` Output formats: `txt`, `json` ## Locking ### `lock-status` ```bash $ dbwarden lock-status --database primary ``` ### `unlock` ```bash $ dbwarden unlock --database primary ``` ## Utility ### `config` ```bash $ dbwarden config ``` ### `version` ```bash $ dbwarden version ``` For worked command examples, see the [Cookbook & Examples](cookbook/index.md). ======================================================================== PAGE: https://dbwarden.emiliano-go.com/codebase/ ======================================================================== # Codebase Organization ## Top-Level Layout ``` dbwarden/ # The package itself tests/ # Test suite (~40 modules) docs/ # Documentation site (MkDocs) examples/ # Runnable example projects scripts/ # Development and CI tooling assets/ # Images, icons, branding site/ # Built documentation output (gitignored) ``` ## Package Layout (`dbwarden/`) | Directory / Module | Responsibility | |---|---| | `cli/` | Typer CLI definitions, argument parsing | | `commands/` | Command orchestration (migrate, generate-models, check, etc.) | | `engine/` | Core logic: model discovery, snapshot extraction, diff, offline migration, safety checks | | `database/` | Connection management, SQL queries by dialect | | `databases/` | Concrete backend specs: ClickHouse, MySQL, PostgreSQL, MariaDB, SQLite | | `schema/` | Dialect-agnostic metadata layer: table/column/field metadata classes | | `repositories/` | Migration and lock metadata persistence | | `fastapi/` | FastAPI integration (lifespan, health checks) | | `config*.py` | Configuration loading and resolution | | `constants.py` | Shared constants | | `exceptions.py` | Exception hierarchy | | `seed.py` | Seed data infrastructure | | `sandbox.py` | Module loading sandbox for user model files | ## The `schema/` vs `databases/` Boundary The `dbwarden/schema/` package is the abstract metadata layer. It defines dialect-agnostic constructs that make no assumptions about the target database: - `TableMeta` and `*ColumnMeta` classes (e.g. `PGColumnMeta`, `CHColumnMeta`, `MyColumnMeta`) - `DBWardenMeta`: the runtime metadata container attached to each model - `_MetaValidator`: metaclass that validates `class Meta` attribute names at import time - `IndexSpec`, `CheckSpec`, `UniqueSpec`: cross-database object specs - `_meta_reader.py`: logic that reads `class Meta` from user models and populates `DBWardenMeta` The `dbwarden/databases/` package is the concrete backend layer. It contains dialect-specific specs and helpers: - `clickhouse/`: `ChEngineSpec`, `ProjectionSpec`, `ChIndexSpec`, `ChTableSpec`, merge-tree helpers, `ChFieldSpec` - `mysql/`: `MyFieldSpec`, `MyTableSpec` - `pgsql/`: `PgFieldSpec`, `PgIndexSpec`, `PgTableSpec`, exclude/partition helpers - `mariadb/`: `MdbFieldSpec`, `MdbTableSpec` - `sqlite/`: `SqFieldSpec`, `SqTableSpec` ### The Import Contract The single most important rule in the codebase is: > **`schema/` must never import from `databases/`.** This keeps the metadata layer database-agnostic. `databases/` may import from `schema/` (and does, for `TableMeta`, `DBWardenMeta`, `IndexSpec`, etc.), but the reverse dependency is forbidden. Consequences of this boundary: - **`ChEngineSpec` and `ProjectionSpec` live in `databases/clickhouse/`**, not `schema/`. They are ClickHouse-specific types, not abstract schema concepts. - Backend specs (`ChTableSpec`, `MyTableSpec`, etc.) are defined per-database, not in `schema/`. - The `schema/__init__.py` only re-exports classes from `schema/` submodules. It does not re-export backend-specific types from `databases/`. - Users import backend types through `from dbwarden.databases.clickhouse import ChEngineSpec` or the top-level `from dbwarden import ChEngineSpec`. ### What Changed in the v0.13.0 Refactor The refactor tightened this boundary. Previously, `ChEngineSpec`, `ProjectionSpec`, and the `*FieldMeta` hierarchy lived in `schema/`. They were moved to their correct locations: - `ChEngineSpec`, `_split_engine_args`: now in `databases/clickhouse/engine.py` - `ProjectionSpec`: now in `databases/clickhouse/projection.py` - `*FieldMeta` classes (`PGFieldMeta`, `CHFieldMeta`, etc.): deleted; fields inlined directly into `*ColumnMeta` in `table_meta.py` The orphan `__pycache__` directories under `schema/{clickhouse,mysql,pgsql,mariadb,sqlite}/` were removed. ## Contribution Guidelines ### Before submitting a PR 1. Ensure your changes respect the `schema/` vs `databases/` import boundary (see above). 2. Run the full test suite before pushing: ``` python -m pytest tests/ -x -q ``` 3. If you add or remove a public export, update the corresponding `__all__` list in the module's `__init__.py`. 4. If you introduce a new top-level directory, add it to the table in this document. ### Code style - No comments in production code unless the logic is genuinely subtle. - Mimic existing patterns: same typing style, same docstring conventions, same import organization. - Prefer `from __future__ import annotations` at the top of every module. - Use `metaclass=_MetaValidator` for any new `class Meta`-like user-facing configuration class. ### Adding a new database backend 1. Create a new subpackage under `databases//` with `__init__.py`, `field.py`, and any backend-specific specs. 2. Define a `*TableSpec` dataclass and a `*FieldSpec` dataclass matching the existing backends. 3. Register the backend in `databases/__init__.py` and add the shortcut import (`sq`, `my`, etc.). 4. If the backend needs no column-level `Meta` attributes (like SQLite), add no `*ColumnMeta` class. 5. Do not touch files in `schema/` unless you are adding cross-database metadata fields. ======================================================================== PAGE: https://dbwarden.emiliano-go.com/commands/check-db/ ======================================================================== # `check-db` Inspect live database schema. ## Usage ```bash $ dbwarden check-db --database primary $ dbwarden check-db --database primary --out json $ dbwarden check-db --database primary --out yaml ``` ## Options - `--database`, `-d` - `--out`, `-o` (`txt`, `json`, `yaml`, `sql`) ## Notes - useful for schema inspection and diagnostics - complements `status` and `history` See also: [Your First Migration](../getting-started/first-migration.md) ======================================================================== PAGE: https://dbwarden.emiliano-go.com/commands/check/ ======================================================================== # `check` Analyze schema differences between your SQLAlchemy models and the live database. ## Usage ```bash $ dbwarden check --database primary $ dbwarden check --database primary --force $ dbwarden check --database primary --out json ``` ## Options - `--database`, `-d` - Target database - `--out`, `-o` - Output format: `txt` or `json` - `--force` - Allow warning-level changes to pass ## Severity model - `INFO` - safe changes like adding a projection or adding a new object - `WARNING` - risky changes that require `--force` - `ERROR` - blocked changes such as partition/order key changes ## Current behavior DBWarden runs generic safety checks for all backends, covering column type changes, nullability changes, default changes, and table operations. For ClickHouse specifically, additional checks classify changes for: - added or removed columns - type changes - engine changes - TTL changes - `ORDER BY` changes - `PARTITION BY` changes - materialized view query changes - projection additions/removals ## Notes - warning-level changes exit non-zero unless `--force` is provided - error-level changes remain blocking even with `--force` - output is based on live database inspection plus current model metadata ======================================================================== PAGE: https://dbwarden.emiliano-go.com/commands/database/ ======================================================================== # `database` Display configured databases. Config is defined in Python code via `database_config()`, so `database list` is a read-only command for viewing what's registered. ## Usage ```bash $ dbwarden database list ``` ## See also - [`settings show`](./settings.md): detailed view of all configuration - [Configuration docs](../configuration/index.md) ======================================================================== PAGE: https://dbwarden.emiliano-go.com/commands/diff/ ======================================================================== # `diff` Show structural differences between SQLAlchemy models and a live database. Read-only: no files are written. ## Usage ```bash $ dbwarden diff --database primary $ dbwarden diff --database primary --out json $ dbwarden diff --database primary --out sql $ dbwarden diff --database primary --offline ``` ## Options | Option | Description | |--------|-------------| | `--database`, `-d` | Target database name | | `--out`, `-o` | Output format: `table` (default), `json`, `sql` | | `--offline` | Use exported model state file instead of live DB snapshot | | `--verbose`, `-v` | Enable verbose logging | ## Output formats ### `table` (default) Displays a Rich table with columns: Operation, Table, Target, Severity. ```text Schema Diff ┏━━━━━━━━━━━━━━┳━━━━━━━┳━━━━━━━━┳━━━━━━━━━┓ ┃ Operation ┃ Table ┃ Target ┃ Severity┃ ┡━━━━━━━━━━━━━━╇━━━━━━━╇━━━━━━━━╇━━━━━━━━━┩ │ add_column │ users │ email │ INFO │ │ drop_column │ users │ name │ WARNING │ └──────────────┴───────┴────────┴─────────┘ ``` ### `json` ```json [ {"operation": "add_column", "table": "users", "target": "email", "severity": "INFO"}, {"operation": "drop_column", "table": "users", "target": "name", "severity": "WARNING"} ] ``` ### `sql` Prints the raw migration SQL that would be generated. ## Offline mode Requires a model state file created by `dbwarden export-models`: ```bash $ dbwarden export-models --database primary # Switch to offline machine $ dbwarden diff --database primary --offline ``` ## See also - [`make-migrations`](./make-migrations.md): generates migration files from diffs - [`check`](./check.md): safety analyzer for schema changes - [`check-db`](./check-db.md): inspect live database schema ======================================================================== PAGE: https://dbwarden.emiliano-go.com/commands/downgrade/ ======================================================================== # `downgrade` Revert applied migrations to reach a specific target version. ## Usage ```bash $ dbwarden downgrade --to 0005 --database primary ``` ## Options - `--to`, `-t` (required) - Target version to downgrade to - `--database`, `-d` - `--verbose`, `-v` ## Notes - reads `-- rollback` sections from migration files and applies them in reverse order - only reverts versions after the target version; versions at or before the target are preserved - same lock discipline as `migrate` and `rollback` - fails if the target version has not been applied See also: [rollback](rollback.md) ======================================================================== PAGE: https://dbwarden.emiliano-go.com/commands/generate-models/ ======================================================================== # `generate-models` Reverse-engineer SQLAlchemy model code from a live database. ## Usage ```bash $ dbwarden generate-models --output ./models/ --database primary $ dbwarden generate-models --output ./models/ --database primary --single-file $ dbwarden generate-models --output ./models/ --database primary --base app.database:Base $ dbwarden generate-models --database primary --tables users,posts $ dbwarden generate-models --database primary --exclude-tables logs,audit ``` > **Note:** `generate-models` works for all supported databases: PostgreSQL, MySQL, MariaDB, ClickHouse, and SQLite. For ClickHouse, use `--clickhouse-engines` or rely on auto-detection from `database_type="clickhouse"`. SQLite produces basic table models without backend-specific metadata. ## Options | Option | Description | |--------|-------------| | `--output`, `-o` | Output directory (default: `models`) | | `--tables` | Comma-separated list of tables to include | | `--exclude-tables` | Comma-separated list of tables to exclude | | `--clickhouse-engines` | Include ClickHouse engine metadata. Auto-detected when `database_type="clickhouse"` | | `--relationships` | Generate `relationship()` attributes for foreign keys | | `--dialect` | SQL dialect for type mapping (auto-detected from database type) | | `--single-file` | Generate a single `models.py` instead of one file per table | | `--base` | Custom Base class import path (e.g. `app.database:Base`). Default: generates `declarative_base()` in each file | | `--database`, `-d` | Target database name | ## Output rules - **Default**: one `.py` file per table (e.g., `users.py`, `posts.py`) - **`--single-file`**: generates `models.py` with all models - Each file imports `declarative_base()` and defines `Base` (or imports from the path given by `--base`) ## Type mapping Database column types are mapped to SQLAlchemy types: | Database Type | SQLAlchemy Type | |---------------|----------------| | `INTEGER` | `Integer` | | `VARCHAR(N)` | `String(length=N)` | | `TEXT` | `Text` | | `BOOLEAN` / `TINYINT(1)` | `Boolean` | | `DECIMAL(P,S)` | `Numeric(precision=P, scale=S)` | | `DATETIME` / `TIMESTAMP` | `DateTime` | | `BIGINT` | `BigInteger` | | `FLOAT` / `DOUBLE` | `Float` | | `Nullable(...)` (ClickHouse) | Inner type (nullable is explicit) | ## PostgreSQL First-Class Output For PostgreSQL databases, `generate-models` reverse-engineers all supported metadata and emits it as `class Meta` inner classes with `PGTableMeta` and `PGColumnMeta`: ```python from sqlalchemy.orm import DeclarativeBase from dbwarden.databases.pgsql import PGTableMeta, PGColumnMeta, pg class Base(DeclarativeBase): pass class User(Base): __tablename__ = "users" id: Mapped[int] = mapped_column(primary_key=True) email: Mapped[str] = mapped_column(String(255), unique=True) bio: Mapped[str | None] = mapped_column(Text, nullable=True) class Meta(PGTableMeta): comment = "Core user accounts" pg_fillfactor = 80 class id(PGColumnMeta): pg = pg.field(identity="always", identity_start=100, identity_increment=1) class bio(PGColumnMeta): pg = pg.field(storage="EXTENDED", collation="en_US.UTF-8") ``` The following metadata is reverse-engineered: - **Identity columns**: `GENERATED ALWAYS/BY DEFAULT AS IDENTITY` with sequence options - **Collation**: per-column `COLLATE` setting - **Storage**: per-column `STORAGE` (PLAIN, MAIN, EXTERNAL, EXTENDED) - **Generated columns**: `GENERATED ALWAYS AS (...) STORED` - **Table fillfactor**: `WITH (fillfactor = N)` - **Tablespace**: `SET TABLESPACE` - **Inheritance**: `INHERITS (parent)` - **EXCLUDE constraints**: `EXCLUDE USING ...` - **FK options**: `ondelete`, `onupdate`, `deferrable` on `ForeignKey()` - **Index options**: `USING`, `WHERE`, `INCLUDE`, `WITH`, `TABLESPACE`, `NULLS NOT DISTINCT` - **Table and column comments** For the complete feature reference, see [PostgreSQL Deep Dive](../databases/postgresql.md). ## ClickHouse First-Class Output For ClickHouse databases, `generate-models` reverse-engineers all supported metadata and emits it as `class Meta` inner classes with `CHTableMeta`, `CHColumnMeta`, `ChEngineSpec`, and `ProjectionSpec`. Engine metadata is included automatically when `database_type="clickhouse"` (no `--clickhouse-engines` flag required). ```python from sqlalchemy.orm import DeclarativeBase from dbwarden.databases.clickhouse import CHTableMeta, CHColumnMeta, ChEngineSpec, ProjectionSpec, ch class Base(DeclarativeBase): pass class Event(Base): __tablename__ = "events" id: Mapped[int] = mapped_column(Int64, primary_key=True) event_date: Mapped[date] = mapped_column(Date) payload: Mapped[str] = mapped_column(String) class Meta(CHTableMeta): ch_engine = ChEngineSpec("MergeTree") ch_order_by = ["event_date", "id"] ch_partition_by = "toYYYYMM(event_date)" ch_ttl = ["event_date + toIntervalYear(1)"] ch_settings = {"index_granularity": "8192"} ch_projections = [ ProjectionSpec("by_date", "SELECT event_date, sum(amount) GROUP BY event_date"), ] class payload(CHColumnMeta): ch = ch.field(codec="ZSTD(3)") ``` The following metadata is reverse-engineered: - **Engine spec**: engine name, arguments, ZooKeeper path, replica name, settings via `ChEngineSpec` - **Ordering and partitioning**: `ch_order_by`, `ch_primary_key`, `ch_partition_by`, `ch_sample_by` - **TTL**: table-level TTL expressions - **Projections**: named projections via `ProjectionSpec` - **Materialized views**: `ch_select_statement`, `ch_to_table` - **Dictionaries**: `ch_dictionary`, `ch_dict_layout`, `ch_dict_source`, `ch_dict_lifetime`, `ch_dict_primary_key` - **Column metadata**: codec, default expression, LowCardinality/Nullable wrappers via `CHColumnMeta` - **Skip indexes**: `ChIndexSpec` entries in `ch_indexes` - **Table and column comments** For the complete feature reference, see [ClickHouse Deep Dive](../databases/clickhouse.md). ## Use cases - **Bootstrapping**: start a new project from an existing database - **Documentation**: generate model stubs to document the schema - **Recovery**: regenerate models when migration scripts are missing ## Warnings - Generated code requires manual review and cleanup - ClickHouse engine metadata is auto-detected; review the generated `ChEngineSpec` to ensure correctness ======================================================================== PAGE: https://dbwarden.emiliano-go.com/commands/history/ ======================================================================== # `history` Show migration execution history. ## Usage ```bash $ dbwarden history --database primary ``` ## Options - `--database`, `-d` ## Notes - shows applied migrations, order, and timestamps - useful for audit and incident analysis See also: [Your First Migration](../getting-started/first-migration.md) ======================================================================== PAGE: https://dbwarden.emiliano-go.com/commands/init/ ======================================================================== # `init` Initialize DBWarden project scaffolding. ## Usage ```bash $ dbwarden init $ dbwarden init --database primary ``` ## What it does - creates `migrations/` and `migrations//` if missing - creates/updates config scaffold (`dbwarden.py`) if needed - does not mutate your database schema ## Notes - safe to run multiple times - first command to run in a new project See also: [Configuration](../configuration/index.md) ======================================================================== PAGE: https://dbwarden.emiliano-go.com/commands/lock/ ======================================================================== # `lock-status` and `unlock` Inspect and recover migration lock state. ## Usage ```bash $ dbwarden lock-status --database primary $ dbwarden unlock --database primary ``` ## Options - `--database`, `-d` ## Notes - use `lock-status` to inspect lock state - use `unlock` only when lock is stale and no migration is running See also: [Migration Locking](../advanced/migration-locking.md) ======================================================================== PAGE: https://dbwarden.emiliano-go.com/commands/make-migrations/ ======================================================================== # `make-migrations` Generate SQL migration file(s) from SQLAlchemy models. ## Usage ```bash # Auto-generated name from schema changes $ dbwarden make-migrations # User-provided description $ dbwarden make-migrations "create users table" # With database option $ dbwarden make-migrations --database primary --verbose # Output plan JSON only $ dbwarden make-migrations --database primary --plan # Explicitly declare column renames $ dbwarden make-migrations --rename users.username:email --rename posts.title:headline # Explicitly declare table renames $ dbwarden make-migrations --rename-table users:accounts # Safe multi-step type changes $ dbwarden make-migrations --database primary --safe-type-change ``` ## Options - `description` (optional): Custom migration name. If not provided, automatically generated from schema changes. - `--database`, `-d`: Target database. - `--plan`: Print the migration plan JSON without writing files. - `--verbose`, `-v`: Verbose output. - `--rename`: Repeatable. Declare a column rename in the format `table.old_name:new_name`. See [Rename Detection](#rename-detection) below. - `--rename-table`: Repeatable. Declare a table rename in the format `old_table:new_table`. See [Table Rename Detection](#table-rename-detection) below. - `--safe-type-change`: Use a multi-step strategy for type changes: add a temporary column, data migration comment, verification step, then drop-and-rename. Useful for databases where `ALTER COLUMN TYPE` would lock the table. - `--concurrent` / `--no-concurrent`: Enable or disable `CREATE INDEX CONCURRENTLY` for PostgreSQL (default: `--concurrent`). Use `--no-concurrent` when the migration runs inside a transaction block. - `--offline`: Use a model state file (`.dbwarden/model_state.json`) instead of a live database or schema snapshot. Run `dbwarden export-models` first to establish a baseline. Useful for CI pipelines without a database service. - `--clickhouse-engine-recreate`: Allow automatic ClickHouse table rebuild when engine changes require recreation. Required to generate `recreate_ch_table` operations. See [ClickHouse Engine Recreate](#clickhouse-engine-recreate) below. - `--drop-preserved-clickhouse-table` / `--keep-preserved-clickhouse-table`: Control whether the preserved old ClickHouse table is dropped after the engine-recreate swap. If omitted, interactive terminals are prompted; non-TTY preserves by default. - `--postgres-auto-using`: Emit an active `USING col::newtype` clause on PostgreSQL `ALTER COLUMN TYPE` statements. Default is a commented-out line for manual review. See the PostgreSQL docs section on Column Type Changes for details. - `--type`, `-t`: Output prefix for the generated migration file: `versioned` (default), `ra` / `runs_always`, or `roc` / `runs_on_change`. Use `ra` for SQL that should run every migration cycle (e.g. grants, materialized view refreshes) and `roc` for SQL that should re-run when the file changes (e.g. stored procedures, triggers). ## Schema Snapshots After each migration is applied, DBWarden writes a **schema snapshot** to `.dbwarden/schemas/.schema.json`. These snapshots capture the full DDL state (tables, columns, types, indexes, constraints, enums) at that point in time. `make-migrations` diffs your SQLAlchemy models against the **latest** snapshot instead of the live database. This means: - You don't need a running database to generate migrations (after the first `migrate`). - Rename detection works by comparing dropped and added columns between the snapshot and your models. - If no snapshot exists, `make-migrations` falls back to diffing against the live database. See [Schema Snapshots](schema-snapshots.md) for details. ## Rename Detection When a column is dropped from the snapshot and a new column of the same type is added to the model, DBWarden auto-detects it as a potential **rename** and emits `ALTER TABLE ... RENAME COLUMN` instead of `DROP` + `ADD`. ### Auto-detection rules | Condition | Outcome | |-----------|---------| | Same table, 1 dropped + 1 added, same normalized type | Auto-detected as rename | | Different types | Never auto-detected (emits drop + add) | | Same name kept | Skipped (no op) | | 2+ dropped + 2+ added of same type | Paired sequentially (positional) | ### Rename detection edge cases - **Ambiguous multi-rename**: When 2+ dropped columns and 2+ added columns share the same normalized type, all are treated as renames (paired in insertion order). This is intentionally permissive; false positives can be declined interactively or overridden with `--rename`. - **Drop-add conversion with `resolved_from`**: A confirmed drop+add pair is converted to a `rename_column` op with `resolved_from` tracking the confirmation source (`"rename_flag"` or `"prompt"`). - **Non-matching confirmed set**: If a confirmed rename tuple does not match any op (e.g., table name mismatch), it is silently ignored. - **Table + column rename interaction**: Table renames are processed first (statement order 0). After the snapshot is updated with the new table name, column renames are detected against the renamed table. The column rename's `resolved_from` is independent of the table rename's `resolved_from`. ### Interactive prompt (TTY) When auto-detected renames are found and you're in an interactive terminal, `make-migrations` prompts you to confirm each one: **Single rename:** ``` Detected rename: users.username → email. Confirm rename? [Y/n]: ``` **Multiple renames:** ``` Detected column renames: [1] users.username → email [2] posts.title → headline [s] Skip all [a] Accept all Select renames to confirm (e.g. 1,3 or a or s): ``` ### CI / non-TTY behavior When not in an interactive terminal, auto-detected renames are **not applied**. Instead, a warning is printed suggesting the `--rename` flag: ``` The following auto-detected column renames were not confirmed: users.username → email (use --rename users.username:email to confirm) posts.title → headline (use --rename posts.title:headline to confirm) These will be emitted as DROP + ADD instead of RENAME. ``` To apply them in CI, pass the corresponding `--rename` flags. ### `--rename` flag The `--rename` flag explicitly tells `make-migrations` to treat a drop+add pair as a rename. It is required in non-TTY environments (CI) and can also be used to force renames that auto-detection would miss (e.g., type changes). Format: `--rename .:` Examples: ```bash # Single rename $ dbwarden make-migrations --rename users.username:email # Multiple renames $ dbwarden make-migrations --rename users.username:email --rename posts.title:headline # Force rename even when types differ $ dbwarden make-migrations --rename users.phone:mobile_phone ``` ### Resolution order 1. **`--rename` flags** are always applied as `RENAME COLUMN` with `resolved_from: "rename_flag"`. 2. **Auto-detected renames** confirmed via interactive prompt get `resolved_from: "prompt"`. 3. **Auto-detected renames** not in `--rename` flags and not confirmed (or in CI) are emitted as `DROP` + `ADD` instead. ## Column-Level Diff When a schema snapshot exists, `make-migrations` does more than detect new and dropped columns. It also compares columns that exist in both the snapshot and the model for three kinds of change: | Change | Snapshot vs Model | Generated SQL | |--------|-------------------|---------------| | **Type change** | `snapshot.type != model.type` (normalized) | `ALTER COLUMN ... TYPE ...` | | **Nullability change** | `snapshot.nullable != model.nullable` | `ALTER COLUMN ... SET NOT NULL` / `DROP NOT NULL` | | **Default change** | `snapshot.default != model.default` | `ALTER COLUMN ... SET DEFAULT` / `DROP DEFAULT` | ### Type change detection Types are normalized before comparison (see [Schema Snapshots](schema-snapshots.md)). If the normalized type differs between the snapshot and the model, an `alter_column_type` operation is emitted. Example: model changes `VARCHAR` to `TEXT`: ```sql -- upgrade ALTER TABLE users ALTER COLUMN bio TYPE TEXT -- rollback -- ALTER TABLE users ALTER COLUMN bio TYPE ``` ### Nullability change When a model column changes nullable, the corresponding `SET NOT NULL` or `DROP NOT NULL` is generated: ```sql -- upgrade ALTER TABLE users ALTER COLUMN email SET NOT NULL -- rollback ALTER TABLE users ALTER COLUMN email DROP NOT NULL ``` ### Default change ```sql -- upgrade ALTER TABLE users ALTER COLUMN role SET DEFAULT 'user' -- rollback ALTER TABLE users ALTER COLUMN role DROP DEFAULT ``` ### Safe type change (`--safe-type-change`) For databases that don't support in-place `ALTER COLUMN TYPE` (or when you want to avoid table locks), pass `--safe-type-change` to generate a multi-step strategy: 1. Add a temporary column with the new type 2. Comment indicating a data migration (`UPDATE ... SET temp = CAST(...)`) 3. Verification step comment 4. After manual verification, drop the old column and rename the temporary column **Limitations:** | Backend | Supported | Notes | |---------|-----------|-------| | PostgreSQL | Yes | Multi-step temp column strategy | | MySQL / MariaDB | Yes | Multi-step temp column strategy | | SQLite | No | Comment emitted (SQLite cannot drop columns before 3.35.0 and has limited ALTER TABLE) | | ClickHouse | No | Comment emitted | ## ClickHouse Engine Recreate When a ClickHouse table's engine changes (e.g., `MergeTree` → `ReplicatedMergeTree`), it cannot be altered in-place. DBWarden supports two strategies depending on the table type. ### Table strategy (CREATE + INSERT + RENAME) For regular `MergeTree`-family tables, DBWarden generates a multi-step operation: 1. Create the new table with the new engine as `
__dbw_new` 2. Copy data: `INSERT INTO __dbw_new SELECT ... FROM
` 3. Swap: `RENAME TABLE
TO
__dbw_old,
__dbw_new TO
` 4. (optional) Drop the preserved old table **Materialized view targets:** If a materialized view targets the table being recreated (via `TO
`), the MV is automatically detached before and reattached after the swap: ```sql DETACH TABLE events_mv; CREATE TABLE events__dbw_new (...); INSERT INTO events__dbw_new SELECT ... FROM events; RENAME TABLE events TO events__dbw_old, events__dbw_new TO events; ATTACH TABLE events_mv; ``` **Column renames:** If the table also has column renames, these should be performed in a separate migration (engine recreate and column rename in the same migration is not supported). ### Dictionary strategy (DROP + CREATE) Dictionaries are recreated with a simple DROP + CREATE since ClickHouse does not support `RENAME DICTIONARY`: ```sql -- upgrade DROP DICTIONARY my_dict; CREATE DICTIONARY my_dict (... ReplicatedMergeTree() ...); -- rollback DROP DICTIONARY my_dict; CREATE DICTIONARY my_dict (... MergeTree() ...); ``` > ⚠️ Dictionaries lose their cached data on recreation. The data will be re-fetched from the source. ### Materialized views and unsupported objects Engine recreation is **blocked** for tables that are themselves materialized views (`ch_select_statement` or `ch_object_type = materialized_view`). Handle these manually with a DROP/CREATE migration. Projections (`ch_projections`) are automatically preserved through the table rebuild and do not block it. ### Safety | Object type | Safety | Notes | |------------|--------|-------| | Regular table with engine change | INFO | Preserved old table by default | | Table with dependent MVs | INFO | MVs detached before, reattached after | | Dictionary | CRITICAL | Cached data lost on DROP/CREATE | ### Flags #### `--clickhouse-engine-recreate` **Required** to generate engine recreation operations. Without this flag, any detected engine change raises an error: ``` ClickHouse table 'events' cannot be automatically recreated: is a dictionary (current). This operation requires manual DROP/CREATE, or use --force to skip this check. ``` #### `--drop-preserved-clickhouse-table` / `--keep-preserved-clickhouse-table` Controls whether the preserved old table (renamed to `
__dbw_old`) is dropped immediately after the swap: - `--drop-preserved-clickhouse-table`: Drop the old table after successful swap - `--keep-preserved-clickhouse-table`: Keep the old table (default in non-TTY) Interactive terminals are prompted to confirm. The preserved table name always ends with `__dbw_old` for easy identification. ### DROP COLUMN warning All `DROP COLUMN` statements are prefixed with a warning comment: ```sql -- WARNING: DROPPING COLUMN users.legacy_field ALTER TABLE users DROP COLUMN legacy_field ``` ### DROP COLUMN warning All `DROP COLUMN` statements are prefixed with a warning comment: ```sql -- WARNING: DROPPING COLUMN users.legacy_field ALTER TABLE users DROP COLUMN legacy_field ``` ## Table Rename Detection When a table is dropped from the snapshot and a new table with similar columns is added to the model, DBWarden auto-detects it as a potential **table rename** using a column-overlap heuristic. ### Auto-detection | Condition | Outcome | |-----------|---------| | A table present in the snapshot but absent from models AND a table absent from the snapshot but present in models | Overlap computed by matching column names and normalized types | | Overlap ratio ≥ 0.6 | Prompted as a rename candidate | | Overlap ratio < 0.6 | Emitted as drop+add with a warning comment | The overlap ratio is `matching_columns / max(len(snapshot_cols), len(model_cols))`. A 0.6 threshold is intentionally conservative: a 10-column table with 6 matching columns is a plausible rename, while a 2-column table with 1 match is not. ### Interactive prompt (TTY) **Single candidate:** ``` Possible table rename detected: users → accounts (78% columns match) Treat as rename? [Y/n]: ``` **Multiple candidates:** ``` Possible table renames detected: [1] users → accounts (78% columns match) [2] posts → articles (100% columns match) Treat as renames? (default: all yes) - Press Enter to rename all - Type numbers to drop+add instead (e.g. "1" or "1 2"): ``` Table rename prompts appear **before** column rename prompts to ensure table names are resolved before column-level changes. ### CI / non-interactive path ``` Warning: table rename candidates detected but running non-interactive. Emitting drop+add. users → accounts (78% columns match) Rerun with --rename-table users:accounts to resolve. ``` ### `--rename-table` flag Format: `--rename-table :` ```bash # Single table rename $ dbwarden make-migrations --rename-table users:accounts # Multiple renames $ dbwarden make-migrations --rename-table users:accounts --rename-table posts:articles # Combined with column rename $ dbwarden make-migrations --rename-table users:accounts --rename accounts.username:email ``` Note: when combining table and column renames, the column rename references the **new** table name. Table renames are applied to the snapshot before column-level processing. ### Table rename edge cases - **Empty tables**: If either the snapshot table or the model table has zero columns, the overlap ratio is `0.0` and the pair is not a rename candidate. - **Zero overlap**: If no columns match by name and normalized type, the ratio is `0.0`: emitted as drop+add. - **Exact match**: If all columns match, the ratio is `1.0`: always a rename candidate. - **Table rename + column changes in the same table**: After the table rename is applied to the snapshot, column diffs are computed against the new table name. Column renames, type changes, nullable changes, and default changes are all detected on the renamed table. - **ClickHouse**: `ALTER TABLE RENAME` emits a comment-only placeholder since ClickHouse does not support it. ### SQL generation All four supported backends (SQLite, PostgreSQL, MySQL, MariaDB) use the same syntax: ```sql -- upgrade ALTER TABLE users RENAME TO accounts; -- rollback ALTER TABLE accounts RENAME TO users; ``` ClickHouse emits `RENAME TABLE old TO new;` (ClickHouse supports this as a standalone statement). ## Foreign Key and Index Diff When a schema snapshot exists, `make-migrations` also detects changes to foreign keys and indexes by comparing the snapshot's stored constraints and indexes against the model's declared relationships and indexes. ### Foreign Key vs Index limitations and edge cases - **Silent skip on missing ref**: If an FK references a table that does not exist in the snapshot, the FK is silently skipped (no error, no SQL). This prevents generating broken SQL but can be surprising. To ensure the FK is emitted, make sure the referenced table exists in the snapshot before running `make-migrations`. - **Content-based comparison (not name-based)**: Both FKs and indexes are compared by their structural properties, not their names. Renaming a constraint or index does not produce a drop+add. - **ClickHouse**: FK and index operations emit comment-only placeholders (not supported). - **SQLite FKs**: Not directly alterable. A comment suggesting table recreation is emitted. ### Foreign Key Detection | Change | Detection | Generated SQL | |--------|-----------|---------------| | **FK added** | FK present in model columns but absent from snapshot constraints | `ALTER TABLE ... ADD CONSTRAINT ... FOREIGN KEY ...` | | **FK dropped** | FK present in snapshot constraints but absent from model columns | `ALTER TABLE ... DROP CONSTRAINT ...` (or `DROP FOREIGN KEY` on MySQL/MariaDB) | FKs are compared by content (columns, referenced table, referenced columns), not by name. This means renaming an FK constraint is not treated as a drop+add. **Validation:** Before emitting an `ADD FOREIGN KEY`, the diff engine verifies that the referenced table and columns exist in the snapshot. If they don't, the FK is silently skipped to avoid generating broken SQL. **Deferrable constraints (Postgres only):** When detected, `DEFERRABLE INITIALLY DEFERRED` is appended to the constraint SQL. **SQLite:** FK constraints are not directly alterable. A comment is emitted suggesting table recreation. ### Index Detection | Change | Detection | Generated SQL | |--------|-----------|---------------| | **Index added** | Index present in model but absent from snapshot indexes | `CREATE [UNIQUE] INDEX ... ON table (columns)` | | **Index dropped** | Index present in snapshot but absent from model indexes | `DROP INDEX ...` | **Full-content comparison:** Indexes are compared by **all** of these attributes, not just columns + unique. Any difference triggers a drop+add: | Attribute | SQL Clause | Backend | Example | |-----------|-----------|---------|---------| | `using` | `USING ` | PostgreSQL, SQLite (partial) | `USING gin`, `USING gist`, `USING hash` | | `unique` | `UNIQUE` | All | `CREATE UNIQUE INDEX` | | `where` | `WHERE ` | PostgreSQL | `WHERE status = 'active'` | | `include` | `INCLUDE ()` | PostgreSQL | `INCLUDE (email, name)` | | `with_params` | `WITH ()` | PostgreSQL | `WITH (fillfactor = 70)` | | `tablespace` | `TABLESPACE ` | PostgreSQL | `TABLESPACE fast_space` | | `nulls_not_distinct` | `NULLS NOT DISTINCT` | PostgreSQL 15+ | On unique indexes | | `column_sorting` | Per-column `ASC/DESC NULLS FIRST/LAST` | PostgreSQL | `col1 DESC NULLS LAST, col2 ASC` | | `type` | `TYPE ` | ClickHouse | `TYPE minmax`, `TYPE bloom_filter` | | `granularity` | `GRANULARITY ` | ClickHouse | `GRANULARITY 3` | | `concurrently` | `CONCURRENTLY` | PostgreSQL | `--concurrent` / `--no-concurrent` | Omitted attributes or `None`-valued attributes are treated as defaults (btree, no partial, no INCLUDE, etc.), so a plain `Index("ix", "col")` produces the same signature across versions. **Name generation:** Auto-generated index names follow the pattern: - `idx_{table}_{col1}_{col2}` for non-unique indexes - `uq_{table}_{col1}_{col2}` for unique indexes - Non-btree `USING` methods append a suffix: `idx_{table}_{col}_{method}` **Backend specifics:** - PostgreSQL uses `CREATE INDEX CONCURRENTLY` by default (`--concurrent`). Use `--no-concurrent` inside transaction blocks. - SQLite and MySQL use standard `CREATE INDEX`. - ClickHouse generates `ALTER TABLE ... ADD INDEX ... TYPE GRANULARITY ` for `ChIndexSpec` entries in `ch_indexes`; standard SQL indexes still emit a comment. ## Statement ordering Operations in the generated migration are ordered consistently: ``` RENAME TABLE (0) : table renames first (all subsequent ops use new name) RENAME COLUMN (1) ALTER COLUMN TYPE (2) ALTER COLUMN NULLABLE (3) ALTER COLUMN DEFAULT (4) CREATE TABLE (5) ADD COLUMN (6) ALTER FOREIGN KEY (7) : FK adds and drops ALTER INDEX (8) : index adds and drops DROP COLUMN (9) DROP TABLE (10) ``` Table renames are ordered first so that all subsequent statements reference the new table name. ## Generated artifacts When a migration is generated, DBWarden writes two files side by side: - `{database_name}__{version}_{description}.sql` - `{database_name}__{version}_{description}.plan.json` The companion plan file contains machine-readable metadata about the generated migration: - `migration_id` - `operations`: each operation includes `type`, `table`, `severity` and optionally `resolved_from` (for rename operations) - `required_flags` - `checksum` Example with rename: ```json { "migration_id": "primary__0003_rename_column_users_username", "operations": [ { "type": "rename_column", "table": "users", "new_name": "email", "severity": "INFO", "resolved_from": "rename_flag" }, { "type": "add_column", "table": "users", "column": "phone", "severity": "INFO" } ], "required_flags": [], "checksum": "sha256..." } ``` Possible `resolved_from` values: | Value | Meaning | |-------|---------| | `"rename_flag"` | Explicitly declared via `--rename` or `--rename-table` CLI flag | | `"prompt"` | Confirmed interactively by the user | | (absent) | Auto-detected rename kept without prompt (currently unused, reserved) | `--plan` switches the command into JSON-output mode. In that mode DBWarden prints the plan to stdout and does not write the `.sql` or `.plan.json` files. ## Auto-Generated Names When no description is provided, DBWarden automatically generates a descriptive name from the schema changes: | Change | Generated Name | |--------|----------------| | Single CREATE TABLE | `create_table_tablename` | | Multiple CREATE TABLE | `create_tables_users_posts` | | Single ADD COLUMN | `add_column_tablename_columnname` | | Multiple ADD COLUMN (same table) | `add_columns_tablename_col1_col2` | | Single RENAME COLUMN | `rename_column_tablename_new_name` | | Multiple RENAME COLUMN (same table) | `rename_columns_tablename_col1_col2` | | Single RENAME TABLE | `rename_table_oldname_newname` | | Multiple RENAME TABLE | `rename_tables_old1_old2` | | Single ALTER COLUMN TYPE | `alter_column_type_tablename_col` | | Single ALTER COLUMN NULLABLE | `alter_column_nullable_tablename_col` | | Single ALTER COLUMN DEFAULT | `alter_column_default_tablename_col` | | Single ADD FOREIGN KEY | `add_foreign_key_tablename_ref_table` | | Single DROP FOREIGN KEY | `drop_foreign_key_tablename` | | Single ADD INDEX | `add_index_tablename_col` | | Single DROP INDEX | `drop_index_tablename` | | Single RECREATE CH TABLE | `recreate_ch_table_tablename` | | ADD + DROP (same table) | `alter_tablename_col1_col2` | | Changes across tables | `add_column_users_email_and_1_more_tables` | | Many targets | `add_columns_tablename_col1_col2_and_3_more` | ### Name Rules - Snake case throughout. - Operation words pluralized for multiple targets (e.g., `add_column` → `add_columns`). - Mixed operations use `alter`. - Max 72 characters (table/target names truncated as needed). ## Examples ```bash # Creates primary__0001_create_table_users.sql + .plan.json $ dbwarden make-migrations --database primary # Creates primary__0002_add_column_users_email.sql + .plan.json $ dbwarden make-migrations --database primary # Creates primary__0003_rename_column_users_email.sql with a confirmed rename $ dbwarden make-migrations --database primary --rename users.username:email # Creates primary__0003_alter_column_type_users_bio.sql with type change $ dbwarden make-migrations --database primary # Creates primary__0004_add_columns_users_email_name.sql + .plan.json $ dbwarden make-migrations --database primary # Creates primary__0003_rename_table_users_accounts.sql with a confirmed table rename $ dbwarden make-migrations --database primary --rename-table users:accounts # Uses safe multi-step type change for PostgreSQL $ dbwarden make-migrations --database primary --safe-type-change # Uses custom name $ dbwarden make-migrations "initial_schema" --database primary # Preview plan JSON without writing files $ dbwarden make-migrations --database primary --plan ``` ## Notes - Generated file includes both `-- upgrade` and `-- rollback`. - Generated `.plan.json` files are useful for CI checks and debugging. - If no models are discovered, configure `model_paths` explicitly. - With `--dev`, translation can target dev SQLite behavior. - Schema snapshots are written to `.dbwarden/schemas/` after each successful `migrate`: see [Schema Snapshots](schema-snapshots.md). - Column-level diff (type/null/default changes) works with a cached schema snapshot, or via a live snapshot taken automatically by `make-migrations` when no cached snapshot exists. - Without a cached snapshot, `make-migrations` takes a full schema snapshot from the live database internally and detects column-level changes. Only rename detection requires a cached snapshot. - For authoring guidelines and the review checklist, see [Migration File Format](../migration-files.md). See also: [Migration File Format](../migration-files.md), [Schema Snapshots](schema-snapshots.md) ======================================================================== PAGE: https://dbwarden.emiliano-go.com/commands/make-rollback/ ======================================================================== # `make-rollback` Generate a rollback SQL file for a given migration file. ## Usage ```bash $ dbwarden make-rollback migrations/primary__0005_add_table.sql ``` ## Arguments - `MIGRATION_FILE` (required) - Path to the migration SQL file ## Output Creates a `.rollback.sql` file next to the given migration file with auto-generated rollback statements. ## Supported reverse transformations | Upgrade Pattern | Generated Rollback | |----------------|-------------------| | `CREATE TABLE t (...)` | `DROP TABLE IF EXISTS t;` | | `CREATE MATERIALIZED VIEW v AS ...` | `DROP VIEW IF EXISTS v;` | | `CREATE DICTIONARY d (...)` | `DROP DICTIONARY IF EXISTS d;` | | `ALTER TABLE t ADD COLUMN c ...` | `ALTER TABLE t DROP COLUMN c;` | | `CREATE INDEX i ON t (...)` | `DROP INDEX IF EXISTS i;` | | `CREATE UNIQUE INDEX i ON t (...)` | `DROP INDEX IF EXISTS i;` | | Other patterns | Comment-only placeholder | ## Notes - generated rollback is conservative: it may not handle all edge cases - always review the generated rollback before using it - for best results, write manual rollback SQL in the `-- rollback` section of the original migration ======================================================================== PAGE: https://dbwarden.emiliano-go.com/commands/migrate/ ======================================================================== # `migrate` Apply pending migrations. ## Usage ```bash $ dbwarden migrate --database primary $ dbwarden migrate --all $ dbwarden migrate --database primary --to-version 0010 $ dbwarden migrate --database primary --count 2 $ dbwarden migrate --database primary --with-backup --backup-dir ./backups $ dbwarden migrate --database primary --baseline --to-version 0005 ``` ## Options - `--database`, `-d` - `--all`, `-a` - `--count`, `-c` - `--to-version`, `-t` - `--baseline` - `--with-backup`, `-b` - `--backup-dir` - `--dry-run`: preview changes without applying - `--sandbox`: apply in a temporary sandbox database - `--apply-seeds`: apply pending seeds after migrations - `--verbose`, `-v` ## Notes - creates metadata/lock tables if needed - executes versioned + repeatable migrations - uses lock protection to prevent concurrent migration mutation See also: [Your First Migration](../getting-started/first-migration.md) ======================================================================== PAGE: https://dbwarden.emiliano-go.com/commands/new/ ======================================================================== # `new` Create a manual migration file. ## Usage ```bash $ dbwarden new "manual hotfix" --database primary $ dbwarden new "backfill users" --database primary --version 0042 $ dbwarden new "seed data" --database primary --type ra $ dbwarden new "update view" --database primary --type roc ``` ## Options - positional `description` - `--database`, `-d` - `--version` - `--type`, `-t`: Migration type: `versioned` (default), `ra` / `runs_always`, or `roc` / `runs_on_change` ## Notes - use when change is not model-driven - file is scaffolded with `-- upgrade` and `-- rollback` sections See also: [Your First Migration](../getting-started/first-migration.md) ======================================================================== PAGE: https://dbwarden.emiliano-go.com/commands/rollback/ ======================================================================== # `rollback` Rollback applied migrations using `-- rollback` SQL sections. ## Usage ```bash $ dbwarden rollback --database primary $ dbwarden rollback --database primary --count 2 $ dbwarden rollback --database primary --to-version 0007 ``` ## Options - `--database`, `-d` - `--count`, `-c` - `--to-version`, `-t` - `--verbose`, `-v` ## Notes - rollback runs in reverse order - same lock discipline as migrate See also: [Your First Migration](../getting-started/first-migration.md) ======================================================================== PAGE: https://dbwarden.emiliano-go.com/commands/schema-snapshots/ ======================================================================== # Schema Snapshots Schema snapshots are JSON files that record the full DDL state of a database at the point a migration was applied. They enable offline migration generation, intelligent rename detection, and column-level change detection (type, nullability, default). ## How they work After each versioned migration is successfully applied by `migrate`, DBWarden extracts the complete schema from the live database and writes it as a `.schema.json` file: ``` .dbwarden/schemas/ primary__0001_init.schema.json primary__0002_add_email.schema.json primary__0003_create_posts.schema.json ``` ### Snapshot contents Each snapshot captures: ```json { "format_version": 2, "migration_id": "primary__0003_create_posts", "database_name": "primary", "database_type": "postgresql", "applied_at": "2026-06-07T10:30:00Z", "checksum": "sha256...", "tables": { "users": { "object_type": "table", "columns": { "id": { "name": "id", "type": "integer", "nullable": false, "primary_key": true, "default": null, "comment": null, "pg_column": {} }, "email": { "name": "email", "type": "varchar", "nullable": false, "primary_key": false, "default": null, "comment": null, "pg_column": {} } }, "backend_table_spec": {"backend": "postgresql"}, "comment": null } }, "enums": {}, "indexes": {}, "constraints": {} } ``` Fields added in format v2: | Field | Level | Purpose | |-------|-------|---------| | `backend_table_spec` | Per-table | Backend-specific table options (e.g., `ch_engine`, `pg_fillfactor`, `my_engine`) | | `pg_column` / `ch_column` / `my_column` | Per-column | Backend-specific column metadata (e.g., `ch_type`, `ch_codec`, `pg_type`, `my_unsigned`) | | `object_type` | Per-table | `"table"`, `"materialized_view"`, or `"dictionary"` | | `name` | Per-column | Column name (v1 stored names as keys; v2 stores them in both key and value) | **Backward compatibility:** Snapshots with `format_version: 1` are automatically normalized to v2 when read. V1 keys like `clickhouse_options`, `pg_table`, `my_table`, and per-table `indexes`/`primary_key` are mapped to their v2 equivalents. ### Integrity Snapshots are self-integrity-checked with a SHA-256 checksum. If a snapshot file is tampered with (manually edited, corrupted), `read_snapshot()` returns `None` and `make-migrations` falls back to the live database. ## Why snapshots? ### Offline diffing Once the first snapshot exists, `make-migrations` can generate new migrations **without a live database connection**. This is useful in: - CI pipelines where only model files are available - Air-gapped environments - Development setups without a full database ### Rename detection (columns) Rename detection relies on comparing the snapshot's columns against the model's columns: 1. A column present in the snapshot but absent from the model → **dropped**. 2. A column present in the model but absent from the snapshot → **added**. 3. If a dropped column and an added column share the same normalized type, they are candidates for a **rename**. Without snapshots (the legacy live-DB fallback path), rename detection is not possible because the live DB already reflects the current state and has no record of dropped columns. ### Rename detection (tables) Table renames are detected by comparing tables present in the snapshot but absent from models (dropped) against tables absent from the snapshot but present in models (added). A **column overlap heuristic** computes the ratio of matching column names+types between the two tables. If the ratio is ≥ 0.6, the pair is a rename candidate. Detected table renames are prompted interactively (TTY) or suggested via the `--rename-table` flag (CI). Confirmed renames emit `ALTER TABLE ... RENAME TO` and are applied to the snapshot before column-level processing, so subsequent column renames reference the new table name. ### Column-level change detection Snapshots also enable `make-migrations` to detect when a column's **type**, **nullability**, or **default** has changed. For columns present in both the snapshot and the model with the same name, the diff engine compares: - **Normalized type**: If `varchar` in the snapshot but `text` in the model, an `ALTER COLUMN TYPE` operation is emitted. - **Nullable flag**: If nullable differs, `SET NOT NULL` or `DROP NOT NULL` is generated. - **Default value**: If the default differs, `SET DEFAULT` or `DROP DEFAULT` is generated. When a cached snapshot exists, these operations are compared against the historical schema. Without a cached snapshot, `make-migrations` takes a full schema snapshot from the live database internally and detects column-level changes (type, nullability, default) from that live snapshot. Only rename detection requires a cached snapshot. ### Foreign key and index change detection Snapshots also store the database's foreign keys (in `constraints`) and indexes (in `indexes`). The diff engine compares these against the model's declared FK relationships (parsed from `ModelColumn.foreign_key`) and indexes (extracted from `__table__.indexes`). Comparisons are content-based (columns, referenced table, referenced columns for FKs; columns + unique flag for indexes), so constraint/index name changes are not treated as drop+add. ### Audit trail Every schema snapshot is an immutable record of the database schema at a specific migration version. You can inspect any historical snapshot to see exactly what the schema looked like. ## How they are created Snapshots are created automatically by the `migrate` command after applying a versioned migration: ``` $ dbwarden migrate --database primary ``` The snapshot is written **after** all pending migrations have been applied. If the write fails (permission issue, disk full, etc.), a warning is logged but the migration itself is not rolled back. Failure to write a snapshot is non-fatal. ### Snapshots are NOT created for - `--dry-run` or `--sandbox` runs (no real schema change) - Rollback operations (snapshot remains as-is for audit) - Repeatable migrations (`RA__`, `ROC__`) ## Rollback and re-apply - **Rollback does not delete the snapshot.** The snapshot stays as an audit record of what was applied. - **Re-applying a migration** overwrites the snapshot with the current schema state. ## Finding the latest snapshot `make-migrations` uses `find_latest_snapshot()` which scans `.dbwarden/schemas/` for snapshot files matching the current database name and picks the one with the highest version prefix (e.g., `0003` > `0002`). If no snapshot exists for the database, `make-migrations` takes a full schema snapshot from the live database internally and runs the standard diff pipeline against it. This enables column-level change detection (type, nullability, default). Only rename detection requires a cached snapshot. ## Snapshot lifecycle summary | Event | Snapshot | |-------|----------| | First `migrate` | Created after apply | | Subsequent `migrate` | Overwritten with latest schema | | `rollback` | Unchanged (kept as audit) | | Re-apply same version | Overwritten | | `--dry-run` / `--sandbox` | Not written | | `make-migrations` (snapshot exists) | Read for diff + rename detection | | `make-migrations` (no snapshot) | Fallback to live DB diff | ## DB-agnostic type normalization Column types in the snapshot are normalized to a canonical set so that equivalent types across databases are treated the same: | Canonical type | Matches | |----------------|---------| | `integer` | INT, INTEGER, INT4, TINYINT, SMALLINT | | `biginteger` | BIGINT, INT8 | | `varchar` | VARCHAR, CHARACTER VARYING | | `text` | TEXT, LONGTEXT, CLOB | | `boolean` | BOOLEAN, BOOL | | `timestamp` | TIMESTAMP, DATETIME | | `numeric` | NUMERIC, DECIMAL (with precision/scale) | | `float` | FLOAT, REAL, DOUBLE | | `bytes` | BYTEA, BLOB, BINARY | | `uuid` | UUID | | `enum` | ENUM | | (unknown) | Stored as-is with `"raw": true` | This normalization is what powers the rename detection: two columns with the same normalized type are candidates for rename, even if their raw SQL type strings differ. ## Edge Cases and Restrictions ### Rename detection - **Ambiguous renames**: When multiple columns of the same type are dropped and added, all possible pairs are treated as renames (not just one). This maximizes detection but may produce false positives that must be confirmed interactively or via `--rename`. - **Type change prevents rename**: If a dropped column and an added column have different normalized types, they are never auto-detected as renames. Use `--rename` to force the rename anyway. - **Same name**: If a column with the same name exists in both the snapshot and the model, no rename is detected even if its type changes (that is handled by type-change detection). ### Table rename detection - **Column-overlap heuristic**: The ratio is `matching_columns / max(len(snapshot_cols), len(model_cols))`. The 0.6 threshold is intentionally conservative. - **Empty tables**: Either table having zero columns results in a ratio of `0.0` (no candidate). - **Table rename + column diff interaction**: Confirmed table renames are applied to the snapshot before column diffs are computed. This ensures column renames and other column-level changes reference the new table name. ### Foreign key and index detection - **Silent skip on missing ref**: If an FK references a table that does not exist in the snapshot, the FK is silently skipped (no error, no SQL emitted). This prevents broken SQL but can be surprising. Ensure the referenced table exists in the snapshot first. - **Content-based comparison**: FKs are compared by `(columns, referenced_table, referenced_columns)`. Indexes are compared by `(frozenset(columns), unique)`. Renaming a constraint or index does not produce a drop+add. - **ClickHouse**: FK and index operations emit comment-only placeholders (not supported by ClickHouse). - **SQLite FKs**: A comment is emitted suggesting table recreation (not directly alterable). ### Column-level change detection - **Cached snapshot not required**: Column-level diff works with a live snapshot taken automatically by `make-migrations`. A cached snapshot enables rename detection in addition to column-level diff. - **Backend limits**: Type changes emit different SQL per backend. SQLite emits comment-only placeholders for type and nullable changes. ClickHouse auto-generates `MODIFY COLUMN` for type, nullable, and LowCardinality changes. Default changes work uniformly across all backends. ### Integrity - **Tampered snapshots**: If the checksum does not match, `read_snapshot()` returns `None`, and `make-migrations` falls back to the live database. A warning is logged. - **Checksum-excluded fields**: The `checksum` field itself is excluded from the hash computation, so checksum updates do not cascade. ======================================================================== PAGE: https://dbwarden.emiliano-go.com/commands/seed/ ======================================================================== # `seed` Manage seed data for a database. ## Subcommands - `seed create`: create a new file seed (legacy) - `seed apply`: apply pending seeds (file + code seeds) - `seed list`: list seeds and their status - `seed rollback`: roll back applied seeds - `seed export`: export code seeds to ROC SQL files for stateless production application --- ## `seed create` Create a new file-based seed file (SQL or Python). For new projects, prefer [code seeds](../seeds.md#code-seeds-recommended) instead. ### Usage ```bash $ dbwarden seed create "seed initial data" --database primary $ dbwarden seed create "populate lookup tables" --database primary --type python ``` ### Options - `--database`, `-d`: target database handle - `--type`: `sql` (default) or `python` - `--verbose`, `-v` --- ## `seed apply` Apply pending seeds. Both file seeds and [code seeds](../seeds.md#code-seeds-recommended) are discovered and applied. ### Usage ```bash $ dbwarden seed apply --database primary $ dbwarden seed apply --database primary --version 0003 $ dbwarden seed apply --database primary --dry-run $ dbwarden seed apply --all ``` ### Options - `--database`, `-d` - `--all`, `-a`: apply across all configured databases - `--version`: apply up to this seed version - `--dry-run`: preview without executing - `--verbose`, `-v` --- ## `seed list` List seeds and their applied status. Includes both file seeds and code seeds. ### Usage ```bash $ dbwarden seed list --database primary $ dbwarden seed list --all $ dbwarden seed list --prune # clean up orphaned tracking records ``` ### Options - `--database`, `-d` - `--all`, `-a` - `--prune`: remove tracking records for seed files that no longer exist on disk - `--verbose`, `-v` --- ## `seed rollback` Roll back applied seeds. Removes the tracking record, allowing the seed to be re-applied. Does **not** reverse data changes. ### Usage ```bash $ dbwarden seed rollback --database primary $ dbwarden seed rollback --database primary --count 2 $ dbwarden seed rollback --database primary --to-version 0003 ``` ### Options - `--database`, `-d` - `--all`, `-a`: rollback on all databases - `--count`, `-c`: number of seeds to roll back (default: 1) - `--to-version`, `-t`: roll back to this seed version - `--verbose`, `-v` See also: [Seed Management](../seeds.md) --- ## `seed export` Export code seeds to ROC (runs-on-change) SQL files for stateless application. The generated file contains `INSERT ... ON CONFLICT` statements rendered in the target database dialect. ROC files are re-applied when their content checksum changes. ### Usage ```bash $ dbwarden seed export --database primary $ dbwarden seed export --all $ dbwarden seed export --database clickhouse --output-dir ./seeds ``` ### Options - `--database`, `-d`: target database handle - `--all`, `-a`: export seeds for all configured databases - `--output-dir`, `-o`: output directory (default: `seeds/`) ### Behavior - **Row-based seeds** (`rows = [...]`): each row is rendered as an `INSERT` statement with `ON CONFLICT` matching the seed's `__seed_on_conflict__` - **Logic-based seeds** (`generate(session)`): executed in a temporary SQLite database with FK-closure tables created and preceding row-based seeds pre-loaded. The resulting rows are exported as INSERT statements - Seeds are ordered by FK dependency (topological sort) so foreign-key-safe insert order is preserved ### Dialect requirement Exporting requires the same dialect packages as connecting to that database. For ClickHouse, install `clickhouse-sqlalchemy`. Missing packages produce a clear error at export time. ### Non-handled problems - Removed rows are not deleted (no purge on re-export) - Logic seeds that depend on other logic seeds' output are unsupported - Non-deterministic `generate()` methods produce a new checksum every export ======================================================================== PAGE: https://dbwarden.emiliano-go.com/commands/settings/ ======================================================================== # `settings` View DBWarden configuration. All database settings are defined in Python code via `database_config()`, so `settings show` is a read-only command for inspecting the current configuration. ## `settings show` ### Usage ```bash $ dbwarden settings show $ dbwarden settings show primary $ dbwarden settings show --all ``` ### Options - `--all`, `-a`: show all configured databases ### Example output ``` Database: PRIMARY (default) • Default: True • Type: SQLite • URL: sqlite:///./app.db • Migrations Directory: migrations/primary • Migration Table: _dbwarden_migrations • Seed Table: _dbwarden_seeds • Model Paths: ['app'] • Dev Database Type: None • Dev Database URL: None • Overlap Models: False ``` ## See also - [`database list`](./database.md) - [Configuration docs](../configuration/index.md) ======================================================================== PAGE: https://dbwarden.emiliano-go.com/commands/snapshot/ ======================================================================== # `snapshot` Output the DDL schema of a specific database table. ## Usage ```bash $ dbwarden snapshot users --database primary ``` ## Options - `TABLE` (required) - Name of the table to snapshot - `--database`, `-d` ## Output For standard SQL databases (SQLite, PostgreSQL, MySQL, MariaDB): - `CREATE TABLE` statement with column types, nullability, and defaults - `CREATE INDEX` statements - Foreign key constraints For ClickHouse: - The raw `CREATE TABLE` query from `system.tables` ## Notes - output is printed to stdout - useful for debugging schema differences or documenting table structure - internally uses `sqlalchemy.inspect()` for generic databases ======================================================================== PAGE: https://dbwarden.emiliano-go.com/commands/status/ ======================================================================== # `status` Show migration status (applied vs pending). ## Usage ```bash $ dbwarden status --database primary $ dbwarden status --all ``` ## Options - `--database`, `-d` - `--all`, `-a` ## Notes - run before and after migration execution - supports multi-database status with `--all` See also: [Your First Migration](../getting-started/first-migration.md) ======================================================================== PAGE: https://dbwarden.emiliano-go.com/commands/version/ ======================================================================== # `version` Show DBWarden version. ## Usage ```bash $ dbwarden version ``` ## Notes - useful for support/debug and release verification ======================================================================== PAGE: https://dbwarden.emiliano-go.com/configuration/concepts/ ======================================================================== # Configuration Concepts Understand how DBWarden configuration works under the hood. ## What is Configuration? Configuration tells DBWarden: - **Where** your databases are (connection URLs) - **What** kind of databases they are (PostgreSQL, SQLite, etc.) - **Where** your SQLAlchemy models live (for migration generation) - **Where** to store migrations (directories) ## Why Python Configuration? ### Type Safety Your IDE can help you: ```python primary = database_config( database_name="primary", # IDE suggests parameter names default=True, # IDE knows this is boolean database_type="sqlite", # IDE can validate enum values database_url_sync="...", ) ``` ### Dynamic Configuration You can use Python logic: ```python import os # Different config per environment environment = os.getenv("ENV", "dev") if environment == "production": database_url = "postgresql://prod-host/myapp" else: database_url = "sqlite:///./dev.db" primary = database_config( database_name="primary", default=True, database_type="postgresql" if environment == "production" else "sqlite", database_url_sync=database_url, ) ``` ### Multiple Databases Easy to configure multiple databases: ```python DATABASES = { "primary": "postgresql://localhost/main", "analytics": "postgresql://localhost/analytics", "logging": "postgresql://localhost/logs", } for name, url in DATABASES.items(): db = database_config( database_name=name, default=(name == "primary"), database_type="postgresql", database_url_sync=url, model_paths=[f"app.models.{name}"], ) ``` ## Configuration Loading ### Discovery Order DBWarden searches for configuration in this order: ``` 1. dbwarden.py in current directory not found 2. dbwarden.py in parent directories not found 3. Full scan for files with database_config() not found 4. DBWARDEN_CONFIG_MODULE environment variable not found Error: No configuration found ``` ### When Configuration Loads Configuration loads when you run **any** DBWarden command: ```bash $ dbwarden migrate # Config loads here $ dbwarden status # Config loads here $ dbwarden history # Config loads here ``` **Load process:** 1. Python imports your config module 2. `database_config()` calls execute 3. Databases register in internal registry 4. Validation runs 5. Command executes with loaded config ### Validation Rules DBWarden validates configuration at load time: | Rule | Why It Matters | |------|----------------| | Exactly one `default=True` | CLI needs to know which DB to use when `--database` is omitted | | Unique `database_name` | Commands target databases by name | | Unique `database_url` | Prevents accidental duplicate configurations | | Unique physical targets | Prevents two configs pointing to same DB with different credentials | | Required `model_paths` in multi-DB | Keeps model discovery boundaries clear | | No overlapping `model_paths` | Prevents ambiguous model ownership (unless `overlap_models=True`) | ### Validation Timing ```python # dbwarden.py primary = database_config( database_name="primary", default=True, database_type="postgresql", database_url_sync="postgresql://localhost/myapp", ) primary = database_config( database_name="primary", # Duplicate! default=True, database_type="postgresql", database_url_sync="postgresql://localhost/other", ) ``` When you run `dbwarden migrate`: ``` Error: Duplicate database_name 'primary' ``` Validation happens **before** any commands execute. ### Config Source Precedence When looking for config, DBWarden uses this precedence: 1. **Top-level `dbwarden.py`**: the conventional standalone config file at your project root. This is the default scaffold created by `dbwarden init`, not the only valid location. Always sandboxed (only `dbwarden` imports allowed). 2. **`DBWARDEN_CONFIG_MODULE`**: an explicit environment variable override. Always imported normally as a Python module (no sandbox). This is the escape hatch for projects with ambiguous full-scan results or non-standard layouts. 3. **Full-scan discovery**: if neither of the above produces a config source, DBWarden walks your project tree looking for any `database_config(...)` call. This means `database_config(...)` can live in any discovered Python file inside your project. Files directly at the project root are sandboxed; files inside subdirectories are imported normally. ### Config Loading Security (Sandbox) DBWarden applies import restrictions only to config files that are **isolated** (sandboxed) vs **in-package** (normal import): | Mode | Import behavior | Applies to | |------|----------------|------------| | `isolated` | Sandboxed: only `dbwarden.*` imports allowed | Top-level `dbwarden.py`; any full-scan-discovered file at the project root | | `in-package` | Normal Python import | Full-scan-discovered files inside subdirectories; `DBWARDEN_CONFIG_MODULE` modules | An isolated config file runs in a sandbox that prevents accidental escalation of file-read access to arbitrary code execution. Only `dbwarden` and its submodules can be imported. An in-package config file is imported as a normal Python module, with full access to `app.*` and any other project imports. This is the correct path when your `database_config(...)` call lives in an application package that imports other project modules. **Import root detection.** For full-scan-discovered files, DBWarden tries to resolve the dotted module path. It checks two common import roots in order: - `src/` (PEP 517/518, setuptools, poetry) - The project root itself For example, `src/myapp/databases.py` resolves as `myapp.databases` with import root `src/`. If neither root produces an importable path, the file falls back to `isolated` (sandboxed). Projects with other layouts should set `DBWARDEN_CONFIG_MODULE` explicitly. **Path validation** (path traversal blocking) applies to all file-based sources regardless of mode. For debugging, set `DBWARDEN_DISABLE_SANDBOX=1` to disable the sandbox for isolated files: ```bash DBWARDEN_DISABLE_SANDBOX=1 dbwarden status # Skip sandbox (debug only) ``` Disabling the sandbox also removes import restrictions for isolated config files, which can be useful in development. Keep it enabled in production. ## The `default` Database ### Why `default=True` Exists Consider these commands: ```bash # Explicit database $ dbwarden migrate --database primary # Implicit database (uses default) $ dbwarden migrate ``` Without `default=True`, DBWarden wouldn't know which database to use for the second command. ### Only One Default ```python # Good analytics = database_config( analytics = database_config(database_name="analytics", default=False, ...) # or omit default # Bad - two defaults analytics = database_config( analytics = database_config(database_name="analytics", default=True, ...) # Error! ``` ### Default Affects CLI Behavior ```bash # These are equivalent when primary is default: $ dbwarden migrate $ dbwarden migrate --database primary # These are NOT equivalent: $ dbwarden migrate $ dbwarden migrate --database analytics # Targets analytics, not primary ``` ## Model Discovery ### What Are `model_paths`? `model_paths` tells DBWarden where your SQLAlchemy models live: ```python primary = database_config( database_name="primary", default=True, database_type="postgresql", database_url_sync="postgresql://localhost/myapp", model_paths=["app.models"], # Look here for models ) ``` ### How Discovery Works ``` 1. Import each module in model_paths 2. Find all classes inheriting from DeclarativeBase 3. Extract table metadata (__tablename__, columns, etc.) 4. Build internal representation for migration generation ``` ### Filtering by Table Name When two databases share the same `model_paths` but should own different subsets of tables, use `model_tables`: ```python primary = database_config( database_name="primary", default=True, database_type="postgresql", database_url_sync="postgresql://localhost/main", model_paths=["app.models"], model_tables=["users", "posts", "comments"], ) analytics = database_config( database_name="analytics", database_type="clickhouse", database_url_sync="http://clickhouse-host:8123/analytics", model_paths=["app.models"], model_tables=["analytics_events", "analytics_sessions"], ) ``` This is useful when all models live under one shared package but each database only owns a subset. DBWarden validates every name in `model_tables` exists among the discovered tables and prevents overlap between databases (unless `overlap_models=True`). ### When Is It Required? **Single database:** Optional (DBWarden scans entire codebase) ```python # This works primary = database_config( database_name="primary", default=True, database_type="sqlite", database_url_sync="sqlite:///./app.db", # No model_paths needed ) ``` **Multiple databases:** Required for each database ```python # This is required primary = database_config( database_name="primary", default=True, database_type="postgresql", database_url_sync="postgresql://localhost/main", model_paths=["app.models.primary"], # Required ) analytics = database_config( database_name="analytics", database_type="postgresql", database_url_sync="postgresql://localhost/analytics", model_paths=["app.models.analytics"], # Required ) ``` **Why?** To prevent ambiguity about which models belong to which database. ## Dev Mode ### What Is Dev Mode? Dev mode lets you use a different database for local development: ```python primary = database_config( database_name="primary", default=True, database_type="postgresql", # Production database_url_sync="postgresql://prod/myapp", dev_database_type="sqlite", # Development dev_database_url="sqlite:///./dev.db", ) ``` Run commands with `--dev`: ```bash $ dbwarden --dev migrate # Uses SQLite $ dbwarden migrate # Uses PostgreSQL ``` ### How It Works ``` Command: dbwarden --dev migrate Check for --dev flag Swap database_type → dev_database_type Swap database_url → dev_database_url Connect to dev database Execute command ``` ### Why Use It? **Speed:** - SQLite is faster than PostgreSQL for local iteration - No network latency - No server setup **Safety:** - Can't accidentally affect production - Each developer has their own isolated database - Easy to reset (just delete the file) **Simplicity:** - No Docker containers needed - No database server installation - Works on all platforms ## Multi-Database Configuration ### Why Multiple Databases? Common scenarios: - **Separation of concerns** - Transactions vs analytics - **Performance** - Offload reporting to separate database - **Compliance** - Audit logs in separate database - **Legacy systems** - New and old databases coexist ### How It Works Each `database_config()` call registers an independent database: ```python primary = database_config( database_name="primary", default=True, database_type="postgresql", database_url_sync="postgresql://localhost/main", model_paths=["app.models.primary"], ) analytics = database_config( database_name="analytics", database_type="postgresql", database_url_sync="postgresql://localhost/analytics", model_paths=["app.models.analytics"], ) ``` They're completely independent: - Separate migration histories - Separate migration directories - Separate model sets - Can use different database types ### Model Path Boundaries ```python app/ models/ primary/ user.py # Goes to primary database order.py analytics/ event.py # Goes to analytics database metric.py ``` Configuration: ```python primary = database_config( database_name="primary", model_paths=["app.models.primary"], # Only primary models ... ) analytics = database_config( database_name="analytics", model_paths=["app.models.analytics"], # Only analytics models ... ) ``` ## Secure Values ### What Is `secure_values`? Prevents credentials from appearing in terminal output: ```python import os DATABASE_URL = os.getenv("DATABASE_URL") primary = database_config( database_name="primary", default=True, database_type="postgresql", database_url_sync=DATABASE_URL, secure_values=True, # Hide credentials ) ``` ### Without `secure_values`: ```bash $ dbwarden settings show Database: primary URL: postgresql://user:SECRET_PASSWORD@prod-host/myapp ``` ### With `secure_values`: ```bash $ dbwarden settings show --all Database: primary URL: DATABASE_URL (expression) ``` Shows the variable name instead of resolved value. ## Configuration vs Runtime ### Configuration Time When config loads: - `database_config()` calls execute - Validation runs - Internal registry populates - **No database connections made** ### Runtime When commands run: - DBWarden reads from registry - **Connects to database** - Executes command logic **Key point:** Configuration errors are caught early, before any database operations. ## What's Next? - **[Connection URLs](connection-urls.md)** - URL format reference - **[Model Discovery](model-discovery.md)** - Deep dive into model paths - **[Dev Mode](dev-mode.md)** - Local development workflows - **[Multi-Database](multi-database.md)** - Multi-database patterns ======================================================================== PAGE: https://dbwarden.emiliano-go.com/configuration/connection-urls/ ======================================================================== # Connection URLs Complete reference for database connection URL formats. ## URL Format All database URLs follow this general structure: ``` [dialect[+driver]]://[username[:password]@][host][:port][/database][?option=value&...] ``` ## PostgreSQL ### Basic Format ``` postgresql://[user[:password]@][host][:port]/database[?options] ``` ### Examples **Local default:** ```python database_url_sync="postgresql://localhost/myapp" ``` **With credentials:** ```python database_url_sync="postgresql://user:password@localhost:5432/myapp" ``` **Remote host:** ```python database_url_sync="postgresql://user:password@db.example.com:5432/myapp" ``` **With SSL:** ```python database_url_sync="postgresql://user:password@localhost/myapp?sslmode=require" ``` **With connection pool:** ```python database_url_sync="postgresql://user:password@localhost/myapp?pool_size=20&max_overflow=10" ``` ### SSL Modes | Mode | Description | |------|-------------| | `disable` | No SSL | | `allow` | Try SSL, fall back to non-SSL | | `prefer` | Try SSL first (default) | | `require` | Require SSL, fail if unavailable | | `verify-ca` | Require SSL + verify CA | | `verify-full` | Require SSL + verify CA + hostname | **Example:** ```python database_url_sync="postgresql://user:pass@host/db?sslmode=verify-full&sslrootcert=/path/to/ca.pem" ``` ### Common Options | Option | Description | Example | |--------|-------------|---------| | `sslmode` | SSL connection mode | `sslmode=require` | | `sslcert` | Client certificate | `sslcert=/path/to/cert.pem` | | `sslkey` | Client key | `sslkey=/path/to/key.pem` | | `sslrootcert` | CA certificate | `sslrootcert=/path/to/ca.pem` | | `connect_timeout` | Connection timeout (seconds) | `connect_timeout=10` | | `application_name` | App name in pg_stat_activity | `application_name=myapp` | ### Cloud Providers **AWS RDS:** ```python database_url_sync="postgresql://user:pass@mydb.abc123.us-east-1.rds.amazonaws.com:5432/myapp?sslmode=require" ``` **Google Cloud SQL:** ```python database_url_sync="postgresql://user:pass@/myapp?host=/cloudsql/project:region:instance" ``` **Azure Database:** ```python database_url_sync="postgresql://user@server:pass@server.postgres.database.azure.com:5432/myapp?sslmode=require" ``` **Heroku:** ```python import os database_url_sync=os.getenv("DATABASE_URL") # Provided by Heroku ``` ## SQLite ### Basic Format ``` sqlite:///[path] ``` ### Examples **Relative path:** ```python database_url_sync="sqlite:///./app.db" database_url_sync="sqlite:///./data/app.db" ``` **Absolute path:** ```python database_url_sync="sqlite:////absolute/path/to/app.db" ``` **In-memory (testing only):** ```python database_url_sync="sqlite:///:memory:" ``` In-memory databases are lost when the connection closes. Only use for testing. ### Common Options | Option | Description | Example | |--------|-------------|---------| | `timeout` | Lock timeout (seconds) | `?timeout=20` | | `check_same_thread` | Thread safety check | `?check_same_thread=false` | **Example:** ```python database_url_sync="sqlite:///./app.db?timeout=20" ``` ## MySQL / MariaDB ### Basic Format ``` mysql://[user[:password]@][host][:port]/database[?options] ``` ### Examples **Local:** ```python database_url_sync="mysql://root:password@localhost:3306/myapp" ``` **With charset:** ```python database_url_sync="mysql://user:pass@localhost/myapp?charset=utf8mb4" ``` **With SSL:** ```python database_url_sync="mysql://user:pass@localhost/myapp?ssl_ca=/path/to/ca.pem" ``` ### Common Options | Option | Description | Example | |--------|-------------|---------| | `charset` | Character set | `charset=utf8mb4` | | `ssl_ca` | CA certificate | `ssl_ca=/path/to/ca.pem` | | `ssl_cert` | Client certificate | `ssl_cert=/path/to/cert.pem` | | `ssl_key` | Client key | `ssl_key=/path/to/key.pem` | ### MariaDB MariaDB uses the same URL format as MySQL: ```python database_url_sync="mysql://user:pass@localhost:3306/myapp" ``` Configure with `database_type="mariadb"`: ```python primary = database_config( database_name="primary", database_type="mariadb", database_url_sync="mysql://localhost/myapp", ) ``` ## ClickHouse ### Basic Format ``` http://[user[:password]@]host[:port]/database[?options] ``` ### Examples **Local:** ```python database_url_sync="http://default:@localhost:8123/myapp" ``` **With authentication:** ```python database_url_sync="http://user:password@localhost:8123/myapp" ``` **With HTTPS:** ```python database_url_sync="https://user:password@clickhouse.example.com:8443/myapp" ``` ### Common Options | Option | Description | Example | |--------|-------------|---------| | `compression` | Enable compression | `compression=1` | | `connect_timeout` | Connection timeout | `connect_timeout=10` | | `send_timeout` | Send timeout | `send_timeout=300` | | `receive_timeout` | Receive timeout | `receive_timeout=300` | **Example:** ```python database_url_sync="http://user:pass@localhost:8123/myapp?compression=1&connect_timeout=10" ``` ## Environment Variables ### Basic Pattern ```python import os primary = database_config( database_name="primary", default=True, database_type="postgresql", database_url_sync=os.getenv("DATABASE_URL"), ) ``` ### With Fallback ```python import os database_url = os.getenv("DATABASE_URL", "sqlite:///./dev.db") primary = database_config( database_name="primary", default=True, database_type="postgresql" if "postgresql" in database_url else "sqlite", database_url_sync=database_url, ) ``` ### Required Environment Variables ```python import os DATABASE_URL = os.getenv("DATABASE_URL") if not DATABASE_URL: raise ValueError("DATABASE_URL environment variable is required") primary = database_config( database_name="primary", default=True, database_type="postgresql", database_url_sync=DATABASE_URL, ) ``` ### Multiple Databases ```python import os primary = database_config( database_name="primary", default=True, database_type="postgresql", database_url_sync=os.getenv("PRIMARY_DATABASE_URL"), ) analytics = database_config( database_name="analytics", database_type="postgresql", database_url_sync=os.getenv("ANALYTICS_DATABASE_URL"), ) ``` ## URL Encoding ### Special Characters If your password contains special characters, URL-encode them: | Character | Encoded | |-----------|---------| | `@` | `%40` | | `:` | `%3A` | | `/` | `%2F` | | `?` | `%3F` | | `#` | `%23` | | `&` | `%26` | | `%` | `%25` | **Example:** Password: `p@ss:word` ```python database_url_sync="postgresql://user:p%40ss%3Aword@localhost/myapp" ``` ### Python URL Encoding ```python from urllib.parse import quote_plus username = "user" password = "p@ss:word" host = "localhost" database = "myapp" database_url = f"postgresql://{username}:{quote_plus(password)}@{host}/{database}" # Result: postgresql://user:p%40ss%3Aword@localhost/myapp ``` ## Connection Pools ### PostgreSQL Pool Options ```python database_url_sync="postgresql://user:pass@localhost/myapp?pool_size=20&max_overflow=10&pool_timeout=30" ``` | Option | Description | Default | |--------|-------------|---------| | `pool_size` | Max connections in pool | 5 | | `max_overflow` | Extra connections if pool full | 10 | | `pool_timeout` | Wait time for connection (seconds) | 30 | | `pool_recycle` | Recycle connections after (seconds) | -1 (never) | ### Connection Lifetime Recycle connections after 1 hour: ```python database_url_sync="postgresql://user:pass@localhost/myapp?pool_recycle=3600" ``` ## Testing Connections ### Verify URL Format ```python from sqlalchemy import create_engine try: engine = create_engine("postgresql://user:pass@localhost/myapp") with engine.connect() as conn: result = conn.execute("SELECT 1") print("Connection successful!") except Exception as e: print(f"Connection failed: {e}") ``` ### Test with DBWarden ```bash # Check configuration $ dbwarden settings show # Test connection $ dbwarden check-db ``` ## Common Mistakes ### Forgetting Port **Wrong:** ```python database_url_sync="postgresql://user:pass@localhost/myapp" # Uses default port 5432 ``` **If you need a different port:** ```python database_url_sync="postgresql://user:pass@localhost:5433/myapp" ``` ### Missing Slashes **Wrong:** ```python database_url_sync="sqlite://./app.db" # Only 2 slashes ``` **Correct:** ```python database_url_sync="sqlite:///./app.db" # 3 slashes for relative path database_url_sync="sqlite:////absolute/path/app.db" # 4 slashes for absolute path ``` ### Special Characters Not Encoded **Wrong:** ```python database_url_sync="postgresql://user:p@ss@localhost/myapp" # @ not encoded ``` **Correct:** ```python database_url_sync="postgresql://user:p%40ss@localhost/myapp" # @ encoded as %40 ``` ## What's Next? - **[Model Discovery](model-discovery.md)** - Configure model paths - **[Dev Mode](dev-mode.md)** - Local development URLs - **[Production Patterns](production-patterns.md)** - Real-world examples - **[Troubleshooting](troubleshooting.md)** - Connection issues ======================================================================== PAGE: https://dbwarden.emiliano-go.com/configuration/credentials/ ======================================================================== # Credentials and Secrets Never hardcode database credentials in `dbwarden.py`. This page covers how to inject secrets safely. ## The problem The quick-start examples show inline connection strings: ```python # Do not ship this primary = database_config( database_url_sync="postgresql://admin:s3cr3t@localhost:5432/myapp", ... ) ``` `dbwarden.py` is Python source. It ends up in version control. Credentials in source are a liability. ## Environment variables The standard pattern: read from the environment at config load time. ```python import os from dbwarden import database_config primary = database_config( database_name="primary", default=True, database_type="postgresql", database_url_sync=os.getenv("DATABASE_URL"), model_paths=["app.models"], secure_values=True, ) ``` `secure_values=True` tells DBWarden to redact the URL in CLI output and logs. ### Fail fast on missing env var ```python import os from dbwarden import database_config DATABASE_URL = os.getenv("DATABASE_URL") if not DATABASE_URL: raise ValueError("DATABASE_URL is required") primary = database_config( database_name="primary", default=True, database_type="postgresql", database_url_sync=DATABASE_URL, model_paths=["app.models"], secure_values=True, ) ``` Without the guard, a missing env var silently passes `None` to `database_url_sync`, which fails later with a confusing error. ## `.env` files with python-dotenv For local development, use a `.env` file to avoid setting vars manually each session. Install: ```bash uv add python-dotenv ``` Create `.env` in your project root: ``` DATABASE_URL=postgresql://dev_user:dev_pass@localhost:5432/myapp_dev ``` Load it at the top of `dbwarden.py`: ```python import os from dotenv import load_dotenv from dbwarden import database_config load_dotenv() primary = database_config( database_name="primary", default=True, database_type="postgresql", database_url_sync=os.getenv("DATABASE_URL"), model_paths=["app.models"], secure_values=True, ) ``` Add `.env` to `.gitignore`. Commit a `.env.example` with placeholder values: ``` DATABASE_URL=postgresql://user:password@localhost:5432/myapp ``` ## Multi-database with separate secrets Each database gets its own env var: ```python import os from dbwarden import database_config primary = database_config( database_name="primary", default=True, database_type="postgresql", database_url_sync=os.getenv("PRIMARY_DATABASE_URL"), model_paths=["app.models"], secure_values=True, ) analytics = database_config( database_name="analytics", database_type="clickhouse", database_url_sync=os.getenv("ANALYTICS_DATABASE_URL"), model_paths=["app.models.analytics"], secure_values=True, ) ``` ## Dev mode with secrets Dev mode can also use env vars, keeping SQLite paths out of source: ```python primary = database_config( database_name="primary", default=True, database_type="postgresql", database_url_sync=os.getenv("DATABASE_URL"), dev_database_type="sqlite", dev_database_url=os.getenv("DEV_DATABASE_URL", "sqlite:///./dev.db"), model_paths=["app.models"], secure_values=True, ) ``` `DEV_DATABASE_URL` defaults to a local SQLite path if not set, which is reasonable for development. ## CI/CD environments In GitHub Actions, set secrets in the repository settings and reference them in the workflow: ```yaml - name: Run migrations env: DATABASE_URL: ${{ secrets.DATABASE_URL }} run: dbwarden migrate --database primary ``` In GitLab CI, use masked CI/CD variables: ```yaml migrate: script: - dbwarden migrate --database primary variables: DATABASE_URL: $DATABASE_URL # set in GitLab CI/CD settings ``` ## Third-party secret managers For production systems using Vault, AWS Secrets Manager, or Infisical, fetch the secret before passing it to `database_config()`: ```python import os import boto3 import json from dbwarden import database_config def get_secret(secret_name: str) -> str: client = boto3.client("secretsmanager") response = client.get_secret_value(SecretId=secret_name) secret = json.loads(response["SecretString"]) return secret["database_url"] primary = database_config( database_name="primary", default=True, database_type="postgresql", database_url_sync=get_secret("myapp/production/database"), model_paths=["app.models"], secure_values=True, ) ``` `dbwarden.py` is plain Python, so any secret retrieval logic is valid here. ## `secure_values=True` When set, DBWarden redacts the connection URL in: - `dbwarden settings show` output - `dbwarden settings show` output - Log lines The URL is still used internally for connections. This prevents accidental credential exposure in terminal output shared in screenshots or logs. See also: [Production Patterns](production-patterns.md) | [CI/CD Patterns](../advanced/ci-cd-patterns.md) ======================================================================== PAGE: https://dbwarden.emiliano-go.com/configuration/dev-mode/ ======================================================================== # Dev Mode Use SQLite for local development and PostgreSQL in production with the same codebase. ## What Is Dev Mode? Dev mode lets you configure **two database URLs**: - **Production URL** - Used by default - **Dev URL** - Used when you pass `--dev` flag ```python primary = database_config( database_name="primary", default=True, database_type="postgresql", database_url_sync="postgresql://localhost/myapp", # Production dev_database_type="sqlite", dev_database_url="sqlite:///./dev.db", # Development ) ``` Run commands with `--dev`: ```bash $ dbwarden --dev migrate # Uses SQLite $ dbwarden --dev status # Uses SQLite $ dbwarden migrate # Uses PostgreSQL ``` ## Why Use Dev Mode? ### Speed SQLite is **much faster** for local development: - No network latency - No authentication overhead - File-based, not server-based - Instant startup ### Simplicity No PostgreSQL server required: - No Docker setup - No installation - No configuration - Works on all platforms ### Safety Can't accidentally affect production: - Dev database is a local file - Each developer has their own database - Easy to reset (`rm dev.db`) - No shared state ### Portability Easy to share between developers: - One config file works everywhere - No server setup instructions - Fresh developers can start immediately ## Basic Setup ### Step 1: Configure Dev Database ```python # dbwarden.py from dbwarden import database_config primary = database_config( database_name="primary", default=True, database_type="postgresql", database_url_sync="postgresql://localhost/myapp", dev_database_type="sqlite", # Add this dev_database_url="sqlite:///./dev.db", # Add this model_paths=["app.models"], ) ``` ### Step 2: Use Dev Mode ```bash # Development workflow $ dbwarden --dev make-migrations "create users" $ dbwarden --dev migrate $ dbwarden --dev status # Production workflow $ dbwarden make-migrations "create users" $ dbwarden migrate $ dbwarden status ``` ## Dev Mode Workflow ### Daily Development ```bash # Morning: pull latest code git pull # Run migrations against dev database $ dbwarden --dev migrate # Work on features... # Add new models # Generate migration $ dbwarden --dev make-migrations "add orders table" # Test migration $ dbwarden --dev migrate # Verify $ dbwarden --dev status # Commit git add migrations/primary/0002_add_orders_table.sql git commit -m "Add orders table" ``` ### Testing Rollbacks ```bash # Apply migration $ dbwarden --dev migrate # Test rollback $ dbwarden --dev rollback # Re-apply $ dbwarden --dev migrate ``` ### Fresh Start Reset your dev database anytime: ```bash # Delete dev database rm dev.db # Re-run migrations $ dbwarden --dev migrate ``` ## Production Workflow Dev mode only affects **local development**. Production uses the main URL: ### CI/CD Pipeline ```yaml # .github/workflows/deploy.yml - name: Run migrations run: dbwarden migrate # No --dev flag env: DATABASE_URL: ${{ secrets.DATABASE_URL }} ``` ### Production Deployment ```bash # Staging $ dbwarden migrate --database primary # Production $ dbwarden migrate --database primary ``` Dev mode is **never used** in CI/CD or production. ## SQLite Limitations ### What Works Most features work in SQLite: - Tables, columns, indexes - Primary keys, foreign keys - Unique constraints - Basic data types - Transactions ### What Doesn't Work Some PostgreSQL features aren't available in SQLite: - Advanced types (JSONB, arrays, enums) - Partial indexes - Generated columns (in older SQLite) - Multiple schemas - Concurrent writes ### Translation DBWarden **doesn't translate** SQL between databases. Your migrations should work on both SQLite and PostgreSQL. **Approach 1:** Write portable SQL ```sql -- Works on both CREATE TABLE users ( id INTEGER PRIMARY KEY, email VARCHAR(255) NOT NULL ); -- PostgreSQL-specific CREATE TABLE users ( id SERIAL PRIMARY KEY, email VARCHAR(255) NOT NULL, metadata JSONB ); ``` **Approach 2:** Use PostgreSQL for dev too ```python primary = database_config( database_name="primary", default=True, database_type="postgresql", database_url_sync="postgresql://localhost/myapp", dev_database_type="postgresql", # Same as prod dev_database_url="postgresql://localhost/myapp_dev", # Different database ) ``` ## Environment-Based Configuration ### Automatic Dev Mode Use environment variables to automatically detect dev: ```python import os is_dev = os.getenv("ENV", "dev") in ["dev", "development", "local"] if is_dev: database_url = "sqlite:///./dev.db" database_type = "sqlite" else: database_url = os.getenv("DATABASE_URL") database_type = "postgresql" primary = database_config( database_name="primary", default=True, database_type=database_type, database_url_sync=database_url, ) ``` Run commands: ```bash # Dev ENV=dev dbwarden migrate # Production ENV=production dbwarden migrate ``` ## Multiple Dev Databases If you have multiple databases, configure dev mode for each: ```python # Primary database primary = database_config( database_name="primary", default=True, database_type="postgresql", database_url_sync="postgresql://localhost/main", dev_database_type="sqlite", dev_database_url="sqlite:///./dev_primary.db", model_paths=["app.models.primary"], ) # Analytics database analytics = database_config( database_name="analytics", database_type="postgresql", database_url_sync="postgresql://localhost/analytics", dev_database_type="sqlite", dev_database_url="sqlite:///./dev_analytics.db", model_paths=["app.models.analytics"], ) ``` Run against all dev databases: ```bash $ dbwarden --dev migrate --all $ dbwarden --dev status --all ``` ## Common Patterns ### Pattern 1: SQLite for Dev, PostgreSQL for Prod (Recommended) ```python primary = database_config( database_name="primary", default=True, database_type="postgresql", database_url_sync="postgresql://localhost/myapp", dev_database_type="sqlite", dev_database_url="sqlite:///./dev.db", ) ``` **Pros:** - Fast local iteration - No server setup - Easy to reset **Cons:** - SQL must be portable - Some features unavailable in dev ### Pattern 2: PostgreSQL for Both ```python primary = database_config( database_name="primary", default=True, database_type="postgresql", database_url_sync="postgresql://prod-host/myapp", dev_database_type="postgresql", dev_database_url="postgresql://localhost/myapp_dev", ) ``` **Pros:** - Identical environments - Use all PostgreSQL features - Catches more bugs in dev **Cons:** - Requires PostgreSQL server locally - Slower than SQLite ### Pattern 3: Dynamic Based on Environment ```python import os environment = os.getenv("ENV", "dev") if environment == "production": database_url = os.getenv("DATABASE_URL") database_type = "postgresql" elif environment == "staging": database_url = os.getenv("STAGING_DATABASE_URL") database_type = "postgresql" else: database_url = "sqlite:///./dev.db" database_type = "sqlite" primary = database_config( database_name="primary", default=True, database_type=database_type, database_url_sync=database_url, ) ``` ## Testing with Dev Mode ### Unit Tests Use SQLite for fast unit tests: ```python # tests/conftest.py import pytest from sqlalchemy import create_engine from app.models import Base @pytest.fixture(scope="function") def db(): engine = create_engine("sqlite:///:memory:") Base.metadata.create_all(engine) yield engine Base.metadata.drop_all(engine) ``` ### Integration Tests Use `--dev` for integration tests: ```bash # Run integration tests $ dbwarden --dev migrate pytest tests/integration/ ``` ## Troubleshooting ### "SQL syntax error in SQLite" **Cause:** Using PostgreSQL-specific SQL. **Solution:** Make SQL portable or use PostgreSQL for dev: ```python dev_database_type="postgresql" dev_database_url="postgresql://localhost/myapp_dev" ``` ### "dev_database_url is required" **Cause:** Set `dev_database_type` without `dev_database_url`. **Solution:** Add both: ```python dev_database_type="sqlite" dev_database_url="sqlite:///./dev.db" # Add this ``` ### Dev database not updating **Cause:** Forgot `--dev` flag. **Solution:** Use `--dev`: ```bash $ dbwarden --dev migrate # Add --dev ``` ## What's Next? - **[Multi-Database](multi-database.md)** - Multiple databases with dev mode - **[Production Patterns](production-patterns.md)** - Deploy to production - **[Troubleshooting](troubleshooting.md)** - Common issues ======================================================================== PAGE: https://dbwarden.emiliano-go.com/configuration/ ======================================================================== # Configuration DBWarden uses Python-based configuration with `database_config()` to define your databases. **One configuration source** for migrations, CLI tools, and runtime: no split configs. ## Quick Start The simplest configuration possible: ```python # dbwarden.py from dbwarden import database_config primary = database_config( database_name="primary", default=True, database_type="sqlite", database_url_sync="sqlite:///./app.db", ) ``` That's it! **4 parameters** to get started. Run your first migration: ```bash $ dbwarden init $ dbwarden make-migrations "initial schema" $ dbwarden migrate ``` ## Learning Path ### New to DBWarden? Start here to understand configuration basics: 1. **[Quick Start](quick-start.md)** - Your first configuration in 2 minutes 2. **[Concepts](concepts.md)** - How configuration works 3. **[Connection URLs](connection-urls.md)** - Database connection formats ### Building Your Configuration Learn specific features: - **[Model Discovery](model-discovery.md)** - How DBWarden finds your SQLAlchemy models - **[Dev Mode](dev-mode.md)** - Local development with SQLite - **[Multi-Database](multi-database.md)** - Configure multiple databases ### Production Ready Deploy with confidence: - **[Production Patterns](production-patterns.md)** - Real-world examples - **[Troubleshooting](troubleshooting.md)** - Common issues and solutions ### Complete Reference - **[Configuration API](../reference/configuration-api.md)** - Complete function signature and parameters ## Key Features ### Simple Configuration Define once, use everywhere: ```python primary = database_config( database_name="primary", default=True, database_type="postgresql", database_url_sync="postgresql://localhost/myapp", model_paths=["app.models"], ) ``` ### Dev Mode Use SQLite locally, PostgreSQL in production: ```python primary = database_config( database_name="primary", default=True, database_type="postgresql", database_url_sync="postgresql://localhost/myapp", dev_database_type="sqlite", dev_database_url="sqlite:///./dev.db", ) ``` Run commands with `--dev`: ```bash $ dbwarden --dev migrate $ dbwarden --dev status ``` ### Multi-Database Configure as many databases as you need: ```python # Primary database primary = database_config( database_name="primary", default=True, database_type="postgresql", database_url_sync="postgresql://localhost/main", model_paths=["app.models.primary"], ) # Analytics database analytics = database_config( database_name="analytics", database_type="clickhouse", database_url_sync="http://localhost:8123/analytics", model_paths=["app.models.analytics"], ) ``` ### Security First Keep credentials out of code: ```python import os primary = database_config( database_name="primary", default=True, database_type="postgresql", database_url_sync=os.getenv("DATABASE_URL"), secure_values=True, # Hide credentials in output ) ``` ### Validation DBWarden validates your configuration: - Exactly one `default=True` - Unique database names - No duplicate URLs - Required `model_paths` for multi-database - Consistent dev mode configuration ## Configuration Loading DBWarden discovers your configuration automatically: 1. **Looks for `dbwarden.py`** in current directory or parents 2. **Checks `DBWARDEN_CONFIG_MODULE`** environment variable 3. **Scans for `database_config()` calls** in your codebase (full project tree walk) 4. **Looks for `warden.toml`** as an alternative TOML-based config file `dbwarden.py` is the default convention and the file created by `dbwarden init`, but `database_config(...)` can live in any discovered Python file inside your project. ## Common Patterns ### Single Database (Minimal) ```python from dbwarden import database_config primary = database_config( database_name="primary", default=True, database_type="postgresql", database_url_sync="postgresql://localhost/myapp", ) ``` ### With Dev Mode (Recommended) ```python from dbwarden import database_config primary = database_config( database_name="primary", default=True, database_type="postgresql", database_url_sync="postgresql://localhost/myapp", dev_database_type="sqlite", dev_database_url="sqlite:///./dev.db", model_paths=["app.models"], ) ``` ### Multiple Databases ```python from dbwarden import database_config # Primary primary = database_config( database_name="primary", default=True, database_type="postgresql", database_url_sync="postgresql://localhost/main", model_paths=["app.models.primary"], ) # Analytics analytics = database_config( database_name="analytics", database_type="postgresql", database_url_sync="postgresql://localhost/analytics", model_paths=["app.models.analytics"], ) ``` ### Production with Environment Variables ```python import os from dbwarden import database_config primary = database_config( database_name="primary", default=True, database_type="postgresql", database_url_sync=os.getenv("DATABASE_URL"), model_paths=["app.models"], secure_values=True, ) ``` ## Why Python Configuration? **vs TOML/YAML/INI:** - Type checking with your IDE - Dynamic configuration (loops, conditionals) - Environment variable integration - No schema mismatches - Can compute values **vs Environment Variables Only:** - Version controlled - Self-documenting - Validation at load time - Multiple databases easy - Can reference code structures ## What's Next? Ready to configure your first database? Start here: - **[Quick Start](quick-start.md)** - Build your first configuration - **[Concepts](concepts.md)** - Understand how it works - **[Production Patterns](production-patterns.md)** - Real-world examples Already familiar with configuration? Jump to: - **[Connection URLs](connection-urls.md)** - URL format reference - **[Troubleshooting](troubleshooting.md)** - Common issues - **[Configuration API](../reference/configuration-api.md)** - Complete reference ======================================================================== PAGE: https://dbwarden.emiliano-go.com/configuration/model-discovery/ ======================================================================== # Model Discovery Learn how DBWarden discovers your SQLAlchemy models for migration generation. ## What Is Model Discovery? Model discovery is the process where DBWarden: 1. Imports Python modules 2. Finds SQLAlchemy model classes 3. Extracts table metadata 4. Uses metadata to generate migrations ## The `model_paths` Parameter ### Basic Usage ```python primary = database_config( database_name="primary", default=True, database_type="postgresql", database_url_sync="postgresql://localhost/myapp", model_paths=["app.models"], # Discover models here ) ``` ### What Gets Discovered DBWarden looks for classes that inherit from: - `DeclarativeBase` (SQLAlchemy 2.0+) - `declarative_base()` return value (SQLAlchemy 1.4) **Example models:** ```python # app/models.py from sqlalchemy.orm import DeclarativeBase, Mapped, mapped_column from sqlalchemy import String, Integer class Base(DeclarativeBase): pass class User(Base): # Discovered __tablename__ = "users" id: Mapped[int] = mapped_column(Integer, primary_key=True) email: Mapped[str] = mapped_column(String(255)) class Order(Base): # Discovered __tablename__ = "orders" id: Mapped[int] = mapped_column(Integer, primary_key=True) ``` ## When Is It Required? ### Single Database (Optional) For single-database projects, `model_paths` is optional: ```python primary = database_config( database_name="primary", default=True, database_type="sqlite", database_url_sync="sqlite:///./app.db", # No model_paths - DBWarden scans entire codebase ) ``` DBWarden will scan your entire codebase for models. Even for single-database projects, specifying `model_paths` makes discovery faster and more predictable. ### Multiple Databases (Required) For multi-database projects, `model_paths` is **required** for each database: ```python primary = database_config( database_name="primary", default=True, database_type="postgresql", database_url_sync="postgresql://localhost/main", model_paths=["app.models.primary"], # Required ) analytics = database_config( database_name="analytics", database_type="postgresql", database_url_sync="postgresql://localhost/analytics", model_paths=["app.models.analytics"], # Required ) ``` **Why required?** To prevent ambiguity about which models belong to which database. ## Discovery Algorithm ### Step 1: Import Modules DBWarden imports each module in `model_paths`: ```python model_paths=["app.models", "app.legacy.models"] ``` Becomes: ```python import app.models import app.legacy.models ``` ### Step 2: Recursive Discovery For each imported module, DBWarden recursively imports submodules: ``` app/ models/ __init__.py # Imported user.py # Imported order.py # Imported admin/ __init__.py # Imported admin_user.py # Imported ``` ### Step 3: Find Model Classes For each module, DBWarden inspects all classes and finds those inheriting from `DeclarativeBase`. ### Step 4: Extract Metadata For each model class, DBWarden extracts: - Table name - Columns (name, type, constraints) - Indexes - Foreign keys - Check constraints - Unique constraints ## Module Path Examples ### Single Module ```python model_paths=["app.models"] ``` Discovers models from: - `app/models.py` (if it's a file) - `app/models/__init__.py` (if it's a package) - `app/models/*.py` (all submodules) ### Multiple Modules ```python model_paths=["app.models", "app.legacy"] ``` Discovers models from both `app.models` and `app.legacy`. ### Nested Modules ```python model_paths=["app.models.api", "app.models.admin"] ``` Discovers models from: - `app/models/api.py` or `app/models/api/*.py` - `app/models/admin.py` or `app/models/admin/*.py` ### Absolute vs Relative **Absolute (recommended):** ```python model_paths=["app.models"] # From project root ``` **Not supported:** ```python model_paths=["./models"] # Relative paths don't work model_paths=["models"] # May work if on PYTHONPATH ``` ## Common Patterns ### Pattern 1: Single Module ``` app/ models.py # All models in one file ``` ```python model_paths=["app.models"] ``` ### Pattern 2: Module Package ``` app/ models/ __init__.py user.py order.py product.py ``` ```python model_paths=["app.models"] ``` ### Pattern 3: Multi-Database ``` app/ models/ primary/ __init__.py user.py order.py analytics/ __init__.py event.py metric.py ``` ```python # Primary database model_paths=["app.models.primary"] # Analytics database model_paths=["app.models.analytics"] ``` ### Pattern 4: Legacy + New ``` app/ models/ # New models __init__.py user.py legacy/ # Legacy models models.py ``` ```python model_paths=["app.models", "app.legacy.models"] ``` ## Model Path Validation ### No Overlap (Default) By default, model paths cannot overlap between databases: ```python # Error: overlap detected primary = database_config( database_name="primary", model_paths=["app.models"], ) analytics = database_config( database_name="analytics", model_paths=["app.models"], # Same path! ) ``` ### Allow Overlap If models genuinely belong to multiple databases: ```python primary = database_config( database_name="primary", model_paths=["app.shared"], overlap_models=True, # Allow overlap ) analytics = database_config( database_name="analytics", model_paths=["app.shared"], overlap_models=True, # Allow overlap ) ``` Both databases will include the same tables. Make sure this is intentional. ## Troubleshooting ### "No SQLAlchemy models found" **Symptom:** DBWarden can't find your models. **Causes:** 1. **Models not imported** ```python # app/models/__init__.py # Wrong - models not imported from sqlalchemy.orm import DeclarativeBase class Base(DeclarativeBase): pass # Correct - import models from app.models.user import User from app.models.order import Order ``` 2. **Wrong module path** ```python # Wrong model_paths=["models"] # Not on PYTHONPATH # Correct model_paths=["app.models"] ``` 3. **Circular imports** ```python # app/models/user.py from app.models.order import Order # Circular import # app/models/order.py from app.models.user import User # Circular import ``` **Solution:** Use forward references: ```python from typing import TYPE_CHECKING if TYPE_CHECKING: from app.models.order import Order ``` ### "model_paths is required" **Symptom:** Error when running commands with multiple databases. **Cause:** Multiple databases configured without `model_paths`. **Solution:** Add `model_paths` to each database: ```python primary = database_config( database_name="primary", model_paths=["app.models.primary"], # Add this ... ) analytics = database_config( database_name="analytics", model_paths=["app.models.analytics"], # Add this ... ) ``` ### "model_paths overlap detected" **Symptom:** Two databases have overlapping model paths. **Cause:** Same path used for multiple databases. **Solution 1:** Use separate paths: ```python primary = database_config( database_name="primary", model_paths=["app.models.primary"], # Different path ... ) analytics = database_config( database_name="analytics", model_paths=["app.models.analytics"], # Different path ... ) ``` **Solution 2:** Allow overlap (if intentional): ```python primary = database_config( database_name="primary", model_paths=["app.shared"], overlap_models=True, # Allow overlap ... ) analytics = database_config( database_name="analytics", model_paths=["app.shared"], overlap_models=True, # Allow overlap ... ) ``` ### Import Errors **Symptom:** `ModuleNotFoundError` or `ImportError` when running commands. **Cause:** DBWarden tries to import module but it doesn't exist. **Solution:** Verify the module path: ```bash python -c "import app.models" # Test import ``` If import fails, fix your module structure or PYTHONPATH. ## Performance Considerations ### Slow Discovery If discovery is slow, reduce the search space: **Before (slow):** ```python model_paths=["app"] # Scans entire app ``` **After (fast):** ```python model_paths=["app.models"] # Only scans models ``` ### Import Side Effects Models should be pure: ```python # Bad - side effects on import class User(Base): __tablename__ = "users" ... print("User model loaded!") # Side effect # Good - no side effects class User(Base): __tablename__ = "users" ... ``` ## Advanced: Dynamic Model Paths You can compute `model_paths` dynamically: ```python import os environment = os.getenv("ENV", "dev") if environment == "production": model_paths = ["app.models.production"] else: model_paths = ["app.models.dev"] primary = database_config( database_name="primary", default=True, database_type="postgresql", database_url_sync="...", model_paths=model_paths, ) ``` ## What's Next? - **[Dev Mode](dev-mode.md)** - Local development workflows - **[Multi-Database](multi-database.md)** - Organize multi-database models - **[Troubleshooting](troubleshooting.md)** - Common configuration issues ======================================================================== PAGE: https://dbwarden.emiliano-go.com/configuration/multi-database/ ======================================================================== # Multi-Database Configure and manage multiple databases in a single project. ## When to Use Multiple Databases Common scenarios: - **Microservices** - Each service has its own database - **Read/Write Split** - Primary for writes, replica for reads - **Domain Separation** - Transactions, analytics, logs in separate databases - **Legacy Integration** - New and old databases coexist - **Multi-Tenancy** - One database per tenant ## Basic Setup Configure each database with `database_config()`: ```python # dbwarden.py from dbwarden import database_config # Primary database primary = database_config( database_name="primary", default=True, database_type="postgresql", database_url_sync="postgresql://localhost/main", model_paths=["app.models.primary"], ) # Analytics database analytics = database_config( database_name="analytics", database_type="postgresql", database_url_sync="postgresql://localhost/analytics", model_paths=["app.models.analytics"], ) # Logging database logging = database_config( database_name="logging", database_type="postgresql", database_url_sync="postgresql://localhost/logs", model_paths=["app.models.logging"], ) ``` ## Model Organization ### Pattern 1: Separate Modules ``` app/ models/ primary/ __init__.py user.py order.py analytics/ __init__.py event.py metric.py logging/ __init__.py audit_log.py ``` Configuration: ```python primary = database_config( database_name="primary", model_paths=["app.models.primary"], ... ) analytics = database_config( database_name="analytics", model_paths=["app.models.analytics"], ... ) logging = database_config( database_name="logging", model_paths=["app.models.logging"], ... ) ``` ### Pattern 2: Shared Base Classes ```python # app/models/base.py from sqlalchemy.orm import DeclarativeBase class PrimaryBase(DeclarativeBase): pass class AnalyticsBase(DeclarativeBase): pass # app/models/primary/user.py from app.models.base import PrimaryBase class User(PrimaryBase): __tablename__ = "users" ... # app/models/analytics/event.py from app.models.base import AnalyticsBase class Event(AnalyticsBase): __tablename__ = "events" ... ``` ## CLI Usage ### Target Specific Database ```bash # Migrate primary $ dbwarden migrate --database primary # Migrate analytics $ dbwarden migrate --database analytics # Status for logging $ dbwarden status --database logging ``` ### Target All Databases ```bash # Migrate all $ dbwarden migrate --all # Status for all $ dbwarden status --all # Rollback all $ dbwarden rollback --all ``` ### Default Database The database with `default=True` is used when `--database` is omitted: ```bash # These are equivalent when primary is default: $ dbwarden migrate $ dbwarden migrate --database primary ``` ## Migration Directories Each database has its own migration directory: ``` migrations/ primary/ 0001_create_users.sql 0002_create_orders.sql analytics/ 0001_create_events.sql 0002_create_metrics.sql logging/ 0001_create_audit_logs.sql ``` Configure custom directories: ```python primary = database_config( database_name="primary", migrations_dir="migrations/primary", # Custom path ... ) ``` ## Independent Migration Histories Each database maintains its own migration history: ```bash # Check primary history $ dbwarden history --database primary Applied Migrations (primary) 0001_create_users (2024-01-15 10:30:00) 0002_create_orders (2024-01-16 11:00:00) # Check analytics history $ dbwarden history --database analytics Applied Migrations (analytics) 0001_create_events (2024-01-15 10:35:00) ``` Migrations are **completely independent** - you can migrate one database without affecting others. ## Dev Mode with Multiple Databases Configure dev mode for each database: ```python primary = database_config( database_name="primary", default=True, database_type="postgresql", database_url_sync="postgresql://localhost/main", dev_database_type="sqlite", dev_database_url="sqlite:///./dev_primary.db", model_paths=["app.models.primary"], ) analytics = database_config( database_name="analytics", database_type="postgresql", database_url_sync="postgresql://localhost/analytics", dev_database_type="sqlite", dev_database_url="sqlite:///./dev_analytics.db", model_paths=["app.models.analytics"], ) ``` Use dev mode: ```bash # Dev mode for all databases $ dbwarden --dev migrate --all # Dev mode for specific database $ dbwarden --dev migrate --database analytics ``` ## Common Patterns ### Pattern 1: Read/Write Split ```python primary = database_config( database_name="primary", default=True, database_type="postgresql", database_url_sync="postgresql://primary-host/myapp", model_paths=["app.models"], ) replica = database_config( database_name="replica", database_type="postgresql", database_url_sync="postgresql://replica-host/myapp", model_paths=["app.models"], # Same models overlap_models=True, # Allow overlap ) ``` **Note:** Run migrations only against primary; replica replicates automatically. ### Pattern 2: Domain Separation ```python # Transactions transactions = database_config( database_name="transactions", default=True, database_type="postgresql", database_url_sync="postgresql://localhost/transactions", model_paths=["app.models.transactions"], ) # Analytics analytics = database_config( database_name="analytics", database_type="clickhouse", database_url_sync="http://localhost:8123/analytics", model_paths=["app.models.analytics"], ) # Audit logs audit = database_config( database_name="audit", database_type="postgresql", database_url_sync="postgresql://localhost/audit", model_paths=["app.models.audit"], ) ``` ### Pattern 3: Multi-Tenant ```python tenants = ["tenant_a", "tenant_b", "tenant_c"] for tenant in tenants: db = database_config( database_name=tenant, default=(tenant == "tenant_a"), database_type="postgresql", database_url_sync=f"postgresql://localhost/{tenant}", model_paths=["app.models"], # Same models for all tenants ) ``` ## Validation Rules ### Required: `model_paths` When you have multiple databases, each **must** specify `model_paths`: ```python # Error: model_paths required analytics = database_config( analytics = database_config(database_name="analytics", ...) # Missing model_paths # Correct primary = database_config( database_name="primary", model_paths=["app.models.primary"], ... ) analytics = database_config( database_name="analytics", model_paths=["app.models.analytics"], ... ) ``` ### No Overlap (Default) Model paths cannot overlap: ```python # Error: overlap detected primary = database_config( database_name="primary", model_paths=["app.models"], ... ) analytics = database_config( database_name="analytics", model_paths=["app.models"], # Same path ... ) ``` ### Allow Overlap For read replicas or shared models: ```python primary = database_config( database_name="primary", model_paths=["app.models"], overlap_models=True, # Allow overlap ... ) replica = database_config( database_name="replica", model_paths=["app.models"], overlap_models=True, # Allow overlap ... ) ``` ## Troubleshooting ### "model_paths is required" **Solution:** Add `model_paths` to all databases: ```python primary = database_config( database_name="primary", model_paths=["app.models.primary"], # Add this ... ) ``` ### "model_paths overlap detected" **Solution 1:** Use separate paths: ```python model_paths=["app.models.primary"] model_paths=["app.models.analytics"] ``` **Solution 2:** Allow overlap: ```python overlap_models=True ``` ### Wrong database targeted **Check default:** ```bash $ dbwarden settings show # Shows which is default ``` **Be explicit:** ```bash $ dbwarden migrate --database analytics # Specify database ``` ## What's Next? - **[Production Patterns](production-patterns.md)** - Deploy multi-database apps - **[Troubleshooting](troubleshooting.md)** - Common issues ======================================================================== PAGE: https://dbwarden.emiliano-go.com/configuration/production-patterns/ ======================================================================== # Production Patterns Real-world configuration patterns for production deployments. ## Environment Variables ### Basic Pattern ```python import os from dbwarden import database_config primary = database_config( database_name="primary", default=True, database_type="postgresql", database_url_sync=os.getenv("DATABASE_URL"), model_paths=["app.models"], secure_values=True, ) ``` ### With Validation ```python import os DATABASE_URL = os.getenv("DATABASE_URL") if not DATABASE_URL: raise ValueError("DATABASE_URL environment variable is required") primary = database_config( database_name="primary", default=True, database_type="postgresql", database_url_sync=DATABASE_URL, model_paths=["app.models"], secure_values=True, ) ``` ### With Defaults ```python import os primary = database_config( database_name="primary", default=True, database_type="postgresql", database_url_sync=os.getenv( "DATABASE_URL", "postgresql://localhost/myapp" # Fallback ), model_paths=["app.models"], ) ``` ## Docker ### Docker Compose ```yaml # docker-compose.yml services: app: build: . environment: DATABASE_URL: postgresql://user:password@db:5432/myapp depends_on: - db db: image: postgres:15 environment: POSTGRES_USER: user POSTGRES_PASSWORD: password POSTGRES_DB: myapp ``` ```python # dbwarden.py import os primary = database_config( database_name="primary", default=True, database_type="postgresql", database_url_sync=os.getenv("DATABASE_URL"), model_paths=["app.models"], ) ``` ### Dockerfile ```dockerfile FROM python:3.12-slim WORKDIR /app COPY pyproject.toml uv.lock . RUN uv sync COPY . . # Run migrations on container start CMD ["sh", "-c", "dbwarden migrate && python app/main.py"] ``` ## Kubernetes ### Secrets ```yaml # secret.yaml apiVersion: v1 kind: Secret metadata: name: database-secret type: Opaque stringData: url: postgresql://user:password@postgres-service:5432/myapp ``` ### Deployment with Init Container ```yaml apiVersion: apps/v1 kind: Deployment metadata: name: myapp spec: replicas: 3 template: spec: initContainers: # Run migrations before app starts - name: migrate image: myapp:latest command: ["dbwarden", "migrate"] env: - name: DATABASE_URL valueFrom: secretKeyRef: name: database-secret key: url containers: - name: app image: myapp:latest env: - name: DATABASE_URL valueFrom: secretKeyRef: name: database-secret key: url ``` ### ConfigMap for Model Paths ```yaml # config.yaml apiVersion: v1 kind: ConfigMap metadata: name: app-config data: model_paths: "app.models" ``` ## AWS ### RDS with Secrets Manager ```python import os import json import boto3 def get_database_url(): secret_name = os.getenv("DB_SECRET_NAME") region = os.getenv("AWS_REGION", "us-east-1") client = boto3.client("secretsmanager", region_name=region) response = client.get_secret_value(SecretId=secret_name) secret = json.loads(response["SecretString"]) return ( f"postgresql://{secret['username']}:{secret['password']}" f"@{secret['host']}:{secret['port']}/{secret['dbname']}" ) primary = database_config( database_name="primary", default=True, database_type="postgresql", database_url_sync=get_database_url(), model_paths=["app.models"], secure_values=True, ) ``` ### RDS Connection via IAM ```python import os import boto3 def get_rds_auth_token(): rds_client = boto3.client("rds") return rds_client.generate_db_auth_token( DBHostname=os.getenv("DB_HOST"), Port=5432, DBUsername=os.getenv("DB_USER"), ) database_url = ( f"postgresql://{os.getenv('DB_USER')}:{get_rds_auth_token()}" f"@{os.getenv('DB_HOST')}:5432/{os.getenv('DB_NAME')}" ) primary = database_config( database_name="primary", default=True, database_type="postgresql", database_url_sync=database_url, model_paths=["app.models"], ) ``` ## Multi-Environment ### Environment-Based Configuration ```python import os ENVIRONMENT = os.getenv("ENVIRONMENT", "dev") if ENVIRONMENT == "production": database_url = os.getenv("PROD_DATABASE_URL") database_type = "postgresql" elif ENVIRONMENT == "staging": database_url = os.getenv("STAGING_DATABASE_URL") database_type = "postgresql" else: database_url = "sqlite:///./dev.db" database_type = "sqlite" primary = database_config( database_name="primary", default=True, database_type=database_type, database_url_sync=database_url, model_paths=["app.models"], secure_values=(ENVIRONMENT != "dev"), ) ``` ### Separate Config Files ```python # dbwarden.py import os from importlib import import_module environment = os.getenv("ENVIRONMENT", "dev") config_module = import_module(f"config.{environment}") config_module.setup_databases() ``` ```python # config/production.py import os from dbwarden import database_config def setup_databases(): primary = database_config( database_name="primary", default=True, database_type="postgresql", database_url_sync=os.getenv("DATABASE_URL"), model_paths=["app.models"], secure_values=True, ) ``` ## Connection Pools ### PostgreSQL with Pooling ```python primary = database_config( database_name="primary", default=True, database_type="postgresql", database_url_sync=( "postgresql://user:pass@localhost/myapp" "?pool_size=20" "&max_overflow=10" "&pool_timeout=30" "&pool_recycle=3600" ), model_paths=["app.models"], ) ``` ### External Pooler (PgBouncer) ```python # Connection through PgBouncer primary = database_config( database_name="primary", default=True, database_type="postgresql", database_url_sync="postgresql://user:pass@pgbouncer:6432/myapp", model_paths=["app.models"], ) ``` ## SSL/TLS ### PostgreSQL with SSL ```python primary = database_config( database_name="primary", default=True, database_type="postgresql", database_url_sync=( "postgresql://user:pass@host/myapp" "?sslmode=require" "&sslrootcert=/path/to/ca.pem" "&sslcert=/path/to/client-cert.pem" "&sslkey=/path/to/client-key.pem" ), model_paths=["app.models"], ) ``` ### Environment-Based SSL ```python import os ssl_mode = os.getenv("DB_SSL_MODE", "prefer") ca_cert = os.getenv("DB_CA_CERT_PATH", "") ssl_params = f"?sslmode={ssl_mode}" if ca_cert: ssl_params += f"&sslrootcert={ca_cert}" database_url = f"postgresql://user:pass@host/myapp{ssl_params}" primary = database_config( database_name="primary", default=True, database_type="postgresql", database_url_sync=database_url, model_paths=["app.models"], ) ``` ## High Availability ### Multiple Replicas ```python # Primary (writes) primary = database_config( database_name="primary", default=True, database_type="postgresql", database_url_sync=os.getenv("PRIMARY_DATABASE_URL"), model_paths=["app.models"], ) # Replica (reads) replica = database_config( database_name="replica", database_type="postgresql", database_url_sync=os.getenv("REPLICA_DATABASE_URL"), model_paths=["app.models"], overlap_models=True, ) ``` ### Automatic Failover ```python import os # Try primary, fallback to replica primary_url = os.getenv("PRIMARY_DATABASE_URL") replica_url = os.getenv("REPLICA_DATABASE_URL") # Application logic handles failover database_url = primary_url # Start with primary primary = database_config( database_name="primary", default=True, database_type="postgresql", database_url_sync=database_url, model_paths=["app.models"], ) ``` ## Monitoring ### Application Name ```python import os app_name = os.getenv("APP_NAME", "myapp") hostname = os.getenv("HOSTNAME", "unknown") database_url = ( f"postgresql://user:pass@host/myapp" f"?application_name={app_name}-{hostname}" ) primary = database_config( database_name="primary", default=True, database_type="postgresql", database_url_sync=database_url, model_paths=["app.models"], ) ``` Check active connections: ```sql SELECT application_name, count(*) FROM pg_stat_activity GROUP BY application_name; ``` ## Security Best Practices ### Never Commit Credentials ```python # Bad database_url_sync="postgresql://user:password@localhost/myapp" # Good database_url_sync=os.getenv("DATABASE_URL") ``` ### Use Least Privilege Create application user with minimal permissions: ```sql CREATE USER myapp_user WITH PASSWORD 'secret'; GRANT CONNECT ON DATABASE myapp TO myapp_user; GRANT SELECT, INSERT, UPDATE, DELETE ON ALL TABLES IN SCHEMA public TO myapp_user; -- Don't grant DROP, TRUNCATE, CREATE, etc. ``` ### Rotate Credentials ```python # Use short-lived tokens database_url = get_temporary_database_credentials() primary = database_config( database_name="primary", default=True, database_type="postgresql", database_url_sync=database_url, model_paths=["app.models"], ) ``` ## CI/CD Integration ### GitHub Actions ```yaml name: Deploy on: push: branches: [main] jobs: migrate: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - name: Install dependencies run: uv add dbwarden - name: Run migrations run: dbwarden migrate --database primary env: DATABASE_URL: ${{ secrets.DATABASE_URL }} ``` ### GitLab CI ```yaml migrate: stage: deploy script: - uv add dbwarden - dbwarden migrate --database primary environment: name: production only: - main variables: DATABASE_URL: $DATABASE_URL ``` ## What's Next? - **[Troubleshooting](troubleshooting.md)** - Common production issues - **[Configuration API Reference](../reference/configuration-api.md)** - Complete parameter docs ======================================================================== PAGE: https://dbwarden.emiliano-go.com/configuration/quick-start/ ======================================================================== # Quick Start Configure your first database in **2 minutes**. ## Prerequisites You should have: - Python 3.10+ installed - DBWarden installed (`uv add dbwarden`) - A database to connect to (or use SQLite) ## Step 1: Initialize Create project structure: ```bash $ dbwarden init ``` This creates: - `migrations/` directory - `dbwarden.py` configuration file ## Step 2: Your First Configuration Open `dbwarden.py` and add: ```python from dbwarden import database_config primary = database_config( database_name="primary", default=True, database_type="sqlite", database_url_sync="sqlite:///./app.db", ) ``` That's it! **4 required parameters**: - `database_name` - What to call this database - `default` - Is this the default? - `database_type` - What kind of database? - `database_url_sync` - How to connect? (sync URL for CLI/migrations) Start with SQLite for the simplest setup. Switch to PostgreSQL later. ## Step 3: Test the Configuration Verify DBWarden can read your config: ```bash $ dbwarden settings show ``` You'll see: ``` Database Configuration ════════════════════════════════════════ primary (default) Type: sqlite URL: sqlite:///./app.db Migrations: migrations/primary ``` ## Step 4: Add Model Paths (Optional) If you have SQLAlchemy models, tell DBWarden where they are: ```python primary = database_config( database_name="primary", default=True, database_type="sqlite", database_url_sync="sqlite:///./app.db", model_paths=["app.models"], # Add this ) ``` DBWarden will discover models from `app.models` and its submodules. ## Step 5: Upgrade to PostgreSQL When you're ready for PostgreSQL: ```python primary = database_config( database_name="primary", default=True, database_type="postgresql", database_url_sync="postgresql://user:password@localhost:5432/myapp", model_paths=["app.models"], ) ``` ## Step 6: Add Dev Mode (Recommended) Keep SQLite for local dev, use PostgreSQL in production: ```python primary = database_config( database_name="primary", default=True, database_type="postgresql", database_url_sync="postgresql://user:password@localhost:5432/myapp", dev_database_type="sqlite", dev_database_url="sqlite:///./dev.db", model_paths=["app.models"], ) ``` Now you can run commands against SQLite locally: ```bash $ dbwarden --dev migrate $ dbwarden --dev status ``` And against PostgreSQL in production: ```bash $ dbwarden migrate $ dbwarden status ``` ## What Just Happened? ### `database_config` registered your database When Python loads `dbwarden.py`, it executes `database_config()` which: 1. Validates your parameters 2. Registers the database in DBWarden's internal registry 3. Sets up migration directories ### DBWarden Can Now Find Your Database All CLI commands now know about your database: ```bash $ dbwarden make-migrations "create users" $ dbwarden migrate $ dbwarden status $ dbwarden history ``` ## Common First-Time Issues ### "No configuration found" **Cause:** DBWarden can't find `dbwarden.py` **Solution:** Ensure you're in the project directory and `dbwarden.py` exists. ### "No SQLAlchemy models found" **Cause:** DBWarden can't discover your models **Solution:** Add `model_paths` to your config: ```python model_paths=["app.models"] ``` ### "Exactly one default=True required" **Cause:** Multiple databases without one marked as default **Solution:** Set one database to `default=True` ## Complete Minimal Example ```python # dbwarden.py from dbwarden import database_config primary = database_config( database_name="primary", default=True, database_type="sqlite", database_url_sync="sqlite:///./app.db", model_paths=["app.models"], ) ``` ## Complete Production Example ```python # dbwarden.py import os from dbwarden import database_config primary = database_config( database_name="primary", default=True, database_type="postgresql", database_url_sync=os.getenv("DATABASE_URL"), dev_database_type="sqlite", dev_database_url="sqlite:///./dev.db", model_paths=["app.models"], secure_values=True, ) ``` ## What's Next? - **[Concepts](concepts.md)** - Understand how configuration works - **[Connection URLs](connection-urls.md)** - Learn URL formats for different databases - **[Dev Mode](dev-mode.md)** - Deep dive into dev workflows - **[Production Patterns](production-patterns.md)** - Real-world examples ======================================================================== PAGE: https://dbwarden.emiliano-go.com/configuration/troubleshooting/ ======================================================================== # Troubleshooting Solutions to common configuration issues. ## "No configuration found" ### Symptom ``` DBWardenConfigError: No configuration found ``` ### Causes & Solutions **Cause 1: No `dbwarden.py` file** ```bash # Check if file exists ls dbwarden.py ``` **Solution:** Create `dbwarden.py`: ```bash $ dbwarden init ``` **Cause 2: Wrong directory** DBWarden looks in current directory and parents. **Solution:** Navigate to project root: ```bash cd /path/to/project $ dbwarden migrate ``` **Cause 3: No `database_config()` calls** **Solution:** Add configuration: ```python # dbwarden.py from dbwarden import database_config primary = database_config( database_name="primary", default=True, database_type="sqlite", database_url_sync="sqlite:///./app.db", ) ``` **Cause 4: Import error in config file** ```python # dbwarden.py from app.models import Base # Import fails ``` **Solution:** Fix imports or use lazy loading: ```python # Don't import models in config file primary = database_config( database_name="primary", model_paths=["app.models"], # Use model_paths instead ... ) ``` ## "Exactly one default=True required" ### Symptom ``` ConfigurationError: Exactly one default=True required ``` ### Causes & Solutions **Cause 1: No default database** ```python # Wrong analytics = database_config( analytics = database_config(database_name="analytics", default=False, ...) ``` **Solution:** Set one database as default: ```python # Correct analytics = database_config( analytics = database_config(database_name="analytics", default=False, ...) ``` **Cause 2: Multiple defaults** ```python # Wrong analytics = database_config( analytics = database_config(database_name="analytics", default=True, ...) ``` **Solution:** Only one default: ```python # Correct analytics = database_config( analytics = database_config(database_name="analytics", ...) # default=False implied ``` ## "Duplicate database_name" ### Symptom ``` ConfigurationError: Duplicate database_name 'primary' ``` ### Cause Same `database_name` used twice: ```python primary = database_config( primary = database_config(database_name="primary", ...) # Duplicate ``` ### Solution Use unique names: ```python analytics = database_config( analytics = database_config(database_name="analytics", ...) # Different name ``` ## "No SQLAlchemy models found" ### Symptom ``` Warning: No SQLAlchemy models found ``` ### Causes & Solutions **Cause 1: Wrong `model_paths`** ```python # Wrong model_paths=["models"] # Not on PYTHONPATH ``` **Solution:** Use correct Python path: ```python # Correct model_paths=["app.models"] ``` **Cause 2: Models not imported** ```python # app/models/__init__.py # Wrong - models not imported from sqlalchemy.orm import DeclarativeBase class Base(DeclarativeBase): pass ``` **Solution:** Import models: ```python # app/models/__init__.py # Correct from sqlalchemy.orm import DeclarativeBase from app.models.user import User from app.models.order import Order class Base(DeclarativeBase): pass ``` **Cause 3: Missing `model_paths`** **Solution:** Add `model_paths`: ```python primary = database_config( database_name="primary", model_paths=["app.models"], # Add this ... ) ``` **Cause 4: Circular imports** ```python # app/models/user.py from app.models.order import Order # Circular # app/models/order.py from app.models.user import User # Circular ``` **Solution:** Use TYPE_CHECKING: ```python from typing import TYPE_CHECKING if TYPE_CHECKING: from app.models.order import Order ``` ## "model_paths is required" ### Symptom ``` ConfigurationError: model_paths is required when more than one database is configured ``` ### Cause Multiple databases without `model_paths`: ```python # Wrong analytics = database_config( analytics = database_config(database_name="analytics", ...) # No model_paths ``` ### Solution Add `model_paths` to all databases: ```python # Correct primary = database_config( database_name="primary", model_paths=["app.models.primary"], ... ) analytics = database_config( database_name="analytics", model_paths=["app.models.analytics"], ... ) ``` ## "model_paths overlap detected" ### Symptom ``` ConfigurationError: model_paths overlap detected between 'primary' and 'analytics' ``` ### Cause Same model paths for different databases: ```python # Wrong primary = database_config( database_name="primary", model_paths=["app.models"], ... ) analytics = database_config( database_name="analytics", model_paths=["app.models"], # Same path ... ) ``` ### Solutions **Solution 1: Use separate paths** ```python # Correct primary = database_config( database_name="primary", model_paths=["app.models.primary"], ... ) analytics = database_config( database_name="analytics", model_paths=["app.models.analytics"], ... ) ``` **Solution 2: Allow overlap (if intentional)** ```python # Correct for read replicas primary = database_config( database_name="primary", model_paths=["app.models"], overlap_models=True, ... ) replica = database_config( database_name="replica", model_paths=["app.models"], overlap_models=True, ... ) ``` ## "model_tables overlap detected" ### Symptom ``` ConfigurationError: model_tables overlap detected: table 'users' in 'primary' is also in 'analytics' ``` ### Cause Two databases have `model_tables` lists that share table names: ```python # Wrong primary = database_config( database_name="primary", model_paths=["app.models"], model_tables=["users", "posts"], ... ) analytics = database_config( database_name="analytics", model_paths=["other_models"], model_tables=["users"], # 'users' already owned by primary ... ) ``` ### Solutions **Solution 1: Remove duplicate table name** ```python # Correct analytics = database_config( database_name="analytics", model_paths=["other_models"], model_tables=["analytics_events"], # No overlap with primary ... ) ``` **Solution 2: Allow overlap (if intentional)** ```python # Correct for shared tables analytics = database_config( database_name="analytics", model_paths=["other_models"], model_tables=["users", "analytics_events"], overlap_models=True, # Allow overlap ... ) ``` ## "dev_database_url is required" ### Symptom ``` ConfigurationError: dev_database_url is required when dev_database_type is set ``` ### Cause Set `dev_database_type` without `dev_database_url`: ```python # Wrong primary = database_config( database_name="primary", dev_database_type="sqlite", # Missing dev_database_url ... ) ``` ### Solution Add both dev parameters: ```python # Correct primary = database_config( database_name="primary", dev_database_type="sqlite", dev_database_url="sqlite:///./dev.db", # Add this ... ) ``` ## Connection Errors ### "could not connect to server" **Cause:** Database server not running or unreachable. **Solutions:** 1. **Check database is running:** ```bash # PostgreSQL sudo systemctl status postgresql # Docker docker ps | grep postgres ``` 2. **Check connection URL:** ```python # Verify host, port, credentials database_url_sync="postgresql://user:pass@localhost:5432/myapp" ``` 3. **Test connection:** ```bash # PostgreSQL psql -h localhost -U user -d myapp # MySQL mysql -h localhost -u user -p myapp ``` ### "authentication failed" **Cause:** Wrong credentials. **Solutions:** 1. **Check credentials:** ```bash # PostgreSQL psql -h localhost -U user -d myapp ``` 2. **Verify environment variable:** ```bash echo $DATABASE_URL ``` 3. **URL encode special characters:** ```python from urllib.parse import quote_plus password = "p@ss:word" encoded = quote_plus(password) # "p%40ss%3Aword" ``` ### "database does not exist" **Cause:** Database not created. **Solution:** Create database: ```sql -- PostgreSQL CREATE DATABASE myapp; -- MySQL CREATE DATABASE myapp CHARACTER SET utf8mb4; ``` ## Import Errors ### "ModuleNotFoundError" **Cause:** Python can't find module. **Solutions:** 1. **Check PYTHONPATH:** ```bash export PYTHONPATH=/path/to/project:$PYTHONPATH ``` 2. **Install package:** ```bash uv add -e . # Editable install ``` 3. **Verify import:** ```bash python -c "import app.models" ``` ## Performance Issues ### Slow configuration loading **Cause:** Large codebase scan. **Solution:** Specify `model_paths`: ```python # Slow - scans everything primary = database_config( # Fast - targeted scan primary = database_config( database_name="primary", model_paths=["app.models"], ... ) ``` ### Slow imports **Cause:** Heavy imports in config file. **Solution:** Avoid imports in `dbwarden.py`: ```python # Slow from app.models import Base from app.services import setup # Fast from dbwarden import database_config db = database_config( ``` ## Debugging Tips ### Enable verbose output ```bash $ dbwarden --verbose migrate ``` ### Check configuration ```bash # Show all configuration $ dbwarden settings show # Show specific database $ dbwarden settings show --database primary ``` ### Test imports ```bash python -c "import dbwarden; print('OK')" python -c "from dbwarden import database_config; print('OK')" ``` ### Verify database connection ```bash $ dbwarden check-db $ dbwarden check-db --database primary ``` ## What's Next? - **[Configuration API Reference](../reference/configuration-api.md)** - Complete parameter docs - **[Quick Start](quick-start.md)** - Start fresh with correct setup ======================================================================== PAGE: https://dbwarden.emiliano-go.com/cookbook/01-project-setup/ ======================================================================== # 1. Project Setup ## What You'll Learn - How to initialize a DBWarden project with `dbwarden init` - How configuration is structured via `database_config()` - How to inspect your loaded configuration ## Prerequisites - Python 3.12+ with `uv add dbwarden sqlalchemy` - The `examples/core/` directory (see [Cookbook Index](index.md)) ## Step 1: Initialize the Project ```bash cd examples/core/ bash scripts/01-setup.sh ``` The `dbwarden init` command creates the directory structure DBWarden expects: ``` migrations/ primary/ # Migration files for the 'primary' database ``` It also writes a starter `dbwarden.py` if one doesn't exist. In our case, we already have one with our configuration. ## Step 2: Understanding the Configuration Our `examples/core/dbwarden.py`: ```python from dbwarden import database_config primary = database_config( database_name="primary", default=True, database_type="sqlite", database_url_sync="sqlite:///./app.db", model_paths=["app"], model_tables=["users", "posts"], ) ``` Each parameter has a specific role: | Parameter | Value | Purpose | |-----------|-------|---------| | `database_name` | `"primary"` | Logical name used in `--database primary` CLI flags | | `default` | `True` | Used when no `--database` flag is given | | `database_type` | `"sqlite"` | Dialect for SQL generation and connection | | `database_url_sync` | `"sqlite:///./app.db"` | Synchronous connection URL | | `model_paths` | `["app"]` | Python module paths to scan for SQLAlchemy models | | `model_tables` | `["users", "posts"]` | Optional table-name filter for this database | The return value `primary` is a `DatabaseHandle` object. It's also used later for FastAPI dependency injection: the same object provides `primary.async_session` and `primary.sync_session`. ## Step 3: Viewing the Configuration ```text $ dbwarden config Configuration: Databases: primary (default): Type: sqlite Sync URL: sqlite:///./app.db Model Paths: app Migrations Dir: migrations/primary ``` This confirms DBWarden has discovered and loaded your configuration. The `(default)` marker means `--database` can be omitted when targeting this database. ## What Happens Under the Hood When you import `dbwarden` and call `database_config()`: 1. The function call is registered in DBWarden's internal registry 2. On first CLI command, DBWarden discovers `dbwarden.py` via AST scanning 3. It imports the module and executes each `database_config()` call 4. It validates uniqueness, default rules, and model path resolution 5. The resolved configuration is cached for the session ## Key Takeaways - `dbwarden init` creates the directory skeleton: run it once per project - `dbwarden config` shows what DBWarden actually resolved (useful for debugging) - `database_config()` is the single entry point for all configuration - `model_paths` controls which Python modules are scanned for models - We chose SQLite here so the example runs with zero external services ## Next [Section 2: Models & Migrations](02-models-and-migrations.md) ======================================================================== PAGE: https://dbwarden.emiliano-go.com/cookbook/02-models-and-migrations/ ======================================================================== # 2. Defining Models and Generating Migrations ## What You'll Learn - How to define SQLAlchemy models with `class Meta` annotations - How `make-migrations` generates SQL from model changes - How the generated SQL maps to database DDL - How to create manual migrations with `dbwarden new` - How to extract rollback SQL from an existing migration ## Prerequisites - Completed [Section 1: Project Setup](01-project-setup.md) - `examples/core/` with `app/models.py` ## Step 1: The Models Our example project defines four models in `examples/core/app/models.py`. Here they are with explanations: ### User ```python class User(Base): __tablename__ = "users" id: Mapped[int] = mapped_column(Integer, primary_key=True) email: Mapped[str] = mapped_column(String(255), unique=True, nullable=False) username: Mapped[str] = mapped_column(String(100), unique=True, nullable=False) full_name: Mapped[str | None] = mapped_column(String(200), nullable=True) is_active: Mapped[bool] = mapped_column(Boolean, default=True) created_at: Mapped[datetime] = mapped_column(DateTime, server_default=text("CURRENT_TIMESTAMP")) class Meta(TableMeta): comment = "Core user accounts" indexes = [ IndexSpec(name="ix_users_created_at", columns=["created_at"]), ] ``` Key points: - `unique=True` on `email` and `username` generates `UNIQUE` constraints - `nullable=True` (the default) allows `NULL`; `nullable=False` adds `NOT NULL` - `server_default=text(...)` becomes a database-level `DEFAULT` clause in the DDL; `default=` is a Python-level default and is not rendered in SQL - `class Meta(TableMeta)` is how we attach table-level metadata - `IndexSpec` generates a named `CREATE INDEX` statement ### Post ```python class Post(Base): __tablename__ = "posts" id: Mapped[int] = mapped_column(Integer, primary_key=True) title: Mapped[str] = mapped_column(String(255), nullable=False) body: Mapped[str] = mapped_column(Text, nullable=False) user_id: Mapped[int] = mapped_column(ForeignKey("users.id"), nullable=False) created_at: Mapped[datetime] = mapped_column(DateTime, server_default=text("CURRENT_TIMESTAMP")) class Meta(TableMeta): comment = "User blog posts" indexes = [ IndexSpec(name="ix_posts_user_id", columns=["user_id"]), IndexSpec(name="ix_posts_created_at", columns=["created_at"]), ] ``` Key points: - `ForeignKey("users.id")` generates a `REFERENCES` clause - Foreign key targets are rendered inline in `CREATE TABLE` ### Product (with CHECK constraint) ```python class Product(Base): __tablename__ = "products" id: Mapped[int] = mapped_column(Integer, primary_key=True) name: Mapped[str] = mapped_column(String(200), nullable=False) price: Mapped[float] = mapped_column(Float, nullable=False) description: Mapped[str | None] = mapped_column(Text, nullable=True) in_stock: Mapped[bool] = mapped_column(Boolean, default=True) created_at: Mapped[datetime] = mapped_column(DateTime, server_default=text("CURRENT_TIMESTAMP")) class Meta(TableMeta): comment = "Product catalog" checks = [ {"name": "ck_products_price_positive", "sql": "price > 0"}, ] ``` Key points: - `checks` in `class Meta` generates `CHECK` constraints - Each check needs a `name` (constraint name) and `sql` (the expression) - This prevents negative prices at the database level ### Tag ```python class Tag(Base): __tablename__ = "tags" id: Mapped[int] = mapped_column(Integer, primary_key=True) name: Mapped[str] = mapped_column(String(50), unique=True, nullable=False) class Meta(TableMeta): comment = "Taxonomy tags for products" ``` The simplest model: just an ID and a unique name. ## Step 2: Generating the Migration ```bash cd examples/core bash scripts/02-models-migrations.sh ``` The first script step runs: ```bash $ dbwarden make-migrations "create core tables" --database primary ``` This compares the current model state against the database (or a stored snapshot). Since this is a fresh project, it detects four new tables and generates: ```sql -- upgrade CREATE TABLE IF NOT EXISTS posts ( id INTEGER NOT NULL PRIMARY KEY, title VARCHAR(255) NOT NULL, body TEXT NOT NULL, user_id INTEGER NOT NULL REFERENCES users(id), created_at DATETIME ) CREATE TABLE IF NOT EXISTS products ( id INTEGER NOT NULL PRIMARY KEY, name VARCHAR(200) NOT NULL, price FLOAT NOT NULL, description TEXT, in_stock BOOLEAN DEFAULT TRUE, created_at DATETIME ) CREATE TABLE IF NOT EXISTS tags ( id INTEGER NOT NULL PRIMARY KEY, name VARCHAR(50) NOT NULL UNIQUE ) CREATE TABLE IF NOT EXISTS users ( id INTEGER NOT NULL PRIMARY KEY, email VARCHAR(255) NOT NULL UNIQUE, username VARCHAR(100) NOT NULL UNIQUE, full_name VARCHAR(200), is_active BOOLEAN DEFAULT TRUE, created_at DATETIME ) -- rollback DROP TABLE users DROP TABLE tags DROP TABLE products DROP TABLE posts ``` > **Note:** The example uses SQLite, which has limited DDL support. With PostgreSQL, DBWarden generates additional features: > - **`CREATE INDEX IF NOT EXISTS ...`**: from `IndexSpec` entries in `class Meta` > - **`COMMENT ON TABLE ...`**: from `Meta.comment` attributes > - **`CONSTRAINT ... CHECK (...)`**: from `Meta.checks` > - **`server_default`** expressions rendered as native SQL defaults > - Inline `REFERENCES` become table-level `FOREIGN KEY` constraints > > The generated SQL is always backend-specific. DBWarden adapts to the `database_type` configured in `dbwarden.py`. ### Reading the Generated SQL Let's walk through what each section does: **`-- upgrade`**: Applied when you run `dbwarden migrate` 1. **`CREATE TABLE IF NOT EXISTS posts (...)`**: Creates posts with a foreign key reference to `users(id)` (inline `REFERENCES` style for SQLite). The foreign key originates from `ForeignKey("users.id")` on the `user_id` column. 2. **`CREATE TABLE IF NOT EXISTS products (...)`**: Creates products with a `CHECK` constraint defined in `class Meta`. In SQLite, CHECK constraints must be inline; with PostgreSQL they become `CONSTRAINT ... CHECK (...)`. 3. **`CREATE TABLE IF NOT EXISTS tags (...)`**: Simple table with a unique constraint on `name`. 4. **`CREATE TABLE IF NOT EXISTS users (...)`**: Creates the users table with all columns, primary key, and unique constraints inline. Note that with this SQLite backend the table order differs from the order in our Python models, and some features are omitted: - **IndexSpec entries** generate `CREATE INDEX` only on PostgreSQL and ClickHouse - **`COMMENT ON TABLE`** is only generated for PostgreSQL - **`server_default`** expressions render as native SQL defaults on PostgreSQL **`-- rollback`**: Applied when you run `dbwarden rollback` 1. Drops tables. Order may vary by backend; DBWarden handles dependency ordering automatically. ### Auto-generated Migration Name The migration file is named automatically: ``` primary__0001_create_core_tables.sql ``` The naming pattern is: ``` {database_name}__{4-digit-version}_{auto-generated-description}.sql ``` ### PostgreSQL-Specific Model Metadata When your `database_type` is `"postgresql"`, DBWarden supports PostgreSQL-specific table and column metadata. The following model shows tablespace, fillfactor, identity columns, and column compression: ```python from dbwarden.databases.pgsql import PGTableMeta, PGColumnMeta, pg class Order(Base): __tablename__ = "orders" id: Mapped[int] = mapped_column(Integer, primary_key=True) total: Mapped[float] = mapped_column(Float) created_at: Mapped[datetime] = mapped_column(TIMESTAMP) class Meta(PGTableMeta): pg_tablespace = "fast_space" pg_fillfactor = 90 comment = "Customer orders" class id(PGColumnMeta): comment = "Order ID" pg = pg.field(identity="ALWAYS") class created_at(PGColumnMeta): pg = pg.field(compression="pglz") ``` The generated PostgreSQL DDL includes tablespace, fillfactor, identity columns, and column-level options: ```sql CREATE TABLE IF NOT EXISTS orders ( id INTEGER GENERATED ALWAYS AS IDENTITY NOT NULL, total FLOAT NOT NULL, created_at TIMESTAMP NOT NULL COMPRESSION pglz ) TABLESPACE fast_space WITH (fillfactor=90); COMMENT ON TABLE orders IS 'Customer orders'; COMMENT ON COLUMN orders.id IS 'Order ID'; ``` ## Step 3: Creating a Manual Migration Sometimes you need a migration that isn't model-driven: a data backfill, a stored procedure, or a complex SQL operation. ```bash $ dbwarden new add_custom_table --database primary ``` This creates a blank migration: ```sql -- upgrade -- TODO: write your upgrade SQL here -- rollback -- TODO: write your rollback SQL here ``` You fill in both sections manually. Manual migrations follow the same file naming convention and are tracked alongside auto-generated ones. ## Step 4: Extracting Rollback SQL If you have a migration file and need to extract just its rollback section: ```bash $ dbwarden make-rollback migrations/primary/primary__0001_create_core_tables.sql ``` This prints the rollback SQL to stdout. Useful for quickly verifying what a rollback will do before running it. ## Key Takeaways - DBWarden generates explicit, reviewable SQL: no hidden runtime behavior - Every migration has both `-- upgrade` and `-- rollback` sections - `class Meta(TableMeta)` is where table-level metadata (comments, indexes, checks) lives - `IndexSpec` produces named `CREATE INDEX` statements; always prefer named indexes - `dbwarden new` creates blank migrations for non-model-driven changes - `dbwarden make-rollback` extracts rollback SQL for review ## Related Documentation - [SQLAlchemy Models Reference](../models.md) - [Modeling Guide](../getting-started/modeling.md) - [Migration File Format](../migration-files.md) - [`make-migrations` command](../commands/make-migrations.md) ## Next [Section 3: Apply & Inspect](03-apply-and-inspect.md) ======================================================================== PAGE: https://dbwarden.emiliano-go.com/cookbook/03-apply-and-inspect/ ======================================================================== # 3. Applying and Inspecting Migrations ## What You'll Learn - How `dbwarden migrate` applies pending SQL - How to roll back and downgrade to specific versions - How to inspect migration history and status - How to validate schema integrity and database connectivity ## Prerequisites - Completed [Section 2](02-models-and-migrations.md) (migration file exists) - `examples/core/` project ## Step 1: Apply Migrations ```bash cd examples/core bash scripts/03-apply-inspect.sh ``` ### The Migrate Command ```bash $ dbwarden migrate --database primary ``` When you run `migrate`, DBWarden: 1. Creates the metadata table (`_dbwarden_migrations`) if it doesn't exist 2. Creates the lock table (`_dbwarden_lock`) if it doesn't exist 3. Acquires a migration lock (prevents concurrent runs) 4. Reads all migration files and filters to pending (unapplied) ones 5. Executes the `-- upgrade` SQL of each pending migration 6. Records each migration's version, checksum, and timestamp 7. Writes a schema snapshot file for future diffs 8. Releases the lock ``` [DBWarden] Applying primary__0001_create_core_tables... [DBWarden] Migration applied successfully (42ms) [DBWarden] All migrations applied. Pending: 0 ``` ### Verify Status ```bash $ dbwarden status --database primary ``` Output: ``` Database: primary Applied: 1 Pending: 0 Status: up-to-date ``` ### View History ```bash $ dbwarden history --database primary ``` Output: ``` Migration History (primary) V0001 create_core_tables 2025-01-15 10:30:00 a1b2c3d4... ``` The checksum (`a1b2c3d4...`) is a SHA-256 hash of the migration file content. This detects tampering or accidental edits after apply. ## Step 2: Rollback ```bash $ dbwarden rollback --database primary --count 1 ``` Rollback executes the `-- rollback` section of the most recently applied migration. After rollback: ``` [DBWarden] Rolling back primary__0001_create_core_tables... [DBWarden] Rollback complete ``` ```bash $ dbwarden status --database primary ``` ``` Database: primary Applied: 0 Pending: 1 Status: pending ``` ### Rollback Mechanics - Rollback always executes the `-- rollback` section of the file; never auto-generates reverse SQL - `--count` controls how many migrations to roll back (default: 1) - Rollbacks are also lock-protected - After rollback, the migration is considered "pending" again and can be re-applied ## Step 3: Re-apply ```bash $ dbwarden migrate --database primary ``` Re-applies the migration. Since rollback removed the tracking record, the migration runs again. ## Step 4: Downgrade to a Version ```bash $ dbwarden downgrade --to 0000 --database primary ``` `downgrade` is a bulk rollback: it rolls back all migrations down to (but not including) the target version. `--to 0000` rolls back everything. ``` [DBWarden] Rolling back primary__0001_create_core_tables... [DBWarden] Downgrade complete. At version: 0000 ``` ### migrate vs rollback vs downgrade | Command | What it does | Safe to run twice? | |---------|-------------|-------------------| | `migrate` | Applies pending migrations | Yes (idempotent) | | `rollback` | Reverses the last N applied migrations | Yes (tracks what's applied) | | `downgrade` | Rolls back to a specific target version | Yes | ## Step 5: Re-apply All ```bash $ dbwarden migrate --database primary $ dbwarden status --database primary ``` After the final apply, status should show: ``` Database: primary Applied: 1 Pending: 0 Status: up-to-date ``` ## Step 6: Schema Validation ```bash $ dbwarden check --database primary ``` `check` scans each migration file and classifies operations by safety level: - **SAFE**: Adding a nullable column, creating an index - **INFO**: Table comment changes - **WARN**: Dropping a default, changing column type - **CRITICAL**: Dropping a table or column, removing a NOT NULL ``` Checking migrations for 'primary'... primary__0001_create_core_tables: CREATE TABLE users SAFE CREATE TABLE posts SAFE CREATE TABLE products SAFE CREATE TABLE tags SAFE CREATE INDEX SAFE COMMENT ON TABLE INFO Result: 5 SAFE, 1 INFO, 0 WARN, 0 CRITICAL ``` ## Step 7: Database Connectivity Check ```bash $ dbwarden check-db --database primary ``` `check-db` connects to the live database and reports its schema: ``` Database: primary Connection: OK Tables: users (6 columns) posts (5 columns) products (6 columns) tags (2 columns) Migration table: _dbwarden_migrations (present) Lock table: _dbwarden_lock (present) ``` This confirms the database is reachable and has the expected schema. ## Key Takeaways - `migrate` applies pending SQL with lock protection and checksum recording - `rollback` and `downgrade` give you precise control over reversal - `status` and `history` are your windows into migration state - `check` classifies each operation by safety before it runs - `check-db` validates database connectivity and schema existence ## Related Documentation - [`migrate` command](../commands/migrate.md) - [`rollback` command](../commands/rollback.md) - [`downgrade` command](../commands/downgrade.md) - [`status` command](../commands/status.md) - [`history` command](../commands/history.md) - [`check` command](../commands/check.md) - [`check-db` command](../commands/check-db.md) ## Next [Section 4: Offline & CI Workflows](04-offline-ci.md) ======================================================================== PAGE: https://dbwarden.emiliano-go.com/cookbook/04-offline-ci/ ======================================================================== # 4. Offline & CI Workflows ## What You'll Learn - How to export model state to JSON for offline use - How to generate migrations without a live database - How to integrate this into CI/CD pipelines ## Prerequisites - Completed [Section 3](03-apply-and-inspect.md) (migrations applied, models in sync) - `examples/core/` project ## The Problem In CI/CD pipelines, you often need to generate migrations as part of your build, but your CI runner may not have a database connection. DBWarden's offline mode solves this by serializing model state to a JSON file. ## Step 1: Export Model State ```bash cd examples/core bash scripts/04-offline-ci.sh ``` The key command: ```bash $ dbwarden export-models --database primary ``` This connects to the live database, introspects the current schema, and writes a JSON file to `.dbwarden/model_state.json`: ```json { "version": "1.0", "exported_at": "2025-01-15T10:30:00", "database": "primary", "tables": { "users": { "columns": { "id": {"type": "INTEGER", "nullable": false, "primary_key": true}, "email": {"type": "VARCHAR(255)", "nullable": false, "unique": true}, "username": {"type": "VARCHAR(100)", "nullable": false, "unique": true}, "full_name": {"type": "VARCHAR(200)", "nullable": true}, "is_active": {"type": "BOOLEAN", "nullable": true, "default": "1"}, "created_at": {"type": "DATETIME", "nullable": true, "default": "CURRENT_TIMESTAMP"} }, "indexes": [ {"name": "ix_users_created_at", "columns": ["created_at"]} ], "checks": [], "comment": "Core user accounts" } } } ``` This file becomes your source of truth for future diffs; no database required. ## Step 2: Commit the State File ```bash git add .dbwarden/model_state.json git commit -m "Update model state snapshot" ``` ## Step 3: Generate Migrations Offline On any machine (including CI without a database): ```bash $ dbwarden make-migrations "offline schema change" --offline --database primary ``` The `--offline` flag tells DBWarden to: 1. Read the model state from `.dbwarden/model_state.json` instead of querying a live database 2. Introspect the current model definitions in your Python code 3. Diff the two and generate migration SQL 4. Write the migration file AND update the snapshot file This means the snapshot is always in sync after each generation. ## CI/CD Integration In a GitHub Actions workflow: ```yaml jobs: generate-migrations: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - uses: actions/setup-python@v5 with: python-version: "3.12" - run: uv add dbwarden sqlalchemy # Generate migrations using the committed state file - name: Check for new migrations run: dbwarden make-migrations "ci change" --offline --database primary # Commit any newly generated migrations - uses: stefanzweifel/git-auto-commit-action@v5 with: commit_message: "Auto-generate migration" ``` The full CI pipeline can then `dbwarden migrate` against staging/production using the generated SQL files. ## Key Takeaways - `export-models` serializes the current database schema to JSON - `make-migrations --offline` generates migrations using the snapshot instead of a live database - Offline mode enables migration generation in CI without database access - The snapshot file should be committed and kept in sync ## Related Documentation - [CI/CD Patterns](../advanced/ci-cd-patterns.md) - [`export-models` command](../cli-reference.md) (see CLI reference) ## Next [Section 5: Schema Inspection](05-schema-inspection.md) ======================================================================== PAGE: https://dbwarden.emiliano-go.com/cookbook/05-schema-inspection/ ======================================================================== # 5. Schema Inspection Schema inspection allows you to compare your SQLAlchemy model definitions against the live database, capture DDL snapshots of individual tables, and reverse-engineer models from an existing database. For complete documentation see the [`diff`](../commands/diff.md), [`snapshot`](../commands/snapshot.md), and [`generate-models`](../commands/generate-models.md) command references. ## What You'll Learn - How to diff models against the live database - How to capture DDL snapshots of individual tables - How to reverse-engineer models from a live database ## Prerequisites - Completed [Section 3](03-apply-and-inspect.md) (migrations applied) - `examples/core/` project ## Step 1: Diff Models vs Database ```bash cd examples/core bash scripts/05-schema-inspection.sh ``` The key command: ```bash $ dbwarden diff --database primary ``` `diff` compares your SQLAlchemy model definitions against the current database schema and reports any discrepancies: ``` No differences found between models and database. ``` If you add a column to a model without running `make-migrations`, `diff` would report a schema diff table: ``` Schema Diff ┌───────────┬───────┬────────┬──────────┐ │ Operation │ Table │ Target │ Severity │ ├───────────┼───────┼────────┼──────────┤ │ add_column│ users │ bio │ INFO │ └───────────┴───────┴────────┴──────────┘ Total changes: 1 ``` This is useful for catching drift before deployments. ## Step 2: Capture a DDL Snapshot ```bash $ dbwarden snapshot users --database primary ``` The `snapshot` command captures the DDL for a specific table: ```sql CREATE TABLE users ( id INTEGER NOT NULL, email VARCHAR(255) NOT NULL, username VARCHAR(100) NOT NULL, full_name VARCHAR(200), is_active BOOLEAN DEFAULT true, created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, PRIMARY KEY (id), UNIQUE (email), UNIQUE (username) ); -- Indexes: CREATE INDEX ix_users_created_at ON users (created_at); ``` Useful for: - Documenting schema for code reviews - Comparing schemas across environments - Debugging migration issues ## Step 3: Reverse-Engineer Models ```bash $ dbwarden generate-models -d primary --tables users,posts ``` This connects to the live database and generates SQLAlchemy model code: ```python from sqlalchemy import Integer, String, Boolean, DateTime, Text, ForeignKey from sqlalchemy.orm import DeclarativeBase, Mapped, mapped_column class Base(DeclarativeBase): pass class User(Base): __tablename__ = "users" id: Mapped[int] = mapped_column(Integer, primary_key=True) email: Mapped[str] = mapped_column(String(255), nullable=False, unique=True) username: Mapped[str] = mapped_column(String(100), nullable=False, unique=True) full_name: Mapped[str | None] = mapped_column(String(200), nullable=True) is_active: Mapped[bool] = mapped_column(Boolean, default=True) created_at: Mapped[datetime] = mapped_column(DateTime, default=datetime.now(UTC)) class Post(Base): __tablename__ = "posts" id: Mapped[int] = mapped_column(Integer, primary_key=True) title: Mapped[str] = mapped_column(String(255), nullable=False) body: Mapped[str] = mapped_column(Text, nullable=False) user_id: Mapped[int] = mapped_column(ForeignKey("users.id"), nullable=False) created_at: Mapped[datetime] = mapped_column(DateTime, nullable=False) ``` This is the fastest way to bootstrap models from an existing database. You can review and annotate the output with `class Meta` afterward. Options: - `--tables users,posts`: limit to specific tables - `--exclude-tables`: exclude tables by pattern - `--single-file`: output all models in one file - `--output ./models/`: write to a directory instead of stdout ## Key Takeaways - `diff` detects drift between models and the live database - `snapshot` captures table DDL for documentation or debugging - `generate-models` reverse-engineers live tables into SQLAlchemy model code - These three commands form your schema inspection toolkit ## Related Documentation - [`diff` command](../commands/diff.md) - [`snapshot` command](../commands/snapshot.md) - [`generate-models` command](../commands/generate-models.md) - [SQLAlchemy Models Reference](../models.md) ## Next [Section 6: Safety & Impact Analysis](06-safety-impact.md) ======================================================================== PAGE: https://dbwarden.emiliano-go.com/cookbook/06-safety-impact/ ======================================================================== # 6. Safety & Impact Analysis Schema changes are the highest-risk operation in most deployments. Dropping a column that application code still references causes runtime errors. Changing a column type can break queries. DBWarden provides two tools to detect these issues before deploy: `check` classifies every migration operation by danger level, and `check-impact` finds affected code references. For complete documentation see the [`check`](../commands/check.md) and [`check-impact`](../cli-reference.md) command references. ## What You'll Learn - How `dbwarden check` classifies operations by danger level - How `dbwarden check-impact` finds code references affected by a migration - How to detect destructive changes before they reach production ## Prerequisites - Completed [Section 3](03-apply-and-inspect.md) (migrations applied) - `examples/core/` project ## Step 1: Safety Check ```bash cd examples/core bash scripts/06-safety-impact.sh ``` The key command: ```bash $ dbwarden check --database primary ``` This scans every migration file and classifies each SQL operation by safety level: | Level | Meaning | |-------|---------| | **SAFE** | No data loss risk (add table, add nullable column, create index) | | **INFO** | Metadata changes (comments, renames) | | **WARN** | Potential impact (change column type, drop default) | | **CRITICAL** | Destructive (drop table, drop column, remove NOT NULL) | Output for our baseline migrations: ``` Safety Check - primary ┌──────────┬──────────────┬──────────┬────────┬─────────┬───────────────┐ │ Severity │ Change │ Table │ Column │ Message │ Required Flag │ ├──────────┼──────────────┼──────────┼────────┼─────────┼───────────────┤ │ SAFE │ create_table │ users │ │ │ │ │ SAFE │ create_table │ posts │ │ │ │ │ SAFE │ create_table │ products │ │ │ │ │ SAFE │ create_table │ tags │ │ │ │ │ SAFE │ create_index │ │ │ │ │ │ INFO │ comment_on │ users │ │ │ │ └──────────┴──────────────┴──────────┴────────┴─────────┴───────────────┘ ``` A migration with a destructive change would show: ``` ┌──────────┬──────────────┬───────┬──────────┬─────────┬───────────────┐ │ Severity │ Change │ Table │ Column │ Message │ Required Flag │ ├──────────┼──────────────┼───────┼──────────┼─────────┼───────────────┤ │ CRITICAL │ drop_column │ users │ username │ │ │ └──────────┴──────────────┴───────┴──────────┴─────────┴───────────────┘ ``` This gives you a quick visual signal during code review: if a migration contains CRITICAL operations, it needs extra scrutiny. ## Step 2: Code Impact Analysis ```bash $ dbwarden check-impact 0001 --database primary ``` `check-impact` scans your application code (not just migration files) for references that would be affected by a migration. It uses AST analysis with a grep fallback: ``` No impact detected Scanned: . ``` A more realistic scenario with a destructive change: ``` Migration: 0002_drop_username Impact detected: 1 operation(s) affect code drop_column on users.username References: 2 app/routes/users.py:34 attribute_access .username app/templates/profile.jinja2:12 grep user.username ``` The scan finds each reference, identifies the access pattern (attribute access in Python, grep match in templates), and reports the file and line number. ### How It Works 1. Reads the migration's plan file and parses the schema changes 2. Identifies schema changes (DROP COLUMN, ALTER COLUMN TYPE, etc.) 3. Scans `.py` files using Python's `ast` module for attribute access patterns 4. Falls back to grep for non-Python files (templates, configs, etc.) 5. Reports all references grouped by change type ### Flags - `--scan-path app/`: limit scanning to a specific directory (default: project root) - `--deep`: also scan dependencies (imported packages) - `--out json`: output as JSON for CI processing - `--verbose`: show scan progress ## Pre-Deploy Workflow Combine both tools for a safe deploy sequence: ```bash # 1. Check migration safety $ dbwarden check --database primary # 2. Check code impact $ dbwarden check-impact 0042 --database primary # 3. Only proceed if no unexpected CRITICAL or WARN items $ dbwarden migrate --database primary ``` In CI: ```yaml - name: Safety check run: dbwarden check --database primary - name: Impact analysis run: dbwarden check-impact 0042 --database primary - name: Apply (only if previous steps succeeded) run: dbwarden migrate --database primary ``` ## Key Takeaways - `check` classifies every migration operation by safety level using a severity table - `check-impact` finds code references affected by a migration using AST + grep - Together they catch breaking changes before deploy - CRITICAL operations aren't blocked; they're flagged for human review - Use `--out json` for CI integration ## Related Documentation - [`check` command](../commands/check.md) - [`check-impact` command](../cli-reference.md) (see CLI reference) - [Safe Deployment](../advanced/safe-deployment.md) ## Next [Section 7: Seeds](07-seeds.md) ======================================================================== PAGE: https://dbwarden.emiliano-go.com/cookbook/07-seeds/ ======================================================================== # 7. Seeds ## What You'll Learn - How to define code seeds using the `Seed` base class - How to create and apply file-based SQL/Python seeds (legacy) - How to list, apply, and roll back seeds - How to auto-apply seeds after migrations ## Prerequisites - Completed [Section 3](03-apply-and-inspect.md) (migrations applied, tables exist) - `examples/core/` project ## Step 1: Define a Code Seed Code seeds are the recommended way to seed data. They live alongside your models and keep seed logic close to the schema it populates. Create a seed file in your models directory (e.g. `models/seeds.py`): ```python from dbwarden.seed import Seed class CountrySeed(Seed): __seed_database__ = "primary" __seed_description__ = "initial countries" __seed_on_conflict__ = "update" __seed_conflict_columns__ = ["code"] model = Country rows = [ Country(code="UY", name="Uruguay"), Country(code="AR", name="Argentina"), ] ``` Notice: - **No `version`**: versions are auto-assigned (`C0001`, `C0002`, ...) - **Model instances** in `rows`: your IDE gives full autocompletion on column names - **Keyword arguments required**: SQLAlchemy 2.0's `DeclarativeBase` does not accept positional args; always use `Model(col=val)` syntax - **`__seed_database__`**: route the seed to the correct database ### Logic-Based Seeds Define a `generate(session)` method for programmatic data: ```python class PermissionSeed(Seed): __seed_database__ = "primary" __seed_description__ = "load permissions" model = Permission @staticmethod def generate(session): for resource in ["users", "orders"]: for action in ["read", "write", "delete"]: session.add(Permission(name=f"{resource}:{action}")) ``` ## Step 2: Apply Seeds ```bash $ dbwarden seed apply --database primary ``` Output: ``` Applying code seed C0001: initial countries ``` Code seeds and file seeds are both discovered and applied together. Each seed version is tracked in `_dbwarden_seeds` and can only be applied once until rolled back. ## Step 3: List Applied Seeds ```bash $ dbwarden seed list --database primary ``` Output: ``` Seeds for database 'primary': C0001 initial countries applied 2025-01-15 10:30:00 (code seed) ``` ## Step 4: Auto-Apply Seeds After Migrations Configure seeds to be applied automatically after `dbwarden migrate`: ```python database_config( database_name="primary", default=True, database_type="sqlite", database_url_sync="sqlite:///./app.db", model_paths=["models"], auto_apply_seeds=True, ) ``` Now running `dbwarden migrate` will also apply any pending seeds. Or apply seeds once without changing config: ```bash $ dbwarden migrate --apply-seeds ``` ## Step 5: Traditional File Seeds (Legacy) For complex multi-statement SQL, you can still use file-based seeds. ### Create a SQL Seed ```bash $ dbwarden seed create "initial admin users" --database primary ``` This creates `seeds/V0001__initial_admin_users.sql`. Fill it with data: ```sql INSERT INTO users (email, username, full_name, is_active, created_at) VALUES ('admin@example.com', 'admin', 'Admin User', 1, CURRENT_TIMESTAMP); INSERT INTO users (email, username, full_name, is_active, created_at) VALUES ('moderator@example.com', 'moderator', 'Moderator User', 1, CURRENT_TIMESTAMP); ``` ### Apply and List ```bash $ dbwarden seed apply --database primary $ dbwarden seed list --database primary ``` Output: ``` Seeds for database 'primary': C0001 initial countries applied 2025-01-15 10:30:00 (code seed) V0001 initial_admin_users applied 2025-01-15 10:31:00 ``` ### Python File Seeds ```bash $ dbwarden seed create "generate sample data" --database primary --type python ``` Creates `seeds/V0002__generate_sample_data.py` with a `seed(connection, session)` function. ## Step 6: Roll Back a Seed ```bash $ dbwarden seed rollback --database primary --count 1 ``` Seed rollback removes the tracking record, allowing the seed to be re-applied. It does **not** reverse the data changes; that is your responsibility if needed. After rollback: ``` Seeds for database 'primary': C0001 initial countries applied 2025-01-15 10:30:00 (code seed) V0001 initial_admin_users pending ``` ## Step 7: Prune Orphaned Records Remove tracking records for seed files that no longer exist on disk: ```bash $ dbwarden seed list --prune ``` ## Key Takeaways - **Code seeds (`Seed` base class) are the recommended approach**: no manual versions, full IDE support, stays in sync with models - `auto_apply_seeds: True` or `dbwarden migrate --apply-seeds` applies seeds automatically after migrations - File seeds (`.sql` / `.py`) are still available for complex multi-statement SQL - `seed list --prune` cleans up orphaned tracking records - Seed rollback removes the tracking record; it does not undo data ## Related Documentation - [Seeds Reference](../seeds.md) - [`seed` command](../commands/seed.md) - [CLI Reference: Seed Management](../cli-reference.md#seed-management) ## Next [Section 8: Multi-Database](08-multi-database.md) ======================================================================== PAGE: https://dbwarden.emiliano-go.com/cookbook/08-multi-database/ ======================================================================== # 8. Multi-Database & Configuration DBWarden supports managing multiple databases in a single project; each with its own migration directory, lock, tracking table, and model paths. You can mix PostgreSQL, MySQL, and ClickHouse backends in the same codebase. For complete documentation see the [Multi-Database Configuration](../configuration/multi-database.md) reference. ## What You'll Learn - How to configure multiple databases in one project - How to target specific databases with CLI flags - How to manage PostgreSQL + MySQL + ClickHouse in the same codebase - How to use `dbwarden settings` for runtime configuration changes ## Prerequisites - Docker (for PostgreSQL, MySQL, and ClickHouse containers) - `examples/multi-database/` directory ## Scenario A project with three databases: - **primary** (PostgreSQL): transactional user data - **legacy** (MySQL): legacy CRM and reporting data - **analytics** (ClickHouse): page view events for analysis ## Step 1: The Configuration ```python from dbwarden import database_config primary = database_config( database_name="primary", default=True, database_type="postgresql", database_url_sync="postgresql://user:password@localhost:5432/primary", database_url_async="postgresql+asyncpg://user:password@localhost:5432/primary", model_paths=["app.models.primary"], ) legacy = database_config( database_name="legacy", database_type="mysql", database_url_sync="mysql+pymysql://user:password@localhost:3306/legacy", model_paths=["app.models.legacy"], ) analytics = database_config( database_name="analytics", database_type="clickhouse", database_url_sync="http://localhost:8123/analytics", model_paths=["app.models.analytics"], ) ``` Key rules: - Exactly one database must have `default=True` (used when `--database` is omitted) - Each database must have separate `model_paths` (no overlap by default) - Each database gets its own migration directory under `migrations/` - MySQL models use `MyTableMeta` / `MyColumnMeta` for engine, charset, and column metadata ## Step 2: Start the Databases ```bash cd examples/multi-database docker compose up -d ``` ## Step 3: Initialize and Migrate ```bash $ dbwarden init $ dbwarden migrate --all ``` This applies migrations to both databases in sequence. Each has its own lock, its own tracking table, and its own migration history. ## Step 4: Target a Specific Database ```bash # Generate migrations for primary only $ dbwarden make-migrations "add user table" --database primary # Apply to analytics only $ dbwarden migrate --database analytics # Check status of one database $ dbwarden status --database primary ``` ## Step 5: Check Status of All Databases ```bash $ dbwarden status --all ``` Output: ``` Database: primary Applied: 1 Pending: 0 Status: up-to-date Database: analytics Applied: 1 Pending: 0 Status: up-to-date ``` ## Step 6: Using `dbwarden settings` The settings commands allow runtime configuration changes without editing `dbwarden.py` directly: ```bash # View current configuration $ dbwarden settings show --all # Set a default database $ dbwarden settings default-database primary # Add a new database entry $ dbwarden settings database-add reporting postgresql://localhost:5432/reporting \ --type postgresql \ --model-path app.models.reporting # Or add a MySQL database $ dbwarden settings database-add legacy mysql+pymysql://localhost:3306/legacy \ --type mysql \ --model-path app.models.legacy # Remove a database $ dbwarden settings database-remove reporting # Rename a database $ dbwarden settings database-rename analytics analytics_v2 ``` Settings commands modify the `dbwarden.py` file directly using AST-based mutation. The changes are permanent and committed to version control. ## Step 7: Dev Mode with Multiple Databases Each database can independently configure dev mode: ```python primary = database_config( database_name="primary", database_type="postgresql", database_url_sync="postgresql://localhost/primary", dev_database_type="sqlite", dev_database_url="sqlite:///./dev_primary.db", model_paths=["app.models.primary"], ) legacy = database_config( database_name="legacy", database_type="mysql", database_url_sync="mysql+pymysql://localhost:3306/legacy", dev_database_type="sqlite", dev_database_url="sqlite:///./dev_legacy.db", model_paths=["app.models.legacy"], ) analytics = database_config( database_name="analytics", database_type="clickhouse", database_url_sync="http://localhost:8123/analytics", dev_database_type="sqlite", dev_database_url="sqlite:///./dev_analytics.db", model_paths=["app.models.analytics"], ) ``` ```bash # Dev mode for all databases $ dbwarden --dev migrate --all # Dev mode for a specific database $ dbwarden --dev migrate --database analytics ``` ## Step 8: Legacy Database with MySQL Metadata The legacy MySQL database uses `MyTableMeta` and `MyColumnMeta` for MySQL-specific features. Here is a sample model from `app/models/legacy/customer.py`: ```python from sqlalchemy import Integer, String, TIMESTAMP, Text from sqlalchemy.orm import Mapped, mapped_column from dbwarden.databases.mysql import MyTableMeta, MyColumnMeta, my class Customer(Base): __tablename__ = "customers" id: Mapped[int] = mapped_column(Integer, primary_key=True) name: Mapped[str] = mapped_column(String(200), nullable=False) notes: Mapped[str | None] = mapped_column(Text) created_at: Mapped[str] = mapped_column(TIMESTAMP) class Meta(MyTableMeta): my_engine = "InnoDB" my_charset = "utf8mb4" my_collate = "utf8mb4_unicode_ci" comment = "Legacy CRM customers" class id(MyColumnMeta): my = my.field(unsigned=True) class created_at(MyColumnMeta): my = my.field(on_update="CURRENT_TIMESTAMP") ``` Migrations for the legacy database work identically to other databases: ```bash # Generate migration for MySQL legacy database $ dbwarden make-migrations "add customer table" --database legacy # Apply to legacy only $ dbwarden migrate --database legacy ``` The generated DDL will target MySQL-native syntax: ```sql CREATE TABLE IF NOT EXISTS customers ( id INTEGER UNSIGNED NOT NULL AUTO_INCREMENT, name VARCHAR(200) NOT NULL, notes TEXT, created_at TIMESTAMP NOT NULL ON UPDATE CURRENT_TIMESTAMP, PRIMARY KEY (id) ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci COMMENT='Legacy CRM customers'; ``` ## Key Takeaways - Multiple `database_config()` calls create independent database targets - Each database has its own migration directory, lock, and history - `--database` targets a specific database; `--all` targets every database - `default=True` controls which database is used when `--database` is omitted - `settings` commands modify `dbwarden.py` at runtime without manual editing - Dev mode can be configured independently per database ## Related Documentation - [Multi-Database Configuration](../configuration/multi-database.md) - [Dev Mode](../configuration/dev-mode.md) - [Settings Command](../commands/settings.md) ## Next [Section 9: FastAPI Integration](09-fastapi-integration.md) ======================================================================== PAGE: https://dbwarden.emiliano-go.com/cookbook/09-fastapi-integration/ ======================================================================== # 9. FastAPI Integration ## What You'll Learn - How to wire DBWarden into a FastAPI application lifecycle - How to use `primary.async_session` as a dependency injection - How to expose health check and migration endpoints - How to validate schema on startup ## Prerequisites - Docker (for PostgreSQL) - `examples/fastapi-app/` directory ## Step 1: Configuration with Session Handles ```python from dbwarden import database_config primary = database_config( database_name="primary", default=True, database_type="postgresql", database_url_sync="postgresql://user:password@localhost:5432/myapp", database_url_async="postgresql+asyncpg://user:password@localhost:5432/myapp", model_paths=["app.models"], ) ``` The `primary` object is a `DatabaseHandle`. It exposes `primary.async_session` and `primary.sync_session` as FastAPI-compatible dependency annotations; no separate dependency module needed. ## Step 2: Lifespan Hook ```python from contextlib import asynccontextmanager from fastapi import FastAPI from dbwarden.fastapi import dbwarden_lifespan @asynccontextmanager async def lifespan(app: FastAPI): async with dbwarden_lifespan(app, mode="check"): yield app = FastAPI(lifespan=lifespan) ``` `dbwarden_lifespan` runs on every startup: 1. **Schema validation** (mode `"check"`): verifies all pending migrations exist and the database is in a known state 2. **Readiness gate**: the app won't accept traffic until validation passes 3. **Connection pool warmup**: pre-connects to the database 4. **On shutdown**: disposes all engine pools and ClickHouse clients Available modes: - `"check"`: validate schema, fail on pending migrations (recommended for production) - `"migrate"`: apply pending migrations automatically on startup - `"skip"`: no startup checks ## Step 3: Session Dependency in Routes ```python from config import primary from app.models import User from app.schemas import UserResponse @router.get("/{user_id}", response_model=UserResponse) async def get_user(user_id: int, session: primary.async_session): result = await session.execute( select(User).where(User.id == user_id) ) user = result.scalar_one_or_none() if not user: raise HTTPException(status_code=404, detail="User not found") return user ``` `primary.async_session` is a type alias for `Annotated[AsyncSession, Depends(...)]`. FastAPI resolves it to an actual database session using the engine configured in `database_config()`. The session is automatically: - Opened when the route handler starts - Committed (or rolled back on exception) when the handler finishes - Closed and returned to the pool ## Step 4: Health Endpoints ```python from dbwarden.fastapi import DBWardenHealthRouter app.include_router(DBWardenHealthRouter(), prefix="/health") ``` This adds: | Endpoint | Description | |----------|-------------| | `GET /health/` | Overall health status across all databases | | `GET /health/liveness` | Is the app alive? (lightweight) | | `GET /health/readiness` | Is the app ready for traffic? (checks DB connectivity) | | `GET /health/{database_name}` | Per-database health status | Sample response: ```json { "status": "ok", "databases": { "primary": { "status": "ok", "connected": true, "pending_migrations": 0, "applied_migrations": 5, "lock_active": false } } } ``` ## Step 5: Migration Endpoints ```python from dbwarden.fastapi import DBWardenRouter app.include_router(DBWardenRouter(), prefix="/db") ``` | Endpoint | Description | |----------|-------------| | `GET /db/status` | JSON representation of `dbwarden status` | | `POST /db/migrate` | Trigger migration execution at runtime | These endpoints are useful for management UIs or automated deployment tooling. ## Step 6: The Complete App ```python from contextlib import asynccontextmanager from fastapi import FastAPI from dbwarden.fastapi import ( DBWardenHealthRouter, DBWardenRouter, dbwarden_lifespan, ) from app.routes import users @asynccontextmanager async def lifespan(app: FastAPI): async with dbwarden_lifespan(app, mode="check"): yield app = FastAPI( title="DBWarden FastAPI Example", lifespan=lifespan, ) app.include_router(users.router, prefix="/api/v1") app.include_router(DBWardenHealthRouter(), prefix="/health") app.include_router(DBWardenRouter(), prefix="/db") ``` ## Step 7: Run and Test ```bash # Install dependencies uv add dbwarden sqlalchemy fastapi uvicorn asyncpg # Start PostgreSQL docker run -d --name pg -e POSTGRES_USER=user \ -e POSTGRES_PASSWORD=password -e POSTGRES_DB=myapp \ -p 5432:5432 postgres:16 # Initialize and migrate $ dbwarden init $ dbwarden make-migrations "create users table" $ dbwarden migrate # Start the app uvicorn app.main:app --reload ``` ```bash # Health check curl http://localhost:8000/health/ # Create a user curl -X POST http://localhost:8000/api/v1/users/ \ -H "Content-Type: application/json" \ -d '{"email": "alice@example.com", "username": "alice"}' # Migration status curl http://localhost:8000/db/status ``` ## Key Takeaways - `database_config()` returns a `DatabaseHandle` with built-in FastAPI dependencies - `dbwarden_lifespan` integrates schema validation into the app lifecycle - `primary.async_session` works directly as a route parameter type annotation - `DBWardenHealthRouter` exposes liveness, readiness, and per-database health - `DBWardenRouter` exposes migration status and execution as HTTP endpoints ## Related Documentation - [FastAPI Integration Overview](../fastapi/index.md) - [FastAPI Tutorial: First Steps](../fastapi/tutorial/first-steps.md) - [FastAPI Tutorial: Complete Application](../fastapi/tutorial/complete-application.md) - [Session Dependency](../fastapi/tutorial/session-dependency.md) - [Health Endpoints](../fastapi/tutorial/health-endpoints.md) ## Next [Section 10: Auto Schemas](10-auto-schemas.md) ======================================================================== PAGE: https://dbwarden.emiliano-go.com/cookbook/10-auto-schemas/ ======================================================================== # 10. Auto-Generated Pydantic Schemas ## What You'll Learn - How `@auto_schema` generates Pydantic schemas from model annotations - How `public = False` controls field visibility - How `CreateSchema`, `UpdateSchema`, and `PublicSchema` differ ## Prerequisites - `examples/auto-schema/` directory (no config file needed; `@auto_schema` works at model definition time) - `uv add dbwarden sqlalchemy` ## The Problem In FastAPI applications, you typically define SQLAlchemy models for the database and Pydantic schemas for the API. This means maintaining two parallel definitions for every entity: the ORM layer and the API layer. They drift apart over time. DBWarden's `@auto_schema` eliminates this duplication by deriving Pydantic schemas directly from model annotations. ## Step 1: Define a Model with @auto_schema ```python from dbwarden.databases import TableMeta from dbwarden.databases import auto_schema @auto_schema class User(Base): __tablename__ = "users" id: Mapped[int] = mapped_column(Integer, primary_key=True) email: Mapped[str] = mapped_column(String(255), unique=True, nullable=False) username: Mapped[str] = mapped_column(String(100), unique=True, nullable=False) full_name: Mapped[str | None] = mapped_column(String(200), nullable=True) password_hash: Mapped[str] = mapped_column(String(255), nullable=False) is_active: Mapped[bool] = mapped_column(Boolean, default=True) created_at: Mapped[datetime] = mapped_column(DateTime, server_default=text("CURRENT_TIMESTAMP")) class Meta(TableMeta): comment = "User accounts with auto-generated Pydantic schemas" class email: comment = "Login email" class password_hash: public = False # Excluded from PublicSchema ``` ## Step 2: What Gets Generated The decorator creates four schema classes on the model: ### `User.CreateSchema` Used for POST requests: includes all fields that the client should provide. Server-defaulted fields (like auto-increment `id`) are excluded. ```python create = User.CreateSchema( email="alice@example.com", username="alice", password_hash="secret", full_name="Alice Smith", is_active=True, ) ``` ### `User.UpdateSchema` All fields optional: used for PATCH requests. ```python update = User.UpdateSchema(full_name="Alice Johnson") ``` ### `User.PublicSchema` Excludes fields marked `public = False`. Perfect for API responses where you never want to leak `password_hash`. ```python user = User( id=1, email="alice@example.com", username="alice", password_hash="secret", is_active=True, ) public = User.PublicSchema.model_validate(user) print(public.model_dump()) # { # "id": 1, # "email": "alice@example.com", # "username": "alice", # "full_name": None, # "is_active": True, # "created_at": ..., # } # password_hash is NOT included ``` ### `User.Schema` All mapped columns, including those marked `public = False`. ## Step 3: Controlling Visibility | Technique | Effect | |-----------|--------| | `class Meta: class field: public = False` | Excluded from PublicSchema | | Field name starting with `_` | Implicitly `public = False` | | `SchemaConfig(exclude_public=["field"])` | Excluded from PublicSchema | | `SchemaConfig(exclude_create=["field"])` | Excluded from CreateSchema | ## Step 4: Customizing Schema Generation ```python from dbwarden.databases import auto_schema, SchemaConfig @auto_schema(config=SchemaConfig( exclude_public=["internal_note"], exclude_create=["created_at"], field_overrides={ "email": EmailStr, }, )) class User(Base): ... ``` `SchemaConfig` supports: | Option | Description | |--------|-------------| | `exclude_always` | Excluded from all schemas | | `exclude_create` | Excluded from CreateSchema only | | `exclude_update` | Excluded from UpdateSchema only | | `exclude_public` | Excluded from PublicSchema only | | `field_overrides` | Override Pydantic field types | | `required_always` | Fields always required | | `optional_always` | Fields always optional | ## Key Takeaways - `@auto_schema` generates CreateSchema, UpdateSchema, PublicSchema, and Schema - `public = False` in `class Meta` controls API visibility; no manual filtering in routes - Fields starting with `_` are implicitly non-public - Use `User.PublicSchema.model_validate(instance)` to convert model instances to API responses - Customize with `SchemaConfig` for advanced use cases ## Related Documentation - [Modeling Guide: Auto-Generated Schemas](../getting-started/modeling.md#auto-generated-pydantic-schemas-with-auto_schema) - [SQLAlchemy Models Reference](../models.md) ## Next [Section 11: Observability](11-observability.md) ======================================================================== PAGE: https://dbwarden.emiliano-go.com/cookbook/11-observability/ ======================================================================== # 11. Observability ## What You'll Learn - How to enable Prometheus metrics for DBWarden - How to use structured JSON logging - How to add query tracing middleware to FastAPI - How to monitor connection pool health ## Prerequisites - `examples/observability/` directory - Docker (for optional Prometheus + Grafana) ## Step 1: Enable Prometheus Metrics Install with metrics support: ```bash uv add "dbwarden[metrics]" ``` DBWarden exposes six Prometheus metric families: | Metric | Type | Labels | Description | |--------|------|--------|-------------| | `dbwarden_migrations_total` | Counter | `database`, `status` | Migration count | | `dbwarden_migration_duration_seconds` | Histogram | `database` | Execution time | | `dbwarden_schema_version` | Gauge | `database` | Current version | | `dbwarden_pending_migrations` | Gauge | `database` | Pending count | | `dbwarden_errors_total` | Counter | `database`, `error_type` | Error count | | `dbwarden_seed_version` | Gauge | `database` | Current seed version | ## Step 2: Add Metrics to FastAPI ```python from dbwarden.fastapi import MetricsMiddleware, MetricsRouter # Middleware captures request duration and counts app.add_middleware(MetricsMiddleware) # Router exposes /metrics endpoint app.include_router(MetricsRouter(), prefix="/metrics") ``` ```bash curl http://localhost:8000/metrics ``` Output: ``` # HELP dbwarden_migrations_total Total number of migrations # TYPE dbwarden_migrations_total counter dbwarden_migrations_total{database="primary",status="applied"} 5 # HELP dbwarden_pending_migrations Number of pending migrations # TYPE dbwarden_pending_migrations gauge dbwarden_pending_migrations{database="primary"} 0 ``` ## Step 3: Structured Logging ```python import os os.environ["DBWARDEN_LOG_JSON"] = "1" ``` Or via environment variable: ```bash DBWARDEN_LOG_JSON=1 uvicorn app.main:app ``` This switches from colored human-readable output to JSON: ```json {"timestamp": "2025-01-15T10:30:00Z", "level": "INFO", "event": "migration_applied", "database": "primary", "duration_ms": 42, "version": "0005"} ``` JSON logs are easier to ingest into ELK, Datadog, or other log aggregators. ## Step 4: Query Tracing ```python from dbwarden.fastapi import QueryTracingMiddleware app.add_middleware(QueryTracingMiddleware) ``` This logs every SQL query with its duration: ```json {"event": "query", "duration_ms": 3, "database": "primary", "statement": "SELECT ..."} ``` Useful for: - Identifying slow queries in development - Building a query performance baseline - Debugging N+1 query patterns ## Step 5: Pool Metrics Collector ```python from dbwarden.fastapi import PoolMetricsCollector ``` This monitors SQLAlchemy connection pool health and exposes: - Pool size (current/total) - Connections in use - Connections overflow - Pool timeouts ## Step 6: Full Setup ```python from contextlib import asynccontextmanager from fastapi import FastAPI from dbwarden.fastapi import ( DBWardenHealthRouter, dbwarden_lifespan, MetricsMiddleware, MetricsRouter, QueryTracingMiddleware, ) @asynccontextmanager async def lifespan(app: FastAPI): async with dbwarden_lifespan(app, mode="check"): yield app = FastAPI( title="DBWarden Observability Example", lifespan=lifespan, ) app.add_middleware(QueryTracingMiddleware) app.add_middleware(MetricsMiddleware) app.include_router(MetricsRouter(), prefix="/metrics") app.include_router(DBWardenHealthRouter(), prefix="/health") ``` ## Step 7: Prometheus + Grafana (Optional) ```yaml services: prometheus: image: prom/prometheus:latest ports: - "9090:9090" volumes: - ./prometheus.yml:/etc/prometheus/prometheus.yml grafana: image: grafana/grafana:latest ports: - "3000:3000" ``` ```bash docker compose up -d ``` - Prometheus: http://localhost:9090 - Grafana: http://localhost:3000 In Grafana, add Prometheus data source (`http://prometheus:9090`) and create dashboards using the `dbwarden_*` metrics. ## Key Takeaways - Metrics are opt-in via `uv add "dbwarden[metrics]"` - Six metric families cover migration, schema, and error tracking - `DBWARDEN_LOG_JSON=1` switches to structured JSON logging - `QueryTracingMiddleware` logs every SQL query with duration - `PoolMetricsCollector` monitors connection pool health - Metrics are compatible with standard Prometheus + Grafana setup ## Related Documentation - [Observability Guide](../observability.md) - [FastAPI Metrics](../fastapi/tutorial/first-steps.md) (see FastAPI tutorial) ======================================================================== PAGE: https://dbwarden.emiliano-go.com/cookbook/ ======================================================================== # Cookbook & Examples Practical, runnable examples that walk through the entire DBWarden workflow: from project setup through advanced observability patterns. ## How to Use Each cookbook section links to code under the [`examples/`](https://github.com/emiliano-gandini-outeda/DBWarden/tree/main/examples) directory. The **core examples** (sections 1–7) use SQLite and require only `uv add dbwarden`. Advanced examples may need Docker for PostgreSQL, ClickHouse, or Prometheus. ``` examples/ ├── core/ # Sections 1–7: progressive SQL workflow ├── multi-database/ # Section 8 ├── fastapi-app/ # Section 9 ├── auto-schema/ # Section 10 └── observability/ # Section 11 ``` ## Sections | # | Section | What You'll Learn | Example Dir | |---|---------|-------------------|-------------| | 1 | [Project Setup](01-project-setup.md) | `init`, `config`, understanding `database_config()` | `examples/core/` | | 2 | [Models & Migrations](02-models-and-migrations.md) | Model definitions, `make-migrations`, `new`, `make-rollback` | `examples/core/` | | 3 | [Apply & Inspect](03-apply-and-inspect.md) | `migrate`, `rollback`, `downgrade`, `history`, `status`, `check`, `check-db` | `examples/core/` | | 4 | [Offline & CI](04-offline-ci.md) | `export-models`, `make-migrations --offline` | `examples/core/` | | 5 | [Schema Inspection](05-schema-inspection.md) | `diff`, `snapshot`, `generate-models` | `examples/core/` | | 6 | [Safety & Impact](06-safety-impact.md) | `check`, `check-impact`, destructive change detection | `examples/core/` | | 7 | [Seeds](07-seeds.md) | `seed create/apply/rollback/list`, SQL seeds, `@seed_data` | `examples/core/` | | 8 | [Multi-Database](08-multi-database.md) | Multiple `database_config()`, PG + ClickHouse, `--all` flag | `examples/multi-database/` | | 9 | [FastAPI Integration](09-fastapi-integration.md) | Lifespan hooks, health endpoints, session DI, migration endpoints | `examples/fastapi-app/` | | 10 | [Auto Schemas](10-auto-schemas.md) | `@auto_schema`, `CreateSchema`, `UpdateSchema`, `PublicSchema` | `examples/auto-schema/` | | 11 | [Observability](11-observability.md) | Prometheus metrics, structured logging, query tracing | `examples/observability/` | ## Quick Start (Core) ```bash cd examples/core uv add dbwarden sqlalchemy bash scripts/01-setup.sh bash scripts/02-models-migrations.sh bash scripts/03-apply-inspect.sh ``` Each section in the cookbook explains what these commands do, what SQL they produce, and why it matters. ## Database-Specific Examples The core examples use SQLite for zero-dependency setup. For production, DBWarden fully supports PostgreSQL, MySQL, and ClickHouse — each with its own deep-dive guide and dedicated example patterns. ### PostgreSQL PostgreSQL is a first-class backend with full round-trip support (read and write schema). The FastAPI integration example in [Section 9](09-fastapi-integration.md) uses PostgreSQL, and [Section 8](08-multi-database.md) shows PostgreSQL + ClickHouse together. For the complete reference on PostgreSQL-specific metadata (identity columns, collation, compression, generated columns, tablespace, inheritance, exclusion constraints, deferrable FKs, advanced index options), see the [PostgreSQL Deep Dive](../databases/postgresql.md). ```python from dbwarden import database_config primary = database_config( database_name="primary", default=True, database_type="postgresql", database_url_sync="postgresql://user:password@localhost:5432/myapp", database_url_async="postgresql+asyncpg://user:password@localhost:5432/myapp", model_paths=["app.models"], ) ``` ### MySQL / MariaDB MySQL and MariaDB are first-class backends with full round-trip support. All MySQL-specific metadata (engine, charset, collation, row format, auto_increment, unsigned columns, ON UPDATE, column comments) is captured by the snapshot, diffed correctly, and emitted as valid DDL. See the [MySQL Deep Dive](../databases/mysql.md) for the complete reference, including MySQL-specific model metadata via `class Meta(MyTableMeta)`. ```python from dbwarden import database_config legacy = database_config( database_name="legacy", database_type="mysql", database_url_sync="mysql+pymysql://user:password@localhost:3306/legacy", model_paths=["app.legacy_models"], ) ``` ### ClickHouse ClickHouse is supported with partial round-trip (read schema and auto-generate most DDL). DBWarden uses the ClickHouse HTTP client directly for DDL execution and supports full engine metadata via `class Meta(CHTableMeta)` with `ChEngineSpec`, `ProjectionSpec`, and `CHColumnMeta`. See the [ClickHouse Deep Dive](../databases/clickhouse.md) for full details on materialized views, projections, dictionaries, replicated engines, and ClickHouse-specific metadata. ClickHouse is typically configured alongside a transactional database (see [Section 8](08-multi-database.md) for a PostgreSQL + ClickHouse example). ======================================================================== PAGE: https://dbwarden.emiliano-go.com/databases/clickhouse/ ======================================================================== # ClickHouse DBWarden treats ClickHouse as a **first-class backend**: every natively supported feature is reverse-engineered, diffed, and emitted as correct DDL. ## First-Class Features "First-class" means the round-trip is verified: reverse-engineer a live database with `generate-models`, feed the output back into `make-migrations`, and get **zero diff**. ```bash # Step 1: reverse-engineer your live ClickHouse database $ dbwarden generate-models -d analytics # Step 2: feed the generated models back in, zero diff $ dbwarden make-migrations -d analytics # -> "No new migrations to generate" (output is empty; your models match the DB exactly) ``` The following ClickHouse features are fully supported in this round-trip: | Category | Features | |----------|----------| | Engine Spec | `MergeTree`, `ReplicatedMergeTree`, `SummingMergeTree`, `AggregatingMergeTree`, `CollapsingMergeTree`, `VersionedCollapsingMergeTree`, `ReplacingMergeTree`, `Distributed` via `ChEngineSpec(name, args, zookeeper_path, replica_name, settings)` | | Ordering | `ORDER BY (col1, col2)` via `ch_order_by` (string or list) | | Primary Key | `PRIMARY KEY (col1)` via `ch_primary_key` (must be prefix of order by) | | Partitioning | `PARTITION BY toYYYYMM(col)` via `ch_partition_by` | | Sampling | `SAMPLE BY intHash64(col)` via `ch_sample_by` | | TTL | Table and column TTL via `ch_ttl` as list of expressions | | Settings | `SETTINGS index_granularity=8192` via `ch_settings` dict | | Materialized Views | `CREATE MATERIALIZED VIEW ... TO target AS SELECT ...` via `ch_select_statement`, `ch_to_table` | | Projections | `PROJECTION name (SELECT ...)` via `ProjectionSpec` list | | Dictionaries | `CREATE DICTIONARY ... SOURCE(...) LIFETIME(...) LAYOUT(...)` via `ch_dict_*` fields | | Skip Indexes | `ALTER TABLE ... ADD INDEX ... TYPE bloom_filter GRANULARITY N` via `ChIndexSpec` entries in `ch_indexes` | | Column Codecs | `CODEC(ZSTD(3))` via `ch.field(codec=...)` on `CHColumnMeta` | | LowCardinality / Nullable | Type wrappers via `ch.field(low_cardinality=..., nullable=...)` on `CHColumnMeta` | | Column Defaults | `DEFAULT expr`, `MATERIALIZED expr`, `ALIAS expr` via column Meta | | Type Normalization | `VARCHAR` -> `String`, `INTEGER` -> `Int32`, `BIGINT` -> `Int64`, `FLOAT(53)` -> `Float64`, `NUMERIC(p,s)` -> `Decimal(p,s)`, `BOOLEAN` -> `Bool`, `ARRAY(Integer)` -> `Array(Int32)`, `Enum` -> `Enum8/Enum16`, `UUID` -> `UUID`, `JSON` -> `JSON`, `DATETIME` -> `DateTime` / `DateTime64` | | Auto-detect | `generate-models` auto-enables ClickHouse engine metadata when `database_type="clickhouse"` (no `--clickhouse-engines` flag needed) | | Snapshot | Full `system.tables` / `system.columns` extraction with CH metadata in `ch_options` and `ch_column` | ## Declaring Metadata ClickHouse metadata is declared in a `class Meta` inner class on the model. This is the **only** supported surface. Pass options via `mapped_column(info=...)` raises `DBWardenConfigError`. ### Table-Level Meta Inherit from `CHTableMeta` on your `class Meta`: ```python from datetime import date, datetime from sqlalchemy.orm import DeclarativeBase, Mapped, mapped_column from dbwarden.databases.clickhouse import CHTableMeta, ChEngineSpec, ProjectionSpec class Base(DeclarativeBase): pass class Event(Base): __tablename__ = "events" id: Mapped[int] = mapped_column(Int64, primary_key=True) event_date: Mapped[date] = mapped_column(Date) payload: Mapped[str] = mapped_column(String) class Meta(CHTableMeta): ch_engine = ChEngineSpec("MergeTree") ch_order_by = ["event_date", "id"] ch_primary_key = "event_date" ch_partition_by = "toYYYYMM(event_date)" ch_sample_by = "intHash64(id)" ch_ttl = ["event_date + toIntervalYear(1)"] ch_settings = {"index_granularity": "8192"} comment = "Core event store" ``` `CHTableMeta` inherits from `TableMeta`, which provides common attributes shared across all backends: | Attribute | Type | SQL | |-----------|------|-----| | `comment` | `str` | `COMMENT ON TABLE t IS '...'` | | `indexes` | `list[dict]` | Common cross-database indexes | | `checks` | `list[dict]` | `ALTER TABLE t ADD CONSTRAINT ... CHECK (...)` | | `uniques` | `list[dict]` | `ALTER TABLE t ADD CONSTRAINT ... UNIQUE (...)` | ClickHouse-specific `CHTableMeta` attributes: | Attribute | Type | SQL | |-----------|------|-----| | `ch_engine` | `ChEngineSpec` | `ENGINE = MergeTree()` (or other engine) | | `ch_order_by` | `str` or `list[str]` | `ORDER BY (col1, col2)` | | `ch_primary_key` | `str` or `list[str]` | `PRIMARY KEY (col1)` (must be prefix of order by) | | `ch_partition_by` | `str` | `PARTITION BY toYYYYMM(col)` | | `ch_sample_by` | `str` | `SAMPLE BY intHash64(col)` | | `ch_ttl` | `list[str]` | `TTL expr1, expr2` | | `ch_settings` | `dict[str, str]` | `SETTINGS key=value` (emitted last) | | `ch_object_type` | `str` | `"table"`, `"materialized_view"`, or `"dictionary"` (auto-detected) | | `ch_select_statement` | `str` | `AS SELECT ...` for materialized views | | `ch_to_table` | `str` | `TO target_table` for materialized views | | `ch_dictionary` | `bool` | Whether this model is a dictionary | | `ch_dict_layout` | `str` | Dictionary layout (e.g., `"FLAT()"`) | | `ch_dict_source` | `str` | Dictionary source (e.g., `"CLICKHOUSE(TABLE 'src')"`) | | `ch_dict_lifetime` | `int` or `str` | Dictionary cache lifetime | | `ch_dict_primary_key` | `str` or `list[str]` | Dictionary primary key | | `ch_indexes` | `list[ChIndexSpec]` | Skip indexes via `ALTER TABLE ... ADD INDEX ...` | | `ch_projections` | `list[ProjectionSpec]` | Named projections | | `ch_zookeeper_path` | `str` | ZooKeeper path for replicated engines | | `ch_replica_name` | `str` | Replica name for replicated engines | The `ChTableSpec` dataclass (from `dbwarden.databases.clickhouse` or `dbwarden`) mirrors these attributes for programmatic access: ```python from dbwarden.databases import ChTableSpec spec = ChTableSpec( engine="MergeTree", order_by=["event_date", "id"], partition_by="toYYYYMM(event_date)", ttl="event_date + toIntervalYear(1)", settings={"index_granularity": "8192"}, ) ``` ### Engine Spec Use `ChEngineSpec` to define the table engine: ```python from dbwarden.databases.clickhouse import ChEngineSpec # Simple engine ch_engine = ChEngineSpec("MergeTree") # Engine with positional arguments ch_engine = ChEngineSpec("ReplacingMergeTree", args=("version_column",)) # Replicated engine with ZooKeeper path and replica name ch_engine = ChEngineSpec("ReplicatedMergeTree", zookeeper_path="/clickhouse/tables/shard1/events", replica_name="{replica}") # Distributed engine with settings ch_engine = ChEngineSpec("Distributed", args=("cluster", "db", "events", "rand()"), settings={"insert_distributed_sync": "1"}) ``` The `ChEngineSpec` constructor fields: | Field | Type | Description | |-------|------|-------------| | `name` | `str` | Engine name (e.g., `"MergeTree"`, `"ReplicatedMergeTree"`) | | `args` | `tuple[str, ...]` | Positional engine arguments | | `zookeeper_path` | `str` or `None` | ZooKeeper path (injected as first engine arg) | | `replica_name` | `str` or `None` | Replica name (injected as second engine arg) | | `settings` | `dict[str, str]` or `None` | `SETTINGS key=value` pairs | ### Projections Use `ProjectionSpec` in `ch_projections` to attach named projections: ```python from dbwarden.databases.clickhouse import ProjectionSpec class Meta(CHTableMeta): ch_order_by = ["author", "created_at"] ch_projections = [ ProjectionSpec("by_author", "SELECT * ORDER BY author"), ProjectionSpec("daily_stats", "SELECT toDate(created_at) AS day, count() GROUP BY day"), ] ``` ### Skip Indexes Use `ChIndexSpec` in `ch_indexes`: ```python from dbwarden.databases.clickhouse import ChIndexSpec class Meta(CHTableMeta): ch_indexes = [ ChIndexSpec("ix_payload", ["payload"], type="bloom_filter", granularity=1), ChIndexSpec("ix_url", ["url"], type="minmax", granularity=3), ] ``` The `ChIndexSpec` constructor fields: | Field | Type | Description | |-------|------|-------------| | `name` | `str` | Index name | | `columns` | `list[str]` | Column names for the index | | `type` | `str` | Index type (`bloom_filter`, `minmax`, `set`, etc.) | | `granularity` | `int` | Index granularity (default: `1`) | | `expr` | `str` or `None` | Optional expression override for the index definition | Generated SQL: ```sql ALTER TABLE events ADD INDEX ix_payload (payload) TYPE bloom_filter GRANULARITY 1 ALTER TABLE events ADD INDEX ix_url (url) TYPE minmax GRANULARITY 3 ``` ### Column-Level Meta Use `CHColumnMeta` inner classes for per-column metadata. The inner class must be named after the column. Use `ch = ch.field(...)` to set column-level options: ```python from sqlalchemy.orm import DeclarativeBase, Mapped, mapped_column from dbwarden.databases.clickhouse import CHTableMeta, CHColumnMeta, ChEngineSpec, ch class Base(DeclarativeBase): pass class Event(Base): __tablename__ = "events" id: Mapped[int] = mapped_column(Int64, primary_key=True) payload: Mapped[str] = mapped_column(String) event_time: Mapped[datetime] = mapped_column(DateTime) tags: Mapped[list[str]] = mapped_column(ARRAY(String)) class Meta(CHTableMeta): ch_engine = ChEngineSpec("MergeTree") ch_order_by = "event_time" class payload(CHColumnMeta): ch = ch.field(codec="ZSTD(3)", nullable=False) class tags(CHColumnMeta): ch = ch.field(low_cardinality=True) class event_time(CHColumnMeta): ch = ch.field(default_expression="now()") ``` `CHColumnMeta` includes common column attributes shared across all backends: | Attribute | Type | SQL | |-----------|------|-----| | `comment` | `str` | `COMMENT ON COLUMN t.c IS '...'` | | `public` | `bool` | Controls field visibility in schemap auto-schema | | `ch` | `ChFieldSpec` | ClickHouse-specific column options (see table below) | ClickHouse-specific `ChFieldSpec` fields (set via `ch.field(...)`): | Keyword | Type | SQL | |---------|------|-----| | `codec` | `str` | `CODEC(ZSTD(3))` | | `default_expression` | `str` | `DEFAULT expr` | | `materialized` | `str` | `MATERIALIZED expr` | | `alias` | `str` | `ALIAS expr` | | `ttl` | `str` | Column-level TTL expression | | `low_cardinality` | `bool` | Wrap type in `LowCardinality(...)` | | `nullable` | `bool` | Wrap type in `Nullable(...)` | ### Materialized Views Materialized views use `ch_select_statement` instead of `ch_engine` for the target: ```python class EventRollup(Base): __tablename__ = "event_rollup_mv" event_date: Mapped[date] = mapped_column(Date) total: Mapped[int] = mapped_column(Int64) class Meta(CHTableMeta): ch_object_type = "materialized_view" ch_select_statement = ( "SELECT toDate(event_time) AS event_date, count() AS total " "FROM events GROUP BY event_date" ) ch_to_table = "mv_target" ``` When `ch_to_table` is set, the generated SQL omits the `ENGINE` clause (ClickHouse rejects `ENGINE` with `TO` clause): ```sql CREATE MATERIALIZED VIEW IF NOT EXISTS event_rollup_mv TO dbwarden.mv_target ( event_date Date NOT NULL, total Int64 NOT NULL ) AS SELECT toDate(event_time) AS event_date, count() AS total FROM events GROUP BY event_date ``` ### Dictionaries ClickHouse dictionaries use `ch_dictionary = True` and related `ch_dict_*` fields: ```python class CountryCode(Base): __tablename__ = "country_codes" code: Mapped[str] = mapped_column(String) name: Mapped[str] = mapped_column(String) class Meta(CHTableMeta): ch_dictionary = True ch_dict_layout = "FLAT()" ch_dict_source = "CLICKHOUSE(HOST 'localhost' TABLE 'countries')" ch_dict_lifetime = "MIN 0 MAX 3600" ch_dict_primary_key = "code" ``` Required fields when `ch_dictionary = True`: | Field | Description | Example | |-------|-------------|---------| | `ch_dict_layout` | Dictionary layout | `"FLAT()"`, `"COMPLEX_KEY_HASHED()"` | | `ch_dict_source` | Source configuration | `"CLICKHOUSE(HOST '...' TABLE '...')"` | | `ch_dict_lifetime` | Cache lifetime | `"MIN 0 MAX 3600"` or `3600` | Optional field: | Field | Description | Default | |-------|-------------|---------| | `ch_dict_primary_key` | Primary key expression | First column | Generated SQL: ```sql CREATE DICTIONARY IF NOT EXISTS country_codes ( code Int64, name String ) PRIMARY KEY code SOURCE(CLICKHOUSE(HOST 'localhost' TABLE 'countries')) LIFETIME(MIN 0 MAX 3600) LAYOUT(FLAT()) ``` Column types render as CH-native types (`Int64`, `String`) rather than SQLAlchemy types (`BIGINT`, `VARCHAR`). ## DDL Behavior ### ADD COLUMN / DROP COLUMN Standard `ALTER TABLE ... ADD COLUMN` and `ALTER TABLE ... DROP COLUMN` work for ClickHouse. Column types render as CH-native type names (e.g., `ALTER TABLE events ADD COLUMN value Float64 NOT NULL`). ### ALTER COLUMN DEFAULT ClickHouse supports `ALTER TABLE ... MODIFY COLUMN ... DEFAULT ...` (and `DROP DEFAULT`) which maps to the standard `SET DEFAULT` / `DROP DEFAULT` pattern. ### ALTER COLUMN TYPE ClickHouse supports in-place `ALTER TABLE ... MODIFY COLUMN ... type` for compatible type changes (e.g., `Int32` to `Int64`). Incompatible type changes (e.g., `String` to `Int64`) require table recreation. ### ALTER TABLE MODIFY SETTING Not all `SETTINGS` are dynamic. Some settings require table recreation. Review the ClickHouse documentation for your version to confirm which settings are `MODIFY SETTING` compatible. ### DDL Transactional Behavior ClickHouse executes DDL statements individually. There is no transactional DDL; partial failure during a multi-statement migration can leave the schema in an inconsistent state. Each statement is applied atomically, but there is no rollback across statements. ### Statement Ordering The standard statement ordering applies to ClickHouse. Operations that emit comments (table rename, safe type change) produce zero-effect placeholders. The ordering ensures the upgrade script remains structurally consistent even when backends skip operations. ## Previously Manual Operations Now Auto-Generated These operations were previously comment-only placeholders. DBWarden now auto-generates SQL for them: ### Table Rename ClickHouse supports `RENAME TABLE ... TO ...`. DBWarden emits real SQL: ```sql RENAME TABLE users TO accounts; ``` ### Nullable / LowCardinality Type Changes DBWarden computes the target type by stripping Nullable/LowCardinality wrappers from the base type and re-wrapping with the new flags: ```sql ALTER TABLE events MODIFY COLUMN payload Nullable(String); ``` ### Projections DBWarden diffs the projection lists by name and emits native `DROP/ADD PROJECTION`: ```sql ALTER TABLE events DROP PROJECTION by_date; ALTER TABLE events ADD PROJECTION by_date SELECT event_date, sum(amount) GROUP BY event_date; ``` ## What Emits Comments Only These operations are not supported by ClickHouse DDL. DBWarden emits comment placeholders or points to available flags: ### Safe Type Change `--safe-type-change` is not supported for ClickHouse. The multi-step temp-column strategy emits a comment. ```sql -- ClickHouse safe type change not supported. -- Manually recreate users with new type for bio. ``` ### Foreign Keys ClickHouse does not enforce foreign key constraints, and DBWarden **prohibits** `ForeignKey()` on models configured for a ClickHouse-backed database. Using `ForeignKey()` raises `DBWardenConfigError` at model discovery time: ``` DBWardenConfigError: Column 'sync.repo' uses ForeignKey constraint referencing 'repos(name)', but ClickHouse does not support foreign key constraints. Remove ForeignKey() and use a plain mapped_column instead; the relationship is logical only. If this model is shared across multiple databases, move it to its own module and configure separate model_paths per database. ``` Instead, use a plain `mapped_column` for columns that reference other tables; the relationship is logical only and should be enforced at the application layer. If you need to share model code between a ClickHouse database and another backend, keep the shared models in a module that is **only** included in the non-ClickHouse database's `model_paths`. Each `database_config()` entry has independent `model_paths`, so a model file with `ForeignKey()` will never be discovered for a database that shouldn't see it. ### Indexes (Standard) Standard SQL indexes are not supported. Only ClickHouse skip indexes declared via `ch_indexes` and `ChIndexSpec` generate real DDL. See [Skip Indexes](#skip-indexes) above. ## Snapshot Format The snapshot JSON captures all ClickHouse-specific metadata. Key sections: ### Column Extras ```json { "name": "payload", "type": "String", "ch_column": { "ch_codec": "ZSTD(3)", "ch_default_expression": null, "ch_materialized": null, "ch_alias": null, "ch_ttl": null, "ch_low_cardinality": false, "ch_nullable": false, "ch_type": "String" } } ``` ### Table Extras ```json { "ch_options": { "ch_engine_raw": {"name": "MergeTree", "args": [], "zookeeper_path": null, "replica_name": null, "settings": null}, "ch_engine": ["MergeTree"], "ch_order_by": ["event_date", "id"], "ch_primary_key": ["event_date"], "ch_partition_by": "toYYYYMM(event_date)", "ch_sample_by": null, "ch_ttl": ["event_date + toIntervalYear(1)"], "ch_settings": {"index_granularity": "8192"}, "ch_object_type": "table", "ch_projections": [{"name": "by_date", "query": "SELECT event_date, sum(amount) GROUP BY event_date"}], "ch_zookeeper_path": null, "ch_replica_name": null } } ``` ## Reverse Engineering `generate-models` queries `system.tables`, `system.columns`, and `system.data_skipping_indices` to reverse-engineer all ClickHouse metadata. The emitted model uses `class Meta` with `CHTableMeta`, `CHColumnMeta`, `ChEngineSpec`, and `ProjectionSpec`. ```bash $ dbwarden generate-models -d analytics ``` Auto-detection is the default: when `database_type="clickhouse"`, engine metadata is included automatically. The `--clickhouse-engines` flag is no longer required. Generated output for a table with engine, ordering, partitioning, codec, and projections: ```python from datetime import date from sqlalchemy.orm import DeclarativeBase, Mapped, mapped_column from dbwarden.databases.clickhouse import CHTableMeta, CHColumnMeta, ChEngineSpec, ProjectionSpec, ch class Base(DeclarativeBase): pass class Event(Base): __tablename__ = "events" id: Mapped[int] = mapped_column(Int64, primary_key=True) event_date: Mapped[date] = mapped_column(Date) payload: Mapped[str] = mapped_column(String) class Meta(CHTableMeta): ch_engine = ChEngineSpec("MergeTree") ch_order_by = ["event_date", "id"] ch_partition_by = "toYYYYMM(event_date)" ch_ttl = ["event_date + toIntervalYear(1)"] ch_settings = {"index_granularity": "8192"} ch_projections = [ ProjectionSpec("by_date", "SELECT event_date, sum(amount) GROUP BY event_date"), ] class payload(CHColumnMeta): ch = ch.field(codec="ZSTD(3)") ``` ## Safety Classification DBWarden classifies migration changes using the `Safety` enum: ```python from dbwarden.engine.safety import Safety assert Safety.SAFE == "SAFE" assert Safety.INFO == "INFO" assert Safety.WARN == "WARN" assert Safety.CRITICAL == "CRITICAL" ``` The following ClickHouse-specific safety classifiers are available for custom analysis: ```python from dbwarden.engine.safety import ( classify_ch_column_change, classify_ch_options_change, classify_ch_safety, ) ``` | Change Type | Severity | Flag Required | |-------------|----------|---------------| | Add column | `INFO` | None | | Drop column | `WARNING` | `--force` | | Change column type (ch_type) | `CRITICAL` | `--force` | | Change column codec | `WARN` | None | | Change column default | `WARN` | None | | Change column TTL | `WARN` | None | | Change LowCardinality / Nullable | `CRITICAL` | `--force` | | Change engine | `CRITICAL` | `--force` | | Change order by | `CRITICAL` | `--force` | | Change partition by | `WARN` | `--force` | | Change sample by | `INFO` | None | | Change TTL | `INFO` | None | | Change settings | `WARN` | `--force` | | Change object type | `CRITICAL` | `--force` | | Change dictionary layout / source / lifetime | `WARN` | `--force` | | Add projection | `INFO` | None | | Drop projection | `WARNING` | `--force` | | Add skip index | `INFO` | None | | Drop skip index | `WARNING` | `--force` | | Add materialized view | `INFO` | None | | Drop materialized view | `CRITICAL` | `--force` | ## Verification Workflow Because ClickHouse diverges from standard SQL DDL, always review auto-generated migrations before applying: ```bash $ dbwarden make-migrations -d analytics $ dbwarden migrate -d analytics ``` Use `dbwarden make-migrations --plan -d analytics` to preview the ops without writing files. See [models.md](../models.md) for detailed ClickHouse model examples and [Modeling Guide](../getting-started/modeling.md) for a complete walkthrough. ======================================================================== PAGE: https://dbwarden.emiliano-go.com/databases/mysql/ ======================================================================== # MySQL & MariaDB DBWarden treats MySQL (and its fork MariaDB) as **first-class backends**: every natively supported feature is reverse-engineered, diffed, and emitted as correct DDL. ## First-Class Features "First-class" means the round-trip is verified: reverse-engineer a live database with `generate-models`, feed the output back into `make-migrations`, and get **zero diff**. ```bash # Step 1: reverse-engineer your live MySQL/MariaDB database $ dbwarden generate-models -d primary # Step 2: feed the generated models back in, zero diff $ dbwarden make-migrations -d primary # -> "No new migrations to generate" (output is empty; your models match the DB exactly) ``` The following MySQL/MariaDB features are fully supported in this round-trip: | Category | Features | |----------|----------| | Engine | `ENGINE=InnoDB`, `MyISAM`, etc. via `my_engine` | | Charset & Collation | Table: `DEFAULT CHARACTER SET` / `COLLATE` via `my_charset`, `my_collate`. Column: per-column `CHARACTER SET` / `COLLATE` via `my.field(charset=..., collate=...)` | | Row Format | `ROW_FORMAT=DYNAMIC`, `COMPACT`, `COMPRESSED`, `REDUNDANT` via `my_row_format` | | Auto Increment | Table-level `AUTO_INCREMENT=N` via `my_auto_increment`. Column-level toggle via `autoincrement` field | | Unsigned | `UNSIGNED` on integer columns via `my.field(unsigned=True)` | | ON UPDATE | `ON UPDATE CURRENT_TIMESTAMP` via `my.field(on_update="CURRENT_TIMESTAMP")` | | Comments | Table: `ALTER TABLE t COMMENT = '...'`. Column: `MODIFY COLUMN ... COMMENT '...'` (full column definition preserved) | | Foreign Keys | `ON DELETE` / `ON UPDATE` options; DROP uses `DROP FOREIGN KEY` (MySQL syntax) | | Indexes | Full index support; `USING BTREE / HASH` preserved | | Auto-increment Lifecycle | Toggle autoincrement on integer PKs via `autoincrement` field: generates `MODIFY COLUMN ... AUTO_INCREMENT` | | Type Normalization | `TINYINT(1)` -> `BOOLEAN`, `INT`, `BIGINT`, `VARCHAR(n)`, `TEXT`, `DATETIME`, `TIMESTAMP`, `YEAR`, `DECIMAL(p,s)`, `FLOAT`, `DOUBLE`, `BLOB`, `JSON`, `ENUM`, `SET` | ### MariaDB-Specific Features | Category | Features | |----------|----------| | Page Compression | `PAGE_COMPRESSED=1` / `PAGE_COMPRESSION_LEVEL=N` via `mdb_page_compressed`, `mdb_page_compression_level` on `MdbTableMeta` | | Invisible Columns | Column invisibility via `mdb.field(invisible=True)` on `MdbColumnMeta` | | Sequences | `CREATE SEQUENCE` support via `mdb.field(sequence=...)` | ## Installation Install with the MySQL driver: ```bash uv add "dbwarden[mysql]" ``` Or with uv: ```bash uv add "dbwarden[mysql]" ``` ## Configuration The MySQL backend is enabled by setting `database_type="mysql"` (or `database_type="mariadb"`) in your dbwarden config: ```python from dbwarden import database_config database_config( database_name="primary", default=True, database_type="mysql", database_url_sync="mysql+pymysql://user:password@localhost:3306/mydb", ) ``` The connection URL uses the `mysql+pymysql://` scheme (the `pymysql` driver is included via the `[mysql]` extra). You can also use any SQLAlchemy-compatible MySQL driver such as `mysql+mysqlconnector://`. ## Declaring Metadata MySQL/MariaDB metadata is declared in a `class Meta` inner class on the model. This is the **only** supported surface: `mapped_column(info=...)` raises `DBWardenConfigError`. ### Table-Level Meta Inherit from `MyTableMeta` on your `class Meta`: ```python from sqlalchemy import Integer, String from sqlalchemy.orm import DeclarativeBase, Mapped, mapped_column from dbwarden.databases.mysql import MyTableMeta class Base(DeclarativeBase): pass class User(Base): __tablename__ = "users" id: Mapped[int] = mapped_column(Integer, primary_key=True) email: Mapped[str] = mapped_column(String(255)) class Meta(MyTableMeta): my_engine = "InnoDB" my_charset = "utf8mb4" my_collate = "utf8mb4_unicode_ci" my_row_format = "DYNAMIC" my_auto_increment = 1000 comment = "Core user accounts" ``` `MyTableMeta` inherits from `TableMeta`, which provides common attributes shared across all backends: | Attribute | Type | SQL | |-----------|------|-----| | `comment` | `str` | `ALTER TABLE t COMMENT = '...'` | | `indexes` | `list[dict]` | `CREATE INDEX ...` | | `checks` | `list[dict]` | `ALTER TABLE t ADD CONSTRAINT ... CHECK (...)` | | `uniques` | `list[dict]` | `ALTER TABLE t ADD CONSTRAINT ... UNIQUE (...)` | MySQL-specific `MyTableMeta` attributes: | Attribute | Type | SQL | |-----------|------|-----| | `my_engine` | `str` | `ALTER TABLE t ENGINE = name` | | `my_charset` | `str` | `ALTER TABLE t DEFAULT CHARACTER SET name` | | `my_collate` | `str` | `ALTER TABLE t COLLATE = name` | | `my_row_format` | `str` | `ALTER TABLE t ROW_FORMAT = name` | | `my_auto_increment` | `int` | `ALTER TABLE t AUTO_INCREMENT = N` | For MariaDB, use `MdbTableMeta`: ```python from dbwarden.databases.mariadb import MdbTableMeta class Meta(MdbTableMeta): my_engine = "InnoDB" mdb_page_compressed = True mdb_page_compression_level = 3 ``` MariaDB-specific `MdbTableMeta` attributes (in addition to all `MyTableMeta` attributes): | Attribute | Type | SQL | |-----------|------|-----| | `mdb_page_compressed` | `bool` | `PAGE_COMPRESSED=1` | | `mdb_page_compression_level` | `int` | `PAGE_COMPRESSION_LEVEL=N` | ### Column-Level Meta Use `MyColumnMeta` inner classes for per-column metadata. The inner class must be named after the column. Use `my = my.field(...)` to set column-level options: ```python from sqlalchemy import Integer, String, Text, TIMESTAMP from sqlalchemy.orm import DeclarativeBase, Mapped, mapped_column from dbwarden.databases.mysql import MyTableMeta, MyColumnMeta, my class Base(DeclarativeBase): pass class User(Base): __tablename__ = "users" id: Mapped[int] = mapped_column(Integer, primary_key=True) email: Mapped[str] = mapped_column(String(255)) bio: Mapped[str] = mapped_column(Text) updated_at: Mapped[str] = mapped_column(TIMESTAMP) class Meta(MyTableMeta): class id(MyColumnMeta): comment = "Primary key" my = my.field(unsigned=True) class email(MyColumnMeta): my = my.field(charset="utf8mb4", collate="utf8mb4_unicode_ci") class updated_at(MyColumnMeta): my = my.field(on_update="CURRENT_TIMESTAMP") ``` `MyColumnMeta` includes common column attributes shared across all backends: | Attribute | Type | SQL | |-----------|------|-----| | `comment` | `str` | `MODIFY COLUMN ... COMMENT '...'` | | `public` | `bool` | Controls field visibility in schemap auto-schema | | `my` | `MyFieldSpec` | MySQL-specific column options (see table below) | MySQL-specific `MyFieldSpec` fields (set via `my.field(...)`): | Keyword | Type | SQL | |---------|------|-----| | `unsigned` | `bool` | `UNSIGNED` on integer columns | | `charset` | `str` | `CHARACTER SET name` (per-column charset) | | `collate` | `str` | `COLLATE name` (per-column collation) | | `on_update` | `str` | `ON UPDATE CURRENT_TIMESTAMP` (typically on TIMESTAMP columns) | For MariaDB, use `MdbColumnMeta` and `mdb.field(...)`: ```python from dbwarden.databases.mariadb import MdbColumnMeta from dbwarden.databases.mariadb import mdb class Meta(MdbTableMeta): class id(MdbColumnMeta): mdb = mdb.field(invisible=True) ``` MariaDB-specific `MdbFieldSpec` fields (set via `mdb.field(...)`): | Keyword | Type | SQL | |---------|------|-----| | `invisible` | `bool` | `ALTER TABLE ... ALTER COLUMN c SET INVISIBLE` | | `sequence` | `str` | Sequence name for MariaDB sequence support | | `unsigned` | `bool` | `UNSIGNED` on integer columns | | `charset` | `str` | `CHARACTER SET name` | | `collate` | `str` | `COLLATE name` | | `on_update` | `str` | `ON UPDATE CURRENT_TIMESTAMP` | ### Foreign Key Options Foreign key options (`ondelete`, `onupdate`) are captured from the database by `generate-models` and emitted in the `ForeignKey` constructor: ```python from sqlalchemy import ForeignKey from sqlalchemy.orm import Mapped, mapped_column class OrderItem(Base): __tablename__ = "order_items" order_id: Mapped[int] = mapped_column(ForeignKey("orders.id", ondelete="CASCADE"), nullable=False) ``` ### Model Example (Generated) Here is the complete generated model output for a MySQL table with engine, charset, unsigned PK, ON UPDATE, and per-column charset: ```python from sqlalchemy import BigInteger, Column, Integer, String, TIMESTAMP, Text, text from sqlalchemy.orm import DeclarativeBase Base = declarative_base() from dbwarden.databases.mysql import MyColumnMeta, MyTableMeta, my class User(Base): __tablename__ = 'users' id = Column('id', Integer, primary_key=True, nullable=False) email = Column('email', String(255), nullable=False) bio = Column('bio', Text) updated_at = Column('updated_at', TIMESTAMP, nullable=False, server_default=text('CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP')) class Meta(MyTableMeta): my_engine = 'InnoDB' my_charset = 'utf8mb4' my_collate = 'utf8mb4_unicode_ci' my_row_format = 'DYNAMIC' comment = 'Core user accounts' class id(MyColumnMeta): comment = 'Primary key' my = my.field(unsigned=True) class email(MyColumnMeta): my = my.field(charset='utf8mb4', collate='utf8mb4_unicode_ci') class updated_at(MyColumnMeta): my = my.field(on_update='CURRENT_TIMESTAMP') ``` ## DDL Behavior ### DDL Is NOT Transactional MySQL and MariaDB DDL is **non-transactional**: each DDL statement implicitly commits the current transaction. If a migration file contains multiple statements and one fails, the prior DDL cannot be rolled back. This makes MySQL/MariaDB more fragile than PostgreSQL for automated migration runs. ### Column Type Changes Emits `ALTER TABLE t MODIFY COLUMN c newtype`. Unlike PostgreSQL, MySQL requires the full column definition on every `MODIFY COLUMN`. DBWarden handles this by re-emitting all column attributes (type, unsigned, nullable, default, comment, charset, collate, auto_increment) in a single statement: ```sql ALTER TABLE users MODIFY COLUMN email VARCHAR(255) NOT NULL COMMENT 'User email'; ``` ### Column Nullable Changes Emits `ALTER TABLE t MODIFY COLUMN c type [NULL | NOT NULL]`, again with the full column type. ### Column Meta Changes When MySQL-specific column metadata changes (unsigned, charset, collate, on_update), DBWarden generates a full `MODIFY COLUMN` that preserves the column's type, nullable, default, comment, and autoincrement state: ```sql ALTER TABLE users MODIFY COLUMN id INT UNSIGNED NOT NULL AUTO_INCREMENT COMMENT 'Primary key'; ``` ### Table Option Changes MySQL table-level option changes generate individual `ALTER TABLE` statements: | Change | Generated SQL | |--------|---------------| | Engine | `ALTER TABLE t ENGINE = InnoDB` | | Charset | `ALTER TABLE t DEFAULT CHARACTER SET utf8mb4` | | Collation | `ALTER TABLE t COLLATE = utf8mb4_unicode_ci` | | Row Format | `ALTER TABLE t ROW_FORMAT = DYNAMIC` | | Auto Increment | `ALTER TABLE t AUTO_INCREMENT = 1000` | ### Auto-increment Lifecycle DBWarden supports toggling auto-increment on integer primary key columns. The `autoincrement` field in your model controls whether a column uses auto-increment: ```python class User(Base): __tablename__ = "users" # Autoincrement enabled: same as default behavior id: Mapped[int] = mapped_column(Integer, primary_key=True, autoincrement=True) class Meta(MyTableMeta): class id(MyColumnMeta): pass # uses default autoincrement from model ``` To explicitly disable auto-increment on a PK column: ```python class User(Base): __tablename__ = "users" # Plain integer PK: no auto-increment id: Mapped[int] = mapped_column(Integer, primary_key=True, autoincrement=False) ``` **What happens when autoincrement changes:** | Change | Generated SQL | |--------|---------------| | Add autoincrement | `ALTER TABLE t MODIFY COLUMN c INT NOT NULL AUTO_INCREMENT` | | Remove autoincrement | `ALTER TABLE t MODIFY COLUMN c INT NOT NULL` | ### Comments Unlike PostgreSQL, MySQL has no `COMMENT ON` syntax. DBWarden generates the correct MySQL syntax: ```sql -- Table comment ALTER TABLE users COMMENT = 'Core user accounts'; -- Column comment (full MODIFY COLUMN preserving all attributes) ALTER TABLE users MODIFY COLUMN email VARCHAR(255) NOT NULL COMMENT 'User email address'; ``` When a comment is cleared, MySQL syntax is used: ```sql ALTER TABLE users COMMENT = ''; ALTER TABLE users MODIFY COLUMN email VARCHAR(255) NOT NULL COMMENT ''; ``` ## Snapshot Format When `database_type` is `"mysql"` or `"mariadb"`, the snapshot captures MySQL-specific metadata in `my_column` and `my_table` blocks. ### Column Extras ```json { "columns": { "id": { "name": "id", "type": "int", "nullable": false, "default": null, "autoincrement": true, "primary_key": true, "comment": "Primary key", "my_column": { "my_unsigned": true, "my_charset": null, "my_collate": null, "my_on_update": null } }, "updated_at": { "name": "updated_at", "type": "timestamp", "nullable": false, "default": "CURRENT_TIMESTAMP", "autoincrement": false, "primary_key": false, "my_column": { "my_unsigned": false, "my_on_update": "CURRENT_TIMESTAMP" } } } } ``` ### Table Extras ```json { "my_table": { "my_engine": "InnoDB", "my_charset": "utf8mb4", "my_collate": "utf8mb4_unicode_ci", "my_row_format": "Dynamic", "my_auto_increment": 1000 } } ``` For MariaDB, additional fields appear: ```json { "my_table": { "mdb_page_compressed": false, "mdb_page_compression_level": null } } ``` ## Reverse Engineering `generate-models` queries `information_schema.TABLES` and `information_schema.COLUMNS` to reverse-engineer all MySQL/MariaDB metadata. The emitted model uses `class Meta` with `MyTableMeta` and `MyColumnMeta` inner classes. ```bash $ dbwarden generate-models -d primary ``` Generated output includes automatic detection of: - Engine, charset, collation, row format from `information_schema.TABLES` - Column unsigned, charset, collation, on_update from `information_schema.COLUMNS` - Foreign key options (`ON DELETE`, `ON UPDATE`) - Auto-increment columns - Column comments ## Safety Classification DBWarden classifies migration changes using the `Safety` enum: ```python from dbwarden.engine.safety import Safety assert Safety.SAFE == "SAFE" assert Safety.INFO == "INFO" assert Safety.WARN == "WARN" assert Safety.CRITICAL == "CRITICAL" ``` | Change Type | Severity | Flag Required | |-------------|----------|---------------| | Add column | `INFO` | None | | Drop column | `WARNING` | `--force` | | Change column type | `WARNING` | `--force` | | Change column nullable | `WARNING` | `--force` | | Change column comment | `INFO` | None | | Change MySQL column meta | `WARNING` | `--force` | | Change engine | `INFO` | None | | Change charset | `INFO` | None | | Change collation | `INFO` | None | | Change row format | `INFO` | None | | Change auto_increment | `INFO` | None | | Change table comment | `INFO` | None | | Add / drop index | `INFO` / `WARNING` | `--force` | | Add / drop FK | `INFO` / `WARNING` | `--force` | ======================================================================== PAGE: https://dbwarden.emiliano-go.com/databases/postgresql/ ======================================================================== # PostgreSQL DBWarden treats PostgreSQL as a **first-class backend**: every natively supported feature is reverse-engineered, diffed, and emitted as correct DDL. ## First-Class Features "First-class" means the round-trip is verified: reverse-engineer a live database with `generate-models`, feed the output back into `make-migrations`, and get **zero diff**. ```bash # Step 1: reverse-engineer your live PostgreSQL database $ dbwarden generate-models -d primary --tables users,orders,items # Step 2: feed the generated models back in, zero diff $ dbwarden make-migrations # → "No changes detected" (output is empty; your models match the DB exactly) ``` The following PostgreSQL features are fully supported in this round-trip: | Category | Features | |----------|----------| | Identity Columns | `GENERATED ALWAYS AS IDENTITY`, `GENERATED BY DEFAULT AS IDENTITY`, sequence options (`START WITH`, `INCREMENT BY`, `MINVALUE`, `MAXVALUE`) | | Collation | Per-column `COLLATE` via `pg.field(collation=...)` | | Storage | Per-column `STORAGE` setting (`PLAIN`, `MAIN`, `EXTERNAL`, `EXTENDED`) via `pg.field(storage=...)` | | Compression | Per-column `COMPRESSION` (`pglz`, `zstd`) via `pg.field(compression=...)` (PG 14+) | | Generated Columns | `GENERATED ALWAYS AS (...) STORED` via `pg.field(generated=...)` | | Table Fillfactor | `WITH (fillfactor = N)` via `pg_fillfactor` | | Tablespace | `SET TABLESPACE` via `pg_tablespace` | | Unlogged Tables | `UNLOGGED` via `pg_unlogged` | | Partitioning | `PARTITION BY RANGE / LIST / HASH (columns)` via `pg_partition` | | Table Inheritance | `INHERITS (parent)` via `pg_inherits` | | EXCLUDE Constraints | `EXCLUDE USING gist (...)` via `pg_excludes` | | Check Constraints | `CHECK (...)` with `NO INHERIT` support via `pg_checks` | | Unique Constraints | Full option diff: `NULLS NOT DISTINCT`, `DEFERRABLE INITIALLY DEFERRED`, `INCLUDE` via `pg_uniques` | | Deferrable FK | `DEFERRABLE INITIALLY DEFERRED` with `ON DELETE` / `ON UPDATE` options | | Index Options | `USING`, `WHERE`, `INCLUDE`, `WITH`, `TABLESPACE`, `NULLS NOT DISTINCT`, `CONCURRENTLY`, column sorting, operator classes via `postgresql_ops` | | Enum Types | `CREATE TYPE ... AS ENUM`, `ALTER TYPE ... ADD VALUE ... AFTER ...` | | Comments | Table and column `COMMENT ON` | | Type Normalization | `SERIAL` → `integer` + autoincrement, `TIMESTAMPTZ`, `NUMERIC(p,s)`, `VARCHAR(n)`, `DOUBLE PRECISION`, `REAL`, `JSONB`, `UUID`, `ARRAY`, `ENUM`, `TSTZRANGE` | | Auto-increment Lifecycle | Toggle autoincrement on integer PKs via `autoincrement` field: generates `CREATE SEQUENCE` / `DROP SEQUENCE` + `SET DEFAULT nextval` | ## Declaring Metadata PostgreSQL metadata is declared in a `class Meta` inner class on the model. This is the **only** supported surface: `mapped_column(info=...)` raises `DBWardenConfigError`. ### Table-Level Meta Inherit from `PGTableMeta` on your `class Meta`: ```python from sqlalchemy import Integer from sqlalchemy.orm import DeclarativeBase, Mapped, mapped_column from dbwarden.databases.pgsql import PGTableMeta class Base(DeclarativeBase): pass class User(Base): __tablename__ = "users" id: Mapped[int] = mapped_column(Integer, primary_key=True) class Meta(PGTableMeta): pg_fillfactor = 80 pg_tablespace = "fastspace" pg_inherits = "base_entity" pg_excludes = [ {"name": "excl_room_booking", "expression": "USING gist (room_id WITH =, during WITH &&)"}, ] ``` `PGTableMeta` inherits from `TableMeta`, which provides common attributes shared across all backends: | Attribute | Type | SQL | |-----------|------|-----| | `comment` | `str` | `COMMENT ON TABLE t IS '...'` | | `indexes` | `list[dict]` | `CREATE INDEX ...` | | `checks` | `list[dict]` | `ALTER TABLE t ADD CONSTRAINT ... CHECK (...)` | | `uniques` | `list[dict]` | `ALTER TABLE t ADD CONSTRAINT ... UNIQUE (...)` | PostgreSQL-specific `PGTableMeta` attributes: | Attribute | Type | SQL | |-----------|------|-----| | `pg_fillfactor` | `int` | `ALTER TABLE t SET (fillfactor = N)` | | `pg_tablespace` | `str` | `ALTER TABLE t SET TABLESPACE name` | | `pg_unlogged` | `bool` | `CREATE UNLOGGED TABLE ...` / `ALTER TABLE t SET UNLOGGED` | | `pg_partition` | `dict` | `PARTITION BY RANGE / LIST / HASH (columns)` | | `pg_inherits` | `str \| list[str]` | `ALTER TABLE t INHERIT parent` | | `pg_excludes` | `list[dict]` | `ALTER TABLE t ADD CONSTRAINT ... EXCLUDE USING ...` | | `pg_indexes` | `list[PgIndexSpec]` | `CREATE INDEX ...` (with `USING`, `WHERE`, `INCLUDE`, `NULLS NOT DISTINCT`, column sorting) | | `pg_checks` | `list[dict]` | `ALTER TABLE t ADD CONSTRAINT ... CHECK (...)` (with `NO INHERIT`) | | `pg_uniques` | `list[dict]` | `ALTER TABLE t ADD CONSTRAINT ... UNIQUE (...)` (with `DEFERRABLE`, `NULLS NOT DISTINCT`, `INCLUDE`) | The `pg_indexes` list uses `PgIndexSpec` objects (from `dbwarden.databases.pgsql` or `dbwarden`): ```python from dbwarden.databases.pgsql import PgIndexSpec class Meta(PGTableMeta): pg_indexes = [ PgIndexSpec("ix_users_email", ["email"], unique=True, using="gin"), ] ``` `PgIndexSpec` constructor fields: | Field | Type | Default | SQL | |-------|------|---------|-----| | `name` | `str` | (auto-generated) | Index name | | `columns` | `list[str]` | — | Indexed columns | | `unique` | `bool` | `False` | `CREATE UNIQUE INDEX` | | `using` | `str \| None` | `None` | `USING btree` (default), `gin`, `gist`, `hash`, `brin` | | `where` | `str \| None` | `None` | `WHERE predicate` | | `include` | `list[str] \| None` | `None` | `INCLUDE (cols)` | | `with_params` | `dict[str, Any] \| None` | `None` | `WITH (params)` | | `tablespace` | `str \| None` | `None` | `TABLESPACE name` | | `nulls_not_distinct` | `bool` | `False` | `NULLS NOT DISTINCT` | | `column_sorting` | `dict[str, str] \| None` | `None` | Per-column `ASC` / `DESC` / `NULLS FIRST` / `NULLS LAST` | | `postgresql_ops` | `dict[str, str] \| None` | `None` | Per-column operator class (e.g., `jsonb_path_ops`) | | `concurrently` | `bool` | `True` | `CREATE INDEX CONCURRENTLY` | Use `postgresql_ops` to specify operator classes for GIN, GiST, or BRIN indexes: ```python PgIndexSpec("ix_users_data", ["data"], using="gin", postgresql_ops={"data": "jsonb_path_ops"}) ``` This generates: ```sql CREATE INDEX CONCURRENTLY ix_users_data ON users USING GIN (data jsonb_path_ops); ``` ### JSONB Columns JSONB columns are fully supported. Declare them with `from sqlalchemy.dialects.postgresql import JSONB` and use `PgIndexSpec` with `using="gin"` for GIN indexes: ```python from sqlalchemy.dialects.postgresql import JSONB from sqlalchemy import Column, Integer, String from sqlalchemy.orm import declarative_base from dbwarden.databases.pgsql import PGTableMeta, PgIndexSpec Base = declarative_base() class User(Base): __tablename__ = "users" id = Column(Integer, primary_key=True) name = Column(String(100)) metadata = Column(JSONB) class Meta(PGTableMeta): pg_indexes = [ PgIndexSpec("ix_users_metadata", ["metadata"], using="gin"), ] ``` For advanced use with the `jsonb_path_ops` operator class (produces smaller indexes for path-based queries): ```python PgIndexSpec("ix_users_metadata", ["metadata"], using="gin", postgresql_ops={"metadata": "jsonb_path_ops"}) ``` Generated SQL: ```sql CREATE INDEX CONCURRENTLY ix_users_metadata ON users USING GIN (metadata jsonb_path_ops); ``` JSONB column type changes (e.g., `json` → `jsonb`) are classified as **SAFE** by the safety analyzer. The `pg_type` annotation is automatically set on JSONB columns during model extraction and snapshot creation, ensuring zero-diff round-trips. ### Column-Level Meta Use `PGColumnMeta` inner classes for per-column metadata. The inner class must be named after the column. Use `pg = pg.field(...)` to set column-level options: ```python from sqlalchemy import Integer, String, Text from sqlalchemy.orm import DeclarativeBase, Mapped, mapped_column from dbwarden.databases.pgsql import PGTableMeta, PGColumnMeta, pg class Base(DeclarativeBase): pass class User(Base): __tablename__ = "users" id: Mapped[int] = mapped_column(Integer, primary_key=True) email: Mapped[str] = mapped_column(String(255)) bio: Mapped[str] = mapped_column(Text) class Meta(PGTableMeta): class id(PGColumnMeta): pg = pg.field(identity="always", identity_start=100, identity_increment=1) class bio(PGColumnMeta): pg = pg.field(storage="EXTENDED", compression="pglz", collation="en_US.UTF-8") ``` `PGColumnMeta` includes common column attributes shared across all backends: | Attribute | Type | SQL | |-----------|------|-----| | `comment` | `str` | `COMMENT ON COLUMN t.c IS '...'` | | `public` | `bool` | Controls field visibility in schemap auto-schema | | `pg` | `PgFieldSpec` | PostgreSQL-specific column options (see table below) | PostgreSQL-specific `PgFieldSpec` fields (set via `pg.field(...)`): | Keyword | Type | SQL | |---------|------|-----| | `collation` | `str` | `ALTER COLUMN c TYPE t COLLATE "name"` | | `storage` | `str` | `ALTER COLUMN c SET STORAGE {PLAIN\|MAIN\|EXTERNAL\|EXTENDED}` | | `compression` | `str` | `ALTER COLUMN c SET COMPRESSION {pglz\|zstd}` (PG 14+) | | `generated` | `str` | `GENERATED ALWAYS AS (expr) STORED` | | `identity` | `str` | `ADD GENERATED {ALWAYS\|BY DEFAULT} AS IDENTITY` | | `identity_start` | `int` | Sequence `START WITH` | | `identity_increment` | `int` | Sequence `INCREMENT BY` | | `identity_min` | `int` | Sequence `MINVALUE` | | `identity_max` | `int` | Sequence `MAXVALUE` | ### Foreign Key Options Foreign key options (`ondelete`, `onupdate`, `deferrable`) are captured from the database by `generate-models` and emitted in the `ForeignKey` constructor: ```python from sqlalchemy import ForeignKey from sqlalchemy.orm import Mapped, mapped_column class OrderItem(Base): __tablename__ = "order_items" order_id: Mapped[int] = mapped_column(ForeignKey("orders.id", ondelete="CASCADE", onupdate="CASCADE", deferrable=True), nullable=False) ``` ## DDL Behavior ### Transactional DDL PostgreSQL DDL is transactional. If a migration file contains multiple statements and one fails, all prior DDL in that file is rolled back. This makes PostgreSQL the safest backend for automated migration runs. ### Index Creation DBWarden defaults to `CREATE INDEX CONCURRENTLY` to avoid table locking. Pass `--no-concurrent` when the migration must run inside a transaction block (PostgreSQL requires `CONCURRENTLY` outside a transaction). ### Column Type Changes Emits `ALTER TABLE t ALTER COLUMN c TYPE newtype` with a commented-out `-- USING col::newtype` line. Pass `--postgres-auto-using` to emit an active `USING` clause. Without the flag, uncomment and verify the USING expression before running the migration against production. ### Safe Type Change The `--safe-type-change` flag generates a multi-step strategy: 1. Add a temporary column with the new type 2. Emit a `--` comment with an `UPDATE` statement template 3. Emit a verification comment 4. After manual verification, drop the old column and rename the temporary column ### Generated Columns Adding a generated column via `ALTER TABLE` is not supported by PostgreSQL. `ALTER COLUMN column_name ADD GENERATED AS (expr) STORED` is not valid DDL. DBWarden emits a comment placeholder noting this limitation. Dropping the generation expression (`ALTER COLUMN c DROP EXPRESSION`) produces real DDL. ### Auto-increment Lifecycle DBWarden supports toggling auto-increment on integer primary key columns. The `autoincrement` field in your model controls whether a column uses SERIAL-style sequence auto-increment or is a plain integer: ```python class User(Base): __tablename__ = "users" # Autoincrement enabled (SERIAL): same as default behavior id: Mapped[int] = mapped_column(Integer, primary_key=True, autoincrement=True) class Meta(PGTableMeta): id = ColumnMeta(autoincrement=True) ``` To explicitly disable auto-increment on a PK column: ```python class User(Base): __tablename__ = "users" # Plain integer PK: no sequence, no auto-increment id: Mapped[int] = mapped_column(Integer, primary_key=True, autoincrement=False) ``` **What happens when autoincrement changes:** | Change | Generated SQL | |--------|---------------| | Adding autoincrement | `CREATE SEQUENCE users_id_seq` + `ALTER COLUMN id SET DEFAULT nextval('users_id_seq')` + `ALTER SEQUENCE users_id_seq OWNED BY users.id` | | Removing autoincrement | `ALTER COLUMN id DROP DEFAULT` + `DROP SEQUENCE IF EXISTS users_id_seq` | The rollback SQL is the symmetric inverse: if an autoincrement addition is rolled back, the sequence is dropped and the default is removed. **Detection from live databases:** When reverse-engineering a live PostgreSQL database, DBWarden detects autoincrement by: 1. SERIAL/BIGSERIAL column types: type string contains `serial` 2. SQLAlchemy's `.autoincrement` attribute: set by the PG dialect for SERIAL columns 3. `nextval(...)` default patterns: PG SERIAL columns have `DEFAULT nextval('table_col_seq'::regclass)` **Sequence lifecycle:** Sequences created by SERIAL auto-increment follow the naming convention `table_column_seq`. When autoincrement is removed: - The column default (`nextval(...)`) is dropped - The sequence is dropped via `DROP SEQUENCE IF EXISTS` - The column type remains `INTEGER`; no data is lost When autoincrement is added to an existing integer column: - A new sequence is created starting at 1 - The column default is set to `nextval('table_column_seq')` - The sequence is owned by the column for proper cleanup on table drop **Type mapping behavior:** The type mapping in `_map_sqlalchemy_type_to_backend` promotes `INTEGER PRIMARY KEY` to `SERIAL` only when `autoincrement` is not explicitly `False`: | Condition | Resulting Type | |-----------|---------------| | `autoincrement=True` (default) | `SERIAL` / `BIGSERIAL` | | `autoincrement=False` | `INTEGER` / `BIGINT` | | `autoincrement=None` (unspecified) | `SERIAL` / `BIGSERIAL` (backward compatible) | **Non-PostgreSQL backends:** SQLite, MySQL, and ClickHouse do not support sequence-based auto-increment toggling via ALTER. The `alter_column_autoincrement` operation emits a comment explaining the limitation on these backends. ## Snapshot Format The snapshot JSON captures all PostgreSQL-specific metadata. Key sections: ### Column Extras ```json { "name": "bio", "type": "text", "pg_column": { "collation": "en_US.UTF-8", "storage": "EXTENDED", "compression": "pglz", "generated": null, "identity": "always", "identity_start": 1, "identity_increment": 1 } } ``` ### Table Extras ```json { "pg_table": { "pg_fillfactor": 80, "pg_tablespace": "fastspace", "pg_unlogged": false, "pg_inherits": "base_entity", "pg_partition": { "strategy": "RANGE", "columns": ["created_at"] }, "pg_excludes": [ {"name": "excl_room_booking", "expression": "EXCLUDE USING gist (room_id WITH =, during WITH &&)"} ] } } ``` ### Foreign Key Extras ```json { "type": "foreign_key", "table": "order_items", "columns": ["order_id"], "referenced_table": "orders", "referenced_columns": ["id"], "on_delete": "CASCADE", "on_update": "CASCADE", "deferrable": true } ``` ## Constraint Diffing Constraints (unique, check, foreign key, exclude) are compared by full attribute content. Any difference in signature: columns, expression, options, produces a `DROP` + `ADD`. Constraint name changes are detected as a new constraint (the old name is dropped, the new name is added). FK comparison uses a 6-tuple signature: `(columns, ref_table, ref_columns, on_delete, on_update, deferrable)`. ## Reverse Engineering `generate-models` queries `pg_class`, `pg_attribute`, `pg_constraint`, `pg_inherits`, `pg_tablespace`, `pg_partitioned_table`, and `pg_collation` to reverse-engineer all PostgreSQL metadata. The emitted model uses `class Meta` with `PGTableMeta` and `PGColumnMeta` inner classes. ```bash $ dbwarden generate-models -d primary ``` Generated output for a table with identity, storage, compression, collation, and fillfactor: ```python from dbwarden.databases.pgsql import pg class User(Base): __tablename__ = "users" id: Mapped[int] = mapped_column(primary_key=True) email: Mapped[str] = mapped_column(String(255), unique=True) bio: Mapped[str | None] = mapped_column(Text, nullable=True) class Meta(PGTableMeta): comment = "Core user accounts" pg_fillfactor = 80 class id(PGColumnMeta): pg = pg.field(identity="always", identity_start=100, identity_increment=1) class bio(PGColumnMeta): pg = pg.field(storage="EXTENDED", compression="pglz", collation="en_US.UTF-8") ``` For a partitioned table, `generate-models` emits `pg_partition`: ```python class Event(Base): __tablename__ = "events" id: Mapped[int] = mapped_column(primary_key=True) created_at: Mapped[datetime] = mapped_column(DateTime) class Meta(PGTableMeta): pg_partition = {"strategy": "RANGE", "columns": ["created_at"]} ``` Unlogged tables, `NO INHERIT` check constraints, deferred unique constraints, and `ALTER TYPE ... ADD VALUE` for enums are all detected and emitted automatically. ## Safety Classification DBWarden classifies migration changes using the `Safety` enum: ```python from dbwarden.engine.safety import Safety assert Safety.SAFE == "SAFE" assert Safety.INFO == "INFO" assert Safety.WARN == "WARN" assert Safety.CRITICAL == "CRITICAL" ``` | Change Type | Severity | Flag Required | |-------------|----------|---------------| | Add column | `INFO` | None | | Drop column | `WARNING` | `--force` | | Change column type (safe) | `INFO` | None | | Change column type (warn) | `WARNING` | `--force` | | Change column type (critical) | `WARNING` | `--force` | | Change column comment | `INFO` | None | | Change PG column meta | `WARNING` | `--force` | | Change fillfactor | `INFO` | None | | Change tablespace | `WARNING` | `--force` | | Change inheritance | `WARNING` | `--force` | | Change exclude constraints | `WARNING` | `--force` | | Change table comment | `INFO` | None | | Change object type | `WARNING` | `--force` | | Add / drop index | `INFO` / `WARNING` | `--force` | | Add / drop FK | `INFO` / `WARNING` | `--force` | ======================================================================== PAGE: https://dbwarden.emiliano-go.com/databases/round-trip/ ======================================================================== # Round Trip Support A **round-trip** backend is one where DBWarden can both read schema (via `generate-models`) and write schema (via `make-migrations` / `migrate`). ## Supported Backends | Backend | `database_type` | Round-Trip | |---------|------------------|------------| | PostgreSQL | `postgresql` | Yes | | MySQL | `mysql` | Yes | | ClickHouse | `clickhouse` | Yes | | SQLite | `sqlite` | Dev only | | MariaDB | `mariadb` | No | ## How Round-Trip Verification Works "First-class" means the round-trip is verified: reverse-engineer a live database with `generate-models`, feed the output back into `make-migrations`, and get **zero diff**. ## Per-Backend Details ### PostgreSQL PostgreSQL is a **first-class backend** with full round-trip support. All metadata (identity columns, collation, storage, compression, generated columns, fillfactor, tablespace, inheritance, exclude constraints, deferrable foreign keys, and advanced index options) is captured by the snapshot, diffed correctly, and emitted as valid DDL. See [PostgreSQL Deep Dive](postgresql.md) for the complete list of supported features. ### MySQL MySQL is a **first-class backend** with full round-trip support. All metadata (engine, charset, collation, row format, auto_increment, unsigned columns, `ON UPDATE`, and column comments) is captured by the snapshot, diffed correctly, and emitted as valid DDL. See [MySQL Deep Dive](mysql.md) for the complete list of supported features. ### ClickHouse ClickHouse has full round-trip support: `generate-models` reads schema from a live ClickHouse server, and `make-migrations` / `migrate` auto-generates DDL for table operations. See [ClickHouse Deep Dive](clickhouse.md) for the complete list of supported features. ### SQLite SQLite is supported for **development workflows only** (`--dev` flag). It uses the same snapshot format as PostgreSQL but with SQLite-compatible DDL. SQLite is ideal for local iteration before running migrations against production. ### MariaDB MariaDB is supported as a separate `database_type` (`mariadb`), but it does **not** have round-trip support. You can use MariaDB as a target database for migrations, but `generate-models` and full schema introspection are not available. Use `make-migrations` to write migrations manually. ======================================================================== PAGE: https://dbwarden.emiliano-go.com/databases/sql-databases/ ======================================================================== # SQL Databases DBWarden supports PostgreSQL, MySQL, MariaDB, and SQLite. While all four share standard SQL DDL, each backend has distinct behaviors that affect generated migrations. This page documents backend-specific syntax, limitations, and edge cases. ## DDL Transactional Behavior | Backend | Transactional DDL | Impact | |---------|------------------|--------| | PostgreSQL | Yes | Entire migration file succeeds or rolls back atomically | | MySQL / MariaDB | No | Each DDL auto-commits; partial failure leaves schema inconsistent | | SQLite | Mostly | Per-statement auto-commit outside explicit transactions | **PostgreSQL**: DDL is transactional. If a migration file contains multiple statements and one fails, all prior DDL in that file is rolled back. This makes PostgreSQL the safest backend for automated migration runs. PostgreSQL is also a **first-class backend** with full support for identity columns, collation, storage, compression, generated columns, table fillfactor, tablespace, inheritance, EXCLUDE constraints, deferrable FK options, and advanced index parameters. See [PostgreSQL Deep Dive](postgresql.md) for complete details. **MySQL / MariaDB**: DDL statements auto-commit immediately. If a 5-statement migration fails on the 4th statement, the first 3 are already committed and cannot be rolled back. Manual inspection and recovery may be needed. Always test MySQL/MariaDB migrations in a staging environment first. **SQLite**: DDL is transactional only within explicit `BEGIN/COMMIT` blocks. DBWarden migration files are executed with each statement as a separate implicit transaction. This means SQLite is similar to MySQL in practice; partial failure is possible. ## Column Rename | Backend | Syntax | Supported | |---------|--------|-----------| | PostgreSQL | `ALTER TABLE t RENAME COLUMN old TO new` | Yes | | SQLite | `ALTER TABLE t RENAME COLUMN old TO new` | Yes (3.25+) | | MySQL | `ALTER TABLE t CHANGE old new type` | Workaround needed | | MariaDB | `ALTER TABLE t CHANGE old new type` | Workaround needed | PostgreSQL and SQLite (3.25+) support `RENAME COLUMN` natively. DBWarden emits the standard form for all backends. If you need a backend that does not support native column rename (e.g., MySQL < 8.0, MariaDB), you must write a manual migration or verify the generated SQL. ## Column Type Change | Backend | Syntax | Supported | |---------|--------|-----------| | PostgreSQL | `ALTER TABLE t ALTER COLUMN c TYPE newtype` (commented-out `USING` by default, active with `--postgres-auto-using`) | Yes | | MySQL / MariaDB | `ALTER TABLE t MODIFY COLUMN c newtype` | Yes | | SQLite | Not supported | Comment emitted | **PostgreSQL**: Emits `ALTER TABLE t ALTER COLUMN c TYPE newtype` with a commented-out `-- USING col::newtype` line. Pass `--postgres-auto-using` on `make-migrations` to emit an active `USING` clause. **MySQL / MariaDB**: Emits `ALTER TABLE t MODIFY COLUMN c newtype`. Note that `MODIFY COLUMN` requires specifying the entire column definition, not just the type. DBWarden includes only the type in the `MODIFY` statement; if you need additional attributes (e.g., `NOT NULL`, `DEFAULT`), add them manually. **SQLite**: `ALTER COLUMN TYPE` is not supported. DBWarden emits a comment: ```sql -- SQLite does not support ALTER COLUMN TYPE. -- Use 'dbwarden new' to write a manual migration for: -- ALTER TABLE users ALTER COLUMN name TYPE TEXT ``` Table recreation is required to change a column's type in SQLite. ## Column Nullable Change | Backend | Syntax | Supported | |---------|--------|-----------| | PostgreSQL | `ALTER TABLE t ALTER COLUMN c [SET/DROP] NOT NULL` | Yes | | MySQL / MariaDB | `ALTER TABLE t MODIFY COLUMN c coltype [NOT] NULL` | Yes | | SQLite | Not supported | Comment emitted | **PostgreSQL**: Uses `SET NOT NULL` / `DROP NOT NULL`. No column type needed. **MySQL / MariaDB**: Uses `MODIFY COLUMN` which requires the full column type. DBWarden includes the type from the model column definition. If the type is not available, nullable changes for MySQL/MariaDB may produce incomplete SQL. **SQLite**: Not supported. A comment is emitted: ```sql -- SQLite: ALTER TABLE users ALTER COLUMN email SET NOT NULL (not supported) ``` ## Column Default Change All four backends support `ALTER TABLE t ALTER COLUMN c SET DEFAULT value` and `ALTER TABLE t ALTER COLUMN c DROP DEFAULT`. Default changes work uniformly across all SQL backends. ## Foreign Key Handling | Backend | ADD FK | DROP FK | Notes | |---------|--------|---------|-------| | PostgreSQL | `ADD CONSTRAINT ... FOREIGN KEY` | `DROP CONSTRAINT ...` | Supports `ON DELETE`, `ON UPDATE`, `DEFERRABLE` | | MySQL | `ADD CONSTRAINT ... FOREIGN KEY` | `DROP FOREIGN KEY ...` | Uses constraint name, not FK name | | MariaDB | `ADD CONSTRAINT ... FOREIGN KEY` | `DROP FOREIGN KEY ...` | Same as MySQL | | SQLite | Not supported (comment) | Not supported (comment) | Recreate table | | ClickHouse | Not supported (error) | Not supported (error) | `ForeignKey()` raises `DBWardenConfigError` | **PostgreSQL FK options**: `ON DELETE`, `ON UPDATE`, and `DEFERRABLE INITIALLY DEFERRED` are fully supported. See [PostgreSQL Deep Dive](postgresql.md). **Validation**: Before emitting `ADD FOREIGN KEY`, DBWarden verifies that the referenced table and columns exist in the snapshot. If they don't, the FK is silently skipped to avoid generating broken SQL. This can happen when the referenced table is added in the same migration batch. If an FK is unexpectedly missing from generated SQL, check whether the referenced table exists in the snapshot. **Content-based comparison**: FKs are compared by a 6-tuple signature `(columns, ref_table, ref_columns, on_delete, on_update, deferrable)`, not by constraint name. Renaming an FK constraint or changing options produces a drop+add. ## Index Handling | Backend | CREATE INDEX | DROP INDEX | Notes | |---------|-------------|------------|-------| | PostgreSQL | `CREATE [UNIQUE] INDEX [CONCURRENTLY] ... USING INCLUDE () WITH () WHERE TABLESPACE ` | `DROP INDEX` | Full feature support | | MySQL / MariaDB | `CREATE [UNIQUE] INDEX` | `DROP INDEX` | Standard | | SQLite | `CREATE [UNIQUE] INDEX` | `DROP INDEX` | Standard | **PostgreSQL advanced parameters**: all are supported in `_build_index_sql`. See [PostgreSQL Deep Dive](postgresql.md) for full coverage. **PostgreSQL `CONCURRENTLY`**: DBWarden defaults to `CREATE INDEX CONCURRENTLY` to avoid table locking. Pass `--no-concurrent` when the migration must run inside a transaction block (PostgreSQL requires `CONCURRENTLY` outside a transaction). **Full-content comparison**: Indexes are compared by **all** attributes (using, unique, where, include, with_params, tablespace, nulls_not_distinct, column_sorting, concurrently), not just columns + name. Any difference produces a drop+add; ALTER INDEX is not used. **Auto-generated names**: `idx_{table}_{col1}_{col2}` (non-unique), `uq_{table}_{col1}_{col2}` (unique). Non-btree `USING` methods append a suffix: `idx_{table}_{col}_{method}`. **No ALTER INDEX**: All index parameter changes (adding a WHERE clause, switching USING methods, changing sort order) produce `DROP INDEX` + `CREATE INDEX`. This is intentional; `ALTER INDEX` support varies widely across backends and index attribute types. ## Safe Type Change | Backend | Supported | Behavior | |---------|-----------|----------| | PostgreSQL | Yes | Multi-step: add temp column, backfill comment, verify, drop+rename | | MySQL / MariaDB | Yes | Multi-step: add temp column, backfill comment, verify, drop+rename | | SQLite | No | Comment emitted | The `--safe-type-change` flag generates a multi-step strategy: 1. Add a temporary column with the new type 2. Emit a `--` comment with an `UPDATE` statement template 3. Emit a verification comment 4. After manual verification, drop the old column and rename the temporary column On SQLite, this strategy is not supported because SQLite cannot drop columns (before 3.35.0) and has limited ALTER TABLE support. A comment is emitted instead. ## Table Rename All four SQL backends support `ALTER TABLE t RENAME TO newname`. The syntax is uniform. ClickHouse is the only supported backend that does not support table rename (see [ClickHouse](clickhouse.md)). ## DROP COLUMN Warning All DROP COLUMN statements are prefixed with a warning comment: ```sql -- WARNING: DROPPING COLUMN users.legacy_field ALTER TABLE users DROP COLUMN legacy_field ``` This applies to all SQL backends. The warning is a comment only and does not affect execution. In MySQL/MariaDB, be especially careful with DROP COLUMN since the DDL auto-commits and cannot be rolled back. ## DROP TABLE `DROP TABLE` emits a rollback comment that references restoring from snapshot. The actual rollback SQL is a placeholder and must be written manually if needed. ## Statement Ordering ``` RENAME TABLE (0) RENAME COLUMN (1) ALTER COLUMN TYPE (2) ALTER COLUMN NULLABLE (3) ALTER COLUMN DEFAULT (4) CREATE TABLE (5) ADD COLUMN (6) ALTER FOREIGN KEY (7) ALTER INDEX (8) DROP COLUMN (9) DROP TABLE (10) ALTER TABLE COMMENT (11) ALTER COLUMN COMMENT (12) ALTER TABLE OPTIONS (13) ALTER TABLE CONSTRAINT (14) ``` This ordering ensures safe execution across all SQL backends. Table renames come first so all subsequent ops use the new name. Drops come last to minimize risk of referencing dropped objects. ## Migration Name Generation Auto-generated migration names are truncated to 72 characters. Operation words (`add`, `drop`, `alter`, `create`, `rename`, `add_columns`, etc.) are preserved during truncation; table and column names are shortened as needed. This applies uniformly across all backends. ## Resolved From Values The plan JSON `resolved_from` field indicates how a rename was confirmed: | Value | Meaning | |-------|---------| | `"rename_flag"` | Explicitly declared via `--rename` or `--rename-table` CLI flag | | `"prompt"` | Confirmed interactively by the user | | (absent) | Auto-detected and kept without explicit confirmation | ## Known Backend-Specific Limitations ### PostgreSQL - `USING` clause for type casts is not auto-generated (write a manual migration) - `CREATE INDEX CONCURRENTLY` may not work within multi-statement transactions - `ALTER COLUMN ADD GENERATED ... AS (expr) STORED` is not supported by PostgreSQL; DBWarden emits a comment placeholder ### MySQL / MariaDB - DDL is not transactional; partial failure leaves schema inconsistent - `MODIFY COLUMN` requires full column definition; auto-generated nullable change SQL includes the column type but may omit other attributes - Column rename is not natively supported (requires `CHANGE` syntax with type) - Foreign key drop uses `DROP FOREIGN KEY` (constraint name is still the auto-generated name) ### SQLite - `ALTER COLUMN TYPE` is not supported (comment emitted) - `ALTER COLUMN [SET/DROP] NOT NULL` is not supported (comment emitted) - `--safe-type-change` is not supported (comment emitted) - FK constraints are not directly alterable (comment suggesting table recreation) - Column rename supported since 3.25.0; older versions need manual migration - Type affinity differs from server databases (e.g., `VARCHAR(255)` becomes `TEXT`) - Limited to single-writer; no concurrent writes ======================================================================== PAGE: https://dbwarden.emiliano-go.com/databases/ ======================================================================== # Supported Databases DBWarden supports PostgreSQL (the default and first-class backend), MySQL, MariaDB, SQLite, and ClickHouse. A **round-trip** backend is one where DBWarden can both read schema (via `generate-models`) and write schema (via `make-migrations` / `migrate`). ## Backend Matrix | Backend | `database_type` | Typical URL | Round-Trip | |---------|------------------|-------------|------------| | PostgreSQL | `postgresql` | `postgresql://user:pass@host:5432/db` | Yes | | MySQL | `mysql` | `mysql://user:pass@host:3306/db` | Yes | | MariaDB | `mariadb` | `mariadb://user:pass@host:3306/db` | No | | ClickHouse | `clickhouse` | `clickhouse://user:pass@host:8123/db` | Yes | | SQLite | `sqlite` | `sqlite:///./app.db` | Dev only | ## Optional Dependency Groups When you install `dbwarden`, the `[postgres]` extra is included by default (providing the PostgreSQL driver). For other backends you must specify the corresponding extra: | Extra | Command | Driver | |-------|---------|--------| | `[postgres]` | Included by default | `psycopg2-binary` | | `[mysql]` | `uv add "dbwarden[mysql]"` | `pymysql` | | `[mariadb]` | `uv add "dbwarden[mariadb]"` | `pymysql` | | `[clickhouse]` | `uv add "dbwarden[clickhouse]"` | `clickhouse-connect` | See [Installation](installation.md) for full details. ## Config Examples PostgreSQL: ```python primary = database_config( database_name="primary", default=True, database_type="postgresql", database_url_sync="postgresql://user:password@localhost:5432/main", ) ``` MySQL: ```python legacy = database_config( database_name="legacy", database_type="mysql", database_url_sync="mysql://user:password@localhost:3306/legacy", ) ``` SQLite: ```python dev = database_config( database_name="dev", database_type="sqlite", database_url_sync="sqlite:///./development.db", ) ``` ClickHouse: ```python analytics = database_config( database_name="analytics", database_type="clickhouse", database_url_sync="clickhouse://user:password@localhost:8123/analytics", ) ``` ## Internal Connection Handling DBWarden uses SQLAlchemy engines, with backend-specific URL normalization where needed. Conceptual flow: ```python def get_engine(config): url = config.sqlalchemy_url if config.database_type == "clickhouse": url = normalize_clickhouse_dialect(url) return create_engine(url) ``` Connections include retry logic: `get_db_connection()` wraps engine connections with up to 5 attempts and exponential backoff when the database is temporarily unavailable (e.g. during a restart or network hiccup). Engines are cached and reused across calls. For PostgreSQL schema support, DBWarden sets `search_path` on connection when `postgres_schema` is configured. ## Development Database Strategy Recommended pattern: - Production-like primary DB (for example PostgreSQL) - SQLite for dev DB via `dev_database_url` - Run local commands with `--dev` ```python primary = database_config( database_name="primary", default=True, database_type="postgresql", database_url_sync="postgresql://user:password@localhost:5432/main", dev_database_type="sqlite", dev_database_url="sqlite:///./development.db", ) ``` ```bash $ dbwarden --dev make-migrations "sync models" -d primary $ dbwarden --dev migrate -d primary ``` ## Translation Note When targeting SQLite in dev mode, DBWarden translates unsupported backend-specific types/defaults. - Unknown/unsupported types fallback to `TEXT` with warnings - `--strict-translation` turns those warnings into errors Details: [SQL Translation](sql-translation.md) ## Backend-Specific Notes Each backend has deep-dive documentation: | Backend | Guide | |---------|-------| | PostgreSQL | [PostgreSQL Deep Dive](databases/postgresql.md) | | MySQL / MariaDB | [MySQL Deep Dive](databases/mysql.md) | | SQLite | [SQL Databases](databases/sql-databases.md) | | ClickHouse | [ClickHouse Deep Dive](databases/clickhouse.md) | ### PostgreSQL PostgreSQL is a **first-class backend** with full round-trip support. All metadata: identity columns, collation, storage, compression, generated columns, fillfactor, tablespace, inheritance, exclude constraints, deferrable FKs, and advanced index options, is captured by the snapshot, diffed correctly, and emitted as valid DDL. See [PostgreSQL Deep Dive](databases/postgresql.md) for the complete reference. ### MySQL MySQL is a **first-class backend** with full round-trip support. All metadata: engine, charset, collation, row format, auto_increment, unsigned columns, ON UPDATE, and column comments, is captured by the snapshot, diffed correctly, and emitted as valid DDL. Key MySQL DDL behavior: - **DDL is NOT transactional**: each statement auto-commits; partial failure possible - Column type/nullable changes use `MODIFY COLUMN` (requires full column definition) - Table comments use `ALTER TABLE t COMMENT = '...'` (not `COMMENT ON`) - Column comments use `MODIFY COLUMN ... COMMENT '...'` (full column definition preserved) - Auto-increment toggle uses `MODIFY COLUMN ... AUTO_INCREMENT` - FK drop uses `DROP FOREIGN KEY` (not `DROP CONSTRAINT`) See [MySQL Deep Dive](databases/mysql.md) for the complete reference. ### MariaDB MariaDB is supported as a separate `database_type` (`mariadb`), but it does **not** have round-trip support. You can use MariaDB as a target database for migrations, but `generate-models` and full schema introspection are not available. Use `make-migrations` to write migrations manually. See [MySQL Deep Dive](databases/mysql.md) for MariaDB-specific notes. ### SQLite - Great for local tests and dev loops via `--dev` mode - Limited DDL: no `ALTER COLUMN TYPE`, no `SET/DROP NOT NULL`, no FK alterations - `--safe-type-change` emits a comment (not supported) - Type affinity differs from server databases - See [SQL Translation](sql-translation.md) for dev-mode type mapping ### ClickHouse ClickHouse has full round-trip support: `generate-models` reads schema from a live ClickHouse server, and `make-migrations` / `migrate` auto-generates DDL for table operations. - HTTP-based wire protocol; DBWarden uses ClickHouse client, not SQLAlchemy session - DDL operations now mostly auto-generated: table rename, column type change, nullable/LowCardinality changes, projections. FK, standard indexes, and safe type change still emit comment placeholders. - Full engine metadata support via `class Meta(CHTableMeta)` with `ChEngineSpec`, `ProjectionSpec`, `CHColumnMeta` - Supports materialized views, projections, dictionaries, replicated engines - See [ClickHouse Deep Dive](databases/clickhouse.md) for full details ## Recommended Verification Workflow ```bash # local loop on dev DB $ dbwarden --dev migrate -d primary # pre-release validation on production-like DB $ dbwarden migrate -d primary $ dbwarden status -d primary ``` ======================================================================== PAGE: https://dbwarden.emiliano-go.com/fastapi/advanced/engine-lifecycle/ ======================================================================== # Engine Lifecycle Learn how DBWarden manages database engines and connections. ## How Engines Are Created When you first use a session annotation like `primary.async_session`: 1. **First request arrives** with a session parameter annotation 2. **DBWarden checks engine cache** - Is there an engine for this database? 3. **If not cached:** - Reads database config from DBWarden registry - Converts URL to async version (`postgresql+asyncpg://...`) - Creates `AsyncEngine` with `create_async_engine()` - Caches engine by database URL 4. **Creates session factory** (`async_sessionmaker`) 5. **Opens new session** for this request 6. **Yields session** to your route 7. **Closes session** in finally block ### Engine Caching Engines are cached per unique database URL: ```python # Internal cache (simplified) _engine_cache = {} def get_engine(database_url: str): if database_url not in _engine_cache: _engine_cache[database_url] = create_async_engine(database_url) return _engine_cache[database_url] ``` **Why cache engines?** - Engines are expensive to create - Each engine manages a connection pool - Creating per-request would exhaust connections - Single engine per database is SQLAlchemy best practice ## Connection Pooling Each engine maintains a connection pool: ### Default Pool Settings ```python create_async_engine( database_url, pool_size=5, # Max connections in pool max_overflow=10, # Extra connections if pool full pool_timeout=30, # Wait time for available connection pool_recycle=3600, # Recycle connections after 1 hour ) ``` ### Custom Pool Settings To customize, you need to create engines manually (advanced): ```python from sqlalchemy.ext.asyncio import create_async_engine custom_engine = create_async_engine( "postgresql+asyncpg://...", pool_size=20, # More connections max_overflow=5, # Fewer overflow pool_pre_ping=True, # Test connections before use ) ``` DBWarden uses SQLAlchemy's defaults, which work well for most applications. ## Engine Disposal Engines should be disposed when your app shuts down. ### Using `dispose_engines` DBWarden provides a built-in `dispose_engines()` function that closes all cached engines and clients: ```python from contextlib import asynccontextmanager from fastapi import FastAPI from dbwarden.fastapi import dispose_engines @asynccontextmanager async def lifespan(app: FastAPI): yield dispose_engines() app = FastAPI(lifespan=lifespan) ``` This closes async and sync session factories, connection pools, and ClickHouse clients for all configured databases. In Kubernetes, pods are terminated quickly, so disposal is less critical. The OS cleans up connections. However, calling `dispose_engines()` is still recommended for clean shutdowns. ## Session Lifecycle Each request gets its own session: ``` Request → get_session() → Session created → Route runs → Session closed ``` ### Session Settings DBWarden sessions use: ```python async_sessionmaker( engine, class_=AsyncSession, expire_on_commit=False, # Keep objects accessible after commit ) ``` ### Why `expire_on_commit=False`? **Without it:** ```python @app.post("/users") async def create_user(session: primary.async_session): user = User(email="test@example.com") session.add(user) await session.commit() # Error: Instance is not bound to a Session return user ``` **With it:** ```python @app.post("/users") async def create_user(session: primary.async_session): user = User(email="test@example.com") session.add(user) await session.commit() # Works: Object still accessible return user ``` FastAPI needs to serialize the object after the route returns, so `expire_on_commit=False` is essential. ## Connection Pool Exhaustion ### Symptoms ``` TimeoutError: QueuePool limit of size 5 overflow 10 reached ``` This happens when: - Too many concurrent requests - Connections not being released - Long-running queries - Connection leaks ### Solutions #### 1. Increase Pool Size ```python # In engine creation pool_size=20, max_overflow=10, ``` #### 2. Profile Connection Usage ```python # Enable pool logging import logging logging.basicConfig() logging.getLogger('sqlalchemy.pool').setLevel(logging.DEBUG) ``` #### 3. Close Connections Properly Make sure sessions close: ```python # Correct - session closes automatically @app.get("/users") async def list_users(session: primary.async_session): result = await session.execute(select(User)) return result.scalars().all() # Session closes here # Wrong - keeping session reference _sessions = [] @app.get("/users") async def list_users(session: primary.async_session): _sessions.append(session) # Leak! ... ``` #### 4. Set Connection Timeout ```python pool_timeout=30, # Wait 30 seconds for connection ``` ## Engine Per Database With multiple databases, each gets its own engine: ```python # Internal (simplified) _engine_cache = { "postgresql://localhost/primary": , "postgresql://localhost/analytics": , "postgresql://localhost/logging": , } ``` Each engine has its own connection pool. ## Monitoring Connections ### Check Pool Status ```python from sqlalchemy import inspect @app.get("/debug/pool-status") async def pool_status(): engine = get_engine_for_database("primary") # Hypothetical pool = engine.pool return { "size": pool.size(), "checked_in": pool.checkedin(), "checked_out": pool.checkedout(), "overflow": pool.overflow(), } ``` ### Log Pool Events ```python import logging logging.basicConfig() logging.getLogger('sqlalchemy.pool').setLevel(logging.INFO) ``` ## Connection Recycling Connections are recycled after `pool_recycle` seconds (default: -1, never): ```python create_async_engine( database_url, pool_recycle=3600, # Recycle after 1 hour ) ``` **Why recycle?** - Database closes idle connections - Prevents stale connections - Refreshes connection state ## Pre-Ping Test connections before use: ```python create_async_engine( database_url, pool_pre_ping=True, # Test connection with SELECT 1 ) ``` **Trade-off:** - Pro: Prevents errors from stale connections - Con: Adds latency to every request ## What's Next? - **[Production Patterns](production-patterns.md)** - Deploy and monitor - **[Multi-Database](multi-database.md)** - Multiple connection pools ======================================================================== PAGE: https://dbwarden.emiliano-go.com/fastapi/advanced/multi-database/ ======================================================================== # Multi-Database Learn how to work with multiple databases in your FastAPI application. ## When to Use Multiple Databases Common scenarios: - **Primary + Analytics** - Separate reporting from transactional data - **Primary + Logging** - Dedicated audit/logging database - **Microservices** - Each service has its own database - **Multi-tenancy** - One database per tenant - **Read Replicas** - Separate read and write databases ## Quick Example Configure multiple databases: ```python # config.py from dbwarden import database_config # Primary database primary = database_config( database_name="primary", default=True, database_type="postgresql", database_url_sync="postgresql://user:password@localhost/myapp", model_paths=["app.models.primary"], ) # Analytics database analytics = database_config( database_name="analytics", database_type="postgresql", database_url_sync="postgresql://user:password@localhost/analytics", model_paths=["app.models.analytics"], ) # Logging database logging = database_config( database_name="logging", database_type="postgresql", database_url_sync="postgresql://user:password@localhost/logs", model_paths=["app.models.logging"], ) ``` Each handle's `.async_session` is a FastAPI dependency annotation: use it directly in routes: ```python from config import primary, analytics, logging @app.get("/users") async def list_users(session: primary.async_session): result = await session.execute(select(User)) return result.scalars().all() @app.get("/analytics/events") async def list_events(session: analytics.async_session): result = await session.execute(select(Event)) return result.scalars().all() ``` ## Query Multiple Databases You can use multiple handles in the same route: ```python @app.get("/dashboard") async def get_dashboard( users_session: primary.async_session, events_session: analytics.async_session, logs_session: logging.async_session, ): # Query primary database users = await users_session.execute(select(User)) # Query analytics database events = await events_session.execute(select(Event)) # Query logging database logs = await logs_session.execute(select(AuditLog)) return { "users": users.scalars().all(), "events": events.scalars().all(), "logs": logs.scalars().all(), } ``` Each session has its own transaction. If one fails, others are unaffected. ## Cross-Database Queries SQLAlchemy doesn't support joining across different databases. Instead: ### Pattern 1: Query Then Combine ```python @app.get("/dashboard") async def get_dashboard( primary_session: primary.async_session, analytics_session: analytics.async_session, ): # Get user IDs from primary user_result = await primary_session.execute(select(User.id)) user_ids = [row[0] for row in user_result.all()] # Get events for those users from analytics event_result = await analytics_session.execute( select(Event).where(Event.user_id.in_(user_ids)) ) return {"events": event_result.scalars().all()} ``` ### Pattern 2: Denormalize Store redundant data in each database: ```python # Primary DB - User class User(Base): __tablename__ = "users" id: Mapped[int] = mapped_column(primary_key=True) email: Mapped[str] # Analytics DB - Event with user email class Event(Base): __tablename__ = "events" id: Mapped[int] = mapped_column(primary_key=True) user_email: Mapped[str] # Denormalized from User event_type: Mapped[str] ``` ### Pattern 3: Application-Level Join ```python @app.get("/enriched-events") async def get_enriched_events( primary_session: primary.async_session, analytics_session: analytics.async_session, ): # Get all users users_result = await primary_session.execute(select(User)) users = {u.id: u for u in users_result.scalars().all()} # Get all events events_result = await analytics_session.execute(select(Event)) events = events_result.scalars().all() # Join in Python enriched = [ { "event": event, "user": users.get(event.user_id) } for event in events ] return enriched ``` ## Startup Checks for All Databases Check all databases on startup: ```python from contextlib import asynccontextmanager from dbwarden.fastapi import migration_context @asynccontextmanager async def lifespan(app: FastAPI): async with migration_context( mode="check", all_databases=True, # Check all databases fail_fast=True, ): yield ``` ## Health Checks for All Databases Health endpoints automatically report all databases: ```python from dbwarden.fastapi import DBWardenHealthRouter app.include_router(DBWardenHealthRouter(), prefix="/health") ``` Response shows all databases: ```json { "status": "ok", "databases": [ { "database": "primary", "status": "ok", "connected": true, "pending_migrations": 0, "lock_active": false, "error": null }, { "database": "analytics", "status": "ok", "connected": true, "pending_migrations": 0, "lock_active": false, "error": null }, { "database": "logging", "status": "ok", "connected": true, "pending_migrations": 0, "lock_active": false, "error": null } ] } ``` Check a specific database: ```bash curl http://localhost:8000/health/analytics ``` ## Migrations for Multiple Databases Each database has its own migration history: ```bash # Create migration for primary database $ dbwarden make-migrations -d primary -m "add users table" # Create migration for analytics database $ dbwarden make-migrations -d analytics -m "add events table" # Apply migrations to all databases $ dbwarden migrate --all ``` ## Common Patterns ### Pattern 1: Primary + Read Replica ```python # config.py primary = database_config(database_name="primary", ...) replica = database_config(database_name="replica", ...) # Routes @app.post("/users") async def create_user(session: primary.async_session): # Write to primary ... @app.get("/users") async def list_users(session: replica.async_session): # Read from replica ... ``` ### Pattern 2: Tenant Per Database ```python def get_tenant_session(tenant_id: str): return Annotated[AsyncSession, Depends(get_session(f"tenant_{tenant_id}"))] @app.get("/data") async def get_data(tenant_id: str): TenantSessionDep = get_tenant_session(tenant_id) # Use tenant-specific database ... ``` ### Pattern 3: Audit Logging ```python @app.post("/users") async def create_user( user_data: UserCreate, primary_session: primary.async_session, logging_session: logging.async_session, ): # Create user in primary user = User(**user_data.model_dump()) primary_session.add(user) await primary_session.commit() # Log action in logging database log = AuditLog(action="create_user", user_id=user.id) logging_session.add(log) await logging_session.commit() return user ``` ## What's Next? - **[Testing](testing.md)** - Test multi-database applications - **[Transaction Management](transaction-management.md)** - Coordinate across databases - **[Production Patterns](production-patterns.md)** - Deploy multi-database apps ======================================================================== PAGE: https://dbwarden.emiliano-go.com/fastapi/advanced/production-patterns/ ======================================================================== # Production Patterns Best practices for deploying FastAPI applications with DBWarden in production. ## Deployment Strategy ### Pattern 1: Pre-Deploy Migrations (Recommended) Run migrations before deploying new code: ```bash # CI/CD pipeline 1. Run tests 2. Build Docker image 3. Run migrations (on staging) 4. Deploy new code 5. Run migrations (on production) 6. Deploy to production ``` **Kubernetes example:** ```yaml apiVersion: batch/v1 kind: Job metadata: name: migrate spec: template: spec: containers: - name: migrate image: myapp:v1.2.3 command: ["dbwarden", "migrate"] env: - name: DATABASE_URL valueFrom: secretKeyRef: name: db-secret key: url restartPolicy: OnFailure ``` Then deploy the app: ```yaml apiVersion: apps/v1 kind: Deployment metadata: name: myapp spec: replicas: 3 template: spec: containers: - name: app image: myapp:v1.2.3 # App uses mode="check" in lifespan ``` ### Pattern 2: Init Container Run migrations in init container before app starts: ```yaml apiVersion: apps/v1 kind: Deployment metadata: name: myapp spec: template: spec: initContainers: - name: migrate image: myapp:latest command: ["dbwarden", "migrate"] env: - name: DATABASE_URL valueFrom: secretKeyRef: name: db-secret key: url containers: - name: app image: myapp:latest env: - name: DATABASE_URL valueFrom: secretKeyRef: name: db-secret key: url ``` Multiple pods may try to migrate simultaneously. DBWarden's migration locking helps, but prefer Pattern 1 for large deployments. ### Pattern 3: Auto-Migrate on Startup (Not Recommended) Only for simple, single-instance deployments: ```python @asynccontextmanager async def lifespan(app: FastAPI): async with migration_context( mode="migrate", allow_in_production=True, # Risky ): yield ``` **Risks:** - Multiple pods race to migrate - No rollback on failure - Downtime during migration ## Environment Configuration ### Environment Variables ```python # config.py import os from pydantic_settings import BaseSettings class Settings(BaseSettings): database_url_sync: str environment: str = "production" log_level: str = "INFO" class Config: env_file = ".env" settings = Settings() ``` Use in DBWarden config: ```python # dbwarden.py from config import settings primary = database_config( database_name="primary", default=True, database_type="postgresql", database_url_sync=settings.database_url_sync, model_paths=["app.models"], ) ``` ### Secrets Management **Kubernetes Secrets:** ```yaml apiVersion: v1 kind: Secret metadata: name: db-secret type: Opaque stringData: url: postgresql://user:password@db-host:5432/myapp ``` **AWS Secrets Manager:** ```python import boto3 import json def get_database_url(): client = boto3.client('secretsmanager') response = client.get_secret_value(SecretId='prod/database/url') secret = json.loads(response['SecretString']) return secret['DATABASE_URL'] db = database_config( database_url_sync=get_database_url(), ... ) ``` ## Health Checks ### Kubernetes Probes ```yaml apiVersion: apps/v1 kind: Deployment metadata: name: myapp spec: template: spec: containers: - name: app # Liveness: Is the app alive? livenessProbe: httpGet: path: /health/ port: 8000 initialDelaySeconds: 30 periodSeconds: 30 timeoutSeconds: 5 failureThreshold: 3 # Readiness: Is the app ready for traffic? readinessProbe: httpGet: path: /health/ port: 8000 initialDelaySeconds: 10 periodSeconds: 10 timeoutSeconds: 3 failureThreshold: 2 ``` ### Load Balancer Health Checks **AWS ALB:** ```yaml TargetGroup: HealthCheckEnabled: true HealthCheckPath: /health/ HealthCheckIntervalSeconds: 30 HealthCheckTimeoutSeconds: 5 HealthyThresholdCount: 2 UnhealthyThresholdCount: 3 Matcher: HttpCode: "200" ``` ## Monitoring ### Prometheus Metrics Export database metrics: ```python from prometheus_client import Counter, Histogram, Gauge import time # Metrics db_query_duration = Histogram( 'db_query_duration_seconds', 'Database query duration', ['operation'] ) db_connections_active = Gauge( 'db_connections_active', 'Active database connections' ) # Middleware to track queries @app.middleware("http") async def track_queries(request, call_next): start = time.time() response = await call_next(request) duration = time.time() - start if hasattr(request.state, 'db_queries'): db_query_duration.labels(operation='query').observe(duration) return response ``` ### Structured Logging ```python import structlog logger = structlog.get_logger() @app.post("/users") async def create_user(user_data: UserCreate, session: primary.async_session): logger.info( "creating_user", email=user_data.email, username=user_data.username ) user = User(**user_data.model_dump()) session.add(user) await session.commit() logger.info( "user_created", user_id=user.id, email=user.email ) return user ``` ### Distributed Tracing ```python from opentelemetry import trace from opentelemetry.instrumentation.fastapi import FastAPIInstrumentor from opentelemetry.instrumentation.sqlalchemy import SQLAlchemyInstrumentor # Initialize tracing FastAPIInstrumentor.instrument_app(app) SQLAlchemyInstrumentor().instrument() # Traces will automatically include database queries ``` ## Performance Optimization ### Connection Pooling ```python # Increase pool size for high-traffic apps create_async_engine( database_url, pool_size=20, max_overflow=10, pool_pre_ping=True, ) ``` ### Query Optimization ```python # Eager load relationships @app.get("/users/{user_id}/posts") async def get_user_posts(user_id: int, session: primary.async_session): result = await session.execute( select(User) .options(selectinload(User.posts)) # Eager load .where(User.id == user_id) ) user = result.scalar_one() return user.posts ``` ### Caching ```python from functools import lru_cache @lru_cache(maxsize=100) def get_cached_user(user_id: int): # Cache expensive computations pass ``` ## Zero-Downtime Deployments ### Blue-Green Deployment 1. Run migrations (backward compatible) 2. Deploy new version (green) 3. Shift traffic to green 4. Keep blue as backup 5. Decommission blue after validation ### Rolling Updates ```yaml apiVersion: apps/v1 kind: Deployment metadata: name: myapp spec: replicas: 5 strategy: type: RollingUpdate rollingUpdate: maxSurge: 1 # One extra pod during update maxUnavailable: 0 # All pods must be ready ``` ### Backward-Compatible Migrations **Bad:** ```python # Breaking change op.drop_column('users', 'old_field') ``` **Good:** ```python # Backward compatible # Step 1: Add new field op.add_column('users', sa.Column('new_field', sa.String())) # Step 2: Deploy code using new_field # Step 3: Backfill data # Step 4: Deploy code no longer using old_field # Step 5: Drop old_field op.drop_column('users', 'old_field') ``` ## Disaster Recovery ### Backups ```bash # Automated backups before migrations $ dbwarden migrate --with-backup --backup-dir /backups ``` ### Rollback Plan ```bash # If deployment fails 1. Roll back code to previous version 2. Roll back migrations: dbwarden rollback --count 1 3. Verify health: curl /health/ ``` ## Security ### Connection String Security ```python # Never commit database_url_sync="postgresql://user:password@host/db" # Use environment variables database_url_sync=os.getenv("DATABASE_URL") ``` ### SSL/TLS ```python database_url_sync="postgresql://user:password@host/db?sslmode=require" ``` ### Least Privilege Create application database user with minimal permissions: ```sql CREATE USER myapp_user WITH PASSWORD 'secret'; GRANT SELECT, INSERT, UPDATE, DELETE ON ALL TABLES IN SCHEMA public TO myapp_user; -- Don't grant DROP, TRUNCATE, etc. ``` ## CI/CD Pipeline Example ```yaml # .github/workflows/deploy.yml name: Deploy on: push: branches: [main] jobs: test: runs-on: ubuntu-latest steps: - uses: actions/checkout@v2 - name: Run tests run: pytest migrate: needs: test runs-on: ubuntu-latest steps: - uses: actions/checkout@v2 - name: Run migrations run: | dbwarden migrate env: DATABASE_URL: ${{ secrets.DATABASE_URL }} deploy: needs: migrate runs-on: ubuntu-latest steps: - name: Deploy to Kubernetes run: | kubectl set image deployment/myapp myapp=myapp:${{ github.sha }} ``` ## What's Next? - **[Multi-Database](multi-database.md)** - Scale with multiple databases - **[Testing](testing.md)** - Test production patterns in CI ======================================================================== PAGE: https://dbwarden.emiliano-go.com/fastapi/advanced/testing/ ======================================================================== # Testing Learn how to test FastAPI applications that use DBWarden. ## Quick Example The simplest way to test with DBWarden is to configure the test database via environment variables. No dependency overrides needed: ```python import os import pytest from fastapi.testclient import TestClient # Point DBWarden at an in-memory SQLite database for tests os.environ["ENVIRONMENT"] = "test" os.environ["DEV_DATABASE_URL"] = "sqlite:///:memory:" from app.main import app from app.models import Base @pytest.fixture def client(): from dbwarden.commands.migrate import migrate_single migrate_single(database="primary") yield TestClient(app) def test_create_user(client): response = client.post( "/api/v1/users/", json={ "email": "test@example.com", "username": "testuser", }, ) assert response.status_code == 201 data = response.json() assert data["email"] == "test@example.com" ``` When `ENVIRONMENT=test` is set and `dev_database_url` is configured in your `database_config()`, DBWarden automatically uses the test URL. No manual engine creation, session factories, or dependency overrides needed. ## Test Database Setup ### Option 1: SQLite In-Memory Fast, isolated, no cleanup needed: ```python import pytest from sqlalchemy import create_engine from sqlalchemy.orm import sessionmaker from sqlalchemy.pool import StaticPool from app.models import Base @pytest.fixture(scope="function") def test_db(): engine = create_engine( "sqlite:///:memory:", connect_args={"check_same_thread": False}, poolclass=StaticPool, ) Base.metadata.create_all(bind=engine) TestingSession = sessionmaker(bind=engine) yield TestingSession() Base.metadata.drop_all(bind=engine) ``` ### Option 2: PostgreSQL Test Database More realistic, slower: ```python import pytest from sqlalchemy import create_engine from sqlalchemy.orm import sessionmaker from app.models import Base TEST_DATABASE_URL = "postgresql://user:password@localhost/test_db" @pytest.fixture(scope="function") def test_db(): engine = create_engine(TEST_DATABASE_URL) Base.metadata.create_all(bind=engine) TestingSession = sessionmaker(bind=engine) session = TestingSession() yield session session.close() Base.metadata.drop_all(bind=engine) ``` ### Option 3: Transaction Rollback Fastest for repeated tests: ```python @pytest.fixture(scope="function") def test_db(): connection = engine.connect() transaction = connection.begin() session = TestingSession(bind=connection) yield session session.close() transaction.rollback() connection.close() ``` ## Override Session Dependency ### Method 1: Environment Variables (Recommended) Configure a `dev_database_url` in your config, then set `ENVIRONMENT=test`: ```python # config.py primary = database_config( database_name="primary", default=True, database_type="postgresql", database_url_sync="postgresql://localhost/prod", dev_database_url="sqlite:///./test.db", model_paths=["app.models"], ) ``` ```python # conftest.py import os import pytest os.environ["ENVIRONMENT"] = "test" @pytest.fixture(autouse=True) def setup_test_db(): from dbwarden.commands.migrate import migrate_single migrate_single(database="primary", verbose=False) yield ``` This works with the `DatabaseHandle` pattern without any dependency overrides. ### Method 2: `get_session` Override For apps that use the `Annotated[AsyncSession, Depends(get_session())]` pattern: ```python from app.dependencies import SessionDep from app.dependencies import get_session # The function, not a call def override_get_session(): try: db = TestingSessionLocal() yield db finally: db.close() app.dependency_overrides[get_session] = override_get_session ``` ### Method 3: Fixture-Based Override ```python import pytest @pytest.fixture def client(test_db): def override(): try: yield test_db finally: test_db.rollback() app.dependency_overrides[get_session] = override yield TestClient(app) app.dependency_overrides.clear() ``` ### Method 4: Async Override For async tests using `get_session()`: ```python import pytest from httpx import AsyncClient from sqlalchemy.ext.asyncio import create_async_engine, AsyncSession @pytest.fixture async def async_client(): engine = create_async_engine("sqlite+aiosqlite:///:memory:") async with engine.begin() as conn: await conn.run_sync(Base.metadata.create_all) async def override(): async with AsyncSession(engine) as session: yield session app.dependency_overrides[get_session] = override async with AsyncClient(app=app, base_url="http://test") as client: yield client app.dependency_overrides.clear() ``` ### Method 2: Fixture-Based ```python import pytest from fastapi.testclient import TestClient @pytest.fixture def client(test_db): def override(): try: yield test_db finally: test_db.rollback() app.dependency_overrides[get_session] = override yield TestClient(app) app.dependency_overrides.clear() ``` ### Method 3: Async Override For async tests: ```python import pytest from httpx import AsyncClient from sqlalchemy.ext.asyncio import create_async_engine, AsyncSession @pytest.fixture async def async_client(): engine = create_async_engine("sqlite+aiosqlite:///:memory:") async with engine.begin() as conn: await conn.run_sync(Base.metadata.create_all) async def override(): async with AsyncSession(engine) as session: yield session app.dependency_overrides[get_session] = override async with AsyncClient(app=app, base_url="http://test") as client: yield client app.dependency_overrides.clear() ``` ## Testing CRUD Operations ### Test Create ```python def test_create_user(client): response = client.post( "/api/v1/users/", json={"email": "test@example.com", "username": "test"} ) assert response.status_code == 201 assert response.json()["email"] == "test@example.com" ``` ### Test Read ```python def test_get_user(client, test_db): # Setup: create user user = User(email="test@example.com", username="test") test_db.add(user) test_db.commit() # Test: get user response = client.get(f"/api/v1/users/{user.id}") assert response.status_code == 200 assert response.json()["email"] == "test@example.com" ``` ### Test Update ```python def test_update_user(client, test_db): user = User(email="test@example.com", username="test") test_db.add(user) test_db.commit() response = client.patch( f"/api/v1/users/{user.id}", json={"email": "new@example.com"} ) assert response.status_code == 200 assert response.json()["email"] == "new@example.com" ``` ### Test Delete ```python def test_delete_user(client, test_db): user = User(email="test@example.com", username="test") test_db.add(user) test_db.commit() user_id = user.id response = client.delete(f"/api/v1/users/{user_id}") assert response.status_code == 204 # Verify deleted assert test_db.get(User, user_id) is None ``` ## Test Fixtures ### User Fixture ```python @pytest.fixture def sample_user(test_db): user = User( email="test@example.com", username="testuser", is_active=True ) test_db.add(user) test_db.commit() test_db.refresh(user) return user ``` ### Multiple Users ```python @pytest.fixture def sample_users(test_db): users = [ User(email=f"user{i}@example.com", username=f"user{i}") for i in range(5) ] test_db.add_all(users) test_db.commit() return users ``` ## Testing Multi-Database Use environment variables to configure test databases per handle: ```python import os import pytest from fastapi.testclient import TestClient os.environ["PRIMARY_DB_URL"] = "sqlite:///./test_primary.db" os.environ["ANALYTICS_DB_URL"] = "sqlite:///./test_analytics.db" os.environ["ENVIRONMENT"] = "test" from app.main import app # config loads after env vars are set @pytest.fixture def client(): from dbwarden.commands.migrate import migrate_single migrate_single(database="primary") migrate_single(database="analytics") yield TestClient(app) ``` ## Testing Error Cases ### Test 404 ```python def test_user_not_found(client): response = client.get("/api/v1/users/9999") assert response.status_code == 404 assert "not found" in response.json()["detail"].lower() ``` ### Test Duplicate ```python def test_duplicate_user(client, sample_user): response = client.post( "/api/v1/users/", json={ "email": sample_user.email, # Duplicate "username": "different" } ) assert response.status_code == 400 ``` ### Test Validation ```python def test_invalid_email(client): response = client.post( "/api/v1/users/", json={"email": "notanemail", "username": "test"} ) assert response.status_code == 422 ``` ## Async Testing ### With pytest-asyncio ```python import pytest from httpx import AsyncClient @pytest.mark.asyncio async def test_create_user_async(async_client): response = await async_client.post( "/api/v1/users/", json={"email": "test@example.com", "username": "test"} ) assert response.status_code == 201 ``` ### Async Fixtures ```python @pytest.fixture async def async_session(): engine = create_async_engine("sqlite+aiosqlite:///:memory:") async with engine.begin() as conn: await conn.run_sync(Base.metadata.create_all) async_session = AsyncSession(engine) yield async_session await async_session.close() ``` ## Testing Health Endpoints ```python def test_health_endpoint(client): response = client.get("/health/") assert response.status_code == 200 data = response.json() assert "status" in data assert "databases" in data ``` ## Mocking ### Mock External Service ```python from unittest.mock import patch def test_user_with_external_service(client): with patch('app.services.external_api.call') as mock: mock.return_value = {"verified": True} response = client.post( "/api/v1/users/", json={"email": "test@example.com", "username": "test"} ) assert response.status_code == 201 mock.assert_called_once() ``` ## What's Next? - **[Transaction Management](transaction-management.md)** - Complex transaction patterns - **[Production Patterns](production-patterns.md)** - CI/CD and integration tests ======================================================================== PAGE: https://dbwarden.emiliano-go.com/fastapi/advanced/transaction-management/ ======================================================================== # Transaction Management Learn how to manage database transactions in FastAPI with DBWarden. In these examples, `primary` is a `DatabaseHandle` created with `database_config()`. Use `primary.async_session` as the route parameter annotation to get a request-scoped session. See [Session Dependency](../tutorial/session-dependency.md). ## Automatic Transactions By default, DBWarden sessions handle transactions automatically: ```python @app.post("/users") async def create_user(user_data: UserCreate, session: primary.async_session): user = User(**user_data.model_dump()) session.add(user) await session.commit() # Explicit commit return user # Session automatically closes here ``` ## When to Commit ### Automatic (Session Autoflush) For simple operations, SQLAlchemy flushes changes automatically: ```python @app.get("/users/{user_id}") async def get_user(user_id: int, session: primary.async_session): result = await session.execute(select(User).where(User.id == user_id)) return result.scalar_one_or_none() # No commit needed for reads ``` ### Manual Commit For writes, explicitly commit: ```python @app.post("/users") async def create_user(user_data: UserCreate, session: primary.async_session): user = User(**user_data.model_dump()) session.add(user) await session.commit() # Explicit commit await session.refresh(user) # Get DB-generated values return user ``` ## Error Handling and Rollback ### Automatic Rollback If an exception occurs, the session rolls back automatically: ```python @app.post("/users") async def create_user(user_data: UserCreate, session: primary.async_session): user = User(**user_data.model_dump()) session.add(user) if not validate_email(user.email): raise HTTPException(400, "Invalid email") # Session automatically rolls back await session.commit() return user ``` ### Manual Rollback For explicit control: ```python @app.post("/users") async def create_user(user_data: UserCreate, session: primary.async_session): user = User(**user_data.model_dump()) session.add(user) try: await session.commit() except IntegrityError: await session.rollback() # Explicit rollback raise HTTPException(400, "User already exists") return user ``` ## Nested Transactions (Savepoints) Use savepoints for partial rollbacks: ```python from sqlalchemy.exc import IntegrityError @app.post("/batch") async def batch_create(users: list[UserCreate], session: primary.async_session): created = [] failed = [] for user_data in users: savepoint = await session.begin_nested() # Create savepoint try: user = User(**user_data.model_dump()) session.add(user) await session.flush() created.append(user) await savepoint.commit() except IntegrityError: await savepoint.rollback() # Rollback to savepoint failed.append(user_data) await session.commit() # Commit all successful inserts return {"created": created, "failed": failed} ``` ## Multiple Operations in One Transaction Group related operations: ```python @app.post("/orders") async def create_order(order_data: OrderCreate, session: primary.async_session): # All operations in one transaction # 1. Create order order = Order(user_id=order_data.user_id) session.add(order) await session.flush() # Get order ID # 2. Add order items for item_data in order_data.items: item = OrderItem(order_id=order.id, **item_data.dict()) session.add(item) # 3. Update inventory for item_data in order_data.items: await session.execute( update(Product) .where(Product.id == item_data.product_id) .values(stock=Product.stock - item_data.quantity) ) # Commit everything at once await session.commit() await session.refresh(order) return order ``` If any step fails, everything rolls back. ## Isolation Levels Set transaction isolation level: ```python from sqlalchemy import create_engine # In engine creation (advanced) engine = create_async_engine( database_url, isolation_level="SERIALIZABLE" # Strictest isolation ) ``` Isolation levels: - `READ UNCOMMITTED` - Dirty reads possible - `READ COMMITTED` - Default for PostgreSQL - `REPEATABLE READ` - No phantom reads - `SERIALIZABLE` - Strictest, slowest ## Two-Phase Commit (Distributed Transactions) For multi-database transactions (advanced): ```python @app.post("/transfer") async def transfer_funds( primary_session: primary.async_session, analytics_session: analytics.async_session, ): try: # Phase 1: Prepare both transactions user = await primary_session.get(User, 1) user.balance -= 100 await primary_session.flush() event = Event(type="transfer", amount=100) analytics_session.add(event) await analytics_session.flush() # Phase 2: Commit both await primary_session.commit() await analytics_session.commit() except Exception: # Rollback both if either fails await primary_session.rollback() await analytics_session.rollback() raise ``` Two-phase commit is complex and not fully supported by SQLAlchemy. Consider using saga pattern or event sourcing for distributed transactions. ## Optimistic Locking Prevent lost updates with version columns: ```python class User(Base): __tablename__ = "users" id: Mapped[int] = mapped_column(primary_key=True) email: Mapped[str] version: Mapped[int] = mapped_column(default=0) # Version column @app.patch("/users/{user_id}") async def update_user( user_id: int, user_data: UserUpdate, session: primary.async_session, ): user = await session.get(User, user_id) # Check version matches if user.version != user_data.expected_version: raise HTTPException(409, "User was modified by someone else") user.email = user_data.email user.version += 1 # Increment version await session.commit() return user ``` ## Pessimistic Locking Lock rows explicitly: ```python from sqlalchemy import select @app.post("/reserve") async def reserve_item(item_id: int, session: primary.async_session): # Lock the row for update result = await session.execute( select(Item) .where(Item.id == item_id) .with_for_update() # SELECT ... FOR UPDATE ) item = result.scalar_one() if item.reserved: raise HTTPException(400, "Already reserved") item.reserved = True await session.commit() return item ``` ## Idempotency Make operations idempotent: ```python @app.post("/orders", status_code=201) async def create_order( order_data: OrderCreate, idempotency_key: str, session: primary.async_session, ): # Check if order already exists existing = await session.execute( select(Order).where(Order.idempotency_key == idempotency_key) ) order = existing.scalar_one_or_none() if order: return order # Already created, return existing # Create new order order = Order(**order_data.dict(), idempotency_key=idempotency_key) session.add(order) await session.commit() return order ``` ## What's Next? - **[Engine Lifecycle](engine-lifecycle.md)** - Connection pooling and cleanup - **[Production Patterns](production-patterns.md)** - Deploy with confidence ======================================================================== PAGE: https://dbwarden.emiliano-go.com/fastapi/concepts/ ======================================================================== # Concepts High-level explanations of how DBWarden's FastAPI integration works. ## What Problem Does It Solve? Without DBWarden, FastAPI apps typically have **split configuration**: ```python # migrations/env.py - Alembic config SQLALCHEMY_DATABASE_URL = "postgresql://..." # app/database.py - App config SQLALCHEMY_DATABASE_URL = "postgresql://..." # Duplicate! # app/main.py - Manual engine creation engine = create_async_engine(SQLALCHEMY_DATABASE_URL) SessionLocal = sessionmaker(engine) # Manual dependency async def get_db(): async with SessionLocal() as session: yield session ``` With DBWarden, you have **one source of truth**: ```python # dbwarden.py - Single config returns a DatabaseHandle primary = database_config( database_url_sync="postgresql://...", model_paths=["app.models"], ) # app/main.py - Use .async_session directly as a parameter annotation @app.get("/users") async def list_users(session: primary.async_session): ... ``` DBWarden handles: - Engine creation - Session factories - Connection pooling - Startup checks - Health endpoints ## Dependency Injection FastAPI uses **dependency injection** to provide resources to routes. ### Without DBWarden ```python # Manual dependency async def get_db(): async with SessionLocal() as session: try: yield session except: await session.rollback() raise # Every route @app.get("/users") async def list_users(db: AsyncSession = Depends(get_db)): ... ``` ### With DBWarden ```python # One-time setup: database_config() returns a DatabaseHandle primary = database_config(database_name="primary", ...) # Every route: use .async_session directly @app.get("/users") async def list_users(session: primary.async_session): ... ``` The `DatabaseHandle` `.async_session` property is a FastAPI dependency annotation, ready to use in route parameters without `Annotated`, `Depends`, or type aliases. ## Engine Caching ### Why Cache Engines? Creating database engines is expensive: ```python # Bad: New engine per request @app.get("/users") async def list_users(): engine = create_async_engine(...) # Expensive! async with engine.connect() as conn: ... ``` Engines should be **created once and reused**: ```python # Good: One engine for app lifetime primary = database_config(database_name="primary", ...) # Engine cached @app.get("/users") async def list_users(session: primary.async_session): # Reuses cached engine ... ``` DBWarden caches engines automatically: ``` First request: 1. primary.async_session resolves 2. Engine created 3. Engine cached 4. Session created from engine 5. Session yielded to route Subsequent requests: 1. primary.async_session resolves 2. Engine retrieved from cache Fast! 3. Session created from engine 4. Session yielded to route ``` ## Session Scope ### Request-Scoped Sessions Each request gets its own session: ``` Request 1: Session A Request 2: Session B Request 3: Session C ``` Sessions are **never shared** between requests, ensuring: - Transaction isolation - No race conditions - Predictable behavior ### Session Lifecycle ``` Request arrives FastAPI calls get_session dependency New session created Session yielded to route Route executes with session Session automatically closed Response returned ``` If an error occurs, the session is rolled back before closing. ## Health vs Liveness Kubernetes has two types of probes: ### Liveness Probe **Question:** Is the app alive? **Answer:** If no, restart the pod. **Use:** Basic health check ```yaml livenessProbe: httpGet: path: /ping # Simple endpoint, no DB failureThreshold: 3 ``` ### Readiness Probe **Question:** Is the app ready to serve traffic? **Answer:** If no, stop routing traffic (but don't restart). **Use:** Database health, migration state ```yaml readinessProbe: httpGet: path: /health/ # Full health check with DB failureThreshold: 2 ``` DBWarden's health endpoints are perfect for readiness probes because they check: - Database connectivity - Migration state - Lock status ## Async vs Sync DBWarden's FastAPI integration is **async-native**. ### Why Async? **Sync (blocking):** ```python result = session.execute(select(User)) # Blocks thread users = result.scalars().all() ``` While waiting for the database, the thread can't do anything else. **Async (non-blocking):** ```python result = await session.execute(select(User)) # Releases control users = result.scalars().all() ``` While waiting for the database, the event loop can handle other requests. **Result:** Async can handle **10-100x more concurrent requests** than sync with the same resources. ### Async Drivers DBWarden automatically uses async drivers: | Database | Async Driver | |----------|-------------| | PostgreSQL | `asyncpg` | | SQLite | `aiosqlite` | Your URLs are automatically upgraded: ```python # Your config database_url_sync="postgresql://localhost/myapp" # DBWarden derives an async URL for its internal async engine # (uses the configured database_url_async if provided, or upgrades from database_url_sync) database_url_async="postgresql+asyncpg://localhost/myapp" ``` ## expire_on_commit This is a session setting that affects object behavior after commit. ### Without expire_on_commit=False ```python user = User(email="test@example.com") session.add(user) await session.commit() # Error: Instance is not bound to a Session return user ``` After commit, SQLAlchemy **expires** all objects, meaning they're no longer accessible without a session. ### With expire_on_commit=False ```python user = User(email="test@example.com") session.add(user) await session.commit() # Works: Object still accessible return user ``` Objects remain accessible after commit. **Why does DBWarden use this?** FastAPI serializes response objects **after** the route returns: ``` Route returns user Session closes FastAPI serializes user to JSON Needs to access user.email! Response sent ``` Without `expire_on_commit=False`, serialization would fail because the session is closed. ## Configuration Resolution DBWarden resolves configuration in this order: 1. **Explicit config** - `database_config(...)` calls 2. **Runtime flags** - `dev=True` parameter 3. **Environment variables** - `ENVIRONMENT=development` 4. **Default values** - Built-in defaults Example: ```python primary = database_config( database_url_sync="postgresql://prod-db/myapp", dev_database_url="sqlite:///dev.db", ) # In production: uses postgresql://prod-db/myapp # In development (ENVIRONMENT=development): uses sqlite:///dev.db # primary.async_session automatically picks the right URL ``` ## When to Use DBWarden **Use DBWarden when:** - You want migrations and runtime to share config - You need startup validation - You want built-in health endpoints - You're building a new FastAPI app - You use SQLAlchemy for models **Don't use DBWarden when:** - You don't use SQLAlchemy - You already have working migration infrastructure - You use an ORM other than SQLAlchemy (e.g., Tortoise, SQLModel standalone) ## Comparison to Alternatives ### vs. Alembic + Manual Setup | | **DBWarden** | **Alembic + Manual** | |---|---|---| | Configuration | One source | Split (env.py + app code) | | Engine creation | Automatic | Manual | | Session dependency | Built-in | Custom | | Startup checks | Built-in | Custom | | Health endpoints | Built-in | Custom | | Learning curve | Lower | Higher | ### vs. SQLModel SQLModel includes SQLAlchemy but doesn't provide: - Migration management - Startup checks - Health endpoints - Multi-database support DBWarden can work **with** SQLModel for migrations while you use SQLModel's ORM. ### vs. Django ORM Django's ORM is integrated with Django's migration system. DBWarden is for FastAPI + SQLAlchemy apps. ## What's Next? - **[API Reference](reference.md)** - Complete function signatures - **[Tutorial](tutorial/first-steps.md)** - Build your first app ======================================================================== PAGE: https://dbwarden.emiliano-go.com/fastapi/ ======================================================================== # FastAPI Integration DBWarden provides first-class FastAPI integration for database sessions, health checks, and migration management. **One configuration source** for both migrations and runtime: no more split configs. ## Quick Start Install the FastAPI integration: ```bash uv add "dbwarden[fastapi]" ``` Create your first FastAPI app with DBWarden: ```python from fastapi import FastAPI from dbwarden.fastapi import dbwarden_lifespan from contextlib import asynccontextmanager @asynccontextmanager async def lifespan(app: FastAPI): async with dbwarden_lifespan(mode="migrate", allow_in_production=True): yield app = FastAPI(lifespan=lifespan) ``` That's it! **5 lines** to integrate DBWarden. For fine-grained control, use `migration_context()` instead of `dbwarden_lifespan()`. ## Tutorial - First Steps New to DBWarden's FastAPI integration? Start with these tutorials: 1. **[First Steps](tutorial/first-steps.md)** - Get started in 2 minutes 2. **[Session Dependency](tutorial/session-dependency.md)** - Database sessions in routes 3. **[Startup Checks](tutorial/startup-checks.md)** - Validate migrations on boot 4. **[Health Endpoints](tutorial/health-endpoints.md)** - Runtime health monitoring 5. **[Complete Application](tutorial/complete-application.md)** - Full working example ## Advanced User Guide Ready for more? Learn advanced patterns: - **[Multi-Database](advanced/multi-database.md)** - Work with multiple databases - **[Testing](advanced/testing.md)** - Override dependencies and test isolation - **[Transaction Management](advanced/transaction-management.md)** - Commits, rollbacks, savepoints - **[Engine Lifecycle](advanced/engine-lifecycle.md)** - Caching, pooling, disposal - **[Production Patterns](advanced/production-patterns.md)** - Kubernetes, CI/CD, monitoring ## Learn Understanding the concepts behind DBWarden's FastAPI integration: - **[Concepts](concepts.md)** - High-level explanations of how it works - **[API Reference](reference.md)** - Complete function signatures and parameters ## Key Features ### Dependency Injection Get SQLAlchemy `AsyncSession` in your routes with proper lifecycle management: ```python from dbwarden import database_config primary = database_config(database_name="primary", ...) @app.get("/users") async def list_users(session: primary.async_session): result = await session.execute(select(User)) return result.scalars().all() ``` - No `Annotated`, `Depends`, or type aliases needed - Automatic engine creation and caching - Request-scoped sessions - Automatic cleanup and error handling - Multi-database support ### Health Endpoints Production-ready health checks out of the box: ```python from dbwarden.fastapi import DBWardenHealthRouter app.include_router(DBWardenHealthRouter(), prefix="/health") ``` - Database connectivity checks - Migration state monitoring - Kubernetes liveness/readiness probes - Per-database health status ### Migration Status and Execution Monitor and trigger migrations at runtime: ```python from dbwarden.fastapi import DBWardenRouter app.include_router(DBWardenRouter(), prefix="/db") # GET /db/status - migration and seed status # POST /db/migrate - trigger migrations ``` - Per-database migration and seed status - Runtime migration triggering - Dry-run support - Optional API key authentication ### Prometheus Metrics Expose migration metrics for monitoring: ```python from dbwarden.fastapi import MetricsRouter, MetricsMiddleware app.add_middleware(MetricsMiddleware) app.include_router(MetricsRouter(), prefix="/metrics") ``` - Migration counters and duration histograms - Schema and seed version gauges - Pending migration tracking - Request-scoped gauge refresh ### Distributed Migration Locking Coordinate migrations across multiple application instances with Redis: ```python from dbwarden.fastapi import migration_lock, sync_migration_lock ``` - Prevents concurrent migrations across pods - Async and sync variants available - Configurable key and TTL ### Engine Lifecycle Management Properly dispose engines on shutdown: ```python from dbwarden.fastapi import dispose_engines ``` - Closes all cached engines and clients - Clean shutdown for async and sync session factories - ClickHouse client cleanup ### Startup Validation Ensure your database is ready before accepting traffic: ```python from contextlib import asynccontextmanager from dbwarden.fastapi import migration_context @asynccontextmanager async def lifespan(app: FastAPI): async with migration_context(mode="check", fail_fast=True): yield app = FastAPI(lifespan=lifespan) ``` - Connectivity validation - Migration state checks - Optional auto-migration on startup - Dev/prod environment awareness ## Why Use DBWarden with FastAPI? **Without DBWarden**, you typically have: - One config for migrations (Alembic, etc.) - Another config for your FastAPI app (engines, sessions) - Manual startup checks - Custom health endpoints **With DBWarden**, you have: - **One configuration source** for everything - Sessions sourced from your migration config - Built-in startup validation - Production-ready health endpoints **Result:** Less boilerplate, fewer bugs, easier maintenance. See also: [Cookbook: FastAPI Integration](../cookbook/09-fastapi-integration.md) ======================================================================== PAGE: https://dbwarden.emiliano-go.com/fastapi/reference/ ======================================================================== # API Reference Complete API documentation for DBWarden's FastAPI integration. For most applications, use the `DatabaseHandle` pattern instead of `get_session()`. Call `database_config()` and use `.async_session` directly in route parameters: no `Annotated`, `Depends`, or type aliases needed. See [Session Dependency](tutorial/session-dependency.md). ## `get_session` Returns a FastAPI dependency that yields an `AsyncSession`. ### Signature ```python def get_session( database: str | None = None, *, dev: bool = False, ) -> Callable[[], AsyncGenerator[AsyncSession, None]] ``` ### Parameters **`database`** : `str | None`, optional - Database name from DBWarden config - If `None`, uses the default database - Default: `None` **`dev`** : `bool`, keyword-only, optional - If `True`, uses `dev_database_url` instead of `database_url` - Useful for local development - Default: `False` ### Returns **`Callable[[], AsyncGenerator[AsyncSession, None]]`** - A dependency function that FastAPI's `Depends()` can consume - The dependency yields an `AsyncSession` for each request - Sessions are automatically closed after the request ### Examples ```python # Default database SessionDep = Annotated[AsyncSession, Depends(get_session())] # Specific database AnalyticsSessionDep = Annotated[AsyncSession, Depends(get_session("analytics"))] # Dev mode DevSessionDep = Annotated[AsyncSession, Depends(get_session(dev=True))] ``` ### Raises - **`ValueError`**: If database type is not supported - **`DBWardenConfigError`**: If config is not loaded or database not found --- ## `migration_context` Async context manager for running startup migration checks or migrations. ### Signature ```python @asynccontextmanager async def migration_context( *, mode: Literal["migrate", "check"] = "check", database: str | None = None, all_databases: bool = False, dev: bool = False, strict_translation: bool = False, with_backup: bool = False, backup_dir: str | None = None, verbose: bool = False, allow_in_production: bool = False, fail_fast: bool = True, only_dev: bool = False, ) -> AsyncGenerator[None, None] ``` ### Parameters **`mode`** : `Literal["migrate", "check"]`, keyword-only, optional - `"check"` - Read-only validation (recommended for production) - `"migrate"` - Apply pending migrations - Default: `"check"` **`database`** : `str | None`, keyword-only, optional - Database name to check/migrate - If `None`, uses default database - Default: `None` **`all_databases`** : `bool`, keyword-only, optional - If `True`, check/migrate all configured databases - Default: `False` **`dev`** : `bool`, keyword-only, optional - Use dev database URL - Default: `False` **`strict_translation`** : `bool`, keyword-only, optional - Enable strict SQL translation mode - Default: `False` **`with_backup`** : `bool`, keyword-only, optional - Create backup before migrations (migrate mode only) - Default: `False` **`backup_dir`** : `str | None`, keyword-only, optional - Directory for backups - If `None`, uses default location - Default: `None` **`verbose`** : `bool`, keyword-only, optional - Enable detailed logging - Default: `False` **`allow_in_production`** : `bool`, keyword-only, optional - Allow migrate mode in production environment - Default: `False` **`fail_fast`** : `bool`, keyword-only, optional - Exit immediately on failure - If `False`, logs warning but continues - Default: `True` **`only_dev`** : `bool`, keyword-only, optional - Only run in development environments - Skipped if `ENVIRONMENT` is production - Default: `False` ### Returns **`AsyncGenerator[None, None]`** - Async context manager for use in FastAPI lifespan ### Examples ```python # Check mode (recommended) @asynccontextmanager async def lifespan(app: FastAPI): async with migration_context(mode="check", all_databases=True): yield # Migrate mode (dev only) @asynccontextmanager async def lifespan(app: FastAPI): async with migration_context( mode="migrate", only_dev=True, with_backup=True, ): yield ``` ### Raises - **`RuntimeError`**: If checks fail and `fail_fast=True` - **`ValueError`**: If `mode` is invalid --- ## `check_schema_on_startup` Run read-only startup schema checks. ### Signature ```python def check_schema_on_startup( *, database: str | None = None, all_databases: bool = False, dev: bool = False, strict_translation: bool = False, only_dev: bool = False, fail_fast: bool = True, verbose: bool = False, ) -> list[HealthResult] ``` ### Parameters Same as `migration_context`, except no migration-specific parameters. ### Returns **`list[HealthResult]`** - List of health results, one per database checked - Each `HealthResult` contains: - `database`: str - Database name - `status`: str - "ok", "degraded", or "error" - `connected`: bool - Whether connection succeeded - `pending_migrations`: int - Number of unapplied migrations - `lock_active`: bool - Whether migration lock is held - `error`: str | None - Error message if failed ### Examples ```python @asynccontextmanager async def lifespan(app: FastAPI): results = check_schema_on_startup(all_databases=True, fail_fast=True) for result in results: print(f"{result.database}: {result.status}") yield ``` ### Raises - **`RuntimeError`**: If any check fails and `fail_fast=True` --- ## `migrate_on_startup` Run migration workflow at startup. ### Signature ```python def migrate_on_startup( *, database: str | None = None, all_databases: bool = False, dev: bool = False, strict_translation: bool = False, with_backup: bool = False, backup_dir: str | None = None, verbose: bool = False, allow_in_production: bool = False, fail_fast: bool = True, only_dev: bool = False, ) -> None ``` ### Parameters Same as `migration_context` in migrate mode. ### Returns **`None`** ### Examples ```python @asynccontextmanager async def lifespan(app: FastAPI): migrate_on_startup( all_databases=True, with_backup=True, only_dev=True, ) yield ``` ### Raises - **`RuntimeError`**: If migration fails and `fail_fast=True` - **`RuntimeError`**: If in production and `allow_in_production=False` --- ## `DBWardenHealthRouter` Creates a FastAPI `APIRouter` with health, liveness, and readiness endpoints. ### Signature ```python def DBWardenHealthRouter( auth_mode: str = "open", api_key: str | None = None, ) -> APIRouter ``` ### Parameters **`auth_mode`** : `str`, optional - `"open"` (default) - no authentication required - `"authenticated"` - requires `X-API-Key` header - Can also be set via `DBWARDEN_HEALTH_AUTH` env var **`api_key`** : `str | None`, optional - API key for authenticated mode ### Returns **`APIRouter`** - Router with health endpoints configured - Routes: - `GET /` - Overall health for all databases - `GET /liveness` - Always returns 200 (app is alive) - `GET /readiness` - Returns 200 when all databases reachable, 503 otherwise - `GET /{database_name}` - Health for specific database ### Examples ```python from dbwarden.fastapi import DBWardenHealthRouter app = FastAPI() app.include_router(DBWardenHealthRouter(), prefix="/health") # Now available: # GET /health/ - All databases # GET /health/liveness - Liveness probe # GET /health/readiness - Readiness probe # GET /health/primary - Specific database ``` ### Response Schema ```python { "status": "ok" | "degraded" | "error", "databases": [ { "database": str, "status": "ok" | "degraded" | "error", "connected": bool, "pending_migrations": int, "lock_active": bool, "error": str | None } ] } ``` Liveness response: ```python {"status": "alive"} ``` ### HTTP Status Codes | Scenario | Status Code | |----------|-------------| | All healthy | 200 | | Degraded (pending migrations) | 200 | | Database unreachable | 503 | | Database not found | 404 (per-database route only) | | App is alive (liveness) | 200 | | Unauthenticated (auth mode) | 401 | | Invalid API key (auth mode) | 403 | --- ## `DBWardenRouter` Creates a FastAPI `APIRouter` with migration status and execution endpoints. ### Signature ```python def DBWardenRouter( auth_mode: str = "open", api_key: str | None = None, ) -> APIRouter ``` ### Parameters **`auth_mode`** : `str`, optional - `"open"` - No authentication required - `"authenticated"` - Require `X-API-Key` header - Default: `"open"` **`api_key`** : `str | None`, optional - API key for authenticated mode - If `None`, reads from `DBWARDEN_MIGRATE_AUTH` env var - Default: `None` ### Returns **`APIRouter`** - Router with status and migrate endpoints ### Endpoints **`GET /status`** - Returns per-database migration and seed status for all configured databases. **`POST /migrate`** - Triggers migration execution. Accepts JSON body: ```json { "database": "primary", "count": null, "to_version": null, "dry_run": false } ``` ### Examples ```python from dbwarden.fastapi import DBWardenRouter app = FastAPI() app.include_router(DBWardenRouter(), prefix="/db") # Now available: # GET /db/status # POST /db/migrate ``` With authentication: ```python app.include_router( DBWardenRouter(auth_mode="authenticated", api_key="my-secret-key"), prefix="/db", ) ``` ### Response Schema (GET /status) ```python { "databases": { "primary": { "database": "primary", "status": "ok" | "degraded" | "error", "connected": bool, "pending_migrations": int, "applied_migrations": int, "pending_seeds": int, "applied_seeds": int, "lock_active": bool, "error": str | None } } } ``` ### Response Schema (POST /migrate) ```python { "success": bool, "message": str, "database": str | None } ``` ### HTTP Status Codes | Scenario | Status Code | |----------|-------------| | Status retrieved successfully | 200 | | Migrate completed | 200 | | Migrate dry-run | 200 | | Auth failure | 403 | | Migration error | 500 | --- ## `MetricsRouter` Creates a FastAPI `APIRouter` with a Prometheus metrics endpoint. ### Signature ```python def MetricsRouter() -> APIRouter ``` ### Returns **`APIRouter`** - Router with metrics endpoint ### Endpoints **`GET /metrics`** - Returns Prometheus text-format metrics. Only active when `prometheus_client` is installed and `DBWARDEN_METRICS=true` is set. Returns 404 when disabled. ### Examples ```python from dbwarden.fastapi import MetricsRouter app = FastAPI() app.include_router(MetricsRouter(), prefix="/metrics") # Now available: # GET /metrics ``` ### Response format ``` # HELP dbwarden_pending_migrations Number of pending migrations # TYPE dbwarden_pending_migrations gauge dbwarden_pending_migrations{database="primary"} 0 # HELP dbwarden_schema_version Current schema version # TYPE dbwarden_schema_version gauge dbwarden_schema_version{database="primary"} 5.0 ``` --- ## `MetricsMiddleware` ASGI middleware that refreshes pending-migration gauges on each HTTP request. ### Signature ```python class MetricsMiddleware ``` ### Usage ```python from dbwarden.fastapi import MetricsMiddleware, MetricsRouter app = FastAPI() app.add_middleware(MetricsMiddleware) app.include_router(MetricsRouter(), prefix="/metrics") ``` The middleware also records HTTP request duration via the migration duration histogram. --- ## `dispose_engines` Close all cached database engines and clients. Should be called during FastAPI shutdown. ### Signature ```python def dispose_engines() -> None ``` ### Examples ```python from contextlib import asynccontextmanager from fastapi import FastAPI from dbwarden.fastapi import dispose_engines @asynccontextmanager async def lifespan(app: FastAPI): yield dispose_engines() ``` This closes async and sync session factories, connection pools, and ClickHouse clients for all configured databases. --- ## `dbwarden_lifespan` Async context manager that handles the full FastAPI engine lifecycle: startup schema validation (or auto-migration), readiness gate, seed application, connection pool warmup, and cleanup on shutdown. ### Signature ```python async def dbwarden_lifespan( app=None, *, mode: str = "check", # "check" | "migrate" | "none" database: str | None = None, all_databases: bool = False, dev: bool = False, strict_translation: bool = False, with_backup: bool = False, backup_dir: str | None = None, verbose: bool = False, allow_in_production: bool = False, fail_fast: bool = True, only_dev: bool = False, readiness_gate: bool = False, apply_seeds: bool = False, pool_warmup: bool = False, pool_warmup_size: int = 3, ) ``` ### Parameters | Parameter | Type | Default | Description | |-----------|------|---------|-------------| | `app` | `FastAPI` | `None` | FastAPI application instance (optional, for router registration) | | `mode` | `str` | `"check"` | Startup mode: `"check"` (read-only), `"migrate"` (auto-apply), `"none"` (skip) | | `database` | `str` | `None` | Target a single database by name | | `all_databases` | `bool` | `False` | Target all configured databases | | `readiness_gate` | `bool` | `False` | Raise if any database is unreachable after startup checks | | `apply_seeds` | `bool` | `False` | Apply pending seed data after migrations | | `pool_warmup` | `bool` | `False` | Acquire connections before yielding to reduce cold-start latency | | `pool_warmup_size` | `int` | `3` | Number of connections to acquire during warmup | The remaining parameters (`dev`, `strict_translation`, `with_backup`, etc.) are identical to `migration_context()` and control startup check/migration behavior. ### Usage ```python from contextlib import asynccontextmanager from fastapi import FastAPI from dbwarden.fastapi import dbwarden_lifespan @asynccontextmanager async def lifespan(app: FastAPI): async with dbwarden_lifespan( app, mode="check", readiness_gate=True, pool_warmup=True, pool_warmup_size=5, ): yield app = FastAPI(lifespan=lifespan) ``` --- ## `QueryTracingMiddleware` ASGI middleware that emits per-request structured query tracing logs. Tracks query count, total duration, slowest query, and slow query threshold breaches. ### Signature ```python def QueryTracingMiddleware( app, slow_query_threshold_ms: int = 100, ) ``` ### Usage ```python from dbwarden.fastapi import QueryTracingMiddleware app.add_middleware(QueryTracingMiddleware, slow_query_threshold_ms=100) ``` The middleware monkey-patches SQLAlchemy's `Engine.connect` around each request to count and time queries. On each response, it logs: | Field | Description | |-------|-------------| | `path` | Request path | | `method` | HTTP method | | `request_duration_ms` | Total request time | | `query_count` | Number of database queries | | `total_query_time_ms` | Cumulative query time | | `slowest_query_time_ms` | Duration of the slowest query | | `slow_queries` | Count of queries exceeding the threshold | Slow queries are logged at `WARNING` level; normal requests at `INFO`. --- ## `PoolMetricsCollector` Collects SQLAlchemy connection pool metrics for monitoring. ### Signature ```python class PoolMetricsCollector() ``` ### Methods **`register(name: str, engine)`** - Register an engine for metrics collection. **`collect() -> dict[str, dict[str, int]]`** - Collect pool metrics from all registered engines. ### Usage ```python from dbwarden.fastapi import PoolMetricsCollector from sqlalchemy import create_engine collector = PoolMetricsCollector() engine = create_engine("postgresql://localhost/db") collector.register("primary", engine) metrics = collector.collect() # { # "primary": { # "pool_size": 5, # "checked_out": 2, # "overflow": 0, # "checked_in": 3 # } # } ``` --- ## `override_database` Async context manager that temporarily overrides a database URL for testing. ### Signature ```python async def override_database( database: str, url: str, *, run_migrations: bool = False, verbose: bool = False, ) -> AsyncGenerator[Any, None] ``` ### Parameters **`database`** : `str` - Database name to override. **`url`** : `str` - Temporary database URL. **`run_migrations`** : `bool`, keyword-only, optional - Run pending migrations after override. Default: `False`. **`verbose`** : `bool`, keyword-only, optional - Enable verbose migration output. Default: `False`. ### Usage ```python from dbwarden.fastapi import override_database async with override_database("primary", "sqlite+aiosqlite:///:memory:", run_migrations=True): # Test code here uses the overridden database ... # Original URL is restored on exit ``` The original `sqlalchemy_url_sync` and `sqlalchemy_url_async` are restored when the context manager exits, even if an exception occurs. --- ## `migration_state` Async context manager that simulates a specific migration state for testing by inserting tracking records. ### Signature ```python async def migration_state( applied: list[str] | None = None, database: str | None = None, ) -> AsyncGenerator[None, None] ``` ### Parameters **`applied`** : `list[str] | None`, optional - List of version strings to mark as applied. Default: `None`. **`database`** : `str | None`, optional - Target database name. Default: `None` (uses default database). ### Usage ```python from dbwarden.fastapi import migration_state async with migration_state(applied=["0001", "0002"]): # Database appears to have migrations 0001 and 0002 applied ... # Tracking records are cleaned up on exit ``` --- ## `migration_lock` and `sync_migration_lock` Redis-backed distributed migration lock context managers for coordinating migrations across multiple application instances. ### Async signature ```python async def migration_lock( redis_client: Any, key: str = "dbwarden_migrate", ttl: int = 60, ) -> AsyncGenerator[None, None] ``` ### Sync signature ```python def sync_migration_lock( redis_client: Any, key: str = "dbwarden_migrate", ttl: int = 60, ) -> Generator[None, None] ``` ### Parameters **`redis_client`** : `Any` - Redis client instance (any library with `setnx`, `expire`, `delete` methods) **`key`** : `str`, optional - Redis key for the lock - Default: `"dbwarden_migrate"` **`ttl`** : `int`, optional - Lock TTL in seconds (auto-expiry) - Default: `60` ### Raises - **`LockError`**: If the lock is already held by another process ### Examples ```python import redis.asyncio as aioredis from contextlib import asynccontextmanager from dbwarden.fastapi import migration_context, migration_lock redis_client = aioredis.from_url("redis://localhost:6379") @asynccontextmanager async def lifespan(app: FastAPI): async with migration_lock(redis_client): async with migration_context(mode="migrate"): yield ``` ```python import redis from dbwarden.fastapi import migration_context, sync_migration_lock redis_client = redis.from_url("redis://localhost:6379") @asynccontextmanager async def lifespan(app: FastAPI): with sync_migration_lock(redis_client): async with migration_context(mode="migrate"): yield ``` --- ## Type Aliases ### `DatabaseHandle` Pattern (Recommended) Use `.async_session` and `.sync_session` directly: no type aliases needed: ```python from dbwarden import database_config primary = database_config(database_name="primary", ...) @app.get("/users") async def list_users(session: primary.async_session): result = await session.execute(select(User)) return result.scalars().all() ``` ### `SessionDep` (Alternative) If you prefer the `Annotated` pattern, use `get_session()`: ```python from typing import Annotated from fastapi import Depends from sqlalchemy.ext.asyncio import AsyncSession from dbwarden.fastapi import get_session SessionDep = Annotated[AsyncSession, Depends(get_session())] ``` --- ## Data Models ### `HealthResult` Returned by `check_schema_on_startup`: ```python @dataclass class HealthResult: database: str # Database name status: str # "ok", "degraded", or "error" connected: bool # Connection successful? pending_migrations: int # Number of unapplied migrations lock_active: bool # Migration lock held? error: str | None # Error message if failed ``` ### `DatabaseHealth` Pydantic model for health endpoints: ```python class DatabaseHealth(BaseModel): database: str status: str connected: bool pending_migrations: int lock_active: bool error: str | None = None ``` ### `HealthResponse` Pydantic model for health endpoints: ```python class HealthResponse(BaseModel): status: str databases: list[DatabaseHealth] ``` --- ## Constants ### Environment Detection DBWarden detects environment from `ENVIRONMENT` variable: **Development environments:** - `dev` - `development` - `local` - `test` - `testing` **Production environments:** - `prod` - `production` Used by `only_dev` and `allow_in_production` parameters. --- ## Exceptions ### `DBWardenNotInitializedError` Raised when DBWarden config hasn't been loaded. ```python # Fix by ensuring dbwarden.py is imported import dbwarden # Loads config ``` ### `DBWardenDatabaseNotFoundError` Raised when specified database name doesn't exist in config. ```python # Fix by adding database to config db = database_config( ``` --- ======================================================================== PAGE: https://dbwarden.emiliano-go.com/fastapi/tutorial/complete-application/ ======================================================================== # Complete Application A full, production-ready FastAPI application using all of DBWarden's features. ## Overview This example shows: - Database configuration - Model definition - Session dependencies - CRUD operations - Startup checks - Health endpoints - Transaction management - Error handling ## Project Structure ``` my_app/ ├── config.py # Database configuration + handles ├── app/ │ ├── __init__.py │ ├── main.py # FastAPI app │ ├── models.py # SQLAlchemy models │ └── routes/ │ ├── __init__.py │ └── users.py # User endpoints └── pyproject.toml ``` ## Step 1: Database Configuration Create `config.py` in your project root: ```python # config.py from dbwarden import database_config primary = database_config( database_name="primary", default=True, database_type="postgresql", database_url_sync="postgresql://user:password@localhost:5432/myapp", dev_database_type="sqlite", dev_database_url="sqlite:///./dev.db", model_paths=["app.models"], model_tables=["users", "posts"], ) ``` `primary` is a `DatabaseHandle`. Use `primary.async_session` in your route parameters and `primary.sync_session` for synchronous routes. In production, use environment variables for sensitive data: ```python import os primary = database_config( database_name="primary", default=True, database_type="postgresql", database_url_sync=os.getenv("DATABASE_URL"), model_paths=["app.models"], model_tables=["users", "posts"], ) ``` ## Step 2: Define Models Create `app/models.py`: ```python # app/models.py from datetime import datetime from sqlalchemy import Boolean, DateTime, Integer, String from sqlalchemy.orm import DeclarativeBase, Mapped, mapped_column class Base(DeclarativeBase): pass class User(Base): __tablename__ = "users" id: Mapped[int] = mapped_column(Integer, primary_key=True) email: Mapped[str] = mapped_column(String(255), unique=True, nullable=False) username: Mapped[str] = mapped_column(String(100), unique=True, nullable=False) full_name: Mapped[str | None] = mapped_column(String(200)) is_active: Mapped[bool] = mapped_column(Boolean, default=True) created_at: Mapped[datetime] = mapped_column( DateTime, default=datetime.utcnow ) updated_at: Mapped[datetime] = mapped_column( DateTime, default=datetime.utcnow, onupdate=datetime.utcnow ) ``` ## Step 3: Shared Dependencies The `DatabaseHandle` from `config.py` is already a shared dependency import `primary` wherever you need a session: ```python # app/routes/users.py from config import primary ``` Use `primary.async_session` directly as a route parameter annotation. No separate `dependencies.py` module is needed. ## Step 4: Pydantic Schemas Create `app/schemas.py` for request/response models: ```python # app/schemas.py from datetime import datetime from pydantic import BaseModel, EmailStr class UserBase(BaseModel): email: EmailStr username: str full_name: str | None = None class UserCreate(UserBase): pass class UserUpdate(BaseModel): email: EmailStr | None = None username: str | None = None full_name: str | None = None is_active: bool | None = None class UserResponse(UserBase): id: int is_active: bool created_at: datetime updated_at: datetime class Config: from_attributes = True ``` ## Step 5: User Routes Create `app/routes/users.py`: ```python # app/routes/users.py from fastapi import APIRouter, HTTPException from sqlalchemy import select from sqlalchemy.exc import IntegrityError from config import primary from app.models import User from app.schemas import UserCreate, UserResponse, UserUpdate router = APIRouter(prefix="/users", tags=["users"]) @router.get("/", response_model=list[UserResponse]) async def list_users( session: primary.async_session, skip: int = 0, limit: int = 100, active_only: bool = False, ): """List all users with pagination.""" stmt = select(User).offset(skip).limit(limit) if active_only: stmt = stmt.where(User.is_active == True) result = await session.execute(stmt) return result.scalars().all() @router.get("/{user_id}", response_model=UserResponse) async def get_user(user_id: int, session: primary.async_session): """Get a single user by ID.""" result = await session.execute( select(User).where(User.id == user_id) ) user = result.scalar_one_or_none() if not user: raise HTTPException(status_code=404, detail="User not found") return user @router.post("/", response_model=UserResponse, status_code=201) async def create_user(user_data: UserCreate, session: primary.async_session): """Create a new user.""" user = User(**user_data.model_dump()) session.add(user) try: await session.commit() except IntegrityError: await session.rollback() raise HTTPException( status_code=400, detail="User with this email or username already exists" ) await session.refresh(user) return user @router.patch("/{user_id}", response_model=UserResponse) async def update_user( user_id: int, user_data: UserUpdate, session: primary.async_session, ): """Update a user.""" result = await session.execute( select(User).where(User.id == user_id) ) user = result.scalar_one_or_none() if not user: raise HTTPException(status_code=404, detail="User not found") # Update only provided fields update_data = user_data.model_dump(exclude_unset=True) for key, value in update_data.items(): setattr(user, key, value) try: await session.commit() except IntegrityError: await session.rollback() raise HTTPException( status_code=400, detail="Email or username already taken" ) await session.refresh(user) return user @router.delete("/{user_id}", status_code=204) async def delete_user(user_id: int, session: primary.async_session): """Delete a user.""" result = await session.execute( select(User).where(User.id == user_id) ) user = result.scalar_one_or_none() if not user: raise HTTPException(status_code=404, detail="User not found") await session.delete(user) await session.commit() ``` ## Step 6: Main Application Create `app/main.py`: ```python # app/main.py from contextlib import asynccontextmanager from fastapi import FastAPI from dbwarden.fastapi import DBWardenHealthRouter, migration_context from app.routes import users @asynccontextmanager async def lifespan(app: FastAPI): """Startup and shutdown logic.""" # Startup: check database migrations async with migration_context( mode="check", all_databases=True, fail_fast=True, verbose=True, ): yield # Shutdown: cleanup happens here # Create FastAPI app app = FastAPI( title="My App", description="Example app with DBWarden integration", version="1.0.0", lifespan=lifespan, ) # Include routers app.include_router(users.router, prefix="/api/v1") app.include_router(DBWardenHealthRouter(), prefix="/health") @app.get("/") async def root(): """Root endpoint.""" return { "message": "Welcome to My App", "docs": "/docs", "health": "/health/" } ``` ## Step 7: Create Migrations Initialize DBWarden and create your first migration: ```bash # Initialize DBWarden (if not already done) $ dbwarden init # Create migration for User model $ dbwarden make-migrations -m "create users table" ``` This generates a migration file like `0001_create_users_table.py`. ## Step 8: Apply Migrations Apply the migration to your database: ```bash # For development (SQLite) export ENVIRONMENT=development $ dbwarden migrate # For production (PostgreSQL) export ENVIRONMENT=production $ dbwarden migrate ``` ## Step 9: Run the Application Start your FastAPI app: ```bash uvicorn app.main:app --reload ``` You'll see: ``` INFO: Started server process [12345] INFO: Waiting for application startup. INFO: DBWarden: migration_context mode=check outcome=ok duration_ms=45 INFO: Application startup complete. INFO: Uvicorn running on http://127.0.0.1:8000 ``` ## Step 10: Test the API ### Interactive Documentation Open to see the Swagger UI. ### Create a User ```bash curl -X POST http://localhost:8000/api/v1/users/ \ -H "Content-Type: application/json" \ -d '{ "email": "alice@example.com", "username": "alice", "full_name": "Alice Smith" }' ``` Response: ```json { "id": 1, "email": "alice@example.com", "username": "alice", "full_name": "Alice Smith", "is_active": true, "created_at": "2024-01-15T10:30:00", "updated_at": "2024-01-15T10:30:00" } ``` ### List Users ```bash curl http://localhost:8000/api/v1/users/ ``` ### Get a User ```bash curl http://localhost:8000/api/v1/users/1 ``` ### Update a User ```bash curl -X PATCH http://localhost:8000/api/v1/users/1 \ -H "Content-Type: application/json" \ -d '{ "full_name": "Alice Johnson" }' ``` ### Delete a User ```bash curl -X DELETE http://localhost:8000/api/v1/users/1 ``` ### Check Health ```bash curl http://localhost:8000/health/ ``` ## Key Features Demonstrated ### 1. Session Management ```python @router.post("/", response_model=UserResponse) async def create_user(user_data: UserCreate, session: primary.async_session): # Session automatically provided user = User(**user_data.model_dump()) session.add(user) await session.commit() await session.refresh(user) return user # Session automatically closed ``` ### 2. Error Handling ```python try: await session.commit() except IntegrityError: await session.rollback() # Explicit rollback raise HTTPException(400, "Duplicate entry") ``` ### 3. Query Patterns ```python # Select one result = await session.execute( select(User).where(User.id == user_id) ) user = result.scalar_one_or_none() # Select many result = await session.execute( select(User).offset(skip).limit(limit) ) users = result.scalars().all() ``` ### 4. Transaction Management ```python # Add to session session.add(user) # Commit changes await session.commit() # Refresh to get DB-generated values await session.refresh(user) # Delete await session.delete(user) await session.commit() ``` ### 5. Startup Validation ```python @asynccontextmanager async def lifespan(app: FastAPI): async with migration_context(mode="check"): # App only starts if database is healthy yield ``` ### 6. Health Endpoints ```python app.include_router(DBWardenHealthRouter(), prefix="/health") ``` Provides: - `GET /health/` - Overall health - `GET /health/{database_name}` - Per-database health ## Production Deployment ### Docker Create `Dockerfile`: ```dockerfile FROM python:3.11-slim WORKDIR /app COPY pyproject.toml uv.lock . RUN uv sync COPY . . # Run migrations before starting app CMD dbwarden migrate && uvicorn app.main:app --host 0.0.0.0 --port 8000 ``` ### Kubernetes Create `deployment.yaml`: ```yaml apiVersion: apps/v1 kind: Deployment metadata: name: myapp spec: replicas: 3 selector: matchLabels: app: myapp template: metadata: labels: app: myapp spec: initContainers: # Run migrations in init container - name: migrate image: myapp:latest command: ["dbwarden", "migrate"] env: - name: DATABASE_URL valueFrom: secretKeyRef: name: db-secret key: url containers: - name: app image: myapp:latest ports: - containerPort: 8000 env: - name: DATABASE_URL valueFrom: secretKeyRef: name: db-secret key: url - name: ENVIRONMENT value: "production" # Liveness probe livenessProbe: httpGet: path: /health/ port: 8000 initialDelaySeconds: 10 periodSeconds: 30 # Readiness probe readinessProbe: httpGet: path: /health/ port: 8000 initialDelaySeconds: 5 periodSeconds: 10 ``` ## Environment Variables Create `.env` for local development: ```bash # .env ENVIRONMENT=development DATABASE_URL=sqlite:///./dev.db ``` For production, set: ```bash ENVIRONMENT=production DATABASE_URL=postgresql://user:password@db-host:5432/myapp ``` ## Dependencies Create `pyproject.toml`: ```toml [project] name = "my-app" version = "0.1.0" requires-python = ">=3.12" dependencies = [ "fastapi>=0.104.0", "uvicorn[standard]>=0.24.0", "sqlalchemy>=2.0.0", "asyncpg>=0.29.0", "aiosqlite>=0.19.0", "pydantic[email]>=2.4.0", "dbwarden>=0.1.0", ] ``` Install: ```bash uv sync ``` ## Testing Create `tests/test_users.py`: ```python import pytest from httpx import AsyncClient from app.main import app @pytest.mark.asyncio async def test_create_user(): async with AsyncClient(app=app, base_url="http://test") as client: response = await client.post( "/api/v1/users/", json={ "email": "test@example.com", "username": "testuser", "full_name": "Test User" } ) assert response.status_code == 201 data = response.json() assert data["email"] == "test@example.com" assert data["username"] == "testuser" @pytest.mark.asyncio async def test_list_users(): async with AsyncClient(app=app, base_url="http://test") as client: response = await client.get("/api/v1/users/") assert response.status_code == 200 assert isinstance(response.json(), list) @pytest.mark.asyncio async def test_health(): async with AsyncClient(app=app, base_url="http://test") as client: response = await client.get("/health/") assert response.status_code == 200 data = response.json() assert data["status"] in ["ok", "degraded", "error"] ``` Run tests: ```bash pytest tests/ ``` ## What's Next? Take your app further: - **[Multi-Database](../advanced/multi-database.md)** - Add analytics or logging databases - **[Testing](../advanced/testing.md)** - Advanced testing patterns - **[Transaction Management](../advanced/transaction-management.md)** - Complex transactions - **[Production Patterns](../advanced/production-patterns.md)** - CI/CD and monitoring - **[Cookbook: FastAPI Integration](../../cookbook/09-fastapi-integration.md)** - Standalone FastAPI example ======================================================================== PAGE: https://dbwarden.emiliano-go.com/fastapi/tutorial/first-steps/ ======================================================================== # First Steps Let's create a FastAPI app with DBWarden in **2 minutes**. You'll create a simple API with: - Database sessions in routes - Startup migration checks - Health endpoints ## Prerequisites You should have: - **Python 3.10+** installed - **FastAPI** and **uvicorn** installed - **DBWarden installed** with the FastAPI extra ```bash uv add "dbwarden[fastapi]" ``` ## Create Your First App Create a single file `main.py`: ```python from contextlib import asynccontextmanager from fastapi import FastAPI from sqlalchemy import text from dbwarden import database_config from dbwarden.fastapi import migration_context @asynccontextmanager async def lifespan(app: FastAPI): async with migration_context(mode="check"): yield app = FastAPI(lifespan=lifespan) primary = database_config( database_name="primary", default=True, database_type="sqlite", database_url_sync="sqlite:///./app.db", model_paths=["app.models"], ) @app.get("/") async def root(session: primary.async_session): result = await session.execute( text("SELECT 'Hello from DBWarden!' as message") ) row = result.first() return {"message": row.message} ``` That's it! **15 lines** of meaningful code including imports. Key details: - `database_config()` returns a `DatabaseHandle` - The handle's `.async_session` is a FastAPI dependency annotation: use it **directly** in route parameters - No need for `Annotated`, `Depends`, session type aliases, or manual engine creation - `migration_context(mode="check")` validates the database on startup ## Run It Start your application: ```bash uvicorn main:app --reload ``` You'll see output like: ``` INFO: Started server process [12345] INFO: Waiting for application startup. INFO: Application startup complete. INFO: Uvicorn running on http://127.0.0.1:8000 ``` If there are pending migrations or database issues, the app will fail to start with a clear error message. ## Test It Open your browser to You'll see: ```json { "message": "Hello from DBWarden!" } ``` Or use `curl`: ```bash curl http://127.0.0.1:8000/ ``` ## Check the Docs FastAPI automatically generates interactive API documentation. Open to see the Swagger UI with your route. ## Add Health Endpoints Let's add health checking in **one line**: ```python from dbwarden.fastapi import DBWardenHealthRouter # Add this line after creating the app app.include_router(DBWardenHealthRouter(), prefix="/health") ``` Now visit to see your database health status: ```json { "status": "ok", "databases": [ { "database": "primary", "status": "ok", "connected": true, "pending_migrations": 0, "lock_active": false, "error": null } ] } ``` ### Complete File with Health ```python from contextlib import asynccontextmanager from fastapi import FastAPI from sqlalchemy import text from dbwarden import database_config from dbwarden.fastapi import DBWardenHealthRouter, migration_context @asynccontextmanager async def lifespan(app: FastAPI): async with migration_context(mode="check"): yield app = FastAPI(lifespan=lifespan) app.include_router(DBWardenHealthRouter(), prefix="/health") primary = database_config( database_name="primary", default=True, database_type="sqlite", database_url_sync="sqlite:///./app.db", model_paths=["app.models"], ) @app.get("/") async def root(session: primary.async_session): result = await session.execute( text("SELECT 'Hello from DBWarden!' as message") ) row = result.first() return {"message": row.message} ``` ## What's Happening? Let's break down what each piece does: ### `migration_context(mode="check")` - Runs when your app starts (before accepting requests) - Checks database connectivity - Verifies migration state - Fails fast if there are issues ### `database_config` handle (e.g., `primary`) - Returns a `DatabaseHandle` with `.async_session` and `.sync_session` properties - Each property is a FastAPI dependency annotation, usable directly in route parameters - Create one handle per database, pass the right one to each route ### `primary.async_session` - A FastAPI dependency annotation with no `Annotated`, `Depends`, or type aliases needed - Creates a new `AsyncSession` per request automatically - Closes the session when the request finishes - For sync routes, use `primary.sync_session` instead ### `DBWardenHealthRouter` - Adds `/health/` endpoint for all databases - Adds `/health/{database_name}` for specific database - Returns connectivity and migration status - Perfect for Kubernetes probes ## What's Next? Now that you have a working app, learn more about each component: - **[Session Dependency](session-dependency.md)** - Deep dive into session handling - **[Startup Checks](startup-checks.md)** - All about `migration_context()` - **[Health Endpoints](health-endpoints.md)** - Complete health monitoring guide Or jump to: - **[Complete Application](complete-application.md)** - A real-world example with models - **[Multi-Database](../advanced/multi-database.md)** - Working with multiple databases ======================================================================== PAGE: https://dbwarden.emiliano-go.com/fastapi/tutorial/health-endpoints/ ======================================================================== # Health Endpoints Learn how to add database health monitoring to your FastAPI application. ## What Are Health Endpoints? Health endpoints are HTTP endpoints that report whether your application and its dependencies (like databases) are working correctly. They're essential for: - **Kubernetes liveness probes** - Is the app alive? - **Kubernetes readiness probes** - Is the app ready to serve traffic? - **Monitoring systems** - Prometheus, Datadog, New Relic, etc. - **Load balancers** - Should traffic be routed here? - **Debugging** - Quick status check during incidents ## Quick Example Add health endpoints to your app in **one line**: ```python from fastapi import FastAPI from dbwarden.fastapi import DBWardenHealthRouter app = FastAPI() app.include_router(DBWardenHealthRouter(), prefix="/health") ``` That's it! You now have: - `GET /health/` - Overall health for all databases - `GET /health/{database_name}` - Health for a specific database ## Your First Health Check Let's start with a complete minimal example: ```python from contextlib import asynccontextmanager from fastapi import FastAPI from dbwarden.fastapi import DBWardenHealthRouter, migration_context @asynccontextmanager async def lifespan(app: FastAPI): async with migration_context(mode="check"): yield app = FastAPI(lifespan=lifespan) # Add health endpoints app.include_router(DBWardenHealthRouter(), prefix="/health") ``` Start your app: ```bash uvicorn main:app --reload ``` ## Test Your Endpoints ### Check Overall Health ```bash curl http://localhost:8000/health/ ``` Response: ```json { "status": "ok", "databases": [ { "database": "primary", "status": "ok", "connected": true, "pending_migrations": 0, "lock_active": false, "error": null } ] } ``` ### Check Specific Database ```bash curl http://localhost:8000/health/primary ``` Response: ```json { "status": "ok", "databases": [ { "database": "primary", "status": "ok", "connected": true, "pending_migrations": 0, "lock_active": false, "error": null } ] } ``` For prettier output, use [httpie](https://httpie.io/): ```bash http :8000/health/ ``` ## Understanding the Response Let's break down what each field means: ### Response Schema ```python { "status": str, # Overall status: "ok" | "degraded" | "error" "databases": [ # List of database health details { "database": str, # Database name from config "status": str, # This database's status: "ok" | "degraded" | "error" "connected": bool, # Can we execute SELECT 1? "pending_migrations": int, # Number of unapplied migrations "lock_active": bool, # Is a migration currently running? "error": str | None # Error message if connection failed } ] } ``` ### Status Values **`"ok"`** - Everything is healthy - Database is connected - Zero pending migrations - No migration lock **`"degraded"`** - Functional but needs attention - Database is connected - Has pending migrations - ℹ App works but schema is outdated **`"error"`** - Not functional - Cannot connect to database - Health check failed ### Overall Status Logic The overall `status` field follows these rules: ```python if any database has status "error": overall_status = "error" elif any database has status "degraded": overall_status = "degraded" else: overall_status = "ok" ``` ## HTTP Status Codes DBWarden health endpoints return these HTTP status codes: | Scenario | HTTP Status | Overall Status | |----------|-------------|----------------| | All databases healthy | **200** | `"ok"` | | Pending migrations exist | **200** | `"degraded"` | | Database unreachable | **503** | `"error"` | | Database name not found | **404** | N/A | Pending migrations are a state, not a failure. The app still works, the schema is just outdated. You decide whether to block traffic on degraded state. ### Why These Status Codes Matter **200 OK** - App is functional - Continue routing traffic - Report as healthy to load balancers - Use for readiness probes (optionally) **503 Service Unavailable** - App cannot function - Stop routing traffic - Restart the pod (liveness probe) - Alert the on-call engineer **404 Not Found** - Configuration error - Database name doesn't exist in config - Only returned for per-database route ## Health Check Flow Here's what happens when you hit `/health/`: ``` 1. Request arrives /health/ 2. For each database in config: a. Get or create engine b. Attempt connection c. Execute SELECT 1 d. If connected: - Count applied migrations - Count total migration files - Calculate pending = total - applied - Check migration lock status e. If connection fails: - Set status = "error" - Set error message - Skip migration checks 3. Aggregate all database statuses 4. Return JSON response with appropriate HTTP status ``` ### What Gets Checked For **each database**, DBWarden checks: 1. **Connectivity** - Can we connect? (`SELECT 1`) 2. **Migration state** - Are migrations pending? 3. **Migration lock** - Is a migration currently running? Health checks are fast - they only run `SELECT 1` and query the configured migration tracking table (default: `_dbwarden_migrations`). No expensive queries or full schema scans. ## Common Use Cases ### Kubernetes Liveness Probe Liveness probes check if your app is alive. If it fails, Kubernetes restarts the pod. ```yaml # deployment.yaml apiVersion: apps/v1 kind: Deployment metadata: name: myapp spec: template: spec: containers: - name: app image: myapp:latest livenessProbe: httpGet: path: /health/ port: 8000 initialDelaySeconds: 10 periodSeconds: 30 failureThreshold: 3 ``` This checks health every 30 seconds. If it fails 3 times in a row, Kubernetes restarts the pod. ### Kubernetes Readiness Probe Readiness probes check if your app is ready to serve traffic. If it fails, Kubernetes stops routing to the pod (but doesn't restart it). ```yaml # deployment.yaml apiVersion: apps/v1 kind: Deployment metadata: name: myapp spec: template: spec: containers: - name: app image: myapp:latest readinessProbe: httpGet: path: /health/ port: 8000 initialDelaySeconds: 5 periodSeconds: 10 ``` This checks health every 10 seconds. If degraded or error, traffic stops routing to this pod. If you use `/health/` for readiness and have pending migrations, your pods will be marked not ready and receive no traffic. This might be what you want, or you might want a separate `/ping` endpoint for basic liveness. ### Separate Liveness and Readiness For more control, use different endpoints: ```python from fastapi import FastAPI from dbwarden.fastapi import DBWardenHealthRouter app = FastAPI() # Database health at /health/ (for readiness) app.include_router(DBWardenHealthRouter(), prefix="/health") # Simple ping at /ping (for liveness) @app.get("/ping") async def ping(): return {"status": "ok"} ``` Then configure Kubernetes: ```yaml livenessProbe: httpGet: path: /ping # Simple check - app is alive port: 8000 readinessProbe: httpGet: path: /health/ # Full check - app is ready port: 8000 ``` ### Monitoring and Alerting Use health endpoints to feed monitoring systems: #### Prometheus Create a script that exports metrics: ```python # monitoring/exporter.py import httpx from prometheus_client import Gauge, start_http_server database_health = Gauge('dbwarden_database_health', 'Database health status', ['database']) pending_migrations = Gauge('dbwarden_pending_migrations', 'Pending migrations', ['database']) async def collect_metrics(): async with httpx.AsyncClient() as client: response = await client.get("http://localhost:8000/health/") data = response.json() for db in data["databases"]: # 1 = ok, 0.5 = degraded, 0 = error health_value = 1 if db["status"] == "ok" else (0.5 if db["status"] == "degraded" else 0) database_health.labels(database=db["database"]).set(health_value) pending_migrations.labels(database=db["database"]).set(db["pending_migrations"]) if __name__ == "__main__": start_http_server(9090) # Run collect_metrics periodically ``` #### Datadog ```python # monitoring/datadog.py import httpx from datadog import statsd async def report_health(): async with httpx.AsyncClient() as client: response = await client.get("http://localhost:8000/health/") data = response.json() for db in data["databases"]: statsd.gauge( 'app.database.pending_migrations', db["pending_migrations"], tags=[f'database:{db["database"]}'] ) statsd.service_check( 'app.database.health', statsd.OK if db["status"] == "ok" else statsd.WARNING, tags=[f'database:{db["database"]}'] ) ``` ### Pre-Query Health Check Check database health before running expensive operations: ```python from fastapi import FastAPI, HTTPException import httpx app = FastAPI() @app.post("/analytics/generate-report") async def generate_report(): # Check if analytics database is healthy async with httpx.AsyncClient() as client: response = await client.get("http://localhost:8000/health/analytics") data = response.json() db_health = data["databases"][0] if not db_health["connected"]: raise HTTPException( status_code=503, detail="Analytics database unavailable" ) if db_health["pending_migrations"] > 0: raise HTTPException( status_code=503, detail=f"Analytics database has {db_health['pending_migrations']} pending migrations" ) # Proceed with expensive report generation ... ``` ### Load Balancer Health Checks Configure your load balancer (ALB, nginx, etc.) to check `/health/`: #### AWS Application Load Balancer ```yaml # target-group.yaml TargetGroup: HealthCheckPath: /health/ HealthCheckIntervalSeconds: 30 HealthCheckTimeoutSeconds: 5 HealthyThresholdCount: 2 UnhealthyThresholdCount: 3 Matcher: HttpCode: 200 ``` #### Nginx ```nginx # nginx.conf upstream myapp { server app1:8000; server app2:8000; server app3:8000; # Health check check interval=10000 rise=2 fall=3 timeout=5000 type=http; check_http_send "GET /health/ HTTP/1.0\r\n\r\n"; check_http_expect_alive http_2xx; } ``` ## Relationship to Startup Checks Health endpoints and startup checks serve different purposes: | | **Health Endpoints** | **Startup Checks** | |---|---|---| | **When** | On demand (HTTP request) | Once at app boot | | **Failure** | Returns HTTP 503 | Blocks app startup | | **Use case** | Runtime monitoring | Enforce readiness before traffic | | **Frequency** | Every request | Once | ### Startup Check (Lifespan) ```python from contextlib import asynccontextmanager from dbwarden.fastapi import migration_context @asynccontextmanager async def lifespan(app: FastAPI): async with migration_context(mode="check", fail_fast=True): # If this fails, app won't start yield ``` ### Runtime Health (HTTP Endpoint) ```python from dbwarden.fastapi import DBWardenHealthRouter # Always available - returns status code based on health app.include_router(DBWardenHealthRouter(), prefix="/health") ``` Both use the same underlying `check_database_health()` function, so behavior is consistent. ## Common Patterns ### Pattern 1: Basic Health Only ```python from fastapi import FastAPI from dbwarden.fastapi import DBWardenHealthRouter app = FastAPI() app.include_router(DBWardenHealthRouter(), prefix="/health") ``` ### Pattern 2: Health + Simple Ping ```python from fastapi import FastAPI from dbwarden.fastapi import DBWardenHealthRouter app = FastAPI() # Database health (readiness) app.include_router(DBWardenHealthRouter(), prefix="/health") # Simple ping (liveness) @app.get("/ping") async def ping(): return {"status": "ok"} ``` ### Pattern 3: Custom Prefix ```python # Health at /api/v1/health/ app.include_router(DBWardenHealthRouter(), prefix="/api/v1/health") ``` ### Pattern 4: With Tags for Documentation ```python # Show in OpenAPI docs with "Health" tag router = DBWardenHealthRouter() router.tags = ["Health"] app.include_router(router, prefix="/health") ``` ## Troubleshooting ### 503 Errors in Production If you're getting 503 errors, check: 1. **Is the database reachable?** ```bash # Check database connectivity curl http://localhost:8000/health/ | jq '.databases[].error' ``` 2. **Are migrations pending?** ```bash # Check pending migrations curl http://localhost:8000/health/ | jq '.databases[].pending_migrations' ``` 3. **Is a migration lock active?** ```bash # Check lock status curl http://localhost:8000/health/ | jq '.databases[].lock_active' ``` ### Degraded State in Production If your app is marked degraded: ```json { "status": "degraded", "databases": [ { "pending_migrations": 3, ... } ] } ``` This means migrations need to be applied: ```bash # Apply migrations $ dbwarden migrate # Or let the app auto-migrate (if configured) # See: Startup Checks documentation ``` ### 404 on Per-Database Route ```bash curl http://localhost:8000/health/analytics # 404: Database 'analytics' not found ``` Check your DBWarden config - the database name must be defined: ```python # dbwarden.py analytics = database_config( database_name="analytics", # Must match route parameter ... ) ``` ### Health Check Too Slow Health checks should be fast (< 100ms). If they're slow: 1. **Database connection is slow** - Check network latency 2. **Migration table is huge** - Consider squashing migrations 3. **Multiple databases** - Each one adds latency For very fast health checks, consider: ```python # Simple ping (no database check) @app.get("/ping") async def ping(): return {"status": "ok"} ``` ## What's Next? - **[Startup Checks](startup-checks.md)** - Validate on app boot - **[Complete Application](complete-application.md)** - Full working example - **[Production Patterns](../advanced/production-patterns.md)** - K8s, CI/CD, monitoring - **[Multi-Database](../advanced/multi-database.md)** - Multiple databases ======================================================================== PAGE: https://dbwarden.emiliano-go.com/fastapi/tutorial/session-dependency/ ======================================================================== # Session Dependency Learn how to get database sessions in your FastAPI routes using `DatabaseHandle`. ## The Handle Pattern `database_config()` returns a `DatabaseHandle`. Its `.async_session` (and `.sync_session`) properties are FastAPI dependency annotations: use them **directly** in route parameters: ```python from dbwarden import database_config primary = database_config(database_name="primary", ...) @app.get("/users") async def list_users(session: primary.async_session): result = await session.execute(select(User)) return result.scalars().all() ``` No `Annotated`, no `Depends`, no type aliases, no manual session creation. ## Quick Example ```python from fastapi import FastAPI from dbwarden import database_config app = FastAPI() primary = database_config( database_name="primary", default=True, database_type="sqlite", database_url_sync="sqlite:///./app.db", ) @app.get("/users") async def list_users(session: primary.async_session): result = await session.execute(select(User)) return result.scalars().all() ``` That is everything you need: 1. Call `database_config()` and store the handle 2. Use `handle.async_session` as a route parameter type hint 3. FastAPI injects a fresh `AsyncSession` per request ## Recommended Project Structure For multi-file projects, define the handle in one place and import it: ```python # dbwarden.py from dbwarden import database_config primary = database_config( database_name="primary", default=True, database_type="postgresql", database_url_sync="postgresql://user:password@localhost/myapp", model_paths=["app.models"], ) ``` ```python # app/routes/users.py from dbwarden import primary from sqlalchemy import select from app.models import User router = APIRouter() @router.get("/users") async def list_users(session: primary.async_session): result = await session.execute(select(User)) return result.scalars().all() @router.get("/users/{user_id}") async def get_user(user_id: int, session: primary.async_session): result = await session.execute( select(User).where(User.id == user_id) ) user = result.scalar_one_or_none() if not user: raise HTTPException(404, "User not found") return user @router.post("/users") async def create_user(user_data: UserCreate, session: primary.async_session): user = User(**user_data.model_dump()) session.add(user) await session.commit() await session.refresh(user) return user ``` ## How It Works ### 1. First Request When the first request comes in: ``` 1. Route parameter primary.async_session resolves 2. Engine is created from your config 3. Engine is cached for reuse 4. Session factory is created 5. A new AsyncSession opens for this request 6. Your route code runs with the session 7. Session closes automatically when the request finishes ``` ### 2. Subsequent Requests ``` 1. primary.async_session resolves 2. Cached engine is reused; no new engine created 3. A fresh session opens for this request 4. Your route runs 5. Session closes automatically ``` Engines are created **once per database** and cached for the application lifetime. ## Multi-Database Projects Create one handle per database: ```python from dbwarden import database_config primary = database_config( database_name="primary", default=True, database_type="postgresql", database_url_sync="postgresql://user:password@localhost/main", model_paths=["app.models.primary"], ) analytics = database_config( database_name="analytics", database_type="clickhouse", database_url_sync="http://user:password@clickhouse-host:8123/analytics", model_paths=["app.models.analytics"], ) logging = database_config( database_name="logging", database_type="postgresql", database_url_sync="postgresql://user:password@localhost/logs", model_paths=["app.models.logging"], ) ``` Use the appropriate handle in each route: ```python @app.get("/users") async def list_users(session: primary.async_session): result = await session.execute(select(User)) return result.scalars().all() @app.get("/analytics/events") async def list_events(session: analytics.async_session): result = await session.execute(select(Event)) return result.scalars().all() ``` ### Multiple Sessions in One Route ```python @app.get("/dashboard") async def get_dashboard( users_session: primary.async_session, events_session: analytics.async_session, ): users = await users_session.execute(select(User)) events = await events_session.execute(select(Event)) return { "users": users.scalars().all(), "events": events.scalars().all(), } ``` Each session is independent and properly managed. ## Sync Sessions For synchronous route handlers, use `.sync_session`: ```python @app.get("/report") def generate_report(session: primary.sync_session): result = session.execute(select(Report)) return result.scalars().all() ``` `.sync_session` works with any sync database driver (psycopg2, mysql-connector, etc.). ## Dev Mode Configure a dev database and the handle automatically resolves the right URL based on the `ENVIRONMENT` environment variable: ```python primary = database_config( database_name="primary", default=True, database_type="postgresql", database_url_sync="postgresql://user:password@localhost/prod", dev_database_type="sqlite", dev_database_url="sqlite:///./dev.db", model_paths=["app.models"], ) ``` - `ENVIRONMENT=development` or `local` or `test` uses `dev_database_url` - Otherwise, uses `database_url_sync` No code changes needed between environments. ## Session Lifecycle ### Request-Scoped Sessions Each request gets its own session: ``` Request A ── Session A (independent) Request B ── Session B (independent) Request C ── Session C (independent) ``` ### Automatic Cleanup Sessions are automatically closed in a `finally` block. You never need manual cleanup. ### Session Settings DBWarden sessions use `expire_on_commit=False` so that Pydantic response models can access attributes after commit. ## Troubleshooting ### "RuntimeError: Working outside of application context" This happens if you try to use the session outside a request handler: ```python # Wrong: used outside a request session = primary.async_session ``` Solution: only use `primary.async_session` as a **route parameter type hint**: ```python # Correct @app.get("/users") async def list_users(session: primary.async_session): ... ``` ### "Config not loaded" Make sure `dbwarden.py` (or whichever file calls `database_config()`) is imported before FastAPI starts: ```python # main.py import dbwarden # Loads config from fastapi import FastAPI ``` ### "Cannot connect to database" Check: - Is the database running? - Is the connection URL correct? - Are credentials valid? - Is the network reachable? ### "expire_on_commit" Errors If you see errors about accessing attributes after commit, ensure you are using the session from the handle (which sets `expire_on_commit=False`). ## Using `get_session` Directly The `get_session()` function is also available from `dbwarden.fastapi` for advanced cases where you need to create session dependencies dynamically: ```python from typing import Annotated from fastapi import Depends from sqlalchemy.ext.asyncio import AsyncSession from dbwarden.fastapi import get_session # Named database SessionDep = Annotated[AsyncSession, Depends(get_session())] AnalyticsDep = Annotated[AsyncSession, Depends(get_session("analytics"))] # Dev mode override DevSessionDep = Annotated[AsyncSession, Depends(get_session(dev=True))] ``` This is useful when: - You need to override sessions in tests (see [Testing](../advanced/testing.md)) - You want to use the `Annotated` type alias pattern - You need programmatic database selection at the dependency level For most cases, the `DatabaseHandle` pattern (`.async_session` / `.sync_session`) is simpler and recommended. ## What's Next? - **[Startup Checks](startup-checks.md)** - Validate your database on app boot - **[Transaction Management](../advanced/transaction-management.md)** - Manual commits and rollbacks - **[Testing](../advanced/testing.md)** - Override dependencies in tests ======================================================================== PAGE: https://dbwarden.emiliano-go.com/fastapi/tutorial/startup-checks/ ======================================================================== # Startup Checks Learn how to validate your database before your FastAPI app accepts traffic. ## What Are Startup Checks? Startup checks run when your app boots, **before** it starts accepting requests. They verify: - Database connectivity - Migration state - Schema integrity If checks fail, your app won't start. This prevents serving traffic with a broken or outdated database. ## Why Use Startup Checks? **Without startup checks:** - App starts even if database is down - First requests fail with connection errors - Users see errors while you debug - Hard to diagnose deployment issues **With startup checks:** - App fails to start if database has issues - Kubernetes restarts the pod automatically - No user-facing errors - Clear logs showing what's wrong ## Quick Example Add a startup check in **3 lines**: ```python from contextlib import asynccontextmanager from fastapi import FastAPI from dbwarden.fastapi import migration_context @asynccontextmanager async def lifespan(app: FastAPI): async with migration_context(mode="check"): yield # App runs here app = FastAPI(lifespan=lifespan) ``` That's it! Your app now validates the database on startup. ## Your First Startup Check Let's start with a complete minimal example: ```python from contextlib import asynccontextmanager from fastapi import FastAPI from dbwarden.fastapi import migration_context @asynccontextmanager async def lifespan(app: FastAPI): # Runs before app starts accepting requests async with migration_context(mode="check"): yield # App serves traffic # Runs on shutdown (cleanup) app = FastAPI(lifespan=lifespan) @app.get("/") async def root(): return {"message": "Hello World"} ``` Start your app: ```bash uvicorn main:app ``` ### If Database Is Healthy You'll see: ``` INFO: Started server process [12345] INFO: Waiting for application startup. INFO: DBWarden: migration_context mode=check outcome=ok duration_ms=45 INFO: Application startup complete. INFO: Uvicorn running on http://127.0.0.1:8000 ``` The app starts successfully! ### If Database Has Issues If the database is unreachable or has pending migrations: ``` INFO: Started server process [12345] INFO: Waiting for application startup. ERROR: Application startup failed. Exiting. RuntimeError: Startup check failed: primary: could not connect to server ``` The app **exits immediately**. No requests are served. Failing fast on startup is better than serving broken requests. Kubernetes will restart your pod automatically. ## Check Mode vs Migrate Mode `migration_context` has two modes: ### Check Mode (Recommended) Validates without making changes: ```python async with migration_context(mode="check"): yield ``` **What it does:** - Checks database connectivity - Verifies migration state - Reports pending migrations - Does **not** apply migrations - Does **not** modify schema **Use for:** - Production deployments - Staging environments - When migrations run in separate jobs ### Migrate Mode Applies migrations on startup: ```python async with migration_context(mode="migrate"): yield ``` **What it does:** - Checks database connectivity - Applies pending migrations - Updates schema - Modifies your database **Use for:** - Local development - Simple deployments - Single-instance apps - When you want auto-migration Migrate mode is blocked in production by default. Set `allow_in_production=True` to override (not recommended for most apps). ## Complete Function Signature ```python async def migration_context( *, mode: Literal["migrate", "check"] = "check", database: str | None = None, all_databases: bool = False, dev: bool = False, strict_translation: bool = False, with_backup: bool = False, backup_dir: str | None = None, verbose: bool = False, allow_in_production: bool = False, fail_fast: bool = True, only_dev: bool = False, ) -> AsyncContextManager: """FastAPI lifespan helper for startup migration/check logic.""" ``` ## All Parameters ### `mode` **Type:** `"check"` | `"migrate"` **Default:** `"check"` What to do on startup: - `"check"` - Read-only validation (recommended) - `"migrate"` - Apply pending migrations ```python # Check only (recommended for production) async with migration_context(mode="check"): yield # Apply migrations (useful for dev) async with migration_context(mode="migrate"): yield ``` ### `database` **Type:** `str | None` **Default:** `None` (uses default database) Which database to check/migrate: ```python # Check default database async with migration_context(mode="check"): yield # Check specific database async with migration_context(mode="check", database="analytics"): yield ``` ### `all_databases` **Type:** `bool` **Default:** `False` Check/migrate all configured databases: ```python # Check all databases async with migration_context(mode="check", all_databases=True): yield ``` If you have multiple databases and want to validate all of them on startup, use this. For apps with multiple databases, always use `all_databases=True` in production to ensure all databases are healthy. ### `dev` **Type:** `bool` **Default:** `False` Use `dev_database_url` instead of `database_url`: ```python # Use dev database async with migration_context(mode="check", dev=True): yield ``` Or set environment variable: ```bash export ENVIRONMENT=development ``` ### `strict_translation` **Type:** `bool` **Default:** `False` Enable strict SQL translation mode (advanced): ```python async with migration_context(mode="check", strict_translation=True): yield ``` ### `with_backup` **Type:** `bool` **Default:** `False` Create backup before migrations (migrate mode only): ```python async with migration_context( mode="migrate", with_backup=True, backup_dir="./backups" ): yield ``` This parameter only applies when `mode="migrate"`. Ignored in check mode. ### `backup_dir` **Type:** `str | None` **Default:** `None` (uses default backup location) Where to store backups: ```python async with migration_context( mode="migrate", with_backup=True, backup_dir="/var/backups/dbwarden" ): yield ``` ### `verbose` **Type:** `bool` **Default:** `False` Enable detailed logging: ```python async with migration_context(mode="check", verbose=True): yield ``` Useful for debugging startup issues. ### `allow_in_production` **Type:** `bool` **Default:** `False` Allow migrate mode in production: ```python async with migration_context( mode="migrate", allow_in_production=True # Use with caution ): yield ``` By default, `mode="migrate"` is **blocked** when `ENVIRONMENT` is `prod` or `production`. This prevents accidental schema changes in production. Only enable this if you understand the risks: no rollback on migration failure, downtime during migration, potential data loss, and race conditions with multiple pods. **Better approach:** Run migrations in a separate job before deployment. ### `fail_fast` **Type:** `bool` **Default:** `True` Exit immediately on failure: ```python # Fail fast (recommended) async with migration_context(mode="check", fail_fast=True): yield # Continue on failure (not recommended) async with migration_context(mode="check", fail_fast=False): yield ``` When `fail_fast=True`: - App exits if checks fail - Clear error message in logs - Kubernetes restarts pod When `fail_fast=False`: - Logs warning but continues - App starts even with database issues - First requests may fail `fail_fast=True` is the right default for production. If you can't start, you shouldn't serve traffic. ### `only_dev` **Type:** `bool` **Default:** `False` Only run checks in development environments: ```python # Only check in dev, skip in prod async with migration_context(mode="check", only_dev=True): yield ``` This skips checks unless `ENVIRONMENT` is one of: - `dev` - `development` - `local` - `test` - `testing` **When to use:** - You run migrations in CI/CD before deployment - You have separate health checks in production - You want faster production startup If you use `only_dev=True`, make sure you have other mechanisms to validate database health in production (like health endpoints or separate migration jobs). ## Common Patterns ### Pattern 1: Production - Check Only Recommended for most production apps: ```python @asynccontextmanager async def lifespan(app: FastAPI): async with migration_context( mode="check", all_databases=True, fail_fast=True, ): yield app = FastAPI(lifespan=lifespan) ``` - Validates all databases - Fails fast on issues - No schema changes - Safe for multiple pods ### Pattern 2: Development - Auto Migrate Convenient for local development: ```python @asynccontextmanager async def lifespan(app: FastAPI): async with migration_context( mode="migrate", only_dev=True, # Only in dev with_backup=True, verbose=True, ): yield app = FastAPI(lifespan=lifespan) ``` - Auto-applies migrations locally - Creates backups - Skipped in production - Detailed logging ### Pattern 3: Hybrid - Dev Migrates, Prod Checks Different behavior per environment: ```python import os @asynccontextmanager async def lifespan(app: FastAPI): is_dev = os.getenv("ENVIRONMENT", "").lower() in ["dev", "development", "local"] async with migration_context( mode="migrate" if is_dev else "check", all_databases=True, fail_fast=True, ): yield app = FastAPI(lifespan=lifespan) ``` - Migrates automatically in dev - Only checks in production - One configuration for all environments ### Pattern 4: No Checks (CI/CD Handles It) If you run migrations in a separate job: ```python @asynccontextmanager async def lifespan(app: FastAPI): # No migration_context - migrations handled by CI/CD yield # Just cleanup on shutdown if needed app = FastAPI(lifespan=lifespan) ``` Use this when: - Migrations run in Kubernetes init containers - CI/CD applies migrations before deployment - You use tools like Flyway or Liquibase ## Direct Helper Functions If you don't want to use `migration_context`, you can call the helpers directly: ### `check_schema_on_startup` Read-only validation: ```python from dbwarden.fastapi import check_schema_on_startup @asynccontextmanager async def lifespan(app: FastAPI): results = check_schema_on_startup( all_databases=True, fail_fast=True, ) # results is a list of HealthResult objects yield ``` **Function signature:** ```python def check_schema_on_startup( *, database: str | None = None, all_databases: bool = False, dev: bool = False, strict_translation: bool = False, only_dev: bool = False, fail_fast: bool = True, verbose: bool = False, ) -> list[HealthResult]: """Run read-only startup schema checks.""" ``` **Returns:** List of `HealthResult` objects with health status per database. ### `migrate_on_startup` Apply migrations: ```python from dbwarden.fastapi import migrate_on_startup @asynccontextmanager async def lifespan(app: FastAPI): migrate_on_startup( all_databases=True, with_backup=True, only_dev=True, ) yield ``` **Function signature:** ```python def migrate_on_startup( *, database: str | None = None, all_databases: bool = False, dev: bool = False, strict_translation: bool = False, with_backup: bool = False, backup_dir: str | None = None, verbose: bool = False, allow_in_production: bool = False, fail_fast: bool = True, only_dev: bool = False, ) -> None: """Run migration workflow at startup.""" ``` Use these when you need more control or want to access the health results. For most cases, `migration_context` is simpler. ## Error Handling ### Connection Errors If database is unreachable: ``` RuntimeError: Startup check failed: primary: could not connect to server: Connection refused (host:5432) ``` **Solution:** - Check database is running - Verify connection URL - Check network/firewall - Ensure credentials are correct ### Pending Migrations If migrations are pending and `mode="check"`: ``` RuntimeError: Startup check failed: primary: 3 pending migrations ``` **Solution:** ```bash # Apply migrations manually $ dbwarden migrate # Or use migrate mode # migration_context(mode="migrate") ``` ### Production Migration Blocked If you try `mode="migrate"` in production: ``` RuntimeError: migrate_on_startup is blocked in production unless allow_in_production=True ``` **Solution:** - Run migrations in a separate job - Or add `allow_in_production=True` (not recommended) ### Multiple Databases, One Fails If `all_databases=True` and one database fails: ``` RuntimeError: Startup check failed: primary: ok; analytics: connection refused ``` The app exits even if some databases are healthy. Fix all databases before starting. ## Comparison: Check vs Migrate | | **Check Mode** | **Migrate Mode** | |---|---|---| | **Reads schema** | | | | **Checks connectivity** | | | | **Reports pending migrations** | | | | **Applies migrations** | | | | **Modifies database** | | | | **Production safe (multi-pod)** | | Risky | | **Can rollback** | N/A | | | **Requires lock** | | | | **Fast** | (< 100ms) | Depends on migrations | ## Environment Detection DBWarden detects your environment from the `ENVIRONMENT` variable: ### Development Environments Detected as "development": - `dev` - `development` - `local` - `test` - `testing` ```bash export ENVIRONMENT=development ``` ### Production Environments Detected as "production": - `prod` - `production` ```bash export ENVIRONMENT=production ``` ### Why It Matters Some parameters behave differently based on environment: **`only_dev=True`** Skipped in production **`allow_in_production=False`** Migrate mode blocked in production ## Troubleshooting ### App Starts But Migrations Not Checked Check that `migration_context` is actually running: ```python @asynccontextmanager async def lifespan(app: FastAPI): print("Lifespan starting...") # Debug async with migration_context(mode="check", verbose=True): print("Passed checks!") # Debug yield print("Lifespan ending...") # Debug ``` Make sure you're passing `lifespan` to `FastAPI`: ```python # Correct app = FastAPI(lifespan=lifespan) # Wrong - lifespan not used app = FastAPI() ``` ### Checks Pass But Routes Fail If startup checks pass but routes fail with connection errors: 1. **Different databases?** - Startup checks one database, routes use another 2. **Connection pool exhausted?** - Too many concurrent requests 3. **Database restarted?** - Connection was valid at startup but not now ### Slow Startup If startup is slow: 1. **Migrations taking time?** - Use `mode="check"` instead of `mode="migrate"` 2. **Multiple databases?** - Each one adds latency 3. **Network latency?** - Database is far away 4. **First connection slow?** - Normal for some databases (initial SSL handshake) Use `verbose=True` to see timing: ```python async with migration_context(mode="check", verbose=True): yield ``` ### Production Blocked If you see "blocked in production" errors: ```python # This is blocked async with migration_context(mode="migrate"): yield # Solutions: # 1. Use check mode async with migration_context(mode="check"): yield # 2. Use only_dev async with migration_context(mode="migrate", only_dev=True): yield # 3. Override (not recommended) async with migration_context(mode="migrate", allow_in_production=True): yield ``` ## What's Next? - **[Complete Application](complete-application.md)** - Full working example - **[Health Endpoints](health-endpoints.md)** - Runtime health monitoring - **[Production Patterns](../advanced/production-patterns.md)** - K8s, CI/CD strategies - **[Multi-Database](../advanced/multi-database.md)** - Multiple databases ======================================================================== PAGE: https://dbwarden.emiliano-go.com/features/ ======================================================================== # Features This page gives a compact overview of the main DBWarden features, with short examples. Use it to understand the surface area of the tool before diving into the guides and reference pages. ## SQL-First Migrations DBWarden writes migrations as plain SQL files. Each file contains both an `--upgrade` section and a `--rollback` section. ```sql -- upgrade CREATE TABLE IF NOT EXISTS users ( id SERIAL PRIMARY KEY, email VARCHAR(255) NOT NULL UNIQUE ); -- rollback DROP TABLE users; ``` This keeps schema changes reviewable in code review and runnable without hidden ORM magic. ## Typed Database Configuration Configure one or many databases with explicit `database_config(...)` calls. ```python from dbwarden import database_config primary = database_config( database_name="primary", default=True, database_type="postgresql", database_url_sync="postgresql://user:password@localhost:5432/main", model_paths=["app.models"], model_tables=["users", "posts", "comments"], ) analytics = database_config( database_name="analytics", database_type="clickhouse", database_url_sync="clickhouse://default:@localhost:8123/analytics", model_paths=["app.analytics_models"], model_tables=["events", "page_views"], ) ``` Each entry is validated before use, including database names, table names, `model_paths`, `model_tables`, and duplicate target detection. ## Model-Driven Migration Generation DBWarden reads SQLAlchemy models, diffs them against the live schema or an offline model state, and emits SQL. ```text $ dbwarden make-migrations "add posts table" --database primary Created migration: migrations/primary/primary__0002_add_posts_table.sql ``` ## Backend-Specific Metadata PostgreSQL, MySQL/MariaDB, and ClickHouse support first-class metadata through `class Meta`. PostgreSQL ```python from dbwarden.databases.pgsql import PGTableMeta, PGColumnMeta class Meta(PGTableMeta): pg_fillfactor = 80 class id(PGColumnMeta): pg_identity = "always" ``` MySQL ```python from dbwarden.databases.mysql import MyTableMeta, MyColumnMeta, my class Meta(MyTableMeta): my_engine = "InnoDB" my_charset = "utf8mb4" class id(MyColumnMeta): my = my.field(unsigned=True) ``` ClickHouse example: ```python from dbwarden.databases.clickhouse import CHTableMeta, ChEngineSpec, ChIndexSpec class Meta(CHTableMeta): ch_engine = ChEngineSpec("MergeTree") ch_order_by = ["event_date", "id"] ch_indexes = [ ChIndexSpec("ix_payload", ["payload"], type="bloom_filter", granularity=1), ] ``` ## Safety Classification Use `check` to classify changes before generating or applying SQL. ```text $ dbwarden check --database primary SAFE add column users.bio WARN shrink varchar users.email CRITICAL drop table audit_log ``` Safety levels are `SAFE`, `INFO`, `WARN`, and `CRITICAL`. ## Read-Only Schema Diffing Use `diff` when you want to inspect differences without writing migration files. ```bash $ dbwarden diff --database primary ``` This is useful during reviews, debugging, and CI checks. ## Reverse-Engineering Models Generate SQLAlchemy model code from a live database. ```bash $ dbwarden generate-models --database primary --tables users,posts ``` This is useful when adopting DBWarden in an existing project, documenting an inherited schema, or recovering model definitions. ## Offline Migrations DBWarden can generate migrations without connecting to a live database, by diffing current models against an exported model state. ```bash $ dbwarden export-models --database primary $ dbwarden make-migrations "offline change" --offline --database primary ``` This is useful for CI pipelines and restricted environments. ## Multi-Database Workflows Manage multiple backends from one repository, each with its own migration directory and model set. ```bash $ dbwarden migrate --database primary $ dbwarden migrate --database analytics ``` You can also show status across all configured databases: ```bash $ dbwarden status --all ``` ## Dev Mode Use `--dev` to point a configured database at a separate development target. ```python primary = database_config( database_name="primary", default=True, database_type="postgresql", database_url_sync="postgresql://user:password@localhost:5432/main", dev_database_type="sqlite", dev_database_url="sqlite:///./development.db", ) ``` ```bash $ dbwarden --dev make-migrations "test locally" --database primary $ dbwarden --dev migrate --database primary ``` ## Seed Management DBWarden tracks SQL and Python seed files separately from schema migrations. ```bash $ dbwarden seed create "load countries" --type sql --database primary $ dbwarden seed apply --database primary $ dbwarden seed list --database primary ``` This is useful for reference data, lookup tables, and repeatable environment setup. ## FastAPI Integration `database_config(...)` returns a `DatabaseHandle` that can be used directly in FastAPI dependencies. ```python from fastapi import APIRouter from .dbwarden import primary router = APIRouter() @router.get("/users") async def list_users(session: primary.async_session): ... ``` This gives one shared source of truth for migrations, runtime connections, and session injection. ## Sandbox Testing Apply migrations in a temporary sandbox database before applying them for real. ```bash $ dbwarden migrate --sandbox --database primary ``` This is useful when validating generated SQL against a throwaway environment. ## Status, History, and Rollback DBWarden includes built-in commands for operational visibility. ```bash $ dbwarden status --database primary $ dbwarden history --database primary $ dbwarden rollback --count 1 --database primary ``` These commands make the migration lifecycle inspectable and reversible. ## Next Steps - Follow [Get Started](getting-started/setup.md) - Explore [Cookbook & Examples](cookbook/index.md) - Use [CLI Reference](cli-reference.md) for command lookup ======================================================================== PAGE: https://dbwarden.emiliano-go.com/getting-started/developing-locally/ ======================================================================== # Developing Locally This guide covers the local development workflow: using a development database, checking diffs safely, reverse-engineering models, and generating offline migrations. ## Use Dev Mode Dev mode swaps the configured database target for `dev_database_url` and `dev_database_type`. Configuration example: ```python from dbwarden import database_config primary = database_config( database_name="primary", default=True, database_type="postgresql", database_url_sync="postgresql://user:password@localhost:5432/main", model_paths=["app.models"], model_tables=["users", "posts", "comments"], dev_database_type="sqlite", dev_database_url="sqlite:///./development.db", ) ``` Run local commands against the development target: ```text $ dbwarden --dev make-migrations "test local change" --database primary Created migration: migrations/primary/primary__0002_test_local_change.sql $ dbwarden --dev migrate --database primary Applying migration: primary__0002_test_local_change.sql Migration applied successfully ``` ## SQLite Translation Using SQLite in dev mode is common, but not every type or default translates perfectly from server databases. DBWarden handles this by translating backend-specific types and warning when fidelity is reduced. If you want those warnings to become hard failures, use: ```bash $ dbwarden --dev --strict-translation make-migrations "validate translation" --database primary ``` ## Check the Planned Changes Use `diff` when you want to inspect differences without writing files: ```bash $ dbwarden diff --database primary ``` Use `check` when you want a safety classification: ```text $ dbwarden check --database primary SAFE add column users.bio WARN shrink varchar users.email CRITICAL drop table audit_log ``` ## Offline Migrations Offline migrations let you generate SQL without connecting to a live database. The workflow is: 1. Export the current model state. 2. Change your models. 3. Generate a migration with `--offline`. Commands: ```bash $ dbwarden export-models --database primary $ dbwarden make-migrations "offline schema change" --offline --database primary ``` This is useful for CI, restricted environments, and workflows where the migration plan should not depend on a live database connection. For a full walkthrough, see [Cookbook: Offline & CI](../cookbook/04-offline-ci.md). ## Local Validation Loop A practical local loop looks like this: ```bash $ dbwarden --dev diff --database primary $ dbwarden --dev check --database primary $ dbwarden --dev make-migrations "local change" --database primary $ dbwarden --dev migrate --database primary $ dbwarden --dev status --database primary ``` This keeps feedback fast while still using the same toolchain you use in production. Next, continue with [Workflows](workflows.md). ======================================================================== PAGE: https://dbwarden.emiliano-go.com/getting-started/first-migration/ ======================================================================== # Your First Migration This guide walks through the core DBWarden workflow: define models, generate SQL, apply the migration, inspect the result, and roll it back. ## Create the Models Create `app/models.py`: ```python from datetime import datetime from sqlalchemy import DateTime, ForeignKey, Integer, String, Text from sqlalchemy.orm import DeclarativeBase, Mapped, mapped_column from dbwarden.databases import IndexSpec, TableMeta class Base(DeclarativeBase): pass class User(Base): __tablename__ = "users" id: Mapped[int] = mapped_column(Integer, primary_key=True) email: Mapped[str] = mapped_column(String(255), unique=True, nullable=False) bio: Mapped[str | None] = mapped_column(Text, nullable=True) class Meta(TableMeta): comment = "Core user accounts" class Post(Base): __tablename__ = "posts" id: Mapped[int] = mapped_column(Integer, primary_key=True) title: Mapped[str] = mapped_column(String(255), nullable=False) body: Mapped[str] = mapped_column(Text, nullable=False) user_id: Mapped[int] = mapped_column(ForeignKey("users.id"), nullable=False) created_at: Mapped[datetime] = mapped_column(DateTime, nullable=False) class Meta(TableMeta): indexes = [ IndexSpec(name="ix_posts_created_at", columns=["created_at"]), ] ``` ## Generate the Migration Run: ```text $ dbwarden make-migrations "create core tables" --database primary Created migration: migrations/primary/primary__0001_create_core_tables.sql ``` DBWarden compares your current models against the live schema, or snapshot state, and writes a new SQL migration file. ## Review the Generated SQL Open the new file. It will look roughly like this: ```sql -- upgrade CREATE TABLE IF NOT EXISTS users ( id SERIAL PRIMARY KEY, email VARCHAR(255) NOT NULL UNIQUE, bio TEXT ); CREATE TABLE IF NOT EXISTS posts ( id SERIAL PRIMARY KEY, title VARCHAR(255) NOT NULL, body TEXT NOT NULL, user_id INTEGER NOT NULL REFERENCES users(id), created_at TIMESTAMP NOT NULL ); CREATE INDEX IF NOT EXISTS ix_posts_created_at ON posts (created_at); -- rollback DROP INDEX IF EXISTS ix_posts_created_at; DROP TABLE posts; DROP TABLE users; ``` The exact SQL depends on the backend, but the structure is always the same: - `-- upgrade` contains the forward change - `-- rollback` contains the reverse change ## Apply the Migration Run: ```text $ dbwarden migrate --database primary Applying migration: primary__0001_create_core_tables.sql Migration applied successfully ``` Internally, DBWarden resolves the config, acquires the migration lock, executes the upgrade SQL, records the checksum, and releases the lock. ## Verify the Result Run: ```text $ dbwarden status --database primary Database: primary Applied migrations: 1 Pending migrations: 0 $ dbwarden history --database primary 1 primary__0001_create_core_tables.sql applied ``` Use `status` to see the current state of the migration queue. Use `history` to see what has been applied and in what order. You can also inspect the live schema directly: ```bash $ dbwarden check-db --database primary ``` This is useful when you want a read-only view of what the database currently contains. ## Roll Back the Migration Run: ```text $ dbwarden rollback --count 1 --database primary Rolling back migration: primary__0001_create_core_tables.sql Rollback completed successfully ``` After that, the database is back to its previous schema state. ## Step by Step ### Step 1: Define the Base Class ```python from sqlalchemy.orm import DeclarativeBase class Base(DeclarativeBase): pass ``` DBWarden does not export a shared `Base`. You define a local SQLAlchemy declarative base in your project. ### Step 2: Define the Models ```python class User(Base): __tablename__ = "users" ``` Every model maps to a table. Columns come from normal SQLAlchemy field declarations. Table-level migration metadata lives in `class Meta`. ### Step 3: Add Table Metadata ```python class Meta(TableMeta): comment = "Core user accounts" ``` `TableMeta` is the cross-database surface for comments, indexes, checks, and unique constraints. ### Step 4: Generate SQL ```bash $ dbwarden make-migrations "create core tables" --database primary ``` This command inspects the configured models, compares them with the current schema, and emits a SQL file. The file becomes part of your normal code review and deployment workflow. ### Step 5: Review Upgrade and Rollback Every migration file contains both directions. This is one of DBWarden's core design choices: a migration is not complete until the rollback exists. ### Step 6: Apply the Migration ```bash $ dbwarden migrate --database primary ``` This executes pending migrations in order and records them in the migration table. ### Step 7: Verify the State ```bash $ dbwarden status --database primary $ dbwarden history --database primary ``` Verification is part of the workflow, not optional cleanup. ### Step 8: Roll It Back ```bash $ dbwarden rollback --count 1 --database primary ``` Rolling back the first migration confirms that the file contains valid reverse SQL, not just valid forward SQL. ## Manual Migrations Not every schema or data change should be auto-generated. When the change is not model-driven, create a manual migration file: ```bash $ dbwarden new "manual hotfix" --database primary ``` Use manual migrations for cases like: - data backfills - type changes that require custom `USING` expressions - backend-specific operations that need hand-written SQL DBWarden will track these files the same way it tracks generated migrations. Next, continue with [Developing Locally](developing-locally.md). ======================================================================== PAGE: https://dbwarden.emiliano-go.com/getting-started/first-steps/ ======================================================================== # First Steps This walkthrough is the foundation of the DBWarden workflow. The goal is not just to run commands, but to understand why each step exists and how it fits the migration lifecycle. ## Step 1: Initialize the project ```bash $ dbwarden init ``` This creates: - a migrations directory structure - a Python configuration scaffold (`dbwarden.py`) Why it matters: DBWarden expects a project-local migration layout and config source so migration behavior is deterministic per repository. ## Step 2: Define one explicit database entry ```python from dbwarden import database_config primary = database_config( database_name="primary", default=True, database_type="postgresql", database_url_sync="postgresql://user:password@localhost:5432/main", model_paths=["app.models"], model_tables=["users"], ) ``` Why it matters: DBWarden resolves migration targets from explicit typed entries, not inferred environment state. ## Step 3: Add SQLAlchemy models DBWarden uses model metadata to generate migration SQL. A minimal model example: ```python from sqlalchemy import DateTime, Integer, String from sqlalchemy.orm import DeclarativeBase, Mapped, mapped_column from datetime import datetime class Base(DeclarativeBase): pass class User(Base): __tablename__ = "users" id: Mapped[int] = mapped_column(Integer, primary_key=True) email: Mapped[str] = mapped_column(String(255), unique=True, nullable=False) created_at: Mapped[datetime] = mapped_column(DateTime, default=datetime.utcnow) ``` Why it matters: model metadata is the input to `make-migrations`. ## Step 4: Generate migration SQL ```bash $ dbwarden make-migrations -d "create users table" --database primary ``` DBWarden creates a versioned SQL file under `migrations/primary/`. Why it matters: this file is now part of your code review process and deployment artifact. ## Step 5: Review the generated migration Open the file and validate both sections: ```sql -- upgrade -- rollback ``` Why it matters: rollback quality determines recovery quality. ## Step 6: Apply migrations ```bash $ dbwarden migrate --database primary ``` During execution DBWarden: 1. resolves config and target database 2. acquires migration lock 3. executes pending SQL 4. stores migration record and checksum 5. releases lock ## Step 7: Verify the result ```bash $ dbwarden status --database primary $ dbwarden history --database primary ``` Use status to confirm pending/applied counts and history to confirm execution order. ## Common first-run issues - `No configuration found`: ensure your project has one discovered config source with `database_config(...)` - `Database '' not found`: ensure `--database` matches configured `database_name` - `No SQLAlchemy models found`: set `model_paths` explicitly in config ## Next Steps - [Configuration](../configuration/index.md) - [Your First Migration](first-migration.md) - [Developing Locally](developing-locally.md) ======================================================================== PAGE: https://dbwarden.emiliano-go.com/getting-started/modeling/ ======================================================================== # Modeling Guide This guide walks through the process of defining SQLAlchemy models that DBWarden can read to generate migration SQL. For the complete reference of all supported Meta attributes, see [SQLAlchemy Models Reference](../models.md). ## How DBWarden Reads Models DBWarden discovers models in the directories specified by `model_paths` in your `database_config(...)`. It reads two sources of metadata from each model: 1. **Column definitions**: typed SQLAlchemy `Mapped[...] = mapped_column(...)` fields, nullability, defaults, primary keys 2. **`class Meta` inner class**: backend-specific options like engine specs, partitioning, codecs All backend-specific metadata uses the `class Meta` pattern. The `__table_args__` approach is not supported for PostgreSQL metadata. Using `mapped_column(info=...)` for backend-specific options raises `DBWardenConfigError`. ## Common Meta Attributes Every backend supports a core set of cross-database attributes. These work with any `database_type`. ### Table-Level ```python from sqlalchemy import Integer, String from sqlalchemy.orm import DeclarativeBase, Mapped, mapped_column from dbwarden.databases import TableMeta class Base(DeclarativeBase): pass class User(Base): __tablename__ = "users" id: Mapped[int] = mapped_column(Integer, primary_key=True) email: Mapped[str] = mapped_column(String(255)) class Meta(TableMeta): comment = "Core user accounts" indexes = [ {"name": "ix_users_email", "columns": ["email"]}, ] ``` Available table-level attributes: `comment`, `indexes`, `checks`, `uniques`. See [Common Meta Attributes](../models.md#common-meta-attributes) for details. ### Column-Level ```python class Meta(TableMeta): class internal_note: comment = "Internal system note" public = False ``` Available column-level attributes: `comment`, `public`. Fields named with a leading `_` are implicitly `public=False`. For backend-specific column options, use `pg = pg.field(...)` for PostgreSQL. See [Column-Level Meta Base Class](../models.md#column-level-meta-base-class) for details. ## PostgreSQL Models When `database_type="postgresql"`, use `class Meta(PGTableMeta)` for table-level metadata and `PGColumnMeta` inner classes for column-level metadata. ```python from sqlalchemy import Integer, Text from sqlalchemy.orm import DeclarativeBase, Mapped, mapped_column from dbwarden.databases.pgsql import PGTableMeta, PGColumnMeta, pg class Base(DeclarativeBase): pass class User(Base): __tablename__ = "users" id: Mapped[int] = mapped_column(Integer, primary_key=True) bio: Mapped[str] = mapped_column(Text) class Meta(PGTableMeta): pg_fillfactor = 80 class id(PGColumnMeta): pg = pg.field(identity="always", identity_start=100) class bio(PGColumnMeta): pg = pg.field(storage="EXTENDED", compression="pglz") ``` See the [reference](../models.md#postgresql-model-metadata) for the full list of `PGTableMeta` and `PGColumnMeta` attributes, or the [PostgreSQL Deep Dive](../databases/postgresql.md) for DDL behavior and snapshot format. ## Using `generate-models` as a Starting Point > **Note**: `generate-models` only works for databases with round trip support (PostgreSQL, SQLite, MySQL, ClickHouse). See [Round Trip Support](../databases/round-trip.md) for details. The fastest way to get a correct model is to reverse-engineer it from your live database: ```bash $ dbwarden generate-models -d primary --tables users,orders ``` DBWarden produces one `.py` file per table (or a single `models.py` with `--single-file`). The generated output includes `class Meta` with all detected backend-specific metadata. Review the generated code before using it: - Column types are mapped from database types to SQLAlchemy types. Verify the mapping is correct for your use case. - Generated `class Meta` attributes are complete but may need adjustment (for example, you might want different index names or additional column hints). - Partitioning, TTL, and engine settings are captured from the live database. If the database schema has drifted from what you intend, edit the model before running `make-migrations`. ## Auto-Generated Pydantic Schemas with `@auto_schema` Use `@auto_schema` to generate four Pydantic schema classes on your model: | Attribute | Contents | |-----------|---------| | `Model.Schema` | All mapped columns | | `Model.CreateSchema` | Excludes server-defaulted columns (PKs with identity, `server_default`) | | `Model.UpdateSchema` | All fields optional | | `Model.PublicSchema` | Excludes fields where `public=False` or name starts with `_` | ```python from sqlalchemy import Integer, String from sqlalchemy.orm import Mapped, mapped_column from dbwarden.databases import auto_schema @auto_schema class User(Base): __tablename__ = "users" id: Mapped[int] = mapped_column(Integer, primary_key=True) email: Mapped[str] = mapped_column(String(255)) password_hash: Mapped[str] = mapped_column(String(255)) class Meta: class email: comment = "Primary contact email" public = True class password_hash: public = False # PublicSchema excludes password_hash and any _prefixed fields public = User.PublicSchema(email="alice@example.com") ``` The decorator reads `class Meta` to infer `SchemaConfig`, then calls `schemap` to build the Pydantic models. Column `comment` values are injected into Pydantic field descriptions, and backend-specific metadata (`pg_*`, `my_*`, `ch_*`, `mdb_*`, `sq_*`) is included in `json_schema_extra.dbwarden_backend_meta`. To customize schema generation, pass a `SchemaConfig` explicitly: ```python from dbwarden.databases import auto_schema, SchemaConfig @auto_schema(config=SchemaConfig(exclude_public=["internal_note"])) class Order(Base): ... ``` `SchemaConfig` supports the following fields: | Field | Type | Description | |-------|------|-------------| | `exclude_always` | `list[str]` | Excluded from all schemas | | `exclude_create` | `list[str]` | Excluded from CreateSchema only | | `exclude_update` | `list[str]` | Excluded from UpdateSchema only | | `exclude_public` | `list[str]` | Excluded from PublicSchema only | | `field_overrides` | `dict` | Override field types in generated schemas | | `required_always` | `list[str]` | Fields that are always required | | `optional_always` | `list[str]` | Fields that are always optional | ## When to Use Manual Migrations Auto-generated migrations handle most cases, but some schema changes still need manual intervention via `dbwarden new`: - PostgreSQL `USING` clause for type casts (e.g., casting `TEXT` to `INTEGER`). DBWarden emits `ALTER COLUMN ... TYPE` with a commented-out `-- USING col::newtype` line. Pass `--postgres-auto-using` to emit an active `USING` clause. - Column renames not caught by the heuristic auto-detection. Use `--rename old_name:new_name` flags for deterministic renames, or rename in a manual migration. - Data migrations (backfilling, transforming existing data). DBWarden emits a SQL comment placeholder. For these cases run `dbwarden new` and write the SQL by hand, or use the relevant flag for auto-generation. ## Best Practices - **One model class per table**: DBWarden discovers models by scanning directories. Each table should have exactly one model class. - **Use `model_paths`**: always set `model_paths` explicitly in `database_config(...)`. Auto-discovery is available but explicit paths are more predictable. - **Review generated migrations**: always read the `.sql` file before running `dbwarden migrate`. - **Use `--dev` for local development**: configure a `dev_database_url` (SQLite works well) and use `dbwarden --dev` to iterate quickly without touching your real database. - **Keep Meta classes minimal**: only set attributes that differ from the default. Default values are omitted from generated migrations, reducing noise. - **Use `@auto_schema` for API projects**: generates Pydantic schemas from your model annotations. Fields with `public=False` or a leading `_` are excluded from `PublicSchema`. See also: [Cookbook: Models & Migrations](../cookbook/02-models-and-migrations.md) Next, continue with [Your First Migration](first-migration.md). ======================================================================== PAGE: https://dbwarden.emiliano-go.com/getting-started/setup/ ======================================================================== # Setup This guide shows the initial project setup for DBWarden. By the end, you will have DBWarden installed, a project-local config file, and one verified database entry. ## Requirements - Python 3.12 or higher - A project that uses SQLAlchemy models, or plans to - A supported backend: PostgreSQL, MySQL, MariaDB, SQLite, or ClickHouse ## Install DBWarden Install the base package: ```bash uv add dbwarden ``` Optional dependency groups: | Group | Command | Use case | |---|---|---| | `fastapi` | `uv add "dbwarden[fastapi]"` | FastAPI session dependencies and runtime integration | | `metrics` | `uv add "dbwarden[metrics]"` | Prometheus metrics | | `sandbox` | `uv add "dbwarden[sandbox]"` | Sandbox migration testing | You can combine them: ```bash uv add "dbwarden[fastapi,metrics,sandbox]" ``` ## Initialize the Project ```text $ dbwarden init Initialized DBWarden project structure Created migrations directory Created dbwarden.py ``` `init` creates the local migration layout and a config scaffold. It is safe to run again, DBWarden will not destroy existing `database_config(...)` definitions. ## Create the Configuration File By default, `dbwarden init` creates `dbwarden.py`, and that is the simplest place to start. However, `database_config(...)` can live in any discovered Python file inside your project. The simplest `dbwarden.py` looks like this: ```python from dbwarden import database_config primary = database_config( database_name="primary", default=True, database_type="postgresql", database_url_sync="postgresql://user:password@localhost:5432/main", model_paths=["app.models"], model_tables=["users", "posts", "comments"], dev_database_type="sqlite", dev_database_url="sqlite:///./development.db", ) ``` Copy this into `dbwarden.py`, or another discovered Python module in your project, then adjust the URLs, `model_paths`, and `model_tables` for your project. ## Step by Step ### Step 1: Import `database_config` ```python from dbwarden import database_config ``` `database_config(...)` is the entry point for defining databases. Every configured database becomes part of the validated runtime config. ### Step 2: Define `database_name` ```python database_name="primary" ``` This is the stable name you will use in CLI commands such as: ```bash $ dbwarden status --database primary ``` ### Step 3: Mark the default database ```python default=True ``` Exactly one configured database must be the default. Commands without `--database` use that entry. ### Step 4: Set the backend type ```python database_type="postgresql" ``` This controls backend-specific SQL generation, schema inspection, and metadata behavior. Supported values: - `postgresql` - `mysql` - `mariadb` - `sqlite` - `clickhouse` ### Step 5: Set the runtime URL ```python database_url_sync="postgresql://user:password@localhost:5432/main" ``` This is the database URL used by CLI commands such as `make-migrations`, `migrate`, `status`, and `check`. If your application also uses async SQLAlchemy sessions, you can define an async URL too: ```python database_url_async="postgresql+asyncpg://user:password@localhost:5432/main" ``` DBWarden keeps sync and async URLs separate so the CLI and FastAPI runtime can share one config source without forcing a single driver choice. ### Step 6: Point to your models ```python model_paths=["app.models"] ``` This tells DBWarden where to discover SQLAlchemy models. In multi-database projects, explicit `model_paths` are required. ### Step 7: Filter tables for this database ```python model_tables=["users", "posts", "comments"] ``` Optional. When set, DBWarden only includes the listed tables from the discovered models. All other discovered tables are ignored. This is useful when multiple databases share the same `model_paths` but own different subsets of tables: ```python primary = database_config( database_name="primary", default=True, database_type="postgresql", database_url_sync="postgresql://user:password@localhost:5432/main", model_paths=["app.models"], model_tables=["users", "posts", "comments"], ) analytics = database_config( database_name="analytics", database_type="clickhouse", database_url_sync="clickhouse://default:@localhost:8123/analytics", model_paths=["app.models"], model_tables=["events", "page_views"], ) logs = database_config( database_name="logs", database_type="mysql", database_url_sync="mysql+pymysql://user:password@localhost:3306/logs", model_paths=["app.models"], model_tables=["audit_log", "error_log"], ) ``` If `model_tables` is not set, all discovered tables in `model_paths` belong to that database. ### Step 8: Configure a dev database ```python dev_database_type="sqlite" dev_database_url="sqlite:///./development.db" ``` This is optional, but recommended. It lets you run commands locally with `--dev`, without touching your main database. ## Verify the Configuration Run: ```text $ dbwarden settings show --all Database: primary Type: postgresql Default: true Sync URL: postgresql://user:password@localhost:5432/main Model paths: ['app.models'] Model tables: ['users', 'posts', 'comments'] Dev database type: sqlite Dev database URL: sqlite:///./development.db ``` If this command works, DBWarden can resolve and validate your config. ## Common Problems ### `No configuration found` DBWarden could not locate a config source. Make sure your project contains at least one discovered file with a `database_config(...)` call. `dbwarden.py` is the default convention, but it does not have to be the only location. ### `Exactly one default=True required` If you configure more than one database, only one can be the default. ### `model_paths is required when more than one database is configured` In multi-database setups, each database must declare the models that belong to it. Next, continue with [Modeling](modeling.md). ======================================================================== PAGE: https://dbwarden.emiliano-go.com/getting-started/workflows/ ======================================================================== # Workflows This guide covers larger day-to-day workflows once the basics are in place. ## Multi-Database Projects DBWarden can manage more than one database from one config source. ```python from dbwarden import database_config primary = database_config( database_name="primary", default=True, database_type="postgresql", database_url_sync="postgresql://user:password@localhost:5432/main", model_paths=["app.models"], model_tables=["users", "posts", "comments"], ) analytics = database_config( database_name="analytics", database_type="clickhouse", database_url_sync="clickhouse://default:@localhost:8123/analytics", model_paths=["app.analytics_models"], model_tables=["events", "page_views"], ) ``` Apply migrations per database: ```bash $ dbwarden migrate --database primary $ dbwarden migrate --database analytics ``` Show status across all configured databases: ```bash $ dbwarden status --all ``` ## Separate Model Sets Each database should usually own a distinct model set through `model_paths`. When databases share the same models package, use `model_tables` to split ownership by table name. DBWarden validates overlapping paths unless `overlap_models=True` is set explicitly. This prevents one model tree from being interpreted as belonging to multiple databases by accident. ## CI Workflows A common CI pattern is: ```bash $ dbwarden export-models --database primary $ dbwarden make-migrations "ci validation" --offline --database primary $ dbwarden check --database primary ``` This keeps schema generation deterministic and avoids depending on a live database in every pipeline step. For a full example, see [Cookbook: Offline & CI](../cookbook/04-offline-ci.md). ## Sandbox Validation Before applying migrations to a real environment, you can validate them in a temporary sandbox database. ```bash $ dbwarden migrate --sandbox --database primary ``` This is especially useful for complex migrations, risky type changes, and CI gates. See the [Architecture Deep Dive](../architecture-deep-dive.md) for a thorough explanation of sandbox validation. ## Baselines and Partial Applies When integrating DBWarden into an existing environment, or when applying only part of a migration sequence, these patterns are common: - `--baseline` marks the target migration as already applied without actually running it, useful for onboarding an existing database. - `--partial` (via `--count` or `--to-version`) applies a subset of pending migrations instead of all of them. ```bash $ dbwarden migrate --database primary --baseline --to-version 0005 $ dbwarden migrate --database primary --count 2 $ dbwarden rollback --database primary --to-version 0007 ``` See the [CLI Reference](../cli-reference.md) for a full breakdown of these flags. Use these modes carefully. They are operational tools, not everyday authoring commands. ## Operational Command Pattern A typical production-safe pattern is: ```bash $ dbwarden check --database primary $ dbwarden make-migrations "release change" --database primary $ dbwarden migrate --database primary $ dbwarden status --database primary $ dbwarden history --database primary ``` This keeps planning, execution, and verification as separate visible steps. ## Rollback Command Pattern When validating rollback quality, use a loop like this: ```bash $ dbwarden migrate --database primary $ dbwarden rollback --count 1 --database primary $ dbwarden migrate --database primary ``` This verifies both directions of the migration before a release depends on them. ## Where to Go Next - Use [Cookbook Overview](../cookbook/index.md) for full working flows - Use [Configuration](../configuration/index.md) for deeper config behavior - Use [CLI Reference](../cli-reference.md) for command details ======================================================================== PAGE: https://dbwarden.emiliano-go.com/glossary/ ======================================================================== # Glossary ## A **Auto Schema** : A feature that generates Pydantic schemas from SQLAlchemy model annotations using `@auto_schema`, eliminating duplication between ORM and API layers in FastAPI applications. ## B **Backend** : A supported database type (PostgreSQL, MySQL, MariaDB, SQLite, ClickHouse). Each backend has specific DDL syntax, feature support, and round-trip capability. **Baseline** : A migration version used as a starting point, marking the point up to which existing migrations are considered already applied without executing the upgrade SQL. ## C **Checksum** : A SHA-256 hash stored when a migration file is applied. On subsequent runs, the checksum is recalculated to detect file tampering or accidental edits. **Check (command)** : Analyzes schema differences between SQLAlchemy models and a live database, classifying every operation by danger level. **Code Seed** : A seed defined as a Python class extending `Seed`, with `run()` and optional `rollback()` methods. The recommended way to manage seed data. **Configuration (dbwarden.py)** : DBWarden uses a Python file (`dbwarden.py`) with `database_config()` calls, providing type safety, runtime flexibility, and IDE support for configuring databases. ## D **Database Config** : A single `database_config()` call that defines one database. Each config includes the database name, type, connection URL, model paths, and optional dev mode settings. **Dev Mode** : Using a different database type (typically SQLite) for local development while targeting a production database (e.g., PostgreSQL) in deployment, enabled via the `dev_database_type` and `dev_database_url` config options. **Diff** : A read-only command showing structural differences between SQLAlchemy models and a live database, with table, json, and sql output formats. **Downgrade** : Revert applied migrations to reach a specific target version by reading `-- rollback` sections and applying them in reverse order. ## F **FastAPI Integration** : DBWarden's built-in support for FastAPI including async sessions, health endpoints, migration endpoints, Prometheus metrics, and distributed locking. Configured once via `database_config()`. ## G **generate-models** : A command that reverse-engineers SQLAlchemy model code from a live database, supporting all backends. ## H **Health Endpoints** : Production-ready HTTP endpoints for database connectivity checks, migration state monitoring, and Kubernetes liveness/readiness probes. ## I **Impact Analysis** : The ability to find affected Python code references (function calls, class references, variable names) before deploying a migration, via the `check-impact` command. **Index Spec** : A class (`IndexSpec`) used in model `Meta` to define database indexes, including columns, uniqueness, and types. ## L **Lock (Migration Lock)** : A database-level lock that prevents concurrent schema mutations across multiple application instances or CLI invocations. ## M **make-migrations** : The command that generates SQL migration files by comparing current SQLAlchemy model definitions against either a live database or stored schema snapshots. **Manual Migration** : A migration file created by hand (via `dbwarden new`) rather than auto-generated. Useful for data migrations, stored procedures, or any DDL outside model diffs. **Meta (class Meta)** : An inner class on SQLAlchemy models that provides DBWarden with backend-specific metadata like table comments, indexes, partitioning, engine options, and more. **Migration File** : A plain SQL file with `-- upgrade` and `-- rollback` sections. Each file represents one atomic schema change. **Multi-Database** : Managing multiple database configurations (e.g., primary + analytics, or microservice per database) from a single `dbwarden.py` config file. ## O **Observability** : DBWarden's monitoring capabilities including Prometheus metrics (migration counters, schema version gauges, connection pool health) and structured JSON logging. **Offline Mode** : Generating migrations using stored JSON schema snapshots instead of connecting to a live database, enabling CI/CD pipelines without database access. ## P **Pydantic Schema (auto-generated)** : Request/response schemas automatically generated from SQLAlchemy model annotations using `@auto_schema`, keeping API contracts in sync with database models. ## R **Rename Detection** : DBWarden can detect column and table renames by comparing schema snapshots, generating `ALTER TABLE ... RENAME` instead of `DROP` + `ADD`. **Rollback** : The `-- rollback` section of a migration file containing SQL to undo the upgrade. DBWarden enforces that every migration has a corresponding rollback. **Round-Trip** : A backend that supports both reading schema (via `generate-models`) and writing schema (via `make-migrations`/`migrate`). Verified when reverse-engineering a database and re-generating produces zero diff. **RA (Runs-Always) Migration** : A migration type that runs every time `migrate` is executed, regardless of previous application. Useful for idempotent operations like views or functions. **ROC (Runs-on-Change) Migration** : A migration type that runs only when its content has changed (detected via checksum). Useful for stored procedures that evolve over time. ## S **Safety Check** : A feature that classifies every migration operation into danger levels (safe, caution, danger, manual) so teams can review high-risk changes before production. **Sandbox** : An isolated test environment using Testcontainers to validate migrations against a real database before applying them to production or staging. **Schema Snapshot** : A JSON file recording the full DDL state of a database at the point a migration was applied. Enables offline migration generation, rename detection, and CI workflows. **Seed** : Data population mechanism using either Python code seeds (recommended) or file-based SQL/Python seeds. Seeds are tracked and versioned like migrations. **SQL-First** : A design philosophy where all schema changes are expressed as explicit SQL files that can be reviewed, tested, and rolled back, rather than being abstracted away by an ORM. **SQL Translation** : The ability to generate SQL for one dialect (e.g., PostgreSQL) while operating against a different development database (e.g., SQLite), enabling local development without full infrastructure. ## T **TableMeta** : The base class for model `Meta` inner classes, providing type-safe configuration of table-level metadata like comments, indexes, partitioning, and engine options. ## U **Unlock** : The `dbwarden unlock` command to recover from a stale migration lock when no migration is actually running and the lock was not released properly. ======================================================================== PAGE: https://dbwarden.emiliano-go.com/ ========================================================================

DBWarden

DBWarden

The SQL-first database toolkit for SQLAlchemy.

Python PyPI License DeepWiki

Full documentation  |  Source Code

DBWarden is a SQL-first migration system for Python and SQLAlchemy projects. It is built for teams that want schema changes to remain explicit, reviewable, and operationally safe, from local development to production. ## What DBWarden Does - Generates migration files as plain SQL, with `-- upgrade` and `-- rollback` sections - Reads SQLAlchemy models and backend-specific metadata from `class Meta` - Supports PostgreSQL, MySQL, MariaDB, SQLite, and ClickHouse - Manages one or many databases from one typed config source - Adds safety tooling, schema diffing, seed tracking, status commands, and FastAPI integration ## The Core Workflow DBWarden keeps the migration lifecycle simple: 1. Define your models. 2. Generate SQL from the model diff. 3. Review the SQL file. 4. Apply it with the CLI. 5. Verify the result with status and history commands. 6. Roll it back when you need to validate recovery. ## Install ```bash uv add dbwarden ``` Optional groups: ```bash uv add "dbwarden[fastapi,metrics,sandbox]" ``` ## Quick Start ### Step 1: Configure a database Create `dbwarden.py`: ```python from dbwarden import database_config primary = database_config( database_name="primary", default=True, database_type="postgresql", database_url_sync="postgresql://user:password@localhost:5432/main", model_paths=["app.models"], model_tables=["users", "posts"], dev_database_type="sqlite", dev_database_url="sqlite:///./development.db", ) ``` ### Step 2: Define your models ```python from datetime import datetime from sqlalchemy import DateTime, ForeignKey, Integer, String, Text from sqlalchemy.orm import DeclarativeBase, Mapped, mapped_column from dbwarden.databases import IndexSpec, TableMeta class Base(DeclarativeBase): pass class User(Base): __tablename__ = "users" id: Mapped[int] = mapped_column(Integer, primary_key=True) email: Mapped[str] = mapped_column(String(255), unique=True, nullable=False) bio: Mapped[str | None] = mapped_column(Text, nullable=True) class Meta(TableMeta): comment = "Core user accounts" class Post(Base): __tablename__ = "posts" id: Mapped[int] = mapped_column(Integer, primary_key=True) title: Mapped[str] = mapped_column(String(255), nullable=False) body: Mapped[str] = mapped_column(Text, nullable=False) user_id: Mapped[int] = mapped_column(ForeignKey("users.id"), nullable=False) created_at: Mapped[datetime] = mapped_column(DateTime, nullable=False) class Meta(TableMeta): indexes = [ IndexSpec(name="ix_posts_created_at", columns=["created_at"]), ] ``` ### Step 3: Generate a migration ```text $ dbwarden make-migrations "create core tables" --database primary Created migration: migrations/primary/primary__0001_create_core_tables.sql ``` ### Step 4: Apply it ```text $ dbwarden migrate --database primary Applying migration: primary__0001_create_core_tables.sql Migration applied successfully ``` ### Step 5: Verify the state ```text $ dbwarden status --database primary Database: primary Applied migrations: 1 Pending migrations: 0 ``` ## Why Teams Use It - SQL remains the source of truth - Rollback SQL is part of the workflow, not an afterthought - Multi-database projects stay under one migration tool - Safety tooling is built in, not bolted on later - FastAPI projects can use the same config for sessions, health checks, and migration endpoints ## Requirements - Python 3.12 or higher - SQLAlchemy models for model-driven migration generation - A supported backend: PostgreSQL, MySQL, MariaDB, SQLite, or ClickHouse ## Next Steps - Start with [Features](features.md) - Follow the guides in [Get Started](getting-started/setup.md) - Explore [Cookbook & Examples](cookbook/index.md) - Use [CLI Reference](cli-reference.md) as command lookup ======================================================================== PAGE: https://dbwarden.emiliano-go.com/installation/ ======================================================================== # Installation This guide covers installing DBWarden in your project and verifying it works correctly. ## Requirements - Python 3.10 or higher - A project that uses SQLAlchemy for database models - uv or another Python package manager ## Install using uv ```bash uv add dbwarden ``` ### Development dependencies To also install testing and linting tools: ```bash uv add "dbwarden[dev]" ``` ### Optional dependency groups The `[postgres]` extra is the most commonly used. Install it if you are targeting PostgreSQL. | Group | Command | Provides | |------------------------|-------|-------| | `postgres` | `uv add "dbwarden[postgres]"` | PostgreSQL driver (`psycopg2-binary`) | | `mysql` | `uv add "dbwarden[mysql]"` | MySQL/MariaDB driver (`pymysql`) | | `clickhouse` | `uv add "dbwarden[clickhouse]"` | ClickHouse driver (`clickhouse-connect`) | | `fastapi` | `uv add "dbwarden[fastapi]"` | FastAPI session dependencies, health router, migration router, metrics router, Redis lock | | `metrics` | `uv add "dbwarden[metrics]"` | Prometheus metrics endpoint (`prometheus-client`) | | `sandbox` | `uv add "dbwarden[sandbox]"` | Sandbox migration testing via testcontainers | Combine groups as needed: ```bash uv add "dbwarden[postgres,mysql,fastapi]" ``` ## Database drivers DBWarden uses SQLAlchemy under the hood. The recommended way to install drivers is via the extras above: ```bash # PostgreSQL uv add "dbwarden[postgres]" # MySQL/MariaDB uv add "dbwarden[mysql]" # ClickHouse uv add "dbwarden[clickhouse]" # SQLite comes bundled with Python ``` You can also install drivers directly if you prefer: ```bash uv add psycopg2-binary # PostgreSQL uv add pymysql # MySQL / MariaDB uv add clickhouse-connect # ClickHouse ``` ## Verify installation After installing, confirm DBWarden is available: ```bash $ dbwarden version ``` You should see output: ``` 0.9.4 ``` ## Initialize in your project Create the DBWarden structure in your project directory: ```bash $ dbwarden init ``` This creates: - a `migrations/` directory structure - a `dbwarden.py` config scaffold (or discovers your existing config source) ## What happens during init When you run `init`, DBWarden: 1. Creates `migrations/` if missing 2. Creates `dbwarden.py` (or updates existing config source) with import scaffolding 3. Does not overwrite existing `database_config(...)` definitions you have added You can run `init` safely on an existing project - it is idempotent. ## Quick configuration After init, configure your first database by editing `dbwarden.py` (or your existing config file that contains `database_config(...)` calls): ```python from dbwarden import database_config primary = database_config( database_name="primary", default=True, database_type="postgresql", database_url_sync="postgresql://user:password@localhost:5432/mydb", dev_database_type="sqlite", dev_database_url="sqlite:///./development.db", ) ``` The `dev_database_*` fields are optional but recommended - they enable fast local iterations with `--dev`. ## Verify configuration loads ```bash $ dbwarden settings show --all ``` You should see your database entry printed with type and URL. ## Common installation issues **Command not found after uv add** - Ensure your virtual environment is activated - Try removing and re-adding: `uv remove dbwarden && uv add dbwarden` **Import errors or missing module warnings** - Upgrade uv and reinstall: `uv add --upgrade dbwarden` **Database driver errors** - Install the appropriate driver for your target database (see Database drivers section above) ## Upgrading To update to a newer version: ```bash uv add --upgrade dbwarden ``` Or with poetry: ```bash poetry update dbwarden ``` Check the release notes when upgrading major versions - there may be configuration or workflow changes. ======================================================================== PAGE: https://dbwarden.emiliano-go.com/migration-files/ ======================================================================== # Migration File Format Migration files are the execution contract in DBWarden. Everything that changes your database should be represented in explicit SQL files that can be reviewed, tested, and rolled back. ## File naming and location Versioned migrations are stored under each database migrations directory (default: `migrations/`). Canonical filename pattern: ```text {database_name}__{version}_{description}.sql ``` A legacy format without the `{database_name}__` prefix (e.g. `0001_create_users_table.sql`) is also accepted for backward compatibility. Examples: ```text primary__0001_initial_schema.sql primary__0002_add_users_table.sql analytics__0001_create_events.sql ``` When a migration is auto-generated with `make-migrations`, DBWarden also writes a companion plan file: ```text primary__0001_initial_schema.plan.json ``` That file captures machine-readable metadata for CI and debugging and is not executed by `migrate`. ## Required sections Each migration file must define both: ```sql -- upgrade -- rollback ``` - `-- upgrade`: statements applied during `migrate` - `-- rollback`: statements applied during `rollback` If rollback is weak or incomplete, production recovery is weak or incomplete. ## Migration classes DBWarden supports three execution classes: | Prefix | Class | Behavior | |--------|-------|----------| | `NNNN_` | Versioned | Runs once in ordered version sequence | | `RA__` | Runs always | Runs on every `migrate` execution | | `ROC__` | Runs on change | Runs when checksum changed | ### When to use each - `NNNN_`: schema evolution (tables, columns, indexes, constraints) - `RA__`: objects that should always be refreshed (views, grants) - `ROC__`: routines/policies that should apply only when content changes ## Execution model At runtime, DBWarden builds an execution plan from file discovery + migration metadata: 1. read versioned files and filter already-applied versions 2. include `RA__` files 3. include changed `ROC__` files 4. execute with lock protection 5. record metadata and checksums Conceptual plan: ```python def build_plan(directory, applied_versions): versioned = parse_versioned_files(directory) repeatable = parse_repeatable_files(directory) pending_versioned = [m for m in versioned if m.version not in applied_versions] pending_ra = repeatable.runs_always pending_roc = changed_only(repeatable.runs_on_change) return pending_versioned + pending_ra + pending_roc ``` ## Examples ### Versioned migration ```sql -- upgrade CREATE TABLE IF NOT EXISTS users ( id INTEGER PRIMARY KEY, email VARCHAR(255) NOT NULL UNIQUE, created_at DATETIME ); -- rollback DROP TABLE users; ``` ### Runs-always migration (`RA__`) Filename example: `primary__RA__refresh_active_users_view.sql` ```sql -- upgrade CREATE OR REPLACE VIEW active_users AS SELECT id, email FROM users WHERE is_active = TRUE; -- rollback DROP VIEW IF EXISTS active_users; ``` ### Runs-on-change migration (`ROC__`) Filename example: `primary__ROC__update_timestamp_trigger.sql` ```sql -- upgrade CREATE OR REPLACE FUNCTION update_updated_at() RETURNS TRIGGER AS $$ BEGIN NEW.updated_at = NOW(); RETURN NEW; END; $$ LANGUAGE plpgsql; -- rollback DROP FUNCTION IF EXISTS update_updated_at(); ``` ## Metadata headers Headers are parsed from migration file comments. The `-- seed` marker is recognised by tools; `-- depends_on` parsing is implemented but **not yet enforced** during migration execution (migrations run in filesystem-sort order by version). Dependency header (parsed but not enforced): ```sql -- depends_on: ["0004", "0005"] ``` Seed marker: ```sql -- seed ``` ## Authoring guidelines - One logical change per migration file - Keep DDL explicit; avoid hidden application-side schema effects - Keep rollback idempotent when possible (`IF EXISTS`, safe predicates) - For data migrations, use bounded, reversible operations - Prefer small migrations over large monolithic SQL scripts ## Review checklist Before merge: - upgrade section matches intended schema change - rollback section restores prior valid state - indexes/constraints/defaults are explicit - no environment-specific literals accidentally committed Before release: ```bash $ dbwarden status --database primary $ dbwarden migrate --database primary $ dbwarden rollback --database primary --count 1 $ dbwarden migrate --database primary ``` ======================================================================== PAGE: https://dbwarden.emiliano-go.com/models/ ======================================================================== # SQLAlchemy Models Reference This page is the **reference** for all supported Meta attributes across every backend. For a step-by-step walkthrough of defining models, see the [Modeling Guide](getting-started/modeling.md). DBWarden reads SQLAlchemy model metadata to generate migration SQL. Use `model_paths` in your `database_config(...)` entries to control discovery. ```python primary = database_config( database_name="primary", default=True, database_type="postgresql", database_url_sync="postgresql://user:pass@localhost:5432/main", model_paths=["app.models"], model_tables=["users", "posts", "comments"], ) ``` ## Common Meta Attributes Every backend supports a core set of cross-database attributes via `class Meta(TableMeta)`: ### Table-level | Attribute | Type | SQL | Backends | |-----------|------|-----|----------| | `comment` | `str` | `COMMENT ON TABLE t IS '...'` | All | | `indexes` | `list[IndexSpec \| dict]` | `CREATE INDEX ...` | All | | `checks` | `list[dict]` | `ALTER TABLE t ADD CONSTRAINT ... CHECK (...)` | All | | `uniques` | `list[dict]` | `ALTER TABLE t ADD CONSTRAINT ... UNIQUE (...)` | All | ```python from sqlalchemy import Integer, String from sqlalchemy.orm import DeclarativeBase, Mapped, mapped_column from dbwarden.databases import TableMeta, IndexSpec class Base(DeclarativeBase): pass class User(Base): __tablename__ = "users" id: Mapped[int] = mapped_column(Integer, primary_key=True) email: Mapped[str] = mapped_column(String(255)) age: Mapped[int] = mapped_column(Integer) class Meta(TableMeta): comment = "Core user accounts" indexes = [ IndexSpec(name="ix_users_email", columns=["email"]), ] checks = [ {"name": "ck_users_age", "sql": "age >= 0"}, ] uniques = [ {"name": "uq_users_email", "columns": ["email"]}, ] ``` For a lighter syntax without the `IndexSpec` import, pass plain dicts for indexes: ```python indexes = [ {"name": "ix_users_email", "columns": ["email"]}, ] ``` `IndexSpec` accepts the same fields as the dict form with IDE autocomplete. Use it for cross-backend indexes shared by any `database_type`. The same dict shorthand applies to `checks` and `uniques`: ```python checks = [ {"name": "ck_users_age", "sql": "age >= 0"}, ] uniques = [ {"name": "uq_users_email", "columns": ["email"]}, ] ``` ### Column-level | Attribute | Type | SQL | Backends | |-----------|------|-----|----------| | `comment` | `str` | `COMMENT ON COLUMN t.c IS '...'` | All | | `public` | `bool` | Controls field visibility in schemap auto-schema | All | ```python class Meta(TableMeta): class internal_note: comment = "Internal system note" public = False ``` These attributes work with any `database_type`. Backend-specific subclasses (`PGTableMeta`, `MyTableMeta`, `CHTableMeta`) inherit all common attributes and add their own. ### Column-Level Meta Base Class For IDE autocomplete on column-level inner classes, use `PGColumnMeta` for PostgreSQL, `MyColumnMeta` for MySQL, `MdbColumnMeta` for MariaDB, or `CHColumnMeta` for ClickHouse. All inherit from `FieldMeta`, which defines cross-database attributes (`comment`, `public`) and backend-specific spec objects (`pg`, `ch`, `my`, `mdb`, `sq`): ```python from dbwarden.databases import FieldMeta from dbwarden.databases.pgsql import pg # Use typed spec objects for backend-specific column attributes: # pg = pg.field(collation=..., storage=..., ...) # ch = ch.field(codec=..., nullable=..., ...) ``` Backend-specific options are always set via a typed spec object attribute, never as flat attributes. For example, use `pg = pg.field(collation="en_US.UTF-8")` instead of the old `pg_collation = "en_US.UTF-8"`. ### Backend Subpackages DBWarden organizes backend-specific types into subpackages under `dbwarden.databases`, also available there as short aliases: | Alias | Subpackage | Key types | |-------|------------|-----------| | `pg` | `dbwarden.databases.pgsql` | `PgFieldSpec`, `PgIndexSpec`, `PgTableSpec` | | `ch` | `dbwarden.databases.clickhouse` | `ChFieldSpec`, `ChIndexSpec`, `ChTableSpec` | | `my` | `dbwarden.databases.mysql` | `MyFieldSpec`, `MyTableSpec` | | `mdb` | `dbwarden.databases.mariadb` | `MdbFieldSpec`, `MdbTableSpec` | | `sq` | `dbwarden.databases.sqlite` | `SqFieldSpec`, `SqTableSpec` | Only `IndexSpec`, `PgIndexSpec`, and `ChIndexSpec` exist as typed index spec classes. MySQL, MariaDB, and SQLite use the base `IndexSpec` with the `indexes` attribute or plain dicts in their backend-specific index list (`my_indexes`, `sq_indexes`). ```python from dbwarden.databases.pgsql import pg from dbwarden.databases.clickhouse import ch from dbwarden.databases.mysql import my from dbwarden.databases.mariadb import mdb from dbwarden.databases.sqlite import sq # Use pg.field(), ch.field() for column-level metadata pg_spec = pg.field(collation="en_US.UTF-8", storage="PLAIN") ch_spec = ch.field(codec="ZSTD(3)", nullable=True) ``` ## PostgreSQL Model Metadata When `database_type="postgresql"`, DBWarden supports first-class PostgreSQL metadata via `class Meta(PGTableMeta)` inner classes. This is the **only** supported surface: `mapped_column(info=...)` raises `DBWardenConfigError`. ### Table-Level Meta Inherit from `PGTableMeta` on your `class Meta`: ```python from sqlalchemy import Integer from sqlalchemy.orm import DeclarativeBase, Mapped, mapped_column from dbwarden.databases.pgsql import PGTableMeta class Base(DeclarativeBase): pass class User(Base): __tablename__ = "users" id: Mapped[int] = mapped_column(Integer, primary_key=True) class Meta(PGTableMeta): pg_fillfactor = 80 pg_tablespace = "fastspace" ``` `PGTableMeta` inherits all common `TableMeta` attributes (`comment`, `indexes`, `checks`, `uniques`) and adds PostgreSQL-specific ones (`pg_fillfactor`, `pg_tablespace`, `pg_unlogged`, `pg_partition`, `pg_inherits`, `pg_excludes`, `pg_indexes`, `pg_checks`, `pg_uniques`). For PostgreSQL-specific indexes, use `PgIndexSpec` in `pg_indexes`: ```python from dbwarden.databases.pgsql import PgIndexSpec class Meta(PGTableMeta): pg_indexes = [ PgIndexSpec("ix_users_email", ["email"], unique=True, using="gin"), ] ``` `PgIndexSpec` supports operator classes via `postgresql_ops` for GIN indexes on JSONB columns: ```python PgIndexSpec("ix_users_data", ["data"], using="gin", postgresql_ops={"data": "jsonb_path_ops"}) ``` This generates `CREATE INDEX ... ON users USING GIN (data jsonb_path_ops)`. Full `PgIndexSpec` constructor fields: `name`, `columns`, `unique`, `using`, `where`, `include`, `with_params`, `tablespace`, `nulls_not_distinct`, `column_sorting`, `postgresql_ops`, `concurrently`. See [PostgreSQL Deep Dive](databases/postgresql.md) for details. ### Column-Level Meta Use `PGColumnMeta` inner classes named after the column. Use `pg = pg.field(...)` to set column-level options: ```python from sqlalchemy import Integer, Text from sqlalchemy.orm import DeclarativeBase, Mapped, mapped_column from dbwarden.databases.pgsql import PGTableMeta, PGColumnMeta, pg class Base(DeclarativeBase): pass class User(Base): __tablename__ = "users" id: Mapped[int] = mapped_column(Integer, primary_key=True) bio: Mapped[str] = mapped_column(Text) class Meta(PGTableMeta): class id(PGColumnMeta): pg = pg.field(identity="always", identity_start=100) class bio(PGColumnMeta): pg = pg.field(storage="EXTENDED", collation="en_US.UTF-8") ``` `PGColumnMeta` includes the common `comment` and `public` attributes plus a `pg` attribute of type `PgFieldSpec` that bundles all PostgreSQL-specific column options. For the full list of supported attributes, see [PostgreSQL Deep Dive](databases/postgresql.md). ## ClickHouse Model Metadata When `database_type="clickhouse"`, DBWarden supports first-class ClickHouse metadata via `class Meta(CHTableMeta)` inner classes. This is the **only** supported surface. Pass options via `mapped_column(info=...)` raises `DBWardenConfigError`. ### Table-Level Meta Inherit from `CHTableMeta` on your `class Meta`: ```python from datetime import date from sqlalchemy.orm import DeclarativeBase, Mapped, mapped_column from dbwarden.databases.clickhouse import CHTableMeta, ChEngineSpec class Base(DeclarativeBase): pass class Event(Base): __tablename__ = "events" id: Mapped[int] = mapped_column(Int64, primary_key=True) event_date: Mapped[date] = mapped_column(Date) payload: Mapped[str] = mapped_column(String) class Meta(CHTableMeta): ch_engine = ChEngineSpec("ReplacingMergeTree", args=("version_column",)) ch_order_by = ["region", "event_time"] ch_primary_key = "region" ch_partition_by = "toYYYYMM(event_time)" ch_sample_by = "intHash64(user_id)" ch_ttl = [ "event_time + INTERVAL 1 MONTH DELETE", "event_time + INTERVAL 1 YEAR TO DISK 'cold'", ] ch_settings = {"index_granularity": "8192"} ``` `CHTableMeta` inherits all common `TableMeta` attributes (`comment`, `indexes`, `checks`, `uniques`) and adds ClickHouse-specific ones (`ch_engine`, `ch_order_by`, `ch_primary_key`, `ch_partition_by`, `ch_sample_by`, `ch_ttl`, `ch_settings`, `ch_object_type`, `ch_select_statement`, `ch_to_table`, `ch_dictionary`, `ch_dict_layout`, `ch_dict_source`, `ch_dict_lifetime`, `ch_dict_primary_key`, `ch_projections`, `ch_zookeeper_path`, `ch_replica_name`). For the full list of supported attributes, see [ClickHouse Deep Dive](databases/clickhouse.md). ### Column-Level Meta Use `CHColumnMeta` inner classes named after the column. Use `ch = ch.field(...)` to set column-level options: ```python from sqlalchemy.orm import DeclarativeBase, Mapped, mapped_column from dbwarden.databases.clickhouse import CHTableMeta, CHColumnMeta, ChEngineSpec, ch class Base(DeclarativeBase): pass class Event(Base): __tablename__ = "events" id: Mapped[int] = mapped_column(Int64, primary_key=True) payload: Mapped[str] = mapped_column(String) tags: Mapped[list[str]] = mapped_column(ARRAY(String)) class Meta(CHTableMeta): ch_engine = ChEngineSpec("MergeTree") ch_order_by = "event_time" class payload(CHColumnMeta): ch = ch.field(codec="ZSTD(3)", nullable=False) class tags(CHColumnMeta): ch = ch.field(low_cardinality=True) ``` `CHColumnMeta` includes the common `comment` and `public` attributes plus a `ch` attribute of type `ChFieldSpec` that bundles all ClickHouse-specific column options. ### Engine Spec Use `ChEngineSpec` for the table engine: ```python from dbwarden.databases.clickhouse import ChEngineSpec # Simple engine ch_engine = ChEngineSpec("MergeTree") # Engine with arguments ch_engine = ChEngineSpec("ReplacingMergeTree", args=("version_column",)) # Replicated engine ch_engine = ChEngineSpec("ReplicatedMergeTree", zookeeper_path="/clickhouse/tables/shard1/events", replica_name="{replica}") # Distributed engine with settings ch_engine = ChEngineSpec("Distributed", args=("cluster", "db", "events", "rand()"), settings={"insert_distributed_sync": "1"}) ``` For replicated engines, `ch_zookeeper_path` and `ch_replica_name` are injected as the first two engine arguments. If `args` contains existing positional arguments, they come after the ZooKeeper path and replica name. ### Projections Use `ProjectionSpec` in `ch_projections`: ```python from dbwarden.databases.clickhouse import ProjectionSpec class Meta(CHTableMeta): ch_order_by = ["author", "created_at"] ch_projections = [ ProjectionSpec("by_author", "SELECT * ORDER BY author"), ProjectionSpec("daily_stats", "SELECT toDate(created_at) AS day, count() GROUP BY day"), ] ``` Current behavior: - projection definitions are rendered into generated ClickHouse DDL - safety checks classify added projections as `INFO` - removed projections are classified as `WARNING` ### Skip Indexes Use `ChIndexSpec` in `ch_indexes`: ```python from dbwarden.databases.clickhouse import ChIndexSpec class Meta(CHTableMeta): ch_indexes = [ ChIndexSpec("ix_payload", ["payload"], type="bloom_filter", granularity=1), ] ``` ### Materialized Views Materialized views use `ch_select_statement` and optionally `ch_to_table`: ```python class EventRollup(Base): __tablename__ = "event_rollup_mv" event_date: Mapped[date] = mapped_column(Date) total: Mapped[int] = mapped_column(Int64) class Meta(CHTableMeta): ch_object_type = "materialized_view" ch_select_statement = ( "SELECT toDate(event_time) AS event_date, count() AS total " "FROM events GROUP BY event_date" ) ch_to_table = "mv_target" ``` When `ch_to_table` is set, the `ENGINE` clause is omitted (ClickHouse rejects `ENGINE` with `TO`). ### Dictionaries ClickHouse dictionaries use `ch_dictionary = True` with related `ch_dict_*` fields: ```python class CountryCode(Base): __tablename__ = "country_codes" code: Mapped[str] = mapped_column(String) name: Mapped[str] = mapped_column(String) class Meta(CHTableMeta): ch_dictionary = True ch_dict_layout = "FLAT()" ch_dict_source = "CLICKHOUSE(HOST 'localhost' TABLE 'countries')" ch_dict_lifetime = "MIN 0 MAX 3600" ch_dict_primary_key = "code" ``` Required fields when `ch_dictionary = True`: | Field | Description | Example | |-------|-------------|---------| | `ch_dict_layout` | Dictionary layout | `"FLAT()"`, `"COMPLEX_KEY_HASHED()"` | | `ch_dict_source` | Source configuration | `"CLICKHOUSE(HOST '...' TABLE '...')"` | | `ch_dict_lifetime` | Cache lifetime | `"MIN 0 MAX 3600"` or `3600` | Optional field: | Field | Description | Default | |-------|-------------|---------| | `ch_dict_primary_key` | Primary key expression | First column | Column types render as CH-native types (`Int64`, `String`). ### Column Hints Use `CHColumnMeta` inner classes for per-column hints instead of `info={}`: ```python from dbwarden.databases.clickhouse import ch class Meta(CHTableMeta): class payload(CHColumnMeta): ch = ch.field(codec="ZSTD(3)", low_cardinality=True, nullable=False) ``` Supported `ch.field()` options: | Keyword | Type | Description | Example | |---------|------|-------------|---------| | `codec` | `str` | Compression codec | `"ZSTD(3)"` | | `default_expression` | `str` | Default value expression | `"now()"` | | `materialized` | `str` | Materialized expression | `"lower(name)"` | | `alias` | `str` | Alias expression | `"concat(a, b)"` | | `ttl` | `str` | Column TTL expression | `"event_time + INTERVAL 1 YEAR"` | | `low_cardinality` | `bool` | Wrap type in LowCardinality | `True` | | `nullable` | `bool` | Wrap type in Nullable | `True` | ## MySQL Model Metadata When `database_type="mysql"` (or `"mariadb"`), DBWarden supports first-class MySQL metadata via `class Meta(MyTableMeta)` inner classes. This is the **only** supported surface: `mapped_column(info=...)` raises `DBWardenConfigError`. ### Table-Level Meta Inherit from `MyTableMeta` on your `class Meta`: ```python from sqlalchemy import Integer, String from sqlalchemy.orm import DeclarativeBase, Mapped, mapped_column from dbwarden.databases.mysql import MyTableMeta class Base(DeclarativeBase): pass class User(Base): __tablename__ = "users" id: Mapped[int] = mapped_column(Integer, primary_key=True) email: Mapped[str] = mapped_column(String(255)) class Meta(MyTableMeta): my_engine = "InnoDB" my_charset = "utf8mb4" my_collate = "utf8mb4_unicode_ci" my_row_format = "DYNAMIC" my_auto_increment = 1000 comment = "Core user accounts" ``` `MyTableMeta` inherits all common `TableMeta` attributes (`comment`, `indexes`, `checks`, `uniques`) and adds MySQL-specific ones (`my_engine`, `my_charset`, `my_collate`, `my_row_format`, `my_auto_increment`). For MariaDB, use `MdbTableMeta` which extends `MyTableMeta` with `mdb_page_compressed` and `mdb_page_compression_level`. ### Column-Level Meta Use `MyColumnMeta` inner classes named after the column. Use `my = my.field(...)` to set column-level MySQL options: ```python from sqlalchemy import Integer, String, TIMESTAMP from sqlalchemy.orm import DeclarativeBase, Mapped, mapped_column from dbwarden.databases.mysql import MyTableMeta, MyColumnMeta, my class Base(DeclarativeBase): pass class User(Base): __tablename__ = "users" id: Mapped[int] = mapped_column(Integer, primary_key=True) email: Mapped[str] = mapped_column(String(255)) updated_at: Mapped[str] = mapped_column(TIMESTAMP) class Meta(MyTableMeta): class id(MyColumnMeta): comment = "Primary key" my = my.field(unsigned=True) class email(MyColumnMeta): my = my.field(charset="utf8mb4", collate="utf8mb4_unicode_ci") class updated_at(MyColumnMeta): my = my.field(on_update="CURRENT_TIMESTAMP") ``` Supported `my.field()` options: | Keyword | Type | Description | Example | |---------|------|-------------|---------| | `unsigned` | `bool` | `UNSIGNED` on integer columns | `unsigned=True` | | `charset` | `str` | Per-column character set | `charset="utf8mb4"` | | `collate` | `str` | Per-column collation | `collate="utf8mb4_unicode_ci"` | | `on_update` | `str` | `ON UPDATE` expression (typically for TIMESTAMP) | `on_update="CURRENT_TIMESTAMP"` | For MariaDB, use `MdbColumnMeta` and `mdb.field()` which extends `my.field()` with `invisible` and `sequence` options. Cross-backend column attributes (`comment`, `public`) are set directly on the inner class, not on the spec object. ======================================================================== PAGE: https://dbwarden.emiliano-go.com/observability/ ======================================================================== # Observability DBWarden provides Prometheus metrics and structured JSON logging for monitoring and debugging. ## Prometheus metrics ### Installation Install the optional metrics dependency: ```bash uv add "dbwarden[metrics]" ``` This installs `prometheus-client` which is required for metric collection and exposition. ### Enabling metrics Set the `DBWARDEN_METRICS` environment variable to `true`: ```bash export DBWARDEN_METRICS=true ``` When enabled, DBWarden instruments the `migrate` and `seed apply` commands with Prometheus metric recording. When disabled (or when `prometheus_client` is not installed), all metric functions are safe no-ops. ### Available metrics | Metric | Type | Labels | Description | |--------|------|--------|-------------| | `dbwarden_migrations_total` | Counter | `database`, `version` | Total migrations applied | | `dbwarden_migration_duration_seconds` | Histogram | `database` | Duration of migration operations | | `dbwarden_schema_version` | Gauge | `database` | Current schema version | | `dbwarden_seed_version` | Gauge | `database` | Current seed version | | `dbwarden_pending_migrations` | Gauge | `database` | Number of pending migrations | | `dbwarden_migration_errors_total` | Counter | `database` | Total migration errors | ### FastAPI metrics endpoint The `MetricsRouter` exposes a `GET /metrics` endpoint in Prometheus text format: ```python from fastapi import FastAPI from dbwarden.fastapi import MetricsRouter app = FastAPI() app.include_router(MetricsRouter(), prefix="/metrics") ``` The endpoint returns: ``` # HELP dbwarden_pending_migrations Number of pending migrations # TYPE dbwarden_pending_migrations gauge dbwarden_pending_migrations{database="primary"} 0 # HELP dbwarden_schema_version Current schema version # TYPE dbwarden_schema_version gauge dbwarden_schema_version{database="primary"} 5.0 ``` Only active when `prometheus_client` is installed and `DBWARDEN_METRICS=true` is set. Returns 404 when disabled. ### MetricsMiddleware The `MetricsMiddleware` is an ASGI middleware that refreshes pending-migration gauges on each HTTP request: ```python from fastapi import FastAPI from dbwarden.fastapi import MetricsMiddleware, MetricsRouter app = FastAPI() app.add_middleware(MetricsMiddleware) app.include_router(MetricsRouter(), prefix="/metrics") ``` The middleware also records HTTP request duration via the migration duration histogram. ## JSON logging DBWarden supports structured JSON logging for integration with log aggregation systems (ELK, Loki, Datadog, etc.). ### Enabling JSON logging Set the `DBWARDEN_LOG_JSON` environment variable to `true`: ```bash export DBWARDEN_LOG_JSON=true ``` When enabled, all DBWarden log output uses newline-delimited JSON format: ```json {"timestamp": "2025-06-01T10:00:00.123456", "level": "INFO", "logger": "dbwarden", "message": "Applying migration 0003", "db_name": "primary", "db_type": "postgresql"} {"timestamp": "2025-06-01T10:00:01.234567", "level": "INFO", "logger": "dbwarden", "message": "Migration 0003 applied successfully", "db_name": "primary", "db_type": "postgresql"} ``` ### JSON log fields | Field | Description | |-------|-------------| | `timestamp` | ISO-8601 timestamp with microseconds | | `level` | Log level (DEBUG, INFO, WARNING, ERROR) | | `logger` | Logger name | | `message` | Log message text | | `db_name` | Database name (when applicable) | | `db_type` | Database type (when applicable) | | `exception` | Exception traceback (when applicable) | ## Environment variables reference | Variable | Value | Effect | |----------|-------|--------| | `DBWARDEN_METRICS` | `true` | Enable Prometheus metric recording and exposition | | `DBWARDEN_LOG_JSON` | `true` | Enable JSON-formatted log output | | `DBWARDEN_MIGRATE_AUTH` | API key string | Require `X-API-Key` header for `POST /migrate` endpoint | | `DBWARDEN_HEALTH_AUTH` | API key string | Require `X-API-Key` header for health endpoints | See also: [Cookbook: Observability](../cookbook/11-observability.md) ======================================================================== PAGE: https://dbwarden.emiliano-go.com/reference/configuration-api/ ======================================================================== # Configuration API Reference Complete reference for the `database_config()` function. This is a reference page. For step-by-step guides, see [Quick Start](../configuration/quick-start.md), [Concepts](../configuration/concepts.md), or [Production Patterns](../configuration/production-patterns.md). ## Function Signature ```python def database_config( *, database_name: str, database_type: Literal["sqlite", "postgresql", "mysql", "mariadb", "clickhouse"] = "sqlite", database_url_sync: str | None = None, database_url_async: str | None = None, default: bool = False, migrations_dir: str | None = None, migration_table: str | None = None, seed_table: str | None = None, auto_apply_seeds: bool = False, model_paths: list[str] | None = None, model_tables: list[str] | None = None, dev_database_type: str | None = None, dev_database_url: str | None = None, overlap_models: bool = False, secure_values: bool = False, ) -> DatabaseHandle: """Register a database in DBWarden and return a handle with session dependencies.""" ``` ## Required arguments | Argument | Type | Description | |----------|------|-------------| | `database_name` | `str` | unique name for this database in your project | | `database_type` | `str` | backend type: `sqlite`, `postgresql`, `mysql`, `mariadb`, or `clickhouse` (default: `"sqlite"`) | At least one of `database_url_sync` or `database_url_async` must be provided. ## Optional arguments | Argument | Type | Default | Description | |----------|------|---------|-------------| | `database_url_sync` | `str | None` | `None` | synchronous connection URL (used by migrations, CLI, and sync sessions) | | `database_url_async` | `str | None` | `None` | async connection URL (used by async sessions; falls back to `database_url_sync` if omitted) | | `default` | `bool` | `False` | if `True`, this database is used when `--database` is omitted | | `migrations_dir` | `str | None` | `None` | custom migration directory path (defaults to `migrations/`) | | `migration_table` | `str | None` | `None` | custom migration tracking table name (defaults to `_dbwarden_migrations`) | | `seed_table` | `str | None` | `None` | custom seed tracking table name (defaults to `_dbwarden_seeds`) | | `auto_apply_seeds` | `bool` | `False` | if `True`, automatically apply pending code seeds after `migrate` | | `model_paths` | `list[str] | None` | `None` | list of Python import paths containing SQLAlchemy models for this database | | `model_tables` | `list[str] | None` | `None` | optional filter: only these table names are owned by this database | | `dev_database_type` | `str | None` | `None` | backend type for local development (used with `--dev`) | | `dev_database_url` | `str | None` | `None` | connection URL for local development (used with `--dev`) | | `overlap_models` | `bool` | `False` | if `True`, allow model path overlap with other databases | | `secure_values` | `bool` | `False` | if `True`, display commands show variable names instead of resolved values | ## Field descriptions ### `database_name` A unique identifier for this database within your project. **Requirements:** - Must be unique across all entries in your config source - Used in CLI `--database` / `-d` flags to select this database - Becomes part of migration filename prefix (for versioned migrations) **Examples:** ```python database_name="primary" database_name="analytics" database_name="legacy" ``` Use descriptive names that reflect the database's purpose: `primary`, `analytics`, `audit_logs`, etc. ### `database_type` The database backend technology. Each value determines: - URL parsing behavior - SQL dialect and syntax handling - Available features (transactions, DDL, constraints) Valid values: `sqlite`, `postgresql`, `mysql`, `mariadb`, `clickhouse` ### `database_url_sync` and `database_url_async` Connection URL strings in the format: ``` [dialect+driver://user:password@]host[:port][/database][?options] ``` - **`database_url_sync`** used by CLI commands (`migrate`, `init`, etc.) and any sync session - **`database_url_async`** used by async sessions (FastAPI); falls back to `database_url_sync` if omitted At least one must be provided. If only `database_url_sync` is given, async sessions will use it (with async driver substitution like `postgresql://...` `postgresql+asyncpg://...`). Examples: ```python # Both sync and async (recommended for FastAPI projects) database_url_sync = "postgresql://user:password@localhost:5432/mydb" database_url_async = "postgresql+asyncpg://user:password@localhost:5432/mydb" # Sync only (CLI-only projects) database_url_sync = "postgresql://user:password@localhost:5432/mydb" # SQLite (relative path) database_url_sync = "sqlite:///./development.db" # MySQL database_url_sync = "mysql://user:password@localhost:3306/mydb" # ClickHouse database_url_sync = "http://user:password@clickhouse-host:8123/mydb" ``` ### `default` When `True`, this database is selected when `--database` / `-d` is not specified. **Rule:** Exactly one entry must have `default=True`. **Example:** ```python # Primary is default primary = database_config( analytics = database_config(database_name="analytics", ...) # default=False implied ``` Exactly one database must have `default=True`. Having zero or multiple defaults will cause a validation error. ### `migrations_dir` Path where this database's migration files are stored. - Defaults to `migrations/` - Each database should have its own directory to avoid collision - Versioned migration files go here (`NNNN_description.sql`) - Repeatable migration files go here (`RA__*.sql`, `ROC__*.sql`) ### `model_paths` A list of Python import paths where DBWarden should discover SQLAlchemy model definitions. **When required:** - **Single database:** Optional (DBWarden scans entire codebase) - **Multiple databases:** Required for each database **How it works:** DBWarden imports each path and inspects classes inheriting from `DeclarativeBase` or `declarative_base()`. **Examples:** ```python # Single module model_paths=["app.models"] # Multiple modules model_paths=["app.models.primary", "app.legacy"] # Nested modules model_paths=["app.models.api.v1", "app.models.api.v2"] ``` Specifying `model_paths` makes discovery faster and more predictable, even for single-database projects. See [Multi-Database Guide](../configuration/multi-database.md) for organizing models across databases. ### `model_tables` A downstream filter applied after model discovery. Only tables whose string name appears in this list are owned by this database. All other discovered tables are ignored. **When to use:** - **Multi-database shared `model_paths`:** Two databases share the same import path but own different subsets of tables. - **Selective deployment:** A microservice owns only a few tables from a shared models package. **How it works:** 1. DBWarden discovers all models via `model_paths` 2. If `model_tables` is set, it validates every name exists among the discovered tables 3. Only the matching tables participate in migrations, diffs, and exports **Example:** ```python primary = database_config( database_name="primary", default=True, database_type="postgresql", database_url_sync="postgresql://localhost/myapp", model_paths=["app.models"], model_tables=["users", "posts", "comments"], ) audit = database_config( database_name="audit", database_type="postgresql", database_url_sync="postgresql://localhost/audit", model_paths=["app.models"], model_tables=["audit_logs"], ) ``` **Overlap validation:** If two databases both set `model_tables` with overlapping names, DBWarden raises an error (same behavior as `model_paths` overlap). Set `overlap_models=True` to allow it. **Must be valid SQL identifiers.** Dotted (schema-qualified) names are not supported in the initial release. ### `migration_table` Name of the table DBWarden uses to record applied migrations and repeatable migration checksums. - Defaults to `_dbwarden_migrations` - Must be a valid SQL identifier - Applies per database entry - Only affects migration tracking metadata; lock tables are separate **Example:** ```python primary = database_config( database_name="primary", default=True, database_type="postgresql", database_url_sync="postgresql://localhost/myapp", migration_table="custom_migrations", ) ``` Use this when: - integrating with an existing database that already reserves a migrations table name - isolating DBWarden metadata under a project-specific convention ### `seed_table` Name of the table DBWarden uses to record applied seeds. - Defaults to `_dbwarden_seeds` - Must be a valid SQL identifier - Applies per database entry **Example:** ```python primary = database_config( database_name="primary", default=True, database_type="postgresql", database_url_sync="postgresql://localhost/myapp", seed_table="custom_seeds", ) ``` Use this when integrating with an existing database that already reserves the seed table name. ### `auto_apply_seeds` When `True`, DBWarden automatically applies pending code seeds after each successful `migrate` run. - Defaults to `False` - Applies per database entry - Can be overridden per-run with `--apply-seeds` / `--no-apply-seeds` CLI flags on `migrate` **Example:** ```python primary = database_config( database_name="primary", default=True, database_type="postgresql", database_url_sync="postgresql://localhost/myapp", auto_apply_seeds=True, ) ``` Use this when: - you want seeds to stay in sync with schema changes without manual `seed apply` steps - deploying code seeds that define reference data or lookup tables - running in CI/CD where every migration cycle should also re-seed ### `dev_database_type` and `dev_database_url` These define an alternate connection for local development workflows. When `--dev` is passed to any DBWarden command: - `database_type` is swapped to `dev_database_type` - `database_url_sync` / `database_url_async` are swapped to `dev_database_url` **Benefits:** - Use SQLite locally for speed (if production is PostgreSQL) - Target a separate development database instance - Test migrations safely before running against production - Each developer has isolated database - Easy to reset (just delete the file) **Example:** ```python primary = database_config( database_name="primary", default=True, database_type="postgresql", database_url_sync="postgresql://prod-host/myapp", dev_database_type="sqlite", dev_database_url="sqlite:///./dev.db", ) ``` Use with: ```bash $ dbwarden --dev migrate # Uses SQLite $ dbwarden migrate # Uses PostgreSQL ``` Use SQLite with `dev_database_url="sqlite:///./dev.db"` for the fastest local iteration loop. See [Dev Mode](../configuration/dev-mode.md) for complete workflow and patterns. ### `overlap_models` By default, DBWarden prevents model path overlap between databases. Set `overlap_models=True` when: - Two databases legitimately share model definitions - You understand the behavior implications (both databases will include overlapping tables) ### `secure_values` When enabled, CLI display commands show the original variable/expression for non-literal arguments instead of resolved values. **Use when:** - Your config uses environment variables or expressions for secrets - You want terminal output to avoid exposing credentials - Running commands in CI/CD with logged output **Example:** ```python import os DATABASE_URL = os.getenv("DATABASE_URL") primary = database_config( database_name="primary", default=True, database_type="postgresql", database_url_sync=DATABASE_URL, secure_values=True, # Enable secure display ) ``` **Without `secure_values`:** ```bash $ dbwarden settings show URL: postgresql://user:SECRET_PASSWORD@prod-host/myapp ``` **With `secure_values=True`:** ```bash $ dbwarden settings show --all URL: DATABASE_URL (expression) ``` Always set `secure_values=True` in production to prevent credential exposure in logs. ## Return value: `DatabaseHandle` `database_config()` returns a `DatabaseHandle` object with two properties designed as FastAPI dependency annotations: | Property | Resolves to (SQL) | Resolves to (ClickHouse) | |----------|-------------------|--------------------------| | `.async_session` | `Annotated[AsyncSession, Depends(...)]` | `Annotated[AsyncClient, Depends(...)]` | | `.sync_session` | `Annotated[Session, Depends(...)]` | `Annotated[Client, Depends(...)]` | Use them as type hints in FastAPI route parameters. The handle serves as a namespace so you never confuse which database a session belongs to: ```python # dbwarden.py from dbwarden import database_config primary = database_config(database_name="primary", ...) analytics = database_config(database_name="analytics", ...) ``` ```python # routes.py from ..dbwarden import primary, analytics @router.get("/users") async def get_users(session: primary.async_session): return await session.execute(...) @router.get("/reports") def get_reports(session: analytics.sync_session): return session.execute(...) ``` Use `.async_session` for async route handlers and `.sync_session` for sync handlers. The deprecated `.session` property (aliased to `.async_session`) will be removed in a future version. `DatabaseHandle` is still useful as a typed container even without FastAPI. Access `handle._name` and `handle._db_type` for the raw config values. ## Configuration rules (enforced at load time) DBWarden validates your config to prevent dangerous misconfigurations: | Rule | Error message (if violated) | |------|---------------------------| | Exactly one `default=True` | `Exactly one default=True required` | | Unique `database_name` across all entries | `Duplicate database_name` | | Unique `database_url_sync` across all entries | `Duplicate database_url_sync` | | Unique physical target (even across credentials) | `Duplicate database target detected` | | Required `model_paths` when multiple databases | `model_paths is required when more than one database is configured` | | Explicit `overlap_models` when paths overlap | `model_paths overlap detected` | | `model_tables` (if set) must not overlap across databases | `model_tables overlap detected` | | If `dev_database_type` set, `dev_database_url` also required | `dev_database_url is required when dev_database_type is set` | ## Loading and resolution Config is loaded by importing your Python config source and executing `database_config(...)` calls. The resolution priority is: 1. Look for `dbwarden.py` in the current directory or parent directories 2. If `DBWARDEN_CONFIG_MODULE` environment variable is set, use that module 3. Full scan for any file containing `database_config(...)` calls If more than one discovery source is found, DBWarden fails with an ambiguity error. `dbwarden.py` is the default convention, but it is not the only valid location. Any discovered Python file inside the project can call `database_config(...)`. ### Security sandbox Config files are loaded with path traversal protection that ensures the file is within the project tree. See [Configuration Concepts Config Loading Security](../configuration/concepts.md#config-loading-security-sandbox). ## Examples ### Minimal single-database setup ```python from dbwarden import database_config primary = database_config( database_name="primary", default=True, database_type="postgresql", database_url_sync="postgresql://user:password@localhost:5432/mydb", ) ``` ### With local development (recommended) ```python from dbwarden import database_config primary = database_config( database_name="primary", default=True, database_type="postgresql", database_url_sync="postgresql://user:password@localhost:5432/mydb", dev_database_type="sqlite", dev_database_url="sqlite:///./development.db", ) ``` ### Multi-database setup ```python from dbwarden import database_config primary = database_config( database_name="primary", default=True, database_type="postgresql", database_url_sync="postgresql://user:password@localhost:5432/main", model_paths=["app.models.api"], ) analytics = database_config( database_name="analytics", database_type="clickhouse", database_url_sync="http://clickhouse:password@clickhouse-host:8123/analytics", model_paths=["app.models.analytics"], ) ``` ## Quick Reference | Parameter | Required? | Default | Use When | |-----------|-----------|---------|----------| | `database_name` | Yes | - | Always | | `database_type` | No | `"sqlite"` | Non-SQLite backends | | `database_url_sync` | Conditional | `None` | CLI or sync sessions | | `database_url_async` | No | `None` | Async sessions (FastAPI) | | `default` | No | `False` | Mark one database as default | | `migrations_dir` | No | `migrations/` | Custom migration directory | | `seed_table` | No | `_dbwarden_seeds` | Custom seed tracking table | | `auto_apply_seeds` | No | `False` | Auto-apply seeds after migrate | | `migration_table` | No | `_dbwarden_migrations` | Custom migration tracking table | | `model_paths` | Conditional | `None` | Multi-database or explicit discovery | | `model_tables` | No | `None` | Filter discovered tables by name | | `dev_database_type` | No | `None` | Local development | | `dev_database_url` | No | `None` | Local development | | `overlap_models` | No | `False` | Shared models (read replicas) | | `secure_values` | No | `False` | Hide credentials in output | ## Related Documentation **Getting Started:** - [Quick Start](../configuration/quick-start.md) - Your first configuration - [Concepts](../configuration/concepts.md) - How configuration works **Guides:** - [Connection URLs](../configuration/connection-urls.md) - Database URL formats - [Model Discovery](../configuration/model-discovery.md) - How `model_paths` works - [Dev Mode](../configuration/dev-mode.md) - Local development - [Multi-Database](../configuration/multi-database.md) - Multiple databases - [Production Patterns](../configuration/production-patterns.md) - Real-world examples **Help:** - [Troubleshooting](../configuration/troubleshooting.md) - Common issues ======================================================================== PAGE: https://dbwarden.emiliano-go.com/reference/migrate-from-toml/ ======================================================================== # Migrate from TOML If your project currently uses `warden.toml` for DBWarden configuration, this guide walks through transitioning to the Python-based `database_config(...)` approach. ## Why migrate The Python configuration model offers: - **Type validation** - catches misconfigurations at import time, not runtime - **Runtime flexibility** - use environment variables, conditional logic, and expressions in your config - **IDE support** - autocomplete, type hints, and inline documentation - **Consistency** - same Python codebase powers your app and your migrations ## Before: TOML configuration ```toml default = "primary" [database.primary] database_type = "postgresql" sqlalchemy_url = "postgresql://user:password@localhost:5432/main" migrations_dir = "migrations/primary" model_paths = ["app.models.api"] ``` ## After: Python configuration ```python from dbwarden import database_config primary = database_config( database_name="primary", default=True, database_type="postgresql", database_url_sync="postgresql://user:password@localhost:5432/main", migrations_dir="migrations/primary", model_paths=["app.models.api"], ) ``` ## What changed in field names | TOML field | Python field | Notes | |------------|--------------|-------| | `default` | `default=True` | Boolean flag per entry instead of top-level | | `database..database_type` | `database_type` | Passed directly to function | | `database..sqlalchemy_url` | `database_url_sync` | Note: renamed for clarity | | `database..migrations_dir` | `migrations_dir` | Same concept, different syntax | | `database..model_paths` | `model_paths` | List syntax in Python | | `database..dev_database_type` | `dev_database_type` | Optional dev swap entries | | `database..dev_database_url` | `dev_database_url` | Optional dev swap entries | ## Migration checklist ### Step 1: Identify current databases Check your existing `warden.toml`: ```bash $ dbwarden database list ``` Note each database entry and its configuration. ### Step 2: Create new config source Create your new config file (or update an existing config file that will contain `database_config(...)` calls). Typical options: - `dbwarden.py` (recommended for new projects) - `app/core/config.py` (if you already have one) - Any Python file that DBWarden can discover ### Step 3: Map each database entry For each database in your TOML config, create a corresponding `database_config(...)` call. **Example transformation:** ```python # From TOML: # [database.primary] # database_type = "postgresql" # sqlalchemy_url = "postgresql://user:password@localhost:5432/main" # dev_database_type = "sqlite" # dev_database_url = "sqlite:///./development.db" # To Python: from dbwarden import database_config primary = database_config( database_name="primary", default=True, database_type="postgresql", database_url_sync="postgresql://user:password@localhost:5432/main", dev_database_type="sqlite", dev_database_url="sqlite:///./development.db", ) ``` ### Step 4: Verify configuration loads ```bash $ dbwarden settings show --all ``` You should see all your migrated databases listed with correct types and URLs. ### Step 5: Test core commands Confirm the migration works by running key commands: ```bash $ dbwarden status --database primary $ dbwarden history --database primary ``` ### Step 6: Remove TOML file Once verified, delete your old `warden.toml`: ```bash rm warden.toml ``` DBWarden now uses only your Python configuration source. ## Advanced migration patterns ### Using environment variables ```python import os DATABASE_URL = os.getenv("DATABASE_URL") primary = database_config( database_name="primary", default=True, database_type="postgresql", database_url_sync=DATABASE_URL, dev_database_type="sqlite", dev_database_url="sqlite:///./development.db", ) ``` This replaces the old pattern of reading URL from environment via TOML (which TOML cannot do natively). ### Using conditional configuration ```python import os ENV = os.getenv("ENVIRONMENT", "development") primary = database_config( database_name="primary", default=True, database_type="postgresql", database_url_sync="postgresql://user:password@localhost:5432/main", dev_database_type="sqlite", dev_database_url="sqlite:///./development.db" if ENV == "development" else None, ) ``` This is impossible with TOML but natural with Python. ### Adding `secure_values` for sensitive URLs If your configuration uses variables or expressions for credentials: ```python import os DB_USER = os.getenv("DB_USER") DB_PASS = os.getenv("DB_PASS") primary = database_config( database_name="primary", default=True, database_type="postgresql", database_url_sync=f"postgresql://{DB_USER}:{DB_PASS}@localhost:5432/main", secure_values=True, ) ``` With `secure_values=True`, display commands show the expression rather than resolved credentials. ## Common issues during migration ### "Exactly one default=True required" Ensure exactly one entry has `default=True`. Unlike TOML's top-level `default` field, you must set this on exactly one entry. ### "model_paths is required when more than one database is configured" When using multiple databases, each needs explicit `model_paths` to keep model discovery boundaries clear. ### "Duplicate database_name" or "Duplicate database_url" Python loads all `database_config(...)` calls, so ensure each call has a unique `database_name` and no duplicate URLs. ### TOML-specific features not supported Some TOML features don't map 1:1 to Python: - Inline tables → use separate key-value dictionary in Python (more verbose but explicit) - Multiline strings → use triple-quoted strings in Python - Arrays of tables → use separate `model_paths=[...]` list entries If you used advanced TOML features, manually translate those to equivalent Python constructs. ## Verification commands after migration After completing the migration, run these to confirm everything works: ```bash # Confirm all databases visible $ dbwarden settings show --all # Confirm status works $ dbwarden status --database # Confirm history works $ dbwarden history --database # If using --dev, confirm it works $ dbwarden --dev status --database ``` ## Rollback if needed If migration causes issues, you can always: 1. Recreate `warden.toml` with the original configuration 2. Run DBWarden version that supports TOML (pre-0.5) However, the Python-based approach is the recommended direction (0.9+) and offers significant benefits. ======================================================================== PAGE: https://dbwarden.emiliano-go.com/seeds/ ======================================================================== # Seed Management DBWarden provides built-in seed data management for populating databases with initial or reference data. Seeds complement migrations by handling data that belongs in version control. ## Overview There are two ways to define seeds, listed in order of preference: 1. **Code seeds** (recommended): define seeds inline alongside your SQLAlchemy models using the `Seed` base class or `@seed_data` decorator. No separate files, no manual versioning. 2. **File seeds**: traditional `.sql` or `.py` files in a `seeds/` directory, useful for complex multi-statement SQL. Both are tracked in the `_dbwarden_seeds` table and applied via `dbwarden seed apply`. --- ## Code Seeds (Recommended) Code seeds live alongside your models in your `model_paths` directories. They are the recommended way to define seed data because they stay in sync with your schema, support IDE autocompletion, and do not require manual version management. ### Seed Base Class Inherit from `Seed` and set a `model` + `rows`: ```python from dbwarden.seed import Seed class CountrySeed(Seed): __seed_database__ = "primary" __seed_description__ = "initial countries" __seed_on_conflict__ = "update" __seed_conflict_columns__ = ["code"] model = Country rows = [ Country(code="UY", name="Uruguay"), Country(code="AR", name="Argentina"), ] ``` Key advantages over the old decorator approach: - **Full IDE autocompletion**: `rows` uses model instances directly, so your editor knows the column names and types - **No `version` parameter**: versions are auto-assigned (`C0001`, `C0002`, ...) based on deterministic class ordering - **No manual import of `SeedRow`**: though `SeedRow` is still available if you prefer dict-like rows ### Seed Class Reference | Attribute | Default | Description | |-----------|---------|-------------| | `__seed_database__` | `"default"` | Routes the seed to the named database handle configured in `database_config(...)`. | | `__seed_description__` | `""` | Human-readable label shown in `dbwarden seed list` output. | | `__seed_on_conflict__` | `"ignore"` | What to do when a row with matching columns exists: `"ignore"` (skip silently), `"update"` (overwrite), or `"error"` (raise). | | `__seed_conflict_columns__` | `None` | List of column names used for conflict detection. Required when `__seed_on_conflict__` is `"update"`. | ### Model Instances in Rows Because `rows` accepts model instances, you get full autocompletion from your `Mapped` annotations: ```python from dbwarden.seed import Seed class RepoSeed(Seed): __seed_database__ = "clickhouse" __seed_description__ = "Tracked Repos" model = Repo rows = [ Repo(name="dbwarden", owner="anomalyco", is_org=True, default_branch="main"), Repo(name="vigil", owner="anomalyco", is_org=True, default_branch="master"), ] ``` Your editor will suggest `name`, `owner`, `is_org`, `default_branch` etc. as you type. > SQLAlchemy 2.0's `DeclarativeBase` does not accept positional arguments in the constructor. Always use keyword arguments when instantiating models in `rows`: `Repo(name="dbwarden", ...)` instead of `Repo("dbwarden", ...)`. ### SeedRow (Alternative) If you prefer dict-like rows, `SeedRow` still works: ```python from dbwarden.seed import Seed, SeedRow class CountrySeed(Seed): __seed_database__ = "primary" __seed_description__ = "initial countries" __seed_on_conflict__ = "update" __seed_conflict_columns__ = ["code"] model = Country rows = [ SeedRow(code="UY", name="Uruguay"), SeedRow(code="AR", name="Argentina"), ] ``` ### `on_conflict` Behavior | Value | Behavior | |-------|----------| | `"ignore"` (default) | Skips existing rows silently | | `"update"` | Updates existing rows with new values | | `"error"` | Raises an error on conflict | ### Logic-Based Seeds Define a `generate(session)` static/class method for programmatic data: ```python class PermissionSeed(Seed): __seed_database__ = "primary" __seed_description__ = "load permissions" __seed_on_conflict__ = "ignore" model = Permission @staticmethod def generate(session): for resource in ["users", "orders"]: for action in ["read", "write", "delete"]: session.add(Permission(name=f"{resource}:{action}")) ``` ### `@seed_data` Decorator (Deprecated) The old decorator still works but is deprecated in favour of the `Seed` base class: ```python from dbwarden.seed import seed_data, SeedRow @seed_data( database="primary", description="initial countries", on_conflict="update", conflict_columns=["code"], ) class CountrySeed: model = Country rows = [SeedRow(code="UY", name="Uruguay")] ``` Note that `version` is **no longer required**; it is auto-assigned. ### Discovery and Ordering Code seeds are discovered through the same `model_paths` scan as models. They use auto-assigned versions in the `C` namespace (`C0001`, `C0002`, ...) and are sorted deterministically by module and class name. Pending detection compares the class qualified name against the `_dbwarden_seeds` tracking table. --- ## File Seeds (Traditional) File seeds live in a `seeds/` directory and are useful for complex multi-statement SQL or when you need to hand-craft seed files. ### Directory Structure ``` seeds/ V0001__seed_initial_users.sql V0002__seed_lookup_tables.sql V0003__seed_sample_data.py ``` Each file follows the naming convention: ``` V<4-digit-version>__. ``` ### Creating File Seeds ```bash $ dbwarden seed create "seed initial users" --database primary ``` Creates a file like `seeds/V0001__seed_initial_users.sql`: ```sql -- INSERT statements go here ``` ### Python File Seeds ```bash $ dbwarden seed create "generate sample data" --database primary --type python ``` Creates `seeds/V0001__generate_sample_data.py` with a `seed(connection, session)` function. The `seed()` function receives both a raw SQLAlchemy `Connection` and an ORM `Session` bound to the same transaction: ```python # Using raw connection def seed(connection, session): for i in range(100): connection.execute( "INSERT INTO users (name) VALUES (:name)", {"name": f"user_{i}"}, ) # Using ORM session def seed(connection, session): for i in range(100): session.add(User(name=f"user_{i}")) session.flush() ``` --- ## Applying Seeds Apply all pending seeds (file + code seeds are both discovered): ```bash $ dbwarden seed apply --database primary ``` Apply a specific version: ```bash $ dbwarden seed apply --database primary --version 0003 ``` Apply to all databases: ```bash $ dbwarden seed apply --all ``` ### Dry Run Preview what would be applied without executing: ```bash $ dbwarden seed apply --database primary --dry-run ``` ### Auto-Apply After Migrations Configure seeds to be applied automatically after each `dbwarden migrate`: ```python database_config( database_name="primary", default=True, database_type="sqlite", database_url_sync="sqlite:///./app.db", model_paths=["models"], auto_apply_seeds=True, ) ``` Or apply seeds once after a migration without changing config: ```bash $ dbwarden migrate --apply-seeds ``` --- ## Listing Seeds ```bash $ dbwarden seed list --database primary ``` Output: ``` Seeds for database 'primary': V0001 seed_initial_users applied 2025-06-01 10:00:00 C0001 initial countries pending (code seed) ``` List across all databases: ```bash $ dbwarden seed list --all ``` ### Pruning Orphaned Records Remove tracking records for seed files that no longer exist on disk: ```bash $ dbwarden seed list --prune ``` --- ## Rolling Back Seeds Rollback removes the applied tracking record, allowing the seed to be re-applied. It does **not** reverse data changes. ```bash # Rollback the most recent seed $ dbwarden seed rollback --database primary # Rollback a specific number $ dbwarden seed rollback --database primary --count 2 # Rollback to a specific version $ dbwarden seed rollback --database primary --to-version 0002 ``` --- ## Seed Tracking Table DBWarden tracks applied seeds in `_dbwarden_seeds` (configurable via `seed_table`): | Column | Description | |--------|-------------| | `version` | 4-digit seed version (`V0001`) or code seed ID (`C0001`) | | `description` | Human-readable description | | `filename` | File path or code seed identifier | | `seed_type` | `sql`, `python`, or `code` | | `checksum` | SHA-256 hash of file/class source | | `applied_at` | Timestamp of application | The tracking table is created automatically on first seed apply. Each version can only be applied once until rolled back. ### Checksum Drift When a seed file has been modified since it was last applied, DBWarden emits a warning: ``` Warning: Seed V0001 has been modified since last apply (checksum mismatch). ``` This helps detect accidental changes to already-applied seeds. --- ## Exporting Seeds for Production Code seeds require your full application environment to execute. For Dockerized deployments where you don't want to copy the application code into a container just to seed data, use `dbwarden seed export` to produce stateless ROC (runs-on-change) SQL files. ```bash $ dbwarden seed export --database clickhouse ``` This writes `seeds/ROC__clickhouse__code_seeds.sql` containing `INSERT ... ON CONFLICT` statements rendered in the target database dialect. In production, apply with: ```bash $ dbwarden seed apply --database clickhouse ``` Because the file is ROC, updating the code seed and re-exporting produces a new content checksum, which triggers re-application. The `ON CONFLICT DO UPDATE` clause handles updating existing rows; no need to delete and recreate. **Non-handled problems:** - Rows removed from a code seed are not automatically deleted in the target database - Logic seeds that depend on other logic seeds' output are not supported (preceding row-based seeds are pre-loaded, but logic-to-logic ordering is not) - Non-deterministic `generate()` methods (e.g. using `datetime.now()`) produce a new checksum every export, causing re-apply on every deploy: acceptable for idempotent upserts, wasteful for pure inserts. Use deterministic `generate()` where possible **Dialect requirement:** Exporting requires the same dialect packages as connecting to that database. For ClickHouse, install `clickhouse-sqlalchemy`. Missing packages produce a clear error at export time. ## Seeds and Migrations Seeds are independent from migrations. You can: - Apply migrations without seeds - Apply seeds without migrations - Mix both in your workflow The `dbwarden status` command and the FastAPI `GET /status` endpoint report both pending migrations and pending seeds. --- ## Seeds in FastAPI The `DBWardenRouter` includes seed status in its `GET /status` response: ```json { "databases": { "primary": { "status": "ok", "connected": true, "pending_migrations": 0, "applied_migrations": 5, "pending_seeds": 2, "applied_seeds": 1, "lock_active": false, "error": null } } } ``` See [FastAPI Reference](fastapi/reference.md) for details. See also: [Cookbook: Seeds](../cookbook/07-seeds.md) ======================================================================== PAGE: https://dbwarden.emiliano-go.com/sql-translation/ ======================================================================== # SQL Translation DBWarden includes a SQL translation layer to support development workflows where your primary database differs from your development database. The most common case is: - Primary database: PostgreSQL/MySQL/MariaDB/ClickHouse - Development database: SQLite (`--dev` mode) This keeps local development fast while still allowing production-targeted schemas. ## Why SQL Translation Exists SQLite does not support all backend-specific SQL types and default expressions used by other databases. Without translation, generated migrations can fail in local development when they contain backend-specific types like `UUID`, `JSONB`, or default expressions like `now()`. DBWarden translation solves this by adapting generated SQL for SQLite compatibility. ## When translation is active Translation is applied when all are true: - command runs with `--dev` - selected database resolves to a SQLite `dev_database_url` - command path generates SQL from models (`make-migrations`) It is not a runtime SQL proxy for arbitrary manual SQL. ## How It Works When you run commands in development mode and target a SQLite dev database: ```bash $ dbwarden --dev make-migrations "sync models" -d primary ``` DBWarden uses this flow: 1. Loads the selected database config and resolves `dev_database_url`. 2. Detects that the active target backend is SQLite. 3. Extracts model metadata from SQLAlchemy models. 4. Translates backend-specific types/defaults to SQLite-compatible SQL. 5. Generates migration SQL with translated definitions. Translation is applied during migration generation, not as a post-processing regex pass. ## Type conversion behavior Common conversions: | Source type | SQLite output | |-------------|---------------| | `UUID` | `TEXT` | | `JSON` / `JSONB` | `TEXT` | | `TIMESTAMPTZ` | `DATETIME` | | `SERIAL` / `BIGSERIAL` | `INTEGER` | | ClickHouse nullable numeric forms | `INTEGER`/`REAL` depending on source | If a type cannot be translated safely: - non-strict mode: fallback to `TEXT` + warning - strict mode: fail migration generation ## Default expression handling Backend expressions such as `now()` or `gen_random_uuid()` may not have direct SQLite equivalents. In non-strict mode, unsupported defaults are dropped with warning. In strict mode, unsupported defaults fail generation. ## Strict Translation Mode If you want hard failures instead of fallback behavior: ```bash $ dbwarden --dev --strict-translation make-migrations "sync models" -d primary ``` In strict mode: - Unknown/unsupported type conversions raise errors - Unsupported default expression conversions raise errors Use this when you want to catch every lossy conversion early. ## Recommended team workflow 1. iterate quickly with `--dev` (SQLite) 2. keep strict checks in CI (`--strict-translation`) 3. validate release candidate migrations against production-like database This balances speed and correctness. ## Troubleshooting `--dev mode is enabled, but database '' has no dev_database_url configured`: - add `dev_database_url` for that database entry Unexpected type fallback to `TEXT`: - inspect model type for backend-specific declaration - re-run with `--strict-translation` to fail fast and fix explicitly Generated SQL differs from production expectations: - expected in SQLite compatibility mode; validate final release migrations on production-like backend ## Notes and Limitations - Translation focuses on compatibility for local development. - Some backend features cannot be represented exactly in SQLite. - For production accuracy, always test migrations against your production-like database too.