Seed Management¶

DBWarden provides built-in seed data management for populating databases with initial or reference data. Seeds complement migrations by handling data that belongs in version control.

Overview¶

There are two ways to define seeds, listed in order of preference:

Code seeds (recommended): define seeds inline alongside your SQLAlchemy models using the Seed base class or @seed_data decorator. No separate files, no manual versioning.
File seeds: traditional .sql or .py files in a seeds/ directory, useful for complex multi-statement SQL.

Both are tracked in the _dbwarden_seeds table and applied via dbwarden seed apply.

Code Seeds (Recommended)¶

Code seeds live alongside your models in your model_paths directories. They are the recommended way to define seed data because they stay in sync with your schema, support IDE autocompletion, and do not require manual version management.

Seed Base Class¶

Inherit from Seed and set a model + rows:

from dbwarden.seed import Seed

class CountrySeed(Seed):
    __seed_database__ = "primary"
    __seed_description__ = "initial countries"
    __seed_on_conflict__ = "update"
    __seed_conflict_columns__ = ["code"]

    model = Country
    rows = [
        Country(code="UY", name="Uruguay"),
        Country(code="AR", name="Argentina"),
    ]

Key advantages over the old decorator approach:

Full IDE autocompletion: rows uses model instances directly, so your editor knows the column names and types
No version parameter: versions are auto-assigned (C0001, C0002, ...) based on deterministic class ordering
No manual import of SeedRow: though SeedRow is still available if you prefer dict-like rows

Seed Class Reference¶

Attribute	Default	Description
`__seed_database__`	`"default"`	Routes the seed to the named database handle configured in `database_config(...)`.
`__seed_description__`	`""`	Human-readable label shown in `dbwarden seed list` output.
`__seed_on_conflict__`	`"ignore"`	What to do when a row with matching columns exists: `"ignore"` (skip silently), `"update"` (overwrite), or `"error"` (raise).
`__seed_conflict_columns__`	`None`	List of column names used for conflict detection. Required when `__seed_on_conflict__` is `"update"`.

Model Instances in Rows¶

Because rows accepts model instances, you get full autocompletion from your Mapped annotations:

from dbwarden.seed import Seed

class RepoSeed(Seed):
    __seed_database__ = "clickhouse"
    __seed_description__ = "Tracked Repos"

    model = Repo
    rows = [
        Repo(name="dbwarden", owner="anomalyco", is_org=True, default_branch="main"),
        Repo(name="vigil", owner="anomalyco", is_org=True, default_branch="master"),
    ]

Your editor will suggest name, owner, is_org, default_branch etc. as you type.

SQLAlchemy 2.0's DeclarativeBase does not accept positional arguments in the constructor. Always use keyword arguments when instantiating models in rows: Repo(name="dbwarden", ...) instead of Repo("dbwarden", ...).

SeedRow (Alternative)¶

If you prefer dict-like rows, SeedRow still works:

from dbwarden.seed import Seed, SeedRow

class CountrySeed(Seed):
    __seed_database__ = "primary"
    __seed_description__ = "initial countries"
    __seed_on_conflict__ = "update"
    __seed_conflict_columns__ = ["code"]

    model = Country
    rows = [
        SeedRow(code="UY", name="Uruguay"),
        SeedRow(code="AR", name="Argentina"),
    ]

`on_conflict` Behavior¶

Value	Behavior
`"ignore"` (default)	Skips existing rows silently
`"update"`	Updates existing rows with new values
`"error"`	Raises an error on conflict

Logic-Based Seeds¶

Define a generate(session) static/class method for programmatic data:

class PermissionSeed(Seed):
    __seed_database__ = "primary"
    __seed_description__ = "load permissions"
    __seed_on_conflict__ = "ignore"

    model = Permission

    @staticmethod
    def generate(session):
        for resource in ["users", "orders"]:
            for action in ["read", "write", "delete"]:
                session.add(Permission(name=f"{resource}:{action}"))

`@seed_data` Decorator (Deprecated)¶

The old decorator still works but is deprecated in favour of the Seed base class:

from dbwarden.seed import seed_data, SeedRow

@seed_data(
    database="primary",
    description="initial countries",
    on_conflict="update",
    conflict_columns=["code"],
)
class CountrySeed:
    model = Country
    rows = [SeedRow(code="UY", name="Uruguay")]

Note that version is no longer required; it is auto-assigned.

Discovery and Ordering¶

Code seeds are discovered through the same model_paths scan as models. They use auto-assigned versions in the C namespace (C0001, C0002, ...) and are sorted deterministically by module and class name. Pending detection compares the class qualified name against the _dbwarden_seeds tracking table.

File Seeds (Traditional)¶

File seeds live in a seeds/ directory and are useful for complex multi-statement SQL or when you need to hand-craft seed files.

Directory Structure¶

seeds/
  V0001__seed_initial_users.sql
  V0002__seed_lookup_tables.sql
  V0003__seed_sample_data.py

Each file follows the naming convention:

V<4-digit-version>__<description>.<sql|py>

Creating File Seeds¶

$ dbwarden seed create "seed initial users" --database primary

Creates a file like seeds/V0001__seed_initial_users.sql:

-- INSERT statements go here

Python File Seeds¶

$ dbwarden seed create "generate sample data" --database primary --type python

Creates seeds/V0001__generate_sample_data.py with a seed(connection, session) function.

The seed() function receives both a raw SQLAlchemy Connection and an ORM Session bound to the same transaction:

# Using raw connection
def seed(connection, session):
    for i in range(100):
        connection.execute(
            "INSERT INTO users (name) VALUES (:name)",
            {"name": f"user_{i}"},
        )

# Using ORM session
def seed(connection, session):
    for i in range(100):
        session.add(User(name=f"user_{i}"))
    session.flush()

Applying Seeds¶

Apply all pending seeds (file + code seeds are both discovered):

$ dbwarden seed apply --database primary

Apply a specific version:

$ dbwarden seed apply --database primary --version 0003

Apply to all databases:

$ dbwarden seed apply --all

Dry Run¶

Preview what would be applied without executing:

$ dbwarden seed apply --database primary --dry-run

Auto-Apply After Migrations¶

Configure seeds to be applied automatically after each dbwarden migrate:

database_config(
    database_name="primary",
    default=True,
    database_type="sqlite",
    database_url_sync="sqlite:///./app.db",
    model_paths=["models"],
    auto_apply_seeds=True,
)

Or apply seeds once after a migration without changing config:

$ dbwarden migrate --apply-seeds

Listing Seeds¶

$ dbwarden seed list --database primary

Output:

Seeds for database 'primary':
  V0001  seed_initial_users                   applied  2025-06-01 10:00:00
  C0001  initial countries                    pending   (code seed)

List across all databases:

$ dbwarden seed list --all

Pruning Orphaned Records¶

Remove tracking records for seed files that no longer exist on disk:

$ dbwarden seed list --prune

Rolling Back Seeds¶

Rollback removes the applied tracking record, allowing the seed to be re-applied. It does not reverse data changes.

# Rollback the most recent seed
$ dbwarden seed rollback --database primary

# Rollback a specific number
$ dbwarden seed rollback --database primary --count 2

# Rollback to a specific version
$ dbwarden seed rollback --database primary --to-version 0002

Seed Tracking Table¶

DBWarden tracks applied seeds in _dbwarden_seeds (configurable via seed_table):

Column	Description
`version`	4-digit seed version (`V0001`) or code seed ID (`C0001`)
`description`	Human-readable description
`filename`	File path or code seed identifier
`seed_type`	`sql`, `python`, or `code`
`checksum`	SHA-256 hash of file/class source
`applied_at`	Timestamp of application

The tracking table is created automatically on first seed apply. Each version can only be applied once until rolled back.

Checksum Drift¶

When a seed file has been modified since it was last applied, DBWarden emits a warning:

Warning: Seed V0001 has been modified since last apply (checksum mismatch).

This helps detect accidental changes to already-applied seeds.

Exporting Seeds for Production¶

Code seeds require your full application environment to execute. For Dockerized deployments where you don't want to copy the application code into a container just to seed data, use dbwarden seed export to produce stateless ROC (runs-on-change) SQL files.

$ dbwarden seed export --database clickhouse

This writes seeds/ROC__clickhouse__code_seeds.sql containing INSERT ... ON CONFLICT statements rendered in the target database dialect. In production, apply with:

$ dbwarden seed apply --database clickhouse

Because the file is ROC, updating the code seed and re-exporting produces a new content checksum, which triggers re-application. The ON CONFLICT DO UPDATE clause handles updating existing rows; no need to delete and recreate.

Non-handled problems:

Rows removed from a code seed are not automatically deleted in the target database
Logic seeds that depend on other logic seeds' output are not supported (preceding row-based seeds are pre-loaded, but logic-to-logic ordering is not)
Non-deterministic generate() methods (e.g. using datetime.now()) produce a new checksum every export, causing re-apply on every deploy: acceptable for idempotent upserts, wasteful for pure inserts. Use deterministic generate() where possible

Dialect requirement: Exporting requires the same dialect packages as connecting to that database. For ClickHouse, install clickhouse-sqlalchemy. Missing packages produce a clear error at export time.

Seeds and Migrations¶

Seeds are independent from migrations. You can:

Apply migrations without seeds
Apply seeds without migrations
Mix both in your workflow

The dbwarden status command and the FastAPI GET /status endpoint report both pending migrations and pending seeds.

Seeds in FastAPI¶

The DBWardenRouter includes seed status in its GET /status response:

{
  "databases": {
    "primary": {
      "status": "ok",
      "connected": true,
      "pending_migrations": 0,
      "applied_migrations": 5,
      "pending_seeds": 2,
      "applied_seeds": 1,
      "lock_active": false,
      "error": null
    }
  }
}

See FastAPI Reference for details.