Skip to content

Seed Management

DBWarden provides built-in seed data management for populating databases with initial or reference data. Seeds complement migrations by handling data that belongs in version control.

Overview

There are two ways to define seeds, listed in order of preference:

  1. Code seeds (recommended): define seeds inline alongside your SQLAlchemy models using the Seed base class or @seed_data decorator. No separate files, no manual versioning.
  2. File seeds: traditional .sql or .py files in a seeds/ directory, useful for complex multi-statement SQL.

Both are tracked in the _dbwarden_seeds table and applied via dbwarden seed apply.


Code seeds live alongside your models in your model_paths directories. They are the recommended way to define seed data because they stay in sync with your schema, support IDE autocompletion, and do not require manual version management.

Seed Base Class

Inherit from Seed and set a model + rows:

from dbwarden.seed import Seed

class CountrySeed(Seed):
    __seed_database__ = "primary"
    __seed_description__ = "initial countries"
    __seed_on_conflict__ = "update"
    __seed_conflict_columns__ = ["code"]

    model = Country
    rows = [
        Country(code="UY", name="Uruguay"),
        Country(code="AR", name="Argentina"),
    ]

Key advantages over the old decorator approach:

  • Full IDE autocompletion: rows uses model instances directly, so your editor knows the column names and types
  • No version parameter: versions are auto-assigned (C0001, C0002, ...) based on deterministic class ordering
  • No manual import of SeedRow: though SeedRow is still available if you prefer dict-like rows

Seed Class Reference

Attribute Default Description
__seed_database__ "default" Routes the seed to the named database handle configured in database_config(...).
__seed_description__ "" Human-readable label shown in dbwarden seed list output.
__seed_on_conflict__ "ignore" What to do when a row with matching columns exists: "ignore" (skip silently), "update" (overwrite), or "error" (raise).
__seed_conflict_columns__ None List of column names used for conflict detection. Required when __seed_on_conflict__ is "update".

Model Instances in Rows

Because rows accepts model instances, you get full autocompletion from your Mapped annotations:

from dbwarden.seed import Seed

class RepoSeed(Seed):
    __seed_database__ = "clickhouse"
    __seed_description__ = "Tracked Repos"

    model = Repo
    rows = [
        Repo(name="dbwarden", owner="anomalyco", is_org=True, default_branch="main"),
        Repo(name="vigil", owner="anomalyco", is_org=True, default_branch="master"),
    ]

Your editor will suggest name, owner, is_org, default_branch etc. as you type.

SQLAlchemy 2.0's DeclarativeBase does not accept positional arguments in the constructor. Always use keyword arguments when instantiating models in rows: Repo(name="dbwarden", ...) instead of Repo("dbwarden", ...).

SeedRow (Alternative)

If you prefer dict-like rows, SeedRow still works:

from dbwarden.seed import Seed, SeedRow

class CountrySeed(Seed):
    __seed_database__ = "primary"
    __seed_description__ = "initial countries"
    __seed_on_conflict__ = "update"
    __seed_conflict_columns__ = ["code"]

    model = Country
    rows = [
        SeedRow(code="UY", name="Uruguay"),
        SeedRow(code="AR", name="Argentina"),
    ]

on_conflict Behavior

Value Behavior
"ignore" (default) Skips existing rows silently
"update" Updates existing rows with new values
"error" Raises an error on conflict

Logic-Based Seeds

Define a generate(session) static/class method for programmatic data:

class PermissionSeed(Seed):
    __seed_database__ = "primary"
    __seed_description__ = "load permissions"
    __seed_on_conflict__ = "ignore"

    model = Permission

    @staticmethod
    def generate(session):
        for resource in ["users", "orders"]:
            for action in ["read", "write", "delete"]:
                session.add(Permission(name=f"{resource}:{action}"))

@seed_data Decorator (Deprecated)

The old decorator still works but is deprecated in favour of the Seed base class:

from dbwarden.seed import seed_data, SeedRow

@seed_data(
    database="primary",
    description="initial countries",
    on_conflict="update",
    conflict_columns=["code"],
)
class CountrySeed:
    model = Country
    rows = [SeedRow(code="UY", name="Uruguay")]

Note that version is no longer required; it is auto-assigned.

Discovery and Ordering

Code seeds are discovered through the same model_paths scan as models. They use auto-assigned versions in the C namespace (C0001, C0002, ...) and are sorted deterministically by module and class name. Pending detection compares the class qualified name against the _dbwarden_seeds tracking table.


File Seeds (Traditional)

File seeds live in a seeds/ directory and are useful for complex multi-statement SQL or when you need to hand-craft seed files.

Directory Structure

seeds/
  V0001__seed_initial_users.sql
  V0002__seed_lookup_tables.sql
  V0003__seed_sample_data.py

Each file follows the naming convention:

V<4-digit-version>__<description>.<sql|py>

Creating File Seeds

$ dbwarden seed create "seed initial users" --database primary

Creates a file like seeds/V0001__seed_initial_users.sql:

-- INSERT statements go here

Python File Seeds

$ dbwarden seed create "generate sample data" --database primary --type python

Creates seeds/V0001__generate_sample_data.py with a seed(connection, session) function.

The seed() function receives both a raw SQLAlchemy Connection and an ORM Session bound to the same transaction:

# Using raw connection
def seed(connection, session):
    for i in range(100):
        connection.execute(
            "INSERT INTO users (name) VALUES (:name)",
            {"name": f"user_{i}"},
        )

# Using ORM session
def seed(connection, session):
    for i in range(100):
        session.add(User(name=f"user_{i}"))
    session.flush()

Applying Seeds

Apply all pending seeds (file + code seeds are both discovered):

$ dbwarden seed apply --database primary

Apply a specific version:

$ dbwarden seed apply --database primary --version 0003

Apply to all databases:

$ dbwarden seed apply --all

Dry Run

Preview what would be applied without executing:

$ dbwarden seed apply --database primary --dry-run

Auto-Apply After Migrations

Configure seeds to be applied automatically after each dbwarden migrate:

database_config(
    database_name="primary",
    default=True,
    database_type="sqlite",
    database_url_sync="sqlite:///./app.db",
    model_paths=["models"],
    auto_apply_seeds=True,
)

Or apply seeds once after a migration without changing config:

$ dbwarden migrate --apply-seeds

Listing Seeds

$ dbwarden seed list --database primary

Output:

Seeds for database 'primary':
  V0001  seed_initial_users                   applied  2025-06-01 10:00:00
  C0001  initial countries                    pending   (code seed)

List across all databases:

$ dbwarden seed list --all

Pruning Orphaned Records

Remove tracking records for seed files that no longer exist on disk:

$ dbwarden seed list --prune

Rolling Back Seeds

Rollback removes the applied tracking record, allowing the seed to be re-applied. It does not reverse data changes.

# Rollback the most recent seed
$ dbwarden seed rollback --database primary

# Rollback a specific number
$ dbwarden seed rollback --database primary --count 2

# Rollback to a specific version
$ dbwarden seed rollback --database primary --to-version 0002

Seed Tracking Table

DBWarden tracks applied seeds in _dbwarden_seeds (configurable via seed_table):

Column Description
version 4-digit seed version (V0001) or code seed ID (C0001)
description Human-readable description
filename File path or code seed identifier
seed_type sql, python, or code
checksum SHA-256 hash of file/class source
applied_at Timestamp of application

The tracking table is created automatically on first seed apply. Each version can only be applied once until rolled back.

Checksum Drift

When a seed file has been modified since it was last applied, DBWarden emits a warning:

Warning: Seed V0001 has been modified since last apply (checksum mismatch).

This helps detect accidental changes to already-applied seeds.


Exporting Seeds for Production

Code seeds require your full application environment to execute. For Dockerized deployments where you don't want to copy the application code into a container just to seed data, use dbwarden seed export to produce stateless ROC (runs-on-change) SQL files.

$ dbwarden seed export --database clickhouse
This writes seeds/ROC__clickhouse__code_seeds.sql containing INSERT ... ON CONFLICT statements rendered in the target database dialect. In production, apply with:
$ dbwarden seed apply --database clickhouse

Because the file is ROC, updating the code seed and re-exporting produces a new content checksum, which triggers re-application. The ON CONFLICT DO UPDATE clause handles updating existing rows; no need to delete and recreate.

Non-handled problems:

  • Rows removed from a code seed are not automatically deleted in the target database
  • Logic seeds that depend on other logic seeds' output are not supported (preceding row-based seeds are pre-loaded, but logic-to-logic ordering is not)
  • Non-deterministic generate() methods (e.g. using datetime.now()) produce a new checksum every export, causing re-apply on every deploy: acceptable for idempotent upserts, wasteful for pure inserts. Use deterministic generate() where possible

Dialect requirement: Exporting requires the same dialect packages as connecting to that database. For ClickHouse, install clickhouse-sqlalchemy. Missing packages produce a clear error at export time.

Seeds and Migrations

Seeds are independent from migrations. You can:

  • Apply migrations without seeds
  • Apply seeds without migrations
  • Mix both in your workflow

The dbwarden status command and the FastAPI GET /status endpoint report both pending migrations and pending seeds.


Seeds in FastAPI

The DBWardenRouter includes seed status in its GET /status response:

{
  "databases": {
    "primary": {
      "status": "ok",
      "connected": true,
      "pending_migrations": 0,
      "applied_migrations": 5,
      "pending_seeds": 2,
      "applied_seeds": 1,
      "lock_active": false,
      "error": null
    }
  }
}

See FastAPI Reference for details.

See also: Cookbook: Seeds