Refactor to a unified v3 to v4 master migrator by ParticularlyPythonicBS · Pull Request #284 · TemoaProject/temoa

ParticularlyPythonicBS · 2026-03-25T15:48:41Z

Consolidates and streamlines the database migration process.
Moves from a multi-step migration v3->v3.1->v4 to a master migrator script that directly manages structural transformations to the v4 schema bypassing intermediate states.
Additionally, the command-line interface has been updated to support thread-safe, batch directory migrations directly out-of-the-box.

Closes #197

Summary by CodeRabbit

New Features
- Batch migration: migrate entire directories at once.
- Broader input support: accept .sql, .sqlite, and .db files.
- Unified migrate command that selects SQL or DB mode.
Refactor
- CLI reorganized to delegate to a reusable migration runner.
- Legacy migration tools consolidated into a single master migrator.
Bug Fixes
- Improved help text, warnings, silent/dry-run behavior, and error reporting for batch runs.
Tests
- Tests updated and new mock SQL fixtures added for migration coverage.

coderabbitai · 2026-03-25T15:48:59Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

Walkthrough

Consolidates migration logic into a new master v3→v4 migrator, removes legacy per-version migration scripts, refactors the CLI to call the master API (single-file and directory batch modes), adds a parameterized batch runner, and updates tests and fixtures to exercise the new flows.

Changes

Cohort / File(s)	Summary
Master Migration `temoa/utilities/master_migration.py`	New consolidated migrator implementing v3→v4 pipeline: token/column mapping, operator-table handling, per-table copies, custom transforms (loan_lifetime_process, time tables, capacity_factor_process), metadata updates, `migrate_database()` and `migrate_sql_dump()` entry points and lightweight CLI.
Removed legacy migrators `temoa/utilities/db_migration_v3_1_to_v4.py`, `temoa/utilities/sql_migration_v3_1_to_v4.py`, `temoa/utilities/db_migration_v3_to_v3_1.py`	Deleted three older migration scripts and their CLIs; their migration logic consolidated into the new `master_migration.py`.
CLI integration `temoa/cli.py`	Refactored `migrate` command to accept file or directory, dispatch single-file migrations directly to `master_migration` (`migrate_database`/`migrate_sql_dump`), route directory batches to `run_all_v4_migrations.run_migrations`, removed argparse.Namespace/sys.argv invocation patterns, and updated help/docstrings.
Batch runner refactor `temoa/utilities/run_all_v4_migrations.py`	Exposes `run_migrations(input_dir, migration_script, schema_path, dry_run=False, silent=False)`; broadened discovered extensions to `.sql`, `.sqlite`, `.db`; added `silent` suppression and updated `run_command` typing/behavior; `main()` now delegates.
Tests & fixtures `tests/test_cli.py`, `tests/test_v4_migration.py`, `tests/testing_data/migration_v3_mock.sql`, `tests/testing_data/migration_v3_1_mock.sql`	Updated CLI tests for directory batch behavior and help text; parametrized migration tests to use `master_migration.py` with `--type` (`sql`/`db`) and `--output`; added v3 mock fixture and extended v3.1 mock data to include operator limit rows.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant CLI as Temoa CLI
    participant Runner as run_all_v4_migrations.run_migrations
    participant Master as master_migration
    participant Old as v3 Source
    participant Schema as v4 Schema
    participant New as v4 Target

    User->>CLI: migrate <input_path> --schema <schema> --output <out> [--type]
    alt input is directory
        CLI->>Runner: run_migrations(input_dir, migration_script, schema_path, ...)
        Runner->>Master: invoke master_migration per file (type derived from suffix)
    else single file
        CLI->>Master: migrate_database(...) or migrate_sql_dump(...)
    end
    Master->>Schema: load v4 schema
    Master->>Old: open/connect/read v3 source
    Master->>New: create/populate v4 target
    Note over Master: phases — operator-table transforms, mapped table copies, custom transforms, metadata/vacuum
    Master-->>CLI: return status / summary
    CLI->>User: print completion / errors

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

adding database migrators to the cli #196: Refactors the CLI migrate command and sys.argv usage; this PR replaces argv-based invocation with a parameterized runner and direct master API calls.
Updating schema to remove redundant time season tables #238: Changes around time-season/time_season_sequential and related migration mappings; overlaps with the new consolidated migration handling.
Rework the definition of seasons #264: Alters v3→v4 migration transformations (season/time/metadata) that intersect with the custom steps implemented in the master migrator.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 63.64% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately and concisely summarizes the main change: consolidating multiple migration scripts into a single unified master migrator for v3 to v4 migrations.
Linked Issues check	✅ Passed	The PR successfully addresses issue `#197` by eliminating sys.argv manipulation, exposing programmatic entry points (run_migrations function), and implementing a master migrator supporting both v3 and v3.1 to v4 migrations.
Out of Scope Changes check	✅ Passed	All changes align with the PR objectives: refactoring migration utilities, removing legacy migrator scripts, updating CLI to support batch directory migrations, and adding comprehensive test coverage for the new master migrator.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch ref/combine_migrators

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 8

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@temoa/cli.py`:
- Around line 454-459: The directory migration path calls run_migrations(...)
but doesn't pass the CLI silent flag so scan/progress output remains printed;
update the CLI call to forward the silent flag (e.g., run_migrations(...,
dry_run=False, silent=silent)) and, if needed, add a corresponding silent
parameter to the run_migrations function signature and its internal logging so
it suppresses progress/scan prints when silent is true (or alternately make
run_migrations accept a logger parameter and respect its level).

In `@temoa/utilities/master_migration.py`:
- Around line 375-388: The current migrate_database opens output_path and writes
the v4 schema before running execute_v3_to_v4_migration, risking a half-written
DB if migration fails; change migrate_database to create and open a temporary
database file (e.g., via tempfile.NamedTemporaryFile or
tempfile.TemporaryDirectory), apply the schema and run
execute_v3_to_v4_migration using con_old and con_new pointed at the temp file,
commit/vacuum and close both connections, then atomically replace the final
output_path with the temp file (os.replace) so no partially migrated DB is left
behind on failure.
- Around line 240-242: The SQL string in the loop using con_old.execute with an
f-string and row[0]/row[1] is vulnerable to syntax errors when values contain
quotes; change the query to use parameterized SQL (pass the region and tech as
parameters to con_old.execute rather than interpolating into the string) in the
block where the vintage is selected (the for v in con_old.execute(...) loop) so
the DB driver safely escapes values and avoids broken SQL.

In `@temoa/utilities/run_all_v4_migrations.py`:
- Around line 39-50: run_migrations currently calls sys.exit on validation
failures which prevents callers (like temoa.cli) from handling errors; change
the three input checks in run_migrations to raise descriptive exceptions (e.g.,
FileNotFoundError or ValueError) instead of printing and calling sys.exit, and
update the summary/failure reporting logic at the end of the function to either
raise a consolidated exception on overall failure or return a structured result
object (status + error details) so the caller can decide how to handle
termination. Ensure error messages include the invalid Path and keep the
function signature (run_migrations) unchanged so temoa.cli can catch and handle
the exceptions or inspect the returned result.
- Around line 86-98: The migration subprocess invocation hardcodes 'python3' in
the migration_cmd list, which can bypass the active virtualenv or fail on
platforms without that binary; replace the literal with sys.executable so the
current Python interpreter is used. Update the construction of migration_cmd
(where migration_script, target_file, schema_path, temp_output_file and mig_type
are assembled) to use sys.executable, and ensure you import sys at top of the
module if not already present; leave the call to run_command(result =
run_command(migration_cmd, cwd=Path.cwd())) unchanged.
- Around line 77-79: The temp file descriptors returned by tempfile.mkstemp()
are being discarded and leaked when creating temp_output_file and
original_backup_file; capture the returned file descriptor and path (fd, path =
mkstemp(...)), call os.close(fd) immediately to close the open descriptor, then
wrap Path(path) for temp_output_file and original_backup_file so the files can
be reopened later (especially on Windows); keep mig_type logic unchanged.

In `@tests/test_cli.py`:
- Around line 175-189: The test test_cli_migrate_accepts_directory_input only
exercises the empty-directory fast path; create a migratable fixture inside
dummy_dir (e.g., add a small .sql or .sqlite file in the dummy_dir before
calling runner.invoke(app, args)) so the batch migration flow is exercised;
update any expectations if the presence of the file changes output (keep the
assertions that look for 'Batch migrating directory' and include the new file
name in the stdout check) and ensure the fixture file is written using Path
methods on dummy_dir so the test still cleans up correctly.

In `@tests/test_v4_migration.py`:
- Around line 17-24: The test currently only creates a v3.1 SQLite DB (db_v3_1
using SCHEMA_V3_1 and MOCK_DATA) so the direct v3→v4 migration path isn't
exercised; update the master-migration test to be parametrized to run against
both the existing v3.1 fixture and a real v3 fixture (e.g., add SCHEMA_V3 and
MOCK_DATA_V3 or a v3_fixture) by creating a second temporary DB (db_v3_real)
from the real v3 schema/data and invoking the same migration/assertion logic for
both fixtures; ensure the test harness (the parametrized test or loop)
references db_v3_1 and the new db_v3_real so the migrator code paths for v3→v4
are covered.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: b4213a37-2f78-4dcc-a0a2-096bef61a766

📥 Commits

Reviewing files that changed from the base of the PR and between cb703e5 and 7dc2fa6.

📒 Files selected for processing (5)

temoa/cli.py
temoa/utilities/master_migration.py
temoa/utilities/run_all_v4_migrations.py
tests/test_cli.py
tests/test_v4_migration.py

temoa/cli.py

temoa/utilities/master_migration.py

temoa/utilities/run_all_v4_migrations.py

tests/test_cli.py

tests/test_v4_migration.py

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (1)

tests/test_cli.py (1)

175-189: ⚠️ Potential issue | 🟡 Minor

Directory migration test still misses the real batch-processing path.

This still exercises the empty-directory early-exit branch only, so it doesn’t validate per-file migration behavior in directory mode.

Suggested test adjustment

 def test_cli_migrate_accepts_directory_input(tmp_path: Path) -> None:
     """Test that the migrate command accepts a directory as input for batch processing."""
     dummy_dir = tmp_path / 'my_dummy_dir'
     dummy_dir.mkdir()
+    fixture_sql = dummy_dir / UTOPIA_SQL_FIXTURE.name
+    shutil.copy2(UTOPIA_SQL_FIXTURE, fixture_sql)
     args = ['migrate', str(dummy_dir)]
     result = runner.invoke(app, args)

     assert result.exit_code == 0
     # Normalize whitespace to handle platform-specific line breaks from rich.print()
     normalized_output = ' '.join(result.stdout.split())
     assert 'Batch migrating directory' in normalized_output
     # Check for the directory name in the original output (paths may be split across lines)
     assert 'my_dummy_dir' in result.stdout
-    assert 'No .sql, .sqlite, or .db files found' in result.stdout
+    assert fixture_sql.name in result.stdout
+    assert 'No .sql, .sqlite, or .db files found' not in result.stdout

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@tests/test_cli.py` around lines 175 - 189, The test
test_cli_migrate_accepts_directory_input currently only exercises the
empty-directory early-exit path; modify it to create at least one database/sql
file inside dummy_dir (e.g., my_dummy_dir/test.sql or test.db) with minimal
content so the CLI runs the per-file batch-processing path, then invoke
runner.invoke(app, args) and update assertions to check for per-file migration
output (look for messages emitted by the migrate command such as file-specific
"migrated" or "processing" text) instead of the "No .sql, .sqlite, or .db files
found" message; locate this change around the dummy_dir creation and the
runner.invoke(app, args) call in test_cli_migrate_accepts_directory_input.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/test_cli.py`:
- Around line 252-259: The test's path assertion is too weak because it only
checks tmp_path.name; update the assertion to assert the normalized full
tmp_path string (e.g., tmp_path.as_posix() or str(tmp_path)) appears in the
normalized output instead of tmp_path.name, using the same normalization
approach you applied to result.stdout (clean_output/normalized_output) so path
separators and whitespace are normalized consistently; locate the assertions
around clean_output / normalized_output and replace the tmp_path.name check with
a membership check for the normalized full tmp_path.

---

Duplicate comments:
In `@tests/test_cli.py`:
- Around line 175-189: The test test_cli_migrate_accepts_directory_input
currently only exercises the empty-directory early-exit path; modify it to
create at least one database/sql file inside dummy_dir (e.g.,
my_dummy_dir/test.sql or test.db) with minimal content so the CLI runs the
per-file batch-processing path, then invoke runner.invoke(app, args) and update
assertions to check for per-file migration output (look for messages emitted by
the migrate command such as file-specific "migrated" or "processing" text)
instead of the "No .sql, .sqlite, or .db files found" message; locate this
change around the dummy_dir creation and the runner.invoke(app, args) call in
test_cli_migrate_accepts_directory_input.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 0d07b330-5194-4407-862b-3f4c970cdd09

📥 Commits

Reviewing files that changed from the base of the PR and between 7dc2fa6 and da05b11.

📒 Files selected for processing (1)

tests/test_cli.py

coderabbitai · 2026-03-25T16:10:13Z

tests/test_cli.py

+    clean_output = re.sub(r'\s+', '', result.stdout)
+
    assert 'SQL dump migration completed' in normalized_output
    assert 'Warning: Input directory' in normalized_output
-    assert str(non_writable_mock_parent) in normalized_output
-    assert 'is not writable.' in normalized_output
-    assert 'Saving output to current directory:' in normalized_output
-    assert str(tmp_path) in normalized_output
+    assert 'mock_non_writable_input_parent' in clean_output
+    assert 'is not writable' in normalized_output
+    assert 'Saving output to current directory' in normalized_output
+    assert tmp_path.name in clean_output


⚠️ Potential issue | 🟡 Minor

Path assertion became too weak after normalization.

Line 259 now checks only tmp_path.name, which can match unrelated output. Assert the normalized full path instead.

Suggested assertion fix

- assert tmp_path.name in clean_output + assert str(tmp_path).replace(' ', '') in clean_output

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

clean_output = re.sub(r'\s+', '', result.stdout)

assert 'SQL dump migration completed' in normalized_output

assert 'Warning: Input directory' in normalized_output

assert str(non_writable_mock_parent) in normalized_output

assert 'is not writable.' in normalized_output

assert 'Saving output to current directory:' in normalized_output

assert str(tmp_path) in normalized_output

assert 'mock_non_writable_input_parent' in clean_output

assert 'is not writable' in normalized_output

assert 'Saving output to current directory' in normalized_output

assert tmp_path.name in clean_output

clean_output = re.sub(r'\s+', '', result.stdout)

assert 'SQL dump migration completed' in normalized_output

assert 'Warning: Input directory' in normalized_output

assert 'mock_non_writable_input_parent' in clean_output

assert 'is not writable' in normalized_output

assert 'Saving output to current directory' in normalized_output

assert str(tmp_path).replace(' ', '') in clean_output

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@tests/test_cli.py` around lines 252 - 259, The test's path assertion is too weak because it only checks tmp_path.name; update the assertion to assert the normalized full tmp_path string (e.g., tmp_path.as_posix() or str(tmp_path)) appears in the normalized output instead of tmp_path.name, using the same normalization approach you applied to result.stdout (clean_output/normalized_output) so path separators and whitespace are normalized consistently; locate the assertions around clean_output / normalized_output and replace the tmp_path.name check with a membership check for the normalized full tmp_path.

coderabbitai

Actionable comments posted: 7

♻️ Duplicate comments (1)

tests/test_cli.py (1)

257-264: 🧹 Nitpick | 🔵 Trivial

Path assertion remains weak after normalization.

Line 264 checks only tmp_path.name in clean_output, which could match unrelated substrings. Consider asserting a more specific path pattern:

-    assert tmp_path.name in clean_output
+    # Normalize path for comparison (handles Windows backslashes)
+    normalized_tmp_path = str(tmp_path).replace('\\', '/').replace(' ', '')
+    assert normalized_tmp_path in clean_output or tmp_path.name in clean_output

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@tests/test_cli.py` around lines 257 - 264, The assertion using only
tmp_path.name is too weak and may match unrelated substrings; update the test in
tests/test_cli.py to assert a more specific path presence by checking the full
temporary path (e.g., str(tmp_path) or tmp_path.as_posix()) or by asserting the
basename with surrounding separators (e.g., path separator + tmp_path.name) in
clean_output/normalized_output; locate the assertions around variables
clean_output, normalized_output and result and replace the tmp_path.name check
with a check for the full tmp_path string or a separator-anchored pattern to
ensure the actual directory path is present.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@temoa/utilities/master_migration.py`:
- Around line 124-375: The function execute_v3_to_v4_migration is too large and
should be split into focused helper functions to reduce cyclomatic complexity:
extract the operator-added logic into _migrate_operator_tables(con_old,
con_new), the standard/copy logic into _migrate_standard_tables(con_old,
con_new), the LoanLifetimeTech logic into _migrate_loan_lifetime(con_old,
con_new), all time-related aggregates into _migrate_time_tables(con_old,
con_new), and CapacityFactorProcess into _migrate_capacity_factor(con_old,
con_new); each helper should encapsulate its current try/except and return the
number of rows it migrated (or accept and update a mutable total), and
execute_v3_to_v4_migration should call these helpers in sequence (keeping the
final metadata updates) and sum/print the total.
- Around line 111-114: The function creates an intermediate variable mapped_once
just to apply re.sub and lower(), so simplify by inlining: replace the
mapped_once assignments and return with a single return that applies
re.sub(r'__+', '_', ''.join(out)).lower() (or directly return re.sub(...,
to_snake_case(token)).lower() if appropriate) so you remove the unnecessary
mapped_once variable; references: mapped_once, to_snake_case, token, re.sub.
- Around line 408-428: The migrate_sql_dump function currently writes the SQL
dump directly to output_path which can leave a partial file on failure; change
it to write to a temporary file in the same directory (e.g., using
tempfile.NamedTemporaryFile or mkstemp with dir=output_path.parent and a unique
suffix), flush and fsync the temp file after writing the iterdump output from
con_new_in_memory, close it, then atomically replace output_path with
os.replace(temp_path, output_path); ensure the temp is removed on exceptions and
permissions/umask are respected so the final state matches the original
behavior.
- Around line 145-172: The migration builds new_cols and computes op_index =
new_cols.index('operator') but doesn't defensively assert the result; add a
defensive assertion after computing op_index to ensure the operator column is
present and within bounds (e.g., assert 0 <= op_index < len(new_cols), include
new_name in the message) so migrations for tables in OPERATOR_ADDED_TABLES fail
fast on unexpected schema changes rather than producing incorrect row mappings;
reference op_index, new_cols, new_name and OPERATOR_ADDED_TABLES when adding
this check.

In `@temoa/utilities/run_all_v4_migrations.py`:
- Around line 136-140: The raise RuntimeError when failed_files is truthy makes
the following else unnecessary; remove the else block and unindent the print so
that after the if failed_files: raise RuntimeError(...) branch, you check if not
silent: print('All files migrated successfully.') at the same indentation level.
Update the block that references failed_files and silent (the RuntimeError raise
and the print statement) accordingly.

In `@tests/test_v4_migration.py`:
- Around line 19-25: pytest.mark.parametrize currently passes the parameter
names as a single string; change its first argument to a tuple of parameter
names (e.g., ('schema_file', 'mock_data_file')) so pytest recognizes each
parameter correctly; update the call where pytest.mark.parametrize is used
around the test function referencing SCHEMA_V3, MOCK_DATA_V3, SCHEMA_V3_1, and
MOCK_DATA_V3_1 to use the tuple form.

In `@tests/testing_data/migration_v3_mock.sql`:
- Around line 1-30: Add a representative operator-constraint row so the
operator-injection and period-to-vintage branches in master_migration.py (the
OPERATOR_ADDED_TABLES flow) are exercised; e.g., insert a single test row into
MinCapacity or EmissionLimit in the mock SQL (use the same region 'R1', tech
'T1', a vintage/period value present in TimePeriod like 2030, and matching
commodity/units) so the migration code paths that handle operator-added tables
and period->vintage translation (operator-injection and period-to-vintage around
master_migration.py lines 135-172) are covered. Ensure the inserted column names
match the schema for the chosen table (MinCapacity or EmissionLimit) and use
values consistent with existing test rows (region 'R1', tech 'T1', vintage 2030)
so the migration will process it.

---

Duplicate comments:
In `@tests/test_cli.py`:
- Around line 257-264: The assertion using only tmp_path.name is too weak and
may match unrelated substrings; update the test in tests/test_cli.py to assert a
more specific path presence by checking the full temporary path (e.g.,
str(tmp_path) or tmp_path.as_posix()) or by asserting the basename with
surrounding separators (e.g., path separator + tmp_path.name) in
clean_output/normalized_output; locate the assertions around variables
clean_output, normalized_output and result and replace the tmp_path.name check
with a check for the full tmp_path string or a separator-anchored pattern to
ensure the actual directory path is present.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 296aa71e-421b-45c9-bc17-8a56fd837bdf

📥 Commits

Reviewing files that changed from the base of the PR and between da05b11 and 5118ec3.

📒 Files selected for processing (9)

temoa/cli.py
temoa/utilities/db_migration_v3_1_to_v4.py
temoa/utilities/db_migration_v3_to_v3_1.py
temoa/utilities/master_migration.py
temoa/utilities/run_all_v4_migrations.py
temoa/utilities/sql_migration_v3_1_to_v4.py
tests/test_cli.py
tests/test_v4_migration.py
tests/testing_data/migration_v3_mock.sql

💤 Files with no reviewable changes (3)

temoa/utilities/db_migration_v3_to_v3_1.py
temoa/utilities/db_migration_v3_1_to_v4.py
temoa/utilities/sql_migration_v3_1_to_v4.py

temoa/utilities/master_migration.py

temoa/utilities/run_all_v4_migrations.py

tests/test_v4_migration.py

tests/testing_data/migration_v3_mock.sql

coderabbitai

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

temoa/utilities/run_all_v4_migrations.py (1)

26-36: ⚠️ Potential issue | 🟡 Minor

silent=True still emits subprocess logs.

run_migrations() now looks like a reusable programmatic entry point, but run_command() still prints the command and captured stdout/stderr unconditionally. Silent callers will still get per-file noise.

🔇 Minimal fix

 def run_command(
-    cmd: list[str], cwd: Path | None = None, capture_output: bool = True
+    cmd: list[str],
+    cwd: Path | None = None,
+    capture_output: bool = True,
+    silent: bool = False,
 ) -> subprocess.CompletedProcess[str]:
     """Helper to run shell commands."""
-    print(f'Executing: {" ".join(cmd)}')
+    if not silent:
+        print(f'Executing: {" ".join(cmd)}')
     result = subprocess.run(cmd, cwd=cwd, capture_output=capture_output, text=True, check=False)
-    if result.returncode != 0 and capture_output:
+    if result.returncode != 0 and capture_output and not silent:
         print(f'COMMAND FAILED (exit code {result.returncode}):')
         print('STDOUT:\n', result.stdout)
         print('STDERR:\n', result.stderr)
     return result
@@
-            result = run_command(migration_cmd, cwd=Path.cwd())
+            result = run_command(migration_cmd, cwd=Path.cwd(), silent=silent)

Also applies to: 39-41

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@temoa/utilities/run_all_v4_migrations.py` around lines 26 - 36, run_command
currently prints the executed command and captured stdout/stderr unconditionally
which breaks callers that expect silence (e.g., run_migrations). Add a boolean
parameter (e.g., silent: bool = False) to run_command and use it to suppress all
prints: only print the "Executing: ..." line and the failure diagnostics when
silent is False; keep existing behavior otherwise. Update callers (notably
run_migrations) to pass silent=True when they want no output. Ensure function
signature (run_command) and all call sites are updated accordingly to avoid
regressions.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@temoa/utilities/master_migration.py`:
- Around line 404-410: The migrate_database function currently calls
sqlite3.connect(source_path) which will silently create a new empty DB if
source_path is missing; add an explicit validation at the top of
migrate_database to verify source_path.exists() and source_path.is_file() (and
optionally that schema_path exists) and raise a clear FileNotFoundError or
ValueError if the input file is missing, before calling
sqlite3.connect(source_path), so a typo in --input fails fast instead of
producing an empty migration.
- Around line 405-407: migrate_database currently creates the temp sqlite file
in the system temp dir (tempfile.mkstemp) and then calls os.replace(temp_path,
output_path), which can fail with EXDEV when output_path is on a different
filesystem; update migrate_database to create the temp file in
output_path.parent (pass dir=output_path.parent to mkstemp or use
NamedTemporaryFile in that dir) so temp_path and output_path are on the same
filesystem, and wrap the os.replace(temp_path, output_path) call in the same
exception-handling/cleanup pattern used by migrate_sql_dump (catch exceptions,
remove temp_path on failure, and re-raise or log appropriately) to ensure atomic
replacement and proper cleanup of temp files.

In `@tests/test_v4_migration.py`:
- Around line 19-26: The test_v4_migrations parametrized fixture currently seeds
V3/V3.1 data but never asserts the operator-table migration path; update the
test to explicitly verify that _migrate_operator_tables() produced the expected
rows by asserting the migrated limit_capacity and limit_emission entries and
that each row contains the injected operator values (match against
SCHEMA_V3/SCHEMA_V3_1 mock data expectations); locate the assertions near the
existing checks for time-table/capacity-factor/metadata and add checks for
presence, correct values of limit_capacity.limit and limit_emission.limit (or
equivalent column names) and their associated operator identifiers.

---

Outside diff comments:
In `@temoa/utilities/run_all_v4_migrations.py`:
- Around line 26-36: run_command currently prints the executed command and
captured stdout/stderr unconditionally which breaks callers that expect silence
(e.g., run_migrations). Add a boolean parameter (e.g., silent: bool = False) to
run_command and use it to suppress all prints: only print the "Executing: ..."
line and the failure diagnostics when silent is False; keep existing behavior
otherwise. Update callers (notably run_migrations) to pass silent=True when they
want no output. Ensure function signature (run_command) and all call sites are
updated accordingly to avoid regressions.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: be199f5f-00b8-4e73-80af-e41be7d373a9

📥 Commits

Reviewing files that changed from the base of the PR and between 5118ec3 and 4d08aff.

📒 Files selected for processing (4)

temoa/utilities/master_migration.py
temoa/utilities/run_all_v4_migrations.py
tests/test_v4_migration.py
tests/testing_data/migration_v3_mock.sql

temoa/utilities/master_migration.py

tests/test_v4_migration.py

coderabbitai

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@temoa/utilities/master_migration.py`:
- Around line 179-190: The set custom_handled_old_tables currently unions
OPERATOR_ADDED_TABLES.keys() but intentionally does not list v3.1 names like
LimitCapacity/LimitEmission so they fall through to _migrate_standard_tables()
because those v3.1 tables already include the operator column; add a short
clarifying comment next to the custom_handled_old_tables definition explaining
that v3.1 operator tables (e.g., LimitCapacity, LimitEmission) are intentionally
handled by the standard migration path and thus do not need to be listed in
OPERATOR_ADDED_TABLES.keys() to avoid confusion.
- Around line 430-439: The try block currently closes con_old and con_new before
os.replace(), then the finally block closes them again; remove the explicit
con_old.close() and con_new.close() calls from the try section so the finally
block is the single cleanup path; keep the existing exception handling that
removes temp_path on error and ensure the finally block still unconditionally
(or defensively check for None) closes con_old and con_new to guarantee proper
cleanup.

In `@tests/test_v4_migration.py`:
- Line 30: Rename the variable db_v3_1 in tests/test_v4_migration.py to a
neutral name (e.g., test_db or source_db) to reflect the parametrized test that
runs against both v3 and v3.1 schemas, and update every occurrence (assignments,
path uses, and any function calls or assertions referencing db_v3_1) to the new
name so references like db_v3_1 = tmp_path / 'test_db.sqlite' and subsequent
uses are consistently replaced.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 7bb0f9f3-64fc-4d82-a190-8a5fdaff6374

📥 Commits

Reviewing files that changed from the base of the PR and between 4d08aff and 206d758.

📒 Files selected for processing (4)

temoa/utilities/master_migration.py
temoa/utilities/run_all_v4_migrations.py
tests/test_v4_migration.py
tests/testing_data/migration_v3_1_mock.sql

coderabbitai · 2026-03-25T18:15:24Z

temoa/utilities/master_migration.py

+    custom_handled_old_tables = {
+        'MetaData',
+        'MetaDataReal',
+        'TimeSeason',
+        'time_season',
+        'time_of_day',
+        'time_season_sequential',
+        'TimeSeasonSequential',
+        'TimeSegmentFraction',
+        'LoanLifetimeTech',
+        'CapacityFactorProcess',
+    }.union(OPERATOR_ADDED_TABLES.keys())


🧹 Nitpick | 🔵 Trivial

Confirm v3.1 operator tables are intentionally handled via standard table migration.

The custom_handled_old_tables set excludes tables in OPERATOR_ADDED_TABLES.keys() (v3 names like MinCapacity, EmissionLimit). However, v3.1 table names (LimitCapacity, LimitEmission) are not excluded, meaning they will be processed by _migrate_standard_tables().

This appears correct—v3.1 tables already have the operator column, so they can be copied directly without transformation. However, adding a comment would clarify this intentional design:

# Note: v3.1 tables (LimitCapacity, LimitEmission, etc.) are handled here # since they already have the operator column and don't need transformation.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@temoa/utilities/master_migration.py` around lines 179 - 190, The set custom_handled_old_tables currently unions OPERATOR_ADDED_TABLES.keys() but intentionally does not list v3.1 names like LimitCapacity/LimitEmission so they fall through to _migrate_standard_tables() because those v3.1 tables already include the operator column; add a short clarifying comment next to the custom_handled_old_tables definition explaining that v3.1 operator tables (e.g., LimitCapacity, LimitEmission) are intentionally handled by the standard migration path and thus do not need to be listed in OPERATOR_ADDED_TABLES.keys() to avoid confusion.

coderabbitai · 2026-03-25T18:15:24Z

temoa/utilities/master_migration.py

+        con_old.close()
+        con_new.close()
+        os.replace(temp_path, output_path)
+    except Exception:
+        if temp_path.exists():
+            os.remove(temp_path)
+        raise
+    finally:
+        con_old.close()
+        con_new.close()


🧹 Nitpick | 🔵 Trivial

Potential double-close of database connections.

The finally block closes con_old and con_new unconditionally, but these connections are also closed in the try block (lines 430-431) before os.replace(). If migration succeeds, the connections are closed twice. While SQLite handles double-close gracefully, this is unnecessary.

💡 Suggested fix: Remove explicit close in try block

con_new.commit() con_new.execute('VACUUM;') con_new.execute('PRAGMA foreign_keys = 1;') - con_old.close() - con_new.close() os.replace(temp_path, output_path) except Exception: if temp_path.exists(): os.remove(temp_path) raise finally: con_old.close() con_new.close()

Let the finally block handle all cleanup.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@temoa/utilities/master_migration.py` around lines 430 - 439, The try block currently closes con_old and con_new before os.replace(), then the finally block closes them again; remove the explicit con_old.close() and con_new.close() calls from the try section so the finally block is the single cleanup path; keep the existing exception handling that removes temp_path on error and ensure the finally block still unconditionally (or defensively check for None) closes con_old and con_new to guarantee proper cleanup.

coderabbitai · 2026-03-25T18:15:24Z

tests/test_v4_migration.py

+    """Test both SQL and SQLite master migrators."""
+
+    # 1. Create SQLite DB
+    db_v3_1 = tmp_path / 'test_db.sqlite'


🧹 Nitpick | 🔵 Trivial

Rename db_v3_1 to reflect parametrized usage.

The variable db_v3_1 is misleading since this test is now parametrized to run with both v3 and v3.1 schemas. Consider renaming to test_db or source_db for clarity.

💡 Suggested rename

- db_v3_1 = tmp_path / 'test_db.sqlite' - with sqlite3.connect(db_v3_1) as conn: + source_db = tmp_path / 'test_db.sqlite' + with sqlite3.connect(source_db) as conn:

And update all subsequent references to db_v3_1 accordingly.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@tests/test_v4_migration.py` at line 30, Rename the variable db_v3_1 in tests/test_v4_migration.py to a neutral name (e.g., test_db or source_db) to reflect the parametrized test that runs against both v3 and v3.1 schemas, and update every occurrence (assignments, path uses, and any function calls or assertions referencing db_v3_1) to the new name so references like db_v3_1 = tmp_path / 'test_db.sqlite' and subsequent uses are consistently replaced.

- Created master_migration.py which streamlines direct structure translation from v3 and v3.1 to v4. - Refactored run_all_v4_migrations.py to expose a reusable batch migrations function. - Updated cli.py to seamlessly accept directory migrations leveraging the new master hook. - Fixed scope and schema edge cases, ensuring robust validation across the tests.

…t_cli.py

…s, and harden testing

…xpanded test coverage

…e verification

ParticularlyPythonicBS added Maintenance Code quality fixes and deprecation management Feature Additional functionality refactor database-schema labels Mar 25, 2026

coderabbitai bot reviewed Mar 25, 2026

View reviewed changes

ParticularlyPythonicBS linked an issue Mar 25, 2026 that may be closed by this pull request

Refactor migration command to avoid sys.argv manipulation #197

Open

coderabbitai bot reviewed Mar 25, 2026

View reviewed changes

temoa/utilities/master_migration.py Show resolved Hide resolved

temoa/utilities/master_migration.py Outdated Show resolved Hide resolved

tests/test_v4_migration.py Show resolved Hide resolved

coderabbitai bot reviewed Mar 25, 2026

View reviewed changes

ParticularlyPythonicBS added 5 commits March 27, 2026 15:53

Fix Windows CI test failure caused by rich word wrapping in tests/tes…

6014de7

…t_cli.py

Finalize migration refactor: address code review, purge legacy script…

23fce8d

…s, and harden testing

Harden migration refactor: modular helpers, atomic SQL exports, and e…

05f22cd

…xpanded test coverage

Final hardening: cross-fs safety, input validation, and operator-tabl…

a20dc7c

…e verification

ParticularlyPythonicBS force-pushed the ref/combine_migrators branch from 206d758 to a20dc7c Compare March 27, 2026 19:53

Conversation

ParticularlyPythonicBS commented Mar 25, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ParticularlyPythonicBS commented Mar 25, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 25, 2026 •

edited

Loading