Task/Standardize development environment with uv and ruff#34
Task/Standardize development environment with uv and ruff#34Archit381 merged 50 commits intoAOSSIE-Org:mainfrom
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughAdds a CI workflow using uv and Ruff, updates project tooling (pyproject.toml), introduces Changes
Sequence Diagram(s)sequenceDiagram
participant Dev as Developer / CI
participant Repo as Repository
participant Verify as openverifiablellm.verify
participant Utils as openverifiablellm.utils
participant FS as Filesystem / Manifest
Dev->>Repo: push / open PR (triggers CI)
Repo->>Verify: run verification CLI / tests
Verify->>FS: check for manifest file
alt manifest missing
Verify->>Utils: extract_text_from_xml(input, write_manifest=True)
Utils->>FS: write processed output + manifest
Utils-->>Verify: return manifest + merkle data
Verify->>Verify: run checks (emit PASS / FAIL / SKIP)
else manifest present
Verify->>FS: load manifest
Verify->>Utils: verify_merkle_proof / compute checks
Verify->>Verify: aggregate PASS / FAIL / SKIP results
end
Verify->>Dev: emit report (PASS / FAIL / SKIP)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 5
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
openverifiablellm/verify.py (1)
88-90:⚠️ Potential issue | 🟡 MinorReport with only SKIP checks incorrectly returns
all_passed=True.The current logic returns
Truewhen there are no FAILed checks (line 90), regardless of whether any checks actually PASSed. A report containing only SKIPped checks would report "ALL CHECKS PASSED" in the summary (line 135) and exit with code 0 (line 504), which is misleading when verification was never attempted.This affects user-facing output, exit codes, and the JSON report. Since SKIP status is used in production code (e.g., when verifying older manifests missing merkle fields), clarify the intended behavior: should success require at least one PASS, or is "no failures" sufficient?
Add a test case for the all-SKIP scenario to make the intended behavior explicit.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@openverifiablellm/verify.py` around lines 88 - 90, The all_passed property currently returns True when there are zero failures, which lets a report with only SKIPped checks be treated as success; update the Report.all_passed implementation to require at least one PASS in addition to zero FAILs (i.e., check both len(self.failed) == 0 and len(self.passed) > 0 or equivalent), reference the Report.all_passed property and the collections self.failed / self.passed (and self.skipped) in your change, and add a unit test that creates a report containing only SKIPped checks to assert all_passed is False and that the summary/exit behavior reflects a non-successful verification.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.github/workflows/ci.yml:
- Around line 5-7: The YAML has extra spaces inside the array brackets for the
branches keys; update both occurrences of the branches entries (the top-level
branches and the pull_request.branches) so the arrays use no inner spaces (e.g.,
change "[ main ]" to "[main]") to satisfy YAMLlint consistency.
- Around line 10-32: The CI job named lint-and-test currently only runs lint
steps (actions named "Run Ruff Check" and "Run Ruff Format"); either add a test
step or rename the job to avoid mismatch. To fix, either add a test step that
runs your test runner (e.g., a step named "Run Tests" that executes uv run
pytest or uv run python -m pytest) after the dependency installation and
before/after the lint steps, or change the job key/name from lint-and-test to
lint and update any references; locate the job by the job identifier
lint-and-test and the step names "Install dependencies", "Run Ruff Check", and
"Run Ruff Format" to implement the change.
- Around line 17-20: The CI step named "Install uv" currently pins the action to
astral-sh/setup-uv@v5 but sets version: "latest", which makes builds
non-reproducible; update the step to replace version: "latest" with a specific
stable uv release (e.g., "0.10.8") so the action uses a fixed uv binary
version—locate the GitHub Actions step with name "Install uv" and the uses entry
"astral-sh/setup-uv@v5" and change the version input accordingly.
In `@docs/setup.md`:
- Around line 5-12: The markdown has lint errors: add a blank line above each
heading (e.g., "## Installation" and "## Linting"), remove the trailing space at
the end of the "Auto-fix issues: `uv run ruff check . --fix`" line, and ensure
the file ends with a single trailing newline; update the docs/setup.md content
accordingly so headings are separated by blank lines, no trailing spaces remain,
and the file ends with a newline.
In `@openverifiablellm/utils.py`:
- Around line 283-284: Remove the two orphaned comment lines left in utils.py
that read "# Content before line 270 remains unchanged" and "# Entire function
definition from lines 270-314 should be deleted" — these are stale post-refactor
comments and should be deleted so only relevant code and valid comments remain;
locate them near the orphaned function area and remove them without changing
surrounding logic.
---
Outside diff comments:
In `@openverifiablellm/verify.py`:
- Around line 88-90: The all_passed property currently returns True when there
are zero failures, which lets a report with only SKIPped checks be treated as
success; update the Report.all_passed implementation to require at least one
PASS in addition to zero FAILs (i.e., check both len(self.failed) == 0 and
len(self.passed) > 0 or equivalent), reference the Report.all_passed property
and the collections self.failed / self.passed (and self.skipped) in your change,
and add a unit test that creates a report containing only SKIPped checks to
assert all_passed is False and that the summary/exit behavior reflects a
non-successful verification.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: b196b3d7-f877-4e12-a3a0-4245a3104f79
⛔ Files ignored due to path filters (1)
uv.lockis excluded by!**/*.lock
📒 Files selected for processing (10)
.github/workflows/ci.ymldocs/setup.mdexamples/demo_util.pyexamples/sample_wiki.pyopenverifiablellm/utils.pyopenverifiablellm/verify.pypyproject.tomlscripts/download_dump.pytests/test_util.pytests/test_verify.py
docs/setup.md
Outdated
| ## Installation | ||
| 1. Install [uv](https://github.com/astral-sh/uv). | ||
| 2. Run `uv sync` to install dependencies and create a virtual environment. | ||
|
|
||
| ## Linting | ||
| We use `ruff` to maintain code quality. | ||
| - Run checks: `uv run ruff check .` | ||
| - Auto-fix issues: `uv run ruff check . --fix` No newline at end of file |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Fix markdown formatting issues.
Per static analysis (markdownlint), there are formatting issues: missing blank lines around headings, trailing space on line 10, and missing trailing newline.
📝 Proposed fix
# Developer Setup
This project uses `uv` for dependency management.
## Installation
+
1. Install [uv](https://github.com/astral-sh/uv).
2. Run `uv sync` to install dependencies and create a virtual environment.
## Linting
-We use `ruff` to maintain code quality.
+
+We use `ruff` to maintain code quality.
- Run checks: `uv run ruff check .`
- Auto-fix issues: `uv run ruff check . --fix`
+- Check formatting: `uv run ruff format --check .`
+- Auto-format: `uv run ruff format .`
+📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| ## Installation | |
| 1. Install [uv](https://github.com/astral-sh/uv). | |
| 2. Run `uv sync` to install dependencies and create a virtual environment. | |
| ## Linting | |
| We use `ruff` to maintain code quality. | |
| - Run checks: `uv run ruff check .` | |
| - Auto-fix issues: `uv run ruff check . --fix` | |
| ## Installation | |
| 1. Install [uv](https://github.com/astral-sh/uv). | |
| 2. Run `uv sync` to install dependencies and create a virtual environment. | |
| ## Linting | |
| We use `ruff` to maintain code quality. | |
| - Run checks: `uv run ruff check .` | |
| - Auto-fix issues: `uv run ruff check . --fix` | |
| - Check formatting: `uv run ruff format --check .` | |
| - Auto-format: `uv run ruff format .` | |
🧰 Tools
🪛 markdownlint-cli2 (0.21.0)
[warning] 5-5: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below
(MD022, blanks-around-headings)
[warning] 9-9: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below
(MD022, blanks-around-headings)
[warning] 10-10: Trailing spaces
Expected: 0 or 2; Actual: 1
(MD009, no-trailing-spaces)
[warning] 12-12: Files should end with a single newline character
(MD047, single-trailing-newline)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@docs/setup.md` around lines 5 - 12, The markdown has lint errors: add a blank
line above each heading (e.g., "## Installation" and "## Linting"), remove the
trailing space at the end of the "Auto-fix issues: `uv run ruff check . --fix`"
line, and ensure the file ends with a single trailing newline; update the
docs/setup.md content accordingly so headings are separated by blank lines, no
trailing spaces remain, and the file ends with a newline.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.github/workflows/ci.yml:
- Around line 22-23: The CI step "Set up Python" currently runs "uv python
install" with no version, which can pull the latest Python and cause
inconsistencies with pyproject.toml's requires-python and ruff's target-version;
update the "Set up Python" step to pin the Python version (e.g., install 3.9) by
passing the explicit version to the uv command or alternatively add a
.python-version file with "3.9" at repo root so "uv python install" will use
that pinned version.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 8368e372-9366-4f18-a6d1-1ceac59b31b0
📒 Files selected for processing (1)
.github/workflows/ci.yml
|
✅ Actions performedReviews resumed. Review triggered.
|
|
@CodeRabbit trigger |
|
The PR description image shows a terminal screenshot of ✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
openverifiablellm/verify.py (1)
459-478:⚠️ Potential issue | 🟠 MajorFail when the re-run does not regenerate a manifest.
The current
SKIPhere weakens the verifier:report.all_passedcan still beTrueeven thoughdata/dataset_manifest.jsonwas never produced, so none of the manifest-derived checks ran. Missing regenerated artifacts should be a verification failure, not a skipped check.Suggested fix
else: report.add( CheckResult( name="manifest_regenerated", - status=CheckStatus.SKIP, - detail="Reproduced manifest not found — skipping manifest field checks", + status=CheckStatus.FAIL, + detail="Reproduced manifest not found after re-running preprocessing", ) ) + return report🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@openverifiablellm/verify.py` around lines 459 - 478, The verifier currently treats a missing reproduced manifest as SKIP; change this to a failure: in the branch where reproduced_manifest_path (the Path "data/dataset_manifest.json") does not exist, add a CheckResult to report with name "manifest_regenerated" and status CheckStatus.FAIL (instead of SKIP) and a clear detail message like "Reproduced manifest not found — expected regenerated dataset_manifest.json", so missing regenerated artifacts cause verification to fail; update any tests or callers that expect the previous SKIP behavior accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@openverifiablellm/verify.py`:
- Around line 278-290: The code currently validates manifest["chunk_size_bytes"]
only when raw_merkle_root exists, but later reuses
manifest.get("chunk_size_bytes", utils.MERKLE_CHUNK_SIZE_BYTES") before Merkle
checks (including processed_merkle_root) and can pass invalid values into
utils.compute_merkle_root(); update the logic so chunk_size_bytes is validated
(must be an int > 0) before any Merkle computation: extract chunk_size_bytes
once from manifest with default utils.MERKLE_CHUNK_SIZE_BYTES, run the
positive-integer guard (adding a CheckResult failure and returning the report)
prior to invoking utils.compute_merkle_root() for both raw_merkle_root and
processed_merkle_root checks, and reference manifest, chunk_size_bytes,
utils.MERKLE_CHUNK_SIZE_BYTES, processed_merkle_root, raw_merkle_root, and
utils.compute_merkle_root() while making the change.
---
Outside diff comments:
In `@openverifiablellm/verify.py`:
- Around line 459-478: The verifier currently treats a missing reproduced
manifest as SKIP; change this to a failure: in the branch where
reproduced_manifest_path (the Path "data/dataset_manifest.json") does not exist,
add a CheckResult to report with name "manifest_regenerated" and status
CheckStatus.FAIL (instead of SKIP) and a clear detail message like "Reproduced
manifest not found — expected regenerated dataset_manifest.json", so missing
regenerated artifacts cause verification to fail; update any tests or callers
that expect the previous SKIP behavior accordingly.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 439934eb-0fbf-42d5-aff6-cc5fbfee1e52
📒 Files selected for processing (1)
openverifiablellm/verify.py
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@openverifiablellm/verify.py`:
- Around line 351-369: The python_version check currently marks PASS when
expected_python is missing, causing false positives; change the logic in the
block using expected_python and python_ver so that if expected_python is
None/empty you add a CheckResult with name="python_version" and
status=CheckStatus.SKIP (rather than PASS), mirroring the legacy-field handling
elsewhere; otherwise keep the existing FAIL when python_ver != expected_python
and PASS when they match, using report.add(...) with CheckResult and the same
expected/actual/detail fields and reference CheckStatus.SKIP for the skipped
case.
- Around line 385-392: In the subprocess re-run in verify.py (the subprocess.run
call that builds the argv list including sys.executable, "-m",
"openverifiablellm.utils", str(input_dump), "--write-manifest"), remove the
unsupported "--write-manifest" argument so the CLI uses its default manifest
behavior (or explicitly pass "--no-manifest" only if you intend to disable it);
update the argument list passed to subprocess.run accordingly.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 86bba3fa-d177-458c-a67e-758e13958a06
📒 Files selected for processing (1)
openverifiablellm/verify.py
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (1)
openverifiablellm/verify.py (1)
351-369:⚠️ Potential issue | 🟠 MajorDon't mark a missing
python_versionas PASS.The
elsebranch still records PASS whenmanifest.get("python_version")is absent, even though nothing was compared. That should beSKIP, like the other legacy-field paths.🩹 Proposed fix
python_ver = platform.python_version() expected_python = manifest.get("python_version") - if expected_python and python_ver != expected_python: + if expected_python is None: + report.add( + CheckResult( + name="python_version", + status=CheckStatus.SKIP, + detail="Field absent from manifest (older version)", + ) + ) + elif python_ver != expected_python: report.add( CheckResult( name="python_version", status=CheckStatus.FAIL, expected=expected_python,🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@openverifiablellm/verify.py` around lines 351 - 369, The current logic in the python version check uses an else branch that records CheckStatus.PASS even when manifest.get("python_version") is missing (expected_python is None); update the branching in verify.py so that when expected_python is falsy you add a CheckResult with status CheckStatus.SKIP instead of PASS; modify the block around expected_python/python_ver where report.add is called (the CheckResult with name="python_version") to emit SKIP when expected_python is not provided, retain FAIL when values mismatch and PASS only when expected_python is present and equals python_ver.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@openverifiablellm/verify.py`:
- Around line 304-322: The check for "manifest_chunk_size_bytes" is tautological
because it compares manifest["chunk_size_bytes"] to the local variable
chunk_size which was just read from the same manifest; change the comparison to
validate the manifest value against the actual computed/used chunk size (the
value produced by preprocessing or the Merkle chunker) rather than the manifest
copy. Locate the check that calls _check_field(...,
expected=manifest["chunk_size_bytes"], actual=chunk_size) and replace
expected/actual so that actual is the runtime/computed chunk size (e.g., the
chunk size returned or used by the preprocessing/Merkle functions) and expected
is manifest["chunk_size_bytes"]; apply the same fix to the other instance around
the second block (lines 469-475) so the manifest value is being validated
against the true computed chunk size. Ensure you reference the computed
chunk-size value used elsewhere in this module (the variable or function that
performs preprocessing/Merkle chunking) when rewriting the comparison.
- Around line 234-244: The manifest loading currently only catches
FileNotFoundError and lets json.JSONDecodeError propagate; update the try/except
around _load_manifest(manifest_path) to also catch json.JSONDecodeError (and
optionally ValueError) and convert it into a structured verification failure by
calling report.add(CheckResult(name="manifest_exists", status=CheckStatus.FAIL,
detail=str(exc))) and then return report; reference the _load_manifest call and
the CheckResult/CheckStatus usage to locate where to add the additional except
branch.
---
Duplicate comments:
In `@openverifiablellm/verify.py`:
- Around line 351-369: The current logic in the python version check uses an
else branch that records CheckStatus.PASS even when
manifest.get("python_version") is missing (expected_python is None); update the
branching in verify.py so that when expected_python is falsy you add a
CheckResult with status CheckStatus.SKIP instead of PASS; modify the block
around expected_python/python_ver where report.add is called (the CheckResult
with name="python_version") to emit SKIP when expected_python is not provided,
retain FAIL when values mismatch and PASS only when expected_python is present
and equals python_ver.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 025c37b0-919c-4b7b-80f8-3071f1a77cc8
📒 Files selected for processing (1)
openverifiablellm/verify.py
|
@CodeRabbit resume |
✅ Actions performedReviews resumed. |
|
@CodeRabbit trigger |
|
The PR description image shows a terminal screenshot of running
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
openverifiablellm/verify.py (1)
473-475:⚠️ Potential issue | 🟠 MajorHandle malformed regenerated manifests as a structured failure.
If the re-run writes
dataset_manifest.jsonbut leaves it malformed,json.load()raises here and verification exits without aVerificationReport. This path should mirror the earlier source-manifest handling and convertJSONDecodeErrorinto aFAIL.Suggested fix
if reproduced_manifest_path.exists(): - with reproduced_manifest_path.open() as f: - reproduced_manifest = json.load(f) + try: + with reproduced_manifest_path.open() as f: + reproduced_manifest = json.load(f) + except json.JSONDecodeError as exc: + report.add( + CheckResult( + name="manifest_regenerated", + status=CheckStatus.FAIL, + detail=f"Reproduced manifest is not valid JSON: {exc}", + ) + ) + return report🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@openverifiablellm/verify.py` around lines 473 - 475, If reproduced_manifest_path.exists() may contain malformed JSON, wrap the reproduced_manifest_path.open() + json.load(f) in a try/except that catches json.JSONDecodeError and converts that into a structured verification failure (a VerificationReport FAIL) rather than letting the exception propagate; mirror the earlier source-manifest handling by logging/creating a VerificationReport with a clear failure reason referencing reproduced_manifest_path and return that report from the current verification function so callers always receive a VerificationReport on malformed regenerated manifests.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@openverifiablellm/verify.py`:
- Around line 477-483: The check for "manifest_preprocessing_version" currently
passes a missing source value into _check_field, causing None to be stringified
and producing incorrect PASS/FAIL; update the branch around the _check_field
call that references "manifest_preprocessing_version" (the call using
manifest.get("preprocessing_version") and
reproduced_manifest.get("preprocessing_version")) to first detect a missing
source value (manifest.get("preprocessing_version") is None) and emit a SKIP
result for that field (matching the other legacy-field branches) instead of
calling _check_field when the source is absent; otherwise continue to call
_check_field as before.
---
Outside diff comments:
In `@openverifiablellm/verify.py`:
- Around line 473-475: If reproduced_manifest_path.exists() may contain
malformed JSON, wrap the reproduced_manifest_path.open() + json.load(f) in a
try/except that catches json.JSONDecodeError and converts that into a structured
verification failure (a VerificationReport FAIL) rather than letting the
exception propagate; mirror the earlier source-manifest handling by
logging/creating a VerificationReport with a clear failure reason referencing
reproduced_manifest_path and return that report from the current verification
function so callers always receive a VerificationReport on malformed regenerated
manifests.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 091a1d0d-2d42-4956-8d59-f0da4947e1b0
📒 Files selected for processing (1)
openverifiablellm/verify.py
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
openverifiablellm/verify.py (1)
36-39:⚠️ Potential issue | 🟠 MajorDon’t report
ALL CHECKS PASSEDwhen some checks were skipped.Now that
SKIPis part of the result model,all_passed = len(self.failed) == 0makes the summary, serializedall_passed, and CLI exit path treatPASS + SKIPthe same asPASS-only. That produces false-green output for legacy manifests.Proposed fix
`@property` - def all_passed(self) -> bool: + def succeeded(self) -> bool: return len(self.failed) == 0 + + `@property` + def all_passed(self) -> bool: + return len(self.failed) == 0 and len(self.skipped) == 0 @@ - verdict = "ALL CHECKS PASSED" if self.all_passed else "VERIFICATION FAILED" + if self.failed: + verdict = "VERIFICATION FAILED" + elif self.skipped: + verdict = "VERIFICATION PASSED WITH SKIPS" + else: + verdict = "ALL CHECKS PASSED" @@ - "all_passed": self.all_passed, + "all_passed": self.all_passed, + "succeeded": self.succeeded, @@ - sys.exit(0 if report.all_passed else 1) + sys.exit(0 if report.succeeded else 1)Also applies to: 88-90, 135-136, 141-151, 557-557
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@openverifiablellm/verify.py` around lines 36 - 39, The summary/serialization/CLI logic currently treats PASS+SKIP the same as all-pass; change all places computing all_passed (where they use expressions like all_passed = len(self.failed) == 0) to require no failures AND no skips (e.g., all_passed = len(self.failed) == 0 and len(self.skipped) == 0), update the human summary text generation (the code that prints "ALL CHECKS PASSED") to only emit that message when this new all_passed is true, ensure the serialized result field for all_passed uses the new value, and adjust the CLI exit path so a SKIP present does not produce the same exit code as a true all-pass (treat SKIP as a non-success for exit handling). Reference symbols to change: CheckStatus, any uses of self.failed and self.skipped, the summary/serialize method that sets all_passed, and the CLI exit branch that checks all_passed.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@openverifiablellm/verify.py`:
- Around line 471-475: The code reading the regenerated manifest at
reproduced_manifest_path uses json.load(f) without error handling; wrap that
call in a try/except catching json.JSONDecodeError (and optionally ValueError)
and, on exception, record a structured FAIL for the regenerated-manifest step
the same way the earlier manifest-load guard does (use the same result
key/format and include the error message), set reproduced_manifest to None or
skip further manifest-dependent checks, and continue to return the verification
report instead of letting the exception bubble.
---
Outside diff comments:
In `@openverifiablellm/verify.py`:
- Around line 36-39: The summary/serialization/CLI logic currently treats
PASS+SKIP the same as all-pass; change all places computing all_passed (where
they use expressions like all_passed = len(self.failed) == 0) to require no
failures AND no skips (e.g., all_passed = len(self.failed) == 0 and
len(self.skipped) == 0), update the human summary text generation (the code that
prints "ALL CHECKS PASSED") to only emit that message when this new all_passed
is true, ensure the serialized result field for all_passed uses the new value,
and adjust the CLI exit path so a SKIP present does not produce the same exit
code as a true all-pass (treat SKIP as a non-success for exit handling).
Reference symbols to change: CheckStatus, any uses of self.failed and
self.skipped, the summary/serialize method that sets all_passed, and the CLI
exit branch that checks all_passed.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 40439e3d-a2be-44c5-a323-a5407993c961
📒 Files selected for processing (1)
openverifiablellm/verify.py
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
openverifiablellm/verify.py (1)
35-39:⚠️ Potential issue | 🟡 MinorDo not log
SKIPresults at warning level.With
CheckStatus.SKIPadded, this method now warns on expected legacy-field skips because every non-PASSresult shares the same branch. That makes successful runs look noisy and unhealthy.🩹 Proposed fix
- log = logger.info if check.status == CheckStatus.PASS else logger.warning + log = logger.warning if check.status == CheckStatus.FAIL else logger.infoAlso applies to: 71-74
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@openverifiablellm/verify.py` around lines 35 - 39, The logging branch currently treats every non-PASS the same, causing CheckStatus.SKIP to be logged as a warning; update the conditional where statuses are logged (the code handling CheckStatus results around the logging logic at ~71-74) to only emit warnings for CheckStatus.FAIL, and log SKIP results at a lower level (info or debug) or skip logging entirely, while keeping PASS logging unchanged; ensure you reference CheckStatus when changing the condition so SKIP is excluded from the warning branch.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@openverifiablellm/verify.py`:
- Around line 486-510: The current check only validates preprocessing_version
and chunk_size_bytes but must validate the full regenerated manifest against all
fields the writer emits; update the comparison in verify.py to iterate and
assert equality for the complete set of manifest keys produced by the writer
(e.g., "wikipedia_dump", "dump_date", "raw_sha256", "processed_sha256",
"raw_merkle_root", "processed_merkle_root", "python_version", plus
"preprocessing_version" and "chunk_size_bytes" when present) using the existing
_check_field helper (preserve names like "manifest_preprocessing_version" and
"manifest_chunk_size_bytes" or add clearly named check keys for each field) so
the test fails if any writer-emitted field is missing or mismatches
reproduced_manifest; ensure you read the expected values from manifest and
actual from reproduced_manifest.
- Around line 286-290: The chunk_size_bytes validation in verify.py incorrectly
accepts JSON booleans because bool is a subclass of int; update the guard that
currently uses isinstance(chunk_size, int) to use an exact type check
(type(chunk_size) is not int) so that true/false are rejected while preserving
valid integers; adjust the conditional around manifest lookup (the code
referencing manifest, chunk_size, and utils.MERKLE_CHUNK_SIZE_BYTES) to use this
exact-type check and keep the existing error path unchanged.
---
Outside diff comments:
In `@openverifiablellm/verify.py`:
- Around line 35-39: The logging branch currently treats every non-PASS the
same, causing CheckStatus.SKIP to be logged as a warning; update the conditional
where statuses are logged (the code handling CheckStatus results around the
logging logic at ~71-74) to only emit warnings for CheckStatus.FAIL, and log
SKIP results at a lower level (info or debug) or skip logging entirely, while
keeping PASS logging unchanged; ensure you reference CheckStatus when changing
the condition so SKIP is excluded from the warning branch.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 8652a08a-2283-4ad5-ab63-34046e318e78
📒 Files selected for processing (1)
openverifiablellm/verify.py
| # Shared Merkle chunk size validation | ||
| chunk_size = manifest.get("chunk_size_bytes", utils.MERKLE_CHUNK_SIZE_BYTES) | ||
| if ("raw_merkle_root" in manifest or "processed_merkle_root" in manifest) and ( | ||
| not isinstance(chunk_size, int) or chunk_size <= 0 | ||
| ): |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verifies the current predicate treats True as a positive int.
# Expected output includes: True => True
python - <<'PY'
for value in [True, False, 1048576, "1048576"]:
print(repr(value), "=>", isinstance(value, int) and value > 0)
PYRepository: AOSSIE-Org/OpenVerifiableLLM
Length of output: 134
🏁 Script executed:
sed -n '286,290p' openverifiablellm/verify.pyRepository: AOSSIE-Org/OpenVerifiableLLM
Length of output: 344
🏁 Script executed:
python - <<'PY'
# Verify proposed fix rejects booleans
for value in [True, False, 1048576, "1048576"]:
print(repr(value), "=>", type(value) is not int and value > 0)
PYRepository: AOSSIE-Org/OpenVerifiableLLM
Length of output: 272
🏁 Script executed:
python - <<'PY'
# Verify: does proposed fix actually reject booleans?
# Error condition: type(chunk_size) is not int or chunk_size <= 0
# (true = raise error, false = accept)
print("Proposed fix error condition results:")
for value in [True, False, 1048576, "1048576"]:
try:
# Simulate the error condition
would_error = (type(value) is not int or value <= 0)
print(f"{repr(value):10} => would_error={would_error}")
except TypeError as e:
print(f"{repr(value):10} => TypeError: {e}")
PYRepository: AOSSIE-Org/OpenVerifiableLLM
Length of output: 234
🏁 Script executed:
python - <<'PY'
# Also verify type() vs isinstance() distinction
print("Type checks:")
print(f"type(True) = {type(True)}")
print(f"type(True) is int = {type(True) is int}")
print(f"type(True) is not int = {type(True) is not int}")
print(f"isinstance(True, int) = {isinstance(True, int)}")
PYRepository: AOSSIE-Org/OpenVerifiableLLM
Length of output: 196
Reject JSON booleans for chunk_size_bytes.
bool is a subclass of int in Python, so {"chunk_size_bytes": true} currently passes this guard as 1. This allows a malformed manifest to silently proceed with Merkle validation using a 1-byte chunk size instead of raising a validation error.
Use type(chunk_size) is not int instead of isinstance(chunk_size, int) to reject booleans while preserving valid integers:
Fix
- if ("raw_merkle_root" in manifest or "processed_merkle_root" in manifest) and (
- not isinstance(chunk_size, int) or chunk_size <= 0
- ):
+ if ("raw_merkle_root" in manifest or "processed_merkle_root" in manifest) and (
+ type(chunk_size) is not int or chunk_size <= 0
+ ):📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # Shared Merkle chunk size validation | |
| chunk_size = manifest.get("chunk_size_bytes", utils.MERKLE_CHUNK_SIZE_BYTES) | |
| if ("raw_merkle_root" in manifest or "processed_merkle_root" in manifest) and ( | |
| not isinstance(chunk_size, int) or chunk_size <= 0 | |
| ): | |
| # Shared Merkle chunk size validation | |
| chunk_size = manifest.get("chunk_size_bytes", utils.MERKLE_CHUNK_SIZE_BYTES) | |
| if ("raw_merkle_root" in manifest or "processed_merkle_root" in manifest) and ( | |
| type(chunk_size) is not int or chunk_size <= 0 | |
| ): |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@openverifiablellm/verify.py` around lines 286 - 290, The chunk_size_bytes
validation in verify.py incorrectly accepts JSON booleans because bool is a
subclass of int; update the guard that currently uses isinstance(chunk_size,
int) to use an exact type check (type(chunk_size) is not int) so that true/false
are rejected while preserving valid integers; adjust the conditional around
manifest lookup (the code referencing manifest, chunk_size, and
utils.MERKLE_CHUNK_SIZE_BYTES) to use this exact-type check and keep the
existing error path unchanged.
| expected_preprocessing_version = manifest.get("preprocessing_version") | ||
| if expected_preprocessing_version is None: | ||
| report.add( | ||
| CheckResult( | ||
| name="manifest_preprocessing_version", | ||
| status=CheckStatus.SKIP, | ||
| detail="Field absent from manifest (older version)", | ||
| ) | ||
| ) | ||
| else: | ||
| _check_field( | ||
| report, | ||
| "manifest_preprocessing_version", | ||
| expected=expected_preprocessing_version, | ||
| actual=reproduced_manifest.get("preprocessing_version"), | ||
| detail="Preprocessing version tag", | ||
| ) | ||
| if "chunk_size_bytes" in manifest: | ||
| _check_field( | ||
| report, | ||
| "manifest_chunk_size_bytes", | ||
| expected=manifest["chunk_size_bytes"], | ||
| actual=reproduced_manifest.get("chunk_size_bytes"), | ||
| detail="Merkle chunk size used during preprocessing", | ||
| ) |
There was a problem hiding this comment.
Validate the regenerated manifest itself, not just two fields.
This block only compares preprocessing_version and chunk_size_bytes, but openverifiablellm/utils.py:215-252 also writes wikipedia_dump, dump_date, raw_sha256, processed_sha256, raw_merkle_root, processed_merkle_root, and python_version. A manifest-writer regression can therefore still go green here as long as the files themselves are correct.
🩹 Suggested direction
+ report.add(
+ CheckResult(
+ name="manifest_regenerated",
+ status=CheckStatus.PASS,
+ detail=str(reproduced_manifest_path),
+ )
+ )
+
+ for field_name, detail in [
+ ("wikipedia_dump", "Raw filename recorded in manifest"),
+ ("dump_date", "Dump date parsed from filename"),
+ ("raw_sha256", "SHA256 of the raw input dump"),
+ ("processed_sha256", "SHA256 of reproduced wiki_clean.txt"),
+ ("python_version", "Python version recorded in manifest"),
+ ]:
+ if field_name in manifest:
+ _check_field(
+ report,
+ f"manifest_{field_name}",
+ expected=manifest[field_name],
+ actual=reproduced_manifest.get(field_name),
+ detail=detail,
+ )
+ if "raw_merkle_root" in manifest:
+ _check_field(
+ report,
+ "manifest_raw_merkle_root",
+ expected=manifest["raw_merkle_root"],
+ actual=reproduced_manifest.get("raw_merkle_root"),
+ detail="Merkle root of raw dump recorded in manifest",
+ )
+ if "processed_merkle_root" in manifest:
+ _check_field(
+ report,
+ "manifest_processed_merkle_root",
+ expected=manifest["processed_merkle_root"],
+ actual=reproduced_manifest.get("processed_merkle_root"),
+ detail="Merkle root of processed file recorded in manifest",
+ )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@openverifiablellm/verify.py` around lines 486 - 510, The current check only
validates preprocessing_version and chunk_size_bytes but must validate the full
regenerated manifest against all fields the writer emits; update the comparison
in verify.py to iterate and assert equality for the complete set of manifest
keys produced by the writer (e.g., "wikipedia_dump", "dump_date", "raw_sha256",
"processed_sha256", "raw_merkle_root", "processed_merkle_root",
"python_version", plus "preprocessing_version" and "chunk_size_bytes" when
present) using the existing _check_field helper (preserve names like
"manifest_preprocessing_version" and "manifest_chunk_size_bytes" or add clearly
named check keys for each field) so the test fails if any writer-emitted field
is missing or mismatches reproduced_manifest; ensure you read the expected
values from manifest and actual from reproduced_manifest.
|
@ryoari Still failing CI checks, pls fix Also, since there are dev dependencies simple |
…fest-bug Fix NameError in verify_preprocessing when validating chunk_size_bytes
…OpenVerifiableLLM into task/standardize-dev-env
|
Hey @archit - small note regarding PR #34. My PR touches many files, so every time something is merged into This causes Ruff to detect formatting differences again, which makes the CI fail until I rebase and re-run Ruff. Since the PR establishes the formatting baseline, merging it earlier would prevent repeated CI failures caused by subsequent merges. Otherwise I can keep rebasing and updating it when needed - just wanted to explain why the CI keeps flipping between passing and failing. |
I guess. What am I looking at ? |
Addressed Issues:
Need to standardize development environment
Fixes #32
Screenshots/Recordings:
Ruff linting was run across the repository to enforce consistent formatting.
Issues detected included:
I001 Import block is un-sorted or un-formatted
F541 f-string without any placeholders
F401
sysimported but unusedThese were resolved using:
ruff check . --fix
ruff format .
All files now pass lint and formatting checks.
Additional Notes:
This PR introduces a standardized development environment and improves code consistency.
Changes include:
Bug fix applied in verify.py:
changed:
to
This resolves the undefined variable error reported by ruff (F821).
Checklist
Summary by CodeRabbit
New Features
Documentation
Chores
Tests
Style / Refactor