chore(prefect-plugin): upgrade Prefect v2→v3 and clean up pydantic v1 usage#17536
chore(prefect-plugin): upgrade Prefect v2→v3 and clean up pydantic v1 usage#17536dinesh-verma-datahub wants to merge 3 commits into
Conversation
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
… usage Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
57012fd to
56322cc
Compare
Connector Tests ResultsAll connector tests passed for commit To skip connector tests, add the Autogenerated by the connector-tests CI pipeline. |
|
Linear: ING-2733 |
|
Your PR has been assigned to @sgomezvillamor (sergio.gomez) for review (ING-2733). |
sgomezvillamor
left a comment
There was a problem hiding this comment.
Great work on the Prefect v2→v3 and pydantic v1→v2 migration! The asyncio.run() → run_coro_as_sync() approach is exactly right for Prefect v3's event loop model, and replacing task_run_futures with paginated read_task_runs calls is a clean, correct fix. A few items before merge:
🔴 BLOCKER — Breaking change not documented in updating-datahub.md
This PR drops Prefect v2 support (>=2.0.0,<3.0.0 → >=3.0.0,<4.0.0) and changes the entry point group from prefect.block to prefect.collections. Both are user-facing breaking changes. Please add an entry to docs/how/updating-datahub.md before merge.
🟡 WARNING — Silent timeout in _fetch_all_task_runs_stable
The 50-iteration × 0.1 s polling loop exits silently when exhausted. If Prefect's state persistence is slow, tasks could be silently dropped from DataHub lineage. Please add a warning log when the loop ends without stabilising.
🟡 WARNING — # type: ignore[arg-type] on FlowRunFilterId
flow_run_id is typed as str but FlowRunFilterId.any_ likely expects List[UUID]. Worth fixing the type rather than suppressing — e.g. any_=[UUID(flow_run_id)].
ℹ️ SUGGESTION — import asyncio inside method body
asyncio was removed from the top-level imports but re-introduced locally inside _fetch_all_task_runs_stable. It should stay at the module level per project conventions.
❓ QUESTION — Can -> Any be narrowed for the validator factory functions?
In files like connection_resolver.py, validate_field_*.py, etc. the return type changed from V1RootValidator/V1Validator → Any. In pydantic v2, model_validator(mode="before")(fn) wraps validators as classmethod descriptors — would -> classmethod be a viable narrower annotation here?
Generated by Claude Code
Summary
asyncio.run(...) calls with run_coro_as_sync(...). Updated entry point group from prefect.block to
prefect.collections and module reference from prefect_datahub.prefect_datahub:DatahubEmitter to
prefect_datahub.datahub_emitter. Fixed _block_type_name to use ClassVar.
map, which was removed in Prefect v3. Replaced with read_task_runs API calls, adding offset pagination
(Prefect server returns at most 200 results per request) and stable-count polling to handle Prefect's
asynchronous task run state persistence.
import SecretStr). Bumped dev dependency from pydantic>=1.10 to pydantic>=2.0.0. Removed
V1RootValidator/V1Validator type references and dead TYPE_CHECKING guards across 8 files
(connection_resolver.py, import_resolver.py, validate_field_deprecation.py, validate_field_removal.py,
validate_field_rename.py, validate_multiline_string.py, snowflake_lineage_v2.py,
entity_removal_state.py). Replaced FlowRun.parse_obj() and TaskRun.parse_obj() (pydantic v1 methods)
with direct await client.read_flow_run/read_task_run calls.