Add support for video (MOT + CVAT Video) and COCO image (BBOX and Segmentation) import and export#652
Add support for video (MOT + CVAT Video) and COCO image (BBOX and Segmentation) import and export#652
Conversation
…before opening a PR
…o query_result.py
Review Summary by QodoAdd video and COCO annotation import/export support
WalkthroughsDescription• Add support for importing and exporting video annotations (MOT, CVAT Video) • Add support for importing and exporting COCO image annotations (bbox, segmentation) • Implement video annotation flattening and Label Studio video task conversion • Fix error handling in blob download with try-catch moved to query_result.py • Add annotation field resolution utility for automatic field detection Diagramflowchart LR
A["Video Formats<br/>MOT, CVAT Video"] --> B["AnnotationImporter"]
C["Image Formats<br/>COCO"] --> B
B --> D["Flatten Video<br/>Annotations"]
B --> E["Convert to<br/>Label Studio"]
D --> F["QueryResult<br/>Export Methods"]
E --> F
F --> G["export_as_mot<br/>export_as_coco<br/>export_as_cvat_video"]
File Changes1. dagshub/data_engine/annotation/importer.py
|
|
Important Review skippedAuto incremental reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the Use the checkbox below for a quick retry:
📝 WalkthroughWalkthroughThis pull request extends annotation format support in the data engine by adding COCO, MOT, and CVAT video formats alongside existing YOLO and CVAT support. Changes include new import/export loaders, video annotation flattening logic, Label Studio conversion for video annotations, and corresponding test coverage. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant Importer as AnnotationImporter
participant Loader as Format Loader
participant Converter as IR Converter
participant Output as Export/LS
User->>Importer: import_annotations(format=coco/mot/cvat_video)
Importer->>Loader: load_coco_from_file/load_mot_from_dir/load_cvat_from_zip
Loader->>Converter: Parse format to raw data
Converter->>Converter: Convert to IRAnnotation objects
alt is_video_format
Converter->>Importer: Return frame-wise annotations
Importer->>Importer: _flatten_video_annotations()
Importer->>Output: Return video-keyed dict
else image-based format
Converter->>Output: Return filename-keyed dict
end
Output-->>User: Annotations ready for metadata/export
sequenceDiagram
participant User
participant QueryResult as QueryResult
participant Importer as AnnotationImporter
participant Exporter as Format Exporter
participant FileSystem as Output Files
User->>QueryResult: export_as_coco/mot/cvat_video(download_dir)
QueryResult->>QueryResult: _resolve_annotation_field()
QueryResult->>Importer: _get_all_video_annotations() or gather annotations
QueryResult->>Exporter: Create context (CocoContext/MOTContext)
alt Video Format (MOT/CVAT Video)
Exporter->>FileSystem: Write video tracks/XML
else COCO Format
Exporter->>FileSystem: Write COCO JSON with image/annotation records
end
FileSystem-->>User: Return path to exported file/directory
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes 🚥 Pre-merge checks | ✅ 1✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
CI Feedback 🧐A test triggered by this PR failed. Here is an AI-generated analysis of the failure:
|
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (6)
dagshub/data_engine/model/query_result.py (1)
410-416: Consider catching more specific exceptions for blob download failures.The broad
Exceptioncatch is flagged by static analysis (BLE001). While this prevents crashes from download errors, consider catching more specific exceptions (e.g.,requests.RequestException,IOError) to avoid masking unexpected errors.The current approach of logging and removing the field from metadata is reasonable for resilience.
♻️ Optional: More specific exception handling
def _get_blob_fn(dp: Datapoint, field: str, url: str, blob_path: Path): try: blob_or_path = _get_blob(url, blob_path, auth, cache_on_disk, load_into_memory, path_format) - except Exception as e: + except (IOError, OSError) as e: logger.warning(f"Error while downloading blob for field {field} in datapoint {dp.path}: {e}") dp.metadata.pop(field, None) return dp.metadata[field] = blob_or_pathdagshub/data_engine/annotation/importer.py (2)
97-131: Consider extracting CVAT video detection logic to reduce duplication.The pattern of calling loader, then checking
_is_video_annotation_dict, then optionally flattening is duplicated betweencvat(lines 98-102) andcvat_video(lines 124-131) handlers.♻️ Optional: Extract common CVAT processing logic
def _process_cvat_result(self, result) -> Mapping[str, Sequence[IRAnnotationBase]]: if self._is_video_annotation_dict(result): return self._flatten_video_annotations(result) return resultThen use in both handlers:
elif self.annotations_type == "cvat": result = load_cvat_from_zip(annotations_file) annotation_dict = self._process_cvat_result(result)
222-226: Assertion may be too strict for edge cases.The
assert self.is_video_formatwill raiseAssertionErrorif a non-video annotation somehow hasNonefilename. Consider using a more informative exception for production code.♻️ Suggested improvement
for ann in anns: if ann.filename is not None: ann.filename = remap_func(ann.filename) else: - assert self.is_video_format, f"Non-video annotation has no filename: {ann}" + if not self.is_video_format: + raise ValueError(f"Non-video annotation unexpectedly has no filename: {ann}") ann.filename = new_filenametests/data_engine/annotation_import/test_mot.py (1)
108-108: Minor: Usenext(iter())for single-element access.Static analysis suggests using
next(iter(result.values()))instead oflist(result.values())[0]for slightly better performance and clarity when accessing the single expected element.♻️ Suggested fix
- anns = list(result.values())[0] + anns = next(iter(result.values()))Also applies to: 122-122
tests/data_engine/annotation_import/test_cvat_video.py (2)
35-35: Minor: Usenext(iter())for single-element access.Same suggestion as in test_mot.py - use
next(iter(result.values()))for cleaner single-element access.♻️ Suggested fix
- anns = list(result.values())[0] + anns = next(iter(result.values()))
151-183: Consider extracting shared test helpers to conftest.py.The
_make_video_bboxand_make_qrhelper functions are duplicated across test_mot.py, test_cvat_video.py, and test_coco.py. Consider moving these to a sharedconftest.pyor test utilities module to reduce duplication.
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (8)
.gitignoredagshub/data_engine/annotation/importer.pydagshub/data_engine/annotation/metadata.pydagshub/data_engine/model/datapoint.pydagshub/data_engine/model/query_result.pytests/data_engine/annotation_import/test_coco.pytests/data_engine/annotation_import/test_cvat_video.pytests/data_engine/annotation_import/test_mot.py
🧰 Additional context used
🧬 Code graph analysis (3)
tests/data_engine/annotation_import/test_mot.py (3)
dagshub/data_engine/annotation/importer.py (3)
_is_video_annotation_dict(138-143)is_video_format(68-69)_flatten_video_annotations(145-154)dagshub/data_engine/client/models.py (1)
MetadataSelectFieldSchema(82-101)dagshub/data_engine/dtypes.py (1)
MetadataFieldType(20-36)
tests/data_engine/annotation_import/test_coco.py (3)
dagshub/data_engine/annotation/importer.py (2)
AnnotationsNotFoundError(30-32)import_annotations(71-135)dagshub/data_engine/annotation/metadata.py (2)
add_coco_annotation(277-297)value(107-117)dagshub/data_engine/client/models.py (1)
MetadataSelectFieldSchema(82-101)
dagshub/data_engine/annotation/importer.py (1)
dagshub/data_engine/model/datasource_state.py (1)
raw_path(99-104)
🪛 Ruff (0.14.14)
tests/data_engine/annotation_import/test_cvat_video.py
[warning] 35-35: Prefer next(iter(result.values())) over single element slice
Replace with next(iter(result.values()))
(RUF015)
tests/data_engine/annotation_import/test_mot.py
[warning] 108-108: Prefer next(iter(result.values())) over single element slice
Replace with next(iter(result.values()))
(RUF015)
[warning] 122-122: Prefer next(iter(result.values())) over single element slice
Replace with next(iter(result.values()))
(RUF015)
tests/data_engine/annotation_import/test_coco.py
[warning] 45-45: Unused function argument: mock_dagshub_auth
(ARG001)
[warning] 64-64: Unused function argument: mock_dagshub_auth
(ARG001)
dagshub/data_engine/model/query_result.py
[warning] 412-412: Do not catch blind exception: Exception
(BLE001)
[warning] 781-781: Avoid specifying long messages outside the exception class
(TRY003)
[warning] 892-892: Avoid specifying long messages outside the exception class
(TRY003)
[warning] 947-947: Avoid specifying long messages outside the exception class
(TRY003)
[warning] 993-993: Avoid specifying long messages outside the exception class
(TRY003)
dagshub/data_engine/annotation/importer.py
[warning] 133-133: Avoid specifying long messages outside the exception class
(TRY003)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: build (3.10)
- GitHub Check: build (3.11)
- GitHub Check: build (3.9)
- GitHub Check: build (3.12)
- GitHub Check: build (3.13)
🔇 Additional comments (20)
dagshub/data_engine/annotation/metadata.py (2)
23-26: LGTM - VideoRectangle task type registration.Module-level registration of
videorectanglein_task_lookupenables Label Studio video annotation parsing. This follows the established pattern for task type registration.
277-297: LGTM - COCO annotation import with filename normalization.The implementation correctly loads COCO annotations, normalizes filenames to the datapoint's path, and updates the datapoint state. The local import avoids circular dependency issues.
One observation: all annotations from the COCO JSON are assigned to the current datapoint's path regardless of their original image grouping. This appears intentional for the use case of adding annotations to a single datapoint.
dagshub/data_engine/model/query_result.py (4)
772-784: LGTM - Well-implemented helper methods.
_get_all_video_annotationscorrectly filters for video-specific annotation types._resolve_annotation_fieldprovides deterministic field selection via sorted order with proper error handling when no annotation fields exist.
864-910: LGTM - COCO export follows established patterns.The implementation correctly:
- Resolves annotation field with fallback
- Validates non-empty annotations before export
- Prefixes filenames with source prefix
- Creates appropriate directory structure
949-957: Dimension inference from first annotation requires verification of external library constraints.The empty list case is properly guarded at lines 946-947. However, the concern about
Nonedimensions deserves clarification: the code assumesvideo_annotations[0].image_widthandvideo_annotations[0].image_heightare non-Nonewhen falling back from the provided parameters. All test cases constructIRVideoBBoxAnnotationwith explicit non-Nonedimensions (e.g.,image_width=1920, image_height=1080), but sinceIRVideoBBoxAnnotationis from the externaldagshub_annotation_converterlibrary, its type constraints cannot be verified within this repository. If the external library allowsNonefor these fields, they could propagate toMOTContext.
964-1005: No action needed. The function properly handlesNonedimensions by inferring them from the video annotation objects, which include width and height metadata. Tests confirm this behavior works as expected.dagshub/data_engine/annotation/importer.py (3)
67-69: LGTM - Clean video format detection.The
is_video_formatproperty provides a clear way to check if the annotation type is video-oriented.
137-143: LGTM - Robust video annotation detection.The method correctly handles edge cases: non-dict inputs, empty dicts, and determines video vs image annotations by checking the type of the first key.
371-385: LGTM - Video task conversion with proper filtering.The method correctly filters for
IRVideoBBoxAnnotationinstances and skips entries without video annotations, preventing empty tasks from being created.tests/data_engine/annotation_import/test_mot.py (4)
1-24: LGTM - Well-structured test setup.Good use of
autousefixture to mocksource_prefixfor all tests in the module, ensuring consistent test isolation.
29-46: LGTM - Comprehensive edge case coverage for _is_video_annotation_dict.Tests cover all key scenarios: int keys, str keys, empty dict, non-dict input, and mixed keys (checking first key behavior).
153-180: LGTM - Thorough export tests.Good coverage of MOT export including directory structure validation, explicit dimension handling, and error cases for missing annotations.
186-236: LGTM - Clean helper functions.The helper functions are well-designed for test reusability, with sensible defaults and clear parameter names.
tests/data_engine/annotation_import/test_cvat_video.py (2)
43-81: LGTM - Good coverage of video annotation filtering and aggregation.Tests properly verify that
_get_all_video_annotationsfilters out image annotations and correctly aggregates across multiple datapoints.
87-145: LGTM - Comprehensive CVAT video export tests.Good coverage of success cases, error handling (no annotations, image-only annotations), custom parameters, and multi-datapoint scenarios.
tests/data_engine/annotation_import/test_coco.py (5)
29-42: LGTM - Solid COCO import tests.Tests cover successful import and proper error handling for non-existent files.
45-58: LGTM - The unused mock_dagshub_auth parameter is intentional.The static analysis warning (ARG001) is a false positive. The
mock_dagshub_authfixture is included to ensure authentication is mocked during these tests via pytest's fixture mechanism, even though it's not directly referenced in the test body.Also applies to: 64-71
77-103: LGTM - Thorough _resolve_annotation_field tests.Excellent coverage including explicit field override, auto-resolution, error on no fields, and alphabetical ordering verification.
109-182: LGTM - Comprehensive COCO export tests.Good coverage including:
- Bbox coordinate verification (denormalized format)
- Error handling for empty annotations
- Explicit class mapping
- Custom output filename
- Multi-datapoint export with count verification
188-218: Helper functions follow established pattern.Same comment as for other test files - these helpers could be shared via conftest.py to reduce duplication across the test modules.
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
Code Review by Qodo
✅ 1.
|
| # Add the source prefix to all annotations | ||
| for ann in annotations: | ||
| ann.filename = os.path.join(self.datasource.source.source_prefix, ann.filename) | ||
|
|
There was a problem hiding this comment.
5. Export mutates filenames 🐞 Bug ✓ Correctness
export_as_yolo/export_as_coco modify annotation objects in-place by rewriting ann.filename to include source_prefix. This permanently mutates datapoint metadata (double-prefixing on repeated exports) and can generate OS-specific separators due to os.path.join with a PurePosixPath prefix.
Agent Prompt
### Issue description
`export_as_yolo()` / `export_as_coco()` rewrite `ann.filename` in-place, which mutates datapoint metadata and can cause repeated exports to double-prefix paths. Additionally, using `os.path.join()` with a `PurePosixPath` prefix can produce platform-specific separators.
### Issue Context
Annotations come from `_get_all_annotations()`, which returns references to the `MetadataAnnotations.annotations` list stored on datapoints.
### Fix Focus Areas
- dagshub/data_engine/model/query_result.py[762-770]
- dagshub/data_engine/model/query_result.py[846-852]
- dagshub/data_engine/model/query_result.py[898-905]
- dagshub/data_engine/model/datasource_state.py[106-113]
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
|
Note to ourselves - need to update the docs: To have the correct list of supported formats |
…before opening a PR
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (3)
dagshub/data_engine/model/query_result.py (1)
954-962: Minor redundancy in dimension checks.The
elif video_annotations:checks at lines 957 and 961 are always true since we've already verifiedvideo_annotationsis non-empty at line 951. This is harmless but slightly redundant.Suggested simplification
context = MOTContext() if image_width is not None: context.image_width = image_width - elif video_annotations: + else: context.image_width = video_annotations[0].image_width if image_height is not None: context.image_height = image_height - elif video_annotations: + else: context.image_height = video_annotations[0].image_heighttests/data_engine/annotation_import/test_mot.py (1)
100-129: Consider usingnext(iter(...))for single-element access.Static analysis suggests using
next(iter(result.values()))instead oflist(result.values())[0]for better performance and clarity when accessing a single element.Suggested fix
- anns = list(result.values())[0] + anns = next(iter(result.values()))tests/data_engine/annotation_import/test_cvat_video.py (1)
28-39: Consider usingnext(iter(...))for single-element access.Same pattern as MOT tests -
next(iter(result.values()))is preferred overlist(result.values())[0].
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
dagshub/data_engine/annotation/importer.pydagshub/data_engine/annotation/metadata.pydagshub/data_engine/model/datapoint.pydagshub/data_engine/model/query_result.pytests/data_engine/annotation_import/test_coco.pytests/data_engine/annotation_import/test_cvat_video.pytests/data_engine/annotation_import/test_mot.py
🧰 Additional context used
🧬 Code graph analysis (3)
dagshub/data_engine/annotation/metadata.py (2)
dagshub/streaming/filesystem.py (1)
path(1119-1124)dagshub/common/helpers.py (1)
log_message(100-107)
dagshub/data_engine/annotation/importer.py (2)
dagshub/data_engine/model/datasource.py (1)
source(180-181)dagshub/data_engine/model/datasource_state.py (1)
raw_path(99-104)
dagshub/data_engine/model/query_result.py (2)
dagshub/data_engine/model/datapoint.py (1)
_get_blob(264-322)dagshub/common/helpers.py (1)
log_message(100-107)
🪛 Ruff (0.15.1)
dagshub/data_engine/annotation/importer.py
[warning] 131-131: Avoid specifying long messages outside the exception class
(TRY003)
[warning] 224-224: Avoid specifying long messages outside the exception class
(TRY003)
tests/data_engine/annotation_import/test_cvat_video.py
[warning] 36-36: Prefer next(iter(result.values())) over single element slice
Replace with next(iter(result.values()))
(RUF015)
dagshub/data_engine/model/query_result.py
[warning] 412-412: Do not catch blind exception: Exception
(BLE001)
[warning] 786-786: Avoid specifying long messages outside the exception class
(TRY003)
[warning] 897-897: Avoid specifying long messages outside the exception class
(TRY003)
[warning] 952-952: Avoid specifying long messages outside the exception class
(TRY003)
[warning] 998-998: Avoid specifying long messages outside the exception class
(TRY003)
tests/data_engine/annotation_import/test_coco.py
[warning] 28-28: Expected a statement
(invalid-syntax)
[warning] 28-28: Expected a statement
(invalid-syntax)
[warning] 28-28: Expected a statement
(invalid-syntax)
[warning] 28-28: Expected a statement
(invalid-syntax)
[warning] 35-35: Expected a statement
(invalid-syntax)
[warning] 35-35: Expected a statement
(invalid-syntax)
[warning] 35-35: Expected a statement
(invalid-syntax)
[warning] 35-35: Expected a statement
(invalid-syntax)
[warning] 35-36: Expected a statement
(invalid-syntax)
[warning] 44-44: Expected a statement
(invalid-syntax)
[warning] 44-44: Expected a statement
(invalid-syntax)
[warning] 44-44: Expected a statement
(invalid-syntax)
[warning] 44-44: Expected a statement
(invalid-syntax)
[warning] 44-44: Simple statements must be separated by newlines or semicolons
(invalid-syntax)
[warning] 44-44: Expected ,, found name
(invalid-syntax)
[warning] 45-45: Unexpected indentation
(invalid-syntax)
[warning] 52-52: Expected a statement
(invalid-syntax)
[warning] 52-52: Expected a statement
(invalid-syntax)
[warning] 52-52: Expected a statement
(invalid-syntax)
[warning] 52-52: Expected a statement
(invalid-syntax)
[warning] 63-63: Expected ,, found ==
(invalid-syntax)
[warning] 63-63: Expected ,, found ==
(invalid-syntax)
[warning] 63-63: Expected ,, found ==
(invalid-syntax)
[warning] 63-63: Expected ,, found =
(invalid-syntax)
[warning] 63-64: Expected ), found newline
(invalid-syntax)
[warning] 69-69: Expected ,, found >>
(invalid-syntax)
[warning] 69-69: Expected ,, found >>
(invalid-syntax)
[warning] 69-69: Expected ,, found >>
(invalid-syntax)
[warning] 69-69: Expected ,, found >
(invalid-syntax)
[warning] 69-69: Positional argument cannot follow keyword argument
(invalid-syntax)
[warning] 69-69: Expected ,, found name
(invalid-syntax)
[warning] 69-69: Positional argument cannot follow keyword argument
(invalid-syntax)
[warning] 69-69: Expected ,, found name
(invalid-syntax)
[warning] 70-70: Expected ,, found name
(invalid-syntax)
[warning] 74-74: Expected a statement
(invalid-syntax)
[warning] 74-74: Expected a statement
(invalid-syntax)
[warning] 74-74: Expected a statement
(invalid-syntax)
[warning] 74-74: Expected a statement
(invalid-syntax)
[warning] 75-75: Expected a statement
(invalid-syntax)
[warning] 75-75: Expected a statement
(invalid-syntax)
[warning] 75-75: Expected a statement
(invalid-syntax)
[warning] 75-75: Expected a statement
(invalid-syntax)
[warning] 75-76: Expected a statement
(invalid-syntax)
[warning] 77-77: Expected a statement
(invalid-syntax)
[warning] 77-77: Expected a statement
(invalid-syntax)
[warning] 77-77: Expected a statement
(invalid-syntax)
[warning] 77-77: Expected a statement
(invalid-syntax)
[warning] 77-77: Simple statements must be separated by newlines or semicolons
(invalid-syntax)
[warning] 77-77: Expected ,, found name
(invalid-syntax)
[warning] 78-78: Unexpected indentation
(invalid-syntax)
[warning] 83-83: Expected a statement
(invalid-syntax)
[warning] 83-83: Expected a statement
(invalid-syntax)
[warning] 83-83: Expected a statement
(invalid-syntax)
[warning] 83-83: Expected a statement
(invalid-syntax)
[warning] 84-84: Unexpected indentation
(invalid-syntax)
[warning] 85-85: Expected a statement
(invalid-syntax)
[warning] 85-85: Expected a statement
(invalid-syntax)
[warning] 85-85: Expected a statement
(invalid-syntax)
[warning] 85-85: Expected a statement
(invalid-syntax)
[warning] 85-86: Expected a statement
(invalid-syntax)
[warning] 86-86: Expected a statement
(invalid-syntax)
[warning] 86-86: Expected a statement
(invalid-syntax)
[warning] 86-86: Expected a statement
(invalid-syntax)
[warning] 86-86: Expected a statement
(invalid-syntax)
[warning] 86-86: Simple statements must be separated by newlines or semicolons
(invalid-syntax)
[warning] 86-86: Expected ,, found name
(invalid-syntax)
[warning] 92-92: Expected a statement
(invalid-syntax)
[warning] 92-92: Expected a statement
(invalid-syntax)
[warning] 92-92: Expected a statement
(invalid-syntax)
[warning] 92-92: Expected a statement
(invalid-syntax)
[warning] 95-95: Expected a statement
(invalid-syntax)
[warning] 95-95: Expected a statement
(invalid-syntax)
[warning] 95-95: Expected a statement
(invalid-syntax)
[warning] 95-95: Expected a statement
(invalid-syntax)
[warning] 95-96: Expected a statement
(invalid-syntax)
[warning] 98-98: Expected a statement
(invalid-syntax)
[warning] 98-98: Expected a statement
(invalid-syntax)
[warning] 98-98: Expected a statement
(invalid-syntax)
[warning] 98-98: Expected a statement
(invalid-syntax)
[warning] 98-98: Simple statements must be separated by newlines or semicolons
(invalid-syntax)
[warning] 98-98: Expected ,, found name
(invalid-syntax)
[warning] 99-99: Unexpected indentation
(invalid-syntax)
[warning] 104-104: Expected a statement
(invalid-syntax)
[warning] 104-104: Expected a statement
(invalid-syntax)
[warning] 104-104: Expected a statement
(invalid-syntax)
[warning] 104-104: Expected a statement
(invalid-syntax)
[warning] 105-105: Unexpected indentation
(invalid-syntax)
[warning] 106-106: Expected a statement
(invalid-syntax)
[warning] 106-106: Expected a statement
(invalid-syntax)
[warning] 106-106: Expected a statement
(invalid-syntax)
[warning] 106-106: Expected a statement
(invalid-syntax)
[warning] 106-107: Expected a statement
(invalid-syntax)
[warning] 107-107: Unexpected indentation
(invalid-syntax)
[warning] 110-110: Expected a statement
(invalid-syntax)
[warning] 123-123: Expected a statement
(invalid-syntax)
[warning] 123-123: Expected a statement
(invalid-syntax)
[warning] 123-123: Expected a statement
(invalid-syntax)
[warning] 123-123: Expected a statement
(invalid-syntax)
[warning] 123-123: Simple statements must be separated by newlines or semicolons
(invalid-syntax)
[warning] 123-123: Expected ,, found name
(invalid-syntax)
[warning] 129-129: Expected a statement
(invalid-syntax)
[warning] 129-129: Expected a statement
(invalid-syntax)
[warning] 129-129: Expected a statement
(invalid-syntax)
[warning] 129-129: Expected a statement
(invalid-syntax)
[warning] 131-131: Expected an indented block after function definition
(invalid-syntax)
[warning] 131-131: Expected a statement
(invalid-syntax)
[warning] 131-131: Expected a statement
(invalid-syntax)
[warning] 131-131: Expected a statement
(invalid-syntax)
[warning] 131-132: Expected a statement
(invalid-syntax)
[warning] 133-133: Expected an indented block after function definition
(invalid-syntax)
[warning] 133-133: Expected a statement
(invalid-syntax)
[warning] 133-133: Expected a statement
(invalid-syntax)
[warning] 133-133: Expected a statement
(invalid-syntax)
[warning] 133-133: Simple statements must be separated by newlines or semicolons
(invalid-syntax)
[warning] 133-133: Expected ,, found name
(invalid-syntax)
[warning] 134-134: Unexpected indentation
(invalid-syntax)
[warning] 138-138: Expected a statement
(invalid-syntax)
[warning] 138-138: Expected a statement
(invalid-syntax)
[warning] 138-138: Expected a statement
(invalid-syntax)
[warning] 138-138: Expected a statement
(invalid-syntax)
[warning] 140-140: Expected an indented block after function definition
(invalid-syntax)
[warning] 140-140: Expected a statement
(invalid-syntax)
[warning] 140-140: Expected a statement
(invalid-syntax)
[warning] 140-140: Expected a statement
(invalid-syntax)
[warning] 140-141: Expected a statement
(invalid-syntax)
[warning] 142-142: Expected an indented block after function definition
(invalid-syntax)
[warning] 142-142: Expected a statement
(invalid-syntax)
[warning] 142-142: Expected a statement
(invalid-syntax)
[warning] 142-142: Expected a statement
(invalid-syntax)
[warning] 142-142: Simple statements must be separated by newlines or semicolons
(invalid-syntax)
[warning] 142-142: Expected ,, found name
(invalid-syntax)
[warning] 143-143: Unexpected indentation
(invalid-syntax)
[warning] 147-147: Expected a statement
(invalid-syntax)
[warning] 147-147: Expected a statement
(invalid-syntax)
[warning] 147-147: Expected a statement
(invalid-syntax)
[warning] 147-147: Expected a statement
(invalid-syntax)
[warning] 149-149: Expected an indented block after function definition
(invalid-syntax)
[warning] 149-149: Expected a statement
(invalid-syntax)
[warning] 149-149: Expected a statement
(invalid-syntax)
[warning] 149-149: Expected a statement
(invalid-syntax)
[warning] 149-150: Expected a statement
(invalid-syntax)
[warning] 151-151: Expected an indented block after function definition
(invalid-syntax)
[warning] 151-151: Expected a statement
(invalid-syntax)
[warning] 151-151: Expected a statement
(invalid-syntax)
[warning] 151-151: Expected a statement
(invalid-syntax)
[warning] 151-151: Simple statements must be separated by newlines or semicolons
(invalid-syntax)
[warning] 151-151: Expected ,, found name
(invalid-syntax)
[warning] 152-152: Unexpected indentation
(invalid-syntax)
[warning] 157-157: Expected a statement
(invalid-syntax)
[warning] 157-157: Expected a statement
(invalid-syntax)
[warning] 157-157: Expected a statement
(invalid-syntax)
[warning] 157-157: Expected a statement
(invalid-syntax)
[warning] 175-175: Expected an indented block after function definition
(invalid-syntax)
[warning] 175-175: Expected a statement
(invalid-syntax)
[warning] 175-175: Expected a statement
(invalid-syntax)
[warning] 175-175: Expected a statement
(invalid-syntax)
[warning] 175-176: Expected a statement
(invalid-syntax)
[warning] 180-180: Expected an indented block after function definition
(invalid-syntax)
[warning] 180-180: Expected a statement
(invalid-syntax)
[warning] 180-180: Expected a statement
(invalid-syntax)
[warning] 180-180: Expected a statement
(invalid-syntax)
[warning] 180-180: Simple statements must be separated by newlines or semicolons
(invalid-syntax)
[warning] 180-180: Expected ,, found name
(invalid-syntax)
[warning] 181-181: Unexpected indentation
(invalid-syntax)
[warning] 194-194: Expected a statement
(invalid-syntax)
[warning] 194-194: Expected a statement
(invalid-syntax)
[warning] 194-194: Expected a statement
(invalid-syntax)
[warning] 194-194: Expected a statement
(invalid-syntax)
[warning] 195-195: Unexpected indentation
(invalid-syntax)
[warning] 199-199: Expected a statement
(invalid-syntax)
[warning] 200-200: Expected an indented block after function definition
(invalid-syntax)
[warning] 200-200: Expected a statement
(invalid-syntax)
[warning] 200-200: Expected a statement
(invalid-syntax)
[warning] 200-200: Expected a statement
(invalid-syntax)
[warning] 200-201: Expected a statement
(invalid-syntax)
[warning] 201-201: Unexpected indentation
(invalid-syntax)
[warning] 208-208: Expected a statement
(invalid-syntax)
[warning] 228-228: Expected an indented block after function definition
(invalid-syntax)
[warning] 228-228: Expected a statement
(invalid-syntax)
[warning] 228-228: Expected a statement
(invalid-syntax)
[warning] 228-228: Expected a statement
(invalid-syntax)
[warning] 228-228: Simple statements must be separated by newlines or semicolons
(invalid-syntax)
[warning] 228-228: Expected ,, found name
(invalid-syntax)
[warning] 229-229: Unexpected indentation
(invalid-syntax)
[warning] 237-237: Expected a statement
(invalid-syntax)
[warning] 237-237: Expected a statement
(invalid-syntax)
[warning] 237-237: Expected a statement
(invalid-syntax)
[warning] 237-237: Expected a statement
(invalid-syntax)
[warning] 288-288: Expected a statement
(invalid-syntax)
[warning] 288-288: Expected a statement
(invalid-syntax)
[warning] 288-288: Expected a statement
(invalid-syntax)
[warning] 288-288: Expected a statement
(invalid-syntax)
[warning] 288-289: Expected a statement
(invalid-syntax)
[warning] 309-309: Expected a statement
(invalid-syntax)
[warning] 309-309: Expected a statement
(invalid-syntax)
[warning] 309-309: Expected a statement
(invalid-syntax)
[warning] 309-309: Expected a statement
(invalid-syntax)
[warning] 309-309: Simple statements must be separated by newlines or semicolons
(invalid-syntax)
[warning] 309-309: Expected ,, found name
(invalid-syntax)
[warning] 320-320: Expected a statement
(invalid-syntax)
[warning] 320-320: Expected a statement
(invalid-syntax)
[warning] 320-320: Expected a statement
(invalid-syntax)
[warning] 320-320: Expected a statement
(invalid-syntax)
[warning] 334-334: Expected a statement
(invalid-syntax)
[warning] 334-334: Expected a statement
(invalid-syntax)
[warning] 334-334: Expected a statement
(invalid-syntax)
[warning] 334-334: Expected a statement
(invalid-syntax)
[warning] 334-335: Expected a statement
(invalid-syntax)
[warning] 335-335: Expected a statement
(invalid-syntax)
[warning] 335-335: Expected a statement
(invalid-syntax)
[warning] 335-335: Expected a statement
(invalid-syntax)
[warning] 335-335: Expected a statement
(invalid-syntax)
[warning] 335-335: Simple statements must be separated by newlines or semicolons
(invalid-syntax)
[warning] 335-335: Expected ,, found name
(invalid-syntax)
tests/data_engine/annotation_import/test_mot.py
[warning] 108-108: Prefer next(iter(result.values())) over single element slice
Replace with next(iter(result.values()))
(RUF015)
[warning] 127-127: Prefer next(iter(result.values())) over single element slice
Replace with next(iter(result.values()))
(RUF015)
🔇 Additional comments (15)
dagshub/data_engine/annotation/metadata.py (1)
282-302: LGTM!The
add_coco_annotationmethod correctly loads COCO annotations, rewrites filenames to match the datapoint path, and triggers the datapoint update. The lazy import helps avoid circular dependencies.dagshub/data_engine/model/query_result.py (4)
410-414: Catching broadExceptionis appropriate here given the retry behavior.The broad exception catch handles both the original exception and potential
RetryErrorfrom tenacity. However, this would be cleaner ifreraise=Truewas added to theRetryingcall in_get_blob, allowing a more specific catch. The current approach works as a defensive measure.
777-789: LGTM!The helper methods are well-designed.
_get_all_video_annotationscorrectly filters for video annotations, and_resolve_annotation_fieldprovides deterministic field selection with proper error handling.
869-915: LGTM!The
export_as_cocomethod follows established patterns fromexport_as_yolo, with proper file downloads, annotation field resolution, and category handling.
969-1010: LGTM!The
export_as_cvat_videomethod follows the established export pattern with proper annotation field resolution and error handling.dagshub/data_engine/annotation/importer.py (5)
67-69: LGTM!The
is_video_formatproperty correctly identifies video-oriented annotation types.
97-131: LGTM!The format-specific handling is well-organized with proper branching for different input types (zip vs directory/file) and video annotation flattening.
135-152: LGTM!The helper methods correctly detect video annotation dictionaries and flatten frame-indexed annotations appropriately.
220-225: LGTM!Good defensive handling for video annotations where
ann.filenamemay beNone, falling back tonew_filenamewith proper validation for non-video formats.
370-384: LGTM!The video task conversion correctly filters for video annotations and uses the appropriate converter. Taking only the first task is acceptable since annotations are grouped by filename.
tests/data_engine/annotation_import/test_mot.py (2)
1-24: LGTM!Good test setup with proper fixtures for mocking source prefix.
134-172: LGTM!Good test coverage for MOT export including success cases, error handling for missing annotations, and explicit dimension handling.
tests/data_engine/annotation_import/test_cvat_video.py (3)
1-23: LGTM!Proper test setup mirroring the MOT test structure.
73-86: LGTM!The Label Studio round-trip test is valuable for validating that video annotations can be serialized and deserialized correctly.
92-127: LGTM!Good test coverage for CVAT video export including success cases, error handling, and custom video name propagation.
| from dagshub_annotation_converter.formats.label_studio.videorectangle import VideoRectangleAnnotation | ||
| from dagshub_annotation_converter.formats.label_studio.task import task_lookup as _task_lookup | ||
|
|
||
| _task_lookup["videorectangle"] = VideoRectangleAnnotation | ||
|
|
||
| from dagshub_annotation_converter.formats.label_studio.videorectangle import VideoRectangleAnnotation | ||
| from dagshub_annotation_converter.formats.label_studio.task import task_lookup as _task_lookup | ||
|
|
||
| _task_lookup["videorectangle"] = VideoRectangleAnnotation |
There was a problem hiding this comment.
Remove duplicate imports and registration.
Lines 23-26 and 28-31 are exact duplicates. This appears to be a merge conflict artifact or copy-paste error that needs to be cleaned up.
Proposed fix
-from dagshub_annotation_converter.formats.label_studio.videorectangle import VideoRectangleAnnotation
-from dagshub_annotation_converter.formats.label_studio.task import task_lookup as _task_lookup
-
-_task_lookup["videorectangle"] = VideoRectangleAnnotation
-
from dagshub_annotation_converter.formats.label_studio.videorectangle import VideoRectangleAnnotation
from dagshub_annotation_converter.formats.label_studio.task import task_lookup as _task_lookup
_task_lookup["videorectangle"] = VideoRectangleAnnotation📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| from dagshub_annotation_converter.formats.label_studio.videorectangle import VideoRectangleAnnotation | |
| from dagshub_annotation_converter.formats.label_studio.task import task_lookup as _task_lookup | |
| _task_lookup["videorectangle"] = VideoRectangleAnnotation | |
| from dagshub_annotation_converter.formats.label_studio.videorectangle import VideoRectangleAnnotation | |
| from dagshub_annotation_converter.formats.label_studio.task import task_lookup as _task_lookup | |
| _task_lookup["videorectangle"] = VideoRectangleAnnotation | |
| from dagshub_annotation_converter.formats.label_studio.videorectangle import VideoRectangleAnnotation | |
| from dagshub_annotation_converter.formats.label_studio.task import task_lookup as _task_lookup | |
| _task_lookup["videorectangle"] = VideoRectangleAnnotation |
| <<<<<<< HEAD | ||
| # --- import --- | ||
|
|
||
|
|
||
| def test_import_coco_from_file(ds, tmp_path): | ||
| _write_coco(tmp_path, _make_coco_json()) | ||
| importer = AnnotationImporter(ds, "coco", tmp_path / "annotations.json", load_from="disk") | ||
| ======= | ||
| # --- COCO import --- | ||
|
|
||
|
|
||
| def test_import_coco_from_file(ds, tmp_path): | ||
| coco_file = tmp_path / "annotations.json" | ||
| coco_file.write_text(json.dumps(_make_coco_json())) | ||
|
|
||
| importer = AnnotationImporter(ds, "coco", coco_file, load_from="disk") | ||
| >>>>>>> 550a453 (initial commite) |
There was a problem hiding this comment.
Unresolved merge conflicts must be resolved before merging.
This file contains git merge conflict markers (<<<<<<< HEAD, =======, >>>>>>> 550a453) throughout. The file has invalid Python syntax and will fail to parse/execute. Please resolve all merge conflicts.
Affected regions include lines 28-44, 52-77, 74-86, 92-107, 129-155, 157-180, 194-228, 237-335, and more.
🧰 Tools
🪛 Ruff (0.15.1)
[warning] 28-28: Expected a statement
(invalid-syntax)
[warning] 28-28: Expected a statement
(invalid-syntax)
[warning] 28-28: Expected a statement
(invalid-syntax)
[warning] 28-28: Expected a statement
(invalid-syntax)
[warning] 35-35: Expected a statement
(invalid-syntax)
[warning] 35-35: Expected a statement
(invalid-syntax)
[warning] 35-35: Expected a statement
(invalid-syntax)
[warning] 35-35: Expected a statement
(invalid-syntax)
[warning] 35-36: Expected a statement
(invalid-syntax)
[warning] 44-44: Expected a statement
(invalid-syntax)
[warning] 44-44: Expected a statement
(invalid-syntax)
[warning] 44-44: Expected a statement
(invalid-syntax)
[warning] 44-44: Expected a statement
(invalid-syntax)
[warning] 44-44: Simple statements must be separated by newlines or semicolons
(invalid-syntax)
[warning] 44-44: Expected ,, found name
(invalid-syntax)
# Conflicts: # dagshub/data_engine/annotation/importer.py # dagshub/data_engine/annotation/metadata.py # dagshub/data_engine/model/datapoint.py # dagshub/data_engine/model/query_result.py # tests/data_engine/annotation_import/test_cvat_video.py # tests/data_engine/annotation_import/test_mot.py
55dc220 to
89d6395
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (4)
tests/data_engine/annotation_import/test_mot.py (1)
107-109: Consider usingnext(iter(...))for single-element access.For accessing the first (and only expected) element from a dict,
next(iter(result.values()))is more idiomatic and avoids creating an intermediate list.♻️ Proposed fix
- anns = list(result.values())[0] + anns = next(iter(result.values()))And similarly at line 122:
- assert len(list(result.values())[0]) == 2 + assert len(next(iter(result.values()))) == 2Also applies to: 121-122
dagshub/data_engine/model/query_result.py (1)
1117-1120: Dead variableann_path.The variable
ann_pathon line 1117 is assigned but only used to extract.nameand.stem. However,ann_filenameis a string and you could usePath(ann_filename).nameandPath(ann_filename).stemdirectly. This isn't a bug, but the assignment could be simplified.♻️ Suggested simplification
assert ann_filename is not None local_video = self._prepare_video_file_for_export(local_download_root, ann_filename) if local_video is None: raise FileNotFoundError( f"Could not find local downloaded video file for '{ann_filename}' " f"under '{local_download_root}'." ) - ann_path = Path(ann_filename) video_files[ann_filename] = local_video - video_files[ann_path.name] = local_video - video_files[ann_path.stem] = local_video + video_files[Path(ann_filename).name] = local_video + video_files[Path(ann_filename).stem] = local_videotests/data_engine/annotation_import/test_cvat_video.py (2)
235-236: Use raw string for regex pattern inmatch=.The pattern
"missing.mp4"contains a dot which is a regex metacharacter matching any character. While it works here, using a raw string makes the intent clearer.♻️ Proposed fix
- with pytest.raises(FileNotFoundError, match="missing.mp4"): + with pytest.raises(FileNotFoundError, match=r"missing\.mp4"):
36-36: Consider usingnext(iter(...))for single-element access.Same as in
test_mot.py,next(iter(result.values()))is more idiomatic thanlist(result.values())[0].♻️ Proposed fix
- anns = list(result.values())[0] + anns = next(iter(result.values()))
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (8)
dagshub/auth/token_auth.pydagshub/data_engine/annotation/importer.pydagshub/data_engine/annotation/metadata.pydagshub/data_engine/model/query_result.pytests/data_engine/annotation_import/test_cvat_video.pytests/data_engine/annotation_import/test_mot.pytests/data_engine/conftest.pytests/mocks/repo_api.py
✅ Files skipped from review due to trivial changes (1)
- dagshub/auth/token_auth.py
🧰 Additional context used
🧬 Code graph analysis (2)
tests/data_engine/conftest.py (1)
dagshub/data_engine/client/models.py (3)
DatasourceType(39-42)MetadataSelectFieldSchema(82-101)PreprocessingStatus(32-36)
dagshub/data_engine/annotation/metadata.py (1)
dagshub/common/helpers.py (1)
log_message(100-107)
🪛 Ruff (0.15.1)
tests/data_engine/annotation_import/test_mot.py
[warning] 108-108: Prefer next(iter(result.values())) over single element slice
Replace with next(iter(result.values()))
(RUF015)
[warning] 122-122: Prefer next(iter(result.values())) over single element slice
Replace with next(iter(result.values()))
(RUF015)
[warning] 190-190: Unused function argument: self
(ARG001)
[warning] 190-190: Unused function argument: args
(ARG001)
[warning] 190-190: Unused function argument: kwargs
(ARG001)
[warning] 195-195: Unused function argument: video_annotations
(ARG001)
[warning] 195-195: Unused function argument: context
(ARG001)
[warning] 195-195: Unused function argument: video_file
(ARG001)
[warning] 223-223: Unused function argument: video_annotations
(ARG001)
[warning] 223-223: Unused function argument: video_file
(ARG001)
[warning] 268-268: Unused function argument: self
(ARG001)
[warning] 268-268: Unused function argument: args
(ARG001)
[warning] 268-268: Unused function argument: kwargs
(ARG001)
[warning] 274-274: Unused function argument: video_annotations
(ARG001)
[warning] 274-274: Unused function argument: context
(ARG001)
[warning] 274-274: Unused function argument: video_files
(ARG001)
[warning] 308-308: Unused function argument: self
(ARG001)
[warning] 308-308: Unused function argument: args
(ARG001)
[warning] 308-308: Unused function argument: kwargs
(ARG001)
[warning] 314-314: Unused function argument: video_annotations
(ARG001)
[warning] 314-314: Unused function argument: context
(ARG001)
tests/data_engine/annotation_import/test_cvat_video.py
[warning] 36-36: Prefer next(iter(result.values())) over single element slice
Replace with next(iter(result.values()))
(RUF015)
[warning] 91-91: Unused function argument: self
(ARG001)
[warning] 91-91: Unused function argument: args
(ARG001)
[warning] 91-91: Unused function argument: kwargs
(ARG001)
[warning] 119-119: Unused function argument: self
(ARG001)
[warning] 119-119: Unused function argument: args
(ARG001)
[warning] 119-119: Unused function argument: kwargs
(ARG001)
[warning] 161-161: Unused function argument: self
(ARG001)
[warning] 161-161: Unused function argument: args
(ARG001)
[warning] 161-161: Unused function argument: kwargs
(ARG001)
[warning] 189-189: Unused function argument: self
(ARG001)
[warning] 189-189: Unused function argument: args
(ARG001)
[warning] 189-189: Unused function argument: kwargs
(ARG001)
[warning] 196-196: Unused function argument: video_annotations
(ARG001)
[warning] 198-198: Unused function argument: video_name
(ARG001)
[warning] 199-199: Unused function argument: image_width
(ARG001)
[warning] 200-200: Unused function argument: image_height
(ARG001)
[warning] 229-229: Unused function argument: self
(ARG001)
[warning] 229-229: Unused function argument: args
(ARG001)
[warning] 229-229: Unused function argument: kwargs
(ARG001)
[warning] 235-235: Pattern passed to match= contains metacharacters but is neither escaped nor raw
(RUF043)
dagshub/data_engine/model/query_result.py
[warning] 811-811: Avoid specifying long messages outside the exception class
(TRY003)
[warning] 820-820: Avoid specifying long messages outside the exception class
(TRY003)
[warning] 931-931: Avoid specifying long messages outside the exception class
(TRY003)
[warning] 988-988: Avoid specifying long messages outside the exception class
(TRY003)
[warning] 1018-1021: Avoid specifying long messages outside the exception class
(TRY003)
[warning] 1034-1034: Avoid specifying long messages outside the exception class
(TRY003)
[warning] 1037-1040: Avoid specifying long messages outside the exception class
(TRY003)
[warning] 1082-1082: Avoid specifying long messages outside the exception class
(TRY003)
[warning] 1113-1116: Avoid specifying long messages outside the exception class
(TRY003)
[warning] 1137-1137: Avoid specifying long messages outside the exception class
(TRY003)
[warning] 1140-1143: Avoid specifying long messages outside the exception class
(TRY003)
dagshub/data_engine/annotation/importer.py
[warning] 161-161: Avoid specifying long messages outside the exception class
(TRY003)
[warning] 289-289: Avoid specifying long messages outside the exception class
(TRY003)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: build (3.12)
- GitHub Check: build (3.9)
- GitHub Check: build (3.11)
- GitHub Check: build (3.13)
- GitHub Check: build (3.10)
🔇 Additional comments (14)
tests/mocks/repo_api.py (1)
116-118: LGTM!The new
idproperty follows the existing mock pattern and provides a simple constant return value appropriate for testing.tests/data_engine/conftest.py (1)
8-8: LGTM!The
DatasourceTypeimport andsource_typeassignment properly enhance the mock datasource fixture to represent repository-backed sources, providing more complete test coverage for the new annotation workflows.Also applies to: 28-29
dagshub/data_engine/annotation/metadata.py (2)
25-28: LGTM!The
VideoRectangleAnnotationimport and task lookup registration enables parsing of video rectangle Label Studio tasks, which is needed for the new video annotation support.
279-299: No action required—discarding the second return value is intentional.The
load_coco_from_json_stringfunction returns a tuple(grouped, context). The individual annotation objects ingroupedalready embed their category information (e.g.,categories={"cat": 1.0}), so the global category metadata (second return value) is not needed. This pattern is consistent throughout the codebase (e.g.,load_coco_from_fileinimporter.pyalso discards it), and tests confirm the method works correctly without accessing the second return value.tests/data_engine/annotation_import/test_mot.py (1)
1-372: LGTM - Comprehensive test coverage for MOT format.The test file provides thorough coverage of MOT import/export functionality including:
- Video annotation dict detection and flattening
- Import from directory and zip archives
- Datasource path handling
- Label Studio task conversion
- Export with proper directory structure and dimensions handling
- Multi-video export scenarios
dagshub/data_engine/annotation/importer.py (4)
68-70: LGTM!Clean property to identify video formats, centralizing the format check logic.
165-171: LGTM!Good defensive handling for the video annotation detection - checking dict type, non-empty, and using first key type to distinguish frame-indexed (int keys) from file-indexed (str keys) annotations.
285-290: Appropriate handling of None filenames for video formats.The fallback to
new_filenamefor video formats withNoneannotation filenames is correct, as video annotations may not have per-frame filenames set.
447-449: Thevideo_ir_to_ls_video_tasksfunction is from an undocumented/internal API indagshub-annotation-converter(version ≥0.1.12); its behavior cannot be verified from public documentation.The function being called is not part of the public API documentation, and there are no tests or comments in this codebase explaining why only the first task is used. Verify with the annotation-converter maintainers whether this function:
- Always returns exactly one task per video file
- Can return multiple tasks (and if so, whether taking
[0]is intentional or a data loss issue)dagshub/data_engine/model/query_result.py (4)
787-813: LGTM!Well-designed helper methods for video export:
_get_all_video_annotationsproperly filters forIRVideoBBoxAnnotation_prepare_video_file_for_exporthandles both direct path and source-prefix-prefixed paths_get_annotation_filenamecorrectly handles None, single, and list filename cases
815-823: LGTM!Clean helper that centralizes annotation field resolution logic and improves maintainability of the export methods.
903-949: LGTM!The COCO export follows the established pattern from YOLO export, with proper handling of source prefix, optional class mapping, and appropriate logging.
951-1050: LGTM - Well-structured video export methods.Both
export_as_motandexport_as_cvat_videofollow consistent patterns:
- Properly resolve annotation field
- Handle single vs multi-source scenarios
- Download videos when dimensions are missing for converter probing
- Use appropriate converter functions
- Provide clear error messages for missing files
Also applies to: 1052-1161
tests/data_engine/annotation_import/test_cvat_video.py (1)
1-276: LGTM - Thorough test coverage for CVAT video format.Excellent coverage including:
- Import from XML
- Video annotation filtering and aggregation
- Export with XML structure verification
- Error handling for no/image-only annotations
- Custom video name propagation
- Multi-datapoint scenarios
- Missing dimension and file handling
Based on changes to dagshub-annotation-converter to the same effect: DagsHub/dagshub-annotation-converter#25
Implementation Details
The PR extends annotation support through a multi-layered approach:
Importer Module Enhancements (
importer.py):AnnotationTypeliteral to include "coco", "mot", and "cvat_video"is_video_formatproperty to identify video-based annotation types_flatten_video_annotationsto convert per-frame annotations into single video-entry dictionariesconvert_to_ls_tasksfor Label Studio video task generationremap_annotationsto handle cases where video format annotations lack explicit filenames_is_video_annotation_dict,_flatten_cvat_fs_annotations,_flatten_mot_fs_annotationsMetadata Annotations Support (
metadata.py):VideoRectangleAnnotationsupport in Label Studio task lookup registryadd_coco_annotationmethod to load COCO-format JSON and integrate annotations into metadataQuery Result Export API (
query_result.py):export_as_coco()- generates COCO JSON with optional class mapping and custom output filenameexport_as_mot()- supports single/multi-video export with optional dimension parametersexport_as_cvat_video()- handles video CVAT XML generation with video name customization_get_all_video_annotations,_prepare_video_file_for_export,_get_annotation_filename_resolve_annotation_fieldto auto-select available annotation field with alphabetical fallbackTest Coverage:
test_coco.py(218 lines): Import/export flows, filename rewriting, field resolution, class mapping, output customizationtest_mot.py(383 lines): Video annotation flattening, directory/zip import, MOT export structure, dimension handlingtest_cvat_video.py(276 lines): CVAT video XML import/export, track/box element generation, multi-datapoint supportCode Quality and Testing Infrastructure:
ruff.tomllinting configuration with comprehensive rule set including:__init__.pyandconftest.pyDatasourceType.REPOSITORYassignment in test fixtures (tests/data_engine/conftest.py)idproperty toMockRepoAPItest utilitytoken_auth.py(no behavioral change - logical equivalence in boolean expression)Documentation Note: Per PR comments, the supported formats list at dagshub.com/docs/use_cases/annotation/#supported-formats-for-annotation_imports requires manual update to reflect MOT, CVAT Video, COCO BBOX, and COCO Segmentation support.