Skip to content

Torchvision crop#6353

Open
mdabek-nvidia wants to merge 10 commits into
NVIDIA:mainfrom
mdabek-nvidia:torchvision_crop
Open

Torchvision crop#6353
mdabek-nvidia wants to merge 10 commits into
NVIDIA:mainfrom
mdabek-nvidia:torchvision_crop

Conversation

@mdabek-nvidia
Copy link
Copy Markdown
Collaborator

Category:

New feature

Description:

Torchvision API operators RandomCrop and functional crop.
Unit tests checking conformance with Torchvision.
Known limitations: tensor and PIL.Image based paddings are not supported

Additional information:

Affected modules and functionalities:

Key points relevant for the review:

Tests:

  • Existing tests apply
  • New tests added
    • Python tests
    • GTests
    • Benchmark
    • Other
  • N/A

Checklist

Documentation

  • Existing documentation applies
  • Documentation updated
    • Docstring
    • Doxygen
    • RST
    • Jupyter
    • Other
  • N/A

DALI team only

Requirements

  • Implements new requirements
  • Affects existing requirements
  • N/A

REQ IDs: N/A

JIRA TASK: N/A

Signed-off-by: Marek Dabek <mdabek@nvidia.com>
Signed-off-by: Marek Dabek <mdabek@nvidia.com>
Signed-off-by: Marek Dabek <mdabek@nvidia.com>
Signed-off-by: Marek Dabek <mdabek@nvidia.com>
Signed-off-by: Marek Dabek <mdabek@nvidia.com>
@mdabek-nvidia
Copy link
Copy Markdown
Collaborator Author

@greptileai please review

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 18, 2026

Greptile Summary

This PR adds torchvision-compatible RandomCrop and functional crop operators to DALI's experimental torchvision module. Both operators wrap fn.slice with layout-aware axis detection and fuse padding into the slice call to avoid a separate pass.

  • RandomCrop: Supports explicit padding, pad_if_needed, and multiple padding modes by computing a random (top, left) offset and passing a negative anchor into fn.slice with out_of_bounds_policy. Dict-fill (fill={type: value}) is stubbed out in the type annotation but disabled at runtime.
  • Functional crop: Stateless wrapper over ndd.slice with coordinate-type validation and layout-based axis selection; delegates size validation to RandomCrop.verify_args.
  • Tests: Comprehensive torchvision parity tests across CPU/GPU, tensor/PIL, and padding combinations; dict-fill tests are skipped with @unittest.skip.

Confidence Score: 3/5

The _verify_fill_value error message actively misleads users about what fill values are accepted after dict support was commented out.

When dict fill is passed (which the __init__ type annotation explicitly allows), _verify_fill_value raises a TypeError telling the user that a dict is valid — but they already passed a dict. This misdirects debugging and means the advertised API surface is silently broken for any caller who trusts the type hints.

dali/python/nvidia/dali/experimental/torchvision/v2/randomcrop.py — specifically _verify_fill_value and the commented-out validation block in _ValidateFill.verify.

Important Files Changed

Filename Overview
dali/python/nvidia/dali/experimental/torchvision/v2/randomcrop.py New RandomCrop operator with random offset computation, padding support, and fill validation — contains a misleading error message that lists dict as a valid fill type after dict support was commented out, plus TODO/commented-out code in a docstring.
dali/python/nvidia/dali/experimental/torchvision/v2/functional/crop.py New functional crop API using ndd.slice with layout-aware axis detection; reuses RandomCrop.verify_args for size/coordinate validation. Logic is sound.
dali/test/python/torchvision/test_tv_randomcrop.py Comprehensive test suite covering identity, padding, asymmetric padding, shape, and error-rejection cases; two # TODO comments for skipped dict-fill tests violate doc-no-todo-on-merge.
dali/test/python/torchvision/test_tv_crop.py Tests for functional crop covering tensors, PIL images, batches, dtype preservation, and error cases across CPU/GPU; correctness validated against torchvision reference.
dali/python/nvidia/dali/experimental/torchvision/init.py Exports RandomCrop and crop in the public API; trivial change, no issues.
dali/python/nvidia/dali/experimental/torchvision/v2/functional/init.py Adds crop to the functional module's public __all__; trivial, no issues.

Sequence Diagram

sequenceDiagram
    participant User
    participant RandomCrop
    participant _ValidateFill
    participant fn_slice as fn.slice

    User->>RandomCrop: __init__(size, padding, fill, ...)
    RandomCrop->>_ValidateFill: "verify(fill=fill)"
    Note over _ValidateFill: _verify_fill_value(fill)<br/>rejects dicts but error<br/>message says dict is valid
    RandomCrop->>RandomCrop: adjust_fill(fill)
    RandomCrop->>RandomCrop: adjust_padding(padding)

    User->>RandomCrop: __call__(image)
    RandomCrop->>RandomCrop: _kernel(in_h, in_w, c, tensor)
    Note over RandomCrop: compute padded_h/w,<br/>pad_if_needed adjustment
    RandomCrop->>RandomCrop: _randint(max_top) → top
    RandomCrop->>RandomCrop: _randint(max_left) → left
    RandomCrop->>fn_slice: "fn.slice(tensor, anchor, shape, axis_names=WH)"
    fn_slice-->>User: cropped tensor

    User->>crop: crop(inpt, top, left, height, width)
    crop->>crop: _verify_crop_coordinate(top/left)
    crop->>RandomCrop: "verify_args(size=(height,width), ...)"
    crop->>ndd_slice: "ndd.slice(inpt, axes=_get_crop_axes(inpt))"
    ndd_slice-->>User: cropped output
Loading

Reviews (5): Last reviewed commit: "Review comments" | Re-trigger Greptile

Comment thread dali/python/nvidia/dali/experimental/torchvision/v2/randomcrop.py Outdated
Comment thread dali/python/nvidia/dali/experimental/torchvision/v2/randomcrop.py
Signed-off-by: Marek Dabek <mdabek@nvidia.com>
Comment thread dali/python/nvidia/dali/experimental/torchvision/v2/randomcrop.py Fixed
Comment thread dali/python/nvidia/dali/experimental/torchvision/v2/randomcrop.py Fixed
Comment thread dali/python/nvidia/dali/experimental/torchvision/v2/randomcrop.py Fixed
@mdabek-nvidia
Copy link
Copy Markdown
Collaborator Author

@greptileai re-review please

Signed-off-by: Marek Dabek <mdabek@nvidia.com>
Comment thread dali/python/nvidia/dali/experimental/torchvision/v2/randomcrop.py Outdated
@mdabek-nvidia mdabek-nvidia marked this pull request as ready for review May 19, 2026 13:37
Signed-off-by: Marek Dabek <mdabek@nvidia.com>
Comment thread dali/python/nvidia/dali/experimental/torchvision/v2/randomcrop.py
Comment thread dali/python/nvidia/dali/experimental/torchvision/v2/randomcrop.py
self.pad_if_needed = pad_if_needed
self.fill = RandomCrop.adjust_fill(fill)
self.padding_mode = padding_mode
self.needs_padding = pad_if_needed or any(self.padding)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Question] needs_padding = pad_if_needed or any(self.padding) — when pad_if_needed=True but the input is already large enough, pad_h = pad_w = 0 and the fn.slice block (lines 229-240) becomes an effective no-op that still runs in the graph (start=0, size=in_w/in_h). Worth a comment confirming that's intentional / cheap enough, or short-circuiting at graph construction time when both any(self.padding) == False and we don't actually need extra padding? Not blocking — just want to flag the extra graph node for large-input/pad_if_needed=True callers.

Copy link
Copy Markdown
Collaborator Author

@mdabek-nvidia mdabek-nvidia May 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added checking if padding is needed

Comment thread dali/test/python/torchvision/test_tv_randomcrop.py Outdated
Signed-off-by: Marek Dabek <mdabek@nvidia.com>
@mdabek-nvidia
Copy link
Copy Markdown
Collaborator Author

mdabek-nvidia commented May 22, 2026

@greptileai, re-review

pad_h = dali.math.max(crop_h - padded_h, 0)
pad_w = dali.math.max(crop_w - padded_w, 0)
pad_top = pad_top + pad_h
pad_bottom = pad_bottom + pad_h
pad_top = pad_top + pad_h
pad_bottom = pad_bottom + pad_h
pad_left = pad_left + pad_w
pad_right = pad_right + pad_w
Comment on lines +86 to +100
raise TypeError(f"fill must be a number, sequence of numbers, None or a dict, got {fill!r}")

@classmethod
def verify(cls, *, fill, **_) -> None:
"""
TODO: Fill using dictionary pattern is currently not supported
if isinstance(fill, dict):
for key, value in fill.items():
if not isinstance(key, (type, str)):
raise TypeError(f"fill dictionary keys must be types or strings, got {key!r}")
cls._verify_fill_value(value)
else:
cls._verify_fill_value(fill)
"""
cls._verify_fill_value(fill)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 [Bug] _verify_fill_value error message falsely lists dict as a valid fill type

The error message on line 86 says "fill must be a number, sequence of numbers, None or a dict", but dict is not handled by _verify_fill_value and falls through to this same raise — so a user who passes fill={torch.Tensor: 9} (which matches the __init__ type annotation) gets TypeError: fill must be a number, sequence of numbers, None or a dict and then tries a dict again, stuck in a loop. The "or a dict" fragment is a leftover from when dict support was enabled; now that it's commented out, the message should be corrected to remove it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants