Skip to content

Preserve arbitrary pose HDF5 attributes in NWB conversion#402

Merged
gbeane merged 2 commits into
mainfrom
feature/nwb-preserve-hdf5-attributes
Jun 22, 2026
Merged

Preserve arbitrary pose HDF5 attributes in NWB conversion#402
gbeane merged 2 commits into
mainfrom
feature/nwb-preserve-hdf5-attributes

Conversation

@gbeane

@gbeane gbeane commented Jun 19, 2026

Copy link
Copy Markdown
Collaborator

Summary

When converting a JABS pose HDF5 file to NWB, any attributes stored in the source file that JABS does not explicitly parse were previously dropped. This change captures every attribute from the entire pose HDF5 file and carries it into the NWB output as metadata, so arbitrary provenance attributes are not lost.

What changed

src/jabs/scripts/cli/convert_to_nwb.py:

  • _collect_hdf5_attributes(path) walks the whole file (root group, all sub-groups, all datasets) via h5py.visititems and records each object's attributes keyed by its HDF5 path ("/", "poseest", "poseest/points", …). Objects with no attributes are omitted.
  • _h5_attr_to_jsonable(value) normalizes h5py return types (numpy scalars/arrays, bytes from fixed-length string attrs) into plain JSON-friendly types. An unrecognized type is preserved as str(value) with a warning rather than dropped.
  • pose_to_pose_data stores the result under metadata["hdf5_attributes"].

No NWB-adapter changes were needed: PoseData.metadata already flows through the writer into the jabs_metadata scratch JSON and is recovered on read, giving a lossless round-trip.

Design note

All attributes are captured verbatim, including ones JABS already parses separately (version, cm_per_pixel). This preserves the raw on-disk values exactly and avoids a fragile allowlist that would need maintenance as pose formats evolve.

Testing

  • New unit tests cover scalar/array/bytes normalization, the unsupported-type fallback, full-file collection (incl. empty-group omission), and JSON-serializability.
  • All 52 existing NWB adapter tests pass (no regression).
  • A real PoseData → NWB → PoseData round-trip confirms a nested hdf5_attributes dict comes back identical.
  • ruff check and ruff format clean.

🤖 Generated with Claude Code

@gbeane gbeane self-assigned this Jun 19, 2026
@gbeane gbeane requested review from Copilot and ptuan5 June 19, 2026 16:17

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the NWB conversion pipeline to preserve all HDF5 attributes from JABS pose files by collecting them across the entire HDF5 object tree and embedding them into PoseData.metadata["hdf5_attributes"], with normalization to JSON-serializable types.

Changes:

  • Added _collect_hdf5_attributes(path) to traverse an HDF5 file (root/group/dataset) and collect attributes keyed by HDF5 object path.
  • Added _h5_attr_to_jsonable(value) to normalize h5py/numpy attribute values into JSON-friendly Python types (with a warning-based string fallback).
  • Added unit tests covering scalar/array/bytes normalization and full-file attribute collection + JSON serializability.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
src/jabs/scripts/cli/convert_to_nwb.py Collects and serializes all pose-file HDF5 attributes into PoseData.metadata during NWB conversion.
tests/scripts/test_convert_to_nwb.py Adds tests for attribute normalization and attribute collection behavior.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/jabs/scripts/cli/convert_to_nwb.py
Comment thread tests/scripts/test_convert_to_nwb.py
@gbeane gbeane merged commit 954d1b7 into main Jun 22, 2026
5 checks passed
@gbeane gbeane deleted the feature/nwb-preserve-hdf5-attributes branch June 22, 2026 19:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants