Skip to content

refactor(dada): dedup ASV serialization across dada handlers (#13)#31

Merged
cjfields merged 1 commit into
mainfrom
refactor/dedup-dada-handlers
Jun 9, 2026
Merged

refactor(dada): dedup ASV serialization across dada handlers (#13)#31
cjfields merged 1 commit into
mainfrom
refactor/dedup-dada-handlers

Conversation

@cjfields

@cjfields cjfields commented Jun 8, 2026

Copy link
Copy Markdown
Member

Summary

Addresses #13. During dada-pooled codegen, redundant code was flagged across the dada handlers; dada-pseudo has since been added and is now folded into scope.

This is a behavior-preserving dedup pass. Output JSON schema and values are byte-identical — verified by the existing dada_from_fastq_matches_dada_from_derep_json, dada_multi_input_matches_per_file_runs, and pooled/pseudo determinism + downstream-feed tests.

What changed

Hoisted the per-handler-local AsvEntry / DadaStats structs to module level and extracted three duplicated blocks (each appeared 3×) into shared helpers:

  • birth_type_str(&BirthType) — the 4-arm BirthType match
  • asv_entry_from_cluster(cluster, abundance) — decode + birth_type + field copy; abundance is explicit so the pooled per-sample path passes its recomputed read count
  • to_json(value, compact) — compact-vs-pretty serialization

Net −67 lines (62 insertions, 129 deletions).

The single-input dada handler keeps its own DadaOutput because it carries aux-only fields (ClusterStatJson/BirthSubJson/AuxJson); fully hoisting that would pull in the aux types for marginal gain.

Deliberately out of scope

Lower-value / higher-churn items left for a possible follow-up: rayon pool-init boilerplate, the [dada]/[dada-pooled]/[dada-pseudo] log tags, prior-FASTA loading, and sample-name resolution.

Verification

  • cargo build clean
  • cargo clippy clean
  • all 54 tests pass

🤖 Generated with Claude Code

…pseudo (#13)

Hoist the per-handler-local `AsvEntry` and `DadaStats` structs to module
level and extract the duplicated cluster→ASV conversion and JSON
serialization into shared helpers:

- `birth_type_str()` — the 4-arm BirthType match (was duplicated 3x)
- `asv_entry_from_cluster()` — decode + birth_type + field copy; takes
  `abundance` explicitly so the pooled path passes its recomputed
  per-sample read count (was duplicated 3x)
- `to_json()` — compact-vs-pretty serialization (was duplicated 3x)

No behavioral change: output JSON schema and values are byte-identical.
Net -67 lines. The single-input `dada` handler keeps its own
`DadaOutput` since it carries the aux-only fields.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@cjfields cjfields merged commit 9a0c5da into main Jun 9, 2026
5 checks passed
@cjfields cjfields deleted the refactor/dedup-dada-handlers branch June 9, 2026 19:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant