Skip to content

v3.3: serialization, canonical form, local-part normalizer#52

Merged
mmucklo merged 1 commit intomasterfrom
feature/v3.3-polish
Apr 13, 2026
Merged

v3.3: serialization, canonical form, local-part normalizer#52
mmucklo merged 1 commit intomasterfrom
feature/v3.3-polish

Conversation

@mmucklo
Copy link
Copy Markdown
Owner

@mmucklo mmucklo commented Apr 13, 2026

Summary

Delivers the v3.3 roadmap (ecosystem bridges deferred per direction). Fully additive — no breaking changes, no behavior changes for v3.2 callers. Also updates README examples to use the typed parseSingle() API throughout.

What's new

Serialization on value objects

  • ParsedEmailAddress::toArray() — round-trips to the legacy parse() array shape exactly (assertable via assertSame).
  • ParsedEmailAddress::toJson(int $flags = 0) — wraps json_encode with JSON_UNESCAPED_UNICODE | JSON_UNESCAPED_SLASHES. ParseErrorCode serializes to its backing string value.
  • ParseResult::toArray() + toJson() — same pattern for the multi-address container.

Stringable

ParsedEmailAddress now implements \Stringable. (string) $parsed returns simpleAddress for valid addresses, empty string otherwise.

Canonical RFC 5322 display form

ParsedEmailAddress::canonical() applies minimal quoting per §3.2.4 (local-part) and §3.2.5 (phrase). Drops unnecessary quotes from the input and adds them only when required:

"John Doe" <j@example.com>        → John Doe <j@example.com>        (name quotes unnecessary)
"John Q. Public" <j@example.com>  → "John Q. Public" <j@example.com> (period requires quoting)
"with space"@example.com          → "with space"@example.com         (local-part quoting required)

Local-part normalizer callback

New opt-in hook on ParseOptions:

$opts = ParseOptions::rfc5322()->withLocalPartNormalizer(
    fn (string $local, string $domain): string =>
        $domain === 'gmail.com'
            ? strtolower(str_replace('.', '', strtok($local, '+')))
            : $local,
);
$parser = new Parse(null, $opts);
$result = $parser->parseSingle('John.Doe+spam@gmail.com');
$result->localPartParsed;    // 'johndoe'
$result->simpleAddress;      // 'johndoe@gmail.com'
$result->originalAddress;    // 'John.Doe+spam@gmail.com' — verbatim preserved

The callback runs after local-part validation succeeds; it never sees invalid input. Returning the input unchanged is a safe no-op. Gated behind a property, so normalization is opt-in per parser instance.

Docs

  • README: nine parse($email, false) call sites rewritten to use parseSingle() / typed property access. One legacy example preserved under "Other Examples" to document the array shape, with a note pointing new code at the typed API.
  • Basic Usage section gains a v3.3 serialization + canonical block.
  • Rule-properties table gets localPartNormalizer.
  • CHANGELOG [3.3.0] entry.
  • ROADMAP v3.3 items flipped to [x]. Ecosystem bridges (Symfony, Laravel, PSR-14) marked deferred.
  • UPGRADE v3.2 → v3.3 section.

Test plan

  • composer ci passes (cs:check, PHPStan with --memory-limit=512M, 60 tests / 472 assertions)
  • toArray() round-trip verified with assertSame against the legacy parse() output
  • toJson() decodes cleanly; ParseErrorCode serializes to its backing string ('multiple_opening_angle', etc.)
  • Stringable returns simpleAddress for valid addresses, empty string for invalid
  • canonical() covered: addr-spec only, atext name (no quotes), unnecessary-quotes stripping, required-quotes retention for periods, quoted local-part with space, empty result for invalid
  • Normalizer covered: Gmail-style rewrite (dots + +tag), domain gating (pass-through for non-gmail), not invoked on invalid addresses, withLocalPartNormalizer(null) clears a previously-set callback

What's left (not in this PR)

Deferred ecosystem bridges (Symfony Constraint, Laravel rule, PSR-14 events) — listed but unchecked in ROADMAP; they'd be separate packages anyway.

v4.0 now trimmed to just the genuinely breaking items: DNS/MX validation callback and RFC 6854 group syntax.

All v3.3 roadmap items (ecosystem bridges deferred by user direction).
Fully additive — no breaking changes for v3.2 callers.

Serialization on value objects:
- ParsedEmailAddress::toArray(): round-trips to the legacy parse() array
  shape, field order matching the parser output. Useful when mixing typed
  and array-based code.
- ParsedEmailAddress::toJson(int $flags = 0): json_encode wrapper with
  JSON_UNESCAPED_UNICODE | JSON_UNESCAPED_SLASHES always set; additional
  flags (e.g. JSON_PRETTY_PRINT) passed through. ParseErrorCode is a
  BackedEnum so it serializes to its backing string value automatically.
- ParseResult::toArray() and ParseResult::toJson(): same pattern; each
  address in the batch is serialized via ParsedEmailAddress::toArray().

Stringable:
- ParsedEmailAddress now implements \Stringable. (string) $parsed returns
  simpleAddress for valid addresses, empty string otherwise. Lets a
  parsed address drop directly into string contexts (logging, templates).

Canonical RFC 5322 display form:
- ParsedEmailAddress::canonical(): returns the minimal-quoting canonical
  form per RFC 5322 §3.2.4 (local-part) and §3.2.5 (phrase). Drops
  unnecessary quotes that the input may have carried (e.g.
  '"John Doe" <a@b>' -> 'John Doe <a@b>') and adds quotes only where
  required (e.g. '"John Q. Public" <a@b>' stays quoted due to the
  non-atext period). Returns empty string for invalid addresses.
- Helpers isAtextDotAtom() and isPhraseAtoms() inspect the content
  against the ABNF character classes.

Local-part normalizer callback:
- New ParseOptions property $localPartNormalizer (readonly ?\Closure)
  and withLocalPartNormalizer(?callable) fluent builder. Any callable is
  accepted and wrapped via Closure::fromCallable for uniform storage.
- The callback fn(string $localPart, string $domain): string is invoked
  after local-part validation succeeds; its return value replaces
  local_part_parsed in the output (and the quoted display form is
  re-derived). originalAddress still preserves the verbatim input for
  audit/logging.
- Typical uses: Gmail dot-insensitivity and +tag stripping, or any
  domain-specific canonicalization. The callback is gated behind the
  validation success check so it only sees addresses that conform to
  the configured ParseOptions rules.
- cloneWith() handles localPartNormalizer specially: the $get() closure
  uses ?? which would treat explicit null as "fall back", so an explicit
  array_key_exists() check is used instead to support clearing.

Docs:
- README: switch the nine parse(\$email, false) call sites in examples
  to parseSingle(\$email); keep one under "Other Examples" to document
  the legacy array shape with a note pointing new code at the typed
  API. Add serialization examples in Basic Usage. Add localPartNormalizer
  to the rule-properties table.
- ROADMAP: v3.3 section expanded with checkbox items; serialization,
  canonicalization, and normalizer items flipped to [x]. Ecosystem
  bridges listed but unchecked with a note marking them deferred.
  New "Quality and Infrastructure (ongoing)" section covers mutation
  testing, property-based tests, PHPStan level bump, Psalm, PhpBench,
  CONTRIBUTING.md and related cross-release work.
- UPGRADE: new v3.2 -> v3.3 section covering the additions.
- CHANGELOG: v3.3.0 entry with Added / Changed sections.

Tests: 60 tests / 472 assertions (up from 42 / 445 in v3.2). New tests:
- toArray round-trips to legacy parse() output exactly (assertSame)
- toJson produces parseable JSON with ParseErrorCode as backing string
- Stringable returns simpleAddress when valid, '' when invalid
- canonical() for six forms: addr-spec only, with atext name, stripping
  unnecessary name quotes, keeping required name quotes, quoted
  local-part, invalid -> empty string
- Local-part normalizer: Gmail-style rewrite, domain gating, not
  invoked on invalid inputs, null-clearing via withLocalPartNormalizer
@codecov
Copy link
Copy Markdown

codecov bot commented Apr 13, 2026

Codecov Report

❌ Patch coverage is 98.61111% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 90.02%. Comparing base (77d4d20) to head (f37b437).
⚠️ Report is 1 commits behind head on master.

Files with missing lines Patch % Lines
src/Parse.php 87.50% 1 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff              @@
##             master      #52      +/-   ##
============================================
+ Coverage     89.34%   90.02%   +0.67%     
- Complexity      357      380      +23     
============================================
  Files             6        6              
  Lines           910      982      +72     
============================================
+ Hits            813      884      +71     
- Misses           97       98       +1     
Files with missing lines Coverage Δ
src/ParseOptions.php 98.93% <100.00%> (+0.04%) ⬆️
src/ParseResult.php 100.00% <100.00%> (ø)
src/ParsedEmailAddress.php 100.00% <100.00%> (ø)
src/Parse.php 85.98% <87.50%> (+0.01%) ⬆️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@mmucklo mmucklo merged commit 084b5a3 into master Apr 13, 2026
10 checks passed
@mmucklo mmucklo deleted the feature/v3.3-polish branch April 13, 2026 05:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant