v3.3: serialization, canonical form, local-part normalizer#52
Merged
Conversation
All v3.3 roadmap items (ecosystem bridges deferred by user direction). Fully additive — no breaking changes for v3.2 callers. Serialization on value objects: - ParsedEmailAddress::toArray(): round-trips to the legacy parse() array shape, field order matching the parser output. Useful when mixing typed and array-based code. - ParsedEmailAddress::toJson(int $flags = 0): json_encode wrapper with JSON_UNESCAPED_UNICODE | JSON_UNESCAPED_SLASHES always set; additional flags (e.g. JSON_PRETTY_PRINT) passed through. ParseErrorCode is a BackedEnum so it serializes to its backing string value automatically. - ParseResult::toArray() and ParseResult::toJson(): same pattern; each address in the batch is serialized via ParsedEmailAddress::toArray(). Stringable: - ParsedEmailAddress now implements \Stringable. (string) $parsed returns simpleAddress for valid addresses, empty string otherwise. Lets a parsed address drop directly into string contexts (logging, templates). Canonical RFC 5322 display form: - ParsedEmailAddress::canonical(): returns the minimal-quoting canonical form per RFC 5322 §3.2.4 (local-part) and §3.2.5 (phrase). Drops unnecessary quotes that the input may have carried (e.g. '"John Doe" <a@b>' -> 'John Doe <a@b>') and adds quotes only where required (e.g. '"John Q. Public" <a@b>' stays quoted due to the non-atext period). Returns empty string for invalid addresses. - Helpers isAtextDotAtom() and isPhraseAtoms() inspect the content against the ABNF character classes. Local-part normalizer callback: - New ParseOptions property $localPartNormalizer (readonly ?\Closure) and withLocalPartNormalizer(?callable) fluent builder. Any callable is accepted and wrapped via Closure::fromCallable for uniform storage. - The callback fn(string $localPart, string $domain): string is invoked after local-part validation succeeds; its return value replaces local_part_parsed in the output (and the quoted display form is re-derived). originalAddress still preserves the verbatim input for audit/logging. - Typical uses: Gmail dot-insensitivity and +tag stripping, or any domain-specific canonicalization. The callback is gated behind the validation success check so it only sees addresses that conform to the configured ParseOptions rules. - cloneWith() handles localPartNormalizer specially: the $get() closure uses ?? which would treat explicit null as "fall back", so an explicit array_key_exists() check is used instead to support clearing. Docs: - README: switch the nine parse(\$email, false) call sites in examples to parseSingle(\$email); keep one under "Other Examples" to document the legacy array shape with a note pointing new code at the typed API. Add serialization examples in Basic Usage. Add localPartNormalizer to the rule-properties table. - ROADMAP: v3.3 section expanded with checkbox items; serialization, canonicalization, and normalizer items flipped to [x]. Ecosystem bridges listed but unchecked with a note marking them deferred. New "Quality and Infrastructure (ongoing)" section covers mutation testing, property-based tests, PHPStan level bump, Psalm, PhpBench, CONTRIBUTING.md and related cross-release work. - UPGRADE: new v3.2 -> v3.3 section covering the additions. - CHANGELOG: v3.3.0 entry with Added / Changed sections. Tests: 60 tests / 472 assertions (up from 42 / 445 in v3.2). New tests: - toArray round-trips to legacy parse() output exactly (assertSame) - toJson produces parseable JSON with ParseErrorCode as backing string - Stringable returns simpleAddress when valid, '' when invalid - canonical() for six forms: addr-spec only, with atext name, stripping unnecessary name quotes, keeping required name quotes, quoted local-part, invalid -> empty string - Local-part normalizer: Gmail-style rewrite, domain gating, not invoked on invalid inputs, null-clearing via withLocalPartNormalizer
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #52 +/- ##
============================================
+ Coverage 89.34% 90.02% +0.67%
- Complexity 357 380 +23
============================================
Files 6 6
Lines 910 982 +72
============================================
+ Hits 813 884 +71
- Misses 97 98 +1
🚀 New features to boost your workflow:
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Delivers the v3.3 roadmap (ecosystem bridges deferred per direction). Fully additive — no breaking changes, no behavior changes for v3.2 callers. Also updates README examples to use the typed
parseSingle()API throughout.What's new
Serialization on value objects
ParsedEmailAddress::toArray()— round-trips to the legacyparse()array shape exactly (assertable viaassertSame).ParsedEmailAddress::toJson(int $flags = 0)— wrapsjson_encodewithJSON_UNESCAPED_UNICODE | JSON_UNESCAPED_SLASHES.ParseErrorCodeserializes to its backing string value.ParseResult::toArray()+toJson()— same pattern for the multi-address container.Stringable
ParsedEmailAddressnowimplements \Stringable.(string) $parsedreturnssimpleAddressfor valid addresses, empty string otherwise.Canonical RFC 5322 display form
ParsedEmailAddress::canonical()applies minimal quoting per §3.2.4 (local-part) and §3.2.5 (phrase). Drops unnecessary quotes from the input and adds them only when required:Local-part normalizer callback
New opt-in hook on
ParseOptions:The callback runs after local-part validation succeeds; it never sees invalid input. Returning the input unchanged is a safe no-op. Gated behind a property, so normalization is opt-in per parser instance.
Docs
parse($email, false)call sites rewritten to useparseSingle()/ typed property access. One legacy example preserved under "Other Examples" to document the array shape, with a note pointing new code at the typed API.localPartNormalizer.[3.3.0]entry.[x]. Ecosystem bridges (Symfony, Laravel, PSR-14) marked deferred.Test plan
composer cipasses (cs:check, PHPStan with--memory-limit=512M, 60 tests / 472 assertions)toArray()round-trip verified withassertSameagainst the legacyparse()outputtoJson()decodes cleanly;ParseErrorCodeserializes to its backing string ('multiple_opening_angle', etc.)simpleAddressfor valid addresses, empty string for invalidcanonical()covered: addr-spec only, atext name (no quotes), unnecessary-quotes stripping, required-quotes retention for periods, quoted local-part with space, empty result for invalid+tag), domain gating (pass-through for non-gmail), not invoked on invalid addresses,withLocalPartNormalizer(null)clears a previously-set callbackWhat's left (not in this PR)
Deferred ecosystem bridges (Symfony Constraint, Laravel rule, PSR-14 events) — listed but unchecked in ROADMAP; they'd be separate packages anyway.
v4.0 now trimmed to just the genuinely breaking items: DNS/MX validation callback and RFC 6854 group syntax.