Skip to content

Make deep backfill's Phase C (orphan-contact discovery) opt-in#21

Open
DiscoBard wants to merge 1 commit into
MaxGhenis:mainfrom
DiscoBard:feat/opt-in-orphan-contact-discovery
Open

Make deep backfill's Phase C (orphan-contact discovery) opt-in#21
DiscoBard wants to merge 1 commit into
MaxGhenis:mainfrom
DiscoBard:feat/opt-in-orphan-contact-discovery

Conversation

@DiscoBard
Copy link
Copy Markdown

Summary

DeepBackfill runs three phases:

  • Phase A — paginate INBOX/ARCHIVE/SPAM_BLOCKED, store conversations
  • Phase B — for each discovered conversation, paginate messages
  • Phase CdiscoverFromContacts calls GetOrCreateConversation for every contact's phone number to surface threads not visible from folder listings

Phases A and B are pure history reads. Phase C has a write-shaped side effect: Google Messages treats GetOrCreateConversation as a thread-creation, so for each contact without a prior thread, an empty SMS thread is created on the user's phone. A user with several dozen contacts who runs deep backfill ends up with several dozen blank conversations they have to delete by hand.

This PR keeps Phase C available but makes it opt-in via a new env var:

OPENMESSAGES_BACKFILL_DISCOVER_ORPHANS=1

Default behavior changes from "Phase C runs on every deep backfill" to "Phase C runs only when explicitly enabled." Users who want the orphan-discovery behavior get it back with one env var; users running deep backfill purely as a history sync no longer get unintended new threads.

Implementation

  • internal/app/backfill.go — add orphanContactDiscoveryEnabled() helper (parses 1/true/yes/on, case- and whitespace-tolerant). Gate the Phase C call in deepBackfill() on this. Off-path logs an informational line referencing the env var.
  • internal/app/backfill_test.go:
    • TestOrphanContactDiscoveryEnabled — table-driven coverage of the parser (truthy, falsy, mixed case, whitespace, unrecognized values)
    • TestDeepBackfillSkipsPhaseCWhenOptOut — asserts no contact-discovery side effects on the default-off path (no contacts checked, no orphan conversations stored)
    • TestDeepBackfillContactDiscovery, TestDeepBackfillContactDiscoverySkipsAlreadySeen, TestDeepBackfillGetOrCreateError — opt in via t.Setenv since they specifically exercise Phase C
  • README.md — document the new env var with a clear warning about the empty-thread side effect

Test plan

  • go build ./... — clean
  • go test ./internal/app/ ./internal/client/ — passes (existing + new helper test + new opt-out regression test)
  • Manual: run with the env unset, confirm Phase C log line appears and no new threads on phone after deep backfill
  • Manual: run with OPENMESSAGES_BACKFILL_DISCOVER_ORPHANS=1, confirm previous behavior is preserved

Compat

No API change. Default behavior changes; users relying on Phase C running automatically will need to set the env var. The off-path log line tells them exactly what to do.

🤖 Generated with Claude Code

Deep backfill's Phase C iterates the user's contacts and calls
GetOrCreateConversation for every phone number. Google Messages treats
that call as a thread-creation: for each contact lacking a prior
thread, an empty SMS thread is created on the user's phone.

For users running deep backfill primarily to sync existing history,
that side effect is unwanted -- it produces dozens of new "blank"
threads on the device with no way to undo individually.

Make Phase C opt-in via a new env var:

    OPENMESSAGES_BACKFILL_DISCOVER_ORPHANS=1

When unset (default), deepBackfill runs Phases A (folder pagination)
and B (per-conversation messages) only and logs an informational line
explaining the gate. When set to a truthy value, behavior is unchanged
from the current default.

- internal/app/backfill.go: add orphanContactDiscoveryEnabled() helper;
  gate Phase C call with informational log line on the off path
- internal/app/backfill_test.go: add TestOrphanContactDiscoveryEnabled
  with truthy/falsy parsing cases; add TestDeepBackfillSkipsPhaseCWhen
  OptOut to assert the default-off path produces no contact-discovery
  side effects; opt the existing Phase C tests in via t.Setenv
- README.md: document the new env var with the side-effect warning

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant