Skip to content

Conversation

@kaiming-cheng
Copy link
Contributor

Summary

Introduces searching/ module with interfaces for optimization history tracking and search strategy.

Structure

opt_worker_component/searching/
-- history/ # AttemptRecord + AttemptStore (keep track of optimization)
-- sampling/ # Sampler protocol (parent/inspiration selection)
-- mutation/ # Mutator protocol (evolve prompt)
-- strategy/ # Strategy protocol (search strategy)

Design Decisions

  • Protocol-first: Interfaces defined as Protocols with simple implementations for MVP
  • Integration-aware: Support Optimization Orchestrator and end-to-end Optimization testing
  • Future PR will include detailed implementation for each protocol

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Jan 30, 2026

# Run pytest with the correct PYTHONPATH
env = {"PYTHONPATH": str(SEARCHING_DIR)}
env.update({k: v for k, v in __import__("os").environ.items()})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

env.update overwrites your PYTHONPATH if the user already had one set, which defeats “Run pytest with the correct PYTHONPATH”.

Suggestion: merge instead of overwrite, e.g. prepend SEARCHING_DIR to existing PYTHONPATH.

...


class SimpleMutator:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

JsonAttemptStore.get_recent() returns newest first (reversed(self._attempts[-n:])).

SimpleMutator.build_prompt() prints history using history[-3:].

If callers pass store.get_recent(...) directly into the mutator, then history[-3:] will actually take the oldest of the “recent” slice (because it’s already newest-first). That’s subtle and will confuse prompt context.

Suggestion: either define a clear convention (history always oldest→newest), or have SimpleMutator treat history as newest-first and use history[:3], or sort by created_at inside the mutator.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Update it to ordered as oldest-first

self._load()

def _load(self) -> None:
"""Load attempts from JSON file if it exists."""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_load() assumes JSON is valid; a partially written file (crash mid-write) will blow up the whole initialization.

Suggestion: wrap JSON parsing with a small try/except and either:
fall back to empty store (and maybe warn), or
write to a temp file + atomic rename in _save().

kernel_code: str
time_ms: float
outcome: Outcome
created_at: datetime = field(default_factory=datetime.now)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AttemptRecord.created_at uses datetime.now() (local time, naive).
If these records ever move across machines/timezones, sorting becomes tricky.

Suggestion: use UTC (datetime.now(timezone.utc)) or store epoch seconds.


"""Searching infrastructure for Optimization Kernel."""

__all__ = []
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

__all__ = [] in the package root is unusual if you want consumers to import “searching” components.

Suggestion: either remove all entirely, or export the public surface (e.g., AttemptRecord/AttemptStore/Mutator/Sampler/Strategy).

def __init__(self, store: AttemptStore) -> None:
self.store = store

def next_parent(self) -> AttemptRecord | None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Strategy.next_parent() selects the parent.
Sampler.sample_parent() also selects the parent.

That’s duplicative and will confuse future integrators: “Do I use Strategy or Sampler to pick the parent?” Suggestion: Strategy owns control flow + uses a Sampler internally. Then Strategy has no next_parent(); instead it exposes run_round(...) or similar, or get_sampler() / sampler property.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use a simple fix to let SimpleStrategy delegates next_parent()

self,
n: int,
) -> list[AttemptRecord]:
"""Sample diverse inspirations for few-shot prompting."""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SimpleSampler.sample_inspirations() returns:
return self.store.get_top_k(n)
That’s not “diverse”, it’s “best-performing”.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In future we could add more support to diverse inspiration; renaming this to get_top_inspirations to align with the functionality

self,
n: int,
) -> list[AttemptRecord]:
return self.store.get_top_k(n)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ensure AttemptStore.get_top_k is deterministic on ties (e.g., break ties by created_at or id). If that’s already true, document it in the store.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

update to use (time_ms, created_at) for deterministic tie-breaking

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants