Skip to content

feat(dist-git): Construct dist-git with lock file history#121

Draft
Tonisal-byte wants to merge 2 commits intomicrosoft:mainfrom
Tonisal-byte:asalinas/dist-git-2.0
Draft

feat(dist-git): Construct dist-git with lock file history#121
Tonisal-byte wants to merge 2 commits intomicrosoft:mainfrom
Tonisal-byte:asalinas/dist-git-2.0

Conversation

@Tonisal-byte
Copy link
Copy Markdown
Contributor

Replaces the Affects: commit trailer approach for synthetic dist-git history with lock file fingerprint change detection. Instead of scanning commit messages for Affects: <component> lines, synthetic commits are now derived from changes to a component's input-fingerprint field in its lock file (specs///.lock).

Copilot AI review requested due to automatic review settings April 24, 2026 22:55
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR changes synthetic dist-git history generation to derive synthetic commits from project lock file history (based on input-fingerprint changes) rather than scanning project commit messages for Affects: <component> trailers.

Changes:

  • Add a git.RunInDir helper to execute git -C <dir> ... via the repo command factory.
  • Rework synthistory to discover FingerprintChange entries by reading lock files from project git history and to rebuild dist-git history with interleaved upstream/synthetic commits.
  • Update source preparation to pass a CmdFactory for git-based fingerprint detection and adjust call sites accordingly.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
internal/utils/git/git.go Adds RunInDir helper for running git commands in a specific directory.
internal/app/azldev/core/sources/synthistory.go Implements lock-file fingerprint change detection and interleaved dist-git history replay.
internal/app/azldev/core/sources/sourceprep.go Switches synthetic history support to require a CmdFactory for git operations.
internal/app/azldev/core/sources/synthistory_test.go Replaces Affects-trailer tests with tests for lock-path, interleaving, orphan handling, and metadata parsing.
internal/app/azldev/cmds/component/render.go Updates WithGitRepo call to pass env as the command factory.
internal/app/azldev/cmds/component/preparesources.go Updates WithGitRepo call to pass env as the command factory.
internal/app/azldev/cmds/component/build.go Updates WithGitRepo call to pass env as the command factory.

Comment on lines +601 to +607
// that touched the given file path in the repository at repoDir. Each commit's
// metadata is separated by a NUL byte in the git log output.
func gitLogFileMetadata(
ctx context.Context, cmdFactory opctx.CmdFactory, repoDir, filePath string,
) ([]CommitMetadata, error) {
output, err := git.RunInDir(ctx, cmdFactory, repoDir,
"log", "--format=%H%n%an%n%ae%n%at%n%s%x00", "--follow", "--", filePath)
Copy link

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gitLogFileMetadata uses 'git log --follow -- ' (tracking renames), but gitShowLockFile always attempts to read ':'. If the file was renamed, the log can include commits where the lock file path differs, and gitShowLockFile will fail and be skipped. Either drop '--follow' (if renames aren't supported) or make the show step rename-aware so history is not partially lost.

Suggested change
// that touched the given file path in the repository at repoDir. Each commit's
// metadata is separated by a NUL byte in the git log output.
func gitLogFileMetadata(
ctx context.Context, cmdFactory opctx.CmdFactory, repoDir, filePath string,
) ([]CommitMetadata, error) {
output, err := git.RunInDir(ctx, cmdFactory, repoDir,
"log", "--format=%H%n%an%n%ae%n%at%n%s%x00", "--follow", "--", filePath)
// that touched the given file path in the repository at repoDir, using the
// path exactly as provided and without following renames. Each commit's
// metadata is separated by a NUL byte in the git log output.
func gitLogFileMetadata(
ctx context.Context, cmdFactory opctx.CmdFactory, repoDir, filePath string,
) ([]CommitMetadata, error) {
output, err := git.RunInDir(ctx, cmdFactory, repoDir,
"log", "--format=%H%n%an%n%ae%n%at%n%s%x00", "--", filePath)

Copilot uses AI. Check for mistakes.
Comment on lines 60 to 74
// WithGitRepo returns a [PreparerOption] that enables dist-git repository
// creation during source preparation. When set, the upstream .git directory
// is preserved and synthetic commit history is generated on top of it. This
// requires the project configuration to reside inside a git repository.
// Without this option, no dist-git is created and synthetic history is skipped.
//
// The defaultAuthorEmail is used for synthetic changelog entries and commits
// when no author email is available from git history.
func WithGitRepo(defaultAuthorEmail string) PreparerOption {
// The cmdFactory is used to shell out to git for fingerprint change detection.
func WithGitRepo(
cmdFactory opctx.CmdFactory,
) PreparerOption {
return func(p *sourcePreparerImpl) {
p.withGitRepo = true
p.defaultAuthorEmail = defaultAuthorEmail
p.cmdFactory = cmdFactory
}
}
Copy link

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WithGitRepo enables synthetic history generation but does not validate that cmdFactory is non-nil. If a caller accidentally passes nil, later code will panic when attempting to run git commands. Consider validating this in WithGitRepo (e.g., ignore the option) or in NewPreparer (return an error if withGitRepo is set but cmdFactory is nil).

Copilot uses AI. Check for mistakes.
Comment thread internal/utils/git/git.go
Comment on lines +227 to +249
// RunInDir executes a git command in the given directory and returns its
// trimmed stdout output. The dir argument is passed via 'git -C dir'.
func RunInDir(
ctx context.Context, cmdFactory opctx.CmdFactory, dir string, args ...string,
) (string, error) {
var stderr bytes.Buffer

fullArgs := make([]string, 0, len(args)+2)
fullArgs = append(fullArgs, "-C", dir)
fullArgs = append(fullArgs, args...)

rawCmd := exec.CommandContext(ctx, "git", fullArgs...)
rawCmd.Stderr = &stderr

cmd, err := cmdFactory.Command(rawCmd)
if err != nil {
return "", fmt.Errorf("failed to create git command:\n%w", err)
}

output, err := cmd.RunAndGetOutput(ctx)
if err != nil {
return "", fmt.Errorf("git %s failed:\n%v\n%w", args[0], stderr.String(), err)
}
Copy link

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RunInDir formats the failure message using args[0], which will panic if called with no args (even though current call sites pass at least one). Please add an explicit guard for len(args)==0 and produce a clearer error (e.g., include the full arg list) rather than indexing blindly.

Copilot uses AI. Check for mistakes.
Comment on lines +52 to +59
// LockFilePath returns the relative path to a component's lock file within the
// project repository. The path follows the same letter-prefix convention used by
// [components.RenderedSpecDir]: specs/<letter>/<name>/<name>.lock.
func LockFilePath(componentName string) string {
prefix := strings.ToLower(componentName[:1])

return filepath.Join("specs", prefix, componentName, componentName+".lock")
}
Copy link

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LockFilePath slices componentName[:1] and uses componentName directly in a path without any validation. This can panic on empty component names and also allows unsafe names (path separators/traversal) to influence the git path. Consider validating with fileutils.ValidateFilename (similar to components.RenderedSpecDir) and returning an error if invalid.

Copilot uses AI. Check for mistakes.
Comment on lines +152 to +167
func CommitInterleavedHistory(
repo *gogit.Repository,
commits []CommitMetadata,
changes []FingerprintChange,
importCommit string,
) error {
// The latest fingerprint change's UpstreamCommit is the commit we're
// pinned to — use it as the upper bound for the upstream walk instead
// of HEAD, which may be ahead (e.g., at the branch tip).
upstreamCommit := changes[len(changes)-1].UpstreamCommit

// Collect upstream commits BEFORE staging, so the temporary commit
// created by stageAndCaptureOverlayTree is not included.
upstreamCommits, err := collectUpstreamCommits(repo, importCommit, upstreamCommit)
if err != nil {
return err
}
Copy link

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CommitInterleavedHistory indexes changes[len(changes)-1] without checking len(changes)>0. Since this is an exported helper, it should defensively return an error on empty input rather than panicking (callers may not always pre-check).

Copilot uses AI. Check for mistakes.
Comment on lines +181 to +207
// stageAndCaptureOverlayTree stages all working tree changes and creates a
// temporary commit to capture the resulting tree hash. The tree hash is used
// later to set the content of the final synthetic commit.
func stageAndCaptureOverlayTree(repo *gogit.Repository) (plumbing.Hash, error) {
worktree, err := repo.Worktree()
if err != nil {
return fmt.Errorf("failed to get worktree:\n%w", err)
return plumbing.ZeroHash, fmt.Errorf("failed to get worktree:\n%w", err)
}

// Stage all working tree changes once — overlays have already been applied.
if err := worktree.AddWithOptions(&gogit.AddOptions{All: true}); err != nil {
return fmt.Errorf("failed to stage changes:\n%w", err)
return plumbing.ZeroHash, fmt.Errorf("failed to stage changes:\n%w", err)
}

tempHash, err := worktree.Commit("temp: capture overlay tree", &gogit.CommitOptions{
AllowEmptyCommits: true,
Author: &object.Signature{Name: "azldev", When: time.Unix(0, 0).UTC()},
})
if err != nil {
return plumbing.ZeroHash, fmt.Errorf("failed to create temporary commit:\n%w", err)
}

tempCommit, err := repo.CommitObject(tempHash)
if err != nil {
return plumbing.ZeroHash, fmt.Errorf("failed to read temporary commit:\n%w", err)
}

return tempCommit.TreeHash, nil
Copy link

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stageAndCaptureOverlayTree creates a temporary commit, which advances HEAD. If any later step fails (e.g., replayInterleavedHistory returns an error), the repo will be left pointing at this temp commit and the caller gets a failure with a mutated repo state. Please preserve the original HEAD ref and restore it on error (or avoid creating a temp commit by deriving the tree hash directly).

Copilot uses AI. Check for mistakes.
Comment on lines +300 to +313
if entry.upstreamCommit != nil {
hash, err := createCommitObject(repo,
entry.upstreamCommit.TreeHash, lastHash,
entry.upstreamCommit.Author, entry.upstreamCommit.Committer,
entry.upstreamCommit.Message)
if err != nil {
return fmt.Errorf("failed to replay upstream commit:\n%w", err)
}

lastHash = hash
lastTreeHash = entry.upstreamCommit.TreeHash

continue
}
Copy link

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replayInterleavedHistory rewrites upstream commits by chaining them with a single parent (lastHash). This implicitly assumes upstream history is strictly linear; merge commits (multiple parents) would be flattened/incorrectly replayed. Please either explicitly detect and error on merges, or extend the replay logic to preserve multiple parents.

Copilot uses AI. Check for mistakes.
Comment on lines +98 to 105
for _, meta := range metas {
fields, err := gitShowLockFile(ctx, cmdFactory, projectRepoDir, meta.Hash, lockFileRelPath)
if err != nil {
slog.Warn("Failed to read lock file at commit; skipping",
"commit", meta.Hash, "error", err)

continue
}
Copy link

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FindFingerprintChanges currently logs and skips any error reading/parsing the lock file at a commit. This can silently drop real failures (e.g., corrupt TOML, unexpected git errors) and produce an incomplete synthetic history. Consider only skipping the specific "file not present at that commit" case and returning an error for parse/command failures.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants