feat(dist-git): Construct dist-git with lock file history by Tonisal-byte · Pull Request #121 · microsoft/azure-linux-dev-tools

Tonisal-byte · 2026-04-24T22:55:55Z

Replaces the Affects: commit trailer approach for synthetic dist-git history with lock file fingerprint change detection. Instead of scanning commit messages for Affects: <component> lines, synthetic commits are now derived from changes to a component's input-fingerprint field in its lock file (specs///.lock).

Copilot

Pull request overview

This PR changes synthetic dist-git history generation to derive synthetic commits from project lock file history (based on input-fingerprint changes) rather than scanning project commit messages for Affects: <component> trailers.

Changes:

Add a git.RunInDir helper to execute git -C <dir> ... via the repo command factory.
Rework synthistory to discover FingerprintChange entries by reading lock files from project git history and to rebuild dist-git history with interleaved upstream/synthetic commits.
Update source preparation to pass a CmdFactory for git-based fingerprint detection and adjust call sites accordingly.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 8 comments.

Show a summary per file

File	Description
internal/utils/git/git.go	Adds `RunInDir` helper for running git commands in a specific directory.
internal/app/azldev/core/sources/synthistory.go	Implements lock-file fingerprint change detection and interleaved dist-git history replay.
internal/app/azldev/core/sources/sourceprep.go	Switches synthetic history support to require a `CmdFactory` for git operations.
internal/app/azldev/core/sources/synthistory_test.go	Replaces Affects-trailer tests with tests for lock-path, interleaving, orphan handling, and metadata parsing.
internal/app/azldev/cmds/component/render.go	Updates `WithGitRepo` call to pass `env` as the command factory.
internal/app/azldev/cmds/component/preparesources.go	Updates `WithGitRepo` call to pass `env` as the command factory.
internal/app/azldev/cmds/component/build.go	Updates `WithGitRepo` call to pass `env` as the command factory.

Copilot · 2026-04-24T23:03:43Z

+// that touched the given file path in the repository at repoDir. Each commit's
+// metadata is separated by a NUL byte in the git log output.
+func gitLogFileMetadata(
+	ctx context.Context, cmdFactory opctx.CmdFactory, repoDir, filePath string,
+) ([]CommitMetadata, error) {
+	output, err := git.RunInDir(ctx, cmdFactory, repoDir,
+		"log", "--format=%H%n%an%n%ae%n%at%n%s%x00", "--follow", "--", filePath)


gitLogFileMetadata uses 'git log --follow -- ' (tracking renames), but gitShowLockFile always attempts to read ':'. If the file was renamed, the log can include commits where the lock file path differs, and gitShowLockFile will fail and be skipped. Either drop '--follow' (if renames aren't supported) or make the show step rename-aware so history is not partially lost.

Suggested change

// that touched the given file path in the repository at repoDir. Each commit's

// metadata is separated by a NUL byte in the git log output.

func gitLogFileMetadata(

ctx context.Context, cmdFactory opctx.CmdFactory, repoDir, filePath string,

) ([]CommitMetadata, error) {

output, err := git.RunInDir(ctx, cmdFactory, repoDir,

"log", "--format=%H%n%an%n%ae%n%at%n%s%x00", "--follow", "--", filePath)

// that touched the given file path in the repository at repoDir, using the

// path exactly as provided and without following renames. Each commit's

// metadata is separated by a NUL byte in the git log output.

func gitLogFileMetadata(

ctx context.Context, cmdFactory opctx.CmdFactory, repoDir, filePath string,

) ([]CommitMetadata, error) {

output, err := git.RunInDir(ctx, cmdFactory, repoDir,

"log", "--format=%H%n%an%n%ae%n%at%n%s%x00", "--", filePath)

Copilot · 2026-04-24T23:03:43Z

 // WithGitRepo returns a [PreparerOption] that enables dist-git repository
 // creation during source preparation. When set, the upstream .git directory
 // is preserved and synthetic commit history is generated on top of it. This
 // requires the project configuration to reside inside a git repository.
 // Without this option, no dist-git is created and synthetic history is skipped.
 //
-// The defaultAuthorEmail is used for synthetic changelog entries and commits
-// when no author email is available from git history.
-func WithGitRepo(defaultAuthorEmail string) PreparerOption {
+// The cmdFactory is used to shell out to git for fingerprint change detection.
+func WithGitRepo(
+	cmdFactory opctx.CmdFactory,
+) PreparerOption {
 	return func(p *sourcePreparerImpl) {
 		p.withGitRepo = true
-		p.defaultAuthorEmail = defaultAuthorEmail
+		p.cmdFactory = cmdFactory
 	}
 }


WithGitRepo enables synthetic history generation but does not validate that cmdFactory is non-nil. If a caller accidentally passes nil, later code will panic when attempting to run git commands. Consider validating this in WithGitRepo (e.g., ignore the option) or in NewPreparer (return an error if withGitRepo is set but cmdFactory is nil).

Copilot · 2026-04-24T23:03:44Z

+// RunInDir executes a git command in the given directory and returns its
+// trimmed stdout output. The dir argument is passed via 'git -C dir'.
+func RunInDir(
+	ctx context.Context, cmdFactory opctx.CmdFactory, dir string, args ...string,
+) (string, error) {
+	var stderr bytes.Buffer
+
+	fullArgs := make([]string, 0, len(args)+2)
+	fullArgs = append(fullArgs, "-C", dir)
+	fullArgs = append(fullArgs, args...)
+
+	rawCmd := exec.CommandContext(ctx, "git", fullArgs...)
+	rawCmd.Stderr = &stderr
+
+	cmd, err := cmdFactory.Command(rawCmd)
+	if err != nil {
+		return "", fmt.Errorf("failed to create git command:\n%w", err)
+	}
+
+	output, err := cmd.RunAndGetOutput(ctx)
+	if err != nil {
+		return "", fmt.Errorf("git %s failed:\n%v\n%w", args[0], stderr.String(), err)
+	}


RunInDir formats the failure message using args[0], which will panic if called with no args (even though current call sites pass at least one). Please add an explicit guard for len(args)==0 and produce a clearer error (e.g., include the full arg list) rather than indexing blindly.

Copilot · 2026-04-24T23:03:44Z

+// LockFilePath returns the relative path to a component's lock file within the
+// project repository. The path follows the same letter-prefix convention used by
+// [components.RenderedSpecDir]: specs/<letter>/<name>/<name>.lock.
+func LockFilePath(componentName string) string {
+	prefix := strings.ToLower(componentName[:1])
+
+	return filepath.Join("specs", prefix, componentName, componentName+".lock")
+}


LockFilePath slices componentName[:1] and uses componentName directly in a path without any validation. This can panic on empty component names and also allows unsafe names (path separators/traversal) to influence the git path. Consider validating with fileutils.ValidateFilename (similar to components.RenderedSpecDir) and returning an error if invalid.

Copilot · 2026-04-24T23:03:44Z

+func CommitInterleavedHistory(
 	repo *gogit.Repository,
-	commits []CommitMetadata,
+	changes []FingerprintChange,
+	importCommit string,
 ) error {
+	// The latest fingerprint change's UpstreamCommit is the commit we're
+	// pinned to — use it as the upper bound for the upstream walk instead
+	// of HEAD, which may be ahead (e.g., at the branch tip).
+	upstreamCommit := changes[len(changes)-1].UpstreamCommit
+
+	// Collect upstream commits BEFORE staging, so the temporary commit
+	// created by stageAndCaptureOverlayTree is not included.
+	upstreamCommits, err := collectUpstreamCommits(repo, importCommit, upstreamCommit)
+	if err != nil {
+		return err
+	}


CommitInterleavedHistory indexes changes[len(changes)-1] without checking len(changes)>0. Since this is an exported helper, it should defensively return an error on empty input rather than panicking (callers may not always pre-check).

Copilot · 2026-04-24T23:03:44Z

+// stageAndCaptureOverlayTree stages all working tree changes and creates a
+// temporary commit to capture the resulting tree hash. The tree hash is used
+// later to set the content of the final synthetic commit.
+func stageAndCaptureOverlayTree(repo *gogit.Repository) (plumbing.Hash, error) {
 	worktree, err := repo.Worktree()
 	if err != nil {
-		return fmt.Errorf("failed to get worktree:\n%w", err)
+		return plumbing.ZeroHash, fmt.Errorf("failed to get worktree:\n%w", err)
 	}

-	// Stage all working tree changes once — overlays have already been applied.
 	if err := worktree.AddWithOptions(&gogit.AddOptions{All: true}); err != nil {
-		return fmt.Errorf("failed to stage changes:\n%w", err)
+		return plumbing.ZeroHash, fmt.Errorf("failed to stage changes:\n%w", err)
+	}
+
+	tempHash, err := worktree.Commit("temp: capture overlay tree", &gogit.CommitOptions{
+		AllowEmptyCommits: true,
+		Author:            &object.Signature{Name: "azldev", When: time.Unix(0, 0).UTC()},
+	})
+	if err != nil {
+		return plumbing.ZeroHash, fmt.Errorf("failed to create temporary commit:\n%w", err)
+	}
+
+	tempCommit, err := repo.CommitObject(tempHash)
+	if err != nil {
+		return plumbing.ZeroHash, fmt.Errorf("failed to read temporary commit:\n%w", err)
+	}
+
+	return tempCommit.TreeHash, nil


stageAndCaptureOverlayTree creates a temporary commit, which advances HEAD. If any later step fails (e.g., replayInterleavedHistory returns an error), the repo will be left pointing at this temp commit and the caller gets a failure with a mutated repo state. Please preserve the original HEAD ref and restore it on error (or avoid creating a temp commit by deriving the tree hash directly).

Copilot · 2026-04-24T23:03:45Z

+		if entry.upstreamCommit != nil {
+			hash, err := createCommitObject(repo,
+				entry.upstreamCommit.TreeHash, lastHash,
+				entry.upstreamCommit.Author, entry.upstreamCommit.Committer,
+				entry.upstreamCommit.Message)
+			if err != nil {
+				return fmt.Errorf("failed to replay upstream commit:\n%w", err)
+			}
+
+			lastHash = hash
+			lastTreeHash = entry.upstreamCommit.TreeHash
+
+			continue
+		}


replayInterleavedHistory rewrites upstream commits by chaining them with a single parent (lastHash). This implicitly assumes upstream history is strictly linear; merge commits (multiple parents) would be flattened/incorrectly replayed. Please either explicitly detect and error on merges, or extend the replay logic to preserve multiple parents.

Copilot · 2026-04-24T23:03:45Z

+	for _, meta := range metas {
+		fields, err := gitShowLockFile(ctx, cmdFactory, projectRepoDir, meta.Hash, lockFileRelPath)
+		if err != nil {
+			slog.Warn("Failed to read lock file at commit; skipping",
+				"commit", meta.Hash, "error", err)
+
+			continue
 		}


FindFingerprintChanges currently logs and skips any error reading/parsing the lock file at a commit. This can silently drop real failures (e.g., corrupt TOML, unexpected git errors) and produce an incomplete synthetic history. Consider only skipping the specific "file not present at that commit" case and returning an error for parse/command failures.

Antonio Salinas added 2 commits April 24, 2026 22:51

feat: synth dist-git /w lock (test)

1ee4e65

fixes

d0b1ece

Copilot AI review requested due to automatic review settings April 24, 2026 22:55

Copilot started reviewing on behalf of Tonisal-byte April 24, 2026 22:56 View session

Copilot AI reviewed Apr 24, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(dist-git): Construct dist-git with lock file history#121

feat(dist-git): Construct dist-git with lock file history#121
Tonisal-byte wants to merge 2 commits intomicrosoft:mainfrom
Tonisal-byte:asalinas/dist-git-2.0

Tonisal-byte commented Apr 24, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 24, 2026

Uh oh!

Copilot AI Apr 24, 2026

Uh oh!

Copilot AI Apr 24, 2026

Uh oh!

Copilot AI Apr 24, 2026

Uh oh!

Copilot AI Apr 24, 2026

Uh oh!

Copilot AI Apr 24, 2026

Uh oh!

Copilot AI Apr 24, 2026

Uh oh!

Copilot AI Apr 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Tonisal-byte commented Apr 24, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants