From 0cfb09c44d196347f37005f0cc11f64ed496f4d6 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Wed, 6 Nov 2024 20:34:50 +0100 Subject: [PATCH 001/218] sideband: mask control characters The output of `git clone` is a vital component for understanding what has happened when things go wrong. However, these logs are partially under the control of the remote server (via the "sideband", which typically contains what the remote `git pack-objects` process sends to `stderr`), and is currently not sanitized by Git. This makes Git susceptible to ANSI escape sequence injection (see CWE-150, https://cwe.mitre.org/data/definitions/150.html), which allows attackers to corrupt terminal state, to hide information, and even to insert characters into the input buffer (i.e. as if the user had typed those characters). To plug this vulnerability, disallow any control character in the sideband, replacing them instead with the common `^` (e.g. `^[` for `\x1b`, `^A` for `\x01`). There is likely a need for more fine-grained controls instead of using a "heavy hammer" like this, which will be introduced subsequently. Signed-off-by: Johannes Schindelin --- sideband.c | 17 +++++++++++++++-- t/t5409-colorize-remote-messages.sh | 12 ++++++++++++ 2 files changed, 27 insertions(+), 2 deletions(-) diff --git a/sideband.c b/sideband.c index ea7c25211ef7e1..d2e6023e60e5ed 100644 --- a/sideband.c +++ b/sideband.c @@ -66,6 +66,19 @@ void list_config_color_sideband_slots(struct string_list *list, const char *pref list_config_item(list, prefix, keywords[i].keyword); } +static void strbuf_add_sanitized(struct strbuf *dest, const char *src, int n) +{ + strbuf_grow(dest, n); + for (; n && *src; src++, n--) { + if (!iscntrl(*src) || *src == '\t' || *src == '\n') + strbuf_addch(dest, *src); + else { + strbuf_addch(dest, '^'); + strbuf_addch(dest, 0x40 + *src); + } + } +} + /* * Optionally highlight one keyword in remote output if it appears at the start * of the line. This should be called for a single line only, which is @@ -81,7 +94,7 @@ static void maybe_colorize_sideband(struct strbuf *dest, const char *src, int n) int i; if (!want_color_stderr(use_sideband_colors())) { - strbuf_add(dest, src, n); + strbuf_add_sanitized(dest, src, n); return; } @@ -114,7 +127,7 @@ static void maybe_colorize_sideband(struct strbuf *dest, const char *src, int n) } } - strbuf_add(dest, src, n); + strbuf_add_sanitized(dest, src, n); } diff --git a/t/t5409-colorize-remote-messages.sh b/t/t5409-colorize-remote-messages.sh index fa5de4500a4f50..d0745c391b2625 100755 --- a/t/t5409-colorize-remote-messages.sh +++ b/t/t5409-colorize-remote-messages.sh @@ -98,4 +98,16 @@ test_expect_success 'fallback to color.ui' ' grep "error: error" decoded ' +test_expect_success 'disallow (color) control sequences in sideband' ' + write_script .git/color-me-surprised <<-\EOF && + printf "error: Have you \\033[31mread\\033[m this?\\n" >&2 + exec "$@" + EOF + test_config_global uploadPack.packObjectshook ./color-me-surprised && + test_commit need-at-least-one-commit && + git clone --no-local . throw-away 2>stderr && + test_decode_color decoded && + test_grep ! RED decoded +' + test_done From 875ecc8fb0187c17ffc87c9bdfc8836e14b2a795 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Sat, 29 Nov 2025 09:21:58 +0100 Subject: [PATCH 002/218] ci(dockerized): reduce the PID limit for private repositories Every once in a while I need to verify that Microsoft Git's test suite passes for changes that are not yet meant for public consumption, and since it was (made) too difficult to keep up a working Azure Pipeline definition, I have to use GitHub Actions in a private GitHub repository for that purpose. In these tests, basically all Dockerized CI jobs fail consistently. The symptom is something like: error: cannot create async thread: Resource temporarily unavailable in the middle of a test, typically in the t5xxx-t6xxx range. The first such error is immediately followed by plenty more of these errors, and not a single test succeeds afterwards. At first, I thought that maybe the massive parallelism I enjoy there is the problem, and I thought that the cgroups limits might be shared between the many containers that run on essentially the same physical machine. But even reducing the matrix to just a single of those Dockerized jobs runs into the very same problems. The underlying reason seems to be a substantial difference in the hosted runners that execute these Dockerized jobs: forcing the PID limit of the container to a high number lets the jobs pass, even when running the complete matrix of all 13 Dockerized jobs concurrently. But that's not the only difference: The jobs seem to take a lot longer in these containers than, say, in the containers made available to https://github.com/git/git. When forcing a PID limit of 64k in that private repository, the jobs completed successfully, but they also took a lot longer, between 2x to 2.5x longer, i.e. painfully much longer. Reducing the PID limit to 16k, the CI jobs still passed, but took an equally long amount of time. Reducing the PID limit to 8k caused the errors to reappear. Here are the numbers from three example runs, the first one forcing the PID and nproc limit to 65536, the second one to 16384, the third run is from the public git/git repository: Job | 64k | 16k | reference ------------------------------|---------|---------|--------- almalinux-8 | 19m 3s | 16m 0s | 9m 36s debian-11 | 20m 31s | 20m 3s | 8m 5s fedora-breaking-changes-meson | 16m 29s | 19m 19s | 9m 40s linux-asan-ubsan | 1h 10m | 1h 11m | 34m 36s linux-breaking-changes | 25m 39s | 25m 58s | 13m 15s linux-leaks | 1h 9m | 1h 10m | 33m 30s linux-meson | 28m 9s | 27m 4s | 13m 45s linux-musl-meson | 16m 32s | 13m 39s | 8m 6s linux-reftable-leaks | 1h 13m | 1h 13m | 34m 34s linux-reftable | 26m 2s | 25m 48s | 13m 31s linux-sha256 | 26m 12s | 26m 3s | 12m 36s linux-TEST-vars | 26m 5s | 25m 21s | 13m 25s linux32 | 21m 16s | 19m 57s | 10m 44s It does not look as if the PID limit is the reason for the longer runtime, seeing as the 64k vs 16k timings deviate no more than as is usual with GitHub workflows. So let's go for 16k. Signed-off-by: Johannes Schindelin --- .github/workflows/main.yml | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml index 6f3d94e3a60cdd..96d19581129ec2 100644 --- a/.github/workflows/main.yml +++ b/.github/workflows/main.yml @@ -420,7 +420,9 @@ jobs: CI_JOB_IMAGE: ${{matrix.vector.image}} CUSTOM_PATH: /custom runs-on: ubuntu-latest - container: ${{matrix.vector.image}} + container: + image: ${{ matrix.vector.image }} + options: ${{ github.repository_visibility == 'private' && '--pids-limit 16384 --ulimit nproc=16384:16384 --ulimit nofile=32768:32768' || '' }} steps: - name: prepare libc6 for actions if: matrix.vector.jobname == 'linux32' From c9ee3b4102a41351976a44f205abb2f6f359d460 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Wed, 6 Nov 2024 21:07:51 +0100 Subject: [PATCH 003/218] sideband: introduce an "escape hatch" to allow control characters The preceding commit fixed the vulnerability whereas sideband messages (that are under the control of the remote server) could contain ANSI escape sequences that would be sent to the terminal verbatim. However, this fix may not be desirable under all circumstances, e.g. when remote servers deliberately add coloring to their messages to increase their urgency. To help with those use cases, give users a way to opt-out of the protections: `sideband.allowControlCharacters`. Signed-off-by: Johannes Schindelin --- Documentation/config.adoc | 2 ++ Documentation/config/sideband.adoc | 5 +++++ sideband.c | 10 ++++++++++ t/t5409-colorize-remote-messages.sh | 8 +++++++- 4 files changed, 24 insertions(+), 1 deletion(-) create mode 100644 Documentation/config/sideband.adoc diff --git a/Documentation/config.adoc b/Documentation/config.adoc index 62eebe7c54501c..dcea3c0c15e2a9 100644 --- a/Documentation/config.adoc +++ b/Documentation/config.adoc @@ -523,6 +523,8 @@ include::config/sequencer.adoc[] include::config/showbranch.adoc[] +include::config/sideband.adoc[] + include::config/sparse.adoc[] include::config/splitindex.adoc[] diff --git a/Documentation/config/sideband.adoc b/Documentation/config/sideband.adoc new file mode 100644 index 00000000000000..3fb5045cd79581 --- /dev/null +++ b/Documentation/config/sideband.adoc @@ -0,0 +1,5 @@ +sideband.allowControlCharacters:: + By default, control characters that are delivered via the sideband + are masked, to prevent potentially unwanted ANSI escape sequences + from being sent to the terminal. Use this config setting to override + this behavior. diff --git a/sideband.c b/sideband.c index d2e6023e60e5ed..ecba71e6610dc4 100644 --- a/sideband.c +++ b/sideband.c @@ -26,6 +26,8 @@ static struct keyword_entry keywords[] = { { "error", GIT_COLOR_BOLD_RED }, }; +static int allow_control_characters; + /* Returns a color setting (GIT_COLOR_NEVER, etc). */ static enum git_colorbool use_sideband_colors(void) { @@ -39,6 +41,9 @@ static enum git_colorbool use_sideband_colors(void) if (use_sideband_colors_cached != GIT_COLOR_UNKNOWN) return use_sideband_colors_cached; + repo_config_get_bool(the_repository, "sideband.allowcontrolcharacters", + &allow_control_characters); + if (!repo_config_get_string_tmp(the_repository, key, &value)) use_sideband_colors_cached = git_config_colorbool(key, value); else if (!repo_config_get_string_tmp(the_repository, "color.ui", &value)) @@ -68,6 +73,11 @@ void list_config_color_sideband_slots(struct string_list *list, const char *pref static void strbuf_add_sanitized(struct strbuf *dest, const char *src, int n) { + if (allow_control_characters) { + strbuf_add(dest, src, n); + return; + } + strbuf_grow(dest, n); for (; n && *src; src++, n--) { if (!iscntrl(*src) || *src == '\t' || *src == '\n') diff --git a/t/t5409-colorize-remote-messages.sh b/t/t5409-colorize-remote-messages.sh index d0745c391b2625..fb31e8525418a1 100755 --- a/t/t5409-colorize-remote-messages.sh +++ b/t/t5409-colorize-remote-messages.sh @@ -105,9 +105,15 @@ test_expect_success 'disallow (color) control sequences in sideband' ' EOF test_config_global uploadPack.packObjectshook ./color-me-surprised && test_commit need-at-least-one-commit && + git clone --no-local . throw-away 2>stderr && test_decode_color decoded && - test_grep ! RED decoded + test_grep ! RED decoded && + + rm -rf throw-away && + git -c sideband.allowControlCharacters clone --no-local . throw-away 2>stderr && + test_decode_color decoded && + test_grep RED decoded ' test_done From 6ed025115b52f44ee1ddee9d67753cd9f0239076 Mon Sep 17 00:00:00 2001 From: Sverre Rabbelier Date: Sun, 24 Jul 2011 15:54:04 +0200 Subject: [PATCH 004/218] t9350: point out that refs are not updated correctly This happens only when the corresponding commits are not exported in the current fast-export run. This can happen either when the relevant commit is already marked, or when the commit is explicitly marked as UNINTERESTING with a negative ref by another argument. This breaks fast-export basec remote helpers. Signed-off-by: Sverre Rabbelier --- t/t9350-fast-export.sh | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/t/t9350-fast-export.sh b/t/t9350-fast-export.sh index 784d68b6e5006f..d4e2222e032e1d 100755 --- a/t/t9350-fast-export.sh +++ b/t/t9350-fast-export.sh @@ -1010,4 +1010,15 @@ test_expect_success GPG,RUST 'export and import of doubly signed commit' ' fi ' +cat > expected << EOF +reset refs/heads/master +from $(git rev-parse master) + +EOF + +test_expect_failure 'refs are updated even if no commits need to be exported' ' + git fast-export master..master > actual && + test_cmp expected actual +' + test_done From 997eb4b52b79d3a18c3e3bba4764d0b9b52d62c8 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Mon, 16 Mar 2026 10:20:23 +0100 Subject: [PATCH 005/218] mingw: skip symlink type auto-detection for network share targets On Windows, symbolic links come in two flavors: file symlinks and directory symlinks. Since Git was born on Linux where this distinction does not exist, Git for Windows has to auto-detect the type by looking at the target. When the target does not yet exist at symlink creation time, Git for Windows creates a "phantom" file symlink and later, once checkout is complete, calls `CreateFileW()` on the target to check whether it is actually a directory. If the symlink target is a UNC path (e.g. `\\attacker\share`), this auto-detection triggers an SMB connection to the remote host. Windows performs NTLM authentication by default for such connections, which means a crafted repository can exfiltrate the cloning user's NTLMv2 hash to an attacker-controlled server without any user interaction beyond `git clone -c core.symlinks=true `. There are ways to specify UNC paths that start with only a single backslash (e.g. `\??\UNC\host\share`); All of them do start like that, though, so let's use that as a tell-tale that we should skip the auto-detection in `process_phantom_symlink()`. The symlink is then left as a file symlink (the `mklink` default), and a warning is emitted suggesting the user set the `symlink` gitattribute to `dir` if a directory symlink is needed. When the attribute is already set, auto-detection is never invoked in the first place, so that code path is unaffected. This is the same class of vulnerability as CVE-2025-66413 (https://github.com/git-for-windows/git/security/advisories/GHSA-hv9c-4jm9-jh3x) and follows the same general mitigation pattern that MinTTY adopted for ANSI escape sequences referencing network share paths (https://github.com/mintty/mintty/security/advisories/GHSA-jf4m-m6rv-p6c5). Note that there are legitimate paths starting with a single backslash that are _not_ network paths: drive-less absolute paths are interpreted as relative to the current working directory's drive. In practice, these are highly uncommon (and brittle, just one working directory change away from breaking). In any case, the only consequence is now that the symlink type of those has to be specified via Git attributes, is all. Reported-by: Justin Lee Addresses: CVE-2026-32631 Addresses: https://github.com/git-for-windows/git/security/advisories/GHSA-9j5h-h4m7-85hx Assisted-by: Claude Opus 4.6 Signed-off-by: Johannes Schindelin --- compat/mingw.c | 23 +++++++++++++++++++++++ 1 file changed, 23 insertions(+) diff --git a/compat/mingw.c b/compat/mingw.c index 2023c16db65742..feefa2cd0eb12a 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -351,6 +351,29 @@ process_phantom_symlink(const wchar_t *wtarget, const wchar_t *wlink) wchar_t relative[MAX_PATH]; const wchar_t *rel; + /* + * Do not follow symlinks to network shares, to avoid NTLM credential + * leak from crafted repositories (e.g. \\attacker-server\share). + * Since paths come in all kind of enterprising shapes and forms (in + * addition to the canonical `\\host\share` form, there's also + * `\??\UNC\host\share`, `\GLOBAL??\UNC\host\share` and also + * `\Device\Mup\host\share`, just to name a few), we simply avoid + * following every symlink target that starts with a slash. + * + * This also catches drive-less absolute paths, of course. These are + * uncommon in practice (and also fragile because they are relative to + * the current working directory's drive). The only "harm" this does + * is that it now requires users to specify via the Git attributes if + * they have such an uncommon symbolic link and need it to be a + * directory type link. + */ + if (is_wdir_sep(wtarget[0])) { + warning("created file symlink '%ls' pointing to '%ls';\n" + "set the `symlink` gitattribute to `dir` if a " + "directory symlink is required", wlink, wtarget); + return PHANTOM_SYMLINK_DONE; + } + /* check that wlink is still a file symlink */ if ((GetFileAttributesW(wlink) & (FILE_ATTRIBUTE_REPARSE_POINT | FILE_ATTRIBUTE_DIRECTORY)) From 7a2cef3b18bb8de947700079419c26b3582bc0a7 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Mon, 18 Nov 2024 21:42:57 +0100 Subject: [PATCH 006/218] sideband: do allow ANSI color sequences by default The preceding two commits introduced special handling of the sideband channel to neutralize ANSI escape sequences before sending the payload to the terminal, and `sideband.allowControlCharacters` to override that behavior. However, some `pre-receive` hooks that are actively used in practice want to color their messages and therefore rely on the fact that Git passes them through to the terminal. In contrast to other ANSI escape sequences, it is highly unlikely that coloring sequences can be essential tools in attack vectors that mislead Git users e.g. by hiding crucial information. Therefore we can have both: Continue to allow ANSI coloring sequences to be passed to the terminal, and neutralize all other ANSI escape sequences. Signed-off-by: Johannes Schindelin --- Documentation/config/sideband.adoc | 17 ++++++-- sideband.c | 61 ++++++++++++++++++++++++++--- t/t5409-colorize-remote-messages.sh | 16 +++++++- 3 files changed, 84 insertions(+), 10 deletions(-) diff --git a/Documentation/config/sideband.adoc b/Documentation/config/sideband.adoc index 3fb5045cd79581..f347fd6b33004a 100644 --- a/Documentation/config/sideband.adoc +++ b/Documentation/config/sideband.adoc @@ -1,5 +1,16 @@ sideband.allowControlCharacters:: By default, control characters that are delivered via the sideband - are masked, to prevent potentially unwanted ANSI escape sequences - from being sent to the terminal. Use this config setting to override - this behavior. + are masked, except ANSI color sequences. This prevents potentially + unwanted ANSI escape sequences from being sent to the terminal. Use + this config setting to override this behavior: ++ +-- + color:: + Allow ANSI color sequences, line feeds and horizontal tabs, + but mask all other control characters. This is the default. + false:: + Mask all control characters other than line feeds and + horizontal tabs. + true:: + Allow all control characters to be sent to the terminal. +-- diff --git a/sideband.c b/sideband.c index ecba71e6610dc4..17d0d5b7198332 100644 --- a/sideband.c +++ b/sideband.c @@ -26,7 +26,11 @@ static struct keyword_entry keywords[] = { { "error", GIT_COLOR_BOLD_RED }, }; -static int allow_control_characters; +static enum { + ALLOW_NO_CONTROL_CHARACTERS = 0, + ALLOW_ALL_CONTROL_CHARACTERS = 1, + ALLOW_ANSI_COLOR_SEQUENCES = 2 +} allow_control_characters = ALLOW_ANSI_COLOR_SEQUENCES; /* Returns a color setting (GIT_COLOR_NEVER, etc). */ static enum git_colorbool use_sideband_colors(void) @@ -41,8 +45,24 @@ static enum git_colorbool use_sideband_colors(void) if (use_sideband_colors_cached != GIT_COLOR_UNKNOWN) return use_sideband_colors_cached; - repo_config_get_bool(the_repository, "sideband.allowcontrolcharacters", - &allow_control_characters); + switch (repo_config_get_maybe_bool(the_repository, "sideband.allowcontrolcharacters", &i)) { + case 0: /* Boolean value */ + allow_control_characters = i ? ALLOW_ALL_CONTROL_CHARACTERS : + ALLOW_NO_CONTROL_CHARACTERS; + break; + case -1: /* non-Boolean value */ + if (repo_config_get_string_tmp(the_repository, "sideband.allowcontrolcharacters", + &value)) + ; /* huh? `get_maybe_bool()` returned -1 */ + else if (!strcmp(value, "color")) + allow_control_characters = ALLOW_ANSI_COLOR_SEQUENCES; + else + warning(_("unrecognized value for `sideband." + "allowControlCharacters`: '%s'"), value); + break; + default: + break; /* not configured */ + } if (!repo_config_get_string_tmp(the_repository, key, &value)) use_sideband_colors_cached = git_config_colorbool(key, value); @@ -71,9 +91,37 @@ void list_config_color_sideband_slots(struct string_list *list, const char *pref list_config_item(list, prefix, keywords[i].keyword); } +static int handle_ansi_color_sequence(struct strbuf *dest, const char *src, int n) +{ + int i; + + /* + * Valid ANSI color sequences are of the form + * + * ESC [ [ [; ]*] m + */ + + if (allow_control_characters != ALLOW_ANSI_COLOR_SEQUENCES || + n < 3 || src[0] != '\x1b' || src[1] != '[') + return 0; + + for (i = 2; i < n; i++) { + if (src[i] == 'm') { + strbuf_add(dest, src, i + 1); + return i; + } + if (!isdigit(src[i]) && src[i] != ';') + break; + } + + return 0; +} + static void strbuf_add_sanitized(struct strbuf *dest, const char *src, int n) { - if (allow_control_characters) { + int i; + + if (allow_control_characters == ALLOW_ALL_CONTROL_CHARACTERS) { strbuf_add(dest, src, n); return; } @@ -82,7 +130,10 @@ static void strbuf_add_sanitized(struct strbuf *dest, const char *src, int n) for (; n && *src; src++, n--) { if (!iscntrl(*src) || *src == '\t' || *src == '\n') strbuf_addch(dest, *src); - else { + else if ((i = handle_ansi_color_sequence(dest, src, n))) { + src += i; + n -= i; + } else { strbuf_addch(dest, '^'); strbuf_addch(dest, 0x40 + *src); } diff --git a/t/t5409-colorize-remote-messages.sh b/t/t5409-colorize-remote-messages.sh index fb31e8525418a1..a755c49a74e634 100755 --- a/t/t5409-colorize-remote-messages.sh +++ b/t/t5409-colorize-remote-messages.sh @@ -100,7 +100,7 @@ test_expect_success 'fallback to color.ui' ' test_expect_success 'disallow (color) control sequences in sideband' ' write_script .git/color-me-surprised <<-\EOF && - printf "error: Have you \\033[31mread\\033[m this?\\n" >&2 + printf "error: Have you \\033[31mread\\033[m this?\\a\\n" >&2 exec "$@" EOF test_config_global uploadPack.packObjectshook ./color-me-surprised && @@ -108,12 +108,24 @@ test_expect_success 'disallow (color) control sequences in sideband' ' git clone --no-local . throw-away 2>stderr && test_decode_color decoded && + test_grep RED decoded && + test_grep "\\^G" stderr && + tr -dc "\\007" actual && + test_must_be_empty actual && + + rm -rf throw-away && + git -c sideband.allowControlCharacters=false \ + clone --no-local . throw-away 2>stderr && + test_decode_color decoded && test_grep ! RED decoded && + test_grep "\\^G" stderr && rm -rf throw-away && git -c sideband.allowControlCharacters clone --no-local . throw-away 2>stderr && test_decode_color decoded && - test_grep RED decoded + test_grep RED decoded && + tr -dc "\\007" actual && + test_file_not_empty actual ' test_done From cf0536eba8ad81134d9076d34862181bc4d6c478 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Wed, 30 Oct 2024 19:48:46 +0100 Subject: [PATCH 007/218] unix-socket: avoid leak when initialization fails When a Unix socket is initialized, the current directory's path is stored so that the cleanup code can `chdir()` back to where it was before exit. If the path that needs to be stored exceeds the default size of the `sun_path` attribute of `struct sockaddr_un` (which is defined as a 108-sized byte array on Linux), a larger buffer needs to be allocated so that it can hold the path, and it is the responsibility of the `unix_sockaddr_cleanup()` function to release that allocated memory. In Git's CI, this stack allocation is not necessary because the code is checked out to `/home/runner/work/git/git`. Concatenate the path `t/trash directory.t0301-credential-cache/.cache/git/credential/socket` and a terminating NUL, and you end up with 96 bytes, 12 shy of the default `sun_path` size. However, I use worktrees with slightly longer paths: `/home/me/projects/git/yes/i/nest/worktrees/to/organize/them/` is more in line with what I have. When I recently tried to locally reproduce a failure of the `linux-leaks` CI job, this t0301 test failed (where it had not failed in CI). The reason: When `credential-cache` tries to reach its daemon initially by calling `unix_sockaddr_init()`, it is expected that the daemon cannot be reached (the idea is to spin up the daemon in that case and try again). However, when this first call to `unix_sockaddr_init()` fails, the code returns early from the `unix_stream_connect()` function _without_ giving the cleanup code a chance to run, skipping the deallocation of above-mentioned path. The fix is easy: do not return early but instead go directly to the cleanup code. Signed-off-by: Johannes Schindelin --- unix-socket.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/unix-socket.c b/unix-socket.c index 8860203c3f46dc..1fa0cf6c15c721 100644 --- a/unix-socket.c +++ b/unix-socket.c @@ -84,7 +84,7 @@ int unix_stream_connect(const char *path, int disallow_chdir) struct unix_sockaddr_context ctx; if (unix_sockaddr_init(&sa, path, &ctx, disallow_chdir) < 0) - return -1; + goto fail; fd = socket(AF_UNIX, SOCK_STREAM, 0); if (fd < 0) goto fail; From 3a82b88e5a39ba4ff193848574a12357ea322edc Mon Sep 17 00:00:00 2001 From: Sverre Rabbelier Date: Sat, 28 Aug 2010 20:49:01 -0500 Subject: [PATCH 008/218] transport-helper: add trailing -- [PT: ensure we add an additional element to the argv array] Signed-off-by: Sverre Rabbelier Signed-off-by: Johannes Schindelin --- transport-helper.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/transport-helper.c b/transport-helper.c index 4e5d1d914fb12a..26ff24c93ac1a0 100644 --- a/transport-helper.c +++ b/transport-helper.c @@ -501,6 +501,8 @@ static int get_exporter(struct transport *transport, for (size_t i = 0; i < revlist_args->nr; i++) strvec_push(&fastexport->args, revlist_args->items[i].string); + strvec_push(&fastexport->args, "--"); + fastexport->git_cmd = 1; return start_command(fastexport); } From 099c5da143f7197f58a7be1ea14b82c97832b987 Mon Sep 17 00:00:00 2001 From: Sverre Rabbelier Date: Sun, 24 Jul 2011 00:06:00 +0200 Subject: [PATCH 009/218] remote-helper: check helper status after import/export Signed-off-by: Johannes Schindelin Signed-off-by: Sverre Rabbelier --- t/t5801-remote-helpers.sh | 2 +- transport-helper.c | 15 +++++++++++++++ 2 files changed, 16 insertions(+), 1 deletion(-) diff --git a/t/t5801-remote-helpers.sh b/t/t5801-remote-helpers.sh index d21877150ed82e..3917da47276825 100755 --- a/t/t5801-remote-helpers.sh +++ b/t/t5801-remote-helpers.sh @@ -262,7 +262,7 @@ test_expect_success 'push update refs failure' ' echo "update fail" >>file && git commit -a -m "update fail" && git rev-parse --verify testgit/origin/heads/update >expect && - test_expect_code 1 env GIT_REMOTE_TESTGIT_FAILURE="non-fast forward" \ + test_must_fail env GIT_REMOTE_TESTGIT_FAILURE="non-fast forward" \ git push origin update && git rev-parse --verify testgit/origin/heads/update >actual && test_cmp expect actual diff --git a/transport-helper.c b/transport-helper.c index 26ff24c93ac1a0..e733c30e2c5826 100644 --- a/transport-helper.c +++ b/transport-helper.c @@ -507,6 +507,19 @@ static int get_exporter(struct transport *transport, return start_command(fastexport); } +static void check_helper_status(struct helper_data *data) +{ + int pid, status; + + pid = waitpid(data->helper->pid, &status, WNOHANG); + if (pid < 0) + die("Could not retrieve status of remote helper '%s'", + data->name); + if (pid > 0 && WIFEXITED(status)) + die("Remote helper '%s' died with %d", + data->name, WEXITSTATUS(status)); +} + static int fetch_with_import(struct transport *transport, int nr_heads, struct ref **to_fetch) { @@ -543,6 +556,7 @@ static int fetch_with_import(struct transport *transport, if (finish_command(&fastimport)) die(_("error while running fast-import")); + check_helper_status(data); /* * The fast-import stream of a remote helper that advertises @@ -1163,6 +1177,7 @@ static int push_refs_with_export(struct transport *transport, if (finish_command(&exporter)) die(_("error while running fast-export")); + check_helper_status(data); if (push_update_refs_status(data, remote_refs, flags)) return 1; From d5dc86ae052b7e74cc5b0dfe3b8d94fac30b9507 Mon Sep 17 00:00:00 2001 From: Jeff King Date: Mon, 13 Jan 2025 01:26:01 -0500 Subject: [PATCH 010/218] grep: prevent `^$` false match at end of file In some implementations, `regexec_buf()` assumes that it is fed lines; Without `REG_NOTEOL` it thinks the end of the buffer is the end of a line. Which makes sense, but trips up this case because we are not feeding lines, but rather a whole buffer. So the final newline is not the start of an empty line, but the true end of the buffer. This causes an interesting bug: $ echo content >file.txt $ git grep --no-index -n '^$' file.txt file.txt:2: This bug is fixed by making the end of the buffer consistently the end of the final line. The patch was applied from https://lore.kernel.org/git/20250113062601.GD767856@coredump.intra.peff.net/ Reported-by: Olly Betts Signed-off-by: Johannes Schindelin --- grep.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/grep.c b/grep.c index c7e1dc1e0ee4fe..4fc12251880544 100644 --- a/grep.c +++ b/grep.c @@ -1646,6 +1646,8 @@ static int grep_source_1(struct grep_opt *opt, struct grep_source *gs, int colle bol = gs->buf; left = gs->size; + if (left && gs->buf[left-1] == '\n') + left--; while (left) { const char *eol; int hit; From 0d32026bb5c0c4804cd015ab4ca18227860fa37c Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Mon, 9 Apr 2012 13:04:35 -0500 Subject: [PATCH 011/218] Always auto-gc after calling a fast-import transport After importing anything with fast-import, we should always let the garbage collector do its job, since the objects are written to disk inefficiently. This brings down an initial import of http://selenic.com/hg from about 230 megabytes to about 14. In the future, we may want to make this configurable on a per-remote basis, or maybe teach fast-import about it in the first place. Signed-off-by: Johannes Schindelin --- transport-helper.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/transport-helper.c b/transport-helper.c index e733c30e2c5826..cea20c3cb22133 100644 --- a/transport-helper.c +++ b/transport-helper.c @@ -22,6 +22,8 @@ #include "packfile.h" static int debug; +/* TODO: put somewhere sensible, e.g. git_transport_options? */ +static int auto_gc = 1; struct helper_data { char *name; @@ -590,6 +592,13 @@ static int fetch_with_import(struct transport *transport, } } strbuf_release(&buf); + if (auto_gc) { + struct child_process cmd = CHILD_PROCESS_INIT; + + cmd.git_cmd = 1; + strvec_pushl(&cmd.args, "gc", "--auto", "--quiet", NULL); + run_command(&cmd); + } return 0; } From acd9c70894954245fd02e9f7f32f1f9709d96eab Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Tue, 18 Apr 2017 12:09:08 +0200 Subject: [PATCH 012/218] mingw: prevent regressions with "drive-less" absolute paths On Windows, there are several categories of absolute paths. One such category starts with a backslash and is implicitly relative to the drive associated with the current working directory. Example: c: git clone https://github.com/git-for-windows/git \G4W should clone into C:\G4W. Back in 2017, Juan Carlos Arevalo Baeza reported a bug in Git's handling of those absolute paths was identified, and fixed. Let's make sure that it stays fixed. Signed-off-by: Johannes Schindelin --- t/t5580-unc-paths.sh | 19 ++++++++++++++----- 1 file changed, 14 insertions(+), 5 deletions(-) diff --git a/t/t5580-unc-paths.sh b/t/t5580-unc-paths.sh index 65ef1a3628ee94..e9df367d5777fd 100755 --- a/t/t5580-unc-paths.sh +++ b/t/t5580-unc-paths.sh @@ -20,14 +20,11 @@ fi UNCPATH="$(winpwd)" case "$UNCPATH" in [A-Z]:*) + WITHOUTDRIVE="${UNCPATH#?:}" # Use administrative share e.g. \\localhost\C$\git-sdk-64\usr\src\git # (we use forward slashes here because MSYS2 and Git accept them, and # they are easier on the eyes) - UNCPATH="//localhost/${UNCPATH%%:*}\$/${UNCPATH#?:}" - test -d "$UNCPATH" || { - skip_all='could not access administrative share; skipping' - test_done - } + UNCPATH="//localhost/${UNCPATH%%:*}\$$WITHOUTDRIVE" ;; *) skip_all='skipping UNC path tests, cannot determine current path as UNC' @@ -35,6 +32,18 @@ case "$UNCPATH" in ;; esac +test_expect_success 'clone into absolute path lacking a drive prefix' ' + USINGBACKSLASHES="$(echo "$WITHOUTDRIVE"/without-drive-prefix | + tr / \\\\)" && + git clone . "$USINGBACKSLASHES" && + test -f without-drive-prefix/.git/HEAD +' + +test -d "$UNCPATH" || { + skip_all='could not access administrative share; skipping' + test_done +} + test_expect_success setup ' test_commit initial ' From f8d7e5704e710912c2dd5b8410c1b42af9aebf1e Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Mon, 16 Feb 2015 14:06:59 +0100 Subject: [PATCH 013/218] mingw: include the Python parts in the build While Git for Windows does not _ship_ Python (in order to save on bandwidth), MSYS2 provides very fine Python interpreters that users can easily take advantage of, by using Git for Windows within its SDK. Signed-off-by: Johannes Schindelin --- config.mak.uname | 1 + 1 file changed, 1 insertion(+) diff --git a/config.mak.uname b/config.mak.uname index 5feb5825587e65..d643a3a5fbbacc 100644 --- a/config.mak.uname +++ b/config.mak.uname @@ -761,6 +761,7 @@ ifeq ($(uname_S),MINGW) ifneq (CLANGARM64,$(MSYSTEM)) USE_NED_ALLOCATOR = YesPlease endif + NO_PYTHON = ifeq (/mingw64,$(subst 32,64,$(subst clangarm,mingw,$(prefix)))) # Move system config into top-level /etc/ ETC_GITCONFIG = ../etc/gitconfig From 9bc73d048c8002250b05e39ba9d717cba26ccea6 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Fri, 31 Jan 2020 12:02:47 +0100 Subject: [PATCH 014/218] mingw: demonstrate a `git add` issue with NTFS junctions NTFS junctions are somewhat similar in spirit to Unix bind mounts: they point to a different directory and are resolved by the filesystem driver. As such, they appear to `lstat()` as if they are directories, not as if they are symbolic links. _Any_ user can create junctions, while symbolic links can only be created by non-administrators in Developer Mode on Windows 10. Hence NTFS junctions are much more common "in the wild" than NTFS symbolic links. It was reported in https://github.com/git-for-windows/git/issues/2481 that adding files via an absolute path that traverses an NTFS junction: since 1e64d18 (mingw: do resolve symlinks in `getcwd()`), we resolve not only symbolic links but also NTFS junctions when determining the absolute path of the current directory. The same is not true for `git add `, where symbolic links are resolved in ``, but not NTFS junctions. Signed-off-by: Johannes Schindelin --- t/t3700-add.sh | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/t/t3700-add.sh b/t/t3700-add.sh index 2947bf9a6b1404..c40d16d9149526 100755 --- a/t/t3700-add.sh +++ b/t/t3700-add.sh @@ -587,4 +587,15 @@ test_expect_success CASE_INSENSITIVE_FS 'path is case-insensitive' ' git add "$downcased" ' +test_expect_failure MINGW 'can add files via NTFS junctions' ' + test_when_finished "cmd //c rmdir junction && rm -rf target" && + test_create_repo target && + cmd //c "mklink /j junction target" && + >target/via-junction && + git -C junction add "$(pwd)/junction/via-junction" && + echo via-junction >expect && + git -C target diff --cached --name-only >actual && + test_cmp expect actual +' + test_done From 72a6d08141dfe496a8868f605204530866369e3f Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Fri, 31 Jan 2020 11:44:31 +0100 Subject: [PATCH 015/218] strbuf_realpath(): use platform-dependent API if available Some platforms (e.g. Windows) provide API functions to resolve paths much quicker. Let's offer a way to short-cut `strbuf_realpath()` on those platforms. Signed-off-by: Johannes Schindelin --- abspath.c | 3 +++ git-compat-util.h | 4 ++++ 2 files changed, 7 insertions(+) diff --git a/abspath.c b/abspath.c index 1202cde23dbc9b..0c17e98654e4b0 100644 --- a/abspath.c +++ b/abspath.c @@ -93,6 +93,9 @@ static char *strbuf_realpath_1(struct strbuf *resolved, const char *path, goto error_out; } + if (platform_strbuf_realpath(resolved, path)) + return resolved->buf; + strbuf_addstr(&remaining, path); get_root_part(resolved, &remaining); diff --git a/git-compat-util.h b/git-compat-util.h index ae1bdc90a4cd6a..de4f4308ff831f 100644 --- a/git-compat-util.h +++ b/git-compat-util.h @@ -350,6 +350,10 @@ static inline int git_has_dir_sep(const char *path) #define query_user_email() NULL #endif +#ifndef platform_strbuf_realpath +#define platform_strbuf_realpath(resolved, path) NULL +#endif + #ifdef __TANDEM #include #include From 0fa618efa34c48c6edb9c630c227e748631a2253 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Sat, 9 May 2020 16:19:06 +0200 Subject: [PATCH 016/218] t5505/t5516: allow running without `.git/branches/` in the templates When we commit the template directory as part of `make vcxproj`, the `branches/` directory is not actually commited, as it is empty. Two tests were not prepared for that situation. This developer tried to get rid of the support for `.git/branches/` a long time ago, but that effort did not bear fruit, so the best we can do is work around in these here tests. Signed-off-by: Johannes Schindelin --- t/t5505-remote.sh | 4 ++-- t/t5516-fetch-push.sh | 8 ++++---- 2 files changed, 6 insertions(+), 6 deletions(-) diff --git a/t/t5505-remote.sh b/t/t5505-remote.sh index e592c0bcde91e9..ed8ef69863ddd8 100755 --- a/t/t5505-remote.sh +++ b/t/t5505-remote.sh @@ -1155,7 +1155,7 @@ test_expect_success !WITH_BREAKING_CHANGES 'migrate a remote from named file in ( cd six && git remote rm origin && - mkdir .git/branches && + mkdir -p .git/branches && echo "$origin_url#main" >.git/branches/origin && git remote rename origin origin && test_path_is_missing .git/branches/origin && @@ -1170,7 +1170,7 @@ test_expect_success !WITH_BREAKING_CHANGES 'migrate a remote from named file in ( cd seven && git remote rm origin && - mkdir .git/branches && + mkdir -p .git/branches && echo "quux#foom" > .git/branches/origin && git remote rename origin origin && test_path_is_missing .git/branches/origin && diff --git a/t/t5516-fetch-push.sh b/t/t5516-fetch-push.sh index 117cfa051f33e2..d8a42fa64c292b 100755 --- a/t/t5516-fetch-push.sh +++ b/t/t5516-fetch-push.sh @@ -933,7 +933,7 @@ test_expect_success !WITH_BREAKING_CHANGES 'fetch with branches' ' mk_empty testrepo && git branch second $the_first_commit && git checkout second && - mkdir testrepo/.git/branches && + mkdir -p testrepo/.git/branches && echo ".." > testrepo/.git/branches/branch1 && ( cd testrepo && @@ -947,7 +947,7 @@ test_expect_success !WITH_BREAKING_CHANGES 'fetch with branches' ' test_expect_success !WITH_BREAKING_CHANGES 'fetch with branches containing #' ' mk_empty testrepo && - mkdir testrepo/.git/branches && + mkdir -p testrepo/.git/branches && echo "..#second" > testrepo/.git/branches/branch2 && ( cd testrepo && @@ -964,7 +964,7 @@ test_expect_success !WITH_BREAKING_CHANGES 'push with branches' ' git checkout second && test_when_finished "rm -rf .git/branches" && - mkdir .git/branches && + mkdir -p .git/branches && echo "testrepo" > .git/branches/branch1 && git push branch1 && @@ -980,7 +980,7 @@ test_expect_success !WITH_BREAKING_CHANGES 'push with branches containing #' ' mk_empty testrepo && test_when_finished "rm -rf .git/branches" && - mkdir .git/branches && + mkdir -p .git/branches && echo "testrepo#branch3" > .git/branches/branch2 && git push branch2 && From a5b135a952fb8f67d4da5286567601ffdb0457ce Mon Sep 17 00:00:00 2001 From: Thomas Braun Date: Thu, 8 May 2014 21:43:24 +0200 Subject: [PATCH 017/218] transport: optionally disable side-band-64k Since commit 0c499ea60fda (send-pack: demultiplex a sideband stream with status data, 2010-02-05) the send-pack builtin uses the side-band-64k capability if advertised by the server. Unfortunately this breaks pushing over the dump git protocol if used over a network connection. The detailed reasons for this breakage are (by courtesy of Jeff Preshing, quoted from https://groups.google.com/d/msg/msysgit/at8D7J-h7mw/eaLujILGUWoJ): MinGW wraps Windows sockets in CRT file descriptors in order to mimic the functionality of POSIX sockets. This causes msvcrt.dll to treat sockets as Installable File System (IFS) handles, calling ReadFile, WriteFile, DuplicateHandle and CloseHandle on them. This approach works well in simple cases on recent versions of Windows, but does not support all usage patterns. In particular, using this approach, any attempt to read & write concurrently on the same socket (from one or more processes) will deadlock in a scenario where the read waits for a response from the server which is only invoked after the write. This is what send_pack currently attempts to do in the use_sideband codepath. The new config option `sendpack.sideband` allows to override the side-band-64k capability of the server, and thus makes the dumb git protocol work. Other transportation methods like ssh and http/https still benefit from the sideband channel, therefore the default value of `sendpack.sideband` is still true. Signed-off-by: Thomas Braun Signed-off-by: Oliver Schneider Signed-off-by: Johannes Schindelin --- Documentation/config.adoc | 2 ++ Documentation/config/sendpack.adoc | 5 +++++ send-pack.c | 6 +++--- 3 files changed, 10 insertions(+), 3 deletions(-) create mode 100644 Documentation/config/sendpack.adoc diff --git a/Documentation/config.adoc b/Documentation/config.adoc index dcea3c0c15e2a9..4332ce35154be0 100644 --- a/Documentation/config.adoc +++ b/Documentation/config.adoc @@ -519,6 +519,8 @@ include::config/safe.adoc[] include::config/sendemail.adoc[] +include::config/sendpack.adoc[] + include::config/sequencer.adoc[] include::config/showbranch.adoc[] diff --git a/Documentation/config/sendpack.adoc b/Documentation/config/sendpack.adoc new file mode 100644 index 00000000000000..e306f657fba7dd --- /dev/null +++ b/Documentation/config/sendpack.adoc @@ -0,0 +1,5 @@ +sendpack.sideband:: + Allows to disable the side-band-64k capability for send-pack even + when it is advertised by the server. Makes it possible to work + around a limitation in the git for windows implementation together + with the dump git protocol. Defaults to true. diff --git a/send-pack.c b/send-pack.c index b4361d5610dc91..0e17ec552e033f 100644 --- a/send-pack.c +++ b/send-pack.c @@ -502,7 +502,7 @@ int send_pack(struct repository *r, int need_pack_data = 0; int allow_deleting_refs = 0; int status_report = 0; - int use_sideband = 0; + int use_sideband = 1; int quiet_supported = 0; int agent_supported = 0; int advertise_sid = 0; @@ -526,6 +526,7 @@ int send_pack(struct repository *r, goto out; } + repo_config_get_bool(r, "sendpack.sideband", &use_sideband); repo_config_get_bool(r, "push.negotiate", &push_negotiate); if (push_negotiate) { trace2_region_enter("send_pack", "push_negotiate", r); @@ -547,8 +548,7 @@ int send_pack(struct repository *r, allow_deleting_refs = 1; if (server_supports("ofs-delta")) args->use_ofs_delta = 1; - if (server_supports("side-band-64k")) - use_sideband = 1; + use_sideband = use_sideband && server_supports("side-band-64k"); if (server_supports("quiet")) quiet_supported = 1; if (server_supports("agent")) From 1a7f4a15a84d84047976ca6b70b4f9327ef4a886 Mon Sep 17 00:00:00 2001 From: Bjoern Mueller Date: Wed, 22 Jan 2020 13:49:13 +0100 Subject: [PATCH 018/218] mingw: fix fatal error working on mapped network drives on Windows In 1e64d18 (mingw: do resolve symlinks in `getcwd()`) a problem was introduced that causes git for Windows to stop working with certain mapped network drives (in particular, drives that are mapped to locations with long path names). Error message was "fatal: Unable to read current working directory: No such file or directory". Present change fixes this issue as discussed in https://github.com/git-for-windows/git/issues/2480 Signed-off-by: Bjoern Mueller --- compat/mingw.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/compat/mingw.c b/compat/mingw.c index feefa2cd0eb12a..87d3c9ddf20e9d 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -1516,8 +1516,13 @@ char *mingw_getcwd(char *pointer, int len) if (hnd != INVALID_HANDLE_VALUE) { ret = GetFinalPathNameByHandleW(hnd, wpointer, ARRAY_SIZE(wpointer), 0); CloseHandle(hnd); - if (!ret || ret >= ARRAY_SIZE(wpointer)) - return NULL; + if (!ret || ret >= ARRAY_SIZE(wpointer)) { + ret = GetLongPathNameW(cwd, wpointer, ARRAY_SIZE(wpointer)); + if (!ret || ret >= ARRAY_SIZE(wpointer)) { + errno = ret ? ENAMETOOLONG : err_win_to_posix(GetLastError()); + return NULL; + } + } if (xwcstoutf(pointer, normalize_ntpath(wpointer), len) < 0) return NULL; return pointer; From bad3b4a0b0a6f0948c116c18fd77afa45cd38d1f Mon Sep 17 00:00:00 2001 From: Jeff Hostetler Date: Thu, 30 Jan 2020 14:22:27 -0500 Subject: [PATCH 019/218] clink.pl: fix MSVC compile script to handle libcurl-d.lib Update clink.pl to link with either libcurl.lib or libcurl-d.lib depending on whether DEBUG=1 is set. Signed-off-by: Jeff Hostetler Signed-off-by: Johannes Schindelin --- compat/vcbuild/scripts/clink.pl | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/compat/vcbuild/scripts/clink.pl b/compat/vcbuild/scripts/clink.pl index 3bd824154be381..c4c99d1a11f18c 100755 --- a/compat/vcbuild/scripts/clink.pl +++ b/compat/vcbuild/scripts/clink.pl @@ -56,7 +56,8 @@ # need to use that instead? foreach my $flag (@lflags) { if ($flag =~ /^-LIBPATH:(.*)/) { - foreach my $l ("libcurl_imp.lib", "libcurl.lib") { + my $libcurl = $is_debug ? "libcurl-d.lib" : "libcurl.lib"; + foreach my $l ("libcurl_imp.lib", $libcurl) { if (-f "$1/$l") { $lib = $l; last; From 59aa92e44e4ba40efb2880cc2b28d99ad40f81d3 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Fri, 31 Jan 2020 11:49:04 +0100 Subject: [PATCH 020/218] mingw: implement a platform-specific `strbuf_realpath()` There is a Win32 API function to resolve symbolic links, and we can use that instead of resolving them manually. Even better, this function also resolves NTFS junction points (which are somewhat similar to bind mounts). This fixes https://github.com/git-for-windows/git/issues/2481. Signed-off-by: Johannes Schindelin --- compat/mingw.c | 76 +++++++++++++++++++++++++++++++++++++++++++ compat/mingw.h | 3 ++ t/t0060-path-utils.sh | 8 +++++ t/t3700-add.sh | 2 +- t/t5601-clone.sh | 7 ++++ 5 files changed, 95 insertions(+), 1 deletion(-) diff --git a/compat/mingw.c b/compat/mingw.c index feefa2cd0eb12a..4777efe57ec12f 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -1500,6 +1500,82 @@ struct tm *localtime_r(const time_t *timep, struct tm *result) } #endif +char *mingw_strbuf_realpath(struct strbuf *resolved, const char *path) +{ + wchar_t wpath[MAX_PATH]; + HANDLE h; + DWORD ret; + int len; + const char *last_component = NULL; + char *append = NULL; + + if (xutftowcs_path(wpath, path) < 0) + return NULL; + + h = CreateFileW(wpath, 0, + FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE, NULL, + OPEN_EXISTING, FILE_FLAG_BACKUP_SEMANTICS, NULL); + + /* + * strbuf_realpath() allows the last path component to not exist. If + * that is the case, now it's time to try without last component. + */ + if (h == INVALID_HANDLE_VALUE && + GetLastError() == ERROR_FILE_NOT_FOUND) { + /* cut last component off of `wpath` */ + wchar_t *p = wpath + wcslen(wpath); + + while (p != wpath) + if (*(--p) == L'/' || *p == L'\\') + break; /* found start of last component */ + + if (p != wpath && (last_component = find_last_dir_sep(path))) { + append = xstrdup(last_component + 1); /* skip directory separator */ + /* + * Do not strip the trailing slash at the drive root, otherwise + * the path would be e.g. `C:` (which resolves to the + * _current_ directory on that drive). + */ + if (p[-1] == L':') + p[1] = L'\0'; + else + *p = L'\0'; + h = CreateFileW(wpath, 0, FILE_SHARE_READ | + FILE_SHARE_WRITE | FILE_SHARE_DELETE, + NULL, OPEN_EXISTING, + FILE_FLAG_BACKUP_SEMANTICS, NULL); + } + } + + if (h == INVALID_HANDLE_VALUE) { +realpath_failed: + FREE_AND_NULL(append); + return NULL; + } + + ret = GetFinalPathNameByHandleW(h, wpath, ARRAY_SIZE(wpath), 0); + CloseHandle(h); + if (!ret || ret >= ARRAY_SIZE(wpath)) + goto realpath_failed; + + len = wcslen(wpath) * 3; + strbuf_grow(resolved, len); + len = xwcstoutf(resolved->buf, normalize_ntpath(wpath), len); + if (len < 0) + goto realpath_failed; + resolved->len = len; + + if (append) { + /* Use forward-slash, like `normalize_ntpath()` */ + strbuf_complete(resolved, '/'); + strbuf_addstr(resolved, append); + FREE_AND_NULL(append); + } + + return resolved->buf; + +} + char *mingw_getcwd(char *pointer, int len) { wchar_t cwd[MAX_PATH], wpointer[MAX_PATH]; diff --git a/compat/mingw.h b/compat/mingw.h index 444daedfa52469..f6daf47ee4e0a7 100644 --- a/compat/mingw.h +++ b/compat/mingw.h @@ -39,6 +39,9 @@ static inline void convert_slashes(char *path) #define PATH_SEP ';' char *mingw_query_user_email(void); #define query_user_email mingw_query_user_email +struct strbuf; +char *mingw_strbuf_realpath(struct strbuf *resolved, const char *path); +#define platform_strbuf_realpath mingw_strbuf_realpath /** * Verifies that the specified path is owned by the user running the diff --git a/t/t0060-path-utils.sh b/t/t0060-path-utils.sh index 8545cdfab559b4..eb2ab9d437ea8e 100755 --- a/t/t0060-path-utils.sh +++ b/t/t0060-path-utils.sh @@ -281,6 +281,14 @@ test_expect_success SYMLINKS 'real path works on symlinks' ' test_cmp expect actual ' +test_expect_success MINGW 'real path works near drive root' ' + # we need a non-existing path at the drive root; simply skip if C:/xyz exists + if test ! -e C:/xyz + then + test C:/xyz = $(test-tool path-utils real_path C:/xyz) + fi +' + test_expect_success SYMLINKS 'prefix_path works with absolute paths to work tree symlinks' ' ln -s target symlink && echo "symlink" >expect && diff --git a/t/t3700-add.sh b/t/t3700-add.sh index c40d16d9149526..b9495e5cf00724 100755 --- a/t/t3700-add.sh +++ b/t/t3700-add.sh @@ -587,7 +587,7 @@ test_expect_success CASE_INSENSITIVE_FS 'path is case-insensitive' ' git add "$downcased" ' -test_expect_failure MINGW 'can add files via NTFS junctions' ' +test_expect_success MINGW 'can add files via NTFS junctions' ' test_when_finished "cmd //c rmdir junction && rm -rf target" && test_create_repo target && cmd //c "mklink /j junction target" && diff --git a/t/t5601-clone.sh b/t/t5601-clone.sh index d743d986c401a0..f70d99016ea2f7 100755 --- a/t/t5601-clone.sh +++ b/t/t5601-clone.sh @@ -78,6 +78,13 @@ test_expect_success 'clone respects GIT_WORK_TREE' ' ' +test_expect_success CASE_INSENSITIVE_FS 'core.worktree is not added due to path case' ' + + mkdir UPPERCASE && + git clone src "$(pwd)/uppercase" && + test "unset" = "$(git -C UPPERCASE config --default unset core.worktree)" +' + test_expect_success 'clone from hooks' ' test_create_repo r0 && From 0ef6adcbdc375d8d64dd6e016a94cc227590b4bb Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Wed, 4 Mar 2020 21:55:28 +0100 Subject: [PATCH 021/218] http: use new "best effort" strategy for Secure Channel revoke checking MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The native Windows HTTPS backend is based on Secure Channel which lets the caller decide how to handle revocation checking problems caused by missing information in the certificate or offline CRL distribution points. Unfortunately, cURL chose to handle these problems differently than OpenSSL by default: while OpenSSL happily ignores those problems (essentially saying "¯\_(ツ)_/¯"), the Secure Channel backend will error out instead. As a remedy, the "no revoke" mode was introduced, which turns off revocation checking altogether. This is a bit heavy-handed. We support this via the `http.schannelCheckRevoke` setting. In https://github.com/curl/curl/pull/4981, we contributed an opt-in "best effort" strategy that emulates what OpenSSL seems to do. In Git for Windows, we actually want this to be the default. This patch makes it so, introducing it as a new value for the `http.schannelCheckRevoke" setting, which now becmes a tristate: it accepts the values "false", "true" or "best-effort" (defaulting to the last one). Signed-off-by: Johannes Schindelin --- Documentation/config/http.adoc | 12 +++++++----- http.c | 25 +++++++++++++++++++++---- 2 files changed, 28 insertions(+), 9 deletions(-) diff --git a/Documentation/config/http.adoc b/Documentation/config/http.adoc index 849c89f36c5ad8..e5d8d14ab72845 100644 --- a/Documentation/config/http.adoc +++ b/Documentation/config/http.adoc @@ -233,11 +233,13 @@ http.sslKeyType:: http.schannelCheckRevoke:: Used to enforce or disable certificate revocation checks in cURL - when http.sslBackend is set to "schannel". Defaults to `true` if - unset. Only necessary to disable this if Git consistently errors - and the message is about checking the revocation status of a - certificate. This option is ignored if cURL lacks support for - setting the relevant SSL option at runtime. + when http.sslBackend is set to "schannel" via "true" and "false", + respectively. Another accepted value is "best-effort" (the default) + in which case revocation checks are performed, but errors due to + revocation list distribution points that are offline are silently + ignored, as well as errors due to certificates missing revocation + list distribution points. This option is ignored if cURL lacks + support for setting the relevant SSL option at runtime. http.schannelUseSSLCAInfo:: As of cURL v7.60.0, the Secure Channel backend can use the diff --git a/http.c b/http.c index 67c9c6fc60673d..65000b3749aaac 100644 --- a/http.c +++ b/http.c @@ -150,7 +150,12 @@ static char *cached_accept_language; static char *http_ssl_backend; -static int http_schannel_check_revoke = 1; +static long http_schannel_check_revoke_mode = +#ifdef CURLSSLOPT_REVOKE_BEST_EFFORT + CURLSSLOPT_REVOKE_BEST_EFFORT; +#else + CURLSSLOPT_NO_REVOKE; +#endif static long http_retry_after = 0; static long http_max_retries = 0; @@ -430,7 +435,19 @@ static int http_options(const char *var, const char *value, } if (!strcmp("http.schannelcheckrevoke", var)) { - http_schannel_check_revoke = git_config_bool(var, value); + if (value && !strcmp(value, "best-effort")) { + http_schannel_check_revoke_mode = +#ifdef CURLSSLOPT_REVOKE_BEST_EFFORT + CURLSSLOPT_REVOKE_BEST_EFFORT; +#else + CURLSSLOPT_NO_REVOKE; + warning(_("%s=%s unsupported by current cURL"), + var, value); +#endif + } else + http_schannel_check_revoke_mode = + (git_config_bool(var, value) ? + 0 : CURLSSLOPT_NO_REVOKE); return 0; } @@ -1079,8 +1096,8 @@ static CURL *get_curl_handle(void) #endif if (http_ssl_backend && !strcmp("schannel", http_ssl_backend) && - !http_schannel_check_revoke) { - curl_easy_setopt(result, CURLOPT_SSL_OPTIONS, (long)CURLSSLOPT_NO_REVOKE); + http_schannel_check_revoke_mode) { + curl_easy_setopt(result, CURLOPT_SSL_OPTIONS, http_schannel_check_revoke_mode); } if (http_proactive_auth != PROACTIVE_AUTH_NONE) From 73df800100b98bfbab39cf8263ddc50abbc60a85 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Sat, 9 May 2020 19:24:23 +0200 Subject: [PATCH 022/218] t5505/t5516: fix white-space around redirectors The convention in Git project's shell scripts is to have white-space _before_, but not _after_ the `>` (or `<`). Signed-off-by: Johannes Schindelin --- t/t5505-remote.sh | 6 +++--- t/t5516-fetch-push.sh | 10 +++++----- 2 files changed, 8 insertions(+), 8 deletions(-) diff --git a/t/t5505-remote.sh b/t/t5505-remote.sh index ed8ef69863ddd8..187a5206e17758 100755 --- a/t/t5505-remote.sh +++ b/t/t5505-remote.sh @@ -951,8 +951,8 @@ test_expect_success '"remote show" does not show symbolic refs' ' ( cd three && git remote show origin >output && - ! grep "^ *HEAD$" < output && - ! grep -i stale < output + ! grep "^ *HEAD$" .git/branches/origin && + echo "quux#foom" >.git/branches/origin && git remote rename origin origin && test_path_is_missing .git/branches/origin && test "$(git config remote.origin.url)" = "quux" && diff --git a/t/t5516-fetch-push.sh b/t/t5516-fetch-push.sh index d8a42fa64c292b..6ee4ccc826433a 100755 --- a/t/t5516-fetch-push.sh +++ b/t/t5516-fetch-push.sh @@ -934,7 +934,7 @@ test_expect_success !WITH_BREAKING_CHANGES 'fetch with branches' ' git branch second $the_first_commit && git checkout second && mkdir -p testrepo/.git/branches && - echo ".." > testrepo/.git/branches/branch1 && + echo ".." >testrepo/.git/branches/branch1 && ( cd testrepo && git fetch branch1 && @@ -948,7 +948,7 @@ test_expect_success !WITH_BREAKING_CHANGES 'fetch with branches' ' test_expect_success !WITH_BREAKING_CHANGES 'fetch with branches containing #' ' mk_empty testrepo && mkdir -p testrepo/.git/branches && - echo "..#second" > testrepo/.git/branches/branch2 && + echo "..#second" >testrepo/.git/branches/branch2 && ( cd testrepo && git fetch branch2 && @@ -965,7 +965,7 @@ test_expect_success !WITH_BREAKING_CHANGES 'push with branches' ' test_when_finished "rm -rf .git/branches" && mkdir -p .git/branches && - echo "testrepo" > .git/branches/branch1 && + echo "testrepo" >.git/branches/branch1 && git push branch1 && ( @@ -981,7 +981,7 @@ test_expect_success !WITH_BREAKING_CHANGES 'push with branches containing #' ' test_when_finished "rm -rf .git/branches" && mkdir -p .git/branches && - echo "testrepo#branch3" > .git/branches/branch2 && + echo "testrepo#branch3" >.git/branches/branch2 && git push branch2 && ( @@ -1511,7 +1511,7 @@ EOF git init no-thin && git --git-dir=no-thin/.git config receive.unpacklimit 0 && git push no-thin/.git refs/heads/main:refs/heads/foo && - echo modified >> path1 && + echo modified >>path1 && git commit -am modified && git repack -adf && rcvpck="git receive-pack --reject-thin-pack-for-testing" && From 336f3ca256a11abc46004086b0814403aeea9b79 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Sat, 12 Sep 2015 12:25:47 +0200 Subject: [PATCH 023/218] t3701: verify that we can add *lots* of files interactively Signed-off-by: Johannes Schindelin --- t/t3701-add-interactive.sh | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+) diff --git a/t/t3701-add-interactive.sh b/t/t3701-add-interactive.sh index 6e120a40011238..cb09158c214768 100755 --- a/t/t3701-add-interactive.sh +++ b/t/t3701-add-interactive.sh @@ -1204,6 +1204,27 @@ test_expect_success 'checkout -p patch editing of added file' ' ) ' +test_expect_success EXPENSIVE 'add -i with a lot of files' ' + git reset --hard && + x160=0123456789012345678901234567890123456789 && + x160=$x160$x160$x160$x160 && + y= && + i=0 && + while test $i -le 200 + do + name=$(printf "%s%03d" $x160 $i) && + echo $name >$name && + git add -N $name && + y="${y}y$LF" && + i=$(($i+1)) || + exit 1 + done && + echo "$y" | git add -p -- . && + git diff --cached >staged && + test_line_count = 1407 staged && + git reset --hard +' + test_expect_success 'show help from add--helper' ' git reset --hard && cat >expect <<-EOF && From 48682d578172d8a126f913df68fcd41e40bf726a Mon Sep 17 00:00:00 2001 From: Luke Bonanomi Date: Wed, 24 Jun 2020 07:45:52 -0400 Subject: [PATCH 024/218] commit: accept "scissors" with CR/LF line endings This change enhances `git commit --cleanup=scissors` by detecting scissors lines ending in either LF (UNIX-style) or CR/LF (DOS-style). Regression tests are included to specifically test for trailing comments after a CR/LF-terminated scissors line. Signed-off-by: Luke Bonanomi Signed-off-by: Johannes Schindelin --- t/t7502-commit-porcelain.sh | 42 +++++++++++++++++++++++++++++++++++++ wt-status.c | 13 +++++++++--- 2 files changed, 52 insertions(+), 3 deletions(-) diff --git a/t/t7502-commit-porcelain.sh b/t/t7502-commit-porcelain.sh index 05f6da4ad98448..8a013669a5aa95 100755 --- a/t/t7502-commit-porcelain.sh +++ b/t/t7502-commit-porcelain.sh @@ -623,6 +623,48 @@ test_expect_success 'cleanup commit messages (scissors option,-F,-e, scissors on test_must_be_empty actual ' +test_expect_success 'helper-editor' ' + + write_script lf-to-crlf.sh <<-\EOF + sed "s/\$/Q/" <"$1" | tr Q "\\015" >"$1".new && + mv -f "$1".new "$1" + EOF +' + +test_expect_success 'cleanup commit messages (scissors option,-F,-e, CR/LF line endings)' ' + + test_config core.editor "\"$PWD/lf-to-crlf.sh\"" && + scissors="# ------------------------ >8 ------------------------" && + + test_write_lines >text \ + "# Keep this comment" "" " $scissors" \ + "# Keep this comment, too" "$scissors" \ + "# Remove this comment" "$scissors" \ + "Remove this comment, too" && + + test_write_lines >expect \ + "# Keep this comment" "" " $scissors" \ + "# Keep this comment, too" && + + git commit --cleanup=scissors -e -F text --allow-empty && + git cat-file -p HEAD >raw && + sed -e "1,/^\$/d" raw >actual && + test_cmp expect actual +' + +test_expect_success 'cleanup commit messages (scissors option,-F,-e, scissors on first line, CR/LF line endings)' ' + + scissors="# ------------------------ >8 ------------------------" && + test_write_lines >text \ + "$scissors" \ + "# Remove this comment and any following lines" && + cp text /tmp/test2-text && + git commit --cleanup=scissors -e -F text --allow-empty --allow-empty-message && + git cat-file -p HEAD >raw && + sed -e "1,/^\$/d" raw >actual && + test_must_be_empty actual +' + test_expect_success 'cleanup commit messages (strip option,-F)' ' echo >>negative && diff --git a/wt-status.c b/wt-status.c index 479ccc3304bc33..f4d2984374146e 100644 --- a/wt-status.c +++ b/wt-status.c @@ -40,7 +40,7 @@ #define UF_DELAY_WARNING_IN_MS (2 * 1000) static const char cut_line[] = -"------------------------ >8 ------------------------\n"; +"------------------------ >8 ------------------------"; static char default_wt_status_colors[][COLOR_MAXLEN] = { GIT_COLOR_NORMAL, /* WT_STATUS_HEADER */ @@ -1121,15 +1121,22 @@ static void wt_longstatus_print_other(struct wt_status *s, status_printf_ln(s, GIT_COLOR_NORMAL, "%s", ""); } +static inline int starts_with_newline(const char *p) +{ + return *p == '\n' || (*p == '\r' && p[1] == '\n'); +} + size_t wt_status_locate_end(const char *s, size_t len) { const char *p; struct strbuf pattern = STRBUF_INIT; strbuf_addf(&pattern, "\n%s %s", comment_line_str, cut_line); - if (starts_with(s, pattern.buf + 1)) + if (starts_with(s, pattern.buf + 1) && + starts_with_newline(s + pattern.len - 1)) len = 0; - else if ((p = strstr(s, pattern.buf))) { + else if ((p = strstr(s, pattern.buf)) && + starts_with_newline(p + pattern.len)) { size_t newlen = p - s + 1; if (newlen < len) len = newlen; From 5b62ca7de972642ed911021bbac76a32756e772b Mon Sep 17 00:00:00 2001 From: Jens Glathe Date: Tue, 2 Jun 2020 12:12:25 +0200 Subject: [PATCH 025/218] t0014: fix indentation For some reason, this test case was indented with 4 spaces instead of 1 horizontal tab. The other test cases in the same test script are fine. Signed-off-by: Jens Glathe Signed-off-by: Johannes Schindelin --- t/t0014-alias.sh | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/t/t0014-alias.sh b/t/t0014-alias.sh index 68b4903cbfa595..156265d8d07cc8 100755 --- a/t/t0014-alias.sh +++ b/t/t0014-alias.sh @@ -52,10 +52,10 @@ test_expect_success 'looping aliases - deprecated builtins' ' #' test_expect_success 'run-command formats empty args properly' ' - test_must_fail env GIT_TRACE=1 git frotz a "" b " " c 2>actual.raw && - sed -ne "/run_command:/s/.*trace: run_command: //p" actual.raw >actual && - echo "git-frotz a '\'''\'' b '\'' '\'' c" >expect && - test_cmp expect actual + test_must_fail env GIT_TRACE=1 git frotz a "" b " " c 2>actual.raw && + sed -ne "/run_command:/s/.*trace: run_command: //p" actual.raw >actual && + echo "git-frotz a '\'''\'' b '\'' '\'' c" >expect && + test_cmp expect actual ' test_expect_success 'tracing a shell alias with arguments shows trace of prepared command' ' From 599e9afab6305e8fab92e39fe1d4903a4602c236 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Wed, 12 Aug 2020 15:06:17 +0000 Subject: [PATCH 026/218] git-gui: accommodate for intent-to-add files As of Git v2.28.0, the diff for files staged via `git add -N` marks them as new files. Git GUI was ill-prepared for that, and this patch teaches Git GUI about them. Please note that this will not even fix things with v2.28.0, as the `rp/apply-cached-with-i-t-a` patches are required on Git's side, too. This fixes https://github.com/git-for-windows/git/issues/2779 Signed-off-by: Johannes Schindelin Signed-off-by: Pratyush Yadav --- git-gui/git-gui.sh | 2 ++ git-gui/lib/diff.tcl | 12 ++++++++---- 2 files changed, 10 insertions(+), 4 deletions(-) diff --git a/git-gui/git-gui.sh b/git-gui/git-gui.sh index 23fe76e498bd17..799b564b926d0f 100755 --- a/git-gui/git-gui.sh +++ b/git-gui/git-gui.sh @@ -1934,6 +1934,7 @@ set all_icons(U$ui_index) file_merge set all_icons(T$ui_index) file_statechange set all_icons(_$ui_workdir) file_plain +set all_icons(A$ui_workdir) file_plain set all_icons(M$ui_workdir) file_mod set all_icons(D$ui_workdir) file_question set all_icons(U$ui_workdir) file_merge @@ -1960,6 +1961,7 @@ foreach i { {A_ {mc "Staged for commit"}} {AM {mc "Portions staged for commit"}} {AD {mc "Staged for commit, missing"}} + {AA {mc "Intended to be added"}} {_D {mc "Missing"}} {D_ {mc "Staged for removal"}} diff --git a/git-gui/lib/diff.tcl b/git-gui/lib/diff.tcl index 8be1a613fbe01f..d25a9bbdc4abde 100644 --- a/git-gui/lib/diff.tcl +++ b/git-gui/lib/diff.tcl @@ -556,7 +556,8 @@ proc apply_or_revert_hunk {x y revert} { if {$current_diff_side eq $ui_index} { set failed_msg [mc "Failed to unstage selected hunk."] lappend apply_cmd --reverse --cached - if {[string index $mi 0] ne {M}} { + set file_state [string index $mi 0] + if {$file_state ne {M} && $file_state ne {A}} { unlock_index return } @@ -569,7 +570,8 @@ proc apply_or_revert_hunk {x y revert} { lappend apply_cmd --cached } - if {[string index $mi 1] ne {M}} { + set file_state [string index $mi 1] + if {$file_state ne {M} && $file_state ne {A}} { unlock_index return } @@ -661,7 +663,8 @@ proc apply_or_revert_range_or_line {x y revert} { set failed_msg [mc "Failed to unstage selected line."] set to_context {+} lappend apply_cmd --reverse --cached - if {[string index $mi 0] ne {M}} { + set file_state [string index $mi 0] + if {$file_state ne {M} && $file_state ne {A}} { unlock_index return } @@ -676,7 +679,8 @@ proc apply_or_revert_range_or_line {x y revert} { lappend apply_cmd --cached } - if {[string index $mi 1] ne {M}} { + set file_state [string index $mi 1] + if {$file_state ne {M} && $file_state ne {A}} { unlock_index return } From 2894e56c5ac34303516eb4c517c98600d83cdb61 Mon Sep 17 00:00:00 2001 From: Philip Oakley Date: Sun, 31 Oct 2021 23:15:13 +0000 Subject: [PATCH 027/218] hash-object: demonstrate a >4GB/LLP64 problem On LLP64 systems, such as Windows, the size of `long`, `int`, etc. is only 32 bits (for backward compatibility). Git's use of `unsigned long` for file memory sizes in many places, rather than size_t, limits the handling of large files on LLP64 systems (commonly given as `>4GB`). Provide a minimum test for handling a >4GB file. The `hash-object` command, with the `--literally` and without `-w` option avoids writing the object, either loose or packed. This avoids the code paths hitting the `bigFileThreshold` config test code, the zlib code, and the pack code. Subsequent patches will walk the test's call chain, converting types to `size_t` (which is larger in LLP64 data models) where appropriate. Signed-off-by: Philip Oakley Signed-off-by: Johannes Schindelin --- t/t1007-hash-object.sh | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/t/t1007-hash-object.sh b/t/t1007-hash-object.sh index de076293b62a76..7867fd1dbf940c 100755 --- a/t/t1007-hash-object.sh +++ b/t/t1007-hash-object.sh @@ -49,6 +49,9 @@ test_expect_success 'setup' ' example sha1:ddd3f836d3e3fbb7ae289aa9ae83536f76956399 example sha256:b44fe1fe65589848253737db859bd490453510719d7424daab03daf0767b85ae + + large5GB sha1:0be2be10a4c8764f32c4bf372a98edc731a4b204 + large5GB sha256:dc18ca621300c8d3cfa505a275641ebab00de189859e022a975056882d313e64 EOF ' @@ -258,4 +261,12 @@ test_expect_success '--stdin outside of repository (uses default hash)' ' test_cmp expect actual ' +test_expect_failure EXPENSIVE,SIZE_T_IS_64BIT,!LONG_IS_64BIT \ + 'files over 4GB hash literally' ' + test-tool genzeros $((5*1024*1024*1024)) >big && + test_oid large5GB >expect && + git hash-object --stdin --literally actual && + test_cmp expect actual +' + test_done From bcb5666befbb80e5bbbb0d80ea25fd464ea6161f Mon Sep 17 00:00:00 2001 From: Philip Oakley Date: Fri, 12 Nov 2021 21:14:50 +0000 Subject: [PATCH 028/218] object-file.c: use size_t for header lengths Continue walking the code path for the >4GB `hash-object --literally` test. The `hash_object_file_literally()` function internally uses both `hash_object_file()` and `write_object_file_prepare()`. Both function signatures use `unsigned long` rather than `size_t` for the mem buffer sizes. Use `size_t` instead, for LLP64 compatibility. While at it, convert those function's object's header buffer length to `size_t` for consistency. The value is already upcast to `uintmax_t` for print format compatibility. Note: The hash-object test still does not pass. A subsequent commit continues to walk the call tree's lower level hash functions to identify further fixes. Signed-off-by: Philip Oakley Signed-off-by: Johannes Schindelin --- object-file.c | 14 +++++++------- object-file.h | 4 ++-- 2 files changed, 9 insertions(+), 9 deletions(-) diff --git a/object-file.c b/object-file.c index 2acc9522df2daa..1d511a0058c716 100644 --- a/object-file.c +++ b/object-file.c @@ -562,7 +562,7 @@ int odb_source_loose_read_object_info(struct odb_source *source, static void hash_object_body(const struct git_hash_algo *algo, struct git_hash_ctx *c, const void *buf, unsigned long len, struct object_id *oid, - char *hdr, int *hdrlen) + char *hdr, size_t *hdrlen) { algo->init_fn(c); git_hash_update(c, hdr, *hdrlen); @@ -571,9 +571,9 @@ static void hash_object_body(const struct git_hash_algo *algo, struct git_hash_c } static void write_object_file_prepare(const struct git_hash_algo *algo, - const void *buf, unsigned long len, + const void *buf, size_t len, enum object_type type, struct object_id *oid, - char *hdr, int *hdrlen) + char *hdr, size_t *hdrlen) { struct git_hash_ctx c; @@ -716,11 +716,11 @@ int finalize_object_file_flags(struct repository *repo, } void hash_object_file(const struct git_hash_algo *algo, const void *buf, - unsigned long len, enum object_type type, + size_t len, enum object_type type, struct object_id *oid) { char hdr[MAX_HEADER_LEN]; - int hdrlen = sizeof(hdr); + size_t hdrlen = sizeof(hdr); write_object_file_prepare(algo, buf, len, type, oid, hdr, &hdrlen); } @@ -1167,7 +1167,7 @@ int odb_source_loose_write_stream(struct odb_source *source, } int odb_source_loose_write_object(struct odb_source *source, - const void *buf, unsigned long len, + const void *buf, size_t len, enum object_type type, struct object_id *oid, struct object_id *compat_oid_in, enum odb_write_object_flags flags) @@ -1176,7 +1176,7 @@ int odb_source_loose_write_object(struct odb_source *source, const struct git_hash_algo *compat = source->odb->repo->compat_hash_algo; struct object_id compat_oid; char hdr[MAX_HEADER_LEN]; - int hdrlen = sizeof(hdr); + size_t hdrlen = sizeof(hdr); /* Generate compat_oid */ if (compat) { diff --git a/object-file.h b/object-file.h index 5241b8dd5c564d..e1e22d512d7e10 100644 --- a/object-file.h +++ b/object-file.h @@ -66,7 +66,7 @@ int odb_source_loose_freshen_object(struct odb_source *source, const struct object_id *oid); int odb_source_loose_write_object(struct odb_source *source, - const void *buf, unsigned long len, + const void *buf, size_t len, enum object_type type, struct object_id *oid, struct object_id *compat_oid_in, enum odb_write_object_flags flags); @@ -201,7 +201,7 @@ int finalize_object_file_flags(struct repository *repo, enum finalize_object_file_flags flags); void hash_object_file(const struct git_hash_algo *algo, const void *buf, - unsigned long len, enum object_type type, + size_t len, enum object_type type, struct object_id *oid); /* Helper to check and "touch" a file */ From d7c54a731d5423a335d6ed7abc83fb8914f7a1f1 Mon Sep 17 00:00:00 2001 From: Philip Oakley Date: Fri, 12 Nov 2021 21:16:51 +0000 Subject: [PATCH 029/218] hash algorithms: use size_t for section lengths Continue walking the code path for the >4GB `hash-object --literally` test to the hash algorithm step for LLP64 systems. This patch lets the SHA1DC code use `size_t`, making it compatible with LLP64 data models (as used e.g. by Windows). The interested reader of this patch will note that we adjust the signature of the `git_SHA1DCUpdate()` function without updating _any_ call site. This certainly puzzled at least one reviewer already, so here is an explanation: This function is never called directly, but always via the macro `platform_SHA1_Update`, which is usually called via the macro `git_SHA1_Update`. However, we never call `git_SHA1_Update()` directly in `struct git_hash_algo`. Instead, we call `git_hash_sha1_update()`, which is defined thusly: static void git_hash_sha1_update(git_hash_ctx *ctx, const void *data, size_t len) { git_SHA1_Update(&ctx->sha1, data, len); } i.e. it contains an implicit downcast from `size_t` to `unsigned long` (before this here patch). With this patch, there is no downcast anymore. With this patch, finally, the t1007-hash-object.sh "files over 4GB hash literally" test case is fixed. Signed-off-by: Philip Oakley Signed-off-by: Johannes Schindelin --- object-file.c | 4 ++-- sha1dc_git.c | 3 +-- sha1dc_git.h | 2 +- t/t1007-hash-object.sh | 2 +- 4 files changed, 5 insertions(+), 6 deletions(-) diff --git a/object-file.c b/object-file.c index 1d511a0058c716..2dbb805028962c 100644 --- a/object-file.c +++ b/object-file.c @@ -560,7 +560,7 @@ int odb_source_loose_read_object_info(struct odb_source *source, } static void hash_object_body(const struct git_hash_algo *algo, struct git_hash_ctx *c, - const void *buf, unsigned long len, + const void *buf, size_t len, struct object_id *oid, char *hdr, size_t *hdrlen) { @@ -580,7 +580,7 @@ static void write_object_file_prepare(const struct git_hash_algo *algo, /* Generate the header */ *hdrlen = format_object_header(hdr, *hdrlen, type, len); - /* Sha1.. */ + /* Hash (function pointers) computation */ hash_object_body(algo, &c, buf, len, oid, hdr, hdrlen); } diff --git a/sha1dc_git.c b/sha1dc_git.c index 9b675a046ee699..fe58d7962a30c9 100644 --- a/sha1dc_git.c +++ b/sha1dc_git.c @@ -27,10 +27,9 @@ void git_SHA1DCFinal(unsigned char hash[20], SHA1_CTX *ctx) /* * Same as SHA1DCUpdate, but adjust types to match git's usual interface. */ -void git_SHA1DCUpdate(SHA1_CTX *ctx, const void *vdata, unsigned long len) +void git_SHA1DCUpdate(SHA1_CTX *ctx, const void *vdata, size_t len) { const char *data = vdata; - /* We expect an unsigned long, but sha1dc only takes an int */ while (len > INT_MAX) { SHA1DCUpdate(ctx, data, INT_MAX); data += INT_MAX; diff --git a/sha1dc_git.h b/sha1dc_git.h index f6f880cabea382..0bcf1aa84b7241 100644 --- a/sha1dc_git.h +++ b/sha1dc_git.h @@ -15,7 +15,7 @@ void git_SHA1DCInit(SHA1_CTX *); #endif void git_SHA1DCFinal(unsigned char [20], SHA1_CTX *); -void git_SHA1DCUpdate(SHA1_CTX *ctx, const void *data, unsigned long len); +void git_SHA1DCUpdate(SHA1_CTX *ctx, const void *data, size_t len); #define platform_SHA_IS_SHA1DC /* used by "test-tool sha1-is-sha1dc" */ diff --git a/t/t1007-hash-object.sh b/t/t1007-hash-object.sh index 7867fd1dbf940c..10382a815e4c14 100755 --- a/t/t1007-hash-object.sh +++ b/t/t1007-hash-object.sh @@ -261,7 +261,7 @@ test_expect_success '--stdin outside of repository (uses default hash)' ' test_cmp expect actual ' -test_expect_failure EXPENSIVE,SIZE_T_IS_64BIT,!LONG_IS_64BIT \ +test_expect_success EXPENSIVE,SIZE_T_IS_64BIT,!LONG_IS_64BIT \ 'files over 4GB hash literally' ' test-tool genzeros $((5*1024*1024*1024)) >big && test_oid large5GB >expect && From e8d50343539d223b3bf69fbf04de01e113e9f85a Mon Sep 17 00:00:00 2001 From: Philip Oakley Date: Mon, 6 Dec 2021 22:26:50 +0000 Subject: [PATCH 030/218] hash-object --stdin: verify that it works with >4GB/LLP64 Just like the `hash-object --literally` code path, the `--stdin` code path also needs to use `size_t` instead of `unsigned long` to represent memory sizes, otherwise it would cause problems on platforms using the LLP64 data model (such as Windows). To limit the scope of the test case, the object is explicitly not written to the object store, nor are any filters applied. The `big` file from the previous test case is reused to save setup time; To avoid relying on that side effect, it is generated if it does not exist (e.g. when running via `sh t1007-*.sh --long --run=1,41`). Signed-off-by: Philip Oakley Signed-off-by: Johannes Schindelin --- t/t1007-hash-object.sh | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/t/t1007-hash-object.sh b/t/t1007-hash-object.sh index 10382a815e4c14..59efee3affcff4 100755 --- a/t/t1007-hash-object.sh +++ b/t/t1007-hash-object.sh @@ -269,4 +269,12 @@ test_expect_success EXPENSIVE,SIZE_T_IS_64BIT,!LONG_IS_64BIT \ test_cmp expect actual ' +test_expect_success EXPENSIVE,SIZE_T_IS_64BIT,!LONG_IS_64BIT \ + 'files over 4GB hash correctly via --stdin' ' + { test -f big || test-tool genzeros $((5*1024*1024*1024)) >big; } && + test_oid large5GB >expect && + git hash-object --stdin actual && + test_cmp expect actual +' + test_done From d99fd4804f77fe47d8d698fa21230c9ddd503526 Mon Sep 17 00:00:00 2001 From: Victoria Dye Date: Thu, 5 Aug 2021 19:04:13 -0400 Subject: [PATCH 031/218] subtree: update `contrib/subtree` `test` target The intention of this change is to align with how the top-level git `Makefile` defines its own test target (which also internally calls `$(MAKE) -C t/ all`). This change also ensures the consistency of `make -C contrib/subtree test` with other testing in CI executions (which rely on `$DEFAULT_TEST_TARGET` being defined as `prove`). Signed-off-by: Victoria Dye --- contrib/subtree/Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/contrib/subtree/Makefile b/contrib/subtree/Makefile index c0c9f21cb78022..dab2dfc08ee222 100644 --- a/contrib/subtree/Makefile +++ b/contrib/subtree/Makefile @@ -95,7 +95,7 @@ $(GIT_SUBTREE_TEST): $(GIT_SUBTREE) cp $< $@ test: $(GIT_SUBTREE_TEST) - $(MAKE) -C t/ test + $(MAKE) -C t/ all clean: $(RM) $(GIT_SUBTREE) From ce892c8090c5be0fa463d5b716449034009c504c Mon Sep 17 00:00:00 2001 From: Philip Oakley Date: Mon, 6 Dec 2021 22:42:46 +0000 Subject: [PATCH 032/218] hash-object: add another >4GB/LLP64 test case To complement the `--stdin` and `--literally` test cases that verify that we can hash files larger than 4GB on 64-bit platforms using the LLP64 data model, here is a test case that exercises `hash-object` _without_ any options. Just as before, we use the `big` file from the previous test case if it exists to save on setup time, otherwise generate it. Signed-off-by: Philip Oakley Signed-off-by: Johannes Schindelin --- t/t1007-hash-object.sh | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/t/t1007-hash-object.sh b/t/t1007-hash-object.sh index 59efee3affcff4..f2722380ee1436 100755 --- a/t/t1007-hash-object.sh +++ b/t/t1007-hash-object.sh @@ -277,4 +277,12 @@ test_expect_success EXPENSIVE,SIZE_T_IS_64BIT,!LONG_IS_64BIT \ test_cmp expect actual ' +test_expect_success EXPENSIVE,SIZE_T_IS_64BIT,!LONG_IS_64BIT \ + 'files over 4GB hash correctly' ' + { test -f big || test-tool genzeros $((5*1024*1024*1024)) >big; } && + test_oid large5GB >expect && + git hash-object -- big >actual && + test_cmp expect actual +' + test_done From 16b5c6574841a691b6eae077a2b2cbda1aaa07f4 Mon Sep 17 00:00:00 2001 From: Derrick Stolee Date: Wed, 13 Apr 2022 14:49:17 -0400 Subject: [PATCH 033/218] setup: properly use "%(prefix)/" when in WSL Signed-off-by: Derrick Stolee --- setup.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/setup.c b/setup.c index 7ec4427368a2a7..1f60e058cae35f 100644 --- a/setup.c +++ b/setup.c @@ -1919,10 +1919,19 @@ const char *setup_git_directory_gently(int *nongit_ok) break; case GIT_DIR_INVALID_OWNERSHIP: if (!nongit_ok) { + struct strbuf prequoted = STRBUF_INIT; struct strbuf quoted = STRBUF_INIT; strbuf_complete(&report, '\n'); - sq_quote_buf_pretty("ed, dir.buf); + +#ifdef __MINGW32__ + if (dir.buf[0] == '/') + strbuf_addstr(&prequoted, "%(prefix)/"); +#endif + + strbuf_add(&prequoted, dir.buf, dir.len); + sq_quote_buf_pretty("ed, prequoted.buf); + die(_("detected dubious ownership in repository at '%s'\n" "%s" "To add an exception for this directory, call:\n" From 55b3fbdec982a4f209bc2dda4094fe7797ba6e0e Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Fri, 2 Apr 2021 22:50:54 +0200 Subject: [PATCH 034/218] mingw: allow for longer paths in `parse_interpreter()` MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit As reported in https://github.com/newren/git-filter-repo/pull/225, it looks like 99 bytes is not really sufficient to represent e.g. the full path to Python when installed via Windows Store (and this path is used in the hasb bang line when installing scripts via `pip`). Let's increase it to what is probably the maximum sensible path size: MAX_PATH. This makes `parse_interpreter()` in line with what `lookup_prog()` handles. Signed-off-by: Johannes Schindelin Signed-off-by: Vilius Šumskas --- compat/mingw.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/compat/mingw.c b/compat/mingw.c index feefa2cd0eb12a..b2595111ed6294 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -1628,7 +1628,7 @@ static const char *quote_arg_msys2(const char *arg) static const char *parse_interpreter(const char *cmd) { - static char buf[100]; + static char buf[MAX_PATH]; char *p, *opt; ssize_t n; /* read() can return negative values */ int fd; From 4163a8b7200556c02b93e07f92bb29be24da3b83 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Mon, 17 May 2021 10:46:52 +0200 Subject: [PATCH 035/218] compat/vcbuild: document preferred way to build in Visual Studio We used to have that `make vcxproj` hack, but a hack it is. In the meantime, we have a much cleaner solution: using CMake, either explicitly, or even more conveniently via Visual Studio's built-in CMake support (simply open Git's top-level directory via File>Open>Folder...). Let's let the `README` reflect this. Signed-off-by: Johannes Schindelin --- compat/vcbuild/README | 28 +++++++++------------------- 1 file changed, 9 insertions(+), 19 deletions(-) diff --git a/compat/vcbuild/README b/compat/vcbuild/README index 29ec1d0f104b80..5c71ea2daa4017 100644 --- a/compat/vcbuild/README +++ b/compat/vcbuild/README @@ -37,27 +37,17 @@ The Steps to Build Git with VS2015 or VS2017 from the command line. ================================================================ -Alternatively, run `make vcxproj` and then load the generated `git.sln` in -Visual Studio. The initial build will install the vcpkg system and build the +Alternatively, just open Git's top-level directory in Visual Studio, via +`File>Open>Folder...`. This will use CMake internally to generate the +project definitions. It will also install the vcpkg system and build the dependencies automatically. This will take a while. -Instead of generating the `git.sln` file yourself (which requires a full Git -for Windows SDK), you may want to consider fetching the `vs/master` branch of -https://github.com/git-for-windows/git instead (which is updated automatically -via CI running `make vcxproj`). The `vs/master` branch does not require a Git -for Windows to build, but you can run the test scripts in a regular Git Bash. - -Note that `make vcxproj` will automatically add and commit the generated `.sln` -and `.vcxproj` files to the repo. This is necessary to allow building a -fully-testable Git in Visual Studio, where a regular Git Bash can be used to -run the test scripts (as opposed to a full Git for Windows SDK): a number of -build targets, such as Git commands implemented as Unix shell scripts (where -`@@SHELL_PATH@@` and other placeholders are interpolated) require a full-blown -Git for Windows SDK (which is about 10x the size of a regular Git for Windows -installation). - -If your plan is to open a Pull Request with Git for Windows, it is a good idea -to drop this commit before submitting. +You can also generate the Visual Studio solution manually by downloading +and running CMake explicitly rather than letting Visual Studio doing +that implicitly. + +Another, deprecated option is to run `make vcxproj`. This option is +superseded by the CMake-based build, and will be removed at some point. ================================================================ The Steps of Build Git with VS2008 From 4c6d4abe787b21a72c59c176326c28539d0f8f36 Mon Sep 17 00:00:00 2001 From: Pascal Muller Date: Wed, 23 Jun 2021 21:21:10 +0200 Subject: [PATCH 036/218] http: optionally send SSL client certificate This adds support for a new http.sslAutoClientCert config value. In cURL 7.77 or later the schannel backend does not automatically send client certificates from the Windows Certificate Store anymore. This config value is only used if http.sslBackend is set to "schannel", and can be used to opt in to the old behavior and force cURL to send client certificates. This fixes https://github.com/git-for-windows/git/issues/3292 Signed-off-by: Pascal Muller --- Documentation/config/http.adoc | 5 +++++ git-curl-compat.h | 8 ++++++++ http.c | 24 +++++++++++++++++++++--- 3 files changed, 34 insertions(+), 3 deletions(-) diff --git a/Documentation/config/http.adoc b/Documentation/config/http.adoc index e5d8d14ab72845..1e722fbf1f0f25 100644 --- a/Documentation/config/http.adoc +++ b/Documentation/config/http.adoc @@ -249,6 +249,11 @@ http.schannelUseSSLCAInfo:: when the `schannel` backend was configured via `http.sslBackend`, unless `http.schannelUseSSLCAInfo` overrides this behavior. +http.sslAutoClientCert:: + As of cURL v7.77.0, the Secure Channel backend won't automatically + send client certificates from the Windows Certificate Store anymore. + To opt in to the old behavior, http.sslAutoClientCert can be set. + http.pinnedPubkey:: Public key of the https service. It may either be the filename of a PEM or DER encoded public key file or a string starting with diff --git a/git-curl-compat.h b/git-curl-compat.h index dccdd4d6e54158..5c8ceb076adea2 100644 --- a/git-curl-compat.h +++ b/git-curl-compat.h @@ -45,6 +45,14 @@ #define GIT_CURL_HAVE_CURLINFO_RETRY_AFTER 1 #endif +/** + * CURLSSLOPT_AUTO_CLIENT_CERT was added in 7.77.0, released in May + * 2021. + */ +#if LIBCURL_VERSION_NUM >= 0x074d00 +#define GIT_CURL_HAVE_CURLSSLOPT_AUTO_CLIENT_CERT +#endif + /** * CURLOPT_PROTOCOLS_STR and CURLOPT_REDIR_PROTOCOLS_STR were added in 7.85.0, * released in August 2022. diff --git a/http.c b/http.c index 65000b3749aaac..588d3140e14eba 100644 --- a/http.c +++ b/http.c @@ -168,6 +168,8 @@ static long http_max_retry_time = 300; */ static int http_schannel_use_ssl_cainfo; +static int http_auto_client_cert; + static int always_auth_proactively(void) { return http_proactive_auth != PROACTIVE_AUTH_NONE && @@ -456,6 +458,11 @@ static int http_options(const char *var, const char *value, return 0; } + if (!strcmp("http.sslautoclientcert", var)) { + http_auto_client_cert = git_config_bool(var, value); + return 0; + } + if (!strcmp("http.minsessions", var)) { min_curl_sessions = git_config_int(var, value, ctx->kvi); if (min_curl_sessions > 1) @@ -1095,9 +1102,20 @@ static CURL *get_curl_handle(void) } #endif - if (http_ssl_backend && !strcmp("schannel", http_ssl_backend) && - http_schannel_check_revoke_mode) { - curl_easy_setopt(result, CURLOPT_SSL_OPTIONS, http_schannel_check_revoke_mode); + if (http_ssl_backend && !strcmp("schannel", http_ssl_backend)) { + long ssl_options = 0; + if (http_schannel_check_revoke_mode) { + ssl_options |= http_schannel_check_revoke_mode; + } + + if (http_auto_client_cert) { +#ifdef GIT_CURL_HAVE_CURLSSLOPT_AUTO_CLIENT_CERT + ssl_options |= CURLSSLOPT_AUTO_CLIENT_CERT; +#endif + } + + if (ssl_options) + curl_easy_setopt(result, CURLOPT_SSL_OPTIONS, ssl_options); } if (http_proactive_auth != PROACTIVE_AUTH_NONE) From b030b44f8ffd13cde98814802acdd46a03aa2420 Mon Sep 17 00:00:00 2001 From: Victoria Dye Date: Thu, 5 Aug 2021 19:11:59 -0400 Subject: [PATCH 037/218] ci: run `contrib/subtree` tests in CI builds Because `git subtree` (unlike most other `contrib` modules) is included as part of the standard release of Git for Windows, its stability should be verified as consistently as it is for the rest of git. By including the `git subtree` tests in the CI workflow, these tests are as much of a gate to merging and indicator of stability as the standard test suite. Signed-off-by: Victoria Dye --- ci/run-build-and-tests.sh | 4 ++++ ci/run-test-slice.sh | 3 +++ 2 files changed, 7 insertions(+) diff --git a/ci/run-build-and-tests.sh b/ci/run-build-and-tests.sh index 28cfe730ee5aed..9bdfac128dbf55 100755 --- a/ci/run-build-and-tests.sh +++ b/ci/run-build-and-tests.sh @@ -63,5 +63,9 @@ case "$jobname" in ;; esac +case " $MAKE_TARGETS " in +*" all "*) make -C contrib/subtree test;; +esac + check_unignored_build_artifacts save_good_tree diff --git a/ci/run-test-slice.sh b/ci/run-test-slice.sh index ff948e397fcb70..f84190e7b73180 100755 --- a/ci/run-test-slice.sh +++ b/ci/run-test-slice.sh @@ -15,4 +15,7 @@ if [ "$1" == "0" ] ; then group "Run unit tests" make --quiet -C t unit-tests-test-tool fi +# Run the git subtree tests only if main tests succeeded +test 0 != "$1" || make -C contrib/subtree test + check_unignored_build_artifacts From d8d5be9c84a9c6690851b48a16b2c98ad33dd0d3 Mon Sep 17 00:00:00 2001 From: Philip Oakley Date: Tue, 7 Dec 2021 09:53:41 +0000 Subject: [PATCH 038/218] hash-object: add a >4GB/LLP64 test case using filtered input To verify that the `clean` side of the `clean`/`smudge` filter code is correct with regards to LLP64 (read: to ensure that `size_t` is used instead of `unsigned long`), here is a test case using a trivial filter, specifically _not_ writing anything to the object store to limit the scope of the test case. As in previous commits, the `big` file from previous test cases is reused if available, to save setup time, otherwise re-generated. Signed-off-by: Philip Oakley Signed-off-by: Johannes Schindelin --- t/t1007-hash-object.sh | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/t/t1007-hash-object.sh b/t/t1007-hash-object.sh index f2722380ee1436..841a6671d1a3c1 100755 --- a/t/t1007-hash-object.sh +++ b/t/t1007-hash-object.sh @@ -285,4 +285,16 @@ test_expect_success EXPENSIVE,SIZE_T_IS_64BIT,!LONG_IS_64BIT \ test_cmp expect actual ' +# This clean filter does nothing, other than excercising the interface. +# We ensure that cleaning doesn't mangle large files on 64-bit Windows. +test_expect_success EXPENSIVE,SIZE_T_IS_64BIT,!LONG_IS_64BIT \ + 'hash filtered files over 4GB correctly' ' + { test -f big || test-tool genzeros $((5*1024*1024*1024)) >big; } && + test_oid large5GB >expect && + test_config filter.null-filter.clean "cat" && + echo "big filter=null-filter" >.gitattributes && + git hash-object -- big >actual && + test_cmp expect actual +' + test_done From c99c6673583eed58d0bcde00363c0a7a3ebc6cde Mon Sep 17 00:00:00 2001 From: Derrick Stolee Date: Wed, 13 Apr 2022 14:54:43 -0400 Subject: [PATCH 039/218] compat/mingw.c: do not warn when failing to get owner In the case of Git for Windows (say, in a Git Bash window) running in a Windows Subsystem for Linux (WSL) directory, the GetNamedSecurityInfoW() call in is_path_owned_By_current_side() returns an error code other than ERROR_SUCCESS. This is consistent behavior across this boundary. In these cases, the owner would always be different because the WSL owner is a different entity than the Windows user. The change here is to suppress the error message that looks like this: error: failed to get owner for '//wsl.localhost/...' (1) Before this change, this warning happens for every Git command, regardless of whether the directory is marked with safe.directory. Signed-off-by: Derrick Stolee --- compat/mingw.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/compat/mingw.c b/compat/mingw.c index feefa2cd0eb12a..fabb3a30364376 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -3277,9 +3277,7 @@ int is_path_owned_by_current_sid(const char *path, struct strbuf *report) DACL_SECURITY_INFORMATION, &sid, NULL, NULL, NULL, &descriptor); - if (err != ERROR_SUCCESS) - error(_("failed to get owner for '%s' (%ld)"), path, err); - else if (sid && IsValidSid(sid)) { + if (err == ERROR_SUCCESS && sid && IsValidSid(sid)) { /* Now, verify that the SID matches the current user's */ static PSID current_user_sid; static HANDLE linked_token; From ba64eab5c21d3ba52b0607eab34fa80e1c488a95 Mon Sep 17 00:00:00 2001 From: Rafael Kitover Date: Tue, 12 Apr 2022 19:53:33 +0000 Subject: [PATCH 040/218] mingw: $env:TERM="xterm-256color" for newer OSes For Windows builds >= 15063 set $env:TERM to "xterm-256color" instead of "cygwin" because they have a more capable console system that supports this. Also set $env:COLORTERM="truecolor" if unset. $env:TERM is initialized so that ANSI colors in color.c work, see 29a3963484 (Win32: patch Windows environment on startup, 2012-01-15). See git-for-windows/git#3629 regarding problems caused by always setting $env:TERM="cygwin". This is the same heuristic used by the Cygwin runtime. Signed-off-by: Rafael Kitover Signed-off-by: Johannes Schindelin --- compat/mingw.c | 17 ++++++++++++++--- 1 file changed, 14 insertions(+), 3 deletions(-) diff --git a/compat/mingw.c b/compat/mingw.c index feefa2cd0eb12a..9964ec01beff58 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -3126,9 +3126,20 @@ static void setup_windows_environment(void) convert_slashes(tmp); } - /* simulate TERM to enable auto-color (see color.c) */ - if (!getenv("TERM")) - setenv("TERM", "cygwin", 1); + + /* + * Make sure TERM is set up correctly to enable auto-color + * (see color.c .) Use "cygwin" for older OS releases which + * works correctly with MSYS2 utilities on older consoles. + */ + if (!getenv("TERM")) { + if ((GetVersion() >> 16) < 15063) + setenv("TERM", "cygwin", 0); + else { + setenv("TERM", "xterm-256color", 0); + setenv("COLORTERM", "truecolor", 0); + } + } /* calculate HOME if not set */ if (!getenv("HOME")) { From 38442a4a5acc0909dfb74033dbdd11190d675585 Mon Sep 17 00:00:00 2001 From: Christopher Degawa Date: Sat, 28 May 2022 14:53:54 -0500 Subject: [PATCH 041/218] winansi: check result and Buffer before using Name NtQueryObject under Wine can return a success but fill out no name. In those situations, Wine will set Buffer to NULL, and set result to the sizeof(OBJECT_NAME_INFORMATION). Running a command such as echo "$(git.exe --version 2>/dev/null)" will crash due to a NULL pointer dereference when the code attempts to null terminate the buffer, although, weirdly, removing the subshell or redirecting stdout to a file will not trigger the crash. Code has been added to also check Buffer and Length to ensure the check is as robust as possible due to the current behavior being fragile at best, and could potentially change in the future This code is based on the behavior of NtQueryObject under wine and reactos. Signed-off-by: Christopher Degawa --- compat/winansi.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/compat/winansi.c b/compat/winansi.c index 3ce190093901b4..601f5c01446e5d 100644 --- a/compat/winansi.c +++ b/compat/winansi.c @@ -546,6 +546,9 @@ static void detect_msys_tty(int fd) if (!NT_SUCCESS(NtQueryObject(h, ObjectNameInformation, buffer, sizeof(buffer) - 2, &result))) return; + if (result < sizeof(*nameinfo) || !nameinfo->Name.Buffer || + !nameinfo->Name.Length) + return; name = nameinfo->Name.Buffer; name[nameinfo->Name.Length / sizeof(*name)] = 0; From 2dd8bc20b3fb76a1422ba31614a104f5f29ec4ef Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=E5=AD=99=E5=8D=93=E8=AF=86?= Date: Sun, 16 Jan 2022 03:38:33 +0800 Subject: [PATCH 042/218] Add config option `windows.appendAtomically` MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Atomic append on windows is only supported on local disk files, and it may cause errors in other situations, e.g. network file system. If that is the case, this config option should be used to turn atomic append off. Co-Authored-By: Johannes Schindelin Signed-off-by: 孙卓识 Signed-off-by: Johannes Schindelin --- Documentation/config.adoc | 2 ++ Documentation/config/windows.adoc | 4 ++++ compat/mingw.c | 36 ++++++++++++++++++++++++++++--- 3 files changed, 39 insertions(+), 3 deletions(-) create mode 100644 Documentation/config/windows.adoc diff --git a/Documentation/config.adoc b/Documentation/config.adoc index dcea3c0c15e2a9..40c68a1162fd3d 100644 --- a/Documentation/config.adoc +++ b/Documentation/config.adoc @@ -559,4 +559,6 @@ include::config/versionsort.adoc[] include::config/web.adoc[] +include::config/windows.adoc[] + include::config/worktree.adoc[] diff --git a/Documentation/config/windows.adoc b/Documentation/config/windows.adoc new file mode 100644 index 00000000000000..fdaaf1c65504f3 --- /dev/null +++ b/Documentation/config/windows.adoc @@ -0,0 +1,4 @@ +windows.appendAtomically:: + By default, append atomic API is used on windows. But it works only with + local disk files, if you're working on a network file system, you should + set it false to turn it off. diff --git a/compat/mingw.c b/compat/mingw.c index feefa2cd0eb12a..9c89acbca886dc 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -8,6 +8,7 @@ #include "dir.h" #include "environment.h" #include "gettext.h" +#include "repository.h" #include "run-command.h" #include "strbuf.h" #include "symlinks.h" @@ -789,6 +790,7 @@ static int is_local_named_pipe_path(const char *filename) int mingw_open (const char *filename, int oflags, ...) { + static int append_atomically = -1; typedef int (*open_fn_t)(wchar_t const *wfilename, int oflags, ...); va_list args; unsigned mode; @@ -808,7 +810,16 @@ int mingw_open (const char *filename, int oflags, ...) return -1; } - if ((oflags & O_APPEND) && !is_local_named_pipe_path(filename)) + /* + * Only set append_atomically to default value(1) when repo is initialized + * and fail to get config value + */ + if (append_atomically < 0 && the_repository && the_repository->commondir && + repo_config_get_bool(the_repository, "windows.appendatomically", &append_atomically)) + append_atomically = 1; + + if (append_atomically && (oflags & O_APPEND) && + !is_local_named_pipe_path(filename)) open_fn = mingw_open_append; else if (!(oflags & ~(O_ACCMODE | O_NOINHERIT))) open_fn = mingw_open_existing; @@ -987,9 +998,28 @@ ssize_t mingw_write(int fd, const void *buf, size_t len) /* check if fd is a pipe */ HANDLE h = (HANDLE) _get_osfhandle(fd); - if (GetFileType(h) != FILE_TYPE_PIPE) + if (GetFileType(h) != FILE_TYPE_PIPE) { + if (orig == EINVAL) { + wchar_t path[MAX_PATH]; + DWORD ret = GetFinalPathNameByHandleW(h, path, + ARRAY_SIZE(path), 0); + UINT drive_type = ret > 0 && ret < ARRAY_SIZE(path) ? + GetDriveTypeW(path) : DRIVE_UNKNOWN; + + /* + * The default atomic append causes such an error on + * network file systems, in such a case, it should be + * turned off via config. + * + * `drive_type` of UNC path: DRIVE_NO_ROOT_DIR + */ + if (DRIVE_NO_ROOT_DIR == drive_type || DRIVE_REMOTE == drive_type) + warning("invalid write operation detected; you may try:\n" + "\n\tgit config windows.appendAtomically false"); + } + errno = orig; - else if (orig == EINVAL) + } else if (orig == EINVAL) errno = EPIPE; else { DWORD buf_size; From 7fa52ab195d18bb6083c31c42309d2ec1c8dc2a3 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Mon, 4 Sep 2017 11:59:45 +0200 Subject: [PATCH 043/218] mingw: change core.fsyncObjectFiles = 1 by default MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit From the documentation of said setting: This boolean will enable fsync() when writing object files. This is a total waste of time and effort on a filesystem that orders data writes properly, but can be useful for filesystems that do not use journalling (traditional UNIX filesystems) or that only journal metadata and not file contents (OS X’s HFS+, or Linux ext3 with "data=writeback"). The most common file system on Windows (NTFS) does not guarantee that order, therefore a sudden loss of power (or any other event causing an unclean shutdown) would cause corrupt files (i.e. files filled with NULs). Therefore we need to change the default. Note that the documentation makes it sound as if this causes really bad performance. In reality, writing loose objects is something that is done only rarely, and only a handful of files at a time. Signed-off-by: Johannes Schindelin --- compat/mingw.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/compat/mingw.c b/compat/mingw.c index 9c89acbca886dc..b715ca53a1ca16 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -16,6 +16,7 @@ #include "win32.h" #include "win32/lazyload.h" #include "wrapper.h" +#include "write-or-die.h" #include #include #include @@ -3659,6 +3660,7 @@ int wmain(int argc, const wchar_t **wargv) maybe_redirect_std_handles(); adjust_symlink_flags(); + fsync_object_files = 1; /* determine size of argv and environ conversion buffer */ maxlen = wcslen(wargv[0]); From 07572a1590ca73890a04b2da04a621bba4c92a35 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Matthias=20A=C3=9Fhauer?= Date: Sun, 10 Jul 2022 11:27:25 +0200 Subject: [PATCH 044/218] MinGW: link as terminal server aware MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Whith Windows 2000, Microsoft introduced a flag to the PE header to mark executables as "terminal server aware". Windows terminal servers provide a redirected Windows directory and redirected registry hives when launching legacy applications without this flag set. Since we do not use any INI files in the Windows directory and don't write to the registry, we don't need this additional preparation. Telling the OS that we don't need this should provide slightly improved startup times in terminal server environments. When building for supported Windows Versions with MSVC the /TSAWARE linker flag is automatically set, but MinGW requires us to set the --tsaware flag manually. This partially addresses https://github.com/git-for-windows/git/issues/3935. Signed-off-by: Matthias Aßhauer --- config.mak.uname | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/config.mak.uname b/config.mak.uname index 5feb5825587e65..bbafcc816dc0d4 100644 --- a/config.mak.uname +++ b/config.mak.uname @@ -707,7 +707,7 @@ ifeq ($(uname_S),MINGW) DEFAULT_HELP_FORMAT = html HAVE_PLATFORM_PROCINFO = YesPlease CSPRNG_METHOD = rtlgenrandom - BASIC_LDFLAGS += -municode + BASIC_LDFLAGS += -municode -Wl,--tsaware COMPAT_CFLAGS += -DNOGDI -Icompat -Icompat/win32 COMPAT_CFLAGS += -DSTRIP_EXTENSION=\".exe\" COMPAT_OBJS += compat/mingw.o compat/winansi.o \ From 4fe4648c3d89aa7964213a563b7292ec641eacaf Mon Sep 17 00:00:00 2001 From: Kiel Hurley Date: Wed, 2 Nov 2022 22:56:16 +1300 Subject: [PATCH 045/218] Fix Windows version resources Add FileVersion, which is a required field As not all required fields were present, none were being included Fixes #4090 Signed-off-by: Kiel Hurley --- git.rc.in | 1 + 1 file changed, 1 insertion(+) diff --git a/git.rc.in b/git.rc.in index e69444eef3f0c5..460ea39561b87f 100644 --- a/git.rc.in +++ b/git.rc.in @@ -12,6 +12,7 @@ BEGIN VALUE "OriginalFilename", "git.exe\0" VALUE "ProductName", "Git\0" VALUE "ProductVersion", "@GIT_VERSION@\0" + VALUE "FileVersion", "@GIT_VERSION@\0" END END From 63fa02a71cc0e5b271a85d9eecf0478a42edba59 Mon Sep 17 00:00:00 2001 From: Andrey Zabavnikov Date: Fri, 28 Oct 2022 17:12:06 +0300 Subject: [PATCH 046/218] status: fix for old-style submodules with commondir In f9b7573f6b00 (repository: free fields before overwriting them, 2017-09-05), Git was taught to release memory before overwriting it, but 357a03ebe9e0 (repository.c: move env-related setup code back to environment.c, 2018-03-03) changed the code so that it would not _always_ be overwritten. As a consequence, the `commondir` attribute would point to already-free()d memory. This seems not to cause problems in core Git, but there are add-on patches in Git for Windows where the `commondir` attribute is subsequently used and causing invalid memory accesses e.g. in setups containing old-style submodules (i.e. the ones with a `.git` directory within theirs worktrees) that have `commondir` configured. This fixes https://github.com/git-for-windows/git/pull/4083. Signed-off-by: Andrey Zabavnikov Signed-off-by: Johannes Schindelin --- repository.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/repository.c b/repository.c index 9e5537f53961ed..2a520c46573c3e 100644 --- a/repository.c +++ b/repository.c @@ -153,7 +153,7 @@ static void repo_set_commondir(struct repository *repo, { struct strbuf sb = STRBUF_INIT; - free(repo->commondir); + FREE_AND_NULL(repo->commondir); if (commondir) { repo->different_commondir = 1; From bbde338d55cb535ad69d6c5c7bf5dcb04c4caf9e Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Sat, 6 May 2023 22:26:15 +0200 Subject: [PATCH 047/218] http: optionally load libcurl lazily This compile-time option allows to ask Git to load libcurl dynamically at runtime. Together with a follow-up patch that optionally overrides the file name depending on the `http.sslBackend` setting, this kicks open the door for installing multiple libcurl flavors side by side, and load the one corresponding to the (runtime-)configured SSL/TLS backend. Signed-off-by: Johannes Schindelin --- Makefile | 28 +++- compat/lazyload-curl.c | 364 +++++++++++++++++++++++++++++++++++++++++ 2 files changed, 385 insertions(+), 7 deletions(-) create mode 100644 compat/lazyload-curl.c diff --git a/Makefile b/Makefile index cedc234173e377..91c22e9ae05c2f 100644 --- a/Makefile +++ b/Makefile @@ -483,6 +483,11 @@ include shared.mak # # CURL_LDFLAGS=-lcurl # +# Define LAZYLOAD_LIBCURL to dynamically load the libcurl; This can be useful +# if Multiple libcurl versions exist (with different file names) that link to +# various SSL/TLS backends, to support the `http.sslBackend` runtime switch in +# such a scenario. +# # === Optional library: libpcre2 === # # Define USE_LIBPCRE if you have and want to use libpcre. Various @@ -1788,10 +1793,19 @@ else CURL_LIBCURL = endif - ifndef CURL_LDFLAGS - CURL_LDFLAGS = $(eval CURL_LDFLAGS := $$(shell $$(CURL_CONFIG) --libs))$(CURL_LDFLAGS) + ifdef LAZYLOAD_LIBCURL + LAZYLOAD_LIBCURL_OBJ = compat/lazyload-curl.o + OBJECTS += $(LAZYLOAD_LIBCURL_OBJ) + # The `CURL_STATICLIB` constant must be defined to avoid seeing the functions + # declared as DLL imports + CURL_CFLAGS = -DCURL_STATICLIB + CURL_LIBCURL = -ldl + else + ifndef CURL_LDFLAGS + CURL_LDFLAGS = $(eval CURL_LDFLAGS := $$(shell $$(CURL_CONFIG) --libs))$(CURL_LDFLAGS) + endif + CURL_LIBCURL += $(CURL_LDFLAGS) endif - CURL_LIBCURL += $(CURL_LDFLAGS) ifndef CURL_CFLAGS CURL_CFLAGS = $(eval CURL_CFLAGS := $$(shell $$(CURL_CONFIG) --cflags))$(CURL_CFLAGS) @@ -1812,7 +1826,7 @@ else endif ifdef USE_CURL_FOR_IMAP_SEND BASIC_CFLAGS += -DUSE_CURL_FOR_IMAP_SEND - IMAP_SEND_BUILDDEPS = http.o + IMAP_SEND_BUILDDEPS = http.o $(LAZYLOAD_LIBCURL_OBJ) IMAP_SEND_LDFLAGS += $(CURL_LIBCURL) endif ifndef NO_EXPAT @@ -3003,10 +3017,10 @@ git-imap-send$X: imap-send.o $(IMAP_SEND_BUILDDEPS) GIT-LDFLAGS $(GITLIBS) $(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) $(filter %.o,$^) \ $(IMAP_SEND_LDFLAGS) $(LIBS) -git-http-fetch$X: http.o http-walker.o http-fetch.o GIT-LDFLAGS $(GITLIBS) +git-http-fetch$X: http.o http-walker.o http-fetch.o $(LAZYLOAD_LIBCURL_OBJ) GIT-LDFLAGS $(GITLIBS) $(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) $(filter %.o,$^) \ $(CURL_LIBCURL) $(LIBS) -git-http-push$X: http.o http-push.o GIT-LDFLAGS $(GITLIBS) +git-http-push$X: http.o http-push.o $(LAZYLOAD_LIBCURL_OBJ) GIT-LDFLAGS $(GITLIBS) $(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) $(filter %.o,$^) \ $(CURL_LIBCURL) $(EXPAT_LIBEXPAT) $(LIBS) @@ -3016,7 +3030,7 @@ $(REMOTE_CURL_ALIASES): $(REMOTE_CURL_PRIMARY) ln -s $< $@ 2>/dev/null || \ cp $< $@ -$(REMOTE_CURL_PRIMARY): remote-curl.o http.o http-walker.o GIT-LDFLAGS $(GITLIBS) +$(REMOTE_CURL_PRIMARY): remote-curl.o http.o http-walker.o $(LAZYLOAD_LIBCURL_OBJ) GIT-LDFLAGS $(GITLIBS) $(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) $(filter %.o,$^) \ $(CURL_LIBCURL) $(EXPAT_LIBEXPAT) $(LIBS) diff --git a/compat/lazyload-curl.c b/compat/lazyload-curl.c new file mode 100644 index 00000000000000..f4e08f76dfcd7f --- /dev/null +++ b/compat/lazyload-curl.c @@ -0,0 +1,364 @@ +#include "../git-compat-util.h" +#include "../git-curl-compat.h" +#include + +/* + * The ABI version of libcurl is encoded in its shared libraries' file names. + * This ABI version has not changed since October 2006 and is unlikely to be + * changed in the future. See https://curl.se/libcurl/abi.html for details. + */ +#define LIBCURL_ABI_VERSION "4" + +typedef void (*func_t)(void); + +#ifdef __APPLE__ +#define LIBCURL_FILE_NAME(base) base "." LIBCURL_ABI_VERSION ".dylib" +#else +#define LIBCURL_FILE_NAME(base) base ".so." LIBCURL_ABI_VERSION +#endif + +static void *load_library(const char *name) +{ + return dlopen(name, RTLD_LAZY); +} + +static func_t load_function(void *handle, const char *name) +{ + /* + * Casting the return value of `dlsym()` to a function pointer is + * explicitly allowed in recent POSIX standards, but GCC complains + * about this in pedantic mode nevertheless. For more about this issue, + * see https://stackoverflow.com/q/31526876/1860823 and + * http://stackoverflow.com/a/36385690/1905491. + */ + func_t f; + *(void **)&f = dlsym(handle, name); + return f; +} + +typedef struct curl_version_info_data *(*curl_version_info_type)(CURLversion version); +static curl_version_info_type curl_version_info_func; + +typedef char *(*curl_easy_escape_type)(CURL *handle, const char *string, int length); +static curl_easy_escape_type curl_easy_escape_func; + +typedef void (*curl_free_type)(void *p); +static curl_free_type curl_free_func; + +typedef CURLcode (*curl_global_init_type)(long flags); +static curl_global_init_type curl_global_init_func; + +typedef CURLsslset (*curl_global_sslset_type)(curl_sslbackend id, const char *name, const curl_ssl_backend ***avail); +static curl_global_sslset_type curl_global_sslset_func; + +typedef void (*curl_global_cleanup_type)(void); +static curl_global_cleanup_type curl_global_cleanup_func; + +typedef CURLcode (*curl_global_trace_type)(const char *config); +static curl_global_trace_type curl_global_trace_func; + +typedef struct curl_slist *(*curl_slist_append_type)(struct curl_slist *list, const char *data); +static curl_slist_append_type curl_slist_append_func; + +typedef void (*curl_slist_free_all_type)(struct curl_slist *list); +static curl_slist_free_all_type curl_slist_free_all_func; + +typedef const char *(*curl_easy_strerror_type)(CURLcode error); +static curl_easy_strerror_type curl_easy_strerror_func; + +typedef CURLM *(*curl_multi_init_type)(void); +static curl_multi_init_type curl_multi_init_func; + +typedef CURLMcode (*curl_multi_add_handle_type)(CURLM *multi_handle, CURL *curl_handle); +static curl_multi_add_handle_type curl_multi_add_handle_func; + +typedef CURLMcode (*curl_multi_remove_handle_type)(CURLM *multi_handle, CURL *curl_handle); +static curl_multi_remove_handle_type curl_multi_remove_handle_func; + +typedef CURLMcode (*curl_multi_fdset_type)(CURLM *multi_handle, fd_set *read_fd_set, fd_set *write_fd_set, fd_set *exc_fd_set, int *max_fd); +static curl_multi_fdset_type curl_multi_fdset_func; + +typedef CURLMcode (*curl_multi_perform_type)(CURLM *multi_handle, int *running_handles); +static curl_multi_perform_type curl_multi_perform_func; + +typedef CURLMcode (*curl_multi_cleanup_type)(CURLM *multi_handle); +static curl_multi_cleanup_type curl_multi_cleanup_func; + +typedef CURLMsg *(*curl_multi_info_read_type)(CURLM *multi_handle, int *msgs_in_queue); +static curl_multi_info_read_type curl_multi_info_read_func; + +typedef const char *(*curl_multi_strerror_type)(CURLMcode error); +static curl_multi_strerror_type curl_multi_strerror_func; + +typedef CURLMcode (*curl_multi_timeout_type)(CURLM *multi_handle, long *milliseconds); +static curl_multi_timeout_type curl_multi_timeout_func; + +typedef CURL *(*curl_easy_init_type)(void); +static curl_easy_init_type curl_easy_init_func; + +typedef CURLcode (*curl_easy_perform_type)(CURL *curl); +static curl_easy_perform_type curl_easy_perform_func; + +typedef void (*curl_easy_cleanup_type)(CURL *curl); +static curl_easy_cleanup_type curl_easy_cleanup_func; + +typedef CURL *(*curl_easy_duphandle_type)(CURL *curl); +static curl_easy_duphandle_type curl_easy_duphandle_func; + +typedef CURLcode (*curl_easy_getinfo_long_type)(CURL *curl, CURLINFO info, long *value); +static curl_easy_getinfo_long_type curl_easy_getinfo_long_func; + +typedef CURLcode (*curl_easy_getinfo_pointer_type)(CURL *curl, CURLINFO info, void **value); +static curl_easy_getinfo_pointer_type curl_easy_getinfo_pointer_func; + +typedef CURLcode (*curl_easy_getinfo_off_t_type)(CURL *curl, CURLINFO info, curl_off_t *value); +static curl_easy_getinfo_off_t_type curl_easy_getinfo_off_t_func; + +typedef CURLcode (*curl_easy_setopt_long_type)(CURL *curl, CURLoption opt, long value); +static curl_easy_setopt_long_type curl_easy_setopt_long_func; + +typedef CURLcode (*curl_easy_setopt_pointer_type)(CURL *curl, CURLoption opt, void *value); +static curl_easy_setopt_pointer_type curl_easy_setopt_pointer_func; + +typedef CURLcode (*curl_easy_setopt_off_t_type)(CURL *curl, CURLoption opt, curl_off_t value); +static curl_easy_setopt_off_t_type curl_easy_setopt_off_t_func; + +static void lazy_load_curl(void) +{ + static int initialized; + void *libcurl; + func_t curl_easy_getinfo_func, curl_easy_setopt_func; + + if (initialized) + return; + + initialized = 1; + libcurl = load_library(LIBCURL_FILE_NAME("libcurl")); + if (!libcurl) + die("failed to load library '%s'", LIBCURL_FILE_NAME("libcurl")); + + curl_version_info_func = (curl_version_info_type)load_function(libcurl, "curl_version_info"); + curl_easy_escape_func = (curl_easy_escape_type)load_function(libcurl, "curl_easy_escape"); + curl_free_func = (curl_free_type)load_function(libcurl, "curl_free"); + curl_global_init_func = (curl_global_init_type)load_function(libcurl, "curl_global_init"); + curl_global_sslset_func = (curl_global_sslset_type)load_function(libcurl, "curl_global_sslset"); + curl_global_cleanup_func = (curl_global_cleanup_type)load_function(libcurl, "curl_global_cleanup"); + curl_global_trace_func = (curl_global_trace_type)load_function(libcurl, "curl_global_trace"); + curl_slist_append_func = (curl_slist_append_type)load_function(libcurl, "curl_slist_append"); + curl_slist_free_all_func = (curl_slist_free_all_type)load_function(libcurl, "curl_slist_free_all"); + curl_easy_strerror_func = (curl_easy_strerror_type)load_function(libcurl, "curl_easy_strerror"); + curl_multi_init_func = (curl_multi_init_type)load_function(libcurl, "curl_multi_init"); + curl_multi_add_handle_func = (curl_multi_add_handle_type)load_function(libcurl, "curl_multi_add_handle"); + curl_multi_remove_handle_func = (curl_multi_remove_handle_type)load_function(libcurl, "curl_multi_remove_handle"); + curl_multi_fdset_func = (curl_multi_fdset_type)load_function(libcurl, "curl_multi_fdset"); + curl_multi_perform_func = (curl_multi_perform_type)load_function(libcurl, "curl_multi_perform"); + curl_multi_cleanup_func = (curl_multi_cleanup_type)load_function(libcurl, "curl_multi_cleanup"); + curl_multi_info_read_func = (curl_multi_info_read_type)load_function(libcurl, "curl_multi_info_read"); + curl_multi_strerror_func = (curl_multi_strerror_type)load_function(libcurl, "curl_multi_strerror"); + curl_multi_timeout_func = (curl_multi_timeout_type)load_function(libcurl, "curl_multi_timeout"); + curl_easy_init_func = (curl_easy_init_type)load_function(libcurl, "curl_easy_init"); + curl_easy_perform_func = (curl_easy_perform_type)load_function(libcurl, "curl_easy_perform"); + curl_easy_cleanup_func = (curl_easy_cleanup_type)load_function(libcurl, "curl_easy_cleanup"); + curl_easy_duphandle_func = (curl_easy_duphandle_type)load_function(libcurl, "curl_easy_duphandle"); + + curl_easy_getinfo_func = load_function(libcurl, "curl_easy_getinfo"); + curl_easy_getinfo_long_func = (curl_easy_getinfo_long_type)curl_easy_getinfo_func; + curl_easy_getinfo_pointer_func = (curl_easy_getinfo_pointer_type)curl_easy_getinfo_func; + curl_easy_getinfo_off_t_func = (curl_easy_getinfo_off_t_type)curl_easy_getinfo_func; + + curl_easy_setopt_func = load_function(libcurl, "curl_easy_setopt"); + curl_easy_setopt_long_func = (curl_easy_setopt_long_type)curl_easy_setopt_func; + curl_easy_setopt_pointer_func = (curl_easy_setopt_pointer_type)curl_easy_setopt_func; + curl_easy_setopt_off_t_func = (curl_easy_setopt_off_t_type)curl_easy_setopt_func; +} + +struct curl_version_info_data *curl_version_info(CURLversion version) +{ + lazy_load_curl(); + return curl_version_info_func(version); +} + +char *curl_easy_escape(CURL *handle, const char *string, int length) +{ + lazy_load_curl(); + return curl_easy_escape_func(handle, string, length); +} + +void curl_free(void *p) +{ + lazy_load_curl(); + curl_free_func(p); +} + +CURLcode curl_global_init(long flags) +{ + lazy_load_curl(); + return curl_global_init_func(flags); +} + +CURLsslset curl_global_sslset(curl_sslbackend id, const char *name, const curl_ssl_backend ***avail) +{ + lazy_load_curl(); + return curl_global_sslset_func(id, name, avail); +} + +void curl_global_cleanup(void) +{ + lazy_load_curl(); + curl_global_cleanup_func(); +} + +CURLcode curl_global_trace(const char *config) +{ + lazy_load_curl(); + return curl_global_trace_func(config); +} + +struct curl_slist *curl_slist_append(struct curl_slist *list, const char *data) +{ + lazy_load_curl(); + return curl_slist_append_func(list, data); +} + +void curl_slist_free_all(struct curl_slist *list) +{ + lazy_load_curl(); + curl_slist_free_all_func(list); +} + +const char *curl_easy_strerror(CURLcode error) +{ + lazy_load_curl(); + return curl_easy_strerror_func(error); +} + +CURLM *curl_multi_init(void) +{ + lazy_load_curl(); + return curl_multi_init_func(); +} + +CURLMcode curl_multi_add_handle(CURLM *multi_handle, CURL *curl_handle) +{ + lazy_load_curl(); + return curl_multi_add_handle_func(multi_handle, curl_handle); +} + +CURLMcode curl_multi_remove_handle(CURLM *multi_handle, CURL *curl_handle) +{ + lazy_load_curl(); + return curl_multi_remove_handle_func(multi_handle, curl_handle); +} + +CURLMcode curl_multi_fdset(CURLM *multi_handle, fd_set *read_fd_set, fd_set *write_fd_set, fd_set *exc_fd_set, int *max_fd) +{ + lazy_load_curl(); + return curl_multi_fdset_func(multi_handle, read_fd_set, write_fd_set, exc_fd_set, max_fd); +} + +CURLMcode curl_multi_perform(CURLM *multi_handle, int *running_handles) +{ + lazy_load_curl(); + return curl_multi_perform_func(multi_handle, running_handles); +} + +CURLMcode curl_multi_cleanup(CURLM *multi_handle) +{ + lazy_load_curl(); + return curl_multi_cleanup_func(multi_handle); +} + +CURLMsg *curl_multi_info_read(CURLM *multi_handle, int *msgs_in_queue) +{ + lazy_load_curl(); + return curl_multi_info_read_func(multi_handle, msgs_in_queue); +} + +const char *curl_multi_strerror(CURLMcode error) +{ + lazy_load_curl(); + return curl_multi_strerror_func(error); +} + +CURLMcode curl_multi_timeout(CURLM *multi_handle, long *milliseconds) +{ + lazy_load_curl(); + return curl_multi_timeout_func(multi_handle, milliseconds); +} + +CURL *curl_easy_init(void) +{ + lazy_load_curl(); + return curl_easy_init_func(); +} + +CURLcode curl_easy_perform(CURL *curl) +{ + lazy_load_curl(); + return curl_easy_perform_func(curl); +} + +void curl_easy_cleanup(CURL *curl) +{ + lazy_load_curl(); + curl_easy_cleanup_func(curl); +} + +CURL *curl_easy_duphandle(CURL *curl) +{ + lazy_load_curl(); + return curl_easy_duphandle_func(curl); +} + +#ifndef CURL_IGNORE_DEPRECATION +#define CURL_IGNORE_DEPRECATION(x) x +#endif + +#ifndef CURLOPTTYPE_BLOB +#define CURLOPTTYPE_BLOB 40000 +#endif + +#undef curl_easy_getinfo +CURLcode curl_easy_getinfo(CURL *curl, CURLINFO info, ...) +{ + va_list ap; + CURLcode res; + + va_start(ap, info); + lazy_load_curl(); + CURL_IGNORE_DEPRECATION( + if (info >= CURLINFO_LONG && info < CURLINFO_DOUBLE) + res = curl_easy_getinfo_long_func(curl, info, va_arg(ap, long *)); + else if ((info >= CURLINFO_STRING && info < CURLINFO_LONG) || + (info >= CURLINFO_SLIST && info < CURLINFO_SOCKET)) + res = curl_easy_getinfo_pointer_func(curl, info, va_arg(ap, void **)); + else if (info >= CURLINFO_OFF_T) + res = curl_easy_getinfo_off_t_func(curl, info, va_arg(ap, curl_off_t *)); + else + die("%s:%d: TODO (info: %d)!", __FILE__, __LINE__, info); + ) + va_end(ap); + return res; +} + +#undef curl_easy_setopt +CURLcode curl_easy_setopt(CURL *curl, CURLoption opt, ...) +{ + va_list ap; + CURLcode res; + + va_start(ap, opt); + lazy_load_curl(); + CURL_IGNORE_DEPRECATION( + if (opt >= CURLOPTTYPE_LONG && opt < CURLOPTTYPE_OBJECTPOINT) + res = curl_easy_setopt_long_func(curl, opt, va_arg(ap, long)); + else if (opt >= CURLOPTTYPE_OBJECTPOINT && opt < CURLOPTTYPE_OFF_T) + res = curl_easy_setopt_pointer_func(curl, opt, va_arg(ap, void *)); + else if (opt >= CURLOPTTYPE_OFF_T && opt < CURLOPTTYPE_BLOB) + res = curl_easy_setopt_off_t_func(curl, opt, va_arg(ap, curl_off_t)); + else + die("%s:%d: TODO (opt: %d)!", __FILE__, __LINE__, opt); + ) + va_end(ap); + return res; +} From 05760514b1e83cd55931f85ef86bed291bb45230 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Sun, 7 May 2023 22:51:52 +0200 Subject: [PATCH 048/218] http: support lazy-loading libcurl also on Windows This implements the Windows-specific support code, because everything is slightly different on Windows, even loading shared libraries. Note: I specifically do _not_ use the code from `compat/win32/lazyload.h` here because that code is optimized for loading individual functions from various system DLLs, while we specifically want to load _many_ functions from _one_ DLL here, and distinctly not a system DLL (we expect libcurl to be located outside `C:\Windows\system32`, something `INIT_PROC_ADDR` refuses to work with). Also, the `curl_easy_getinfo()`/`curl_easy_setopt()` functions are declared as vararg functions, which `lazyload.h` cannot handle. Finally, we are about to optionally override the exact file name that is to be loaded, which is a goal contrary to `lazyload.h`'s design. Signed-off-by: Johannes Schindelin --- Makefile | 4 ++++ compat/lazyload-curl.c | 52 ++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 56 insertions(+) diff --git a/Makefile b/Makefile index 91c22e9ae05c2f..df898b2acdca89 100644 --- a/Makefile +++ b/Makefile @@ -1799,7 +1799,11 @@ else # The `CURL_STATICLIB` constant must be defined to avoid seeing the functions # declared as DLL imports CURL_CFLAGS = -DCURL_STATICLIB +ifneq ($(uname_S),MINGW) +ifneq ($(uname_S),Windows) CURL_LIBCURL = -ldl +endif +endif else ifndef CURL_LDFLAGS CURL_LDFLAGS = $(eval CURL_LDFLAGS := $$(shell $$(CURL_CONFIG) --libs))$(CURL_LDFLAGS) diff --git a/compat/lazyload-curl.c b/compat/lazyload-curl.c index f4e08f76dfcd7f..82ab11de43a0fb 100644 --- a/compat/lazyload-curl.c +++ b/compat/lazyload-curl.c @@ -1,6 +1,8 @@ #include "../git-compat-util.h" #include "../git-curl-compat.h" +#ifndef WIN32 #include +#endif /* * The ABI version of libcurl is encoded in its shared libraries' file names. @@ -11,6 +13,7 @@ typedef void (*func_t)(void); +#ifndef WIN32 #ifdef __APPLE__ #define LIBCURL_FILE_NAME(base) base "." LIBCURL_ABI_VERSION ".dylib" #else @@ -35,6 +38,55 @@ static func_t load_function(void *handle, const char *name) *(void **)&f = dlsym(handle, name); return f; } +#else +#define LIBCURL_FILE_NAME(base) base "-" LIBCURL_ABI_VERSION ".dll" + +static void *load_library(const char *name) +{ + size_t name_size = strlen(name) + 1; + const char *path = getenv("PATH"); + char dll_path[MAX_PATH]; + + while (path && *path) { + const char *sep = strchrnul(path, ';'); + size_t len = sep - path; + + if (len && len + name_size < sizeof(dll_path)) { + memcpy(dll_path, path, len); + dll_path[len] = '/'; + memcpy(dll_path + len + 1, name, name_size); + + if (!access(dll_path, R_OK)) { + wchar_t wpath[MAX_PATH]; + int wlen = MultiByteToWideChar(CP_UTF8, 0, dll_path, -1, wpath, ARRAY_SIZE(wpath)); + void *res = wlen ? (void *)LoadLibraryExW(wpath, NULL, 0) : NULL; + if (!res) { + DWORD err = GetLastError(); + char buf[1024]; + + if (!FormatMessageA(FORMAT_MESSAGE_FROM_SYSTEM | + FORMAT_MESSAGE_ARGUMENT_ARRAY | + FORMAT_MESSAGE_IGNORE_INSERTS, + NULL, err, LANG_NEUTRAL, + buf, sizeof(buf) - 1, NULL)) + xsnprintf(buf, sizeof(buf), "last error: %ld", err); + error("LoadLibraryExW() failed with: %s", buf); + } + return res; + } + } + + path = *sep ? sep + 1 : NULL; + } + + return NULL; +} + +static func_t load_function(void *handle, const char *name) +{ + return (func_t)GetProcAddress((HANDLE)handle, name); +} +#endif typedef struct curl_version_info_data *(*curl_version_info_type)(CURLversion version); static curl_version_info_type curl_version_info_func; From f2d02f888173fcaa4d0c36042668e558043ba30d Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Sun, 7 May 2023 22:05:33 +0200 Subject: [PATCH 049/218] http: when loading libcurl lazily, allow for multiple SSL backends The previous commits introduced a compile-time option to load libcurl lazily, but it uses the hard-coded name "libcurl-4.dll" (or equivalent on platforms other than Windows). To allow for installing multiple libcurl flavors side by side, where each supports one specific SSL/TLS backend, let's first look whether `libcurl--4.dll` exists, and only use `libcurl-4.dll` as a fall back. That will allow us to ship with a libcurl by default that only supports the Secure Channel backend for the `https://` protocol. This libcurl won't suffer from any dependency problem when upgrading OpenSSL to a new major version (which will change the DLL name, and hence break every program and library that depends on it). This is crucial because Git for Windows relies on libcurl to keep working when building and deploying a new OpenSSL package because that library is used by `git fetch` and `git clone`. Note that this feature is by no means specific to Windows. On Ubuntu, for example, a `git` built using `LAZY_LOAD_LIBCURL` will use `libcurl.so.4` for `http.sslbackend=openssl` and `libcurl-gnutls.so.4` for `http.sslbackend=gnutls`. Signed-off-by: Johannes Schindelin --- compat/lazyload-curl.c | 16 ++++++++++++++-- 1 file changed, 14 insertions(+), 2 deletions(-) diff --git a/compat/lazyload-curl.c b/compat/lazyload-curl.c index 82ab11de43a0fb..a6a3f7e3a7aeaa 100644 --- a/compat/lazyload-curl.c +++ b/compat/lazyload-curl.c @@ -175,17 +175,26 @@ static curl_easy_setopt_pointer_type curl_easy_setopt_pointer_func; typedef CURLcode (*curl_easy_setopt_off_t_type)(CURL *curl, CURLoption opt, curl_off_t value); static curl_easy_setopt_off_t_type curl_easy_setopt_off_t_func; +static char ssl_backend[64]; + static void lazy_load_curl(void) { static int initialized; - void *libcurl; + void *libcurl = NULL; func_t curl_easy_getinfo_func, curl_easy_setopt_func; if (initialized) return; initialized = 1; - libcurl = load_library(LIBCURL_FILE_NAME("libcurl")); + if (ssl_backend[0]) { + char dll_name[64 + 16]; + snprintf(dll_name, sizeof(dll_name) - 1, + LIBCURL_FILE_NAME("libcurl-%s"), ssl_backend); + libcurl = load_library(dll_name); + } + if (!libcurl) + libcurl = load_library(LIBCURL_FILE_NAME("libcurl")); if (!libcurl) die("failed to load library '%s'", LIBCURL_FILE_NAME("libcurl")); @@ -250,6 +259,9 @@ CURLcode curl_global_init(long flags) CURLsslset curl_global_sslset(curl_sslbackend id, const char *name, const curl_ssl_backend ***avail) { + if (name && strlen(name) < sizeof(ssl_backend)) + strlcpy(ssl_backend, name, sizeof(ssl_backend)); + lazy_load_curl(); return curl_global_sslset_func(id, name, avail); } From fee35b1c43d7f13c38722b6eb4ea5c274763bc5e Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Sun, 7 May 2023 22:43:37 +0200 Subject: [PATCH 050/218] mingw: do load libcurl dynamically by default This will help with Git for Windows' maintenance going forward: It allows Git for Windows to switch its primary libcurl to a variant without the OpenSSL backend, while still loading an alternate when setting `http.sslBackend = openssl`. This is necessary to avoid maintenance headaches with upgrading OpenSSL: its major version name is encoded in the shared library's file name and hence major version updates (temporarily) break libraries that are linked against the OpenSSL library. Signed-off-by: Johannes Schindelin --- config.mak.uname | 1 + 1 file changed, 1 insertion(+) diff --git a/config.mak.uname b/config.mak.uname index bbafcc816dc0d4..3cbc1c4a189fae 100644 --- a/config.mak.uname +++ b/config.mak.uname @@ -708,6 +708,7 @@ ifeq ($(uname_S),MINGW) HAVE_PLATFORM_PROCINFO = YesPlease CSPRNG_METHOD = rtlgenrandom BASIC_LDFLAGS += -municode -Wl,--tsaware + LAZYLOAD_LIBCURL = YesDoThatPlease COMPAT_CFLAGS += -DNOGDI -Icompat -Icompat/win32 COMPAT_CFLAGS += -DSTRIP_EXTENSION=\".exe\" COMPAT_OBJS += compat/mingw.o compat/winansi.o \ From e6a6ebcb1b905a7da0852a882c5ecc4c0e50d576 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Wed, 2 Nov 2022 16:23:58 +0100 Subject: [PATCH 051/218] Add a GitHub workflow to verify that Git/Scalar work in Nano Server In Git for Windows v2.39.0, we fixed a regression where `git.exe` would no longer work in Windows Nano Server (frequently used in Docker containers). This GitHub workflow can be used to verify manually that the Git/Scalar executables work in Nano Server. Signed-off-by: Johannes Schindelin --- .github/workflows/nano-server.yml | 76 +++++++++++++++++++++++++++++++ 1 file changed, 76 insertions(+) create mode 100644 .github/workflows/nano-server.yml diff --git a/.github/workflows/nano-server.yml b/.github/workflows/nano-server.yml new file mode 100644 index 00000000000000..a9cf026efeb2a6 --- /dev/null +++ b/.github/workflows/nano-server.yml @@ -0,0 +1,76 @@ +name: Windows Nano Server tests + +on: + workflow_dispatch: + +env: + DEVELOPER: 1 + +jobs: + test-nano-server: + runs-on: windows-2022 + env: + WINDBG_DIR: "C:/Program Files (x86)/Windows Kits/10/Debuggers/x64" + IMAGE: mcr.microsoft.com/powershell:nanoserver-ltsc2022 + + steps: + - uses: actions/checkout@v6 + - uses: git-for-windows/setup-git-for-windows-sdk@v2 + - name: build Git + shell: bash + run: make -j15 + - name: pull nanoserver image + shell: bash + run: docker pull $IMAGE + - name: run nano-server test + shell: bash + run: | + docker run \ + --user "ContainerAdministrator" \ + -v "$WINDBG_DIR:C:/dbg" \ + -v "$(cygpath -aw /mingw64/bin):C:/mingw64-bin" \ + -v "$(cygpath -aw .):C:/test" \ + $IMAGE pwsh.exe -Command ' + # Extend the PATH to include the `.dll` files in /mingw64/bin/ + $env:PATH += ";C:\mingw64-bin" + + # For each executable to test pick some no-operation set of + # flags/subcommands or something that should quickly result in an + # error with known exit code that is not a negative 32-bit + # number, and set the expected return code appropriately. + # + # Only test executables that could be expected to run in a UI + # less environment. + # + # ( Executable path, arguments, expected return code ) + # also note space is required before close parenthesis (a + # powershell quirk when defining nested arrays like this) + + $executables_to_test = @( + ("C:\test\git.exe", "", 1 ), + ("C:\test\scalar.exe", "version", 0 ) + ) + + foreach ($executable in $executables_to_test) + { + Write-Output "Now testing $($executable[0])" + &$executable[0] $executable[1] + if ($LASTEXITCODE -ne $executable[2]) { + # if we failed, run the debugger to find out what function + # or DLL could not be found and then exit the script with + # failure The missing DLL or EXE will be referenced near + # the end of the output + + # Set a flag to have the debugger show loader stub + # diagnostics. This requires running as administrator, + # otherwise the flag will be ignored. + C:\dbg\gflags -i $executable[0] +SLS + + C:\dbg\cdb.exe -c "g" -c "q" $executable[0] $executable[1] + + exit 1 + } + } + + exit 0 + ' From 1990147f3776adab2c1656a9864b3f267fdf31d6 Mon Sep 17 00:00:00 2001 From: David Lomas Date: Fri, 28 Jul 2023 15:31:25 +0100 Subject: [PATCH 052/218] mingw: suggest `windows.appendAtomically` in more cases When running Git for Windows on a remote APFS filesystem, it would appear that the `mingw_open_append()`/`write()` combination would fail almost exactly like on some CIFS-mounted shares as had been reported in https://github.com/git-for-windows/git/issues/2753, albeit with a different `errno` value. Let's handle that `errno` value just the same, by suggesting to set `windows.appendAtomically=false`. Signed-off-by: David Lomas Signed-off-by: Johannes Schindelin --- compat/mingw.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/compat/mingw.c b/compat/mingw.c index 9c89acbca886dc..172bac7ed80d09 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -993,7 +993,7 @@ ssize_t mingw_write(int fd, const void *buf, size_t len) { ssize_t result = write(fd, buf, len); - if (result < 0 && (errno == EINVAL || errno == ENOSPC) && buf) { + if (result < 0 && (errno == EINVAL || errno == EBADF || errno == ENOSPC) && buf) { int orig = errno; /* check if fd is a pipe */ @@ -1019,7 +1019,7 @@ ssize_t mingw_write(int fd, const void *buf, size_t len) } errno = orig; - } else if (orig == EINVAL) + } else if (orig == EINVAL || errno == EBADF) errno = EPIPE; else { DWORD buf_size; From 37612e1a15e496213caa9c7f0152919615dd397a Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Wed, 22 Nov 2023 22:57:38 +0100 Subject: [PATCH 053/218] win32: use native ANSI sequence processing, if possible Windows 10 version 1511 (also known as Anniversary Update), according to https://learn.microsoft.com/en-us/windows/console/console-virtual-terminal-sequences introduced native support for ANSI sequence processing. This allows using colors from the entire 24-bit color range. All we need to do is test whether the console's "virtual processing support" can be enabled. If it can, we do not even need to start the `console_thread` to handle ANSI sequences. Or, almost all we need to do: When `console_thread()` does its work, it uses the Unicode-aware `write_console()` function to write to the Win32 Console, which supports Git for Windows' implicit convention that all text that is written is encoded in UTF-8. The same is not necessarily true if native ANSI sequence processing is used, as the output is then subject to the current code page. Let's ensure that the code page is set to `CP_UTF8` as long as Git writes to it. Signed-off-by: Johannes Schindelin --- compat/winansi.c | 46 ++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 46 insertions(+) diff --git a/compat/winansi.c b/compat/winansi.c index 3ce190093901b4..147e88e1781300 100644 --- a/compat/winansi.c +++ b/compat/winansi.c @@ -564,6 +564,49 @@ static void detect_msys_tty(int fd) #endif +static HANDLE std_console_handle; +static DWORD std_console_mode = ENABLE_VIRTUAL_TERMINAL_PROCESSING; +static UINT std_console_code_page = CP_UTF8; + +static void reset_std_console(void) +{ + if (std_console_mode != ENABLE_VIRTUAL_TERMINAL_PROCESSING) + SetConsoleMode(std_console_handle, std_console_mode); + if (std_console_code_page != CP_UTF8) + SetConsoleOutputCP(std_console_code_page); +} + +static int enable_virtual_processing(void) +{ + std_console_handle = GetStdHandle(STD_OUTPUT_HANDLE); + if (std_console_handle == INVALID_HANDLE_VALUE || + !GetConsoleMode(std_console_handle, &std_console_mode)) { + std_console_handle = GetStdHandle(STD_ERROR_HANDLE); + if (std_console_handle == INVALID_HANDLE_VALUE || + !GetConsoleMode(std_console_handle, &std_console_mode)) + return 0; + } + + std_console_code_page = GetConsoleOutputCP(); + if (std_console_code_page != CP_UTF8) + SetConsoleOutputCP(CP_UTF8); + if (!std_console_code_page) + std_console_code_page = CP_UTF8; + + atexit(reset_std_console); + + if (std_console_mode & ENABLE_VIRTUAL_TERMINAL_PROCESSING) + return 1; + + if (!SetConsoleMode(std_console_handle, + std_console_mode | + ENABLE_PROCESSED_OUTPUT | + ENABLE_VIRTUAL_TERMINAL_PROCESSING)) + return 0; + + return 1; +} + /* * Wrapper for isatty(). Most calls in the main git code * call isatty(1 or 2) to see if the instance is interactive @@ -602,6 +645,9 @@ void winansi_init(void) return; } + if (enable_virtual_processing()) + return; + /* create a named pipe to communicate with the console thread */ if (swprintf(name, ARRAY_SIZE(name) - 1, L"\\\\.\\pipe\\winansi%lu", GetCurrentProcessId()) < 0) From 728b1db7c7a4fa9f9e5c194a38614e89b2561078 Mon Sep 17 00:00:00 2001 From: MinarKotonoha Date: Mon, 8 Apr 2024 16:41:10 +0800 Subject: [PATCH 054/218] common-main.c: fflush stdout buffer upon exit By default, the buffer type of Windows' `stdout` is unbuffered (_IONBF), and there is no need to manually fflush `stdout`. But some programs, such as the Windows Filtering Platform driver provided by the security software, may change the buffer type of `stdout` to full buffering. This nees `fflush(stdout)` to be called manually, otherwise there will be no output to `stdout`. Signed-off-by: MinarKotonoha Signed-off-by: Johannes Schindelin --- common-exit.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/common-exit.c b/common-exit.c index 1aaa538be3ed67..609f32abed8b53 100644 --- a/common-exit.c +++ b/common-exit.c @@ -11,6 +11,13 @@ static void check_bug_if_BUG(void) /* We wrap exit() to call common_exit() in git-compat-util.h */ int common_exit(const char *file, int line, int code) { + /* + * Windows Filtering Platform driver provided by the security software + * may change buffer type of stdout from _IONBF to _IOFBF. + * It will no output without fflush manually. + */ + fflush(stdout); + /* * For non-POSIX systems: Take the lowest 8 bits of the "code" * to e.g. turn -1 into 255. On a POSIX system this is From 4ecf3161002feb6b07a955825ae8d3a9c0811e13 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Tue, 9 Apr 2024 16:50:56 +0200 Subject: [PATCH 055/218] t5601/t7406(mingw): do run tests with symlink support A long time ago, we decided to run tests in Git for Windows' SDK with the default `winsymlinks` mode: copying instead of linking. This is still the default mode of MSYS2 to this day. However, this is not how most users run Git for Windows: As the majority of Git for Windows' users seem to be on Windows 10 and newer, likely having enabled Developer Mode (which allows creating symbolic links without administrator privileges), they will run with symlink support enabled. This is the reason why it is crucial to get the fixes for CVE-2024-? to the users, and also why it is crucial to ensure that the test suite exercises the related test cases. This commit ensures the latter. Signed-off-by: Johannes Schindelin --- t/t5601-clone.sh | 10 ++++++++++ t/t7406-submodule-update.sh | 9 +++++++++ 2 files changed, 19 insertions(+) diff --git a/t/t5601-clone.sh b/t/t5601-clone.sh index d743d986c401a0..a859e09956222c 100755 --- a/t/t5601-clone.sh +++ b/t/t5601-clone.sh @@ -7,6 +7,16 @@ export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME . ./test-lib.sh +# This test script contains test cases that need to create symbolic links. To +# make sure that these test cases are exercised in Git for Windows, where (for +# historical reasons) `ln -s` creates copies by default, let's specifically ask +# for `ln -s` to create symbolic links whenever possible. +if test_have_prereq MINGW +then + MSYS=${MSYS+$MSYS }winsymlinks:nativestrict + export MSYS +fi + X= test_have_prereq !MINGW || X=.exe diff --git a/t/t7406-submodule-update.sh b/t/t7406-submodule-update.sh index 3adab12091a5f0..a3e0dc198ab646 100755 --- a/t/t7406-submodule-update.sh +++ b/t/t7406-submodule-update.sh @@ -14,6 +14,15 @@ export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME . ./test-lib.sh +# This test script contains test cases that need to create symbolic links. To +# make sure that these test cases are exercised in Git for Windows, where (for +# historical reasons) `ln -s` creates copies by default, let's specifically ask +# for `ln -s` to create symbolic links whenever possible. +if test_have_prereq MINGW +then + MSYS=${MSYS+$MSYS }winsymlinks:nativestrict + export MSYS +fi compare_head() { From 18f1363dd06a813f03ee5549bbdd81d8ca6dc290 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Tue, 21 May 2024 13:55:26 +0200 Subject: [PATCH 056/218] win32: ensure that `localtime_r()` is declared even in i686 builds The `__MINGW64__` constant is defined, surprise, surprise, only when building for a 64-bit CPU architecture. Therefore using it as a guard to define `_POSIX_C_SOURCE` (so that `localtime_r()` is declared, among other functions) is not enough, we also need to check `__MINGW32__`. Technically, the latter constant is defined even for 64-bit builds. But let's make things a bit easier to understand by testing for both constants. Making it so fixes this compile warning (turned error in GCC v14.1): archive-zip.c: In function 'dos_time': archive-zip.c:612:9: error: implicit declaration of function 'localtime_r'; did you mean 'localtime_s'? [-Wimplicit-function-declaration] 612 | localtime_r(&time, &tm); | ^~~~~~~~~~~ | localtime_s Signed-off-by: Johannes Schindelin --- compat/posix.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/compat/posix.h b/compat/posix.h index faaae1b6555d1b..9040b27e1f3dd7 100644 --- a/compat/posix.h +++ b/compat/posix.h @@ -45,7 +45,7 @@ #define UNUSED #endif -#ifdef __MINGW64__ +#if defined(__MINGW32__) || defined(__MINGW64__) #define _POSIX_C_SOURCE 1 #elif defined(__sun__) /* From d44a3f61adf5896a7e8441cb173997b46d34ff49 Mon Sep 17 00:00:00 2001 From: Ariel Lourenco Date: Tue, 2 Jul 2024 18:09:43 -0300 Subject: [PATCH 057/218] Fallback to AppData if XDG_CONFIG_HOME is unset In order to be a better Windows citizenship, Git should save its configuration files on AppData folder. This can enables git configuration files be replicated between machines using the same Microsoft account logon which would reduce the friction of setting up Git on new systems. Therefore, if %APPDATA%\Git\config exists, we use it; otherwise $HOME/.config/git/config is used. Signed-off-by: Ariel Lourenco --- path.c | 23 ++++++++++++++++++++--- 1 file changed, 20 insertions(+), 3 deletions(-) diff --git a/path.c b/path.c index d7e17bf17404de..8c9b6fa27632e0 100644 --- a/path.c +++ b/path.c @@ -1545,6 +1545,7 @@ int looks_like_command_line_option(const char *str) char *xdg_config_home_for(const char *subdir, const char *filename) { const char *home, *config_home; + char *home_config = NULL; assert(subdir); assert(filename); @@ -1553,10 +1554,26 @@ char *xdg_config_home_for(const char *subdir, const char *filename) return mkpathdup("%s/%s/%s", config_home, subdir, filename); home = getenv("HOME"); - if (home) - return mkpathdup("%s/.config/%s/%s", home, subdir, filename); + if (home && *home) + home_config = mkpathdup("%s/.config/%s/%s", home, subdir, filename); + + #ifdef WIN32 + { + const char *appdata = getenv("APPDATA"); + if (appdata && *appdata) { + char *appdata_config = mkpathdup("%s/Git/%s", appdata, filename); + if (file_exists(appdata_config)) { + if (home_config && file_exists(home_config)) + warning("'%s' was ignored because '%s' exists.", home_config, appdata_config); + free(home_config); + return appdata_config; + } + free(appdata_config); + } + } + #endif - return NULL; + return home_config; } char *xdg_config_home(const char *filename) From 6b4736186f6b138b3c45d41893f200ea749ac0c7 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Thu, 4 Jul 2024 22:41:56 +0200 Subject: [PATCH 058/218] run-command: be helpful with Git LFS fails on Windows 7 Git LFS is now built with Go 1.21 which no longer supports Windows 7. However, Git for Windows still wants to support Windows 7. Ideally, Git LFS would re-introduce Windows 7 support until Git for Windows drops support for Windows 7, but that's not going to happen: https://github.com/git-for-windows/git/issues/4996#issuecomment-2176152565 The next best thing we can do is to let the users know what is happening, and how to get out of their fix, at least. This is not quite as easy as it would first seem because programs compiled with Go 1.21 or newer will simply throw an exception and fail with an Access Violation on Windows 7. The only way I found to address this is to replicate the logic from Go's very own `version` command (which can determine the Go version with which a given executable was built) to detect the situation, and in that case offer a helpful error message. This addresses https://github.com/git-for-windows/git/issues/4996. Signed-off-by: Johannes Schindelin --- compat/win32/path-utils.c | 199 ++++++++++++++++++++++++++++++++++++++ compat/win32/path-utils.h | 3 + git-compat-util.h | 7 ++ run-command.c | 1 + 4 files changed, 210 insertions(+) diff --git a/compat/win32/path-utils.c b/compat/win32/path-utils.c index 966ef779b9ca9b..c4fea0301b5ecc 100644 --- a/compat/win32/path-utils.c +++ b/compat/win32/path-utils.c @@ -2,6 +2,9 @@ #include "../../git-compat-util.h" #include "../../environment.h" +#include "../../wrapper.h" +#include "../../strbuf.h" +#include "../../versioncmp.h" int win32_has_dos_drive_prefix(const char *path) { @@ -89,3 +92,199 @@ int win32_fspathcmp(const char *a, const char *b) { return win32_fspathncmp(a, b, (size_t)-1); } + +static int read_at(int fd, char *buffer, size_t offset, size_t size) +{ + if (lseek(fd, offset, SEEK_SET) < 0) { + fprintf(stderr, "could not seek to 0x%x\n", (unsigned int)offset); + return -1; + } + + return read_in_full(fd, buffer, size); +} + +static size_t le16(const char *buffer) +{ + unsigned char *u = (unsigned char *)buffer; + return u[0] | (u[1] << 8); +} + +static size_t le32(const char *buffer) +{ + return le16(buffer) | (le16(buffer + 2) << 16); +} + +/* + * Determine the Go version of a given executable, if it was built with Go. + * + * This recapitulates the logic from + * https://github.com/golang/go/blob/master/src/cmd/go/internal/version/version.go + * (without requiring the user to install `go.exe` to find out). + */ +static ssize_t get_go_version(const char *path, char *go_version, size_t go_version_size) +{ + int fd = open(path, O_RDONLY); + char buffer[1024]; + off_t offset; + size_t num_sections, opt_header_size, i; + char *p = NULL, *q; + ssize_t res = -1; + + if (fd < 0) + return -1; + + if (read_in_full(fd, buffer, 2) < 0) + goto fail; + + /* + * Parse the PE file format, for more details, see + * https://en.wikipedia.org/wiki/Portable_Executable#Layout and + * https://learn.microsoft.com/en-us/windows/win32/debug/pe-format + */ + if (buffer[0] != 'M' || buffer[1] != 'Z') + goto fail; + + if (read_at(fd, buffer, 0x3c, 4) < 0) + goto fail; + + /* Read the `PE\0\0` signature and the COFF file header */ + offset = le32(buffer); + if (read_at(fd, buffer, offset, 24) < 0) + goto fail; + + if (buffer[0] != 'P' || buffer[1] != 'E' || buffer[2] != '\0' || buffer[3] != '\0') + goto fail; + + num_sections = le16(buffer + 6); + opt_header_size = le16(buffer + 20); + offset += 24; /* skip file header */ + + /* + * Validate magic number 0x10b or 0x20b, for full details see + * https://learn.microsoft.com/en-us/windows/win32/debug/pe-format#optional-header-standard-fields-image-only + */ + if (read_at(fd, buffer, offset, 2) < 0 || + ((i = le16(buffer)) != 0x10b && i != 0x20b)) + goto fail; + + offset += opt_header_size; + + for (i = 0; i < num_sections; i++) { + if (read_at(fd, buffer, offset + i * 40, 40) < 0) + goto fail; + + /* + * For full details about the section headers, see + * https://learn.microsoft.com/en-us/windows/win32/debug/pe-format#section-table-section-headers + */ + if ((le32(buffer + 36) /* characteristics */ & ~0x600000) /* IMAGE_SCN_ALIGN_32BYTES */ == + (/* IMAGE_SCN_CNT_INITIALIZED_DATA */ 0x00000040 | + /* IMAGE_SCN_MEM_READ */ 0x40000000 | + /* IMAGE_SCN_MEM_WRITE */ 0x80000000)) { + size_t size = le32(buffer + 16); /* "SizeOfRawData " */ + size_t pointer = le32(buffer + 20); /* "PointerToRawData " */ + + /* + * Skip the section if either size or pointer is 0, see + * https://github.com/golang/go/blob/go1.21.0/src/debug/buildinfo/buildinfo.go#L333 + * for full details. + * + * Merely seeing a non-zero size will not actually do, + * though: he size must be at least `buildInfoSize`, + * i.e. 32, and we expect a UVarint (at least another + * byte) _and_ the bytes representing the string, + * which we expect to start with the letters "go" and + * continue with the Go version number. + */ + if (size < 32 + 1 + 2 + 1 || !pointer) + continue; + + p = malloc(size); + + if (!p || read_at(fd, p, pointer, size) < 0) + goto fail; + + /* + * Look for the build information embedded by Go, see + * https://github.com/golang/go/blob/go1.21.0/src/debug/buildinfo/buildinfo.go#L165-L175 + * for full details. + * + * Note: Go contains code to enforce alignment along a + * 16-byte boundary. In practice, no `.exe` has been + * observed that required any adjustment, therefore + * this here code skips that logic for simplicity. + */ + q = memmem(p, size - 18, "\xff Go buildinf:", 14); + if (!q) + goto fail; + /* + * Decode the build blob. For full details, see + * https://github.com/golang/go/blob/go1.21.0/src/debug/buildinfo/buildinfo.go#L177-L191 + * + * Note: The `endianness` values observed in practice + * were always 2, therefore the complex logic to handle + * any other value is skipped for simplicty. + */ + if ((q[14] == 8 || q[14] == 4) && q[15] == 2) { + /* + * Only handle a Go version string with fewer + * than 128 characters, so the Go UVarint at + * q[32] that indicates the string's length must + * be only one byte (without the high bit set). + */ + if ((q[32] & 0x80) || + !q[32] || + (q + 33 + q[32] - p) > (ssize_t)size || + q[32] + 1 > (ssize_t)go_version_size) + goto fail; + res = q[32]; + memcpy(go_version, q + 33, res); + go_version[res] = '\0'; + break; + } + } + } + +fail: + free(p); + close(fd); + return res; +} + +void win32_warn_about_git_lfs_on_windows7(int exit_code, const char *argv0) +{ + char buffer[128], *git_lfs = NULL; + const char *p; + + /* + * Git LFS v3.5.1 fails with an Access Violation on Windows 7; That + * would usually show up as an exit code 0xc0000005. For some reason + * (probably because at this point, we no longer have the _original_ + * HANDLE that was returned by `CreateProcess()`) we observe other + * values like 0xb00 and 0x2 instead. Since the exact exit code + * seems to be inconsistent, we check for a non-zero exit status. + */ + if (exit_code == 0) + return; + if (GetVersion() >> 16 > 7601) + return; /* Warn only on Windows 7 or older */ + if (!istarts_with(argv0, "git-lfs ") && + strcasecmp(argv0, "git-lfs")) + return; + if (!(git_lfs = locate_in_PATH("git-lfs"))) + return; + if (get_go_version(git_lfs, buffer, sizeof(buffer)) > 0 && + skip_prefix(buffer, "go", &p) && + versioncmp("1.21.0", p) <= 0) + warning("This program was built with Go v%s\n" + "i.e. without support for this Windows version:\n" + "\n\t%s\n" + "\n" + "To work around this, you can download and install a " + "working version from\n" + "\n" + "\thttps://github.com/git-lfs/git-lfs/releases/tag/" + "v3.4.1\n", + p, git_lfs); + free(git_lfs); +} diff --git a/compat/win32/path-utils.h b/compat/win32/path-utils.h index a561c700e75713..a69483c332c1a7 100644 --- a/compat/win32/path-utils.h +++ b/compat/win32/path-utils.h @@ -34,4 +34,7 @@ int win32_fspathcmp(const char *a, const char *b); int win32_fspathncmp(const char *a, const char *b, size_t count); #define fspathncmp win32_fspathncmp +void win32_warn_about_git_lfs_on_windows7(int exit_code, const char *argv0); +#define warn_about_git_lfs_on_windows7 win32_warn_about_git_lfs_on_windows7 + #endif diff --git a/git-compat-util.h b/git-compat-util.h index ae1bdc90a4cd6a..0f3ea2d4854959 100644 --- a/git-compat-util.h +++ b/git-compat-util.h @@ -261,6 +261,13 @@ static inline int git_offset_1st_component(const char *path) #define fspathncmp git_fspathncmp #endif +#ifndef warn_about_git_lfs_on_windows7 +static inline void warn_about_git_lfs_on_windows7(int exit_code UNUSED, + const char *argv0 UNUSED) +{ +} +#endif + #ifndef is_valid_path #define is_valid_path(path) 1 #endif diff --git a/run-command.c b/run-command.c index c146a56532a139..9bdbf1fec41545 100644 --- a/run-command.c +++ b/run-command.c @@ -581,6 +581,7 @@ static int wait_or_whine(pid_t pid, const char *argv0, int in_signal) */ code += 128; } else if (WIFEXITED(status)) { + warn_about_git_lfs_on_windows7(status, argv0); code = WEXITSTATUS(status); } else { if (!in_signal) From 7806bd1849dcf2e50231f4ffaf9d367df127f234 Mon Sep 17 00:00:00 2001 From: Jeff Hostetler Date: Mon, 29 Apr 2024 08:55:03 -0400 Subject: [PATCH 059/218] survey: stub in new experimental 'git-survey' command Start work on a new 'git survey' command to scan the repository for monorepo performance and scaling problems. The goal is to measure the various known "dimensions of scale" and serve as a foundation for adding additional measurements as we learn more about Git monorepo scaling problems. The initial goal is to complement the scanning and analysis performed by the GO-based 'git-sizer' (https://github.com/github/git-sizer) tool. It is hoped that by creating a builtin command, we may be able to take advantage of internal Git data structures and code that is not accessible from GO to gain further insight into potential scaling problems. Co-authored-by: Derrick Stolee Signed-off-by: Jeff Hostetler Signed-off-by: Derrick Stolee --- .gitignore | 1 + Documentation/config.adoc | 2 + Documentation/config/survey.adoc | 11 +++++ Documentation/git-survey.adoc | 36 +++++++++++++++ Documentation/meson.build | 1 + Makefile | 1 + builtin.h | 1 + builtin/survey.c | 76 ++++++++++++++++++++++++++++++++ command-list.txt | 1 + git.c | 1 + meson.build | 1 + t/meson.build | 1 + t/t1517-outside-repo.sh | 2 +- t/t8100-git-survey.sh | 18 ++++++++ 14 files changed, 152 insertions(+), 1 deletion(-) create mode 100644 Documentation/config/survey.adoc create mode 100644 Documentation/git-survey.adoc create mode 100644 builtin/survey.c create mode 100755 t/t8100-git-survey.sh diff --git a/.gitignore b/.gitignore index 24635cf2d6f4a3..d3b2dfcff3f26a 100644 --- a/.gitignore +++ b/.gitignore @@ -171,6 +171,7 @@ /git-submodule /git-submodule--helper /git-subtree +/git-survey /git-svn /git-switch /git-symbolic-ref diff --git a/Documentation/config.adoc b/Documentation/config.adoc index dcea3c0c15e2a9..11f4a23c56ee28 100644 --- a/Documentation/config.adoc +++ b/Documentation/config.adoc @@ -537,6 +537,8 @@ include::config/status.adoc[] include::config/submodule.adoc[] +include::config/survey.adoc[] + include::config/tag.adoc[] include::config/tar.adoc[] diff --git a/Documentation/config/survey.adoc b/Documentation/config/survey.adoc new file mode 100644 index 00000000000000..c1b0f852a1250e --- /dev/null +++ b/Documentation/config/survey.adoc @@ -0,0 +1,11 @@ +survey.*:: + These variables adjust the default behavior of the `git survey` + command. The intention is that this command could be run in the + background with these options. ++ +-- + verbose:: + This boolean value implies the `--[no-]verbose` option. + progress:: + This boolean value implies the `--[no-]progress` option. +-- diff --git a/Documentation/git-survey.adoc b/Documentation/git-survey.adoc new file mode 100644 index 00000000000000..5f8ec9bfea673b --- /dev/null +++ b/Documentation/git-survey.adoc @@ -0,0 +1,36 @@ +git-survey(1) +============= + +NAME +---- +git-survey - EXPERIMENTAL: Measure various repository dimensions of scale + +SYNOPSIS +-------- +[verse] +(EXPERIMENTAL!) 'git survey' + +DESCRIPTION +----------- + +Survey the repository and measure various dimensions of scale. + +As repositories grow to "monorepo" size, certain data shapes can cause +performance problems. `git-survey` attempts to measure and report on +known problem areas. + +OPTIONS +------- + +--progress:: + Show progress. This is automatically enabled when interactive. + +OUTPUT +------ + +By default, `git survey` will print information about the repository in a +human-readable format that includes overviews and tables. + +GIT +--- +Part of the linkgit:git[1] suite diff --git a/Documentation/meson.build b/Documentation/meson.build index d6365b888bbed3..17d437d8ded045 100644 --- a/Documentation/meson.build +++ b/Documentation/meson.build @@ -144,6 +144,7 @@ manpages = { 'git-status.adoc' : 1, 'git-stripspace.adoc' : 1, 'git-submodule.adoc' : 1, + 'git-survey.adoc' : 1, 'git-svn.adoc' : 1, 'git-switch.adoc' : 1, 'git-symbolic-ref.adoc' : 1, diff --git a/Makefile b/Makefile index cedc234173e377..e1abe7ffed0446 100644 --- a/Makefile +++ b/Makefile @@ -1488,6 +1488,7 @@ BUILTIN_OBJS += builtin/sparse-checkout.o BUILTIN_OBJS += builtin/stash.o BUILTIN_OBJS += builtin/stripspace.o BUILTIN_OBJS += builtin/submodule--helper.o +BUILTIN_OBJS += builtin/survey.o BUILTIN_OBJS += builtin/symbolic-ref.o BUILTIN_OBJS += builtin/tag.o BUILTIN_OBJS += builtin/unpack-file.o diff --git a/builtin.h b/builtin.h index 235c51f30e5380..85aa3e62892fb5 100644 --- a/builtin.h +++ b/builtin.h @@ -259,6 +259,7 @@ int cmd_sparse_checkout(int argc, const char **argv, const char *prefix, struct int cmd_status(int argc, const char **argv, const char *prefix, struct repository *repo); int cmd_stash(int argc, const char **argv, const char *prefix, struct repository *repo); int cmd_stripspace(int argc, const char **argv, const char *prefix, struct repository *repo); +int cmd_survey(int argc, const char **argv, const char *prefix, struct repository *repo); int cmd_submodule__helper(int argc, const char **argv, const char *prefix, struct repository *repo); int cmd_switch(int argc, const char **argv, const char *prefix, struct repository *repo); int cmd_symbolic_ref(int argc, const char **argv, const char *prefix, struct repository *repo); diff --git a/builtin/survey.c b/builtin/survey.c new file mode 100644 index 00000000000000..f43c9e405ca8be --- /dev/null +++ b/builtin/survey.c @@ -0,0 +1,76 @@ +#define USE_THE_REPOSITORY_VARIABLE + +#include "builtin.h" +#include "config.h" +#include "environment.h" +#include "parse-options.h" + +static const char * const survey_usage[] = { + N_("(EXPERIMENTAL!) git survey "), + NULL, +}; + +struct survey_opts { + int verbose; + int show_progress; +}; + +struct survey_context { + struct repository *repo; + + /* Options that control what is done. */ + struct survey_opts opts; +}; + +static int survey_load_config_cb(const char *var, const char *value, + const struct config_context *cctx, void *pvoid) +{ + struct survey_context *ctx = pvoid; + + if (!strcmp(var, "survey.verbose")) { + ctx->opts.verbose = git_config_bool(var, value); + return 0; + } + if (!strcmp(var, "survey.progress")) { + ctx->opts.show_progress = git_config_bool(var, value); + return 0; + } + + return git_default_config(var, value, cctx, pvoid); +} + +static void survey_load_config(struct survey_context *ctx) +{ + repo_config(the_repository, survey_load_config_cb, ctx); +} + +int cmd_survey(int argc, const char **argv, const char *prefix, struct repository *repo) +{ + static struct survey_context ctx = { + .opts = { + .verbose = 0, + .show_progress = -1, /* defaults to isatty(2) */ + }, + }; + + static struct option survey_options[] = { + OPT__VERBOSE(&ctx.opts.verbose, N_("verbose output")), + OPT_BOOL(0, "progress", &ctx.opts.show_progress, N_("show progress")), + OPT_END(), + }; + + show_usage_with_options_if_asked(argc, argv, + survey_usage, survey_options); + + ctx.repo = repo; + + prepare_repo_settings(ctx.repo); + survey_load_config(&ctx); + + argc = parse_options(argc, argv, prefix, survey_options, survey_usage, 0); + + if (ctx.opts.show_progress < 0) + ctx.opts.show_progress = isatty(2); + + return 0; +} diff --git a/command-list.txt b/command-list.txt index f9005cf45979f1..d32cc32ef247ac 100644 --- a/command-list.txt +++ b/command-list.txt @@ -191,6 +191,7 @@ git-stash mainporcelain git-status mainporcelain info git-stripspace purehelpers git-submodule mainporcelain +git-survey mainporcelain git-svn foreignscminterface git-switch mainporcelain history git-symbolic-ref plumbingmanipulators diff --git a/git.c b/git.c index 5a40eab8a26a66..8f5f92f137b38a 100644 --- a/git.c +++ b/git.c @@ -659,6 +659,7 @@ static struct cmd_struct commands[] = { { "status", cmd_status, RUN_SETUP | NEED_WORK_TREE }, { "stripspace", cmd_stripspace }, { "submodule--helper", cmd_submodule__helper, RUN_SETUP }, + { "survey", cmd_survey, RUN_SETUP }, { "switch", cmd_switch, RUN_SETUP | NEED_WORK_TREE }, { "symbolic-ref", cmd_symbolic_ref, RUN_SETUP }, { "tag", cmd_tag, RUN_SETUP | DELAY_PAGER_CONFIG }, diff --git a/meson.build b/meson.build index 11488623bfd8f8..b68bead11c51e6 100644 --- a/meson.build +++ b/meson.build @@ -677,6 +677,7 @@ builtin_sources = [ 'builtin/stash.c', 'builtin/stripspace.c', 'builtin/submodule--helper.c', + 'builtin/survey.c', 'builtin/symbolic-ref.c', 'builtin/tag.c', 'builtin/unpack-file.c', diff --git a/t/meson.build b/t/meson.build index 7528e5cda5fef0..17b0456216f9cf 100644 --- a/t/meson.build +++ b/t/meson.build @@ -977,6 +977,7 @@ integration_tests = [ 't8014-blame-ignore-fuzzy.sh', 't8015-blame-diff-algorithm.sh', 't8020-last-modified.sh', + 't8100-git-survey.sh', 't9001-send-email.sh', 't9002-column.sh', 't9003-help-autocorrect.sh', diff --git a/t/t1517-outside-repo.sh b/t/t1517-outside-repo.sh index e1d35170de61a3..b829099e6c2725 100755 --- a/t/t1517-outside-repo.sh +++ b/t/t1517-outside-repo.sh @@ -128,7 +128,7 @@ do merge-octopus | merge-one-file | merge-resolve | mergetool | \ mktag | p4 | p4.py | pickaxe | remote-ftp | remote-ftps | \ remote-http | remote-https | replay | send-email | \ - sh-i18n--envsubst | shell | show | stage | submodule | svn | \ + sh-i18n--envsubst | shell | show | stage | submodule | survey | svn | \ upload-archive--writer | upload-pack | web--browse | whatchanged) expect_outcome=expect_failure ;; *) diff --git a/t/t8100-git-survey.sh b/t/t8100-git-survey.sh new file mode 100755 index 00000000000000..d9816419855d1a --- /dev/null +++ b/t/t8100-git-survey.sh @@ -0,0 +1,18 @@ +#!/bin/sh + +test_description='git survey' + +GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main +export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME + +TEST_PASSES_SANITIZE_LEAK=0 +export TEST_PASSES_SANITIZE_LEAK + +. ./test-lib.sh + +test_expect_success 'git survey -h shows experimental warning' ' + test_expect_code 129 git survey -h >usage && + grep "EXPERIMENTAL!" usage +' + +test_done From 1592927a5e0c5e632bc4b0910efc471e45b45151 Mon Sep 17 00:00:00 2001 From: Jeff Hostetler Date: Mon, 29 Apr 2024 09:51:34 -0400 Subject: [PATCH 060/218] survey: add command line opts to select references By default we will scan all references in "refs/heads/", "refs/tags/" and "refs/remotes/". Add command line opts let the use ask for all refs or a subset of them and to include a detached HEAD. Signed-off-by: Jeff Hostetler Signed-off-by: Derrick Stolee Signed-off-by: Johannes Schindelin --- Documentation/git-survey.adoc | 34 +++++ builtin/survey.c | 248 ++++++++++++++++++++++++++++++++++ t/t8100-git-survey.sh | 9 ++ 3 files changed, 291 insertions(+) diff --git a/Documentation/git-survey.adoc b/Documentation/git-survey.adoc index 5f8ec9bfea673b..56060d14b5cfef 100644 --- a/Documentation/git-survey.adoc +++ b/Documentation/git-survey.adoc @@ -19,12 +19,46 @@ As repositories grow to "monorepo" size, certain data shapes can cause performance problems. `git-survey` attempts to measure and report on known problem areas. +Ref Selection and Reachable Objects +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +In this first analysis phase, `git survey` will iterate over the set of +requested branches, tags, and other refs and treewalk over all of the +reachable commits, trees, and blobs and generate various statistics. + OPTIONS ------- --progress:: Show progress. This is automatically enabled when interactive. +Ref Selection +~~~~~~~~~~~~~ + +The following options control the set of refs that `git survey` will examine. +By default, `git survey` will look at tags, local branches, and remote refs. +If any of the following options are given, the default set is cleared and +only refs for the given options are added. + +--all-refs:: + Use all refs. This includes local branches, tags, remote refs, + notes, and stashes. This option overrides all of the following. + +--branches:: + Add local branches (`refs/heads/`) to the set. + +--tags:: + Add tags (`refs/tags/`) to the set. + +--remotes:: + Add remote branches (`refs/remote/`) to the set. + +--detached:: + Add HEAD to the set. + +--other:: + Add notes (`refs/notes/`) and stashes (`refs/stash/`) to the set. + OUTPUT ------ diff --git a/builtin/survey.c b/builtin/survey.c index f43c9e405ca8be..cb45d900869fe2 100644 --- a/builtin/survey.c +++ b/builtin/survey.c @@ -3,16 +3,55 @@ #include "builtin.h" #include "config.h" #include "environment.h" +#include "object.h" +#include "odb.h" #include "parse-options.h" +#include "progress.h" +#include "ref-filter.h" +#include "strvec.h" +#include "trace2.h" static const char * const survey_usage[] = { N_("(EXPERIMENTAL!) git survey "), NULL, }; +struct survey_refs_wanted { + int want_all_refs; /* special override */ + + int want_branches; + int want_tags; + int want_remotes; + int want_detached; + int want_other; /* see FILTER_REFS_OTHERS -- refs/notes/, refs/stash/ */ +}; + +static struct survey_refs_wanted default_ref_options = { + .want_all_refs = 1, +}; + struct survey_opts { int verbose; int show_progress; + struct survey_refs_wanted refs; +}; + +struct survey_report_ref_summary { + size_t refs_nr; + size_t branches_nr; + size_t remote_refs_nr; + size_t tags_nr; + size_t tags_annotated_nr; + size_t others_nr; + size_t unknown_nr; +}; + +/** + * This struct contains all of the information that needs to be printed + * at the end of the exploration of the repository and its references. + */ +struct survey_report { + struct survey_report_ref_summary refs; }; struct survey_context { @@ -20,8 +59,84 @@ struct survey_context { /* Options that control what is done. */ struct survey_opts opts; + + /* Info for output only. */ + struct survey_report report; + + /* + * The rest of the members are about enabling the activity + * of the 'git survey' command, including ref listings, object + * pointers, and progress. + */ + + struct progress *progress; + size_t progress_nr; + size_t progress_total; + + struct strvec refs; }; +static void clear_survey_context(struct survey_context *ctx) +{ + strvec_clear(&ctx->refs); +} + +/* + * After parsing the command line arguments, figure out which refs we + * should scan. + * + * If ANY were given in positive sense, then we ONLY include them and + * do not use the builtin values. + */ +static void fixup_refs_wanted(struct survey_context *ctx) +{ + struct survey_refs_wanted *rw = &ctx->opts.refs; + + /* + * `--all-refs` overrides and enables everything. + */ + if (rw->want_all_refs == 1) { + rw->want_branches = 1; + rw->want_tags = 1; + rw->want_remotes = 1; + rw->want_detached = 1; + rw->want_other = 1; + return; + } + + /* + * If none of the `--` were given, we assume all + * of the builtin unspecified values. + */ + if (rw->want_branches == -1 && + rw->want_tags == -1 && + rw->want_remotes == -1 && + rw->want_detached == -1 && + rw->want_other == -1) { + *rw = default_ref_options; + return; + } + + /* + * Since we only allow positive boolean values on the command + * line, we will only have true values where they specified + * a `--`. + * + * So anything that still has an unspecified value should be + * set to false. + */ + if (rw->want_branches == -1) + rw->want_branches = 0; + if (rw->want_tags == -1) + rw->want_tags = 0; + if (rw->want_remotes == -1) + rw->want_remotes = 0; + if (rw->want_detached == -1) + rw->want_detached = 0; + if (rw->want_other == -1) + rw->want_other = 0; +} + static int survey_load_config_cb(const char *var, const char *value, const struct config_context *cctx, void *pvoid) { @@ -44,18 +159,146 @@ static void survey_load_config(struct survey_context *ctx) repo_config(the_repository, survey_load_config_cb, ctx); } +static void do_load_refs(struct survey_context *ctx, + struct ref_array *ref_array) +{ + struct ref_filter filter = REF_FILTER_INIT; + struct ref_sorting *sorting; + struct string_list sorting_options = STRING_LIST_INIT_DUP; + + string_list_append(&sorting_options, "objectname"); + sorting = ref_sorting_options(&sorting_options); + + if (ctx->opts.refs.want_detached) + strvec_push(&ctx->refs, "HEAD"); + + if (ctx->opts.refs.want_all_refs) { + strvec_push(&ctx->refs, "refs/"); + } else { + if (ctx->opts.refs.want_branches) + strvec_push(&ctx->refs, "refs/heads/"); + if (ctx->opts.refs.want_tags) + strvec_push(&ctx->refs, "refs/tags/"); + if (ctx->opts.refs.want_remotes) + strvec_push(&ctx->refs, "refs/remotes/"); + if (ctx->opts.refs.want_other) { + strvec_push(&ctx->refs, "refs/notes/"); + strvec_push(&ctx->refs, "refs/stash/"); + } + } + + filter.name_patterns = ctx->refs.v; + filter.ignore_case = 0; + filter.match_as_path = 1; + + if (ctx->opts.show_progress) { + ctx->progress_total = 0; + ctx->progress = start_progress(ctx->repo, + _("Scanning refs..."), 0); + } + + filter_refs(ref_array, &filter, FILTER_REFS_KIND_MASK); + + if (ctx->opts.show_progress) { + ctx->progress_total = ref_array->nr; + display_progress(ctx->progress, ctx->progress_total); + } + + ref_array_sort(sorting, ref_array); + + stop_progress(&ctx->progress); + ref_filter_clear(&filter); + ref_sorting_release(sorting); +} + +/* + * The REFS phase: + * + * Load the set of requested refs and assess them for scalablity problems. + * Use that set to start a treewalk to all reachable objects and assess + * them. + * + * This data will give us insights into the repository itself (the number + * of refs, the size and shape of the DAG, the number and size of the + * objects). + * + * Theoretically, this data is independent of the on-disk representation + * (e.g. independent of packing concerns). + */ +static void survey_phase_refs(struct survey_context *ctx) +{ + struct ref_array ref_array = { 0 }; + + trace2_region_enter("survey", "phase/refs", ctx->repo); + do_load_refs(ctx, &ref_array); + + ctx->report.refs.refs_nr = ref_array.nr; + for (int i = 0; i < ref_array.nr; i++) { + unsigned long size; + struct ref_array_item *item = ref_array.items[i]; + + switch (item->kind) { + case FILTER_REFS_TAGS: + ctx->report.refs.tags_nr++; + if (odb_read_object_info(ctx->repo->objects, + &item->objectname, + &size) == OBJ_TAG) + ctx->report.refs.tags_annotated_nr++; + break; + + case FILTER_REFS_BRANCHES: + ctx->report.refs.branches_nr++; + break; + + case FILTER_REFS_REMOTES: + ctx->report.refs.remote_refs_nr++; + break; + + case FILTER_REFS_OTHERS: + ctx->report.refs.others_nr++; + break; + + default: + ctx->report.refs.unknown_nr++; + break; + } + } + + trace2_region_leave("survey", "phase/refs", ctx->repo); + + ref_array_clear(&ref_array); +} + int cmd_survey(int argc, const char **argv, const char *prefix, struct repository *repo) { static struct survey_context ctx = { .opts = { .verbose = 0, .show_progress = -1, /* defaults to isatty(2) */ + + .refs.want_all_refs = -1, + + .refs.want_branches = -1, /* default these to undefined */ + .refs.want_tags = -1, + .refs.want_remotes = -1, + .refs.want_detached = -1, + .refs.want_other = -1, }, + .refs = STRVEC_INIT, }; static struct option survey_options[] = { OPT__VERBOSE(&ctx.opts.verbose, N_("verbose output")), OPT_BOOL(0, "progress", &ctx.opts.show_progress, N_("show progress")), + + OPT_BOOL_F(0, "all-refs", &ctx.opts.refs.want_all_refs, N_("include all refs"), PARSE_OPT_NONEG), + + OPT_BOOL_F(0, "branches", &ctx.opts.refs.want_branches, N_("include branches"), PARSE_OPT_NONEG), + OPT_BOOL_F(0, "tags", &ctx.opts.refs.want_tags, N_("include tags"), PARSE_OPT_NONEG), + OPT_BOOL_F(0, "remotes", &ctx.opts.refs.want_remotes, N_("include all remotes refs"), PARSE_OPT_NONEG), + OPT_BOOL_F(0, "detached", &ctx.opts.refs.want_detached, N_("include detached HEAD"), PARSE_OPT_NONEG), + OPT_BOOL_F(0, "other", &ctx.opts.refs.want_other, N_("include notes and stashes"), PARSE_OPT_NONEG), + OPT_END(), }; @@ -72,5 +315,10 @@ int cmd_survey(int argc, const char **argv, const char *prefix, struct repositor if (ctx.opts.show_progress < 0) ctx.opts.show_progress = isatty(2); + fixup_refs_wanted(&ctx); + + survey_phase_refs(&ctx); + + clear_survey_context(&ctx); return 0; } diff --git a/t/t8100-git-survey.sh b/t/t8100-git-survey.sh index d9816419855d1a..9bac3c2ba47e2c 100755 --- a/t/t8100-git-survey.sh +++ b/t/t8100-git-survey.sh @@ -15,4 +15,13 @@ test_expect_success 'git survey -h shows experimental warning' ' grep "EXPERIMENTAL!" usage ' +test_expect_success 'create a semi-interesting repo' ' + test_commit_bulk 10 +' + +test_expect_success 'git survey (default)' ' + git survey >out 2>err && + test_line_count = 0 err +' + test_done From 3ca4cab9712a2a7993bbdaa55caff76da62c69c3 Mon Sep 17 00:00:00 2001 From: Derrick Stolee Date: Sun, 1 Sep 2024 15:58:32 -0400 Subject: [PATCH 061/218] survey: start pretty printing data in table form When 'git survey' provides information to the user, this will be presented in one of two formats: plaintext and JSON. The JSON implementation will be delayed until the functionality is complete for the plaintext format. The most important parts of the plaintext format are headers specifying the different sections of the report and tables providing concreted data. Create a custom table data structure that allows specifying a list of strings for the row values. When printing the table, check each column for the maximum width so we can create a table of the correct size from the start. The table structure is designed to be flexible to the different kinds of output that will be implemented in future changes. Signed-off-by: Derrick Stolee --- Documentation/git-survey.adoc | 7 ++ builtin/survey.c | 157 ++++++++++++++++++++++++++++++++++ t/t8100-git-survey.sh | 18 +++- 3 files changed, 181 insertions(+), 1 deletion(-) diff --git a/Documentation/git-survey.adoc b/Documentation/git-survey.adoc index 56060d14b5cfef..120ecb9a4d49f2 100644 --- a/Documentation/git-survey.adoc +++ b/Documentation/git-survey.adoc @@ -65,6 +65,13 @@ OUTPUT By default, `git survey` will print information about the repository in a human-readable format that includes overviews and tables. +References Summary +~~~~~~~~~~~~~~~~~~ + +The references summary includes a count of each kind of reference, +including branches, remote refs, and tags (split by "all" and +"annotated"). + GIT --- Part of the linkgit:git[1] suite diff --git a/builtin/survey.c b/builtin/survey.c index cb45d900869fe2..4e150c31e9de08 100644 --- a/builtin/survey.c +++ b/builtin/survey.c @@ -8,6 +8,7 @@ #include "parse-options.h" #include "progress.h" #include "ref-filter.h" +#include "strbuf.h" #include "strvec.h" #include "trace2.h" @@ -81,6 +82,160 @@ static void clear_survey_context(struct survey_context *ctx) strvec_clear(&ctx->refs); } +struct survey_table { + const char *table_name; + struct strvec header; + struct strvec *rows; + size_t rows_nr; + size_t rows_alloc; +}; + +#define SURVEY_TABLE_INIT { \ + .header = STRVEC_INIT, \ +} + +static void clear_table(struct survey_table *table) +{ + strvec_clear(&table->header); + for (size_t i = 0; i < table->rows_nr; i++) + strvec_clear(&table->rows[i]); + free(table->rows); +} + +static void insert_table_rowv(struct survey_table *table, ...) +{ + va_list ap; + char *arg; + ALLOC_GROW(table->rows, table->rows_nr + 1, table->rows_alloc); + + memset(&table->rows[table->rows_nr], 0, sizeof(struct strvec)); + + va_start(ap, table); + while ((arg = va_arg(ap, char *))) + strvec_push(&table->rows[table->rows_nr], arg); + va_end(ap); + + table->rows_nr++; +} + +#define SECTION_SEGMENT "========================================" +#define SECTION_SEGMENT_LEN 40 +static const char *section_line = SECTION_SEGMENT + SECTION_SEGMENT + SECTION_SEGMENT + SECTION_SEGMENT; +static const size_t section_len = 4 * SECTION_SEGMENT_LEN; + +static void print_table_title(const char *name, size_t *widths, size_t nr) +{ + size_t width = 3 * (nr - 1); + + for (size_t i = 0; i < nr; i++) + width += widths[i]; + + if (width > section_len) + width = section_len; + + printf("\n%s\n%.*s\n", name, (int)width, section_line); +} + +static void print_row_plaintext(struct strvec *row, size_t *widths) +{ + static struct strbuf line = STRBUF_INIT; + strbuf_setlen(&line, 0); + + for (size_t i = 0; i < row->nr; i++) { + const char *str = row->v[i]; + size_t len = strlen(str); + if (i) + strbuf_add(&line, " | ", 3); + strbuf_addchars(&line, ' ', widths[i] - len); + strbuf_add(&line, str, len); + } + printf("%s\n", line.buf); +} + +static void print_divider_plaintext(size_t *widths, size_t nr) +{ + static struct strbuf line = STRBUF_INIT; + strbuf_setlen(&line, 0); + + for (size_t i = 0; i < nr; i++) { + if (i) + strbuf_add(&line, "-+-", 3); + strbuf_addchars(&line, '-', widths[i]); + } + printf("%s\n", line.buf); +} + +static void print_table_plaintext(struct survey_table *table) +{ + size_t *column_widths; + size_t columns_nr = table->header.nr; + CALLOC_ARRAY(column_widths, columns_nr); + + for (size_t i = 0; i < columns_nr; i++) { + column_widths[i] = strlen(table->header.v[i]); + + for (size_t j = 0; j < table->rows_nr; j++) { + size_t rowlen = strlen(table->rows[j].v[i]); + if (column_widths[i] < rowlen) + column_widths[i] = rowlen; + } + } + + print_table_title(table->table_name, column_widths, columns_nr); + print_row_plaintext(&table->header, column_widths); + print_divider_plaintext(column_widths, columns_nr); + + for (size_t j = 0; j < table->rows_nr; j++) + print_row_plaintext(&table->rows[j], column_widths); + + free(column_widths); +} + +static void survey_report_plaintext_refs(struct survey_context *ctx) +{ + struct survey_report_ref_summary *refs = &ctx->report.refs; + struct survey_table table = SURVEY_TABLE_INIT; + + table.table_name = _("REFERENCES SUMMARY"); + + strvec_push(&table.header, _("Ref Type")); + strvec_push(&table.header, _("Count")); + + if (ctx->opts.refs.want_all_refs || ctx->opts.refs.want_branches) { + char *fmt = xstrfmt("%"PRIuMAX"", (uintmax_t)refs->branches_nr); + insert_table_rowv(&table, _("Branches"), fmt, NULL); + free(fmt); + } + + if (ctx->opts.refs.want_all_refs || ctx->opts.refs.want_remotes) { + char *fmt = xstrfmt("%"PRIuMAX"", (uintmax_t)refs->remote_refs_nr); + insert_table_rowv(&table, _("Remote refs"), fmt, NULL); + free(fmt); + } + + if (ctx->opts.refs.want_all_refs || ctx->opts.refs.want_tags) { + char *fmt = xstrfmt("%"PRIuMAX"", (uintmax_t)refs->tags_nr); + insert_table_rowv(&table, _("Tags (all)"), fmt, NULL); + free(fmt); + fmt = xstrfmt("%"PRIuMAX"", (uintmax_t)refs->tags_annotated_nr); + insert_table_rowv(&table, _("Tags (annotated)"), fmt, NULL); + free(fmt); + } + + print_table_plaintext(&table); + clear_table(&table); +} + +static void survey_report_plaintext(struct survey_context *ctx) +{ + printf("GIT SURVEY for \"%s\"\n", ctx->repo->worktree); + printf("-----------------------------------------------------\n"); + survey_report_plaintext_refs(ctx); +} + /* * After parsing the command line arguments, figure out which refs we * should scan. @@ -319,6 +474,8 @@ int cmd_survey(int argc, const char **argv, const char *prefix, struct repositor survey_phase_refs(&ctx); + survey_report_plaintext(&ctx); + clear_survey_context(&ctx); return 0; } diff --git a/t/t8100-git-survey.sh b/t/t8100-git-survey.sh index 9bac3c2ba47e2c..e518e4844fe2d0 100755 --- a/t/t8100-git-survey.sh +++ b/t/t8100-git-survey.sh @@ -21,7 +21,23 @@ test_expect_success 'create a semi-interesting repo' ' test_expect_success 'git survey (default)' ' git survey >out 2>err && - test_line_count = 0 err + test_line_count = 0 err && + + tr , " " >expect <<-EOF && + GIT SURVEY for "$(pwd)" + ----------------------------------------------------- + + REFERENCES SUMMARY + ======================== + , Ref Type | Count + -----------------+------ + , Branches | 1 + Remote refs | 0 + Tags (all) | 0 + Tags (annotated) | 0 + EOF + + test_cmp expect out ' test_done From 572cfa65bd615b39229c5ded14c611a2548588b2 Mon Sep 17 00:00:00 2001 From: Derrick Stolee Date: Sun, 1 Sep 2024 20:33:47 -0400 Subject: [PATCH 062/218] survey: add object count summary At the moment, nothing is obvious about the reason for the use of the path-walk API, but this will become more prevelant in future iterations. For now, use the path-walk API to sum up the counts of each kind of object. For example, this is the reachable object summary output for my local repo: REACHABLE OBJECT SUMMARY ======================== Object Type | Count ------------+------- Tags | 1343 Commits | 179344 Trees | 314350 Blobs | 184030 Signed-off-by: Derrick Stolee --- Documentation/git-survey.adoc | 6 ++ builtin/survey.c | 130 ++++++++++++++++++++++++++++++++-- t/t8100-git-survey.sh | 23 ++++-- 3 files changed, 148 insertions(+), 11 deletions(-) diff --git a/Documentation/git-survey.adoc b/Documentation/git-survey.adoc index 120ecb9a4d49f2..44f3a0568b7697 100644 --- a/Documentation/git-survey.adoc +++ b/Documentation/git-survey.adoc @@ -72,6 +72,12 @@ The references summary includes a count of each kind of reference, including branches, remote refs, and tags (split by "all" and "annotated"). +Reachable Object Summary +~~~~~~~~~~~~~~~~~~~~~~~~ + +The reachable object summary shows the total number of each kind of Git +object, including tags, commits, trees, and blobs. + GIT --- Part of the linkgit:git[1] suite diff --git a/builtin/survey.c b/builtin/survey.c index 4e150c31e9de08..1e8b9c1e5492aa 100644 --- a/builtin/survey.c +++ b/builtin/survey.c @@ -3,13 +3,19 @@ #include "builtin.h" #include "config.h" #include "environment.h" +#include "hex.h" #include "object.h" #include "odb.h" +#include "object-name.h" #include "parse-options.h" +#include "path-walk.h" #include "progress.h" #include "ref-filter.h" +#include "refs.h" +#include "revision.h" #include "strbuf.h" #include "strvec.h" +#include "tag.h" #include "trace2.h" static const char * const survey_usage[] = { @@ -47,12 +53,20 @@ struct survey_report_ref_summary { size_t unknown_nr; }; +struct survey_report_object_summary { + size_t commits_nr; + size_t tags_nr; + size_t trees_nr; + size_t blobs_nr; +}; + /** * This struct contains all of the information that needs to be printed * at the end of the exploration of the repository and its references. */ struct survey_report { struct survey_report_ref_summary refs; + struct survey_report_object_summary reachable_objects; }; struct survey_context { @@ -75,10 +89,12 @@ struct survey_context { size_t progress_total; struct strvec refs; + struct ref_array ref_array; }; static void clear_survey_context(struct survey_context *ctx) { + ref_array_clear(&ctx->ref_array); strvec_clear(&ctx->refs); } @@ -129,10 +145,14 @@ static const size_t section_len = 4 * SECTION_SEGMENT_LEN; static void print_table_title(const char *name, size_t *widths, size_t nr) { size_t width = 3 * (nr - 1); + size_t min_width = strlen(name); for (size_t i = 0; i < nr; i++) width += widths[i]; + if (width < min_width) + width = min_width; + if (width > section_len) width = section_len; @@ -229,11 +249,43 @@ static void survey_report_plaintext_refs(struct survey_context *ctx) clear_table(&table); } +static void survey_report_plaintext_reachable_object_summary(struct survey_context *ctx) +{ + struct survey_report_object_summary *objs = &ctx->report.reachable_objects; + struct survey_table table = SURVEY_TABLE_INIT; + char *fmt; + + table.table_name = _("REACHABLE OBJECT SUMMARY"); + + strvec_push(&table.header, _("Object Type")); + strvec_push(&table.header, _("Count")); + + fmt = xstrfmt("%"PRIuMAX"", (uintmax_t)objs->tags_nr); + insert_table_rowv(&table, _("Tags"), fmt, NULL); + free(fmt); + + fmt = xstrfmt("%"PRIuMAX"", (uintmax_t)objs->commits_nr); + insert_table_rowv(&table, _("Commits"), fmt, NULL); + free(fmt); + + fmt = xstrfmt("%"PRIuMAX"", (uintmax_t)objs->trees_nr); + insert_table_rowv(&table, _("Trees"), fmt, NULL); + free(fmt); + + fmt = xstrfmt("%"PRIuMAX"", (uintmax_t)objs->blobs_nr); + insert_table_rowv(&table, _("Blobs"), fmt, NULL); + free(fmt); + + print_table_plaintext(&table); + clear_table(&table); +} + static void survey_report_plaintext(struct survey_context *ctx) { printf("GIT SURVEY for \"%s\"\n", ctx->repo->worktree); printf("-----------------------------------------------------\n"); survey_report_plaintext_refs(ctx); + survey_report_plaintext_reachable_object_summary(ctx); } /* @@ -382,15 +434,13 @@ static void do_load_refs(struct survey_context *ctx, */ static void survey_phase_refs(struct survey_context *ctx) { - struct ref_array ref_array = { 0 }; - trace2_region_enter("survey", "phase/refs", ctx->repo); - do_load_refs(ctx, &ref_array); + do_load_refs(ctx, &ctx->ref_array); - ctx->report.refs.refs_nr = ref_array.nr; - for (int i = 0; i < ref_array.nr; i++) { + ctx->report.refs.refs_nr = ctx->ref_array.nr; + for (int i = 0; i < ctx->ref_array.nr; i++) { unsigned long size; - struct ref_array_item *item = ref_array.items[i]; + struct ref_array_item *item = ctx->ref_array.items[i]; switch (item->kind) { case FILTER_REFS_TAGS: @@ -420,8 +470,72 @@ static void survey_phase_refs(struct survey_context *ctx) } trace2_region_leave("survey", "phase/refs", ctx->repo); +} + +static void increment_object_counts( + struct survey_report_object_summary *summary, + enum object_type type, + size_t nr) +{ + switch (type) { + case OBJ_COMMIT: + summary->commits_nr += nr; + break; - ref_array_clear(&ref_array); + case OBJ_TREE: + summary->trees_nr += nr; + break; + + case OBJ_BLOB: + summary->blobs_nr += nr; + break; + + case OBJ_TAG: + summary->tags_nr += nr; + break; + + default: + break; + } +} + +static int survey_objects_path_walk_fn(const char *path UNUSED, + struct oid_array *oids, + enum object_type type, + void *data) +{ + struct survey_context *ctx = data; + + increment_object_counts(&ctx->report.reachable_objects, + type, oids->nr); + + return 0; +} + +static void survey_phase_objects(struct survey_context *ctx) +{ + struct rev_info revs = REV_INFO_INIT; + struct path_walk_info info = PATH_WALK_INFO_INIT; + unsigned int add_flags = 0; + + trace2_region_enter("survey", "phase/objects", ctx->repo); + + info.revs = &revs; + info.path_fn = survey_objects_path_walk_fn; + info.path_fn_data = ctx; + + repo_init_revisions(ctx->repo, &revs, ""); + revs.tag_objects = 1; + + for (int i = 0; i < ctx->ref_array.nr; i++) { + struct ref_array_item *item = ctx->ref_array.items[i]; + add_pending_oid(&revs, NULL, &item->objectname, add_flags); + } + + walk_objects_by_path(&info); + + release_revisions(&revs); + trace2_region_leave("survey", "phase/objects", ctx->repo); } int cmd_survey(int argc, const char **argv, const char *prefix, struct repository *repo) @@ -474,6 +588,8 @@ int cmd_survey(int argc, const char **argv, const char *prefix, struct repositor survey_phase_refs(&ctx); + survey_phase_objects(&ctx); + survey_report_plaintext(&ctx); clear_survey_context(&ctx); diff --git a/t/t8100-git-survey.sh b/t/t8100-git-survey.sh index e518e4844fe2d0..d3086784090352 100755 --- a/t/t8100-git-survey.sh +++ b/t/t8100-git-survey.sh @@ -16,11 +16,17 @@ test_expect_success 'git survey -h shows experimental warning' ' ' test_expect_success 'create a semi-interesting repo' ' - test_commit_bulk 10 + test_commit_bulk 10 && + git tag -a -m one one HEAD~5 && + git tag -a -m two two HEAD~3 && + git tag -a -m three three two && + git tag -a -m four four three && + git update-ref -d refs/tags/three && + git update-ref -d refs/tags/two ' test_expect_success 'git survey (default)' ' - git survey >out 2>err && + git survey --all-refs >out 2>err && test_line_count = 0 err && tr , " " >expect <<-EOF && @@ -33,8 +39,17 @@ test_expect_success 'git survey (default)' ' -----------------+------ , Branches | 1 Remote refs | 0 - Tags (all) | 0 - Tags (annotated) | 0 + Tags (all) | 2 + Tags (annotated) | 2 + + REACHABLE OBJECT SUMMARY + ======================== + Object Type | Count + ------------+------ + Tags | 4 + Commits | 10 + Trees | 10 + Blobs | 10 EOF test_cmp expect out From 7af6b9145efeb9e9576db06db9cc293e403197bf Mon Sep 17 00:00:00 2001 From: Derrick Stolee Date: Sun, 1 Sep 2024 20:58:35 -0400 Subject: [PATCH 063/218] survey: summarize total sizes by object type Now that we have explored objects by count, we can expand that a bit more to summarize the data for the on-disk and inflated size of those objects. This information is helpful for diagnosing both why disk space (and perhaps clone or fetch times) is growing but also why certain operations are slow because the inflated size of the abstract objects that must be processed is so large. Note: zlib-ng is slightly more efficient even at those small sizes. Even between zlib versions, there are slight differences in compression. To accommodate for that in the tests, not the exact numbers but some rough approximations are validated (the test should validate `git survey`, after all, not zlib). Signed-off-by: Derrick Stolee Signed-off-by: Johannes Schindelin --- builtin/survey.c | 133 ++++++++++++++++++++++++++++++++++++++++++ t/t8100-git-survey.sh | 37 +++++++++++- 2 files changed, 169 insertions(+), 1 deletion(-) diff --git a/builtin/survey.c b/builtin/survey.c index 1e8b9c1e5492aa..1d1290553250a1 100644 --- a/builtin/survey.c +++ b/builtin/survey.c @@ -60,6 +60,19 @@ struct survey_report_object_summary { size_t blobs_nr; }; +/** + * For some category given by 'label', count the number of objects + * that match that label along with the on-disk size and the size + * after decompressing (both with delta bases and zlib). + */ +struct survey_report_object_size_summary { + char *label; + size_t nr; + size_t disk_size; + size_t inflated_size; + size_t num_missing; +}; + /** * This struct contains all of the information that needs to be printed * at the end of the exploration of the repository and its references. @@ -67,8 +80,16 @@ struct survey_report_object_summary { struct survey_report { struct survey_report_ref_summary refs; struct survey_report_object_summary reachable_objects; + + struct survey_report_object_size_summary *by_type; }; +#define REPORT_TYPE_COMMIT 0 +#define REPORT_TYPE_TREE 1 +#define REPORT_TYPE_BLOB 2 +#define REPORT_TYPE_TAG 3 +#define REPORT_TYPE_COUNT 4 + struct survey_context { struct repository *repo; @@ -280,12 +301,48 @@ static void survey_report_plaintext_reachable_object_summary(struct survey_conte clear_table(&table); } +static void survey_report_object_sizes(const char *title, + const char *categories, + struct survey_report_object_size_summary *summary, + size_t summary_nr) +{ + struct survey_table table = SURVEY_TABLE_INIT; + table.table_name = title; + + strvec_push(&table.header, categories); + strvec_push(&table.header, _("Count")); + strvec_push(&table.header, _("Disk Size")); + strvec_push(&table.header, _("Inflated Size")); + + for (size_t i = 0; i < summary_nr; i++) { + char *label_str = xstrdup(summary[i].label); + char *nr_str = xstrfmt("%"PRIuMAX, (uintmax_t)summary[i].nr); + char *disk_str = xstrfmt("%"PRIuMAX, (uintmax_t)summary[i].disk_size); + char *inflate_str = xstrfmt("%"PRIuMAX, (uintmax_t)summary[i].inflated_size); + + insert_table_rowv(&table, label_str, nr_str, + disk_str, inflate_str, NULL); + + free(label_str); + free(nr_str); + free(disk_str); + free(inflate_str); + } + + print_table_plaintext(&table); + clear_table(&table); +} + static void survey_report_plaintext(struct survey_context *ctx) { printf("GIT SURVEY for \"%s\"\n", ctx->repo->worktree); printf("-----------------------------------------------------\n"); survey_report_plaintext_refs(ctx); survey_report_plaintext_reachable_object_summary(ctx); + survey_report_object_sizes(_("TOTAL OBJECT SIZES BY TYPE"), + _("Object Type"), + ctx->report.by_type, + REPORT_TYPE_COUNT); } /* @@ -499,6 +556,69 @@ static void increment_object_counts( } } +static void increment_totals(struct survey_context *ctx, + struct oid_array *oids, + struct survey_report_object_size_summary *summary) +{ + for (size_t i = 0; i < oids->nr; i++) { + struct object_info oi = OBJECT_INFO_INIT; + unsigned oi_flags = OBJECT_INFO_FOR_PREFETCH; + unsigned long object_length = 0; + off_t disk_sizep = 0; + enum object_type type; + + oi.typep = &type; + oi.sizep = &object_length; + oi.disk_sizep = &disk_sizep; + + if (odb_read_object_info_extended(ctx->repo->objects, + &oids->oid[i], + &oi, oi_flags) < 0) { + summary->num_missing++; + } else { + summary->nr++; + summary->disk_size += disk_sizep; + summary->inflated_size += object_length; + } + } +} + +static void increment_object_totals(struct survey_context *ctx, + struct oid_array *oids, + enum object_type type) +{ + struct survey_report_object_size_summary *total; + struct survey_report_object_size_summary summary = { 0 }; + + increment_totals(ctx, oids, &summary); + + switch (type) { + case OBJ_COMMIT: + total = &ctx->report.by_type[REPORT_TYPE_COMMIT]; + break; + + case OBJ_TREE: + total = &ctx->report.by_type[REPORT_TYPE_TREE]; + break; + + case OBJ_BLOB: + total = &ctx->report.by_type[REPORT_TYPE_BLOB]; + break; + + case OBJ_TAG: + total = &ctx->report.by_type[REPORT_TYPE_TAG]; + break; + + default: + BUG("No other type allowed"); + } + + total->nr += summary.nr; + total->disk_size += summary.disk_size; + total->inflated_size += summary.inflated_size; + total->num_missing += summary.num_missing; +} + static int survey_objects_path_walk_fn(const char *path UNUSED, struct oid_array *oids, enum object_type type, @@ -508,10 +628,20 @@ static int survey_objects_path_walk_fn(const char *path UNUSED, increment_object_counts(&ctx->report.reachable_objects, type, oids->nr); + increment_object_totals(ctx, oids, type); return 0; } +static void initialize_report(struct survey_context *ctx) +{ + CALLOC_ARRAY(ctx->report.by_type, REPORT_TYPE_COUNT); + ctx->report.by_type[REPORT_TYPE_COMMIT].label = xstrdup(_("Commits")); + ctx->report.by_type[REPORT_TYPE_TREE].label = xstrdup(_("Trees")); + ctx->report.by_type[REPORT_TYPE_BLOB].label = xstrdup(_("Blobs")); + ctx->report.by_type[REPORT_TYPE_TAG].label = xstrdup(_("Tags")); +} + static void survey_phase_objects(struct survey_context *ctx) { struct rev_info revs = REV_INFO_INIT; @@ -524,12 +654,15 @@ static void survey_phase_objects(struct survey_context *ctx) info.path_fn = survey_objects_path_walk_fn; info.path_fn_data = ctx; + initialize_report(ctx); + repo_init_revisions(ctx->repo, &revs, ""); revs.tag_objects = 1; for (int i = 0; i < ctx->ref_array.nr; i++) { struct ref_array_item *item = ctx->ref_array.items[i]; add_pending_oid(&revs, NULL, &item->objectname, add_flags); + display_progress(ctx->progress, ++(ctx->progress_nr)); } walk_objects_by_path(&info); diff --git a/t/t8100-git-survey.sh b/t/t8100-git-survey.sh index d3086784090352..c2a6333145bac1 100755 --- a/t/t8100-git-survey.sh +++ b/t/t8100-git-survey.sh @@ -25,10 +25,35 @@ test_expect_success 'create a semi-interesting repo' ' git update-ref -d refs/tags/two ' +approximate_sizes() { + # very simplistic approximate rounding + sed -Ee "s/ *(1[0-9][0-9])( |$)/ ~0.1kB\2/g" \ + -e "s/ *(4[6-9][0-9]|5[0-6][0-9])( |$)/ ~0.5kB\2/g" \ + -e "s/ *(5[6-9][0-9]|6[0-6][0-9])( |$)/ ~0.6kB\2/g" \ + -e "s/ *1(4[89][0-9]|5[0-8][0-9])( |$)/ ~1.5kB\2/g" \ + -e "s/ *1(69[0-9]|7[0-9][0-9])( |$)/ ~1.7kB\2/g" \ + -e "s/ *1(79[0-9]|8[0-9][0-9])( |$)/ ~1.8kB\2/g" \ + -e "s/ *2(1[0-9][0-9]|20[0-1])( |$)/ ~2.1kB\2/g" \ + -e "s/ *2(3[0-9][0-9]|4[0-1][0-9])( |$)/ ~2.3kB\2/g" \ + -e "s/ *2(5[0-9][0-9]|6[0-1][0-9])( |$)/ ~2.5kB\2/g" \ + "$@" +} + test_expect_success 'git survey (default)' ' git survey --all-refs >out 2>err && test_line_count = 0 err && + test_oid_cache <<-EOF && + commits_sizes sha1:~1.5kB | ~2.1kB + commits_sizes sha256:~1.8kB | ~2.5kB + trees_sizes sha1:~0.5kB | ~1.7kB + trees_sizes sha256:~0.6kB | ~2.3kB + blobs_sizes sha1:~0.1kB | ~0.1kB + blobs_sizes sha256:~0.1kB | ~0.1kB + tags_sizes sha1:~0.5kB | ~0.5kB + tags_sizes sha256:~0.5kB | ~0.6kB + EOF + tr , " " >expect <<-EOF && GIT SURVEY for "$(pwd)" ----------------------------------------------------- @@ -50,9 +75,19 @@ test_expect_success 'git survey (default)' ' Commits | 10 Trees | 10 Blobs | 10 + + TOTAL OBJECT SIZES BY TYPE + =============================================== + Object Type | Count | Disk Size | Inflated Size + ------------+-------+-----------+-------------- + Commits | 10 | $(test_oid commits_sizes) + Trees | 10 | $(test_oid trees_sizes) + Blobs | 10 | $(test_oid blobs_sizes) + Tags | 4 | $(test_oid tags_sizes) EOF - test_cmp expect out + approximate_sizes out >out-edited && + test_cmp expect out-edited ' test_done From 32d24eb60a0bfc20c3fec5a3edbdf223b75e2ed5 Mon Sep 17 00:00:00 2001 From: Derrick Stolee Date: Sun, 1 Sep 2024 21:21:54 -0400 Subject: [PATCH 064/218] survey: show progress during object walk Signed-off-by: Derrick Stolee --- builtin/survey.c | 16 ++++++++++++++++ t/t8100-git-survey.sh | 5 +++++ 2 files changed, 21 insertions(+) diff --git a/builtin/survey.c b/builtin/survey.c index 1d1290553250a1..c570a1470122f4 100644 --- a/builtin/survey.c +++ b/builtin/survey.c @@ -630,6 +630,9 @@ static int survey_objects_path_walk_fn(const char *path UNUSED, type, oids->nr); increment_object_totals(ctx, oids, type); + ctx->progress_nr += oids->nr; + display_progress(ctx->progress, ctx->progress_nr); + return 0; } @@ -659,13 +662,26 @@ static void survey_phase_objects(struct survey_context *ctx) repo_init_revisions(ctx->repo, &revs, ""); revs.tag_objects = 1; + ctx->progress_nr = 0; + ctx->progress_total = ctx->ref_array.nr; + if (ctx->opts.show_progress) + ctx->progress = start_progress(ctx->repo, + _("Preparing object walk"), + ctx->progress_total); for (int i = 0; i < ctx->ref_array.nr; i++) { struct ref_array_item *item = ctx->ref_array.items[i]; add_pending_oid(&revs, NULL, &item->objectname, add_flags); display_progress(ctx->progress, ++(ctx->progress_nr)); } + stop_progress(&ctx->progress); + ctx->progress_nr = 0; + ctx->progress_total = 0; + if (ctx->opts.show_progress) + ctx->progress = start_progress(ctx->repo, + _("Walking objects"), 0); walk_objects_by_path(&info); + stop_progress(&ctx->progress); release_revisions(&revs); trace2_region_leave("survey", "phase/objects", ctx->repo); diff --git a/t/t8100-git-survey.sh b/t/t8100-git-survey.sh index c2a6333145bac1..118410be55cc2a 100755 --- a/t/t8100-git-survey.sh +++ b/t/t8100-git-survey.sh @@ -25,6 +25,11 @@ test_expect_success 'create a semi-interesting repo' ' git update-ref -d refs/tags/two ' +test_expect_success 'git survey --progress' ' + GIT_PROGRESS_DELAY=0 git survey --all-refs --progress >out 2>err && + grep "Preparing object walk" err +' + approximate_sizes() { # very simplistic approximate rounding sed -Ee "s/ *(1[0-9][0-9])( |$)/ ~0.1kB\2/g" \ From 90fa6a626e37480266850a696e0698f517f29262 Mon Sep 17 00:00:00 2001 From: Derrick Stolee Date: Sun, 1 Sep 2024 22:35:06 -0400 Subject: [PATCH 065/218] survey: add ability to track prioritized lists In future changes, we will make use of these methods. The intention is to keep track of the top contributors according to some metric. We don't want to store all of the entries and do a sort at the end, so track a constant-size table and remove rows that get pushed out depending on the chosen sorting algorithm. Co-authored-by: Jeff Hostetler Signed-off-by; Jeff Hostetler Signed-off-by: Derrick Stolee --- builtin/survey.c | 113 +++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 113 insertions(+) diff --git a/builtin/survey.c b/builtin/survey.c index c570a1470122f4..5ff62fa4ab921c 100644 --- a/builtin/survey.c +++ b/builtin/survey.c @@ -73,6 +73,119 @@ struct survey_report_object_size_summary { size_t num_missing; }; +typedef int (*survey_top_cmp)(void *v1, void *v2); + +MAYBE_UNUSED +static int cmp_by_nr(void *v1, void *v2) +{ + struct survey_report_object_size_summary *s1 = v1; + struct survey_report_object_size_summary *s2 = v2; + + if (s1->nr < s2->nr) + return -1; + if (s1->nr > s2->nr) + return 1; + return 0; +} + +MAYBE_UNUSED +static int cmp_by_disk_size(void *v1, void *v2) +{ + struct survey_report_object_size_summary *s1 = v1; + struct survey_report_object_size_summary *s2 = v2; + + if (s1->disk_size < s2->disk_size) + return -1; + if (s1->disk_size > s2->disk_size) + return 1; + return 0; +} + +MAYBE_UNUSED +static int cmp_by_inflated_size(void *v1, void *v2) +{ + struct survey_report_object_size_summary *s1 = v1; + struct survey_report_object_size_summary *s2 = v2; + + if (s1->inflated_size < s2->inflated_size) + return -1; + if (s1->inflated_size > s2->inflated_size) + return 1; + return 0; +} + +/** + * Store a list of "top" categories by some sorting function. When + * inserting a new category, reorder the list and free the one that + * got ejected (if any). + */ +struct survey_report_top_table { + const char *name; + survey_top_cmp cmp_fn; + size_t nr; + size_t alloc; + + /** + * 'data' stores an array of structs and must be cast into + * the proper array type before evaluating an index. + */ + void *data; +}; + +MAYBE_UNUSED +static void init_top_sizes(struct survey_report_top_table *top, + size_t limit, const char *name, + survey_top_cmp cmp) +{ + struct survey_report_object_size_summary *sz_array; + + top->name = name; + top->cmp_fn = cmp; + top->alloc = limit; + top->nr = 0; + + CALLOC_ARRAY(sz_array, limit); + top->data = sz_array; +} + +MAYBE_UNUSED +static void clear_top_sizes(struct survey_report_top_table *top) +{ + struct survey_report_object_size_summary *sz_array = top->data; + + for (size_t i = 0; i < top->nr; i++) + free(sz_array[i].label); + free(sz_array); +} + +MAYBE_UNUSED +static void maybe_insert_into_top_size(struct survey_report_top_table *top, + struct survey_report_object_size_summary *summary) +{ + struct survey_report_object_size_summary *sz_array = top->data; + size_t pos = top->nr; + + /* Compare against list from the bottom. */ + while (pos > 0 && top->cmp_fn(&sz_array[pos - 1], summary) < 0) + pos--; + + /* Not big enough! */ + if (pos >= top->alloc) + return; + + /* We need to shift the data. */ + if (top->nr == top->alloc) + free(sz_array[top->nr - 1].label); + else + top->nr++; + + for (size_t i = top->nr - 1; i > pos; i--) + memcpy(&sz_array[i], &sz_array[i - 1], sizeof(*sz_array)); + + memcpy(&sz_array[pos], summary, sizeof(*summary)); + sz_array[pos].label = xstrdup(summary->label); +} + /** * This struct contains all of the information that needs to be printed * at the end of the exploration of the repository and its references. From d8e4362d030197f8c85979579a141c288107d834 Mon Sep 17 00:00:00 2001 From: Derrick Stolee Date: Sun, 1 Sep 2024 22:35:40 -0400 Subject: [PATCH 066/218] survey: add report of "largest" paths MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Since we are already walking our reachable objects using the path-walk API, let's now collect lists of the paths that contribute most to different metrics. Specifically, we care about * Number of versions. * Total size on disk. * Total inflated size (no delta or zlib compression). This information can be critical to discovering which parts of the repository are causing the most growth, especially on-disk size. Different packing strategies might help compress data more efficiently, but the toal inflated size is a representation of the raw size of all snapshots of those paths. Even when stored efficiently on disk, that size represents how much information must be processed to complete a command such as 'git blame'. The exact disk size seems to be not quite robust enough for testing, as could be seen by the `linux-musl-meson` job consistently failing, possibly because of zlib-ng deflates differently: t8100.4(git survey (default)) was failing with a symptom like this: TOTAL OBJECT SIZES BY TYPE =============================================== Object Type | Count | Disk Size | Inflated Size ------------+-------+-----------+-------------- - Commits | 10 | 1523 | 2153 + Commits | 10 | 1528 | 2153 Trees | 10 | 495 | 1706 Blobs | 10 | 191 | 101 - Tags | 4 | 510 | 528 + Tags | 4 | 547 | 528 This means: the disk size is unlikely something we can verify robustly. Since zlib-ng seems to increase the disk size of the tags from 528 to 547, we cannot even assume that the disk size is always smaller than the inflated size. We will most likely want to either skip verifying the disk size altogether, or go for some kind of fuzzy matching, say, by replacing `s/ 1[45][0-9][0-9] / ~1.5k /` and `s/ [45][0-9][0-9] / ~½k /` or something like that. Signed-off-by: Derrick Stolee Signed-off-by: Johannes Schindelin --- builtin/survey.c | 79 ++++++++++++++++++++++++++++++++++++++----- t/t8100-git-survey.sh | 12 ++++++- 2 files changed, 82 insertions(+), 9 deletions(-) diff --git a/builtin/survey.c b/builtin/survey.c index 5ff62fa4ab921c..2dd1eedfda74f1 100644 --- a/builtin/survey.c +++ b/builtin/survey.c @@ -75,7 +75,6 @@ struct survey_report_object_size_summary { typedef int (*survey_top_cmp)(void *v1, void *v2); -MAYBE_UNUSED static int cmp_by_nr(void *v1, void *v2) { struct survey_report_object_size_summary *s1 = v1; @@ -88,7 +87,6 @@ static int cmp_by_nr(void *v1, void *v2) return 0; } -MAYBE_UNUSED static int cmp_by_disk_size(void *v1, void *v2) { struct survey_report_object_size_summary *s1 = v1; @@ -101,7 +99,6 @@ static int cmp_by_disk_size(void *v1, void *v2) return 0; } -MAYBE_UNUSED static int cmp_by_inflated_size(void *v1, void *v2) { struct survey_report_object_size_summary *s1 = v1; @@ -132,7 +129,6 @@ struct survey_report_top_table { void *data; }; -MAYBE_UNUSED static void init_top_sizes(struct survey_report_top_table *top, size_t limit, const char *name, survey_top_cmp cmp) @@ -158,7 +154,6 @@ static void clear_top_sizes(struct survey_report_top_table *top) free(sz_array); } -MAYBE_UNUSED static void maybe_insert_into_top_size(struct survey_report_top_table *top, struct survey_report_object_size_summary *summary) { @@ -195,6 +190,10 @@ struct survey_report { struct survey_report_object_summary reachable_objects; struct survey_report_object_size_summary *by_type; + + struct survey_report_top_table *top_paths_by_count; + struct survey_report_top_table *top_paths_by_disk; + struct survey_report_top_table *top_paths_by_inflate; }; #define REPORT_TYPE_COMMIT 0 @@ -446,6 +445,13 @@ static void survey_report_object_sizes(const char *title, clear_table(&table); } +static void survey_report_plaintext_sorted_size( + struct survey_report_top_table *top) +{ + survey_report_object_sizes(top->name, _("Path"), + top->data, top->nr); +} + static void survey_report_plaintext(struct survey_context *ctx) { printf("GIT SURVEY for \"%s\"\n", ctx->repo->worktree); @@ -456,6 +462,21 @@ static void survey_report_plaintext(struct survey_context *ctx) _("Object Type"), ctx->report.by_type, REPORT_TYPE_COUNT); + + survey_report_plaintext_sorted_size( + &ctx->report.top_paths_by_count[REPORT_TYPE_TREE]); + survey_report_plaintext_sorted_size( + &ctx->report.top_paths_by_count[REPORT_TYPE_BLOB]); + + survey_report_plaintext_sorted_size( + &ctx->report.top_paths_by_disk[REPORT_TYPE_TREE]); + survey_report_plaintext_sorted_size( + &ctx->report.top_paths_by_disk[REPORT_TYPE_BLOB]); + + survey_report_plaintext_sorted_size( + &ctx->report.top_paths_by_inflate[REPORT_TYPE_TREE]); + survey_report_plaintext_sorted_size( + &ctx->report.top_paths_by_inflate[REPORT_TYPE_BLOB]); } /* @@ -698,7 +719,8 @@ static void increment_totals(struct survey_context *ctx, static void increment_object_totals(struct survey_context *ctx, struct oid_array *oids, - enum object_type type) + enum object_type type, + const char *path) { struct survey_report_object_size_summary *total; struct survey_report_object_size_summary summary = { 0 }; @@ -730,9 +752,30 @@ static void increment_object_totals(struct survey_context *ctx, total->disk_size += summary.disk_size; total->inflated_size += summary.inflated_size; total->num_missing += summary.num_missing; + + if (type == OBJ_TREE || type == OBJ_BLOB) { + int index = type == OBJ_TREE ? + REPORT_TYPE_TREE : REPORT_TYPE_BLOB; + struct survey_report_top_table *top; + + /* + * Temporarily store (const char *) here, but it will + * be duped if inserted and will not be freed. + */ + summary.label = (char *)path; + + top = ctx->report.top_paths_by_count; + maybe_insert_into_top_size(&top[index], &summary); + + top = ctx->report.top_paths_by_disk; + maybe_insert_into_top_size(&top[index], &summary); + + top = ctx->report.top_paths_by_inflate; + maybe_insert_into_top_size(&top[index], &summary); + } } -static int survey_objects_path_walk_fn(const char *path UNUSED, +static int survey_objects_path_walk_fn(const char *path, struct oid_array *oids, enum object_type type, void *data) @@ -741,7 +784,7 @@ static int survey_objects_path_walk_fn(const char *path UNUSED, increment_object_counts(&ctx->report.reachable_objects, type, oids->nr); - increment_object_totals(ctx, oids, type); + increment_object_totals(ctx, oids, type, path); ctx->progress_nr += oids->nr; display_progress(ctx->progress, ctx->progress_nr); @@ -751,11 +794,31 @@ static int survey_objects_path_walk_fn(const char *path UNUSED, static void initialize_report(struct survey_context *ctx) { + const int top_limit = 100; + CALLOC_ARRAY(ctx->report.by_type, REPORT_TYPE_COUNT); ctx->report.by_type[REPORT_TYPE_COMMIT].label = xstrdup(_("Commits")); ctx->report.by_type[REPORT_TYPE_TREE].label = xstrdup(_("Trees")); ctx->report.by_type[REPORT_TYPE_BLOB].label = xstrdup(_("Blobs")); ctx->report.by_type[REPORT_TYPE_TAG].label = xstrdup(_("Tags")); + + CALLOC_ARRAY(ctx->report.top_paths_by_count, REPORT_TYPE_COUNT); + init_top_sizes(&ctx->report.top_paths_by_count[REPORT_TYPE_TREE], + top_limit, _("TOP DIRECTORIES BY COUNT"), cmp_by_nr); + init_top_sizes(&ctx->report.top_paths_by_count[REPORT_TYPE_BLOB], + top_limit, _("TOP FILES BY COUNT"), cmp_by_nr); + + CALLOC_ARRAY(ctx->report.top_paths_by_disk, REPORT_TYPE_COUNT); + init_top_sizes(&ctx->report.top_paths_by_disk[REPORT_TYPE_TREE], + top_limit, _("TOP DIRECTORIES BY DISK SIZE"), cmp_by_disk_size); + init_top_sizes(&ctx->report.top_paths_by_disk[REPORT_TYPE_BLOB], + top_limit, _("TOP FILES BY DISK SIZE"), cmp_by_disk_size); + + CALLOC_ARRAY(ctx->report.top_paths_by_inflate, REPORT_TYPE_COUNT); + init_top_sizes(&ctx->report.top_paths_by_inflate[REPORT_TYPE_TREE], + top_limit, _("TOP DIRECTORIES BY INFLATED SIZE"), cmp_by_inflated_size); + init_top_sizes(&ctx->report.top_paths_by_inflate[REPORT_TYPE_BLOB], + top_limit, _("TOP FILES BY INFLATED SIZE"), cmp_by_inflated_size); } static void survey_phase_objects(struct survey_context *ctx) diff --git a/t/t8100-git-survey.sh b/t/t8100-git-survey.sh index 118410be55cc2a..1ba48cc47e1b35 100755 --- a/t/t8100-git-survey.sh +++ b/t/t8100-git-survey.sh @@ -92,7 +92,17 @@ test_expect_success 'git survey (default)' ' EOF approximate_sizes out >out-edited && - test_cmp expect out-edited + lines=$(wc -l out-trimmed && + test_cmp expect out-trimmed && + + for type in "DIRECTORIES" "FILES" + do + for metric in "COUNT" "DISK SIZE" "INFLATED SIZE" + do + grep "TOP $type BY $metric" out || return 1 + done || return 1 + done ' test_done From 48d9e81164a9c13c664b6e00d430bef3095feb5b Mon Sep 17 00:00:00 2001 From: Derrick Stolee Date: Mon, 23 Sep 2024 15:38:25 -0400 Subject: [PATCH 067/218] survey: add --top= option and config The 'git survey' builtin provides several detail tables, such as "top files by on-disk size". The size of these tables defaults to 10, currently. Allow the user to specify this number via a new --top= option or the new survey.top config key. Signed-off-by: Derrick Stolee Signed-off-by: Johannes Schindelin --- Documentation/config/survey.adoc | 3 +++ builtin/survey.c | 22 ++++++++++++++-------- 2 files changed, 17 insertions(+), 8 deletions(-) diff --git a/Documentation/config/survey.adoc b/Documentation/config/survey.adoc index c1b0f852a1250e..9e594a2092f225 100644 --- a/Documentation/config/survey.adoc +++ b/Documentation/config/survey.adoc @@ -8,4 +8,7 @@ survey.*:: This boolean value implies the `--[no-]verbose` option. progress:: This boolean value implies the `--[no-]progress` option. + top:: + This integer value implies `--top=`, specifying the + number of entries in the detail tables. -- diff --git a/builtin/survey.c b/builtin/survey.c index 2dd1eedfda74f1..c1d78222146628 100644 --- a/builtin/survey.c +++ b/builtin/survey.c @@ -40,6 +40,7 @@ static struct survey_refs_wanted default_ref_options = { struct survey_opts { int verbose; int show_progress; + int top_nr; struct survey_refs_wanted refs; }; @@ -548,6 +549,10 @@ static int survey_load_config_cb(const char *var, const char *value, ctx->opts.show_progress = git_config_bool(var, value); return 0; } + if (!strcmp(var, "survey.top")) { + ctx->opts.top_nr = git_config_bool(var, value); + return 0; + } return git_default_config(var, value, cctx, pvoid); } @@ -794,8 +799,6 @@ static int survey_objects_path_walk_fn(const char *path, static void initialize_report(struct survey_context *ctx) { - const int top_limit = 100; - CALLOC_ARRAY(ctx->report.by_type, REPORT_TYPE_COUNT); ctx->report.by_type[REPORT_TYPE_COMMIT].label = xstrdup(_("Commits")); ctx->report.by_type[REPORT_TYPE_TREE].label = xstrdup(_("Trees")); @@ -804,21 +807,21 @@ static void initialize_report(struct survey_context *ctx) CALLOC_ARRAY(ctx->report.top_paths_by_count, REPORT_TYPE_COUNT); init_top_sizes(&ctx->report.top_paths_by_count[REPORT_TYPE_TREE], - top_limit, _("TOP DIRECTORIES BY COUNT"), cmp_by_nr); + ctx->opts.top_nr, _("TOP DIRECTORIES BY COUNT"), cmp_by_nr); init_top_sizes(&ctx->report.top_paths_by_count[REPORT_TYPE_BLOB], - top_limit, _("TOP FILES BY COUNT"), cmp_by_nr); + ctx->opts.top_nr, _("TOP FILES BY COUNT"), cmp_by_nr); CALLOC_ARRAY(ctx->report.top_paths_by_disk, REPORT_TYPE_COUNT); init_top_sizes(&ctx->report.top_paths_by_disk[REPORT_TYPE_TREE], - top_limit, _("TOP DIRECTORIES BY DISK SIZE"), cmp_by_disk_size); + ctx->opts.top_nr, _("TOP DIRECTORIES BY DISK SIZE"), cmp_by_disk_size); init_top_sizes(&ctx->report.top_paths_by_disk[REPORT_TYPE_BLOB], - top_limit, _("TOP FILES BY DISK SIZE"), cmp_by_disk_size); + ctx->opts.top_nr, _("TOP FILES BY DISK SIZE"), cmp_by_disk_size); CALLOC_ARRAY(ctx->report.top_paths_by_inflate, REPORT_TYPE_COUNT); init_top_sizes(&ctx->report.top_paths_by_inflate[REPORT_TYPE_TREE], - top_limit, _("TOP DIRECTORIES BY INFLATED SIZE"), cmp_by_inflated_size); + ctx->opts.top_nr, _("TOP DIRECTORIES BY INFLATED SIZE"), cmp_by_inflated_size); init_top_sizes(&ctx->report.top_paths_by_inflate[REPORT_TYPE_BLOB], - top_limit, _("TOP FILES BY INFLATED SIZE"), cmp_by_inflated_size); + ctx->opts.top_nr, _("TOP FILES BY INFLATED SIZE"), cmp_by_inflated_size); } static void survey_phase_objects(struct survey_context *ctx) @@ -869,6 +872,7 @@ int cmd_survey(int argc, const char **argv, const char *prefix, struct repositor .opts = { .verbose = 0, .show_progress = -1, /* defaults to isatty(2) */ + .top_nr = 10, .refs.want_all_refs = -1, @@ -884,6 +888,8 @@ int cmd_survey(int argc, const char **argv, const char *prefix, struct repositor static struct option survey_options[] = { OPT__VERBOSE(&ctx.opts.verbose, N_("verbose output")), OPT_BOOL(0, "progress", &ctx.opts.show_progress, N_("show progress")), + OPT_INTEGER('n', "top", &ctx.opts.top_nr, + N_("number of entries to include in detail tables")), OPT_BOOL_F(0, "all-refs", &ctx.opts.refs.want_all_refs, N_("include all refs"), PARSE_OPT_NONEG), From 5d4e035411f1f5443697980768afb5b9113c0a60 Mon Sep 17 00:00:00 2001 From: Derrick Stolee Date: Fri, 6 Sep 2024 14:16:13 -0400 Subject: [PATCH 068/218] revision: create mark_trees_uninteresting_dense() The sparse tree walk algorithm was created in d5d2e93577e (revision: implement sparse algorithm, 2019-01-16) and involves using the mark_trees_uninteresting_sparse() method. This method takes a repository and an oidset of tree IDs, some of which have the UNINTERESTING flag and some of which do not. Create a method that has an equivalent set of preconditions but uses a "dense" walk (recursively visits all reachable trees, as long as they have not previously been marked UNINTERESTING). This is an important difference from mark_tree_uninteresting(), which short-circuits if the given tree has the UNINTERESTING flag. A use of this method will be added in a later change, with a condition set whether the sparse or dense approach should be used. Signed-off-by: Derrick Stolee From 03bcb28fd8a8b0e5d5dbb3ebfa4587eb9ec450e0 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Mon, 1 Jul 2024 23:28:45 +0200 Subject: [PATCH 069/218] survey: clearly note the experimental nature in the output While this command is definitely something we _want_, chances are that upstreaming this will require substantial changes. We still want to be able to experiment with this before that, to focus on what we need out of this command: To assist with diagnosing issues with large repositories, as well as to help monitoring the growth and the associated painpoints of such repositories. To that end, we are about to integrate this command into `microsoft/git`, to get the tool into the hands of users who need it most, with the idea to iterate in close collaboration between these users and the developers familar with Git's internals. However, we will definitely want to avoid letting anybody have the impression that this command, its exact inner workings, as well as its output format, are anywhere close to stable. To make that fact utterly clear (and thereby protect the freedom to iterate and innovate freely before upstreaming the command), let's mark its output as experimental in all-caps, as the first thing we do. Signed-off-by: Johannes Schindelin --- builtin/survey.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/builtin/survey.c b/builtin/survey.c index c1d78222146628..f40905fb2fd57a 100644 --- a/builtin/survey.c +++ b/builtin/survey.c @@ -17,6 +17,7 @@ #include "strvec.h" #include "tag.h" #include "trace2.h" +#include "color.h" static const char * const survey_usage[] = { N_("(EXPERIMENTAL!) git survey "), @@ -905,6 +906,11 @@ int cmd_survey(int argc, const char **argv, const char *prefix, struct repositor show_usage_with_options_if_asked(argc, argv, survey_usage, survey_options); + if (isatty(2)) + color_fprintf_ln(stderr, + want_color_fd(2, GIT_COLOR_AUTO) ? GIT_COLOR_YELLOW : "", + "(THIS IS EXPERIMENTAL, EXPECT THE OUTPUT FORMAT TO CHANGE!)"); + ctx.repo = repo; prepare_repo_settings(ctx.repo); From 11d00975cf58efbf1ebafc526a45153a30fb29d9 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Thu, 14 Nov 2019 20:09:23 +0100 Subject: [PATCH 070/218] mingw: make sure `errno` is set correctly when socket operations fail The winsock2 library provides functions that work on different data types than file descriptors, therefore we wrap them. But that is not the only difference: they also do not set `errno` but expect the callers to enquire about errors via `WSAGetLastError()`. Let's translate that into appropriate `errno` values whenever the socket operations fail so that Git's code base does not have to change its expectations. This closes https://github.com/git-for-windows/git/issues/2404 Helped-by: Jeff Hostetler Signed-off-by: Johannes Schindelin --- compat/mingw.c | 157 +++++++++++++++++++++++++++++++++++++++++++++---- 1 file changed, 147 insertions(+), 10 deletions(-) diff --git a/compat/mingw.c b/compat/mingw.c index feefa2cd0eb12a..95707a8cfd1ed2 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -2364,18 +2364,150 @@ static void ensure_socket_initialization(void) initialized = 1; } +static int winsock_error_to_errno(DWORD err) +{ + switch (err) { + case WSAEINTR: return EINTR; + case WSAEBADF: return EBADF; + case WSAEACCES: return EACCES; + case WSAEFAULT: return EFAULT; + case WSAEINVAL: return EINVAL; + case WSAEMFILE: return EMFILE; + case WSAEWOULDBLOCK: return EWOULDBLOCK; + case WSAEINPROGRESS: return EINPROGRESS; + case WSAEALREADY: return EALREADY; + case WSAENOTSOCK: return ENOTSOCK; + case WSAEDESTADDRREQ: return EDESTADDRREQ; + case WSAEMSGSIZE: return EMSGSIZE; + case WSAEPROTOTYPE: return EPROTOTYPE; + case WSAENOPROTOOPT: return ENOPROTOOPT; + case WSAEPROTONOSUPPORT: return EPROTONOSUPPORT; + case WSAEOPNOTSUPP: return EOPNOTSUPP; + case WSAEAFNOSUPPORT: return EAFNOSUPPORT; + case WSAEADDRINUSE: return EADDRINUSE; + case WSAEADDRNOTAVAIL: return EADDRNOTAVAIL; + case WSAENETDOWN: return ENETDOWN; + case WSAENETUNREACH: return ENETUNREACH; + case WSAENETRESET: return ENETRESET; + case WSAECONNABORTED: return ECONNABORTED; + case WSAECONNRESET: return ECONNRESET; + case WSAENOBUFS: return ENOBUFS; + case WSAEISCONN: return EISCONN; + case WSAENOTCONN: return ENOTCONN; + case WSAETIMEDOUT: return ETIMEDOUT; + case WSAECONNREFUSED: return ECONNREFUSED; + case WSAELOOP: return ELOOP; + case WSAENAMETOOLONG: return ENAMETOOLONG; + case WSAEHOSTUNREACH: return EHOSTUNREACH; + case WSAENOTEMPTY: return ENOTEMPTY; + /* No errno equivalent; default to EIO */ + case WSAESOCKTNOSUPPORT: + case WSAEPFNOSUPPORT: + case WSAESHUTDOWN: + case WSAETOOMANYREFS: + case WSAEHOSTDOWN: + case WSAEPROCLIM: + case WSAEUSERS: + case WSAEDQUOT: + case WSAESTALE: + case WSAEREMOTE: + case WSASYSNOTREADY: + case WSAVERNOTSUPPORTED: + case WSANOTINITIALISED: + case WSAEDISCON: + case WSAENOMORE: + case WSAECANCELLED: + case WSAEINVALIDPROCTABLE: + case WSAEINVALIDPROVIDER: + case WSAEPROVIDERFAILEDINIT: + case WSASYSCALLFAILURE: + case WSASERVICE_NOT_FOUND: + case WSATYPE_NOT_FOUND: + case WSA_E_NO_MORE: + case WSA_E_CANCELLED: + case WSAEREFUSED: + case WSAHOST_NOT_FOUND: + case WSATRY_AGAIN: + case WSANO_RECOVERY: + case WSANO_DATA: + case WSA_QOS_RECEIVERS: + case WSA_QOS_SENDERS: + case WSA_QOS_NO_SENDERS: + case WSA_QOS_NO_RECEIVERS: + case WSA_QOS_REQUEST_CONFIRMED: + case WSA_QOS_ADMISSION_FAILURE: + case WSA_QOS_POLICY_FAILURE: + case WSA_QOS_BAD_STYLE: + case WSA_QOS_BAD_OBJECT: + case WSA_QOS_TRAFFIC_CTRL_ERROR: + case WSA_QOS_GENERIC_ERROR: + case WSA_QOS_ESERVICETYPE: + case WSA_QOS_EFLOWSPEC: + case WSA_QOS_EPROVSPECBUF: + case WSA_QOS_EFILTERSTYLE: + case WSA_QOS_EFILTERTYPE: + case WSA_QOS_EFILTERCOUNT: + case WSA_QOS_EOBJLENGTH: + case WSA_QOS_EFLOWCOUNT: +#ifndef _MSC_VER + case WSA_QOS_EUNKNOWNPSOBJ: +#endif + case WSA_QOS_EPOLICYOBJ: + case WSA_QOS_EFLOWDESC: + case WSA_QOS_EPSFLOWSPEC: + case WSA_QOS_EPSFILTERSPEC: + case WSA_QOS_ESDMODEOBJ: + case WSA_QOS_ESHAPERATEOBJ: + case WSA_QOS_RESERVED_PETYPE: + default: return EIO; + } +} + +/* + * On Windows, `errno` is a global macro to a function call. + * This makes it difficult to debug and single-step our mappings. + */ +static inline void set_wsa_errno(void) +{ + DWORD wsa = WSAGetLastError(); + int e = winsock_error_to_errno(wsa); + errno = e; + +#ifdef DEBUG_WSA_ERRNO + fprintf(stderr, "winsock error: %d -> %d\n", wsa, e); + fflush(stderr); +#endif +} + +static inline int winsock_return(int ret) +{ + if (ret < 0) + set_wsa_errno(); + + return ret; +} + +#define WINSOCK_RETURN(x) do { return winsock_return(x); } while (0) + #undef gethostname int mingw_gethostname(char *name, int namelen) { - ensure_socket_initialization(); - return gethostname(name, namelen); + ensure_socket_initialization(); + WINSOCK_RETURN(gethostname(name, namelen)); } #undef gethostbyname struct hostent *mingw_gethostbyname(const char *host) { + struct hostent *ret; + ensure_socket_initialization(); - return gethostbyname(host); + + ret = gethostbyname(host); + if (!ret) + set_wsa_errno(); + + return ret; } #undef getaddrinfo @@ -2383,7 +2515,7 @@ int mingw_getaddrinfo(const char *node, const char *service, const struct addrinfo *hints, struct addrinfo **res) { ensure_socket_initialization(); - return getaddrinfo(node, service, hints, res); + WINSOCK_RETURN(getaddrinfo(node, service, hints, res)); } int mingw_socket(int domain, int type, int protocol) @@ -2403,7 +2535,7 @@ int mingw_socket(int domain, int type, int protocol) * in errno so that _if_ someone looks up the code somewhere, * then it is at least the number that are usually listed. */ - errno = WSAGetLastError(); + set_wsa_errno(); return -1; } /* convert into a file descriptor */ @@ -2419,35 +2551,35 @@ int mingw_socket(int domain, int type, int protocol) int mingw_connect(int sockfd, struct sockaddr *sa, size_t sz) { SOCKET s = (SOCKET)_get_osfhandle(sockfd); - return connect(s, sa, sz); + WINSOCK_RETURN(connect(s, sa, sz)); } #undef bind int mingw_bind(int sockfd, struct sockaddr *sa, size_t sz) { SOCKET s = (SOCKET)_get_osfhandle(sockfd); - return bind(s, sa, sz); + WINSOCK_RETURN(bind(s, sa, sz)); } #undef setsockopt int mingw_setsockopt(int sockfd, int lvl, int optname, void *optval, int optlen) { SOCKET s = (SOCKET)_get_osfhandle(sockfd); - return setsockopt(s, lvl, optname, (const char*)optval, optlen); + WINSOCK_RETURN(setsockopt(s, lvl, optname, (const char*)optval, optlen)); } #undef shutdown int mingw_shutdown(int sockfd, int how) { SOCKET s = (SOCKET)_get_osfhandle(sockfd); - return shutdown(s, how); + WINSOCK_RETURN(shutdown(s, how)); } #undef listen int mingw_listen(int sockfd, int backlog) { SOCKET s = (SOCKET)_get_osfhandle(sockfd); - return listen(s, backlog); + WINSOCK_RETURN(listen(s, backlog)); } #undef accept @@ -2458,6 +2590,11 @@ int mingw_accept(int sockfd1, struct sockaddr *sa, socklen_t *sz) SOCKET s1 = (SOCKET)_get_osfhandle(sockfd1); SOCKET s2 = accept(s1, sa, sz); + if (s2 == INVALID_SOCKET) { + set_wsa_errno(); + return -1; + } + /* convert into a file descriptor */ if ((sockfd2 = _open_osfhandle(s2, O_RDWR|O_BINARY)) < 0) { int err = errno; From c21db0a7a050622dea5f41cf94286a7c6232ba5b Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Matthias=20A=C3=9Fhauer?= Date: Sun, 22 Dec 2024 17:15:39 +0100 Subject: [PATCH 071/218] compat/mingw: handle WSA errors in strerror MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit We map WSAGetLastError() errors to errno errors in winsock_error_to_errno(), but the MSVC strerror() implementation only produces "Unknown error" for most of them. Produce some more meaningful error messages in these cases. Our builds for ARM64 link against the newer UCRT strerror() that does know these errors, so we won't change the strerror() used there. The wording of the messages is copied from glibc strerror() messages. Reported-by: M Hickford Signed-off-by: Matthias Aßhauer Signed-off-by: Johannes Schindelin --- Makefile | 1 + compat/mingw-posix.h | 5 +++ compat/mingw.c | 85 ++++++++++++++++++++++++++++++++++++++++++ t/meson.build | 1 + t/unit-tests/u-mingw.c | 72 +++++++++++++++++++++++++++++++++++ 5 files changed, 164 insertions(+) create mode 100644 t/unit-tests/u-mingw.c diff --git a/Makefile b/Makefile index cedc234173e377..05fd5d24445c21 100644 --- a/Makefile +++ b/Makefile @@ -1527,6 +1527,7 @@ CLAR_TEST_SUITES += u-hash CLAR_TEST_SUITES += u-hashmap CLAR_TEST_SUITES += u-list-objects-filter-options CLAR_TEST_SUITES += u-mem-pool +CLAR_TEST_SUITES += u-mingw CLAR_TEST_SUITES += u-oid-array CLAR_TEST_SUITES += u-oidmap CLAR_TEST_SUITES += u-oidtree diff --git a/compat/mingw-posix.h b/compat/mingw-posix.h index 2d989fd762474e..da934834a1a177 100644 --- a/compat/mingw-posix.h +++ b/compat/mingw-posix.h @@ -288,6 +288,11 @@ int mingw_socket(int domain, int type, int protocol); int mingw_connect(int sockfd, struct sockaddr *sa, size_t sz); #define connect mingw_connect +char *mingw_strerror(int errnum); +#ifndef _UCRT +#define strerror mingw_strerror +#endif + int mingw_bind(int sockfd, struct sockaddr *sa, size_t sz); #define bind mingw_bind diff --git a/compat/mingw.c b/compat/mingw.c index 95707a8cfd1ed2..6b2b3948e33b05 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -2489,6 +2489,91 @@ static inline int winsock_return(int ret) #define WINSOCK_RETURN(x) do { return winsock_return(x); } while (0) +#undef strerror +char *mingw_strerror(int errnum) +{ + static char buf[41] =""; + switch (errnum) { + case EWOULDBLOCK: + xsnprintf(buf, 41, "%s", "Operation would block"); + break; + case EINPROGRESS: + xsnprintf(buf, 41, "%s", "Operation now in progress"); + break; + case EALREADY: + xsnprintf(buf, 41, "%s", "Operation already in progress"); + break; + case ENOTSOCK: + xsnprintf(buf, 41, "%s", "Socket operation on non-socket"); + break; + case EDESTADDRREQ: + xsnprintf(buf, 41, "%s", "Destination address required"); + break; + case EMSGSIZE: + xsnprintf(buf, 41, "%s", "Message too long"); + break; + case EPROTOTYPE: + xsnprintf(buf, 41, "%s", "Protocol wrong type for socket"); + break; + case ENOPROTOOPT: + xsnprintf(buf, 41, "%s", "Protocol not available"); + break; + case EPROTONOSUPPORT: + xsnprintf(buf, 41, "%s", "Protocol not supported"); + break; + case EOPNOTSUPP: + xsnprintf(buf, 41, "%s", "Operation not supported"); + break; + case EAFNOSUPPORT: + xsnprintf(buf, 41, "%s", "Address family not supported by protocol"); + break; + case EADDRINUSE: + xsnprintf(buf, 41, "%s", "Address already in use"); + break; + case EADDRNOTAVAIL: + xsnprintf(buf, 41, "%s", "Cannot assign requested address"); + break; + case ENETDOWN: + xsnprintf(buf, 41, "%s", "Network is down"); + break; + case ENETUNREACH: + xsnprintf(buf, 41, "%s", "Network is unreachable"); + break; + case ENETRESET: + xsnprintf(buf, 41, "%s", "Network dropped connection on reset"); + break; + case ECONNABORTED: + xsnprintf(buf, 41, "%s", "Software caused connection abort"); + break; + case ECONNRESET: + xsnprintf(buf, 41, "%s", "Connection reset by peer"); + break; + case ENOBUFS: + xsnprintf(buf, 41, "%s", "No buffer space available"); + break; + case EISCONN: + xsnprintf(buf, 41, "%s", "Transport endpoint is already connected"); + break; + case ENOTCONN: + xsnprintf(buf, 41, "%s", "Transport endpoint is not connected"); + break; + case ETIMEDOUT: + xsnprintf(buf, 41, "%s", "Connection timed out"); + break; + case ECONNREFUSED: + xsnprintf(buf, 41, "%s", "Connection refused"); + break; + case ELOOP: + xsnprintf(buf, 41, "%s", "Too many levels of symbolic links"); + break; + case EHOSTUNREACH: + xsnprintf(buf, 41, "%s", "No route to host"); + break; + default: return strerror(errnum); + } + return buf; +} + #undef gethostname int mingw_gethostname(char *name, int namelen) { diff --git a/t/meson.build b/t/meson.build index 7528e5cda5fef0..d88b494db5e924 100644 --- a/t/meson.build +++ b/t/meson.build @@ -6,6 +6,7 @@ clar_test_suites = [ 'unit-tests/u-hashmap.c', 'unit-tests/u-list-objects-filter-options.c', 'unit-tests/u-mem-pool.c', + 'unit-tests/u-mingw.c', 'unit-tests/u-oid-array.c', 'unit-tests/u-oidmap.c', 'unit-tests/u-oidtree.c', diff --git a/t/unit-tests/u-mingw.c b/t/unit-tests/u-mingw.c new file mode 100644 index 00000000000000..cb74da5e793a33 --- /dev/null +++ b/t/unit-tests/u-mingw.c @@ -0,0 +1,72 @@ +#include "unit-test.h" + +#if defined(GIT_WINDOWS_NATIVE) && !defined(_UCRT) +#undef strerror +int errnos_contains(int); +static int errnos [53]={ + /* errnos in err_win_to_posix */ + EACCES, EBUSY, EEXIST, ERANGE, EIO, ENODEV, ENXIO, ENOEXEC, EINVAL, ENOENT, + EPIPE, ENAMETOOLONG, ENOSYS, ENOTEMPTY, ENOSPC, EFAULT, EBADF, EPERM, EINTR, + E2BIG, ESPIPE, ENOMEM, EXDEV, EAGAIN, ENFILE, EMFILE, ECHILD, EROFS, + /* errnos only in winsock_error_to_errno */ + EWOULDBLOCK, EINPROGRESS, EALREADY, ENOTSOCK, EDESTADDRREQ, EMSGSIZE, + EPROTOTYPE, ENOPROTOOPT, EPROTONOSUPPORT, EOPNOTSUPP, EAFNOSUPPORT, + EADDRINUSE, EADDRNOTAVAIL, ENETDOWN, ENETUNREACH, ENETRESET, ECONNABORTED, + ECONNRESET, ENOBUFS, EISCONN, ENOTCONN, ETIMEDOUT, ECONNREFUSED, ELOOP, + EHOSTUNREACH + }; + +int errnos_contains(int errnum) +{ + for(int i=0;i<53;i++) + if(errnos[i]==errnum) + return 1; + return 0; +} +#endif + +void test_mingw__no_strerror_shim_on_ucrt(void) +{ +#if defined(GIT_WINDOWS_NATIVE) && defined(_UCRT) + cl_assert_(strerror != mingw_strerror, + "mingw_strerror is unnescessary when building against UCRT"); +#else + cl_skip(); +#endif +} + +void test_mingw__strerror(void) +{ +#if defined(GIT_WINDOWS_NATIVE) && !defined(_UCRT) + for(int i=0;i<53;i++) + { + char *crt; + char *mingw; + mingw = mingw_strerror(errnos[i]); + crt = strerror(errnos[i]); + cl_assert_(!strcasestr(mingw, "unknown error"), + "mingw_strerror should know all errno values we care about"); + if(!strcasestr(crt, "unknown error")) + cl_assert_equal_s(crt,mingw); + } +#else + cl_skip(); +#endif +} + +void test_mingw__errno_translation(void) +{ +#if defined(GIT_WINDOWS_NATIVE) && !defined(_UCRT) + /* GetLastError() return values are currently defined from 0 to 15841, + testing up to 20000 covers some room for future expansion */ + for (int i=0;i<20000;i++) + { + if(i!=ERROR_SUCCESS) + cl_assert_(errnos_contains(err_win_to_posix(i)), + "all err_win_to_posix return values should be tested against mingw_strerror"); + /* ideally we'd test the same for winsock_error_to_errno, but it's static */ + } +#else + cl_skip(); +#endif +} From 93612befded880a0dd7f680aa789d6d434a0b8ff Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Wed, 26 Nov 2025 17:51:41 +0100 Subject: [PATCH 072/218] t5563: verify that NTLM authentication works Although NTLM authentication is considered weak (extending even to NTLMv2, which purportedly allows brute-forcing reasonably complex 8-character passwords in a matter of days, given ample compute resources), it _is_ one of the authentication methods supported by libcurl. Note: The added test case *cannot* reuse the existing `custom_auth` facility. The reason is that that facility is backed by an NPH script ("No Parse Headers"), which does not allow handling the 3-phase NTLM authentication correctly (in my hands, the NPH script would not even be called upon the Type 3 message, a "200 OK" would be returned, but no headers, let alone the `git http-backend` output as payload). Having a separate NTLM authentication script makes the exact workings clearer and more readable, anyway. Co-authored-by: Matthew John Cheetham Signed-off-by: Johannes Schindelin --- t/lib-httpd.sh | 1 + t/lib-httpd/apache.conf | 8 ++++++++ t/lib-httpd/ntlm-handshake.sh | 38 +++++++++++++++++++++++++++++++++++ t/t5563-simple-http-auth.sh | 15 ++++++++++++++ 4 files changed, 62 insertions(+) create mode 100755 t/lib-httpd/ntlm-handshake.sh diff --git a/t/lib-httpd.sh b/t/lib-httpd.sh index 4c76e813e396bf..7150a2a2f2c5ce 100644 --- a/t/lib-httpd.sh +++ b/t/lib-httpd.sh @@ -168,6 +168,7 @@ prepare_httpd() { install_script apply-one-time-script.sh install_script nph-custom-auth.sh install_script http-429.sh + install_script ntlm-handshake.sh ln -s "$LIB_HTTPD_MODULE_PATH" "$HTTPD_ROOT_PATH/modules" diff --git a/t/lib-httpd/apache.conf b/t/lib-httpd/apache.conf index 40a690b0bb7c9b..7a5c3620cfe901 100644 --- a/t/lib-httpd/apache.conf +++ b/t/lib-httpd/apache.conf @@ -155,6 +155,13 @@ SetEnv PERL_PATH ${PERL_PATH} CGIPassAuth on + + SetEnv GIT_EXEC_PATH ${GIT_EXEC_PATH} + SetEnv GIT_HTTP_EXPORT_ALL + + CGIPassAuth on + + ScriptAlias /smart/incomplete_length/git-upload-pack incomplete-length-upload-pack-v2-http.sh/ ScriptAlias /smart/incomplete_body/git-upload-pack incomplete-body-upload-pack-v2-http.sh/ ScriptAlias /smart/no_report/git-receive-pack error-no-report.sh/ @@ -166,6 +173,7 @@ ScriptAlias /error/ error.sh/ ScriptAliasMatch /one_time_script/(.*) apply-one-time-script.sh/$1 ScriptAliasMatch /http_429/(.*) http-429.sh/$1 ScriptAliasMatch /custom_auth/(.*) nph-custom-auth.sh/$1 +ScriptAliasMatch /ntlm_auth/(.*) ntlm-handshake.sh/$1 Options FollowSymlinks diff --git a/t/lib-httpd/ntlm-handshake.sh b/t/lib-httpd/ntlm-handshake.sh new file mode 100755 index 00000000000000..3cf1266e40f20a --- /dev/null +++ b/t/lib-httpd/ntlm-handshake.sh @@ -0,0 +1,38 @@ +#!/bin/sh + +case "$HTTP_AUTHORIZATION" in +'') + # No Authorization header -> send NTLM challenge + echo "Status: 401 Unauthorized" + echo "WWW-Authenticate: NTLM" + echo + ;; +"NTLM TlRMTVNTUAAB"*) + # Type 1 -> respond with Type 2 challenge (hardcoded) + echo "Status: 401 Unauthorized" + # Base64-encoded version of the Type 2 challenge: + # signature: 'NTLMSSP\0' + # message_type: 2 + # target_name: 'NTLM-GIT-SERVER' + # flags: 0xa2898205 = + # NEGOTIATE_UNICODE, REQUEST_TARGET, NEGOTIATE_NT_ONLY, + # TARGET_TYPE_SERVER, TARGET_TYPE_SHARE, REQUEST_NON_NT_SESSION_KEY, + # NEGOTIATE_VERSION, NEGOTIATE_128, NEGOTIATE_56 + # challenge: 0xfa3dec518896295b + # context: '0000000000000000' + # target_info_present: true + # target_info_len: 128 + # version: '10.0 (build 19041)' + echo "WWW-Authenticate: NTLM TlRMTVNTUAACAAAAHgAeADgAAAAFgomi+j3sUYiWKVsAAAAAAAAAAIAAgABWAAAACgBhSgAAAA9OAFQATABNAC0ARwBJAFQALQBTAEUAUgBWAEUAUgACABIAVwBPAFIASwBHAFIATwBVAFAAAQAeAE4AVABMAE0ALQBHAEkAVAAtAFMARQBSAFYARQBSAAQAEgBXAE8AUgBLAEcAUgBPAFUAUAADAB4ATgBUAEwATQAtAEcASQBUAC0AUwBFAFIAVgBFAFIABwAIAACfOcZKYNwBAAAAAA==" + echo + ;; +"NTLM TlRMTVNTUAAD"*) + # Type 3 -> accept without validation + exec "$GIT_EXEC_PATH"/git-http-backend + ;; +*) + echo "Status: 500 Unrecognized" + echo + echo "Unhandled auth: '$HTTP_AUTHORIZATION'" + ;; +esac diff --git a/t/t5563-simple-http-auth.sh b/t/t5563-simple-http-auth.sh index 00635816156ba3..b8cef9dd5b1413 100755 --- a/t/t5563-simple-http-auth.sh +++ b/t/t5563-simple-http-auth.sh @@ -719,4 +719,19 @@ test_expect_success 'access using three-legged auth' ' EOF ' +test_lazy_prereq NTLM 'curl --version | grep -q NTLM' + +test_expect_success NTLM 'access using NTLM auth' ' + test_when_finished "per_test_cleanup" && + + set_credential_reply get <<-EOF && + username=user + password=pwd + EOF + + test_config_global credential.helper test-helper && + GIT_TRACE_CURL=1 \ + git ls-remote "$HTTPD_URL/ntlm_auth/repo.git" +' + test_done From 4a6bb0d52a1c30d2806ae6c63a8806d86d443335 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Matthias=20A=C3=9Fhauer?= Date: Sun, 22 Dec 2024 17:43:45 +0100 Subject: [PATCH 073/218] compat/mingw: drop outdated comment MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit This comment has been true for the longest time; The combination of the two preceding commits made it incorrect, so let's drop that comment. Signed-off-by: Matthias Aßhauer Signed-off-by: Johannes Schindelin --- compat/mingw.c | 9 --------- 1 file changed, 9 deletions(-) diff --git a/compat/mingw.c b/compat/mingw.c index 6b2b3948e33b05..c398e5962a8446 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -2611,15 +2611,6 @@ int mingw_socket(int domain, int type, int protocol) ensure_socket_initialization(); s = WSASocket(domain, type, protocol, NULL, 0, 0); if (s == INVALID_SOCKET) { - /* - * WSAGetLastError() values are regular BSD error codes - * biased by WSABASEERR. - * However, strerror() does not know about networking - * specific errors, which are values beginning at 38 or so. - * Therefore, we choose to leave the biased error code - * in errno so that _if_ someone looks up the code somewhere, - * then it is at least the number that are usually listed. - */ set_wsa_errno(); return -1; } From a4f847e7aab9afa449cf558cb0e49d8199340d58 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Wed, 26 Nov 2025 18:47:19 +0100 Subject: [PATCH 074/218] http: disallow NTLM authentication by default NTLM authentication is relatively weak. This is the case even with the default setting of modern Windows versions, where NTLMv1 and LanManager are disabled and only NTLMv2 is enabled: NTLMv2 hashes of even reasonably complex 8-character passwords can be broken in a matter of days, given enough compute resources. Even worse: On Windows, NTLM authentication uses Security Support Provider Interface ("SSPI"), which provides the credentials without requiring the user to type them in. Which means that an attacker could talk an unsuspecting user into cloning from a server that is under the attacker's control and extracts the user's NTLMv2 hash without their knowledge. For that reason, let's disallow NTLM authentication by default. NTLM authentication is quite simple to set up, though, and therefore there are still some on-prem Azure DevOps setups out there whose users and/or automation rely on this type of authentication. To give them an escape hatch, introduce the `http..allowNTLMAuth` config setting that can be set to `true` to opt back into using NTLM for a specific remote repository. Signed-off-by: Johannes Schindelin --- Documentation/config/http.adoc | 5 +++++ http.c | 20 ++++++++++++++++---- t/t5563-simple-http-auth.sh | 6 ++++-- 3 files changed, 25 insertions(+), 6 deletions(-) diff --git a/Documentation/config/http.adoc b/Documentation/config/http.adoc index 849c89f36c5ad8..36f331d301dec4 100644 --- a/Documentation/config/http.adoc +++ b/Documentation/config/http.adoc @@ -231,6 +231,11 @@ http.sslKeyType:: See also libcurl `CURLOPT_SSLKEYTYPE`. Can be overridden by the `GIT_SSL_KEY_TYPE` environment variable. +http.allowNTLMAuth:: + Whether or not to allow NTLM authentication. While very convenient to set + up, and therefore still used in many on-prem scenarios, NTLM is a weak + authentication method and therefore deprecated. Defaults to "false". + http.schannelCheckRevoke:: Used to enforce or disable certificate revocation checks in cURL when http.sslBackend is set to "schannel". Defaults to `true` if diff --git a/http.c b/http.c index 67c9c6fc60673d..c189222fdb5941 100644 --- a/http.c +++ b/http.c @@ -131,7 +131,8 @@ enum http_follow_config http_follow_config = HTTP_FOLLOW_INITIAL; static struct credential cert_auth = CREDENTIAL_INIT; static int ssl_cert_password_required; -static unsigned long http_auth_methods = CURLAUTH_ANY; +static unsigned long http_auth_any = CURLAUTH_ANY & ~CURLAUTH_NTLM; +static unsigned long http_auth_methods; static int http_auth_methods_restricted; /* Modes for which empty_auth cannot actually help us. */ static unsigned long empty_auth_useless = @@ -429,6 +430,15 @@ static int http_options(const char *var, const char *value, return 0; } + if (!strcmp("http.allowntlmauth", var)) { + if (git_config_bool(var, value)) { + http_auth_any |= CURLAUTH_NTLM; + } else { + http_auth_any &= ~CURLAUTH_NTLM; + } + return 0; + } + if (!strcmp("http.schannelcheckrevoke", var)) { http_schannel_check_revoke = git_config_bool(var, value); return 0; @@ -709,11 +719,11 @@ static void init_curl_proxy_auth(CURL *result) if (i == ARRAY_SIZE(proxy_authmethods)) { warning("unsupported proxy authentication method %s: using anyauth", http_proxy_authmethod); - curl_easy_setopt(result, CURLOPT_PROXYAUTH, CURLAUTH_ANY); + curl_easy_setopt(result, CURLOPT_PROXYAUTH, http_auth_any); } } else - curl_easy_setopt(result, CURLOPT_PROXYAUTH, CURLAUTH_ANY); + curl_easy_setopt(result, CURLOPT_PROXYAUTH, http_auth_any); } static int has_cert_password(void) @@ -1060,7 +1070,7 @@ static CURL *get_curl_handle(void) } curl_easy_setopt(result, CURLOPT_NETRC, CURL_NETRC_OPTIONAL); - curl_easy_setopt(result, CURLOPT_HTTPAUTH, CURLAUTH_ANY); + curl_easy_setopt(result, CURLOPT_HTTPAUTH, http_auth_any); #ifdef CURLGSSAPI_DELEGATION_FLAG if (curl_deleg) { @@ -1448,6 +1458,8 @@ void http_init(struct remote *remote, const char *url, int proactive_auth) set_long_from_env(&http_max_retries, "GIT_HTTP_MAX_RETRIES"); set_long_from_env(&http_max_retry_time, "GIT_HTTP_MAX_RETRY_TIME"); + http_auth_methods = http_auth_any; + curl_default = get_curl_handle(); } diff --git a/t/t5563-simple-http-auth.sh b/t/t5563-simple-http-auth.sh index b8cef9dd5b1413..822d64ed5ec9cb 100755 --- a/t/t5563-simple-http-auth.sh +++ b/t/t5563-simple-http-auth.sh @@ -730,8 +730,10 @@ test_expect_success NTLM 'access using NTLM auth' ' EOF test_config_global credential.helper test-helper && - GIT_TRACE_CURL=1 \ - git ls-remote "$HTTPD_URL/ntlm_auth/repo.git" + test_must_fail env GIT_TRACE_CURL=1 git \ + ls-remote "$HTTPD_URL/ntlm_auth/repo.git" && + GIT_TRACE_CURL=1 git -c http.$HTTPD_URL.allowNTLMAuth=true \ + ls-remote "$HTTPD_URL/ntlm_auth/repo.git" ' test_done From ac20928ab76a244eee805b20b2b66390b8fba905 Mon Sep 17 00:00:00 2001 From: Matthew John Cheetham Date: Mon, 13 Apr 2026 12:46:14 +0100 Subject: [PATCH 075/218] http: extract http_reauth_prepare() from retry paths All three HTTP retry paths (http_request_recoverable, post_rpc, probe_rpc) call credential_fill() directly when handling HTTP_REAUTH. Extract this into a helper function so that a subsequent commit can add pre-fill logic (such as attempting empty-auth before prompting) in one place. No functional change. Signed-off-by: Matthew John Cheetham --- http.c | 7 ++++++- http.h | 6 ++++++ remote-curl.c | 4 ++-- 3 files changed, 14 insertions(+), 3 deletions(-) diff --git a/http.c b/http.c index 9086fa55b35f3e..6344a07a018312 100644 --- a/http.c +++ b/http.c @@ -680,6 +680,11 @@ static void init_curl_http_auth(CURL *result) } } +void http_reauth_prepare(int all_capabilities) +{ + credential_fill(the_repository, &http_auth, all_capabilities); +} + /* *var must be free-able */ static void var_override(char **var, char *value) { @@ -2427,7 +2432,7 @@ static int http_request_recoverable(const char *url, sleep(retry_delay); } } else if (ret == HTTP_REAUTH) { - credential_fill(the_repository, &http_auth, 1); + http_reauth_prepare(1); } /* diff --git a/http.h b/http.h index f9ee888c3ed67e..729c51904d39ad 100644 --- a/http.h +++ b/http.h @@ -76,6 +76,12 @@ extern int http_is_verbose; extern ssize_t http_post_buffer; extern struct credential http_auth; +/** + * Prepare for an HTTP re-authentication retry. This fills credentials + * via credential_fill() so the next request can include them. + */ +void http_reauth_prepare(int all_capabilities); + extern char curl_errorstr[CURL_ERROR_SIZE]; enum http_follow_config { diff --git a/remote-curl.c b/remote-curl.c index aba60d571282d3..affdb880f7b3bf 100644 --- a/remote-curl.c +++ b/remote-curl.c @@ -946,7 +946,7 @@ static int post_rpc(struct rpc_state *rpc, int stateless_connect, int flush_rece do { err = probe_rpc(rpc, &results); if (err == HTTP_REAUTH) - credential_fill(the_repository, &http_auth, 0); + http_reauth_prepare(0); } while (err == HTTP_REAUTH); if (err != HTTP_OK) return -1; @@ -1068,7 +1068,7 @@ static int post_rpc(struct rpc_state *rpc, int stateless_connect, int flush_rece rpc->any_written = 0; err = run_slot(slot, NULL); if (err == HTTP_REAUTH && !large_request) { - credential_fill(the_repository, &http_auth, 0); + http_reauth_prepare(0); curl_slist_free_all(headers); goto retry; } From 0ae3505a2f9ef2b645cefcf12000861e31688da0 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Matthias=20A=C3=9Fhauer?= Date: Sun, 29 Dec 2024 11:48:34 +0100 Subject: [PATCH 076/218] t0301: actually test credential-cache on Windows MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Commit 2406bf5 (Win32: detect unix socket support at runtime, 2024-04-03) introduced a runtime detection for whether the operating system supports unix sockets for Windows, but a mistake snuck into the tests. When building and testing Git without NO_UNIX_SOCKETS we currently skip t0301-credential-cache on Windows if unix sockets are supported and run the tests if they aren't. Flip that logic to actually work the way it was intended. Signed-off-by: Matthias Aßhauer Signed-off-by: Johannes Schindelin --- t/t0301-credential-cache.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/t/t0301-credential-cache.sh b/t/t0301-credential-cache.sh index 6f7cfd9e33f633..a14032626192d0 100755 --- a/t/t0301-credential-cache.sh +++ b/t/t0301-credential-cache.sh @@ -12,7 +12,7 @@ test -z "$NO_UNIX_SOCKETS" || { if test_have_prereq MINGW then service_running=$(sc query afunix | grep "4 RUNNING") - test -z "$service_running" || { + test -n "$service_running" || { skip_all='skipping credential-cache tests, unix sockets not available' test_done } From c91c82adc2588c3ea81008299c6bfc6ab98475e4 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Wed, 26 Nov 2025 19:18:35 +0100 Subject: [PATCH 077/218] http: warn if might have failed because of NTLM The new default of Git is to disable NTLM authentication by default. To help users find the escape hatch of that config setting, should they need it, suggest it when the authentication failed and the server had offered NTLM, i.e. if re-enabling it would fix the problem. Helped-by: Patrick Steinhardt Signed-off-by: Johannes Schindelin --- http.c | 11 +++++++++++ t/t5563-simple-http-auth.sh | 3 ++- 2 files changed, 13 insertions(+), 1 deletion(-) diff --git a/http.c b/http.c index c189222fdb5941..41fcc5f713f521 100644 --- a/http.c +++ b/http.c @@ -1900,6 +1900,17 @@ static int handle_curl_result(struct slot_results *results) credential_reject(the_repository, &http_auth); if (always_auth_proactively()) http_proactive_auth = PROACTIVE_AUTH_NONE; + if ((results->auth_avail & CURLAUTH_NTLM) && + !(http_auth_any & CURLAUTH_NTLM)) { + warning(_("Due to its cryptographic weaknesses, " + "NTLM authentication has been\n" + "disabled in Git by default. You can " + "re-enable it for trusted servers\n" + "by running:\n\n" + "git config set " + "http.%s://%s.allowNTLMAuth true"), + http_auth.protocol, http_auth.host); + } return HTTP_NOAUTH; } else { http_auth_methods &= ~CURLAUTH_GSSNEGOTIATE; diff --git a/t/t5563-simple-http-auth.sh b/t/t5563-simple-http-auth.sh index 822d64ed5ec9cb..303f8589640aa2 100755 --- a/t/t5563-simple-http-auth.sh +++ b/t/t5563-simple-http-auth.sh @@ -731,7 +731,8 @@ test_expect_success NTLM 'access using NTLM auth' ' test_config_global credential.helper test-helper && test_must_fail env GIT_TRACE_CURL=1 git \ - ls-remote "$HTTPD_URL/ntlm_auth/repo.git" && + ls-remote "$HTTPD_URL/ntlm_auth/repo.git" 2>err && + test_grep "allowNTLMAuth" err && GIT_TRACE_CURL=1 git -c http.$HTTPD_URL.allowNTLMAuth=true \ ls-remote "$HTTPD_URL/ntlm_auth/repo.git" ' From 6dd257a9d949f4c6f02941c2c2fd2a403c7a6b24 Mon Sep 17 00:00:00 2001 From: Matthew John Cheetham Date: Mon, 13 Apr 2026 12:52:11 +0100 Subject: [PATCH 078/218] http: attempt Negotiate auth in http.emptyAuth=auto mode When a server advertises Negotiate (SPNEGO) authentication, the "auto" mode of http.emptyAuth should detect this as an "exotic" method and proactively send empty credentials, allowing libcurl to use the system Kerberos ticket without prompting the user. However, two features interact to prevent this from working: The Negotiate-stripping logic, introduced in 4dbe66464b (remote-curl: fall back to Basic auth if Negotiate fails, 2015-01-08), removes CURLAUTH_GSSNEGOTIATE from the allowed methods on the first 401 response. The empty-auth auto-detection, introduced in 40a18fc77c (http: add an "auto" mode for http.emptyauth, 2017-02-25), then checks the remaining methods for anything "exotic" -- but Negotiate has already been removed, so auto mode never activates for servers whose only non-Basic/Digest method is Negotiate (e.g., Apache with mod_auth_kerb offering Basic + Negotiate). Fix this by delaying the Negotiate stripping in auto mode: on the first 401, keep Negotiate in the allowed methods so that auto mode can detect it and retry with empty credentials. If that attempt fails (no valid Kerberos ticket), strip Negotiate on the second 401 and fall through to credential_fill() as usual. To support this, also teach http_reauth_prepare() to skip credential_fill() when empty auth is about to be attempted, since filling real credentials would bypass the empty-auth mechanism. The true and false modes are unchanged: true sends empty credentials on the very first request (before any 401), and false never sends them. Signed-off-by: Matthew John Cheetham --- http.c | 25 ++++++++++++++++++++++++- 1 file changed, 24 insertions(+), 1 deletion(-) diff --git a/http.c b/http.c index 6344a07a018312..6c3207cc52a905 100644 --- a/http.c +++ b/http.c @@ -139,6 +139,7 @@ static unsigned long empty_auth_useless = CURLAUTH_BASIC | CURLAUTH_DIGEST_IE | CURLAUTH_DIGEST; +static int empty_auth_try_negotiate; static struct curl_slist *pragma_header; static struct string_list extra_http_headers = STRING_LIST_INIT_DUP; @@ -682,6 +683,17 @@ static void init_curl_http_auth(CURL *result) void http_reauth_prepare(int all_capabilities) { + /* + * If we deferred stripping Negotiate to give empty auth a + * chance (auto mode), skip credential_fill on this retry so + * that init_curl_http_auth() sends empty credentials and + * libcurl can attempt Negotiate with the system ticket cache. + */ + if (empty_auth_try_negotiate && + !http_auth.password && !http_auth.credential && + (http_auth_methods & CURLAUTH_GSSNEGOTIATE)) + return; + credential_fill(the_repository, &http_auth, all_capabilities); } @@ -1924,7 +1936,18 @@ static int handle_curl_result(struct slot_results *results) } return HTTP_NOAUTH; } else { - http_auth_methods &= ~CURLAUTH_GSSNEGOTIATE; + if (curl_empty_auth == -1 && + !empty_auth_try_negotiate && + (results->auth_avail & CURLAUTH_GSSNEGOTIATE)) { + /* + * In auto mode, give Negotiate a chance via + * empty auth before stripping it. If it fails, + * we will strip it on the next 401. + */ + empty_auth_try_negotiate = 1; + } else { + http_auth_methods &= ~CURLAUTH_GSSNEGOTIATE; + } if (results->auth_avail) { http_auth_methods &= results->auth_avail; http_auth_methods_restricted = 1; From 152380dcbcc9a4f9fd4e687aeb18d79bae697a93 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Matthias=20A=C3=9Fhauer?= Date: Sun, 22 Dec 2024 17:24:24 +0100 Subject: [PATCH 079/218] credential-cache: handle ECONNREFUSED gracefully MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit In 245670c (credential-cache: check for windows specific errors, 2021-09-14) we concluded that on Windows we would always encounter ENETDOWN where we would expect ECONNREFUSED on POSIX systems, when connecting to unix sockets. As reported in [1], we do encounter ECONNREFUSED on Windows if the socket file doesn't exist, but the containing directory does and ENETDOWN if neither exists. We should handle this case like we do on non-windows systems. [1] https://github.com/git-for-windows/git/pull/4762#issuecomment-2545498245 This fixes https://github.com/git-for-windows/git/issues/5314 Helped-by: M Hickford Signed-off-by: Matthias Aßhauer Signed-off-by: Johannes Schindelin --- builtin/credential-cache.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/builtin/credential-cache.c b/builtin/credential-cache.c index 7f733cb756e03c..3b8130d3d64f9c 100644 --- a/builtin/credential-cache.c +++ b/builtin/credential-cache.c @@ -23,7 +23,7 @@ static int connection_closed(int error) static int connection_fatally_broken(int error) { - return (error != ENOENT) && (error != ENETDOWN); + return (error != ENOENT) && (error != ENETDOWN) && (error != ECONNREFUSED); } #else From 8668f54d7c25fd76e7e776abda22e2ad807eef2a Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Thu, 6 Mar 2025 14:05:03 +0100 Subject: [PATCH 080/218] reftable: do make sure to use custom allocators The reftable library goes out of its way to use its own set of allocator functions that can be configured using `reftable_set_alloc()`. However, Git does not configure this. That is not typically a problem, except when Git uses a custom allocator via some definitions in `git-compat-util.h`, as is the case in Git for Windows (which switched away from the long-unmaintained nedmalloc to mimalloc). Then, it is quite possible that Git assigns a `strbuf` (allocated via the custom allocator) to, say, the `refname` field of a `reftable_log_record` in `write_transaction_table()`, and later on asks the reftable library function `reftable_log_record_release()` to release it, but that function was compiled without using `git-compat-util.h` and hence calls regular `free()` (i.e. _not_ the custom allocator's own function). This has been a problem for a long time and it was a matter of some sort of "luck" that 1) reftables are not commonly used on Windows, and 2) mimalloc can often ignore gracefully when it is asked to release memory that it has not allocated. However, a recent update to `seen` brought this problem to the forefront, letting t1460 fail in Git for Windows, with symptoms much in the same way as the problem I had to address in d02c37c3e6ba (t-reftable-basics: allow for `malloc` to be `#define`d, 2025-01-08) where exit code 127 was also produced in lieu of `STATUS_HEAP_CORRUPTION` (C0000374) because exit codes are only 7 bits wide. It was not possible to figure out what change in particular caused these new failures within a reasonable time frame, as there are too many changes in `seen` that conflict with Git for Windows' patches, I had to stop the investigation after spending four hours on it fruitlessly. To verify that this patch fixes the issue, I avoided using mimalloc and temporarily patched in a "custom allocator" that would more reliably point out problems, like this: diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c index 68f38291f84c..9421d630b9f5 100644 --- a/refs/reftable-backend.c +++ b/refs/reftable-backend.c @@ -353,6 +353,69 @@ static int reftable_be_fsync(int fd) return fsync_component(FSYNC_COMPONENT_REFERENCE, fd); } +#define DEBUG_REFTABLE_ALLOC +#ifdef DEBUG_REFTABLE_ALLOC +#include "khash.h" + +static inline khint_t __ac_X31_hash_ptr(void *ptr) +{ + union { + void *ptr; + char s[sizeof(void *)]; + } u; + size_t i; + khint_t h; + + u.ptr = ptr; + h = (khint_t)*u.s; + for (i = 0; i < sizeof(void *); i++) + h = (h << 5) - h + (khint_t)u.s[i]; + return h; +} + +#define kh_ptr_hash_func(key) __ac_X31_hash_ptr(key) +#define kh_ptr_hash_equal(a, b) ((a) == (b)) + +KHASH_INIT(ptr, void *, int, 0, kh_ptr_hash_func, kh_ptr_hash_equal) + +static kh_ptr_t *my_malloced; + +static void *my_malloc(size_t sz) +{ + int dummy; + void *ptr = malloc(sz); + if (ptr) + kh_put_ptr(my_malloced, ptr, &dummy); + return ptr; +} + +static void *my_realloc(void *ptr, size_t sz) +{ + int dummy; + if (ptr) { + khiter_t pos = kh_get_ptr(my_malloced, ptr); + if (pos >= kh_end(my_malloced)) + die("Was not my_malloc()ed: %p", ptr); + kh_del_ptr(my_malloced, pos); + } + ptr = realloc(ptr, sz); + if (ptr) + kh_put_ptr(my_malloced, ptr, &dummy); + return ptr; +} + +static void my_free(void *ptr) +{ + if (ptr) { + khiter_t pos = kh_get_ptr(my_malloced, ptr); + if (pos >= kh_end(my_malloced)) + die("Was not my_malloc()ed: %p", ptr); + kh_del_ptr(my_malloced, pos); + } + free(ptr); +} +#endif + static struct ref_store *reftable_be_init(struct repository *repo, const char *gitdir, unsigned int store_flags) @@ -362,6 +425,11 @@ static struct ref_store *reftable_be_init(struct repository *repo, int is_worktree; mode_t mask; +#ifdef DEBUG_REFTABLE_ALLOC + my_malloced = kh_init_ptr(); + reftable_set_alloc(my_malloc, my_realloc, my_free); +#endif + mask = umask(0); umask(mask); I briefly considered contributing this "custom allocator" patch, too, but it is unwieldy (for example, it would not work at all when compiling with mimalloc support) and it would only waste space (or even time, if a compile flag was introduced and exercised as part of the CI builds). Given that it is highly unlikely that Git will lose the new `reftable_set_alloc()` call by mistake, I rejected that idea as simply too wasteful. Signed-off-by: Johannes Schindelin --- refs/reftable-backend.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c index daea30a5b4cad9..2f438385da01b6 100644 --- a/refs/reftable-backend.c +++ b/refs/reftable-backend.c @@ -381,6 +381,8 @@ static struct ref_store *reftable_be_init(struct repository *repo, mask = umask(0); umask(mask); + reftable_set_alloc(malloc, realloc, free); + refs_compute_filesystem_location(gitdir, payload, &is_worktree, &refdir, &ref_common_dir); From 2238da5fa8ed2073f919cb1b492cd564346215cf Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Tue, 3 Jun 2025 12:45:39 +0200 Subject: [PATCH 081/218] check-whitespace: avoid alerts about upstream commits Every once in a while, whitespace errors are introduced in Git for Windows' rebases to newer Git versions, simply by virtue of integrating upstream commits that do not follow upstream Git's own whitespace rule. In Git v2.50.0-rc0, for example, 03f2915541a4 (xdiff: disable cleanup_records heuristic with --minimal, 2025-04-29) introduced a trailing space. Arguably, non-actionable alerts are worse than no alerts at all, so let's suppress those alerts that we cannot do anything about, anyway. Signed-off-by: Johannes Schindelin --- ci/check-whitespace.sh | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/ci/check-whitespace.sh b/ci/check-whitespace.sh index c40804394cb079..e590ac0dfd765e 100755 --- a/ci/check-whitespace.sh +++ b/ci/check-whitespace.sh @@ -19,6 +19,7 @@ problems=() commit= commitText= commitTextmd= +committerEmail= goodParent= if ! git rev-parse --quiet --verify "${baseCommit}" @@ -27,7 +28,7 @@ then exit 1 fi -while read dash sha etc +while read dash email sha etc do case "${dash}" in "---") # Line contains commit information. @@ -40,10 +41,14 @@ do commit="${sha}" commitText="${sha} ${etc}" commitTextmd="[${sha}](${url}/commit/${sha}) ${etc}" + committerEmail="${email}" ;; "") ;; *) # Line contains whitespace error information for current commit. + # Quod licet Iovi non licet bovi + test gitster@pobox.com != "$committerEmail" || break + if test -n "${goodParent}" then problems+=("1) --- ${commitTextmd}") @@ -64,7 +69,7 @@ do echo "${dash} ${sha} ${etc}" ;; esac -done <<< "$(git log --check --pretty=format:"---% h% s" "${baseCommit}"..)" +done <<< "$(git log --check --pretty=format:"---% ce% h% s" "${baseCommit}"..)" if test ${#problems[*]} -gt 0 then From f44b29427929dee27eca5353b4c0851f4b508d25 Mon Sep 17 00:00:00 2001 From: Thomas Braun Date: Mon, 26 Jan 2026 19:02:44 +0100 Subject: [PATCH 082/218] t/t5571-prep-push-hook.sh: Add test with writing to stderr The 2.53.0.rc0.windows release candidate had a regression where writing to stderr from a pre-push hook would error out. The regression was fixed in 2.53.0.rc1.windows and the test here ensures that this stays fixed. Signed-off-by: Thomas Braun --- t/t5571-pre-push-hook.sh | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/t/t5571-pre-push-hook.sh b/t/t5571-pre-push-hook.sh index a11b20e378223e..25b8d50c9428a7 100755 --- a/t/t5571-pre-push-hook.sh +++ b/t/t5571-pre-push-hook.sh @@ -138,4 +138,16 @@ test_expect_success 'sigpipe does not cause pre-push hook failure' ' git push parent1 "refs/heads/b/*:refs/heads/b/*" ' +test_expect_success 'can write to stderr' ' + test_hook --clobber pre-push <<-\EOF && + echo foo >/dev/stderr && + exit 0 + EOF + + test_commit third && + echo foo >expect && + git push --quiet parent1 HEAD 2>actual && + test_cmp expect actual +' + test_done From d8031edcf2b94dd6ba927b5339ec9f7f59ba9b48 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Mon, 9 Feb 2026 18:21:48 +0100 Subject: [PATCH 083/218] credential: advertise NTLM suppression and allow helpers to re-enable The previous commits disabled NTLM authentication by default due to its cryptographic weaknesses. Users can re-enable it via the config setting http..allowNTLMAuth, but this requires manual intervention. Credential helpers may have knowledge about which servers are trusted for NTLM authentication (e.g., known on-prem Azure DevOps instances). To allow them to signal this trust, introduce a simple negotiation: when NTLM is suppressed and the server offered it, Git advertises ntlm=suppressed to the credential helper. The helper can respond with ntlm=allow to re-enable NTLM for this request. This happens precisely at the point where we would otherwise warn the user about NTLM being suppressed, ensuring the capability is only advertised when relevant. Helped-by: Matthew John Cheetham Signed-off-by: Johannes Schindelin --- credential.c | 5 +++++ credential.h | 3 +++ http.c | 17 +++++++++++++++-- t/t5563-simple-http-auth.sh | 13 ++++++++++++- 4 files changed, 35 insertions(+), 3 deletions(-) diff --git a/credential.c b/credential.c index 2594c0c4229ba0..af964189363b28 100644 --- a/credential.c +++ b/credential.c @@ -360,6 +360,9 @@ int credential_read(struct credential *c, FILE *fp, credential_set_capability(&c->capa_authtype, op_type); else if (!strcmp(value, "state")) credential_set_capability(&c->capa_state, op_type); + } else if (!strcmp(key, "ntlm")) { + if (!strcmp(value, "allow")) + c->ntlm_allow = 1; } else if (!strcmp(key, "continue")) { c->multistage = !!git_config_bool("continue", value); } else if (!strcmp(key, "password_expiry_utc")) { @@ -420,6 +423,8 @@ void credential_write(const struct credential *c, FILE *fp, if (c->ephemeral) credential_write_item(c, fp, "ephemeral", "1", 0); } + if (c->ntlm_suppressed) + credential_write_item(c, fp, "ntlm", "suppressed", 0); credential_write_item(c, fp, "protocol", c->protocol, 1); credential_write_item(c, fp, "host", c->host, 1); credential_write_item(c, fp, "path", c->path, 0); diff --git a/credential.h b/credential.h index c78b72d110eaac..95244d5375dfe9 100644 --- a/credential.h +++ b/credential.h @@ -177,6 +177,9 @@ struct credential { struct credential_capability capa_authtype; struct credential_capability capa_state; + unsigned ntlm_suppressed:1, + ntlm_allow:1; + char *username; char *password; char *credential; diff --git a/http.c b/http.c index 41fcc5f713f521..9086fa55b35f3e 100644 --- a/http.c +++ b/http.c @@ -660,6 +660,11 @@ static void init_curl_http_auth(CURL *result) credential_fill(the_repository, &http_auth, 1); + if (http_auth.ntlm_allow && !(http_auth_methods & CURLAUTH_NTLM)) { + http_auth_methods |= CURLAUTH_NTLM; + curl_easy_setopt(result, CURLOPT_HTTPAUTH, http_auth_methods); + } + if (http_auth.password) { if (always_auth_proactively()) { /* @@ -1891,6 +1896,8 @@ static int handle_curl_result(struct slot_results *results) } else if (missing_target(results)) return HTTP_MISSING_TARGET; else if (results->http_code == 401) { + http_auth.ntlm_suppressed = (results->auth_avail & CURLAUTH_NTLM) && + !(http_auth_any & CURLAUTH_NTLM); if ((http_auth.username && http_auth.password) ||\ (http_auth.authtype && http_auth.credential)) { if (http_auth.multistage) { @@ -1900,8 +1907,7 @@ static int handle_curl_result(struct slot_results *results) credential_reject(the_repository, &http_auth); if (always_auth_proactively()) http_proactive_auth = PROACTIVE_AUTH_NONE; - if ((results->auth_avail & CURLAUTH_NTLM) && - !(http_auth_any & CURLAUTH_NTLM)) { + if (http_auth.ntlm_suppressed) { warning(_("Due to its cryptographic weaknesses, " "NTLM authentication has been\n" "disabled in Git by default. You can " @@ -2424,6 +2430,13 @@ static int http_request_recoverable(const char *url, credential_fill(the_repository, &http_auth, 1); } + /* + * Re-enable NTLM auth if the helper allows it and we would + * otherwise suppress authentication via NTLM. + */ + if (http_auth.ntlm_suppressed && http_auth.ntlm_allow) + http_auth_methods |= CURLAUTH_NTLM; + ret = http_request(url, result, target, options); } if (ret == HTTP_RATE_LIMITED) { diff --git a/t/t5563-simple-http-auth.sh b/t/t5563-simple-http-auth.sh index 303f8589640aa2..35e6f4b397da7c 100755 --- a/t/t5563-simple-http-auth.sh +++ b/t/t5563-simple-http-auth.sh @@ -733,8 +733,19 @@ test_expect_success NTLM 'access using NTLM auth' ' test_must_fail env GIT_TRACE_CURL=1 git \ ls-remote "$HTTPD_URL/ntlm_auth/repo.git" 2>err && test_grep "allowNTLMAuth" err && + + # Can be enabled via config GIT_TRACE_CURL=1 git -c http.$HTTPD_URL.allowNTLMAuth=true \ - ls-remote "$HTTPD_URL/ntlm_auth/repo.git" + ls-remote "$HTTPD_URL/ntlm_auth/repo.git" && + + # Or via credential helper responding with ntlm=allow + set_credential_reply get <<-EOF && + username=user + password=pwd + ntlm=allow + EOF + + git ls-remote "$HTTPD_URL/ntlm_auth/repo.git" ' test_done From bd5ce11e6a7569254c6513c9a14311f97ac9a181 Mon Sep 17 00:00:00 2001 From: Maks Kuznia Date: Mon, 30 Mar 2026 23:18:31 +0200 Subject: [PATCH 084/218] dir: do not traverse mount points It was already decided in ef22148 (clean: do not traverse mount points, 2018-12-07) that we shouldn't traverse NTFS junctions/bind mounts when using `git clean`, partly because they're sometimes used in worktrees. But the same check wasn't applied to `remove_dir_recurse()` in `dir.c`, which `git worktree remove` uses. So removing a worktree suffers the same problem we had previously with `git clean`. Let's add the same guard from ef22148. Signed-off-by: Maks Kuznia --- dir.c | 7 +++++++ t/t2403-worktree-move.sh | 9 +++++++++ 2 files changed, 16 insertions(+) diff --git a/dir.c b/dir.c index fcb8f6dd2aa969..2f72dc865eea59 100644 --- a/dir.c +++ b/dir.c @@ -3411,6 +3411,13 @@ static int remove_dir_recurse(struct strbuf *path, int flag, int *kept_up) return 0; } + if (is_mount_point(path)) { + /* Do not descend and nuke a mount point or junction. */ + if (kept_up) + *kept_up = 1; + return 0; + } + flag &= ~REMOVE_DIR_KEEP_TOPLEVEL; dir = opendir(path->buf); if (!dir) { diff --git a/t/t2403-worktree-move.sh b/t/t2403-worktree-move.sh index 0bb33e8b1b90fb..56faef26aa3bb1 100755 --- a/t/t2403-worktree-move.sh +++ b/t/t2403-worktree-move.sh @@ -271,4 +271,13 @@ test_expect_success 'move worktree with relative path to absolute path' ' test_cmp expect .git/worktrees/absolute/gitdir ' +test_expect_success MINGW 'worktree remove does not traverse mount points' ' + mkdir target && + >target/dont-remove-me && + git worktree add --detach wt-junction && + cmd //c "mklink /j wt-junction\\mnt target" && + git worktree remove --force wt-junction && + test_path_is_file target/dont-remove-me +' + test_done From 0f789244ce645080691c3135a6138561d249853d Mon Sep 17 00:00:00 2001 From: Matthew John Cheetham Date: Mon, 13 Apr 2026 12:57:12 +0100 Subject: [PATCH 085/218] t5563: add tests for http.emptyAuth with Negotiate Add tests exercising the interaction between http.emptyAuth and servers that advertise Negotiate (SPNEGO) authentication. Verify that auto mode gives Negotiate a chance via empty auth (resulting in two 401 responses before falling through to credential_fill with Basic credentials), and that false mode strips Negotiate immediately (only one 401 response). Signed-off-by: Matthew John Cheetham --- t/t5563-simple-http-auth.sh | 74 +++++++++++++++++++++++++++++++++++++ 1 file changed, 74 insertions(+) diff --git a/t/t5563-simple-http-auth.sh b/t/t5563-simple-http-auth.sh index 35e6f4b397da7c..911c716aaff2f4 100755 --- a/t/t5563-simple-http-auth.sh +++ b/t/t5563-simple-http-auth.sh @@ -748,4 +748,78 @@ test_expect_success NTLM 'access using NTLM auth' ' git ls-remote "$HTTPD_URL/ntlm_auth/repo.git" ' +test_lazy_prereq SPNEGO 'curl --version | grep -qi "SPNEGO\|GSS-API\|Kerberos\|negotiate"' + +test_expect_success SPNEGO 'http.emptyAuth=auto attempts Negotiate before credential_fill' ' + test_when_finished "per_test_cleanup" && + + set_credential_reply get <<-EOF && + username=alice + password=secret-passwd + EOF + + # Basic base64(alice:secret-passwd) + cat >"$HTTPD_ROOT_PATH/custom-auth.valid" <<-EOF && + id=1 creds=Basic YWxpY2U6c2VjcmV0LXBhc3N3ZA== + EOF + + cat >"$HTTPD_ROOT_PATH/custom-auth.challenge" <<-EOF && + id=1 status=200 + id=default response=WWW-Authenticate: Negotiate + id=default response=WWW-Authenticate: Basic realm="example.com" + EOF + + test_config_global credential.helper test-helper && + GIT_TRACE_CURL="$TRASH_DIRECTORY/trace-auto" \ + git -c http.emptyAuth=auto \ + ls-remote "$HTTPD_URL/custom_auth/repo.git" && + + # In auto mode with a Negotiate+Basic server, there should be + # three 401 responses: (1) initial no-auth request, (2) empty-auth + # retry where Negotiate fails (no Kerberos ticket), (3) libcurl + # internal Negotiate retry. The fourth attempt uses Basic + # credentials from credential_fill and succeeds. + grep "HTTP/[0-9.]* 401" "$TRASH_DIRECTORY/trace-auto" >actual_401s && + test_line_count = 3 actual_401s && + + expect_credential_query get <<-EOF + capability[]=authtype + capability[]=state + protocol=http + host=$HTTPD_DEST + wwwauth[]=Negotiate + wwwauth[]=Basic realm="example.com" + EOF +' + +test_expect_success SPNEGO 'http.emptyAuth=false skips Negotiate' ' + test_when_finished "per_test_cleanup" && + + set_credential_reply get <<-EOF && + username=alice + password=secret-passwd + EOF + + # Basic base64(alice:secret-passwd) + cat >"$HTTPD_ROOT_PATH/custom-auth.valid" <<-EOF && + id=1 creds=Basic YWxpY2U6c2VjcmV0LXBhc3N3ZA== + EOF + + cat >"$HTTPD_ROOT_PATH/custom-auth.challenge" <<-EOF && + id=1 status=200 + id=default response=WWW-Authenticate: Negotiate + id=default response=WWW-Authenticate: Basic realm="example.com" + EOF + + test_config_global credential.helper test-helper && + GIT_TRACE_CURL="$TRASH_DIRECTORY/trace-false" \ + git -c http.emptyAuth=false \ + ls-remote "$HTTPD_URL/custom_auth/repo.git" && + + # With emptyAuth=false, Negotiate is stripped immediately and + # credential_fill is called right away. Only one 401 response. + grep "HTTP/[0-9.]* 401" "$TRASH_DIRECTORY/trace-false" >actual_401s && + test_line_count = 1 actual_401s +' + test_done From f260472f7b7147d3458e0be3b9eff3a5df5334bb Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Mon, 24 Jun 2019 23:45:21 +0200 Subject: [PATCH 086/218] mingw: stop using nedmalloc The vendored nedmalloc allocator under compat/nedmalloc/ has been unmaintained upstream for a very long time: the original repository at https://github.com/ned14/nedmalloc received its last commit on July 5, 2014, and was archived (made read-only) by its owner on March 15, 2019. Our copy has been carried forward unchanged ever since. The Git for Windows commit that introduced mimalloc as a replacement on Windows ("mingw: use mimalloc", 2019-06-24, present in the Git for Windows branch thicket but not upstream) already observed at that time that nedmalloc had ceased to see any updates for several years. This came to a head when the Git for Windows SDK upgraded to GCC 16: the `add_segment()` function in `compat/nedmalloc/malloc.c.h` declares `int nfences = 0` and only references it inside an `assert()`, which GCC 16 now flags as `-Wunused-but-set-variable`. Combined with the `-Werror` enabled by `DEVELOPER=1`, this turns into a hard build failure: compat/nedmalloc/malloc.c.h: In function 'add_segment': compat/nedmalloc/malloc.c.h:3897:7: error: variable 'nfences' set but not used [-Werror=unused-but-set-variable=] 3897 | int nfences = 0; | ^~~~~~~ cc1.exe: all warnings being treated as errors The same source built without complaint under GCC 15.2.0; the regression was bisected to the SDK package update at https://github.com/git-for-windows/git-sdk-64/commit/188d93dd455 (`mingw-w64-x86_64-gcc 15.2.0-14 -> 16.1.0-1`), with the failing CI run captured at https://github.com/git-for-windows/git-sdk-64/actions/runs/25244795074. Rather than patch the unmaintained vendored sources to silence the warning, stop opting into nedmalloc altogether on MINGW. The platform allocator is what every non-MINGW build already uses, and a fresh build of git.git's master against a minimal Git for Windows SDK upgraded to GCC 16, with `USE_NED_ALLOCATOR` removed from the MINGW section, completes successfully. The compat/nedmalloc/ subtree itself is left in place to keep this change minimal; nothing in the build links against it any longer, so it can be removed in a follow-up if desired. Assisted-by: Claude Opus 4.7 Signed-off-by: Johannes Schindelin --- config.mak.uname | 3 --- 1 file changed, 3 deletions(-) diff --git a/config.mak.uname b/config.mak.uname index d643a3a5fbbacc..8a391157e48c5b 100644 --- a/config.mak.uname +++ b/config.mak.uname @@ -758,9 +758,6 @@ ifeq ($(uname_S),MINGW) HAVE_LIBCHARSET_H = YesPlease USE_GETTEXT_SCHEME = fallthrough USE_LIBPCRE = YesPlease - ifneq (CLANGARM64,$(MSYSTEM)) - USE_NED_ALLOCATOR = YesPlease - endif NO_PYTHON = ifeq (/mingw64,$(subst 32,64,$(subst clangarm,mingw,$(prefix)))) # Move system config into top-level /etc/ From 4d1fb6b187326204375134eab5472e322b724f0f Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Sun, 23 Nov 2025 11:11:01 +0100 Subject: [PATCH 087/218] mingw: stop hard-coding `CC = gcc` This is no longer true in general, not with supporting Clang out of the box. Signed-off-by: Johannes Schindelin --- config.mak.uname | 1 - 1 file changed, 1 deletion(-) diff --git a/config.mak.uname b/config.mak.uname index 9aef8175a058dd..7636452c56bd61 100644 --- a/config.mak.uname +++ b/config.mak.uname @@ -750,7 +750,6 @@ ifeq ($(uname_S),MINGW) COMPAT_CFLAGS += -D_USE_32BIT_TIME_T BASIC_LDFLAGS += -Wl,--large-address-aware endif - CC = gcc COMPAT_CFLAGS += -D__USE_MINGW_ANSI_STDIO=0 -DDETECT_MSYS_TTY \ -fstack-protector-strong EXTLIBS += -lntdll From 0d244323f6a34f1cac9d1504a6a696cf5662e045 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Fri, 21 Nov 2025 12:15:12 +0100 Subject: [PATCH 088/218] mingw: drop the -D_USE_32BIT_TIME_T option This option was added in fa93bb20d72 (MinGW: Fix stat definitions to work with MinGW runtime version 4.0, 2013-09-11), i.e. a _long_ time ago. So long, in fact, that it still targeted MinGW. But we switched to mingw-w64 in 2015, which seems not to share the problem, and therefore does not require a fix. Even worse: This flag is incompatible with UCRT64, which we are about to support by way of upstreaming `mingw-w64-git` to the MSYS2 project, see https://github.com/msys2/MINGW-packages/pull/26470 for details. So let's send that option into its well-deserved retirement. Signed-off-by: Johannes Schindelin --- config.mak.uname | 1 - 1 file changed, 1 deletion(-) diff --git a/config.mak.uname b/config.mak.uname index 7636452c56bd61..72a1a592119858 100644 --- a/config.mak.uname +++ b/config.mak.uname @@ -747,7 +747,6 @@ ifeq ($(uname_S),MINGW) HOST_CPU = aarch64 BASIC_LDFLAGS += -Wl,--pic-executable,-e,mainCRTStartup else - COMPAT_CFLAGS += -D_USE_32BIT_TIME_T BASIC_LDFLAGS += -Wl,--large-address-aware endif COMPAT_CFLAGS += -D__USE_MINGW_ANSI_STDIO=0 -DDETECT_MSYS_TTY \ From c470f27ac6dafa7ca91b40978a2d01a4ea767d1a Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Fri, 21 Nov 2025 12:38:21 +0100 Subject: [PATCH 089/218] mingw: only use -Wl,--large-address-aware for 32-bit builds That option only matters there, and is in fact only really understood in those builds; UCRT64 versions of GCC, for example, do not know what to do with that option. Signed-off-by: Johannes Schindelin --- config.mak.uname | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/config.mak.uname b/config.mak.uname index 72a1a592119858..a9ace4af44be04 100644 --- a/config.mak.uname +++ b/config.mak.uname @@ -736,9 +736,8 @@ ifeq ($(uname_S),MINGW) ifeq (MINGW32,$(MSYSTEM)) prefix = /mingw32 HOST_CPU = i686 - BASIC_LDFLAGS += -Wl,--pic-executable,-e,_mainCRTStartup - endif - ifeq (MINGW64,$(MSYSTEM)) + BASIC_LDFLAGS += -Wl,--pic-executable,-e,_mainCRTStartup -Wl,--large-address-aware + else ifeq (MINGW64,$(MSYSTEM)) prefix = /mingw64 HOST_CPU = x86_64 BASIC_LDFLAGS += -Wl,--pic-executable,-e,mainCRTStartup @@ -747,7 +746,6 @@ ifeq ($(uname_S),MINGW) HOST_CPU = aarch64 BASIC_LDFLAGS += -Wl,--pic-executable,-e,mainCRTStartup else - BASIC_LDFLAGS += -Wl,--large-address-aware endif COMPAT_CFLAGS += -D__USE_MINGW_ANSI_STDIO=0 -DDETECT_MSYS_TTY \ -fstack-protector-strong From 05f2ffade4ff06ac6a476674ebbcbf6cf32b5726 Mon Sep 17 00:00:00 2001 From: Jeff Hostetler Date: Tue, 30 Mar 2021 14:25:31 -0400 Subject: [PATCH 090/218] clink.pl: fix libexpatd.lib link error when using MSVC When building with `make MSVC=1 DEBUG=1`, link to `libexpatd.lib` rather than `libexpat.lib`. It appears that the `vcpkg` package for "libexpat" has changed and now creates `libexpatd.lib` for debug mode builds. Previously, both debug and release builds created a ".lib" with the same basename. Signed-off-by: Jeff Hostetler --- compat/vcbuild/scripts/clink.pl | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/compat/vcbuild/scripts/clink.pl b/compat/vcbuild/scripts/clink.pl index 3bd824154be381..2768ae15f1879f 100755 --- a/compat/vcbuild/scripts/clink.pl +++ b/compat/vcbuild/scripts/clink.pl @@ -66,7 +66,11 @@ } push(@args, $lib); } elsif ("$arg" eq "-lexpat") { + if ($is_debug) { + push(@args, "libexpatd.lib"); + } else { push(@args, "libexpat.lib"); + } } elsif ("$arg" =~ /^-L/ && "$arg" ne "-LTCG") { $arg =~ s/^-L/-LIBPATH:/; push(@lflags, $arg); From 3606ba6e1d1a5f9eb8ed1a6b4f3010a657d439ef Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Fri, 21 Nov 2025 13:44:56 +0100 Subject: [PATCH 091/218] mingw: avoid over-specifying `--pic-executable` In bf2d5d8239e (Don't let ld strip relocations, 2016-01-16) (picked from https://github.com/git-for-windows/git/pull/612/commits/6a237925bf10), Git for Windows introduced the `-Wl,-pic-executable` flag, specifying the exact entry point via `-e`. This required discerning between i686 and x86_64 code because the former required the symbol to be prefixed with an underscore, the latter did not. As per https://sourceware.org/bugzilla/show_bug.cgi?id=10865, the specified symbols are already the default, though. So let's drop the overly-specific definition. Signed-off-by: Johannes Schindelin --- config.mak.uname | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/config.mak.uname b/config.mak.uname index a9ace4af44be04..019c7e514a067a 100644 --- a/config.mak.uname +++ b/config.mak.uname @@ -736,15 +736,15 @@ ifeq ($(uname_S),MINGW) ifeq (MINGW32,$(MSYSTEM)) prefix = /mingw32 HOST_CPU = i686 - BASIC_LDFLAGS += -Wl,--pic-executable,-e,_mainCRTStartup -Wl,--large-address-aware + BASIC_LDFLAGS += -Wl,--pic-executable -Wl,--large-address-aware else ifeq (MINGW64,$(MSYSTEM)) prefix = /mingw64 HOST_CPU = x86_64 - BASIC_LDFLAGS += -Wl,--pic-executable,-e,mainCRTStartup + BASIC_LDFLAGS += -Wl,--pic-executable else ifeq (CLANGARM64,$(MSYSTEM)) prefix = /clangarm64 HOST_CPU = aarch64 - BASIC_LDFLAGS += -Wl,--pic-executable,-e,mainCRTStartup + BASIC_LDFLAGS += -Wl,--pic-executable else endif COMPAT_CFLAGS += -D__USE_MINGW_ANSI_STDIO=0 -DDETECT_MSYS_TTY \ From d08f8241b9769815309a44428f6425c0d62a8836 Mon Sep 17 00:00:00 2001 From: Jeff Hostetler Date: Mon, 5 Apr 2021 15:27:38 -0400 Subject: [PATCH 092/218] Makefile: clean up .ilk files when MSVC=1 Signed-off-by: Jeff Hostetler --- Makefile | 3 +++ 1 file changed, 3 insertions(+) diff --git a/Makefile b/Makefile index cedc234173e377..d77d78584d3e45 100644 --- a/Makefile +++ b/Makefile @@ -3911,12 +3911,15 @@ ifdef MSVC $(RM) $(patsubst %.o,%.o.pdb,$(OBJECTS)) $(RM) headless-git.o.pdb $(RM) $(patsubst %.exe,%.pdb,$(OTHER_PROGRAMS)) + $(RM) $(patsubst %.exe,%.ilk,$(OTHER_PROGRAMS)) $(RM) $(patsubst %.exe,%.iobj,$(OTHER_PROGRAMS)) $(RM) $(patsubst %.exe,%.ipdb,$(OTHER_PROGRAMS)) $(RM) $(patsubst %.exe,%.pdb,$(PROGRAMS)) + $(RM) $(patsubst %.exe,%.ilk,$(PROGRAMS)) $(RM) $(patsubst %.exe,%.iobj,$(PROGRAMS)) $(RM) $(patsubst %.exe,%.ipdb,$(PROGRAMS)) $(RM) $(patsubst %.exe,%.pdb,$(TEST_PROGRAMS)) + $(RM) $(patsubst %.exe,%.ilk,$(TEST_PROGRAMS)) $(RM) $(patsubst %.exe,%.iobj,$(TEST_PROGRAMS)) $(RM) $(patsubst %.exe,%.ipdb,$(TEST_PROGRAMS)) $(RM) compat/vcbuild/MSVC-DEFS-GEN From e8f59ba24425218577ac228f13df9f674543add3 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Fri, 21 Nov 2025 13:53:19 +0100 Subject: [PATCH 093/218] mingw: set the prefix and HOST_CPU as per MSYS2's settings MSYS2 already defines a couple of helpful environment variables, and we can use those to infer the installation location as well as the CPU. No need for hard-coding ;-) Signed-off-by: Johannes Schindelin --- config.mak.uname | 18 ++++++------------ 1 file changed, 6 insertions(+), 12 deletions(-) diff --git a/config.mak.uname b/config.mak.uname index 019c7e514a067a..39e11cb980fc1d 100644 --- a/config.mak.uname +++ b/config.mak.uname @@ -733,19 +733,13 @@ ifeq ($(uname_S),MINGW) ifneq (,$(findstring -O,$(filter-out -O0 -Og,$(CFLAGS)))) BASIC_LDFLAGS += -Wl,--dynamicbase endif - ifeq (MINGW32,$(MSYSTEM)) - prefix = /mingw32 - HOST_CPU = i686 - BASIC_LDFLAGS += -Wl,--pic-executable -Wl,--large-address-aware - else ifeq (MINGW64,$(MSYSTEM)) - prefix = /mingw64 - HOST_CPU = x86_64 - BASIC_LDFLAGS += -Wl,--pic-executable - else ifeq (CLANGARM64,$(MSYSTEM)) - prefix = /clangarm64 - HOST_CPU = aarch64 + ifneq (,$(MSYSTEM)) + prefix = $(MINGW_PREFIX) + HOST_CPU = $(patsubst %-w64-mingw32,%,$(MINGW_CHOST)) BASIC_LDFLAGS += -Wl,--pic-executable - else + ifeq (MINGW32,$(MSYSTEM)) + BASIC_LDFLAGS += -Wl,--large-address-aware + endif endif COMPAT_CFLAGS += -D__USE_MINGW_ANSI_STDIO=0 -DDETECT_MSYS_TTY \ -fstack-protector-strong From 852bc764c8ad2b7babaace36f307a3b5b6477aee Mon Sep 17 00:00:00 2001 From: Jeff Hostetler Date: Mon, 5 Apr 2021 14:08:22 -0400 Subject: [PATCH 094/218] vcbuild: add support for compiling Windows resource files Create a wrapper for the Windows Resource Compiler (RC.EXE) for use by the MSVC=1 builds. This is similar to the CL.EXE and LIB.EXE wrappers used for the MSVC=1 builds. Signed-off-by: Jeff Hostetler --- compat/vcbuild/find_vs_env.bat | 7 ++++++ compat/vcbuild/scripts/rc.pl | 46 ++++++++++++++++++++++++++++++++++ config.mak.uname | 3 ++- 3 files changed, 55 insertions(+), 1 deletion(-) create mode 100644 compat/vcbuild/scripts/rc.pl diff --git a/compat/vcbuild/find_vs_env.bat b/compat/vcbuild/find_vs_env.bat index b35d264c0e6bed..379b16296e09c2 100644 --- a/compat/vcbuild/find_vs_env.bat +++ b/compat/vcbuild/find_vs_env.bat @@ -99,6 +99,7 @@ REM ================================================================ SET sdk_dir=%WindowsSdkDir% SET sdk_ver=%WindowsSDKVersion% + SET sdk_ver_bin_dir=%WindowsSdkVerBinPath%%tgt% SET si=%sdk_dir%Include\%sdk_ver% SET sdk_includes=-I"%si%ucrt" -I"%si%um" -I"%si%shared" SET sl=%sdk_dir%lib\%sdk_ver% @@ -130,6 +131,7 @@ REM ================================================================ SET sdk_dir=%WindowsSdkDir% SET sdk_ver=%WindowsSDKVersion% + SET sdk_ver_bin_dir=%WindowsSdkVerBinPath%bin\amd64 SET si=%sdk_dir%Include\%sdk_ver% SET sdk_includes=-I"%si%ucrt" -I"%si%um" -I"%si%shared" -I"%si%winrt" SET sl=%sdk_dir%lib\%sdk_ver% @@ -160,6 +162,11 @@ REM ================================================================ echo msvc_includes=%msvc_includes% echo msvc_libs=%msvc_libs% + echo sdk_ver_bin_dir=%sdk_ver_bin_dir% + SET X1=%sdk_ver_bin_dir:C:=/C% + SET X2=%X1:\=/% + echo sdk_ver_bin_dir_msys=%X2% + echo sdk_includes=%sdk_includes% echo sdk_libs=%sdk_libs% diff --git a/compat/vcbuild/scripts/rc.pl b/compat/vcbuild/scripts/rc.pl new file mode 100644 index 00000000000000..7bca4cd81c6c63 --- /dev/null +++ b/compat/vcbuild/scripts/rc.pl @@ -0,0 +1,46 @@ +#!/usr/bin/perl -w +###################################################################### +# Compile Resources on Windows +# +# This is a wrapper to facilitate the compilation of Git with MSVC +# using GNU Make as the build system. So, instead of manipulating the +# Makefile into something nasty, just to support non-space arguments +# etc, we use this wrapper to fix the command line options +# +###################################################################### +use strict; +my @args = (); +my @input = (); + +while (@ARGV) { + my $arg = shift @ARGV; + if ("$arg" =~ /^-[dD]/) { + # GIT_VERSION gets passed with too many + # layers of dquote escaping. + $arg =~ s/\\"/"/g; + + push(@args, $arg); + + } elsif ("$arg" eq "-i") { + my $arg = shift @ARGV; + # TODO complain if NULL or is dashed ?? + push(@input, $arg); + + } elsif ("$arg" eq "-o") { + my $arg = shift @ARGV; + # TODO complain if NULL or is dashed ?? + push(@args, "-fo$arg"); + + } else { + push(@args, $arg); + } +} + +push(@args, "-nologo"); +push(@args, "-v"); +push(@args, @input); + +unshift(@args, "rc.exe"); +printf("**** @args\n"); + +exit (system(@args) != 0); diff --git a/config.mak.uname b/config.mak.uname index 8cbd55c96f38a4..d428dc33560849 100644 --- a/config.mak.uname +++ b/config.mak.uname @@ -450,7 +450,7 @@ ifeq ($(uname_S),Windows) # link.exe next to, and required by, cl.exe, we have to prepend this # onto the existing $PATH. # - SANE_TOOL_PATH ?= $(msvc_bin_dir_msys) + SANE_TOOL_PATH ?= $(msvc_bin_dir_msys):$(sdk_ver_bin_dir_msys) HAVE_ALLOCA_H = YesPlease NO_PREAD = YesPlease NEEDS_CRYPTO_WITH_SSL = YesPlease @@ -521,6 +521,7 @@ endif # See https://msdn.microsoft.com/en-us/library/ms235330.aspx EXTLIBS = user32.lib advapi32.lib shell32.lib wininet.lib ws2_32.lib invalidcontinue.obj kernel32.lib ntdll.lib PTHREAD_LIBS = + RC = compat/vcbuild/scripts/rc.pl lib = BASIC_CFLAGS += $(vcpkg_inc) $(sdk_includes) $(msvc_includes) ifndef DEBUG From 6e984f774e146a90f3213dd3c7a759a692de2f89 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Fri, 21 Nov 2025 14:09:40 +0100 Subject: [PATCH 095/218] mingw: only enable the MSYS2-specific stuff when compiling in MSYS2 The tell-tale is the presence of the `MSYSTEM` value while compiling, of course. In that case, we want to ensure that `MSYSTEM` is set when running `git.exe`, and also enable the magic MSYS2 tty detection. Signed-off-by: Johannes Schindelin --- config.mak.uname | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/config.mak.uname b/config.mak.uname index 39e11cb980fc1d..1b75cda2e2ac0e 100644 --- a/config.mak.uname +++ b/config.mak.uname @@ -737,12 +737,12 @@ ifeq ($(uname_S),MINGW) prefix = $(MINGW_PREFIX) HOST_CPU = $(patsubst %-w64-mingw32,%,$(MINGW_CHOST)) BASIC_LDFLAGS += -Wl,--pic-executable + COMPAT_CFLAGS += -DDETECT_MSYS_TTY ifeq (MINGW32,$(MSYSTEM)) BASIC_LDFLAGS += -Wl,--large-address-aware endif endif - COMPAT_CFLAGS += -D__USE_MINGW_ANSI_STDIO=0 -DDETECT_MSYS_TTY \ - -fstack-protector-strong + COMPAT_CFLAGS += -D__USE_MINGW_ANSI_STDIO=0 -fstack-protector-strong EXTLIBS += -lntdll EXTRA_PROGRAMS += headless-git$X INSTALL = /bin/install From de1dd1bd902c53fa8a5c1a338d4c3a7a6871e920 Mon Sep 17 00:00:00 2001 From: Jeff Hostetler Date: Mon, 5 Apr 2021 14:12:14 -0400 Subject: [PATCH 096/218] config.mak.uname: add git.rc to MSVC builds Teach MSVC=1 builds to depend on the `git.rc` file so that the resulting executables have Windows-style resources and version number information within them. Signed-off-by: Jeff Hostetler --- config.mak.uname | 1 + 1 file changed, 1 insertion(+) diff --git a/config.mak.uname b/config.mak.uname index d428dc33560849..f25e60c3752658 100644 --- a/config.mak.uname +++ b/config.mak.uname @@ -520,6 +520,7 @@ endif # handle twice, or to access the osfhandle of an already-closed stdout # See https://msdn.microsoft.com/en-us/library/ms235330.aspx EXTLIBS = user32.lib advapi32.lib shell32.lib wininet.lib ws2_32.lib invalidcontinue.obj kernel32.lib ntdll.lib + GITLIBS += git.res PTHREAD_LIBS = RC = compat/vcbuild/scripts/rc.pl lib = From 546d38dcfa069d85befb04a6c2403d38b52321e2 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Fri, 21 Nov 2025 14:17:24 +0100 Subject: [PATCH 097/218] mingw: rely on MSYS2's metadata instead of hard-coding it MSYS2 defines some helpful environment variables, e.g. `MSYSTEM`. There is code in Git for Windows to ensure that that `MSYSTEM` variable is set, hard-coding a default. However, the existing solution jumps through hoops to reconstruct the proper default, and is even incomplete doing so, as we found out when we extended it to support CLANGARM64. This is absolutely unnecessary because there is already a perfectly valid `MSYSTEM` value we can use at build time. This is even true when building the MINGW32 variant on a MINGW64 system because `makepkg-mingw` will override the `MSYSTEM` value as per the `MINGW_ARCH` array. The same is equally true for the `/mingw64`, `/mingw32` and `/clangarm64` prefix: those values are already available via the `MINGW_PREFIX` environment variable, and we just need to pass that setting through. Only when `MINGW_PREFIX` is not set (as is the case in Git for Windows' minimal SDK, where only `MSYSTEM` is guaranteed to be set correctly), we use as fall-back the top-level directory whose name is the down-cased value of the `MSYSTEM` variable. Incidentally, this also broadens the support to all the configurations supported by the MSYS2 project, i.e. clang64 & ucrt64, too. Note: This keeps the same, hard-coded MSYSTEM platform support for CMake as before, but drops it for Meson (because it is unclear how Meson could do this in a more flexible manner). Signed-off-by: Johannes Schindelin --- config.mak.uname | 14 ++++++-------- contrib/buildsystems/CMakeLists.txt | 9 ++++++++- meson.build | 13 ++++++++++++- meson_options.txt | 4 ++++ 4 files changed, 30 insertions(+), 10 deletions(-) diff --git a/config.mak.uname b/config.mak.uname index 1b75cda2e2ac0e..8e64d0db20c9b0 100644 --- a/config.mak.uname +++ b/config.mak.uname @@ -441,14 +441,8 @@ ifeq ($(uname_S),Windows) GIT_VERSION := $(GIT_VERSION).MSVC pathsep = ; # Assume that this is built in Git for Windows' SDK - ifeq (MINGW32,$(MSYSTEM)) - prefix = /mingw32 - else - ifeq (CLANGARM64,$(MSYSTEM)) - prefix = /clangarm64 - else - prefix = /mingw64 - endif + ifneq (,$(MSYSTEM)) + prefix = $(MINGW_PREFIX) endif # Prepend MSVC 64-bit tool-chain to PATH. # @@ -734,6 +728,10 @@ ifeq ($(uname_S),MINGW) BASIC_LDFLAGS += -Wl,--dynamicbase endif ifneq (,$(MSYSTEM)) + ifeq ($(MINGW_PREFIX),$(filter-out /%,$(MINGW_PREFIX))) + # Override if empty or does not start with a slash + MINGW_PREFIX := /$(shell echo '$(MSYSTEM)' | tr A-Z a-z) + endif prefix = $(MINGW_PREFIX) HOST_CPU = $(patsubst %-w64-mingw32,%,$(MINGW_CHOST)) BASIC_LDFLAGS += -Wl,--pic-executable diff --git a/contrib/buildsystems/CMakeLists.txt b/contrib/buildsystems/CMakeLists.txt index 81b4306e72046c..f3cbf743d20269 100644 --- a/contrib/buildsystems/CMakeLists.txt +++ b/contrib/buildsystems/CMakeLists.txt @@ -256,7 +256,14 @@ if(CMAKE_SYSTEM_NAME STREQUAL "Windows") _CONSOLE DETECT_MSYS_TTY STRIP_EXTENSION=".exe" NO_SYMLINK_HEAD UNRELIABLE_FSTAT NOGDI OBJECT_CREATION_MODE=1 __USE_MINGW_ANSI_STDIO=0 USE_NED_ALLOCATOR OVERRIDE_STRDUP MMAP_PREVENTS_DELETE USE_WIN32_MMAP - HAVE_WPGMPTR ENSURE_MSYSTEM_IS_SET HAVE_RTLGENRANDOM) + HAVE_WPGMPTR HAVE_RTLGENRANDOM) + if(CMAKE_GENERATOR_PLATFORM STREQUAL "x64") + add_compile_definitions(ENSURE_MSYSTEM_IS_SET="MINGW64" MINGW_PREFIX="mingw64") + elseif(CMAKE_GENERATOR_PLATFORM STREQUAL "arm64") + add_compile_definitions(ENSURE_MSYSTEM_IS_SET="CLANGARM64" MINGW_PREFIX="clangarm64") + elseif(CMAKE_GENERATOR_PLATFORM STREQUAL "x86") + add_compile_definitions(ENSURE_MSYSTEM_IS_SET="MINGW32" MINGW_PREFIX="mingw32") + endif() list(APPEND compat_SOURCES compat/mingw.c compat/winansi.c diff --git a/meson.build b/meson.build index 11488623bfd8f8..7a6b54df6bec84 100644 --- a/meson.build +++ b/meson.build @@ -1290,7 +1290,6 @@ elif host_machine.system() == 'windows' libgit_c_args += [ '-DDETECT_MSYS_TTY', - '-DENSURE_MSYSTEM_IS_SET', '-DNATIVE_CRLF', '-DNOGDI', '-DNO_POSIX_GOODIES', @@ -1300,6 +1299,18 @@ elif host_machine.system() == 'windows' '-D__USE_MINGW_ANSI_STDIO=0', ] + msystem = get_option('msystem') + if msystem != '' + mingw_prefix = get_option('mingw_prefix') + if mingw_prefix == '' + mingw_prefix = '/' + msystem.to_lower() + endif + libgit_c_args += [ + '-DENSURE_MSYSTEM_IS_SET="' + msystem + '"', + '-DMINGW_PREFIX="' + mingw_prefix + '"' + ] + endif + libgit_dependencies += compiler.find_library('ntdll') libgit_include_directories += 'compat/win32' if compiler.get_id() == 'msvc' diff --git a/meson_options.txt b/meson_options.txt index 659cbb218f46e0..4c77708a28aa01 100644 --- a/meson_options.txt +++ b/meson_options.txt @@ -21,6 +21,10 @@ option('runtime_prefix', type: 'boolean', value: false, description: 'Resolve ancillary tooling and support files relative to the location of the runtime binary instead of hard-coding them into the binary.') option('sane_tool_path', type: 'array', value: [], description: 'An array of paths to pick up tools from in case the normal tools are broken or lacking.') +option('msystem', type: 'string', value: '', + description: 'Fall-back on Windows when MSYSTEM is not set.') +option('mingw_prefix', type: 'string', value: '', + description: 'Fall-back on Windows when MINGW_PREFIX is not set.') # Build information compiled into Git and other parts like documentation. option('build_date', type: 'string', value: '', From b1c84adabefa9daf69ed0974b0f2fa926342b8c7 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Tue, 21 Feb 2017 13:28:58 +0100 Subject: [PATCH 098/218] mingw: ensure valid CTYPE A change between versions 2.4.1 and 2.6.0 of the MSYS2 runtime modified how Cygwin's runtime (and hence Git for Windows' MSYS2 runtime derivative) handles locales: d16a56306d (Consolidate wctomb/mbtowc calls for POSIX-1.2008, 2016-07-20). An unintended side-effect is that "cold-calling" into the POSIX emulation will start with a locale based on the current code page, something that Git for Windows is very ill-prepared for, as it expects to be able to pass a command-line containing non-ASCII characters to the shell without having those characters munged. One symptom of this behavior: when `git clone` or `git fetch` shell out to call `git-upload-pack` with a path that contains non-ASCII characters, the shell tried to interpret the entire command-line (including command-line parameters) as executable path, which obviously must fail. This fixes https://github.com/git-for-windows/git/issues/1036 Signed-off-by: Johannes Schindelin --- compat/mingw.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/compat/mingw.c b/compat/mingw.c index feefa2cd0eb12a..c3e16e6b9cbe72 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -3154,6 +3154,9 @@ static void setup_windows_environment(void) setenv("HOME", tmp, 1); } + if (!getenv("LC_ALL") && !getenv("LC_CTYPE") && !getenv("LANG")) + setenv("LC_CTYPE", "C.UTF-8", 1); + /* * Change 'core.symlinks' default to false, unless native symlinks are * enabled in MSys2 (via 'MSYS=winsymlinks:nativestrict'). Thus we can From f3d894008ab5f4146645bfacb2a4fd9291f4340e Mon Sep 17 00:00:00 2001 From: Jeff Hostetler Date: Mon, 5 Apr 2021 14:24:52 -0400 Subject: [PATCH 099/218] clink.pl: ignore no-stack-protector arg on MSVC=1 builds Ignore the `-fno-stack-protector` compiler argument when building with MSVC. This will be used in a later commit that needs to build a Win32 GUI app. Signed-off-by: Jeff Hostetler --- compat/vcbuild/scripts/clink.pl | 2 ++ 1 file changed, 2 insertions(+) diff --git a/compat/vcbuild/scripts/clink.pl b/compat/vcbuild/scripts/clink.pl index 2768ae15f1879f..73c8a2b184f38b 100755 --- a/compat/vcbuild/scripts/clink.pl +++ b/compat/vcbuild/scripts/clink.pl @@ -122,6 +122,8 @@ push(@cflags, "-wd4996"); } elsif ("$arg" =~ /^-W[a-z]/) { # let's ignore those + } elsif ("$arg" eq "-fno-stack-protector") { + # eat this } else { push(@args, $arg); } From 1fb6dab0761e07ba6ccf15a077b0c3f5a33c280e Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Fri, 21 Nov 2025 14:45:45 +0100 Subject: [PATCH 100/218] mingw: always define `ETC_*` for MSYS2 environments Special-casing even more configurations simply does not make sense. Signed-off-by: Johannes Schindelin --- config.mak.uname | 10 ++++------ 1 file changed, 4 insertions(+), 6 deletions(-) diff --git a/config.mak.uname b/config.mak.uname index 8e64d0db20c9b0..9759198b0148b7 100644 --- a/config.mak.uname +++ b/config.mak.uname @@ -496,7 +496,7 @@ ifeq ($(uname_S),Windows) NATIVE_CRLF = YesPlease DEFAULT_HELP_FORMAT = html SKIP_DASHED_BUILT_INS = YabbaDabbaDoo -ifeq (/mingw64,$(subst 32,64,$(subst clangarm,mingw,$(prefix)))) +ifneq (,$(MINGW_PREFIX)) # Move system config into top-level /etc/ ETC_GITCONFIG = ../etc/gitconfig ETC_GITATTRIBUTES = ../etc/gitattributes @@ -739,6 +739,9 @@ ifeq ($(uname_S),MINGW) ifeq (MINGW32,$(MSYSTEM)) BASIC_LDFLAGS += -Wl,--large-address-aware endif + # Move system config into top-level /etc/ + ETC_GITCONFIG = ../etc/gitconfig + ETC_GITATTRIBUTES = ../etc/gitattributes endif COMPAT_CFLAGS += -D__USE_MINGW_ANSI_STDIO=0 -fstack-protector-strong EXTLIBS += -lntdll @@ -749,11 +752,6 @@ ifeq ($(uname_S),MINGW) USE_GETTEXT_SCHEME = fallthrough USE_LIBPCRE = YesPlease NO_PYTHON = - ifeq (/mingw64,$(subst 32,64,$(subst clangarm,mingw,$(prefix)))) - # Move system config into top-level /etc/ - ETC_GITCONFIG = ../etc/gitconfig - ETC_GITATTRIBUTES = ../etc/gitattributes - endif endif ifeq ($(uname_S),QNX) COMPAT_CFLAGS += -DSA_RESTART=0 From be2b2f4256759238e74612760cd9936eb80dc34f Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Sat, 1 Feb 2020 00:31:16 +0100 Subject: [PATCH 101/218] mingw: allow `git.exe` to be used instead of the "Git wrapper" Git for Windows wants to add `git.exe` to the users' `PATH`, without cluttering the latter with unnecessary executables such as `wish.exe`. To that end, it invented the concept of its "Git wrapper", i.e. a tiny executable located in `C:\Program Files\Git\cmd\git.exe` (originally a CMD script) whose sole purpose is to set up a couple of environment variables and then spawn the _actual_ `git.exe` (which nowadays lives in `C:\Program Files\Git\mingw64\bin\git.exe` for 64-bit, and the obvious equivalent for 32-bit installations). Currently, the following environment variables are set unless already initialized: - `MSYSTEM`, to make sure that the MSYS2 Bash and the MSYS2 Perl interpreter behave as expected, and - `PLINK_PROTOCOL`, to force PuTTY's `plink.exe` to use the SSH protocol instead of Telnet, - `PATH`, to make sure that the `bin` folder in the user's home directory, as well as the `/mingw64/bin` and the `/usr/bin` directories are included. The trick here is that the `/mingw64/bin/` and `/usr/bin/` directories are relative to the top-level installation directory of Git for Windows (which the included Bash interprets as `/`, i.e. as the MSYS pseudo root directory). Using the absence of `MSYSTEM` as a tell-tale, we can detect in `git.exe` whether these environment variables have been initialized properly. Therefore we can call `C:\Program Files\Git\mingw64\bin\git` in-place after this change, without having to call Git through the Git wrapper. Obviously, above-mentioned directories must be _prepended_ to the `PATH` variable, otherwise we risk picking up executables from unrelated Git installations. We do that by constructing the new `PATH` value from scratch, appending `$HOME/bin` (if `HOME` is set), then the MSYS2 system directories, and then appending the original `PATH`. Side note: this modification of the `PATH` variable is independent of the modification necessary to reach the executables and scripts in `/mingw64/libexec/git-core/`, i.e. the `GIT_EXEC_PATH`. That modification is still performed by Git, elsewhere, long after making the changes described above. While we _still_ cannot simply hard-link `mingw64\bin\git.exe` to `cmd` (because the former depends on a couple of `.dll` files that are only in `mingw64\bin`, i.e. calling `...\cmd\git.exe` would fail to load due to missing dependencies), at least we can now avoid that extra process of running the Git wrapper (which then has to wait for the spawned `git.exe` to finish) by calling `...\mingw64\bin\git.exe` directly, via its absolute path. Testing this is in Git's test suite tricky: we set up a "new" MSYS pseudo-root and copy the `git.exe` file into the appropriate location, then verify that `MSYSTEM` is set properly, and also that the `PATH` is modified so that scripts can be found in `$HOME/bin`, `/mingw64/bin/` and `/usr/bin/`. This addresses https://github.com/git-for-windows/git/issues/2283 Signed-off-by: Johannes Schindelin --- compat/mingw.c | 65 +++++++++++++++++++++++++++++++++++++++++++ config.mak.uname | 8 ++++-- t/t0060-path-utils.sh | 33 +++++++++++++++++++++- 3 files changed, 103 insertions(+), 3 deletions(-) diff --git a/compat/mingw.c b/compat/mingw.c index c3e16e6b9cbe72..70e5a853932317 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -3102,6 +3102,45 @@ int xwcstoutf(char *utf, const wchar_t *wcs, size_t utflen) return -1; } +#ifdef ENSURE_MSYSTEM_IS_SET +#if !defined(RUNTIME_PREFIX) || !defined(HAVE_WPGMPTR) || !defined(MINGW_PREFIX) +static size_t append_system_bin_dirs(char *path UNUSED, size_t size UNUSED) +{ + return 0; +} +#else +static size_t append_system_bin_dirs(char *path, size_t size) +{ + char prefix[32768]; + const char *slash; + size_t len = xwcstoutf(prefix, _wpgmptr, sizeof(prefix)), off = 0; + + if (len == 0 || len >= sizeof(prefix) || + !(slash = find_last_dir_sep(prefix))) + return 0; + /* strip trailing `git.exe` */ + len = slash - prefix; + + /* strip trailing `cmd` or `\bin` or `bin` or `libexec\git-core` */ + if (strip_suffix_mem(prefix, &len, "\\" MINGW_PREFIX "\\libexec\\git-core") || + strip_suffix_mem(prefix, &len, "\\" MINGW_PREFIX "\\bin")) + off += xsnprintf(path + off, size - off, + "%.*s\\" MINGW_PREFIX "\\bin;", (int)len, prefix); + else if (strip_suffix_mem(prefix, &len, "\\cmd") || + strip_suffix_mem(prefix, &len, "\\bin") || + strip_suffix_mem(prefix, &len, "\\libexec\\git-core")) + off += xsnprintf(path + off, size - off, + "%.*s\\" MINGW_PREFIX "\\bin;", (int)len, prefix); + else + return 0; + + off += xsnprintf(path + off, size - off, + "%.*s\\usr\\bin;", (int)len, prefix); + return off; +} +#endif +#endif + static void setup_windows_environment(void) { char *tmp = getenv("TMPDIR"); @@ -3154,6 +3193,32 @@ static void setup_windows_environment(void) setenv("HOME", tmp, 1); } + if (!getenv("PLINK_PROTOCOL")) + setenv("PLINK_PROTOCOL", "ssh", 0); + +#ifdef ENSURE_MSYSTEM_IS_SET + if (!(tmp = getenv("MSYSTEM")) || !tmp[0]) { + const char *home = getenv("HOME"), *path = getenv("PATH"); + char buf[32768]; + size_t off = 0; + + setenv("MSYSTEM", ENSURE_MSYSTEM_IS_SET, 1); + + if (home) + off += xsnprintf(buf + off, sizeof(buf) - off, + "%s\\bin;", home); + off += append_system_bin_dirs(buf + off, sizeof(buf) - off); + if (path) + off += xsnprintf(buf + off, sizeof(buf) - off, + "%s", path); + else if (off > 0) + buf[off - 1] = '\0'; + else + buf[0] = '\0'; + setenv("PATH", buf, 1); + } +#endif + if (!getenv("LC_ALL") && !getenv("LC_CTYPE") && !getenv("LANG")) setenv("LC_CTYPE", "C.UTF-8", 1); diff --git a/config.mak.uname b/config.mak.uname index 9759198b0148b7..8cbd55c96f38a4 100644 --- a/config.mak.uname +++ b/config.mak.uname @@ -512,7 +512,9 @@ endif compat/win32/pthread.o compat/win32/syslog.o \ compat/win32/trace2_win32_process_info.o \ compat/win32/dirent.o - COMPAT_CFLAGS = -D__USE_MINGW_ACCESS -DDETECT_MSYS_TTY -DNOGDI -DHAVE_STRING_H -Icompat -Icompat/regex -Icompat/win32 -DSTRIP_EXTENSION=\".exe\" + COMPAT_CFLAGS = -D__USE_MINGW_ACCESS -DDETECT_MSYS_TTY \ + -DENSURE_MSYSTEM_IS_SET="\"$(MSYSTEM)\"" -DMINGW_PREFIX="\"$(patsubst /%,%,$(MINGW_PREFIX))\"" \ + -DNOGDI -DHAVE_STRING_H -Icompat -Icompat/regex -Icompat/win32 -DSTRIP_EXTENSION=\".exe\" BASIC_LDFLAGS = -IGNORE:4217 -IGNORE:4049 -NOLOGO -ENTRY:wmainCRTStartup -SUBSYSTEM:CONSOLE # invalidcontinue.obj allows Git's source code to close the same file # handle twice, or to access the osfhandle of an already-closed stdout @@ -735,7 +737,9 @@ ifeq ($(uname_S),MINGW) prefix = $(MINGW_PREFIX) HOST_CPU = $(patsubst %-w64-mingw32,%,$(MINGW_CHOST)) BASIC_LDFLAGS += -Wl,--pic-executable - COMPAT_CFLAGS += -DDETECT_MSYS_TTY + COMPAT_CFLAGS += -DDETECT_MSYS_TTY \ + -DENSURE_MSYSTEM_IS_SET="\"$(MSYSTEM)\"" \ + -DMINGW_PREFIX="\"$(patsubst /%,%,$(MINGW_PREFIX))\"" ifeq (MINGW32,$(MSYSTEM)) BASIC_LDFLAGS += -Wl,--large-address-aware endif diff --git a/t/t0060-path-utils.sh b/t/t0060-path-utils.sh index 8545cdfab559b4..56faf5fe732ee0 100755 --- a/t/t0060-path-utils.sh +++ b/t/t0060-path-utils.sh @@ -602,7 +602,8 @@ test_expect_success !VALGRIND,RUNTIME_PREFIX,CAN_EXEC_IN_PWD 'RUNTIME_PREFIX wor echo "echo HERE" | write_script pretend/libexec/git-core/git-here && GIT_EXEC_PATH= ./pretend/bin/git here >actual && echo HERE >expect && - test_cmp expect actual' + test_cmp expect actual +' test_expect_success !VALGRIND,RUNTIME_PREFIX,CAN_EXEC_IN_PWD '%(prefix)/ works' ' git config yes.path "%(prefix)/yes" && @@ -611,4 +612,34 @@ test_expect_success !VALGRIND,RUNTIME_PREFIX,CAN_EXEC_IN_PWD '%(prefix)/ works' test_cmp expect actual ' +test_expect_success MINGW,RUNTIME_PREFIX 'MSYSTEM/PATH is adjusted if necessary' ' + if test -z "$MINGW_PREFIX" + then + MINGW_PREFIX="/$(echo "${MSYSTEM:-MINGW64}" | tr A-Z a-z)" + fi && + mkdir -p "$HOME"/bin pretend"$MINGW_PREFIX"/bin \ + pretend"$MINGW_PREFIX"/libexec/git-core pretend/usr/bin && + cp "$GIT_EXEC_PATH"/git.exe pretend"$MINGW_PREFIX"/bin/ && + cp "$GIT_EXEC_PATH"/git.exe pretend"$MINGW_PREFIX"/libexec/git-core/ && + # copy the .dll files, if any (happens when building via CMake) + if test -n "$(ls "$GIT_EXEC_PATH"/*.dll 2>/dev/null)" + then + cp "$GIT_EXEC_PATH"/*.dll pretend"$MINGW_PREFIX"/bin/ && + cp "$GIT_EXEC_PATH"/*.dll pretend"$MINGW_PREFIX"/libexec/git-core/ + fi && + echo "env | grep MSYSTEM=" | write_script "$HOME"/bin/git-test-home && + echo "echo ${MINGW_PREFIX#/}" | write_script pretend"$MINGW_PREFIX"/bin/git-test-bin && + echo "echo usr" | write_script pretend/usr/bin/git-test-bin2 && + + ( + MSYSTEM= && + GIT_EXEC_PATH= && + pretend"$MINGW_PREFIX"/libexec/git-core/git.exe test-home >actual && + pretend"$MINGW_PREFIX"/libexec/git-core/git.exe test-bin >>actual && + pretend"$MINGW_PREFIX"/bin/git.exe test-bin2 >>actual + ) && + test_write_lines MSYSTEM=$MSYSTEM "${MINGW_PREFIX#/}" usr >expect && + test_cmp expect actual +' + test_done From f717f38a33a2a0d6bec09aec576b1f2dc6ce2caa Mon Sep 17 00:00:00 2001 From: Jeff Hostetler Date: Mon, 5 Apr 2021 14:39:33 -0400 Subject: [PATCH 102/218] clink.pl: move default linker options for MSVC=1 builds Move the default `-ENTRY` and `-SUBSYSTEM` arguments for MSVC=1 builds from `config.mak.uname` into `clink.pl`. These args are constant for console-mode executables. Add support to `clink.pl` for generating a Win32 GUI application using the `-mwindows` argument (to match how GCC does it). This changes the `-ENTRY` and `-SUBSYSTEM` arguments accordingly. Signed-off-by: Jeff Hostetler --- compat/vcbuild/scripts/clink.pl | 11 +++++++++++ config.mak.uname | 2 +- 2 files changed, 12 insertions(+), 1 deletion(-) diff --git a/compat/vcbuild/scripts/clink.pl b/compat/vcbuild/scripts/clink.pl index 73c8a2b184f38b..a38b360015ece9 100755 --- a/compat/vcbuild/scripts/clink.pl +++ b/compat/vcbuild/scripts/clink.pl @@ -15,6 +15,7 @@ my @lflags = (); my $is_linking = 0; my $is_debug = 0; +my $is_gui = 0; while (@ARGV) { my $arg = shift @ARGV; if ("$arg" eq "-DDEBUG") { @@ -124,11 +125,21 @@ # let's ignore those } elsif ("$arg" eq "-fno-stack-protector") { # eat this + } elsif ("$arg" eq "-mwindows") { + $is_gui = 1; } else { push(@args, $arg); } } if ($is_linking) { + if ($is_gui) { + push(@args, "-ENTRY:wWinMainCRTStartup"); + push(@args, "-SUBSYSTEM:WINDOWS"); + } else { + push(@args, "-ENTRY:wmainCRTStartup"); + push(@args, "-SUBSYSTEM:CONSOLE"); + } + push(@args, @lflags); unshift(@args, "link.exe"); } else { diff --git a/config.mak.uname b/config.mak.uname index f25e60c3752658..fa8d36f1e31823 100644 --- a/config.mak.uname +++ b/config.mak.uname @@ -515,7 +515,7 @@ endif COMPAT_CFLAGS = -D__USE_MINGW_ACCESS -DDETECT_MSYS_TTY \ -DENSURE_MSYSTEM_IS_SET="\"$(MSYSTEM)\"" -DMINGW_PREFIX="\"$(patsubst /%,%,$(MINGW_PREFIX))\"" \ -DNOGDI -DHAVE_STRING_H -Icompat -Icompat/regex -Icompat/win32 -DSTRIP_EXTENSION=\".exe\" - BASIC_LDFLAGS = -IGNORE:4217 -IGNORE:4049 -NOLOGO -ENTRY:wmainCRTStartup -SUBSYSTEM:CONSOLE + BASIC_LDFLAGS = -IGNORE:4217 -IGNORE:4049 -NOLOGO # invalidcontinue.obj allows Git's source code to close the same file # handle twice, or to access the osfhandle of an already-closed stdout # See https://msdn.microsoft.com/en-us/library/ms235330.aspx From 5db8d3bdcf778d3a7c7da34ba0e3b62fa3b10f3e Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Fri, 27 Jan 2023 08:55:21 +0100 Subject: [PATCH 103/218] windows: skip linking `git-` for built-ins It is merely a historical wart that, say, `git-commit` exists in the `libexec/git-core/` directory, a tribute to the original idea to let Git be essentially a bunch of Unix shell scripts revolving around very few "plumbing" (AKA low-level) commands. Git has evolved a lot from there. These days, most of Git's functionality is contained within the `git` executable, in the form of "built-in" commands. To accommodate for scripts that use the "dashed" form of Git commands, even today, Git provides hard-links that make the `git` executable available as, say, `git-commit`, just in case that an old script has not been updated to invoke `git commit`. Those hard-links do not come cheap: they take about half a minute for every build of Git on Windows, they are mistaken for taking up huge amounts of space by some Windows Explorer versions that do not understand hard-links, and therefore many a "bug" report had to be addressed. The "dashed form" has been officially deprecated in Git version 1.5.4, which was released on February 2nd, 2008, i.e. a very long time ago. This deprecation was never finalized by skipping these hard-links, but we can start the process now, in Git for Windows. Signed-off-by: Johannes Schindelin --- config.mak.uname | 2 ++ 1 file changed, 2 insertions(+) diff --git a/config.mak.uname b/config.mak.uname index 8a391157e48c5b..9aef8175a058dd 100644 --- a/config.mak.uname +++ b/config.mak.uname @@ -501,6 +501,7 @@ ifeq ($(uname_S),Windows) NO_POSIX_GOODIES = UnfortunatelyYes NATIVE_CRLF = YesPlease DEFAULT_HELP_FORMAT = html + SKIP_DASHED_BUILT_INS = YabbaDabbaDoo ifeq (/mingw64,$(subst 32,64,$(subst clangarm,mingw,$(prefix)))) # Move system config into top-level /etc/ ETC_GITCONFIG = ../etc/gitconfig @@ -693,6 +694,7 @@ ifeq ($(uname_S),MINGW) FSMONITOR_DAEMON_BACKEND = win32 FSMONITOR_OS_SETTINGS = win32 + SKIP_DASHED_BUILT_INS = YabbaDabbaDoo RUNTIME_PREFIX = YesPlease HAVE_WPGMPTR = YesWeDo NO_ST_BLOCKS_IN_STRUCT_STAT = YesPlease From fabf7df59c8a93eae736f73a6886d4a951007271 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Fri, 21 Nov 2025 15:15:42 +0100 Subject: [PATCH 104/218] max_tree_depth: lower it for clang builds in general on Windows In 436a42215e5 (max_tree_depth: lower it for clangarm64 on Windows, 2025-04-23), I provided a work-around for a nasty issue with clangarm builds, where the stack is exhausted before the maximal tree depth is reached, and the resulting error cannot easily be handled by Git (because it would require Windows-specific handling). Turns out that this is not at all limited to ARM64. In my tests with CLANG64 in MSYS2 on the GitHub Actions runners, the test t6700.4 failed in the exact same way. What's worse: The limit needs to be quite a bit lower for x86_64 than for aarch64. In aforementioned tests, the breaking point was 1232: With 1231 it still worked as expected, with 1232 it would fail with the `STATUS_STACK_OVERFLOW` incorrectly mapped to exit code 127. For comparison, in my tests on GitHub Actions' Windows/ARM64 runners, the breaking point was 1439 instead. Therefore the condition needs to be adapted once more, to accommodate (with some safety margin) both aarch64 and x86_64 in clang-based builds on Windows, to let that test pass. Signed-off-by: Johannes Schindelin --- git-compat-util.h | 16 +++++++++++----- 1 file changed, 11 insertions(+), 5 deletions(-) diff --git a/git-compat-util.h b/git-compat-util.h index ae1bdc90a4cd6a..15a5c92bd0e5fd 100644 --- a/git-compat-util.h +++ b/git-compat-util.h @@ -597,17 +597,23 @@ static inline bool strip_suffix(const char *str, const char *suffix, * the stack overflow can occur. */ #define DEFAULT_MAX_ALLOWED_TREE_DEPTH 512 -#elif defined(GIT_WINDOWS_NATIVE) && defined(__clang__) && defined(__aarch64__) +#elif defined(GIT_WINDOWS_NATIVE) && defined(__clang__) /* - * Similar to Visual C, it seems that on Windows/ARM64 the clang-based - * builds have a smaller stack space available. When running out of - * that stack space, a `STATUS_STACK_OVERFLOW` is produced. When the + * Similar to Visual C, it seems that clang-based builds on Windows + * have a smaller stack space available. When running out of that + * stack space, a `STATUS_STACK_OVERFLOW` is produced. When the * Git command was run from an MSYS2 Bash, this unfortunately results * in an exit code 127. Let's prevent that by lowering the maximal - * tree depth; This value seems to be low enough. + * tree depth; Unfortunately, it seems that the exact limit differs + * for aarch64 vs x86_64, and the difference is too large to simply + * use a single limit. */ +#if defined(__aarch64__) #define DEFAULT_MAX_ALLOWED_TREE_DEPTH 1280 #else +#define DEFAULT_MAX_ALLOWED_TREE_DEPTH 1152 +#endif +#else #define DEFAULT_MAX_ALLOWED_TREE_DEPTH 2048 #endif From 90e4e77b8affc2a9a0ebb0472b25eb43261cdcb8 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Tue, 25 Aug 2020 12:13:26 +0200 Subject: [PATCH 105/218] mingw: ignore HOMEDRIVE/HOMEPATH if it points to Windows' system directory Internally, Git expects the environment variable `HOME` to be set, and to point to the current user's home directory. This environment variable is not set by default on Windows, and therefore Git tries its best to construct one if it finds `HOME` unset. There are actually two different approaches Git tries: first, it looks at `HOMEDRIVE`/`HOMEPATH` because this is widely used in corporate environments with roaming profiles, and a user generally wants their global Git settings to be in a roaming profile. Only when `HOMEDRIVE`/`HOMEPATH` is either unset or does not point to a valid location, Git will fall back to using `USERPROFILE` instead. However, starting with Windows Vista, for secondary logons and services, the environment variables `HOMEDRIVE`/`HOMEPATH` point to Windows' system directory (usually `C:\Windows\system32`). That is undesirable, and that location is usually write-protected anyway. So let's verify that the `HOMEDRIVE`/`HOMEPATH` combo does not point to Windows' system directory before using it, falling back to `USERPROFILE` if it does. This fixes git-for-windows#2709 Initial-Path-by: Ivan Pozdeev Signed-off-by: Johannes Schindelin --- compat/mingw.c | 15 ++++++++++++++- 1 file changed, 14 insertions(+), 1 deletion(-) diff --git a/compat/mingw.c b/compat/mingw.c index 70e5a853932317..c592b9d218a66a 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -3141,6 +3141,18 @@ static size_t append_system_bin_dirs(char *path, size_t size) #endif #endif +static int is_system32_path(const char *path) +{ + WCHAR system32[MAX_PATH], wpath[MAX_PATH]; + + if (xutftowcs_path(wpath, path) < 0 || + !GetSystemDirectoryW(system32, ARRAY_SIZE(system32)) || + _wcsicmp(system32, wpath)) + return 0; + + return 1; +} + static void setup_windows_environment(void) { char *tmp = getenv("TMPDIR"); @@ -3181,7 +3193,8 @@ static void setup_windows_environment(void) strbuf_addstr(&buf, tmp); if ((tmp = getenv("HOMEPATH"))) { strbuf_addstr(&buf, tmp); - if (is_directory(buf.buf)) + if (!is_system32_path(buf.buf) && + is_directory(buf.buf)) setenv("HOME", buf.buf, 1); else tmp = NULL; /* use $USERPROFILE */ From 5fac063d1f021a55696192b2dc49bfb4c8cff9d7 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Matthias=20A=C3=9Fhauer?= Date: Sat, 2 Dec 2023 12:10:00 +0100 Subject: [PATCH 106/218] git.rc: include winuser.h MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit winuser.h contains the definition of RT_MANIFEST that our LLVM based toolchain needs to understand that we want to embed compat/win32/git.manifest as an application manifest. It currently just embeds it as additional data that Windows doesn't understand. This also helps our GCC based toolchain understand that we only want one copy embedded. It currently embeds one working assembly manifest and one nearly identical, but useless copy as additional data. This also teaches our Visual Studio based buildsystems to pick up the manifest file from git.rc. This means we don't have to explicitly specify it in contrib/buildsystems/Generators/Vcxproj.pm anymore. Slightly counter-intuitively this also means we have to explicitly tell Cmake not to embed a default manifest. This fixes https://github.com/git-for-windows/git/issues/4707 Signed-off-by: Matthias Aßhauer Signed-off-by: Johannes Schindelin --- contrib/buildsystems/CMakeLists.txt | 1 + git.rc.in | 1 + 2 files changed, 2 insertions(+) diff --git a/contrib/buildsystems/CMakeLists.txt b/contrib/buildsystems/CMakeLists.txt index e802c21f180ea5..4fb9473b877026 100644 --- a/contrib/buildsystems/CMakeLists.txt +++ b/contrib/buildsystems/CMakeLists.txt @@ -208,6 +208,7 @@ if(CMAKE_C_COMPILER_ID STREQUAL "MSVC") set(CMAKE_RUNTIME_OUTPUT_DIRECTORY_DEBUG ${CMAKE_BINARY_DIR}) set(CMAKE_RUNTIME_OUTPUT_DIRECTORY_RELEASE ${CMAKE_BINARY_DIR}) add_compile_options(/MP /std:c11) + add_link_options(/MANIFEST:NO) endif() #default behaviour diff --git a/git.rc.in b/git.rc.in index e69444eef3f0c5..1d5b627b610549 100644 --- a/git.rc.in +++ b/git.rc.in @@ -1,3 +1,4 @@ +#include 1 VERSIONINFO FILEVERSION @GIT_MAJOR_VERSION@,@GIT_MINOR_VERSION@,@GIT_MICRO_VERSION@,@GIT_PATCH_LEVEL@ PRODUCTVERSION @GIT_MAJOR_VERSION@,@GIT_MINOR_VERSION@,@GIT_MICRO_VERSION@,@GIT_PATCH_LEVEL@ From 7ec458d1b98b5c45fe5c433ae21bd5a9aed8f788 Mon Sep 17 00:00:00 2001 From: Yuyi Wang Date: Sat, 11 Mar 2023 17:51:18 +0800 Subject: [PATCH 107/218] cmake: install headless-git. headless-git is a git executable without opening a console window. It is useful when other GUI executables want to call git. We should install it together with git on Windows. Signed-off-by: Yuyi Wang --- contrib/buildsystems/CMakeLists.txt | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/contrib/buildsystems/CMakeLists.txt b/contrib/buildsystems/CMakeLists.txt index f3cbf743d20269..e802c21f180ea5 100644 --- a/contrib/buildsystems/CMakeLists.txt +++ b/contrib/buildsystems/CMakeLists.txt @@ -738,6 +738,7 @@ if(WIN32) endif() add_executable(headless-git ${CMAKE_SOURCE_DIR}/compat/win32/headless.c) + list(APPEND PROGRAMS_BUILT headless-git) if(CMAKE_C_COMPILER_ID STREQUAL "GNU" OR CMAKE_C_COMPILER_ID STREQUAL "Clang") target_link_options(headless-git PUBLIC -municode -Wl,-subsystem,windows) elseif(CMAKE_C_COMPILER_ID STREQUAL "MSVC") @@ -938,7 +939,7 @@ list(TRANSFORM git_perl_scripts PREPEND "${CMAKE_BINARY_DIR}/") #install foreach(program ${PROGRAMS_BUILT}) -if(program MATCHES "^(git|git-shell|scalar)$") +if(program MATCHES "^(git|git-shell|headless-git|scalar)$") install(TARGETS ${program} RUNTIME DESTINATION bin) else() From c5a212090fd636e2e790f32861dfd743f0bcf807 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Tue, 21 Apr 2026 08:58:16 +0000 Subject: [PATCH 108/218] ci: bump microsoft/setup-msbuild from v2 to v3 The v2 of `microsoft/setup-msbuild` runs on Node.js 20, which GitHub is phasing out of the Actions runners. v3 is a minimal release whose only substantive change is moving the action's runtime to Node.js 24, so that our Visual Studio build jobs keep working once Node.js 20 is removed from the runners. The risk of this bump is very low: v3 contains no functional changes to the action itself -- it merely adds `msbuild.exe` to `PATH`, with no change to command-line flags, inputs, outputs, or default tool resolution. The only precondition is a recent-enough Actions Runner, which the github.com-hosted runners already satisfy. See also: - Release notes: https://github.com/microsoft/setup-msbuild/releases - Compare: https://github.com/microsoft/setup-msbuild/compare/v2...v3 Originally-authored-by: dependabot[bot] Signed-off-by: Johannes Schindelin --- .github/workflows/main.yml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml index 96d19581129ec2..db30c6812b03f0 100644 --- a/.github/workflows/main.yml +++ b/.github/workflows/main.yml @@ -186,7 +186,7 @@ jobs: repository: git/git definitionId: 9 - name: add msbuild to PATH - uses: microsoft/setup-msbuild@v2 + uses: microsoft/setup-msbuild@v3 - name: copy dlls to root shell: cmd run: compat\vcbuild\vcpkg_copy_dlls.bat release From 236cb4d9ff73ca4a29e5d9b46455af405672cca0 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Tue, 21 Apr 2026 08:58:57 +0000 Subject: [PATCH 109/218] ci: bump actions/{upload,download}-artifact to v7 and v8 `actions/upload-artifact` and `actions/download-artifact` are tightly coupled: the upload action writes artifact archives in a format that the download action then reads. Because of this coupling, the two actions should always be bumped together so that the artifact format contract between them is satisfied. All of our `actions/upload-artifact` uses are still on v5, with one stray v4 occurrence. Keeping them on these versions would leave the artifact-upload steps running on Node.js 20, which GitHub is phasing out, and would eventually cause all upload steps to fail. Going from v5 directly to v7 folds in two release bumps: - v6 switches the action's default runtime from Node.js 20 to Node.js 24 (v5 had preliminary Node 24 support but still defaulted to Node 20). This is the main motivation for bumping now: it gets us off the deprecated runtime. - v7 adds two opt-in features: direct (unzipped) single-file uploads via a new `archive: false` parameter, and an internal conversion of the action to ESM to match the updated `@actions/*` packages. Risk analysis: we never pass `archive`, so the zip-as-usual behavior is unchanged. We also do not `require('@actions/*')` from any calling workflow, so the ESM migration cannot affect us. The upload steps we care about -- tracked files/build artifacts and failing-test directories -- keep the same inputs (`name`, `path`) and outputs, so the diff is purely the `@vN` identifier. The main precondition is a recent Actions Runner (>= 2.327.1), which the github.com-hosted runners used by our CI already satisfy. While at it, align the one remaining `@v4` occurrence with the rest so that every `upload-artifact` step uses the same version. See also: - Release notes: https://github.com/actions/upload-artifact/releases - Compare: https://github.com/actions/upload-artifact/compare/v5...v7 We use `actions/download-artifact` to pass build artifacts between the "windows-build" / "vs-build" / "windows-meson-build" jobs and their corresponding test jobs. All callers are currently on v6; bumping to v8 keeps this action in lockstep with the `upload-artifact` bump above. What v7 and v8 change: - v7 switches the default runtime from Node.js 20 to Node.js 24 (v6 had preliminary Node 24 support but still defaulted to Node 20). This is the main motivation: it gets us off the deprecated runtime. - v8 makes three further changes: * The package is converted to ESM (invisible to workflow authors). * The action now checks the `Content-Type` header before attempting to unzip a download, so that directly-uploaded (unzipped) artifacts from `upload-artifact` v7 are downloaded correctly. * The `digest-mismatch` behaviour is changed from warn-and- continue to a hard failure by default. Risk analysis: defaulting hash-mismatch to a hard failure is strictly safer than the previous warn-and-continue behaviour -- a mismatch points to real corruption or tampering and should stop the run. We download archives that the same workflow just uploaded, on the same runner fleet, so false positives are not expected. Our usage is limited to the `name` and `path` inputs, which are unchanged between v6 and v8, so the diff is purely the `@vN` identifier. See also: - Release notes: https://github.com/actions/download-artifact/releases - Compare: https://github.com/actions/download-artifact/compare/v6...v8 Originally-authored-by: dependabot[bot] Signed-off-by: Johannes Schindelin --- .github/workflows/main.yml | 22 +++++++++++----------- 1 file changed, 11 insertions(+), 11 deletions(-) diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml index db30c6812b03f0..5994f40cc0f71d 100644 --- a/.github/workflows/main.yml +++ b/.github/workflows/main.yml @@ -123,7 +123,7 @@ jobs: - name: zip up tracked files run: git archive -o artifacts/tracked.tar.gz HEAD - name: upload tracked files and build artifacts - uses: actions/upload-artifact@v5 + uses: actions/upload-artifact@v7 with: name: windows-artifacts path: artifacts @@ -140,7 +140,7 @@ jobs: cancel-in-progress: ${{ needs.ci-config.outputs.skip_concurrent == 'yes' }} steps: - name: download tracked files and build artifacts - uses: actions/download-artifact@v6 + uses: actions/download-artifact@v8 with: name: windows-artifacts path: ${{github.workspace}} @@ -157,7 +157,7 @@ jobs: run: ci/print-test-failures.sh - name: Upload failed tests' directories if: failure() && env.FAILED_TEST_ARTIFACTS != '' - uses: actions/upload-artifact@v5 + uses: actions/upload-artifact@v7 with: name: failed-tests-windows-${{ matrix.nr }} path: ${{env.FAILED_TEST_ARTIFACTS}} @@ -208,7 +208,7 @@ jobs: - name: zip up tracked files run: git archive -o artifacts/tracked.tar.gz HEAD - name: upload tracked files and build artifacts - uses: actions/upload-artifact@v5 + uses: actions/upload-artifact@v7 with: name: vs-artifacts path: artifacts @@ -226,7 +226,7 @@ jobs: steps: - uses: git-for-windows/setup-git-for-windows-sdk@v1 - name: download tracked files and build artifacts - uses: actions/download-artifact@v6 + uses: actions/download-artifact@v8 with: name: vs-artifacts path: ${{github.workspace}} @@ -244,7 +244,7 @@ jobs: run: ci/print-test-failures.sh - name: Upload failed tests' directories if: failure() && env.FAILED_TEST_ARTIFACTS != '' - uses: actions/upload-artifact@v5 + uses: actions/upload-artifact@v7 with: name: failed-tests-windows-vs-${{ matrix.nr }} path: ${{env.FAILED_TEST_ARTIFACTS}} @@ -270,7 +270,7 @@ jobs: shell: pwsh run: meson compile -C build - name: Upload build artifacts - uses: actions/upload-artifact@v5 + uses: actions/upload-artifact@v7 with: name: windows-meson-artifacts path: build @@ -292,7 +292,7 @@ jobs: shell: pwsh run: pip install meson ninja - name: Download build artifacts - uses: actions/download-artifact@v6 + uses: actions/download-artifact@v8 with: name: windows-meson-artifacts path: build @@ -305,7 +305,7 @@ jobs: run: ci/print-test-failures.sh - name: Upload failed tests' directories if: failure() && env.FAILED_TEST_ARTIFACTS != '' - uses: actions/upload-artifact@v4 + uses: actions/upload-artifact@v7 with: name: failed-tests-windows-meson-${{ matrix.nr }} path: ${{env.FAILED_TEST_ARTIFACTS}} @@ -349,7 +349,7 @@ jobs: run: ci/print-test-failures.sh - name: Upload failed tests' directories if: failure() && env.FAILED_TEST_ARTIFACTS != '' - uses: actions/upload-artifact@v5 + uses: actions/upload-artifact@v7 with: name: failed-tests-${{matrix.vector.jobname}} path: ${{env.FAILED_TEST_ARTIFACTS}} @@ -451,7 +451,7 @@ jobs: run: sudo --preserve-env --set-home --user=builder ci/print-test-failures.sh - name: Upload failed tests' directories if: failure() && env.FAILED_TEST_ARTIFACTS != '' - uses: actions/upload-artifact@v5 + uses: actions/upload-artifact@v7 with: name: failed-tests-${{matrix.vector.jobname}} path: ${{env.FAILED_TEST_ARTIFACTS}} From c202109d3ed778422084ad5343fa62bf3186773b Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Tue, 21 Apr 2026 08:59:35 +0000 Subject: [PATCH 110/218] ci: bump actions/github-script from v8 to v9 The only use we have of `actions/github-script` is the "skip if the commit or tree was already tested" step in `main.yml`, which checks whether an identical tree-SHA was already built successfully. It currently pins v8; v9 is the latest release. What v9 changes: - The `ACTIONS_ORCHESTRATION_ID` environment variable is now appended to the HTTP user-agent string. This is transparent to our script. - A new injected `getOctokit` factory lets scripts create additional authenticated clients in the same step without importing `@actions/github`. We do not use it. - Two breaking changes affect scripts that either call `require('@actions/github')` (fails at runtime, because `@actions/github` v9 is now ESM-only) or that shadow the implicit `getOctokit` parameter via `const`/`let` (syntax error). Our script does neither -- it only uses the pre-supplied `github` REST client and `core` helpers -- so the upgrade is safe. Risk analysis: the step is advisory. It sets `enabled=' but skip'` as an optimization to avoid re-running CI on a tree that was already tested successfully. Even if the v9 upgrade broke the script, the surrounding `try { ... } catch (e) { core.warning(e); }` block would degrade it to a warning and CI would still run normally. In practice the script continues to work identically on v9. See also: - Release notes: https://github.com/actions/github-script/releases - Compare: https://github.com/actions/github-script/compare/v8...v9 Originally-authored-by: dependabot[bot] Signed-off-by: Johannes Schindelin --- .github/workflows/main.yml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml index 5994f40cc0f71d..1269b4d41c63b4 100644 --- a/.github/workflows/main.yml +++ b/.github/workflows/main.yml @@ -63,7 +63,7 @@ jobs: echo "skip_concurrent=$skip_concurrent" >>$GITHUB_OUTPUT - name: skip if the commit or tree was already tested id: skip-if-redundant - uses: actions/github-script@v8 + uses: actions/github-script@v9 if: steps.check-ref.outputs.enabled == 'yes' with: github-token: ${{secrets.GITHUB_TOKEN}} From 275f23cc93d8354ab85ab52a755a538085bd4306 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Tue, 21 Apr 2026 09:00:35 +0000 Subject: [PATCH 111/218] ci: bump actions/checkout from v5 to v6 Every workflow currently pins `actions/checkout` to v5, which was introduced primarily to move to the Node.js 24 runtime. v6 is the next release and worth picking up so we stay on a maintained version of the action. The one behaviorally interesting change in v6: `persist-credentials` now stores the helper credentials under `$RUNNER_TEMP` instead of writing them directly into the local `.git/config`. Two implications follow: 1. In the normal case this is an unambiguous improvement -- the token no longer lands in `.git/config`, reducing the risk of inadvertently leaking it through workspace archiving (`upload-artifact` snapshots, cache entries, core dumps, ...). 2. Docker container actions require an Actions Runner of at least v2.329.0 to find the credentials in their new location. The github.com-hosted runners our CI uses are already past that version, so this does not affect us. Downstream users running self-hosted runners may need to update them before adopting this version of the action. Risk analysis: our checkout steps either check out the default repository (no special credential requirements) or, in the `vs-build` job, explicitly set `repository: microsoft/vcpkg` and `path: compat/vcbuild/vcpkg`. Neither case relies on the precise location of the persisted credentials -- subsequent steps interact with the API via the runner-provided `GITHUB_TOKEN` directly -- so the v6 credential-storage change is transparent to our workflows. The diff is purely the `@vN` identifier; there are no input or output changes. See also: - Release notes: https://github.com/actions/checkout/releases - Changelog: https://github.com/actions/checkout/blob/main/CHANGELOG.md - Compare: https://github.com/actions/checkout/compare/v5...v6 Originally-authored-by: dependabot[bot] Signed-off-by: Johannes Schindelin --- .github/workflows/check-style.yml | 2 +- .github/workflows/check-whitespace.yml | 2 +- .github/workflows/coverity.yml | 2 +- .github/workflows/main.yml | 24 ++++++++++++------------ 4 files changed, 15 insertions(+), 15 deletions(-) diff --git a/.github/workflows/check-style.yml b/.github/workflows/check-style.yml index 19a145d4ad0c5a..108a2de903310c 100644 --- a/.github/workflows/check-style.yml +++ b/.github/workflows/check-style.yml @@ -20,7 +20,7 @@ jobs: jobname: ClangFormat runs-on: ubuntu-latest steps: - - uses: actions/checkout@v5 + - uses: actions/checkout@v6 with: fetch-depth: 0 diff --git a/.github/workflows/check-whitespace.yml b/.github/workflows/check-whitespace.yml index 928fd4cfe2456d..ea6f49f742108e 100644 --- a/.github/workflows/check-whitespace.yml +++ b/.github/workflows/check-whitespace.yml @@ -19,7 +19,7 @@ jobs: check-whitespace: runs-on: ubuntu-latest steps: - - uses: actions/checkout@v5 + - uses: actions/checkout@v6 with: fetch-depth: 0 diff --git a/.github/workflows/coverity.yml b/.github/workflows/coverity.yml index 3435baeca29a55..89bef267275aee 100644 --- a/.github/workflows/coverity.yml +++ b/.github/workflows/coverity.yml @@ -38,7 +38,7 @@ jobs: COVERITY_LANGUAGE: cxx COVERITY_PLATFORM: overridden-below steps: - - uses: actions/checkout@v5 + - uses: actions/checkout@v6 - name: install minimal Git for Windows SDK if: contains(matrix.os, 'windows') uses: git-for-windows/setup-git-for-windows-sdk@v1 diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml index 1269b4d41c63b4..2decc1143a2df6 100644 --- a/.github/workflows/main.yml +++ b/.github/workflows/main.yml @@ -112,7 +112,7 @@ jobs: group: windows-build-${{ github.ref }} cancel-in-progress: ${{ needs.ci-config.outputs.skip_concurrent == 'yes' }} steps: - - uses: actions/checkout@v5 + - uses: actions/checkout@v6 - uses: git-for-windows/setup-git-for-windows-sdk@v1 - name: build shell: bash @@ -173,10 +173,10 @@ jobs: group: vs-build-${{ github.ref }} cancel-in-progress: ${{ needs.ci-config.outputs.skip_concurrent == 'yes' }} steps: - - uses: actions/checkout@v5 + - uses: actions/checkout@v6 - uses: git-for-windows/setup-git-for-windows-sdk@v1 - name: initialize vcpkg - uses: actions/checkout@v5 + uses: actions/checkout@v6 with: repository: 'microsoft/vcpkg' path: 'compat/vcbuild/vcpkg' @@ -258,7 +258,7 @@ jobs: group: windows-meson-build-${{ github.ref }} cancel-in-progress: ${{ needs.ci-config.outputs.skip_concurrent == 'yes' }} steps: - - uses: actions/checkout@v5 + - uses: actions/checkout@v6 - uses: actions/setup-python@v6 - name: Set up dependencies shell: pwsh @@ -286,7 +286,7 @@ jobs: group: windows-meson-test-${{ matrix.nr }}-${{ github.ref }} cancel-in-progress: ${{ needs.ci-config.outputs.skip_concurrent == 'yes' }} steps: - - uses: actions/checkout@v5 + - uses: actions/checkout@v6 - uses: actions/setup-python@v6 - name: Set up dependencies shell: pwsh @@ -341,7 +341,7 @@ jobs: TEST_OUTPUT_DIRECTORY: ${{github.workspace}}/t runs-on: ${{matrix.vector.pool}} steps: - - uses: actions/checkout@v5 + - uses: actions/checkout@v6 - run: ci/install-dependencies.sh - run: ci/run-build-and-tests.sh - name: print test failures @@ -362,7 +362,7 @@ jobs: CI_JOB_IMAGE: ubuntu-latest runs-on: ubuntu-latest steps: - - uses: actions/checkout@v5 + - uses: actions/checkout@v6 - run: ci/install-dependencies.sh - run: ci/run-build-and-minimal-fuzzers.sh dockerized: @@ -441,7 +441,7 @@ jobs: else apt-get -q update && apt-get -q -y install git fi - - uses: actions/checkout@v5 + - uses: actions/checkout@v6 - run: ci/install-dependencies.sh - run: useradd builder --create-home - run: chown -R builder . @@ -466,7 +466,7 @@ jobs: group: static-analysis-${{ github.ref }} cancel-in-progress: ${{ needs.ci-config.outputs.skip_concurrent == 'yes' }} steps: - - uses: actions/checkout@v5 + - uses: actions/checkout@v6 - run: ci/install-dependencies.sh - run: ci/run-static-analysis.sh - run: ci/check-directional-formatting.bash @@ -482,7 +482,7 @@ jobs: group: rust-analysis-${{ github.ref }} cancel-in-progress: ${{ needs.ci-config.outputs.skip_concurrent == 'yes' }} steps: - - uses: actions/checkout@v5 + - uses: actions/checkout@v6 - run: ci/install-dependencies.sh - run: ci/run-rust-checks.sh sparse: @@ -496,7 +496,7 @@ jobs: group: sparse-${{ github.ref }} cancel-in-progress: ${{ needs.ci-config.outputs.skip_concurrent == 'yes' }} steps: - - uses: actions/checkout@v5 + - uses: actions/checkout@v6 - name: Install other dependencies run: ci/install-dependencies.sh - run: make sparse @@ -512,6 +512,6 @@ jobs: CI_JOB_IMAGE: ubuntu-latest runs-on: ubuntu-latest steps: - - uses: actions/checkout@v5 + - uses: actions/checkout@v6 - run: ci/install-dependencies.sh - run: ci/test-documentation.sh From 8bcfbb95adcaca853b7eb623484b90f927deec9c Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Tue, 28 Apr 2026 01:55:49 +0200 Subject: [PATCH 112/218] mingw: optionally use legacy (non-POSIX) delete semantics At some point between Windows 10 Build 17134.1304 and Build 18363.657, the default behavior of `DeleteFileW()` was changed to use POSIX semantics (https://stackoverflow.com/a/60512798). Under those semantics, a file can be deleted even when another process holds an active `MapViewOfFile` view on it: the directory entry is removed immediately, but the underlying data persists until the last handle is closed. On older Windows versions (and Windows 10 builds before that change), `DeleteFileW()` uses legacy semantics where deletion fails outright if any process holds a file mapping. To allow testing code paths that depend on the legacy behavior, introduce a `GIT_TEST_LEGACY_DELETE` environment variable. When set, `mingw_unlink()` uses `SetFileInformationByHandle()` with `FileDispositionInfo` (the non-POSIX variant) instead of `DeleteFileW()`, forcing legacy delete semantics regardless of the Windows version. Signed-off-by: Johannes Schindelin --- compat/mingw.c | 47 +++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 45 insertions(+), 2 deletions(-) diff --git a/compat/mingw.c b/compat/mingw.c index feefa2cd0eb12a..035914566513b4 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -472,20 +472,63 @@ static wchar_t *normalize_ntpath(wchar_t *wbuf) return wbuf; } +/* + * Use SetFileInformationByHandle(FileDispositionInfo) to force legacy + * (non-POSIX) delete semantics. On Windows 11, DeleteFileW() uses POSIX + * delete semantics internally, allowing deletion even with active + * MapViewOfFile views. This helper simulates Windows 10 behavior where + * deletion fails if a file mapping exists. + * + * Returns nonzero on success (like DeleteFileW), 0 on failure. + */ +static int legacy_delete_file(const wchar_t *wpathname) +{ + FILE_DISPOSITION_INFO fdi = { TRUE }; + DWORD gle; + HANDLE h = CreateFileW(wpathname, DELETE, + FILE_SHARE_READ | FILE_SHARE_WRITE | + FILE_SHARE_DELETE, + NULL, OPEN_EXISTING, + FILE_FLAG_OPEN_REPARSE_POINT, NULL); + if (h == INVALID_HANDLE_VALUE) + return 0; + + if (SetFileInformationByHandle(h, FileDispositionInfo, + &fdi, sizeof(fdi))) { + CloseHandle(h); + return 1; + } + gle = GetLastError(); + CloseHandle(h); + SetLastError(gle); + return 0; +} + +static int try_delete_file(const wchar_t *wpathname, int use_legacy) +{ + if (use_legacy) + return legacy_delete_file(wpathname); + return DeleteFileW(wpathname); +} + int mingw_unlink(const char *pathname, int handle_in_use_error) { + static int use_legacy_delete = -1; int tries = 0; wchar_t wpathname[MAX_PATH]; if (xutftowcs_path(wpathname, pathname) < 0) return -1; - if (DeleteFileW(wpathname)) + if (use_legacy_delete < 0) + use_legacy_delete = !!getenv("GIT_TEST_LEGACY_DELETE"); + + if (try_delete_file(wpathname, use_legacy_delete)) return 0; do { /* read-only files cannot be removed */ _wchmod(wpathname, 0666); - if (!_wunlink(wpathname)) + if (try_delete_file(wpathname, use_legacy_delete)) return 0; if (!is_file_in_use_error(GetLastError())) break; From 06f0ce46e888f80db98a3226d9c01f607f306e63 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Wed, 29 Apr 2026 15:44:06 +0200 Subject: [PATCH 113/218] ci: bump git-for-windows/setup-git-for-windows-sdk from v1 to v2 The v1 of `git-for-windows/setup-git-for-windows-sdk` runs on Node.js 20, which GitHub is phasing out of the Actions runners. v2 moves the action to Node.js 24 so that the CI jobs relying on a Git for Windows SDK keep working once Node.js 20 is removed. The risk is very low: v2 contains no functional changes to the SDK setup itself, only the runtime upgrade. The action still provisions the same minimal SDK and exposes the same outputs. The sole precondition is a recent Actions Runner (>= 2.327.1), which the github.com-hosted runners already satisfy. Signed-off-by: Johannes Schindelin --- .github/workflows/coverity.yml | 2 +- .github/workflows/main.yml | 8 ++++---- 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/.github/workflows/coverity.yml b/.github/workflows/coverity.yml index 89bef267275aee..58a78f1eb3f836 100644 --- a/.github/workflows/coverity.yml +++ b/.github/workflows/coverity.yml @@ -41,7 +41,7 @@ jobs: - uses: actions/checkout@v6 - name: install minimal Git for Windows SDK if: contains(matrix.os, 'windows') - uses: git-for-windows/setup-git-for-windows-sdk@v1 + uses: git-for-windows/setup-git-for-windows-sdk@v2 - run: ci/install-dependencies.sh if: contains(matrix.os, 'ubuntu') || contains(matrix.os, 'macos') env: diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml index 2decc1143a2df6..9dbdd29ca5aeb8 100644 --- a/.github/workflows/main.yml +++ b/.github/workflows/main.yml @@ -113,7 +113,7 @@ jobs: cancel-in-progress: ${{ needs.ci-config.outputs.skip_concurrent == 'yes' }} steps: - uses: actions/checkout@v6 - - uses: git-for-windows/setup-git-for-windows-sdk@v1 + - uses: git-for-windows/setup-git-for-windows-sdk@v2 - name: build shell: bash env: @@ -147,7 +147,7 @@ jobs: - name: extract tracked files and build artifacts shell: bash run: tar xf artifacts.tar.gz && tar xf tracked.tar.gz - - uses: git-for-windows/setup-git-for-windows-sdk@v1 + - uses: git-for-windows/setup-git-for-windows-sdk@v2 - name: test shell: bash run: . /etc/profile && ci/run-test-slice.sh $((${{matrix.nr}} + 1)) 10 @@ -174,7 +174,7 @@ jobs: cancel-in-progress: ${{ needs.ci-config.outputs.skip_concurrent == 'yes' }} steps: - uses: actions/checkout@v6 - - uses: git-for-windows/setup-git-for-windows-sdk@v1 + - uses: git-for-windows/setup-git-for-windows-sdk@v2 - name: initialize vcpkg uses: actions/checkout@v6 with: @@ -224,7 +224,7 @@ jobs: group: vs-test-${{ matrix.nr }}-${{ github.ref }} cancel-in-progress: ${{ needs.ci-config.outputs.skip_concurrent == 'yes' }} steps: - - uses: git-for-windows/setup-git-for-windows-sdk@v1 + - uses: git-for-windows/setup-git-for-windows-sdk@v2 - name: download tracked files and build artifacts uses: actions/download-artifact@v8 with: From eb3911ba5e9f93cf71db68b21f9028e67b340640 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Matthias=20A=C3=9Fhauer?= Date: Sat, 21 Feb 2026 12:11:02 +0100 Subject: [PATCH 114/218] win32: thread-utils: handle multi-socket systems MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit While the currently used way to detect the number of CPU cores on Windows is nice and straight-forward, GetSystemInfo() only gives us access to the number of processors within the current group. [1] While that is usually fine for systems with a single physical CPU, separate physical sockets are typically separate groups. Switch to using GetLogicalProcessorInformationEx() to handle multi-socket systems better. [1] https://learn.microsoft.com/en-us/windows/win32/api/sysinfoapi/ns-sysinfoapi-system_info#members This fixes https://github.com/git-for-windows/git/issues/4766 Co-Authored-by: Herman Semenov Signed-off-by: Matthias Aßhauer --- thread-utils.c | 27 ++++++++++++++++++++++----- 1 file changed, 22 insertions(+), 5 deletions(-) diff --git a/thread-utils.c b/thread-utils.c index 374890e6b05b69..00e7e9192b3e0b 100644 --- a/thread-utils.c +++ b/thread-utils.c @@ -28,11 +28,28 @@ int online_cpus(void) #endif #ifdef GIT_WINDOWS_NATIVE - SYSTEM_INFO info; - GetSystemInfo(&info); - - if ((int)info.dwNumberOfProcessors > 0) - return (int)info.dwNumberOfProcessors; + DWORD len = 0; + if (!GetLogicalProcessorInformationEx(RelationProcessorCore, NULL, &len) && GetLastError() == ERROR_INSUFFICIENT_BUFFER) { + uint8_t *buf = malloc(len); + if (buf) { + if (GetLogicalProcessorInformationEx(RelationProcessorCore, (PSYSTEM_LOGICAL_PROCESSOR_INFORMATION_EX) buf, &len)) { + DWORD offset = 0; + int n_cores = 0; + while (offset < len) { + PSYSTEM_LOGICAL_PROCESSOR_INFORMATION_EX info = (PSYSTEM_LOGICAL_PROCESSOR_INFORMATION_EX) (buf + offset); + offset += info->Size; + /* The threads within a core always share a single group. We need to count the bits in the mask to get a thread count. */ + for (KAFFINITY mask = info->Processor.GroupMask[0].Mask; mask; mask >>= 1) + n_cores += mask &1; + } + if (n_cores) { + free(buf); + return n_cores; + } + } + free(buf); + } + } #elif defined(hpux) || defined(__hpux) || defined(_hpux) struct pst_dynamic psd; From bb58bb522c2c1fcba8df129a0214b5c60bd47824 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Tue, 28 Apr 2026 00:10:41 +0200 Subject: [PATCH 115/218] maintenance(geometric): do release the `.idx` files before repacking As is done for all the other maintenance tasks, let's release the ODB also before starting the geometric repacking. That way, the `.idx` files won't be `mmap()`ed when they are to be deleted (which does not work on Windows because you cannot delete files on that platform as long as they are kept open by a process). This regression was introduced by 9bc151850c1c (builtin/maintenance: introduce "geometric-repack" task, 2025-10-24), but was only noticed once geometric repacking was made the default in 452b12c2e0fe (builtin/ maintenance: use "geometric" strategy by default, 2026-02-24). The fix recapitulates my work from df76ee7b77f0 (run-command: offer to close the object store before running, 2021-09-09) & friends. To guard against future regressions of this kind, add a check to `run_and_verify_geometric_pack()` in `t7900` that detects orphaned `.idx` files left behind after repacking. Contrary to interactive calls, the `git maintenance` call in that test case would _not_ block on Windows, asking whether to retry deleting that file, which is the reason why this bug was not caught earlier. Furthermore, since the default behavior of `DeleteFileW()` was changed at some point between Windows 10 Build 17134.1304 and Build 18363.657 to use POSIX semantics (see https://stackoverflow.com/a/60512798), the added orphaned-`.idx` check would be insufficient to catch this regression on modern Windows without emulating legacy delete semantics via `GIT_TEST_LEGACY_DELETE=1`. This fixes https://github.com/git-for-windows/git/issues/6210. Signed-off-by: Johannes Schindelin --- builtin/gc.c | 1 + t/t7900-maintenance.sh | 22 +++++++++++++++++++++- 2 files changed, 22 insertions(+), 1 deletion(-) diff --git a/builtin/gc.c b/builtin/gc.c index 3a71e314c975af..84a66d32404e4d 100644 --- a/builtin/gc.c +++ b/builtin/gc.c @@ -1590,6 +1590,7 @@ static int maintenance_task_geometric_repack(struct maintenance_run_opts *opts, pack_geometry_split(&geometry); child.git_cmd = 1; + child.odb_to_close = the_repository->objects; strvec_pushl(&child.args, "repack", "-d", "-l", NULL); if (geometry.split < geometry.pack_nr) diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh index 4700beacc18281..f497f51b2348c8 100755 --- a/t/t7900-maintenance.sh +++ b/t/t7900-maintenance.sh @@ -532,7 +532,16 @@ run_and_verify_geometric_pack () { # And verify that there are no loose objects anymore. git count-objects -v >count && - test_grep '^count: 0$' count + test_grep '^count: 0$' count && + + # Verify that no orphaned .idx files were left behind. On + # Windows, a missing odb_to_close causes the parent to hold + # mmap handles on .idx files, silently preventing their + # deletion by the child git-repack process. + ls .git/objects/pack/pack-*.idx .git/objects/pack/pack-*.pack | + sed "s/\.pack$/.idx/" | + sort | uniq -u >orphaned-idx && + test_must_be_empty orphaned-idx } test_expect_success 'geometric repacking task' ' @@ -580,8 +589,19 @@ test_expect_success 'geometric repacking task' ' # And these two small packs should now be merged via the # geometric repack. The large packfile should remain intact. + cp -R .git/objects .git/objects.save && run_and_verify_geometric_pack 2 && + # On Windows, verify the same with legacy delete semantics + # that reject deletion of mmap-held .idx files. + if test_have_prereq MINGW + then + rm -rf .git/objects && + mv .git/objects.save .git/objects && + test_env GIT_TEST_LEGACY_DELETE=1 \ + run_and_verify_geometric_pack 2 + fi && + # If we now add two more objects and repack twice we should # then see another all-into-one repack. This time around # though, as we have unreachable objects, we should also see a From 8ed8f033c36a6ec72759ae3f2d43949175075045 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Sat, 25 Apr 2026 12:21:45 +0200 Subject: [PATCH 116/218] l10n: bump mshick/add-pr-comment from v2 to v3 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The l10n workflow uses `mshick/add-pr-comment` to post git-po-helper reports as comments on translation pull requests. It was still pinned to v2, which runs on Node.js 20. GitHub is phasing out the Node.js 20 runtime on Actions runners, so staying on v2 will eventually cause the "Create comment in pull request for report" step to fail. The sole breaking change in v3 is the switch from Node.js 20 to Node.js 24 (https://github.com/mshick/add-pr-comment/releases/tag/v3.0.0). The action's inputs and outputs are unchanged, so the upgrade is a drop-in replacement. Subsequent v3.x releases added new opt-in features (message truncation, retry with exponential backoff, file attachments, commit comment support, "delete on status") but none of them affect existing callers that do not opt in. See also: - Changelog: https://github.com/mshick/add-pr-comment/blob/main/CHANGELOG.md - Compare: https://github.com/mshick/add-pr-comment/compare/v2...v3 Pointed-out-by: Christoph Grüninger Assisted-by: Claude Opus 4.6 Signed-off-by: Johannes Schindelin --- .github/workflows/l10n.yml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/.github/workflows/l10n.yml b/.github/workflows/l10n.yml index 95e55134bdbed4..114a12a9e59f60 100644 --- a/.github/workflows/l10n.yml +++ b/.github/workflows/l10n.yml @@ -92,7 +92,7 @@ jobs: cat git-po-helper.out exit $exit_code - name: Create comment in pull request for report - uses: mshick/add-pr-comment@v2 + uses: mshick/add-pr-comment@v3 if: >- always() && github.event_name == 'pull_request_target' && From c6daaaddfa5bf314cc4361b9dbad4643a9a239f7 Mon Sep 17 00:00:00 2001 From: Patrick Steinhardt Date: Tue, 5 May 2026 14:26:03 +0200 Subject: [PATCH 117/218] build: tolerate use of _Generic from glibc 2.43 with Clang When building with `make DEVELOPER=1` we explicitly pass "-std=gnu99" to the compiler so that we don't start leaning on features exposed by more recent versions of the C standard. Unfortunately though, glibc 2.43 started to use type-generic expressions. This works alright with GCC, but when compiling with Clang this leads to errors: $ make DEVELOPER=1 CC=clang CC daemon.o In file included from daemon.c:3: ./git-compat-util.h:344:11: error: '_Generic' is a C11 extension [-Werror,-Wc11-extensions] 344 | return !!strchr(path, '/'); | ^ /usr/include/string.h:265:3: note: expanded from macro 'strchr' 265 | __glibc_const_generic (S, const char *, strchr (S, C)) | ^ /usr/include/x86_64-linux-gnu/sys/cdefs.h:838:3: note: expanded from macro '__glibc_const_generic' 838 | _Generic (0 ? (PTR) : (void *) 1, \ | ^ In theory, the `__glibc_const_generic` macro does have feature gating: #if !defined __cplusplus \ && (__GNUC_PREREQ (4, 9) \ || __glibc_has_extension (c_generic_selections) \ || (!defined __GNUC__ && defined __STDC_VERSION__ \ && __STDC_VERSION__ >= 201112L)) # define __HAVE_GENERIC_SELECTION 1 #else # define __HAVE_GENERIC_SELECTION 0 #endif But this feature gating isn't effective because `_has_extension()` will always evaluate to true as C generics _are_ available as a language extension to GNU C99 when using Clang. This would have been different if `_has_feature()` was used instead, in which case it would have properly evaluated to `false`. GCC has a workaround to squelch this warning from standard system headers, but because clang fails due to [-Werror,-Wc11-extensions], as it lacks the corresponding workaround. For both meson and Makefile, pass -Wno-c11-extensions when we are building with clang. Signed-off-by: Patrick Steinhardt Helped-by: Shardul Natu [jc: replaced Makefile side with Shardul's approach] Signed-off-by: Junio C Hamano Assisted-by: Opus 4.7 Signed-off-by: Johannes Schindelin --- config.mak.dev | 7 +++++++ meson.build | 6 ++++++ 2 files changed, 13 insertions(+) diff --git a/config.mak.dev b/config.mak.dev index c8dcf78779e60b..eecb12c1111ff4 100644 --- a/config.mak.dev +++ b/config.mak.dev @@ -98,6 +98,13 @@ endif endif endif +# glibc 2.43 headers unconditionally use _Generic even when we ask the +# compiler to stick to -std=gnu99 and unlike GCC, clang lacks a +# workaround to squelch warnings from system headers. +ifneq ($(filter clang1,$(COMPILER_FEATURES)),) # if we are using clang +DEVELOPER_CFLAGS += -Wno-c11-extensions +endif + # https://bugzilla.redhat.com/show_bug.cgi?id=2075786 ifneq ($(filter gcc12,$(COMPILER_FEATURES)),) DEVELOPER_CFLAGS += -Wno-error=stringop-overread diff --git a/meson.build b/meson.build index 11488623bfd8f8..2997d4f90faaf2 100644 --- a/meson.build +++ b/meson.build @@ -866,6 +866,12 @@ if get_option('warning_level') in ['2','3', 'everything'] and compiler.get_argum libgit_c_args += cflag endif endforeach + + # Clang generates warnings when compiling glibc 2.43 because of the use of + # _Generic. + if compiler.get_id() == 'clang' + libgit_c_args += '-Wno-c11-extensions' + endif endif if get_option('breaking_changes') From 9536b137f9718fdae6e3f3d6f6248f48adc06a36 Mon Sep 17 00:00:00 2001 From: Tyrie Vella Date: Mon, 18 May 2026 09:50:39 -0700 Subject: [PATCH 118/218] entry: flush fscache after creating directories and writing files When checkout.workers > 1 and core.fscache is enabled on Windows, 'git checkout -- ' fails when restoring files into directories that do not yet exist on disk. Two failure modes occur: 1. create_directories(): the fscache returns a stale directory listing that does not include a just-created directory. has_dirs_only_path() reports it as non-existent, triggering the unlink+mkdir recovery path which fails with 'cannot create directory: Directory not empty'. 2. write_pc_item(): after writing and closing a file, lstat() cannot see it through the stale fscache, failing with 'unable to stat just-written file'. With workers=1, write_entry() calls flush_fscache() after each file, keeping the cache in sync. With workers>1, enqueue_checkout() defers the write (and the flush), leaving the cache stale for subsequent entries. Fix both by adding flush_fscache() calls after mkdir() in create_directories() and before lstat() in write_pc_item(). On non-Windows platforms flush_fscache() is a no-op. Assisted-by: Claude Opus 4.6 Signed-off-by: Tyrie Vella --- entry.c | 15 +++++++++- parallel-checkout.c | 7 +++++ t/t2080-parallel-checkout-basics.sh | 46 +++++++++++++++++++++++++++++ 3 files changed, 67 insertions(+), 1 deletion(-) diff --git a/entry.c b/entry.c index 7817aee362ed9e..91a573ed1c0ad5 100644 --- a/entry.c +++ b/entry.c @@ -49,10 +49,23 @@ static void create_directories(const char *path, int path_len, */ if (mkdir(buf, 0777)) { if (errno == EEXIST && state->force && - !unlink_or_warn(buf) && !mkdir(buf, 0777)) + !unlink_or_warn(buf) && !mkdir(buf, 0777)) { + flush_fscache(); continue; + } die_errno("cannot create directory at '%s'", buf); } + + /* + * Flush the lstat cache of directory listings so that + * subsequent has_dirs_only_path() calls see the + * just-created directory. Without this, the Windows + * fscache returns stale ENOENT for the new directory, + * causing the next entry sharing this parent to + * incorrectly hit the mkdir/unlink recovery path + * above, which then fails with "Directory not empty". + */ + flush_fscache(); } free(buf); } diff --git a/parallel-checkout.c b/parallel-checkout.c index 0bf4bd6d4abd8c..a6d07dcb1805b9 100644 --- a/parallel-checkout.c +++ b/parallel-checkout.c @@ -395,6 +395,13 @@ void write_pc_item(struct parallel_checkout_item *pc_item, goto out; } + /* + * Flush the Windows fscache so that the lstat() below sees the + * file we just wrote. Without this, the cached parent directory + * listing may not yet include the new file entry. + */ + flush_fscache(); + if (state->refresh_cache && !fstat_done && lstat(path.buf, &pc_item->st) < 0) { error_errno("unable to stat just-written file '%s'", path.buf); pc_item->status = PC_ITEM_FAILED; diff --git a/t/t2080-parallel-checkout-basics.sh b/t/t2080-parallel-checkout-basics.sh index 5ffe1a41e2cd72..7ad96cd5cd24a3 100755 --- a/t/t2080-parallel-checkout-basics.sh +++ b/t/t2080-parallel-checkout-basics.sh @@ -274,4 +274,50 @@ test_expect_success '"git checkout ." report should not include failed entries' ) ' +# Regression test: parallel checkout + fscache stale directory listing. +# +# When checkout.workers > 1, checkout_entry_ca() enqueues files for deferred +# writing instead of writing them inline. The inline write_entry() path calls +# flush_fscache() after each file, keeping the Windows fscache in sync with +# newly-created directories. The deferred path skips this flush, so +# has_dirs_only_path() sees stale ENOENT for directories that mkdir() just +# created. The recovery path in create_directories() then tries to unlink+ +# recreate the directory, which fails because it already has children. +# +# The trigger is: two files sharing a parent directory that does not yet exist +# on disk when `git checkout -- ` runs. +test_expect_success MINGW 'parallel checkout with fscache does not fail on new directories' ' + git init fscache-pc && + ( + cd fscache-pc && + git config core.fscache true && + + # Commit B1: files in a nested directory + mkdir -p sub/deep/dir && + echo one >sub/deep/dir/file1.txt && + echo two >sub/deep/dir/file2.txt && + git add sub && + git commit -m "B1: with sub/deep/dir" && + git tag B1 && + + # Commit B2: the directory is gone + git rm -rf sub && + git commit -m "B2: without sub" && + + # Now restore both files from B1 with parallel checkout. + # This is the pathspec checkout path (checkout_worktree in + # builtin/checkout.c), which defers writes via enqueue_checkout + # when workers > 1 and does not flush fscache between entries. + git -c checkout.workers=2 \ + -c checkout.thresholdForParallelism=0 \ + checkout B1 -- sub/deep/dir/file1.txt sub/deep/dir/file2.txt && + + # Verify both files are correctly restored + echo one >expect1 && + echo two >expect2 && + test_cmp expect1 sub/deep/dir/file1.txt && + test_cmp expect2 sub/deep/dir/file2.txt + ) +' + test_done From e1756e3becc47546330bac697654a26a3aafac34 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Wed, 10 Jun 2026 17:09:01 +0200 Subject: [PATCH 119/218] ci(vs-build): adapt to Visual Studio 2026 default on windows-latest The `windows-latest` runner image migration that began on June 8, 2026 and completes on June 15, 2026 switches the default Visual Studio install from VS 2022 (v17) to VS 2026 (v18), per https://github.com/actions/runner-images/issues/14017. CMake 4.x picks up the new generator name "Visual Studio 18 2026" automatically and, crucially, writes the solution file with the new `.slnx` (XML) extension rather than `.sln`. See https://github.com/Kitware/CMake/blob/v4.3.2/Source/cmGlobalVisualStudioGenerator.cxx#L1147-L1159 where `GetSLNFile()` appends an "x" to the filename when the generator version is `VS18` or newer. As a result, the `MSBuild` step in the `vs-build` job fails with MSBUILD : error MSB1009: Project file does not exist. Switch: git.sln because the file CMake actually wrote is `git.slnx`. An example of the failure can be seen at https://github.com/git-for-windows/git/actions/runs/27264770241/job/80556419519. Teach the step to prefer `git.slnx` and fall back to `git.sln` so that it works on both the new image and any runner still on VS 2022 during the week-long staggered rollout. The conditional is written in PowerShell rather than bash so the step stays on the default shell: `microsoft/setup-msbuild@v3` adds `msbuild` to the Windows `PATH` only, and an MSYS2 bash spawned by the SDK does not pick it up (an earlier attempt at this fix using `shell: bash` failed with `msbuild: command not found`, see https://github.com/git-for-windows/git/actions/runs/27290221733/job/80608493655). Letting MSBuild itself discover the solution by omitting the project argument is not an option here either: CMake emits all `*.vcxproj` files (one per `add_executable`/`add_library`, e.g. `git-daemon.vcxproj`, `common-main.vcxproj`, `ALL_BUILD.vcxproj`, ...) into the same build root as the solution file, and MSBuild's auto-discovery in `ProcessProjectSwitch()` (`dotnet/msbuild`, `src/MSBuild/XMake.cs`) rejects that combination as `AmbiguousProjectError` because it only disambiguates the special case of exactly two projects where one has a `.proj` extension. Additionally, drop the `-property:PlatformToolset=v142` argument that had been carried since 889cacb6 (ci: configure GitHub Actions for CI/PR, 2020-04-11), when this job was first configured for VS 2019. The VS 2026 install on `windows-latest` only ships the v144 toolset along with a v143 compatibility component (`Microsoft.VisualStudio.Component.VC.14.44.17.14.x86.x64`); v142 is no longer present, so the explicit pin would now also fail in its own right. Removing it lets MSBuild use whatever toolset CMake selected during configuration (v143 on a VS 2022 runner, v144 on a VS 2026 one), which keeps the configure and build steps consistent with each other regardless of which image picked up the job. Signed-off-by: Johannes Schindelin Assisted-by: Opus 4.7 --- .github/workflows/main.yml | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml index 96d19581129ec2..9dc4806ad72905 100644 --- a/.github/workflows/main.yml +++ b/.github/workflows/main.yml @@ -196,7 +196,9 @@ jobs: cmake `pwd`/contrib/buildsystems/ -DCMAKE_PREFIX_PATH=`pwd`/compat/vcbuild/vcpkg/installed/x64-windows \ -DNO_GETTEXT=YesPlease -DPERL_TESTS=OFF -DPYTHON_TESTS=OFF -DCURL_NO_CURL_CMAKE=ON - name: MSBuild - run: msbuild git.sln -property:Configuration=Release -property:Platform=x64 -maxCpuCount:4 -property:PlatformToolset=v142 + run: | + $sln = if (Test-Path git.slnx) { 'git.slnx' } else { 'git.sln' } + msbuild $sln -property:Configuration=Release -property:Platform=x64 -maxCpuCount:4 - name: bundle artifact tar shell: bash env: From 3f4a9141834dc5558e824ad61c8f98d962c9896f Mon Sep 17 00:00:00 2001 From: Ian Bearman Date: Fri, 31 Jan 2020 16:00:25 -0800 Subject: [PATCH 120/218] vcbuild: install ARM64 dependencies when building ARM64 binaries Co-authored-by: Dennis Ameling Signed-off-by: Ian Bearman Signed-off-by: Dennis Ameling Signed-off-by: Johannes Schindelin --- compat/vcbuild/README | 6 +++++- compat/vcbuild/vcpkg_copy_dlls.bat | 7 ++++++- compat/vcbuild/vcpkg_install.bat | 9 +++++++-- 3 files changed, 18 insertions(+), 4 deletions(-) diff --git a/compat/vcbuild/README b/compat/vcbuild/README index 29ec1d0f104b80..1df1cabb1ebbbd 100644 --- a/compat/vcbuild/README +++ b/compat/vcbuild/README @@ -6,7 +6,11 @@ The Steps to Build Git with VS2015 or VS2017 from the command line. Prompt or from an SDK bash window: $ cd - $ ./compat/vcbuild/vcpkg_install.bat + $ ./compat/vcbuild/vcpkg_install.bat x64-windows + + or + + $ ./compat/vcbuild/vcpkg_install.bat arm64-windows The vcpkg tools and all of the third-party sources will be installed in this folder: diff --git a/compat/vcbuild/vcpkg_copy_dlls.bat b/compat/vcbuild/vcpkg_copy_dlls.bat index 13661c14f8705c..8bea0cbf83b6cf 100644 --- a/compat/vcbuild/vcpkg_copy_dlls.bat +++ b/compat/vcbuild/vcpkg_copy_dlls.bat @@ -15,7 +15,12 @@ REM ================================================================ @FOR /F "delims=" %%D IN ("%~dp0") DO @SET cwd=%%~fD cd %cwd% - SET arch=x64-windows + SET arch=%2 + IF NOT DEFINED arch ( + echo defaulting to 'x64-windows`. Invoke %0 with 'x86-windows', 'x64-windows', or 'arm64-windows' + set arch=x64-windows + ) + SET inst=%cwd%vcpkg\installed\%arch% IF [%1]==[release] ( diff --git a/compat/vcbuild/vcpkg_install.bat b/compat/vcbuild/vcpkg_install.bat index 8330d8120fb511..cacef18c11dc79 100644 --- a/compat/vcbuild/vcpkg_install.bat +++ b/compat/vcbuild/vcpkg_install.bat @@ -31,6 +31,12 @@ REM ================================================================ SETLOCAL EnableDelayedExpansion + SET arch=%1 + IF NOT DEFINED arch ( + echo defaulting to 'x64-windows`. Invoke %0 with 'x86-windows', 'x64-windows', or 'arm64-windows' + set arch=x64-windows + ) + @FOR /F "delims=" %%D IN ("%~dp0") DO @SET cwd=%%~fD cd %cwd% @@ -55,9 +61,8 @@ REM ================================================================ echo Successfully installed %cwd%vcpkg\vcpkg.exe :install_libraries - SET arch=x64-windows - echo Installing third-party libraries... + echo Installing third-party libraries(%arch%)... FOR %%i IN (zlib expat libiconv openssl libssh2 curl) DO ( cd %cwd%vcpkg IF NOT EXIST "packages\%%i_%arch%" CALL :sub__install_one %%i From 3095a33d4dc3f08113e7e4a7b56a01c3ac2ac347 Mon Sep 17 00:00:00 2001 From: Ian Bearman Date: Tue, 4 Feb 2020 10:34:40 -0800 Subject: [PATCH 121/218] vcbuild: add an option to install individual 'features' In this context, a "feature" is a dependency combined with its own dependencies. Signed-off-by: Ian Bearman Signed-off-by: Johannes Schindelin --- compat/vcbuild/vcpkg_install.bat | 35 +++++++++++++++++++++++++++++++- 1 file changed, 34 insertions(+), 1 deletion(-) diff --git a/compat/vcbuild/vcpkg_install.bat b/compat/vcbuild/vcpkg_install.bat index cacef18c11dc79..8da212487ae97d 100644 --- a/compat/vcbuild/vcpkg_install.bat +++ b/compat/vcbuild/vcpkg_install.bat @@ -85,14 +85,47 @@ REM ================================================================ :sub__install_one echo Installing package %1... + call :%1_features + REM vcpkg may not be reliable on slow, intermittent or proxy REM connections, see e.g. REM https://social.msdn.microsoft.com/Forums/windowsdesktop/en-US/4a8f7be5-5e15-4213-a7bb-ddf424a954e6/winhttpsendrequest-ends-with-12002-errorhttptimeout-after-21-seconds-no-matter-what-timeout?forum=windowssdk REM which explains the hidden 21 second timeout REM (last post by Dave : Microsoft - Windows Networking team) - .\vcpkg.exe install %1:%arch% + .\vcpkg.exe install %1%features%:%arch% IF ERRORLEVEL 1 ( EXIT /B 1 ) echo Finished %1 goto :EOF + +:: +:: features for each vcpkg to install +:: there should be an entry here for each package to install +:: 'set features=' means use the default otherwise +:: 'set features=[comma-delimited-feature-set]' is the syntax +:: + +:zlib_features +set features= +goto :EOF + +:expat_features +set features= +goto :EOF + +:libiconv_features +set features= +goto :EOF + +:openssl_features +set features= +goto :EOF + +:libssh2_features +set features= +goto :EOF + +:curl_features +set features=[core,openssl] +goto :EOF From 0235fd9ecf07a9d9d8c3e4cd1eacf6ff75baf1c5 Mon Sep 17 00:00:00 2001 From: Philip Oakley Date: Sun, 6 Oct 2019 18:40:55 +0100 Subject: [PATCH 122/218] vcpkg_install: detect lack of Git The vcpkg_install batch file depends on the availability of a working Git on the CMD path. This may not be present if the user has selected the 'bash only' option during Git-for-Windows install. Detect and tell the user about their lack of a working Git in the CMD window. Fixes #2348. A separate PR https://github.com/git-for-windows/build-extra/pull/258 now highlights the recommended path setting during install. Signed-off-by: Philip Oakley --- compat/vcbuild/vcpkg_install.bat | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/compat/vcbuild/vcpkg_install.bat b/compat/vcbuild/vcpkg_install.bat index ebd0bad242a8ca..bcbbf536af3141 100644 --- a/compat/vcbuild/vcpkg_install.bat +++ b/compat/vcbuild/vcpkg_install.bat @@ -36,6 +36,13 @@ REM ================================================================ dir vcpkg\vcpkg.exe >nul 2>nul && GOTO :install_libraries + git.exe version 2>nul + IF ERRORLEVEL 1 ( + echo "***" + echo "Git not found. Please adjust your CMD path or Git install option." + echo "***" + EXIT /B 1 ) + echo Fetching vcpkg in %cwd%vcpkg git.exe clone https://github.com/Microsoft/vcpkg vcpkg IF ERRORLEVEL 1 ( EXIT /B 1 ) From e48b4e664306a389c096a07da59cec660c9f39ff Mon Sep 17 00:00:00 2001 From: Dennis Ameling Date: Fri, 4 Dec 2020 14:11:34 +0100 Subject: [PATCH 123/218] cmake: allow building for Windows/ARM64 Signed-off-by: Dennis Ameling Signed-off-by: Johannes Schindelin --- contrib/buildsystems/CMakeLists.txt | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/contrib/buildsystems/CMakeLists.txt b/contrib/buildsystems/CMakeLists.txt index 81b4306e72046c..98500521641afd 100644 --- a/contrib/buildsystems/CMakeLists.txt +++ b/contrib/buildsystems/CMakeLists.txt @@ -65,9 +65,9 @@ if(USE_VCPKG) set(VCPKG_DIR "${CMAKE_SOURCE_DIR}/compat/vcbuild/vcpkg") if(NOT EXISTS ${VCPKG_DIR}) message("Initializing vcpkg and building the Git's dependencies (this will take a while...)") - execute_process(COMMAND ${CMAKE_SOURCE_DIR}/compat/vcbuild/vcpkg_install.bat) + execute_process(COMMAND ${CMAKE_SOURCE_DIR}/compat/vcbuild/vcpkg_install.bat ${VCPKG_ARCH}) endif() - list(APPEND CMAKE_PREFIX_PATH "${VCPKG_DIR}/installed/x64-windows") + list(APPEND CMAKE_PREFIX_PATH "${VCPKG_DIR}/installed/${VCPKG_ARCH}") # In the vcpkg edition, we need this to be able to link to libcurl set(CURL_NO_CURL_CMAKE ON) @@ -1199,7 +1199,7 @@ string(REPLACE "@USE_LIBPCRE2@" "" git_build_options "${git_build_options}") string(REPLACE "@WITH_BREAKING_CHANGES@" "" git_build_options "${git_build_options}") string(REPLACE "@X@" "${EXE_EXTENSION}" git_build_options "${git_build_options}") if(USE_VCPKG) - string(APPEND git_build_options "PATH=\"$PATH:$TEST_DIRECTORY/../compat/vcbuild/vcpkg/installed/x64-windows/bin\"\n") + string(APPEND git_build_options "PATH=\"$PATH:$TEST_DIRECTORY/../compat/vcbuild/vcpkg/installed/${VCPKG_ARCH}/bin\"\n") endif() file(WRITE ${CMAKE_BINARY_DIR}/GIT-BUILD-OPTIONS ${git_build_options}) From bfbd8b6658ee597094fb7fb76e912e647d4913dd Mon Sep 17 00:00:00 2001 From: Philip Oakley Date: Sun, 6 Oct 2019 18:43:57 +0100 Subject: [PATCH 124/218] vcpkg_install: add comment regarding slow network connections The vcpkg downloads may not succeed. Warn careful readers of the time out. A simple retry will usually resolve the issue. Signed-off-by: Philip Oakley Signed-off-by: Johannes Schindelin --- compat/vcbuild/vcpkg_install.bat | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/compat/vcbuild/vcpkg_install.bat b/compat/vcbuild/vcpkg_install.bat index bcbbf536af3141..8330d8120fb511 100644 --- a/compat/vcbuild/vcpkg_install.bat +++ b/compat/vcbuild/vcpkg_install.bat @@ -80,6 +80,12 @@ REM ================================================================ :sub__install_one echo Installing package %1... + REM vcpkg may not be reliable on slow, intermittent or proxy + REM connections, see e.g. + REM https://social.msdn.microsoft.com/Forums/windowsdesktop/en-US/4a8f7be5-5e15-4213-a7bb-ddf424a954e6/winhttpsendrequest-ends-with-12002-errorhttptimeout-after-21-seconds-no-matter-what-timeout?forum=windowssdk + REM which explains the hidden 21 second timeout + REM (last post by Dave : Microsoft - Windows Networking team) + .\vcpkg.exe install %1:%arch% IF ERRORLEVEL 1 ( EXIT /B 1 ) From ea633e08a0bd167cdac06b1d23a423113a0cce68 Mon Sep 17 00:00:00 2001 From: Dennis Ameling Date: Sun, 29 Nov 2020 00:12:26 +0100 Subject: [PATCH 125/218] ci(vs-build) also build Windows/ARM64 artifacts There are no Windows/ARM64 agents in GitHub Actions yet, therefore we just skip adjusting the `vs-test` job for now. Signed-off-by: Dennis Ameling Signed-off-by: Johannes Schindelin --- .github/workflows/main.yml | 17 ++++++++++------- 1 file changed, 10 insertions(+), 7 deletions(-) diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml index 9dc4806ad72905..c463e79f44931f 100644 --- a/.github/workflows/main.yml +++ b/.github/workflows/main.yml @@ -169,8 +169,11 @@ jobs: NO_PERL: 1 GIT_CONFIG_PARAMETERS: "'user.name=CI' 'user.email=ci@git'" runs-on: windows-latest + strategy: + matrix: + arch: [x64, arm64] concurrency: - group: vs-build-${{ github.ref }} + group: vs-build-${{ github.ref }}-${{ matrix.arch }} cancel-in-progress: ${{ needs.ci-config.outputs.skip_concurrent == 'yes' }} steps: - uses: actions/checkout@v5 @@ -189,16 +192,16 @@ jobs: uses: microsoft/setup-msbuild@v2 - name: copy dlls to root shell: cmd - run: compat\vcbuild\vcpkg_copy_dlls.bat release + run: compat\vcbuild\vcpkg_copy_dlls.bat release ${{ matrix.arch }}-windows - name: generate Visual Studio solution shell: bash run: | - cmake `pwd`/contrib/buildsystems/ -DCMAKE_PREFIX_PATH=`pwd`/compat/vcbuild/vcpkg/installed/x64-windows \ - -DNO_GETTEXT=YesPlease -DPERL_TESTS=OFF -DPYTHON_TESTS=OFF -DCURL_NO_CURL_CMAKE=ON + cmake `pwd`/contrib/buildsystems/ -DCMAKE_PREFIX_PATH=`pwd`/compat/vcbuild/vcpkg/installed/${{ matrix.arch }}-windows \ + -DNO_GETTEXT=YesPlease -DPERL_TESTS=OFF -DPYTHON_TESTS=OFF -DCURL_NO_CURL_CMAKE=ON -DCMAKE_GENERATOR_PLATFORM=${{ matrix.arch }} -DVCPKG_ARCH=${{ matrix.arch }}-windows - name: MSBuild run: | $sln = if (Test-Path git.slnx) { 'git.slnx' } else { 'git.sln' } - msbuild $sln -property:Configuration=Release -property:Platform=x64 -maxCpuCount:4 + msbuild $sln -property:Configuration=Release -property:Platform=${{ matrix.arch }} -maxCpuCount:4 - name: bundle artifact tar shell: bash env: @@ -212,7 +215,7 @@ jobs: - name: upload tracked files and build artifacts uses: actions/upload-artifact@v5 with: - name: vs-artifacts + name: vs-artifacts-${{ matrix.arch }} path: artifacts vs-test: name: win+VS test @@ -230,7 +233,7 @@ jobs: - name: download tracked files and build artifacts uses: actions/download-artifact@v6 with: - name: vs-artifacts + name: vs-artifacts-x64 path: ${{github.workspace}} - name: extract tracked files and build artifacts shell: bash From f826b94c566660cd13deae02eebce84a09d02317 Mon Sep 17 00:00:00 2001 From: Philip Oakley Date: Fri, 2 Jul 2021 00:30:24 +0100 Subject: [PATCH 126/218] CMake: default Visual Studio generator has changed Correct some wording and inform users regarding the Visual Studio changes (from V16.6) to the default generator. Subsequent commits ensure that Git for Windows can be directly opened in modern Visual Studio without needing special configuration of the CMakeLists settings. It appeares that internally Visual Studio creates it's own version of the .sln file (etc.) for extension tools that expect them. The large number of references below document the shifting of Visual Studio default and CMake setting options. refs: https://docs.microsoft.com/en-us/search/?scope=C%2B%2B&view=msvc-150&terms=Ninja 1. https://docs.microsoft.com/en-us/cpp/linux/cmake-linux-configure?view=msvc-160 (note the linux bit) "In Visual Studio 2019 version 16.6 or later ***, Ninja is the default generator for configurations targeting a remote system or WSL. For more information, see this post on the C++ Team Blog [https://devblogs.microsoft.com/cppblog/linux-development-with-visual-studio-first-class-support-for-gdbserver-improved-build-times-with-ninja-and-updates-to-the-connection-manager/]. For more information about these settings, see CMakeSettings.json reference [https://docs.microsoft.com/en-us/cpp/build/cmakesettings-reference?view=msvc-160]." 2. https://docs.microsoft.com/en-us/cpp/build/cmake-presets-vs?view=msvc-160 "CMake supports two files that allow users to specify common configure, build, and test options and share them with others: CMakePresets.json and CMakeUserPresets.json." " Both files are supported in Visual Studio 2019 version 16.10 or later. ***" 3. https://devblogs.microsoft.com/cppblog/linux-development-with-visual-studio-first-class-support-for-gdbserver-improved-build-times-with-ninja-and-updates-to-the-connection-manager/ " Ninja has been the default generator (underlying build system) for CMake configurations targeting Windows for some time***, but in Visual Studio 2019 version 16.6 Preview 3*** we added support for Ninja on Linux." 4. https://docs.microsoft.com/en-us/cpp/build/cmakesettings-reference?view=msvc-160 " `generator`: specifies CMake generator to use for this configuration. May be one of: Visual Studio 2019 only: Visual Studio 16 2019 Visual Studio 16 2019 Win64 Visual Studio 16 2019 ARM Visual Studio 2017 and later: Visual Studio 15 2017 Visual Studio 15 2017 Win64 Visual Studio 15 2017 ARM Visual Studio 14 2015 Visual Studio 14 2015 Win64 Visual Studio 14 2015 ARM Unix Makefiles Ninja Because Ninja is designed for fast build speeds instead of flexibility and function, it is set as the default. However, some CMake projects may be unable to correctly build using Ninja. If this occurs, you can instruct CMake to generate Visual Studio projects instead. To specify a Visual Studio generator in Visual Studio 2017, open the settings editor from the main menu by choosing CMake | Change CMake Settings. Delete "Ninja" and type "V". This activates IntelliSense, which enables you to choose the generator you want." "To specify a Visual Studio generator in Visual Studio 2019, right-click on the CMakeLists.txt file in Solution Explorer and choose CMake Settings for project > Show Advanced Settings > CMake Generator. When the active configuration specifies a Visual Studio generator, by default MSBuild.exe is invoked with` -m -v:minimal` arguments." 5. https://docs.microsoft.com/en-us/cpp/build/cmake-presets-vs?view=msvc-160#enable-cmakepresetsjson-integration-in-visual-studio-2019 "Enable CMakePresets.json integration in Visual Studio 2019 CMakePresets.json integration isn't enabled by default in Visual Studio 2019. You can enable it for all CMake projects in Tools > Options > CMake > General: (tick a box)" ... see more. 6. https://docs.microsoft.com/en-us/cpp/build/cmakesettings-reference?view=msvc-140 (whichever v140 is..) "CMake projects are supported in Visual Studio 2017 and later." 7. https://docs.microsoft.com/en-us/cpp/overview/what-s-new-for-cpp-2017?view=msvc-150 "Support added for the CMake Ninja generator." 8. https://docs.microsoft.com/en-us/cpp/overview/what-s-new-for-cpp-2017?view=msvc-150#cmake-support-via-open-folder "CMake support via Open Folder Visual Studio 2017 introduces support for using CMake projects without converting to MSBuild project files (.vcxproj). For more information, see CMake projects in Visual Studio[https://docs.microsoft.com/en-us/cpp/build/cmake-projects-in-visual-studio?view=msvc-150]. Opening CMake projects with Open Folder automatically configures the environment for C++ editing, building, and debugging." ... +more! 9. https://docs.microsoft.com/en-us/cpp/build/cmake-presets-vs?view=msvc-160#supported-cmake-and-cmakepresetsjson-versions "Visual Studio reads and evaluates CMakePresets.json and CMakeUserPresets.json itself and doesn't invoke CMake directly with the --preset option. So, CMake version 3.20 or later isn't strictly required when you're building with CMakePresets.json inside Visual Studio. We recommend using CMake version 3.14 or later." 10. https://docs.microsoft.com/en-us/cpp/build/cmake-presets-vs?view=msvc-160#enable-cmakepresetsjson-integration-in-visual-studio-2019 "If you don't want to enable CMakePresets.json integration for all CMake projects, you can enable CMakePresets.json integration for a single CMake project by adding a CMakePresets.json file to the root of the open folder. You must close and reopen the folder in Visual Studio to activate the integration. 11. https://docs.microsoft.com/en-us/cpp/build/cmake-presets-vs?view=msvc-160#default-configure-presets ***(doesn't actually say which version..) "Default Configure Presets If no CMakePresets.json or CMakeUserPresets.json file exists, or if CMakePresets.json or CMakeUserPresets.json is invalid, Visual Studio will fall back*** on the following default Configure Presets: Windows example JSON { "name": "windows-default", "displayName": "Windows x64 Debug", "description": "Sets Ninja generator, compilers, x64 architecture, build and install directory, debug build type", "generator": "Ninja", "binaryDir": "${sourceDir}/out/build/${presetName}", "architecture": { "value": "x64", "strategy": "external" }, "cacheVariables": { "CMAKE_BUILD_TYPE": "Debug", "CMAKE_INSTALL_PREFIX": "${sourceDir}/out/install/${presetName}" }, "vendor": { "microsoft.com/VisualStudioSettings/CMake/1.0": { "hostOS": [ "Windows" ] } } }, " Signed-off-by: Philip Oakley --- contrib/buildsystems/CMakeLists.txt | 16 ++++++++++------ 1 file changed, 10 insertions(+), 6 deletions(-) diff --git a/contrib/buildsystems/CMakeLists.txt b/contrib/buildsystems/CMakeLists.txt index 14bc843610a64c..d91d97a9d650dd 100644 --- a/contrib/buildsystems/CMakeLists.txt +++ b/contrib/buildsystems/CMakeLists.txt @@ -14,6 +14,11 @@ Note: Visual Studio also has the option of opening `CMakeLists.txt` directly; Using this option, Visual Studio will not find the source code, though, therefore the `File>Open>Folder...` option is preferred. +Visual Studio does not produce a .sln solution file nor the .vcxproj files +that may be required by VS extension tools. + +To generate the .sln/.vcxproj files run CMake manually, as described below. + Instructions to run CMake manually: mkdir -p contrib/buildsystems/out @@ -22,7 +27,7 @@ Instructions to run CMake manually: This will build the git binaries in contrib/buildsystems/out directory (our top-level .gitignore file knows to ignore contents of -this directory). +this directory). The project .sln and .vcxproj files are also generated. Possible build configurations(-DCMAKE_BUILD_TYPE) with corresponding compiler flags @@ -35,17 +40,16 @@ empty(default) : NOTE: -DCMAKE_BUILD_TYPE is optional. For multi-config generators like Visual Studio this option is ignored -This process generates a Makefile(Linux/*BSD/MacOS) , Visual Studio solution(Windows) by default. +This process generates a Makefile(Linux/*BSD/MacOS), Visual Studio solution(Windows) by default. Run `make` to build Git on Linux/*BSD/MacOS. Open git.sln on Windows and build Git. -NOTE: By default CMake uses Makefile as the build tool on Linux and Visual Studio in Windows, -to use another tool say `ninja` add this to the command line when configuring. -`-G Ninja` - NOTE: By default CMake will install vcpkg locally to your source tree on configuration, to avoid this, add `-DNO_VCPKG=TRUE` to the command line when configuring. +The Visual Studio default generator changed in v16.6 from its Visual Studio +implemenation to `Ninja` This required changes to many CMake scripts. + ]] cmake_minimum_required(VERSION 3.14) From ac822571402e12d61c4a186b47f4243848ad14ba Mon Sep 17 00:00:00 2001 From: Philip Oakley Date: Sat, 24 Apr 2021 11:09:58 +0100 Subject: [PATCH 127/218] .gitignore: add Visual Studio CMakeSetting.json file The CMakeSettings.json file is tool generated. Developers may track it should they provide additional settings. Signed-off-by: Philip Oakley --- .gitignore | 1 + 1 file changed, 1 insertion(+) diff --git a/.gitignore b/.gitignore index 24635cf2d6f4a3..8bf10eb878024f 100644 --- a/.gitignore +++ b/.gitignore @@ -257,5 +257,6 @@ Release/ /git.VC.db *.dSYM /contrib/buildsystems/out +CMakeSettings.json /contrib/libgit-rs/target /contrib/libgit-sys/target From d8d25d545debb1998cff4a959d8309fad5fabe8f Mon Sep 17 00:00:00 2001 From: Philip Oakley Date: Thu, 22 Apr 2021 11:11:38 +0100 Subject: [PATCH 128/218] CMakeLists: add default "x64-windows" arch for Visual Studio In Git-for-Windows, work on using ARM64 has progressed. The commit 2d94b77b27 (cmake: allow building for Windows/ARM64, 2020-12-04) failed to notice that /compat/vcbuild/vcpkg_install.bat will default to using the "x64-windows" architecture for the vcpkg installation if not set, but CMake is not told of this default. Commit 635b6d99b3 (vcbuild: install ARM64 dependencies when building ARM64 binaries, 2020-01-31) later updated vcpkg_install.bat to accept an arch (%1) parameter, but retained the default. This default is neccessary for the use case where the project directory is opened directly in Visual Studio, which will find and build a CMakeLists.txt file without any parameters, thus expecting use of the default setting. Also Visual studio will generate internal .sln solution and .vcxproj project files needed for some extension tools. Inform users of the additional .sln/.vcxproj generation. ** How to test: rm -rf '.vs' # remove old visual studio settings rm -rf 'compat/vcbuild/vcpkg' # remove any vcpkg downloads rm -rf 'contrib/buildsystems/out' # remove builds & CMake artifacts with a fresh Visual Studio Community Edition, File>>Open>>(git *folder*) to load the project (which will take some time!). check for successful compilation. The implicit .sln (etc.) are in the hidden .vs directory created by Visual Studio. Signed-off-by: Philip Oakley --- contrib/buildsystems/CMakeLists.txt | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/contrib/buildsystems/CMakeLists.txt b/contrib/buildsystems/CMakeLists.txt index d91d97a9d650dd..3976c37fca806f 100644 --- a/contrib/buildsystems/CMakeLists.txt +++ b/contrib/buildsystems/CMakeLists.txt @@ -71,6 +71,10 @@ if(USE_VCPKG) message("Initializing vcpkg and building the Git's dependencies (this will take a while...)") execute_process(COMMAND ${CMAKE_SOURCE_DIR}/compat/vcbuild/vcpkg_install.bat ${VCPKG_ARCH}) endif() + if(NOT EXISTS ${VCPKG_ARCH}) + message("VCPKG_ARCH: unset, using 'x64-windows'") + set(VCPKG_ARCH "x64-windows") # default from vcpkg_install.bat + endif() list(APPEND CMAKE_PREFIX_PATH "${VCPKG_DIR}/installed/${VCPKG_ARCH}") # In the vcpkg edition, we need this to be able to link to libcurl From 399be1001f62dc8f914a5d217e01beff3cf9a952 Mon Sep 17 00:00:00 2001 From: Dennis Ameling Date: Sun, 6 Dec 2020 18:39:26 +0100 Subject: [PATCH 129/218] Add schannel to curl installation Signed-off-by: Dennis Ameling --- compat/vcbuild/vcpkg_install.bat | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/compat/vcbuild/vcpkg_install.bat b/compat/vcbuild/vcpkg_install.bat index 8da212487ae97d..575c65c20ba307 100644 --- a/compat/vcbuild/vcpkg_install.bat +++ b/compat/vcbuild/vcpkg_install.bat @@ -127,5 +127,5 @@ set features= goto :EOF :curl_features -set features=[core,openssl] +set features=[core,openssl,schannel] goto :EOF From 5859b251c259a77375323e519dd655c2418e3543 Mon Sep 17 00:00:00 2001 From: Dennis Ameling Date: Mon, 19 Jul 2021 13:02:16 +0200 Subject: [PATCH 130/218] cmake(): allow setting HOST_CPU for cross-compilation Git's regular Makefile mentions that HOST_CPU should be defined when cross-compiling Git: https://github.com/git-for-windows/git/blob/37796bca76ef4180c39ee508ca3e42c0777ba444/Makefile#L438-L439 This is then used to set the GIT_HOST_CPU variable when compiling Git: https://github.com/git-for-windows/git/blob/37796bca76ef4180c39ee508ca3e42c0777ba444/Makefile#L1337-L1341 Then, when the user runs `git version --build-options`, it returns that value: https://github.com/git-for-windows/git/blob/37796bca76ef4180c39ee508ca3e42c0777ba444/help.c#L658 This commit adds the same functionality to the CMake configuration. Users can now set -DHOST_CPU= to set the target architecture. Signed-off-by: Dennis Ameling --- .github/workflows/main.yml | 2 +- contrib/buildsystems/CMakeLists.txt | 9 ++++++++- 2 files changed, 9 insertions(+), 2 deletions(-) diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml index c463e79f44931f..bb98e0785792a8 100644 --- a/.github/workflows/main.yml +++ b/.github/workflows/main.yml @@ -197,7 +197,7 @@ jobs: shell: bash run: | cmake `pwd`/contrib/buildsystems/ -DCMAKE_PREFIX_PATH=`pwd`/compat/vcbuild/vcpkg/installed/${{ matrix.arch }}-windows \ - -DNO_GETTEXT=YesPlease -DPERL_TESTS=OFF -DPYTHON_TESTS=OFF -DCURL_NO_CURL_CMAKE=ON -DCMAKE_GENERATOR_PLATFORM=${{ matrix.arch }} -DVCPKG_ARCH=${{ matrix.arch }}-windows + -DNO_GETTEXT=YesPlease -DPERL_TESTS=OFF -DPYTHON_TESTS=OFF -DCURL_NO_CURL_CMAKE=ON -DCMAKE_GENERATOR_PLATFORM=${{ matrix.arch }} -DVCPKG_ARCH=${{ matrix.arch }}-windows -DHOST_CPU=${{ matrix.arch }} - name: MSBuild run: | $sln = if (Test-Path git.slnx) { 'git.slnx' } else { 'git.sln' } diff --git a/contrib/buildsystems/CMakeLists.txt b/contrib/buildsystems/CMakeLists.txt index 98500521641afd..14bc843610a64c 100644 --- a/contrib/buildsystems/CMakeLists.txt +++ b/contrib/buildsystems/CMakeLists.txt @@ -212,7 +212,14 @@ endif() #default behaviour include_directories(${CMAKE_SOURCE_DIR}) -add_compile_definitions(GIT_HOST_CPU="${CMAKE_SYSTEM_PROCESSOR}") + +# When cross-compiling, define HOST_CPU as the canonical name of the CPU on +# which the built Git will run (for instance "x86_64"). +if(NOT HOST_CPU) + add_compile_definitions(GIT_HOST_CPU="${CMAKE_SYSTEM_PROCESSOR}") +else() + add_compile_definitions(GIT_HOST_CPU="${HOST_CPU}") +endif() add_compile_definitions(SHA256_BLK INTERNAL_QSORT RUNTIME_PREFIX) add_compile_definitions(NO_OPENSSL SHA1_DC SHA1DC_NO_STANDARD_INCLUDES SHA1DC_INIT_SAFE_HASH_DEFAULT=0 From f4ce0a5385dc4e4116abb95e029f954024665568 Mon Sep 17 00:00:00 2001 From: Philip Oakley Date: Mon, 10 May 2021 16:47:40 +0100 Subject: [PATCH 131/218] CMake: show Win32 and Generator_platform build-option values Ensure key CMake option values are part of the CMake output to facilitate user support when tool updates impact the wider CMake actions, particularly ongoing 'improvements' in Visual Studio. These CMake displays perform the same function as the build-options.txt provided in the main Git for Windows. CMake is already chatty. The setting of CMAKE_EXPORT_COMPILE_COMMANDS is also reported. Include the environment's CMAKE_EXPORT_COMPILE_COMMANDS value which may have been propogated to CMake's internal value. Testing the CMAKE_EXPORT_COMPILE_COMMANDS processing can be difficult in the Visual Studio environment, as it may be cached in many places. The 'environment' may include the OS, the user shell, CMake's own environment, along with the Visual Studio presets and caches. See previous commit for arefacts that need removing for a clean test. Signed-off-by: Philip Oakley --- contrib/buildsystems/CMakeLists.txt | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/contrib/buildsystems/CMakeLists.txt b/contrib/buildsystems/CMakeLists.txt index 3976c37fca806f..49888fa38e1541 100644 --- a/contrib/buildsystems/CMakeLists.txt +++ b/contrib/buildsystems/CMakeLists.txt @@ -63,10 +63,20 @@ endif() if(NOT DEFINED CMAKE_EXPORT_COMPILE_COMMANDS) set(CMAKE_EXPORT_COMPILE_COMMANDS TRUE) + message("settting CMAKE_EXPORT_COMPILE_COMMANDS: ${CMAKE_EXPORT_COMPILE_COMMANDS}") endif() if(USE_VCPKG) set(VCPKG_DIR "${CMAKE_SOURCE_DIR}/compat/vcbuild/vcpkg") + message("WIN32: ${WIN32}") # show its underlying text values + message("VCPKG_DIR: ${VCPKG_DIR}") + message("VCPKG_ARCH: ${VCPKG_ARCH}") # maybe unset + message("MSVC: ${MSVC}") + message("CMAKE_GENERATOR: ${CMAKE_GENERATOR}") + message("CMAKE_CXX_COMPILER_ID: ${CMAKE_CXX_COMPILER_ID}") + message("CMAKE_GENERATOR_PLATFORM: ${CMAKE_GENERATOR_PLATFORM}") + message("CMAKE_EXPORT_COMPILE_COMMANDS: ${CMAKE_EXPORT_COMPILE_COMMANDS}") + message("ENV(CMAKE_EXPORT_COMPILE_COMMANDS): $ENV{CMAKE_EXPORT_COMPILE_COMMANDS}") if(NOT EXISTS ${VCPKG_DIR}) message("Initializing vcpkg and building the Git's dependencies (this will take a while...)") execute_process(COMMAND ${CMAKE_SOURCE_DIR}/compat/vcbuild/vcpkg_install.bat ${VCPKG_ARCH}) From db36eaf9fedd099312c9821c5826552a74f69073 Mon Sep 17 00:00:00 2001 From: JiSeop Moon Date: Mon, 23 Apr 2018 22:30:18 +0900 Subject: [PATCH 132/218] mingw: introduce code to detect whether we're inside a Windows container This will come in handy in the next commit. Signed-off-by: JiSeop Moon Signed-off-by: Johannes Schindelin --- compat/mingw.c | 32 ++++++++++++++++++++++++++++++++ compat/mingw.h | 5 +++++ 2 files changed, 37 insertions(+) diff --git a/compat/mingw.c b/compat/mingw.c index feefa2cd0eb12a..3fa7d75b74919e 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -3713,3 +3713,35 @@ int mingw_have_unix_sockets(void) return ret; } #endif + +/* + * Based on https://stackoverflow.com/questions/43002803 + * + * [HKLM\SYSTEM\CurrentControlSet\Services\cexecsvc] + * "DisplayName"="@%systemroot%\\system32\\cexecsvc.exe,-100" + * "ErrorControl"=dword:00000001 + * "ImagePath"=hex(2):25,00,73,00,79,00,73,00,74,00,65,00,6d,00,72,00,6f,00, + * 6f,00,74,00,25,00,5c,00,73,00,79,00,73,00,74,00,65,00,6d,00,33,00,32,00, + * 5c,00,63,00,65,00,78,00,65,00,63,00,73,00,76,00,63,00,2e,00,65,00,78,00, + * 65,00,00,00 + * "Start"=dword:00000002 + * "Type"=dword:00000010 + * "Description"="@%systemroot%\\system32\\cexecsvc.exe,-101" + * "ObjectName"="LocalSystem" + * "ServiceSidType"=dword:00000001 + */ +int is_inside_windows_container(void) +{ + static int inside_container = -1; /* -1 uninitialized */ + const char *key = "SYSTEM\\CurrentControlSet\\Services\\cexecsvc"; + HKEY handle = NULL; + + if (inside_container != -1) + return inside_container; + + inside_container = ERROR_SUCCESS == + RegOpenKeyExA(HKEY_LOCAL_MACHINE, key, 0, KEY_READ, &handle); + RegCloseKey(handle); + + return inside_container; +} diff --git a/compat/mingw.h b/compat/mingw.h index 444daedfa52469..20af7bd5496ffe 100644 --- a/compat/mingw.h +++ b/compat/mingw.h @@ -213,3 +213,8 @@ int mingw_have_unix_sockets(void); #undef have_unix_sockets #define have_unix_sockets mingw_have_unix_sockets #endif + +/* + * Check current process is inside Windows Container. + */ +int is_inside_windows_container(void); From 325ac04eca52767869957fad7a653f84a8a562c0 Mon Sep 17 00:00:00 2001 From: Karsten Blees Date: Thu, 19 Mar 2015 16:33:44 +0100 Subject: [PATCH 133/218] mingw: Support `git_terminal_prompt` with more terminals The `git_terminal_prompt()` function expects the terminal window to be attached to a Win32 Console. However, this is not the case with terminal windows other than `cmd.exe`'s, e.g. with MSys2's own `mintty`. Non-cmd terminals such as `mintty` still have to have a Win32 Console to be proper console programs, but have to hide the Win32 Console to be able to provide more flexibility (such as being resizeable not only vertically but also horizontally). By writing to that Win32 Console, `git_terminal_prompt()` manages only to send the prompt to nowhere and to wait for input from a Console to which the user has no access. This commit introduces a function specifically to support `mintty` -- or other terminals that are compatible with MSys2's `/dev/tty` emulation. We use the `TERM` environment variable as an indicator for that: if the value starts with "xterm" (such as `mintty`'s "xterm_256color"), we prefer to let `xterm_prompt()` handle the user interaction. The most prominent user of `git_terminal_prompt()` is certainly `git-remote-https.exe`. It is an interesting use case because both `stdin` and `stdout` are redirected when Git calls said executable, yet it still wants to access the terminal. When running inside a `mintty`, the terminal is not accessible to the `git-remote-https.exe` program, though, because it is a MinGW program and the `mintty` terminal is not backed by a Win32 console. To solve that problem, we simply call out to the shell -- which is an *MSys2* program and can therefore access `/dev/tty`. Helped-by: nalla Signed-off-by: Karsten Blees Signed-off-by: Johannes Schindelin --- compat/terminal.c | 54 +++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 54 insertions(+) diff --git a/compat/terminal.c b/compat/terminal.c index 584f27bf7e1078..cdcde283644e41 100644 --- a/compat/terminal.c +++ b/compat/terminal.c @@ -418,6 +418,54 @@ static int getchar_with_timeout(int timeout) return getchar(); } +static char *shell_prompt(const char *prompt, int echo) +{ + const char *read_input[] = { + /* Note: call 'bash' explicitly, as 'read -s' is bash-specific */ + "bash", "-c", echo ? + "cat >/dev/tty && read -r line /dev/tty && read -r -s line /dev/tty", + NULL + }; + struct child_process child = CHILD_PROCESS_INIT; + static struct strbuf buffer = STRBUF_INIT; + int prompt_len = strlen(prompt), len = -1, code; + + strvec_pushv(&child.args, read_input); + child.in = -1; + child.out = -1; + + if (start_command(&child)) + return NULL; + + if (write_in_full(child.in, prompt, prompt_len) != prompt_len) { + error("could not write to prompt script"); + close(child.in); + goto ret; + } + close(child.in); + + strbuf_reset(&buffer); + len = strbuf_read(&buffer, child.out, 1024); + if (len < 0) { + error("could not read from prompt script"); + goto ret; + } + + strbuf_strip_suffix(&buffer, "\n"); + strbuf_strip_suffix(&buffer, "\r"); + +ret: + close(child.out); + code = finish_command(&child); + if (code) { + error("failed to execute prompt script (exit code %d)", code); + return NULL; + } + + return len < 0 ? NULL : buffer.buf; +} + #endif #ifndef FORCE_TEXT @@ -429,6 +477,12 @@ char *git_terminal_prompt(const char *prompt, int echo) static struct strbuf buf = STRBUF_INIT; int r; FILE *input_fh, *output_fh; +#ifdef GIT_WINDOWS_NATIVE + const char *term = getenv("TERM"); + + if (term && starts_with(term, "xterm")) + return shell_prompt(prompt, echo); +#endif input_fh = fopen(INPUT_PATH, "r" FORCE_TEXT); if (!input_fh) From 7ad618a785bcc2575b14db79135aaa2c9028aa72 Mon Sep 17 00:00:00 2001 From: JiSeop Moon Date: Mon, 23 Apr 2018 22:31:42 +0200 Subject: [PATCH 134/218] mingw: when running in a Windows container, try to rename() harder It is a known issue that a rename() can fail with an "Access denied" error at times, when copying followed by deleting the original file works. Let's just fall back to that behavior. Signed-off-by: JiSeop Moon Signed-off-by: Johannes Schindelin --- compat/mingw.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/compat/mingw.c b/compat/mingw.c index 3fa7d75b74919e..0a00837b45a909 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -2564,6 +2564,13 @@ int mingw_rename(const char *pold, const char *pnew) gle = GetLastError(); } + if (gle == ERROR_ACCESS_DENIED && is_inside_windows_container()) { + /* Fall back to copy to destination & remove source */ + if (CopyFileW(wpold, wpnew, FALSE) && !mingw_unlink(pold, 1)) + return 0; + gle = GetLastError(); + } + /* revert file attributes on failure */ if (attrs != INVALID_FILE_ATTRIBUTES) SetFileAttributesW(wpnew, attrs); From 0b126e975cb05d4096eefb3c274aedf9233b8875 Mon Sep 17 00:00:00 2001 From: Karsten Blees Date: Sat, 9 May 2015 02:11:48 +0200 Subject: [PATCH 135/218] compat/terminal.c: only use the Windows console if bash 'read -r' fails MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Accessing the Windows console through the special CONIN$ / CONOUT$ devices doesn't work properly for non-ASCII usernames an passwords. It also doesn't work for terminal emulators that hide the native console window (such as mintty), and 'TERM=xterm*' is not necessarily a reliable indicator for such terminals. The new shell_prompt() function, on the other hand, works fine for both MSys1 and MSys2, in native console windows as well as mintty, and properly supports Unicode. It just needs bash on the path (for 'read -s', which is bash-specific). On Windows, try to use the shell to read from the terminal. If that fails with ENOENT (i.e. bash was not found), use CONIN/OUT as fallback. Note: To test this, create a UTF-8 credential file with non-ASCII chars, e.g. in git-bash: 'echo url=http://täst.com > cred.txt'. Then in git-cmd, 'git credential fill --- compat/terminal.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/compat/terminal.c b/compat/terminal.c index cdcde283644e41..a89c5cd9ccf604 100644 --- a/compat/terminal.c +++ b/compat/terminal.c @@ -434,6 +434,7 @@ static char *shell_prompt(const char *prompt, int echo) strvec_pushv(&child.args, read_input); child.in = -1; child.out = -1; + child.silent_exec_failure = 1; if (start_command(&child)) return NULL; @@ -477,11 +478,14 @@ char *git_terminal_prompt(const char *prompt, int echo) static struct strbuf buf = STRBUF_INIT; int r; FILE *input_fh, *output_fh; + #ifdef GIT_WINDOWS_NATIVE - const char *term = getenv("TERM"); - if (term && starts_with(term, "xterm")) - return shell_prompt(prompt, echo); + /* try shell_prompt first, fall back to CONIN/OUT if bash is missing */ + char *result = shell_prompt(prompt, echo); + if (result || errno != ENOENT) + return result; + #endif input_fh = fopen(INPUT_PATH, "r" FORCE_TEXT); From 984bc2fbf01ff010b0603cd4b5d084456afdeb89 Mon Sep 17 00:00:00 2001 From: JiSeop Moon Date: Mon, 23 Apr 2018 22:35:26 +0200 Subject: [PATCH 136/218] mingw: move the file_attr_to_st_mode() function definition In preparation for making this function a bit more complicated (to allow for special-casing the `ContainerMappedDirectories` in Windows containers, which look like a symbolic link, but are not), let's move it out of the header. Signed-off-by: JiSeop Moon Signed-off-by: Johannes Schindelin --- compat/mingw.c | 14 ++++++++++++++ compat/win32.h | 14 +------------- 2 files changed, 15 insertions(+), 13 deletions(-) diff --git a/compat/mingw.c b/compat/mingw.c index 0a00837b45a909..b6ce98dbe2181b 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -3752,3 +3752,17 @@ int is_inside_windows_container(void) return inside_container; } + +int file_attr_to_st_mode (DWORD attr, DWORD tag) +{ + int fMode = S_IREAD; + if ((attr & FILE_ATTRIBUTE_REPARSE_POINT) && tag == IO_REPARSE_TAG_SYMLINK) + fMode |= S_IFLNK; + else if (attr & FILE_ATTRIBUTE_DIRECTORY) + fMode |= S_IFDIR; + else + fMode |= S_IFREG; + if (!(attr & FILE_ATTRIBUTE_READONLY)) + fMode |= S_IWRITE; + return fMode; +} diff --git a/compat/win32.h b/compat/win32.h index 671bcc81f93351..52169ae19f4371 100644 --- a/compat/win32.h +++ b/compat/win32.h @@ -6,19 +6,7 @@ #include #endif -static inline int file_attr_to_st_mode (DWORD attr, DWORD tag) -{ - int fMode = S_IREAD; - if ((attr & FILE_ATTRIBUTE_REPARSE_POINT) && tag == IO_REPARSE_TAG_SYMLINK) - fMode |= S_IFLNK; - else if (attr & FILE_ATTRIBUTE_DIRECTORY) - fMode |= S_IFDIR; - else - fMode |= S_IFREG; - if (!(attr & FILE_ATTRIBUTE_READONLY)) - fMode |= S_IWRITE; - return fMode; -} +extern int file_attr_to_st_mode (DWORD attr, DWORD tag); static inline int get_file_attr(const char *fname, WIN32_FILE_ATTRIBUTE_DATA *fdata) { From e92c12c8275baedff821df5d3670818037d28274 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Thu, 13 Nov 2025 11:23:29 +0100 Subject: [PATCH 137/218] ci(macos): skip the `git p4` tests Historically, the macOS jobs have always been among the longest-running ones, and recently the `git p4` tests became another liability: They started to fail much more often (maybe as of the switch away from the `macos-13` pool?), requiring re-runs of the jobs that already were responsible for long CI build times. Of the 35 test scripts that exercise `git p4`, 32 are actually run on macOS (3 are skipped for reasons like case-sensitivee filesystem), and they take an accumulated runtime of over half an hour. Furthermore, the `git p4` command is not really affected by Git for Windows' patches, at least not as far as macOS is concerned, therefore it is not only causing developer friction to have these long-running, frequently failing tests, it is also quite wasteful: There has not been a single instance so far where any `git p4` test failure in Git for Windows had demonstrated an actionable bug. While upstream Git is confident to have addressed the flakiness of the `git p4` tests via ffff0bb0dac1 (Use Perforce arm64 binary on macOS CI jobs, 2025-11-16) (which got slipped in at the 11th hour into the v2.52.0 release, fast-tracked without ever hitting `seen` even after -rc2 was released), I am not quite so confident, and besides, the runtime penalty of running those tests in Git for Windows' CI runs is still a worrisome burden. So let's just disable those tests in the CI runs, at least on macOS. Signed-off-by: Johannes Schindelin --- ci/install-dependencies.sh | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/ci/install-dependencies.sh b/ci/install-dependencies.sh index c55441d9df91fd..745e39997935a4 100755 --- a/ci/install-dependencies.sh +++ b/ci/install-dependencies.sh @@ -119,11 +119,12 @@ macos-*) # brew install gnu-time brew link --force gettext - mkdir -p "$CUSTOM_PATH" - wget -q "$P4WHENCE/bin.macosx12arm64/helix-core-server.tgz" && - tar -xf helix-core-server.tgz -C "$CUSTOM_PATH" p4 p4d && - sudo xattr -d com.apple.quarantine "$CUSTOM_PATH/p4" "$CUSTOM_PATH/p4d" 2>/dev/null || true - rm helix-core-server.tgz + # Uncomment this block if you want to run `git p4` tests: + # mkdir -p "$CUSTOM_PATH" + # wget -q "$P4WHENCE/bin.macosx12arm64/helix-core-server.tgz" && + # tar -xf helix-core-server.tgz -C "$CUSTOM_PATH" p4 p4d && + # sudo xattr -d com.apple.quarantine "$CUSTOM_PATH/p4" "$CUSTOM_PATH/p4d" 2>/dev/null || true + # rm helix-core-server.tgz case "$jobname" in osx-meson) From cc8be7bbd3ad79451aa5bbc7323d4aef43992563 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Fri, 23 Feb 2018 02:50:03 +0100 Subject: [PATCH 138/218] mingw (git_terminal_prompt): do fall back to CONIN$/CONOUT$ method To support Git Bash running in a MinTTY, we use a dirty trick to access the MSYS2 pseudo terminal: we execute a Bash snippet that accesses /dev/tty. The idea was to fall back to writing to/reading from CONOUT$/CONIN$ if that Bash call failed because Bash was not found. However, we should fall back even in other error conditions, because we have not successfully read the user input. Let's make it so. Signed-off-by: Johannes Schindelin --- compat/terminal.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/compat/terminal.c b/compat/terminal.c index a89c5cd9ccf604..882b027e41e52b 100644 --- a/compat/terminal.c +++ b/compat/terminal.c @@ -483,7 +483,7 @@ char *git_terminal_prompt(const char *prompt, int echo) /* try shell_prompt first, fall back to CONIN/OUT if bash is missing */ char *result = shell_prompt(prompt, echo); - if (result || errno != ENOENT) + if (result) return result; #endif From 938fb92539207a40201c69d96ee7450c58c6640f Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Mon, 23 Apr 2018 23:20:00 +0200 Subject: [PATCH 139/218] mingw: Windows Docker volumes are *not* symbolic links ... even if they may look like them. As looking up the target of the "symbolic link" (just to see whether it starts with `/ContainerMappedDirectories/`) is pretty expensive, we do it when we can be *really* sure that there is a possibility that this might be the case. Signed-off-by: Johannes Schindelin Signed-off-by: JiSeop Moon --- compat/mingw.c | 25 +++++++++++++++++++------ compat/win32.h | 2 +- 2 files changed, 20 insertions(+), 7 deletions(-) diff --git a/compat/mingw.c b/compat/mingw.c index b6ce98dbe2181b..f868c29e883a32 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -1224,7 +1224,7 @@ int mingw_lstat(const char *file_name, struct stat *buf) buf->st_uid = 0; buf->st_nlink = 1; buf->st_mode = file_attr_to_st_mode(fdata.dwFileAttributes, - reparse_tag); + reparse_tag, file_name); buf->st_size = S_ISLNK(buf->st_mode) ? link_len : fdata.nFileSizeLow | (((off_t) fdata.nFileSizeHigh) << 32); buf->st_dev = buf->st_rdev = 0; /* not used by Git */ @@ -1273,7 +1273,7 @@ static int get_file_info_by_handle(HANDLE hnd, struct stat *buf) buf->st_gid = 0; buf->st_uid = 0; buf->st_nlink = 1; - buf->st_mode = file_attr_to_st_mode(fdata.dwFileAttributes, 0); + buf->st_mode = file_attr_to_st_mode(fdata.dwFileAttributes, 0, NULL); buf->st_size = fdata.nFileSizeLow | (((off_t)fdata.nFileSizeHigh)<<32); buf->st_dev = buf->st_rdev = 0; /* not used by Git */ @@ -3753,12 +3753,25 @@ int is_inside_windows_container(void) return inside_container; } -int file_attr_to_st_mode (DWORD attr, DWORD tag) +int file_attr_to_st_mode (DWORD attr, DWORD tag, const char *path) { int fMode = S_IREAD; - if ((attr & FILE_ATTRIBUTE_REPARSE_POINT) && tag == IO_REPARSE_TAG_SYMLINK) - fMode |= S_IFLNK; - else if (attr & FILE_ATTRIBUTE_DIRECTORY) + if ((attr & FILE_ATTRIBUTE_REPARSE_POINT) && + tag == IO_REPARSE_TAG_SYMLINK) { + int flag = S_IFLNK; + char buf[MAX_LONG_PATH]; + + /* + * Windows containers' mapped volumes are marked as reparse + * points and look like symbolic links, but they are not. + */ + if (path && is_inside_windows_container() && + readlink(path, buf, sizeof(buf)) > 27 && + starts_with(buf, "/ContainerMappedDirectories/")) + flag = S_IFDIR; + + fMode |= flag; + } else if (attr & FILE_ATTRIBUTE_DIRECTORY) fMode |= S_IFDIR; else fMode |= S_IFREG; diff --git a/compat/win32.h b/compat/win32.h index 52169ae19f4371..299f01bdf0f5a4 100644 --- a/compat/win32.h +++ b/compat/win32.h @@ -6,7 +6,7 @@ #include #endif -extern int file_attr_to_st_mode (DWORD attr, DWORD tag); +extern int file_attr_to_st_mode (DWORD attr, DWORD tag, const char *path); static inline int get_file_attr(const char *fname, WIN32_FILE_ATTRIBUTE_DATA *fdata) { From 697e3979cadd4e8871b0ca67d9a8cab0aa507e02 Mon Sep 17 00:00:00 2001 From: Bert Belder Date: Fri, 26 Oct 2018 11:13:45 +0200 Subject: [PATCH 140/218] Win32: symlink: move phantom symlink creation to a separate function Signed-off-by: Bert Belder --- compat/mingw.c | 91 +++++++++++++++++++++++++++----------------------- 1 file changed, 49 insertions(+), 42 deletions(-) diff --git a/compat/mingw.c b/compat/mingw.c index feefa2cd0eb12a..2768e40f78b199 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -447,6 +447,54 @@ static void process_phantom_symlinks(void) LeaveCriticalSection(&phantom_symlinks_cs); } +static int create_phantom_symlink(wchar_t *wtarget, wchar_t *wlink) +{ + int len; + + /* create file symlink */ + if (!CreateSymbolicLinkW(wlink, wtarget, symlink_file_flags)) { + errno = err_win_to_posix(GetLastError()); + return -1; + } + + /* convert to directory symlink if target exists */ + switch (process_phantom_symlink(wtarget, wlink)) { + case PHANTOM_SYMLINK_RETRY: { + /* if target doesn't exist, add to phantom symlinks list */ + wchar_t wfullpath[MAX_LONG_PATH]; + struct phantom_symlink_info *psi; + + /* convert to absolute path to be independent of cwd */ + len = GetFullPathNameW(wlink, MAX_LONG_PATH, wfullpath, NULL); + if (!len || len >= MAX_LONG_PATH) { + errno = err_win_to_posix(GetLastError()); + return -1; + } + + /* over-allocate and fill phantom_symlink_info structure */ + psi = xmalloc(sizeof(struct phantom_symlink_info) + + sizeof(wchar_t) * (len + wcslen(wtarget) + 2)); + psi->wlink = (wchar_t *)(psi + 1); + wcscpy(psi->wlink, wfullpath); + psi->wtarget = psi->wlink + len + 1; + wcscpy(psi->wtarget, wtarget); + + EnterCriticalSection(&phantom_symlinks_cs); + psi->next = phantom_symlinks; + phantom_symlinks = psi; + LeaveCriticalSection(&phantom_symlinks_cs); + break; + } + case PHANTOM_SYMLINK_DIRECTORY: + /* if we created a dir symlink, process other phantom symlinks */ + process_phantom_symlinks(); + break; + default: + break; + } + return 0; +} + /* Normalizes NT paths as returned by some low-level APIs. */ static wchar_t *normalize_ntpath(wchar_t *wbuf) { @@ -2897,48 +2945,7 @@ int symlink(const char *target, const char *link) if (wtarget[len] == '/') wtarget[len] = '\\'; - /* create file symlink */ - if (!CreateSymbolicLinkW(wlink, wtarget, symlink_file_flags)) { - errno = err_win_to_posix(GetLastError()); - return -1; - } - - /* convert to directory symlink if target exists */ - switch (process_phantom_symlink(wtarget, wlink)) { - case PHANTOM_SYMLINK_RETRY: { - /* if target doesn't exist, add to phantom symlinks list */ - wchar_t wfullpath[MAX_PATH]; - struct phantom_symlink_info *psi; - - /* convert to absolute path to be independent of cwd */ - len = GetFullPathNameW(wlink, MAX_PATH, wfullpath, NULL); - if (!len || len >= MAX_PATH) { - errno = err_win_to_posix(GetLastError()); - return -1; - } - - /* over-allocate and fill phantom_symlink_info structure */ - psi = xmalloc(sizeof(struct phantom_symlink_info) - + sizeof(wchar_t) * (len + wcslen(wtarget) + 2)); - psi->wlink = (wchar_t *)(psi + 1); - wcscpy(psi->wlink, wfullpath); - psi->wtarget = psi->wlink + len + 1; - wcscpy(psi->wtarget, wtarget); - - EnterCriticalSection(&phantom_symlinks_cs); - psi->next = phantom_symlinks; - phantom_symlinks = psi; - LeaveCriticalSection(&phantom_symlinks_cs); - break; - } - case PHANTOM_SYMLINK_DIRECTORY: - /* if we created a dir symlink, process other phantom symlinks */ - process_phantom_symlinks(); - break; - default: - break; - } - return 0; + return create_phantom_symlink(wtarget, wlink); } int readlink(const char *path, char *buf, size_t bufsiz) From 192974520a259f75edcb1076ab69dad4c7b9dd75 Mon Sep 17 00:00:00 2001 From: David Lomas Date: Fri, 28 Jul 2023 15:20:43 +0100 Subject: [PATCH 141/218] mingw: work around rename() failing on a read-only file At least on _some_ APFS network shares, Git fails to rename the object files because they are marked as read-only, because that has the effect of setting the uchg flag on APFS, which then means the file can't be renamed or deleted. To work around that, when a rename failed, and the read-only flag is set, try to turn it off and on again. This fixes https://github.com/git-for-windows/git/issues/4482 Signed-off-by: David Lomas Signed-off-by: Johannes Schindelin --- compat/mingw.c | 25 +++++++++++++++++++------ 1 file changed, 19 insertions(+), 6 deletions(-) diff --git a/compat/mingw.c b/compat/mingw.c index f868c29e883a32..52821d5496bdc4 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -2472,7 +2472,7 @@ int mingw_accept(int sockfd1, struct sockaddr *sa, socklen_t *sz) int mingw_rename(const char *pold, const char *pnew) { static int supports_file_rename_info_ex = 1; - DWORD attrs = INVALID_FILE_ATTRIBUTES, gle; + DWORD attrs = INVALID_FILE_ATTRIBUTES, gle, attrsold; int tries = 0; wchar_t wpold[MAX_PATH], wpnew[MAX_PATH]; int wpnew_len; @@ -2564,11 +2564,24 @@ int mingw_rename(const char *pold, const char *pnew) gle = GetLastError(); } - if (gle == ERROR_ACCESS_DENIED && is_inside_windows_container()) { - /* Fall back to copy to destination & remove source */ - if (CopyFileW(wpold, wpnew, FALSE) && !mingw_unlink(pold, 1)) - return 0; - gle = GetLastError(); + if (gle == ERROR_ACCESS_DENIED) { + if (is_inside_windows_container()) { + /* Fall back to copy to destination & remove source */ + if (CopyFileW(wpold, wpnew, FALSE) && !mingw_unlink(pold, 1)) + return 0; + gle = GetLastError(); + } else if ((attrsold = GetFileAttributesW(wpold)) & FILE_ATTRIBUTE_READONLY) { + /* if file is read-only, change and retry */ + SetFileAttributesW(wpold, attrsold & ~FILE_ATTRIBUTE_READONLY); + if (MoveFileExW(wpold, wpnew, + MOVEFILE_REPLACE_EXISTING | MOVEFILE_COPY_ALLOWED)) { + SetFileAttributesW(wpnew, attrsold); + return 0; + } + gle = GetLastError(); + /* revert attribute change on failure */ + SetFileAttributesW(wpold, attrsold); + } } /* revert file attributes on failure */ From 084d73aed035802b73fbef912a472bd1d37f8a8f Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Mon, 11 Feb 2019 14:19:18 +0100 Subject: [PATCH 142/218] Introduce helper to create symlinks that knows about index_state On Windows, symbolic links actually have a type depending on the target: it can be a file or a directory. In certain circumstances, this poses problems, e.g. when a symbolic link is supposed to point into a submodule that is not checked out, so there is no way for Git to auto-detect the type. To help with that, we will add support over the course of the next commits to specify that symlink type via the Git attributes. This requires an index_state, though, something that Git for Windows' `symlink()` replacement cannot know about because the function signature is defined by the POSIX standard and not ours to change. So let's introduce a helper function to create symbolic links that *does* know about the index_state. Signed-off-by: Johannes Schindelin --- apply.c | 2 +- builtin/difftool.c | 2 +- compat/mingw-posix.h | 4 +++- compat/mingw.c | 2 +- entry.c | 2 +- git-compat-util.h | 10 ++++++++++ refs/files-backend.c | 2 +- setup.c | 4 ++-- 8 files changed, 20 insertions(+), 8 deletions(-) diff --git a/apply.c b/apply.c index 4aa1694cfaa2f0..986b17a8a65c37 100644 --- a/apply.c +++ b/apply.c @@ -4515,7 +4515,7 @@ static int try_create_file(struct apply_state *state, const char *path, /* Although buf:size is counted string, it also is NUL * terminated. */ - return !!symlink(buf, path); + return !!create_symlink(state && state->repo ? state->repo->index : NULL, buf, path); fd = open(path, O_CREAT | O_EXCL | O_WRONLY, (mode & 0100) ? 0777 : 0666); if (fd < 0) diff --git a/builtin/difftool.c b/builtin/difftool.c index e4bc1f831696a8..8d10e2489f088e 100644 --- a/builtin/difftool.c +++ b/builtin/difftool.c @@ -544,7 +544,7 @@ static int run_dir_diff(struct repository *repo, } add_path(&wtdir, wtdir_len, dst_path); if (dt_options->symlinks) { - if (symlink(wtdir.buf, rdir.buf)) { + if (create_symlink(lstate.istate, wtdir.buf, rdir.buf)) { ret = error_errno("could not symlink '%s' to '%s'", wtdir.buf, rdir.buf); goto finish; } diff --git a/compat/mingw-posix.h b/compat/mingw-posix.h index 2d989fd762474e..2c96303a83e720 100644 --- a/compat/mingw-posix.h +++ b/compat/mingw-posix.h @@ -193,8 +193,10 @@ int setitimer(int type, struct itimerval *in, struct itimerval *out); int sigaction(int sig, struct sigaction *in, struct sigaction *out); int link(const char *oldpath, const char *newpath); int uname(struct utsname *buf); -int symlink(const char *target, const char *link); int readlink(const char *path, char *buf, size_t bufsiz); +struct index_state; +int mingw_create_symlink(struct index_state *index, const char *target, const char *link); +#define create_symlink mingw_create_symlink /* * replacements of existing functions diff --git a/compat/mingw.c b/compat/mingw.c index 2768e40f78b199..cb9b28b2d9cb27 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -2925,7 +2925,7 @@ int link(const char *oldpath, const char *newpath) return 0; } -int symlink(const char *target, const char *link) +int mingw_create_symlink(struct index_state *index UNUSED, const char *target, const char *link) { wchar_t wtarget[MAX_PATH], wlink[MAX_PATH]; int len; diff --git a/entry.c b/entry.c index 7817aee362ed9e..89f2d8ef141027 100644 --- a/entry.c +++ b/entry.c @@ -324,7 +324,7 @@ static int write_entry(struct cache_entry *ce, char *path, struct conv_attrs *ca if (!has_symlinks || to_tempfile) goto write_file_entry; - ret = symlink(new_blob, path); + ret = create_symlink(state->istate, new_blob, path); free(new_blob); if (ret) return error_errno("unable to create symlink %s", path); diff --git a/git-compat-util.h b/git-compat-util.h index ae1bdc90a4cd6a..7dce2787ae4d22 100644 --- a/git-compat-util.h +++ b/git-compat-util.h @@ -346,6 +346,16 @@ static inline int git_has_dir_sep(const char *path) #define has_dir_sep(path) git_has_dir_sep(path) #endif +#ifndef create_symlink +struct index_state; +static inline int git_create_symlink(struct index_state *index UNUSED, + const char *target, const char *link) +{ + return symlink(target, link); +} +#define create_symlink git_create_symlink +#endif + #ifndef query_user_email #define query_user_email() NULL #endif diff --git a/refs/files-backend.c b/refs/files-backend.c index b3b0c25f84e503..80985f5e06d87d 100644 --- a/refs/files-backend.c +++ b/refs/files-backend.c @@ -2112,7 +2112,7 @@ static int create_ref_symlink(struct ref_lock *lock, const char *target) ref_path = get_locked_file_path(&lock->lk); unlink(ref_path); - ret = symlink(target, ref_path); + ret = create_symlink(NULL, target, ref_path); free(ref_path); if (ret) diff --git a/setup.c b/setup.c index 7ec4427368a2a7..71f276b620139f 100644 --- a/setup.c +++ b/setup.c @@ -2291,7 +2291,7 @@ static void copy_templates_1(struct strbuf *path, struct strbuf *template_path, if (strbuf_readlink(&lnk, template_path->buf, st_template.st_size) < 0) die_errno(_("cannot readlink '%s'"), template_path->buf); - if (symlink(lnk.buf, path->buf)) + if (create_symlink(NULL, lnk.buf, path->buf)) die_errno(_("cannot symlink '%s' '%s'"), lnk.buf, path->buf); strbuf_release(&lnk); @@ -2570,7 +2570,7 @@ static int create_default_files(const char *template_path, repo_git_path_replace(the_repository, &path, "tXXXXXX"); if (!close(xmkstemp(path.buf)) && !unlink(path.buf) && - !symlink("testing", path.buf) && + !create_symlink(NULL, "testing", path.buf) && !lstat(path.buf, &st1) && S_ISLNK(st1.st_mode)) unlink(path.buf); /* good */ From d9a41f95ee02306371072c245c1e8a31a4888fbd Mon Sep 17 00:00:00 2001 From: Bert Belder Date: Fri, 26 Oct 2018 11:51:51 +0200 Subject: [PATCH 143/218] mingw: allow to specify the symlink type in .gitattributes On Windows, symbolic links have a type: a "file symlink" must point at a file, and a "directory symlink" must point at a directory. If the type of symlink does not match its target, it doesn't work. Git does not record the type of symlink in the index or in a tree. On checkout it'll guess the type, which only works if the target exists at the time the symlink is created. This may often not be the case, for example when the link points at a directory inside a submodule. By specifying `symlink=file` or `symlink=dir` the user can specify what type of symlink Git should create, so Git doesn't have to rely on unreliable heuristics. Signed-off-by: Bert Belder Signed-off-by: Johannes Schindelin --- Documentation/gitattributes.adoc | 30 ++++++++++++++++ compat/mingw.c | 60 ++++++++++++++++++++++++++++++-- 2 files changed, 88 insertions(+), 2 deletions(-) diff --git a/Documentation/gitattributes.adoc b/Documentation/gitattributes.adoc index f20041a323d174..7794bf0fd98dad 100644 --- a/Documentation/gitattributes.adoc +++ b/Documentation/gitattributes.adoc @@ -403,6 +403,36 @@ sign `$` upon checkout. Any byte sequence that begins with with `$Id$` upon check-in. +`symlink` +^^^^^^^^^ + +On Windows, symbolic links have a type: a "file symlink" must point at +a file, and a "directory symlink" must point at a directory. If the +type of symlink does not match its target, it doesn't work. + +Git does not record the type of symlink in the index or in a tree. On +checkout it'll guess the type, which only works if the target exists +at the time the symlink is created. This may often not be the case, +for example when the link points at a directory inside a submodule. + +The `symlink` attribute allows you to explicitly set the type of symlink +to `file` or `dir`, so Git doesn't have to guess. If you have a set of +symlinks that point at other files, you can do: + +------------------------ +*.gif symlink=file +------------------------ + +To tell Git that a symlink points at a directory, use: + +------------------------ +tools_folder symlink=dir +------------------------ + +The `symlink` attribute is ignored on platforms other than Windows, +since they don't distinguish between different types of symlinks. + + `filter` ^^^^^^^^ diff --git a/compat/mingw.c b/compat/mingw.c index cb9b28b2d9cb27..7b4f4254987dd2 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -4,6 +4,7 @@ #include "git-compat-util.h" #include "abspath.h" #include "alloc.h" +#include "attr.h" #include "config.h" #include "dir.h" #include "environment.h" @@ -2925,7 +2926,38 @@ int link(const char *oldpath, const char *newpath) return 0; } -int mingw_create_symlink(struct index_state *index UNUSED, const char *target, const char *link) +enum symlink_type { + SYMLINK_TYPE_UNSPECIFIED = 0, + SYMLINK_TYPE_FILE, + SYMLINK_TYPE_DIRECTORY, +}; + +static enum symlink_type check_symlink_attr(struct index_state *index, const char *link) +{ + static struct attr_check *check; + const char *value; + + if (!index) + return SYMLINK_TYPE_UNSPECIFIED; + + if (!check) + check = attr_check_initl("symlink", NULL); + + git_check_attr(index, link, check); + + value = check->items[0].value; + if (ATTR_UNSET(value)) + return SYMLINK_TYPE_UNSPECIFIED; + if (!strcmp(value, "file")) + return SYMLINK_TYPE_FILE; + if (!strcmp(value, "dir") || !strcmp(value, "directory")) + return SYMLINK_TYPE_DIRECTORY; + + warning(_("ignoring invalid symlink type '%s' for '%s'"), value, link); + return SYMLINK_TYPE_UNSPECIFIED; +} + +int mingw_create_symlink(struct index_state *index, const char *target, const char *link) { wchar_t wtarget[MAX_PATH], wlink[MAX_PATH]; int len; @@ -2945,7 +2977,31 @@ int mingw_create_symlink(struct index_state *index UNUSED, const char *target, c if (wtarget[len] == '/') wtarget[len] = '\\'; - return create_phantom_symlink(wtarget, wlink); + switch (check_symlink_attr(index, link)) { + case SYMLINK_TYPE_UNSPECIFIED: + /* Create a phantom symlink: it is initially created as a file + * symlink, but may change to a directory symlink later if/when + * the target exists. */ + return create_phantom_symlink(wtarget, wlink); + case SYMLINK_TYPE_FILE: + if (!CreateSymbolicLinkW(wlink, wtarget, symlink_file_flags)) + break; + return 0; + case SYMLINK_TYPE_DIRECTORY: + if (!CreateSymbolicLinkW(wlink, wtarget, + symlink_directory_flags)) + break; + /* There may be dangling phantom symlinks that point at this + * one, which should now morph into directory symlinks. */ + process_phantom_symlinks(); + return 0; + default: + BUG("unhandled symlink type"); + } + + /* CreateSymbolicLinkW failed. */ + errno = err_win_to_posix(GetLastError()); + return -1; } int readlink(const char *path, char *buf, size_t bufsiz) From b95bcb1a09d536462b7a8daf35745df31227e26f Mon Sep 17 00:00:00 2001 From: Bert Belder Date: Fri, 26 Oct 2018 23:42:09 +0200 Subject: [PATCH 144/218] Win32: symlink: add test for `symlink` attribute To verify that the symlink is resolved correctly, we use the fact that `git.exe` is a native Win32 program, and that `git.exe config -f ` therefore uses the native symlink resolution. Signed-off-by: Bert Belder Signed-off-by: Johannes Schindelin --- t/meson.build | 1 + t/t2040-checkout-symlink-attr.sh | 46 ++++++++++++++++++++++++++++++++ 2 files changed, 47 insertions(+) create mode 100755 t/t2040-checkout-symlink-attr.sh diff --git a/t/meson.build b/t/meson.build index 7528e5cda5fef0..6e77422ae808fa 100644 --- a/t/meson.build +++ b/t/meson.build @@ -274,6 +274,7 @@ integration_tests = [ 't2026-checkout-pathspec-file.sh', 't2027-checkout-track.sh', 't2030-unresolve-info.sh', + 't2040-checkout-symlink-attr.sh', 't2050-git-dir-relative.sh', 't2060-switch.sh', 't2070-restore.sh', diff --git a/t/t2040-checkout-symlink-attr.sh b/t/t2040-checkout-symlink-attr.sh new file mode 100755 index 00000000000000..e00c31d096ce88 --- /dev/null +++ b/t/t2040-checkout-symlink-attr.sh @@ -0,0 +1,46 @@ +#!/bin/sh + +test_description='checkout symlinks with `symlink` attribute on Windows + +Ensures that Git for Windows creates symlinks of the right type, +as specified by the `symlink` attribute in `.gitattributes`.' + +# Tell MSYS to create native symlinks. Without this flag test-lib's +# prerequisite detection for SYMLINKS doesn't detect the right thing. +MSYS=winsymlinks:nativestrict && export MSYS + +. ./test-lib.sh + +if ! test_have_prereq MINGW,SYMLINKS +then + skip_all='skipping $0: MinGW-only test, which requires symlink support.' + test_done +fi + +# Adds a symlink to the index without clobbering the work tree. +cache_symlink () { + sha=$(printf '%s' "$1" | git hash-object --stdin -w) && + git update-index --add --cacheinfo 120000,$sha,"$2" +} + +test_expect_success 'checkout symlinks with attr' ' + cache_symlink file1 file-link && + cache_symlink dir dir-link && + + printf "file-link symlink=file\ndir-link symlink=dir\n" >.gitattributes && + git add .gitattributes && + + git checkout . && + + mkdir dir && + echo "[a]b=c" >file1 && + echo "[x]y=z" >dir/file2 && + + # MSYS2 is very forgiving, it will resolve symlinks even if the + # symlink type is incorrect. To make this test meaningful, try + # them with a native, non-MSYS executable, such as `git config`. + test "$(git config -f file-link a.b)" = "c" && + test "$(git config -f dir-link/file2 x.y)" = "z" +' + +test_done From 698ef1f78618247698b41608396a36fb3a4946cb Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Wed, 17 May 2017 17:05:09 +0200 Subject: [PATCH 145/218] mingw: kill child processes in a gentler way The TerminateProcess() function does not actually leave the child processes any chance to perform any cleanup operations. This is bad insofar as Git itself expects its signal handlers to run. A symptom is e.g. a left-behind .lock file that would not be left behind if the same operation was run, say, on Linux. To remedy this situation, we use an obscure trick: we inject a thread into the process that needs to be killed and to let that thread run the ExitProcess() function with the desired exit status. Thanks J Wyman for describing this trick. The advantage is that the ExitProcess() function lets the atexit handlers run. While this is still different from what Git expects (i.e. running a signal handler), in practice Git sets up signal handlers and atexit handlers that call the same code to clean up after itself. In case that the gentle method to terminate the process failed, we still fall back to calling TerminateProcess(), but in that case we now also make sure that processes spawned by the spawned process are terminated; TerminateProcess() does not give the spawned process a chance to do so itself. Please note that this change only affects how Git for Windows tries to terminate processes spawned by Git's own executables. Third-party software that *calls* Git and wants to terminate it *still* need to make sure to imitate this gentle method, otherwise this patch will not have any effect. Signed-off-by: Johannes Schindelin --- compat/mingw.c | 29 +++++-- compat/win32/exit-process.h | 165 ++++++++++++++++++++++++++++++++++++ 2 files changed, 186 insertions(+), 8 deletions(-) create mode 100644 compat/win32/exit-process.h diff --git a/compat/mingw.c b/compat/mingw.c index feefa2cd0eb12a..bd36d47e7fae9c 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -13,6 +13,7 @@ #include "symlinks.h" #include "trace2.h" #include "win32.h" +#include "win32/exit-process.h" #include "win32/lazyload.h" #include "wrapper.h" #include @@ -2231,16 +2232,28 @@ int mingw_execvp(const char *cmd, char *const *argv) int mingw_kill(pid_t pid, int sig) { if (pid > 0 && sig == SIGTERM) { - HANDLE h = OpenProcess(PROCESS_TERMINATE, FALSE, pid); - - if (TerminateProcess(h, -1)) { + HANDLE h = OpenProcess(PROCESS_CREATE_THREAD | + PROCESS_QUERY_INFORMATION | + PROCESS_VM_OPERATION | PROCESS_VM_WRITE | + PROCESS_VM_READ | PROCESS_TERMINATE, + FALSE, pid); + int ret; + + if (h) + ret = exit_process(h, 128 + sig); + else { + h = OpenProcess(PROCESS_TERMINATE, FALSE, pid); + if (!h) { + errno = err_win_to_posix(GetLastError()); + return -1; + } + ret = terminate_process_tree(h, 128 + sig); + } + if (ret) { + errno = err_win_to_posix(GetLastError()); CloseHandle(h); - return 0; } - - errno = err_win_to_posix(GetLastError()); - CloseHandle(h); - return -1; + return ret; } else if (pid > 0 && sig == 0) { HANDLE h = OpenProcess(PROCESS_QUERY_INFORMATION, FALSE, pid); if (h) { diff --git a/compat/win32/exit-process.h b/compat/win32/exit-process.h new file mode 100644 index 00000000000000..d53989884cfb0c --- /dev/null +++ b/compat/win32/exit-process.h @@ -0,0 +1,165 @@ +#ifndef EXIT_PROCESS_H +#define EXIT_PROCESS_H + +/* + * This file contains functions to terminate a Win32 process, as gently as + * possible. + * + * At first, we will attempt to inject a thread that calls ExitProcess(). If + * that fails, we will fall back to terminating the entire process tree. + * + * For simplicity, these functions are marked as file-local. + */ + +#include + +/* + * Terminates the process corresponding to the process ID and all of its + * directly and indirectly spawned subprocesses. + * + * This way of terminating the processes is not gentle: the processes get + * no chance of cleaning up after themselves (closing file handles, removing + * .lock files, terminating spawned processes (if any), etc). + */ +static int terminate_process_tree(HANDLE main_process, int exit_status) +{ + HANDLE snapshot = CreateToolhelp32Snapshot(TH32CS_SNAPPROCESS, 0); + PROCESSENTRY32 entry; + DWORD pids[16384]; + int max_len = sizeof(pids) / sizeof(*pids), i, len, ret = 0; + pid_t pid = GetProcessId(main_process); + + pids[0] = (DWORD)pid; + len = 1; + + /* + * Even if Process32First()/Process32Next() seem to traverse the + * processes in topological order (i.e. parent processes before + * child processes), there is nothing in the Win32 API documentation + * suggesting that this is guaranteed. + * + * Therefore, run through them at least twice and stop when no more + * process IDs were added to the list. + */ + for (;;) { + int orig_len = len; + + memset(&entry, 0, sizeof(entry)); + entry.dwSize = sizeof(entry); + + if (!Process32First(snapshot, &entry)) + break; + + do { + for (i = len - 1; i >= 0; i--) { + if (pids[i] == entry.th32ProcessID) + break; + if (pids[i] == entry.th32ParentProcessID) + pids[len++] = entry.th32ProcessID; + } + } while (len < max_len && Process32Next(snapshot, &entry)); + + if (orig_len == len || len >= max_len) + break; + } + + for (i = len - 1; i > 0; i--) { + HANDLE process = OpenProcess(PROCESS_TERMINATE, FALSE, pids[i]); + + if (process) { + if (!TerminateProcess(process, exit_status)) + ret = -1; + CloseHandle(process); + } + } + if (!TerminateProcess(main_process, exit_status)) + ret = -1; + CloseHandle(main_process); + + return ret; +} + +/** + * Determine whether a process runs in the same architecture as the current + * one. That test is required before we assume that GetProcAddress() returns + * a valid address *for the target process*. + */ +static inline int process_architecture_matches_current(HANDLE process) +{ + static BOOL current_is_wow = -1; + BOOL is_wow; + + if (current_is_wow == -1 && + !IsWow64Process (GetCurrentProcess(), ¤t_is_wow)) + current_is_wow = -2; + if (current_is_wow == -2) + return 0; /* could not determine current process' WoW-ness */ + if (!IsWow64Process (process, &is_wow)) + return 0; /* cannot determine */ + return is_wow == current_is_wow; +} + +/** + * Inject a thread into the given process that runs ExitProcess(). + * + * Note: as kernel32.dll is loaded before any process, the other process and + * this process will have ExitProcess() at the same address. + * + * This function expects the process handle to have the access rights for + * CreateRemoteThread(): PROCESS_CREATE_THREAD, PROCESS_QUERY_INFORMATION, + * PROCESS_VM_OPERATION, PROCESS_VM_WRITE, and PROCESS_VM_READ. + * + * The idea comes from the Dr Dobb's article "A Safer Alternative to + * TerminateProcess()" by Andrew Tucker (July 1, 1999), + * http://www.drdobbs.com/a-safer-alternative-to-terminateprocess/184416547 + * + * If this method fails, we fall back to running terminate_process_tree(). + */ +static int exit_process(HANDLE process, int exit_code) +{ + DWORD code; + + if (GetExitCodeProcess(process, &code) && code == STILL_ACTIVE) { + static int initialized; + static LPTHREAD_START_ROUTINE exit_process_address; + PVOID arg = (PVOID)(intptr_t)exit_code; + DWORD thread_id; + HANDLE thread = NULL; + + if (!initialized) { + HINSTANCE kernel32 = GetModuleHandleA("kernel32"); + if (!kernel32) + die("BUG: cannot find kernel32"); + exit_process_address = + (LPTHREAD_START_ROUTINE)(void (*)(void)) + GetProcAddress(kernel32, "ExitProcess"); + initialized = 1; + } + if (!exit_process_address || + !process_architecture_matches_current(process)) + return terminate_process_tree(process, exit_code); + + thread = CreateRemoteThread(process, NULL, 0, + exit_process_address, + arg, 0, &thread_id); + if (thread) { + CloseHandle(thread); + /* + * If the process survives for 10 seconds (a completely + * arbitrary value picked from thin air), fall back to + * killing the process tree via TerminateProcess(). + */ + if (WaitForSingleObject(process, 10000) == + WAIT_OBJECT_0) { + CloseHandle(process); + return 0; + } + } + + return terminate_process_tree(process, exit_code); + } + + return 0; +} + +#endif From ed0c744e64ba7e0d11188114e4ebd3af8bfc15b7 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Mon, 23 Apr 2018 00:24:29 +0200 Subject: [PATCH 146/218] mingw: really handle SIGINT Previously, we did not install any handler for Ctrl+C, but now we really want to because the MSYS2 runtime learned the trick to call the ConsoleCtrlHandler when Ctrl+C was pressed. With this, hitting Ctrl+C while `git log` is running will only terminate the Git process, but not the pager. This finally matches the behavior on Linux and on macOS. Signed-off-by: Johannes Schindelin --- compat/mingw.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/compat/mingw.c b/compat/mingw.c index bd36d47e7fae9c..3363b74ef5ea35 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -3603,7 +3603,14 @@ static void adjust_symlink_flags(void) symlink_file_flags |= 2; symlink_directory_flags |= 2; } +} +static BOOL WINAPI handle_ctrl_c(DWORD ctrl_type) +{ + if (ctrl_type != CTRL_C_EVENT) + return FALSE; /* we did not handle this */ + mingw_raise(SIGINT); + return TRUE; /* we did handle this */ } #ifdef _MSC_VER @@ -3640,6 +3647,8 @@ int wmain(int argc, const wchar_t **wargv) #endif #endif + SetConsoleCtrlHandler(handle_ctrl_c, TRUE); + maybe_redirect_std_handles(); adjust_symlink_flags(); From 0c4517985adc2be2fbf1bb4023f3fe7373dab270 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Fri, 7 Dec 2018 13:39:30 +0100 Subject: [PATCH 147/218] clean: do not traverse mount points It seems to be not exactly rare on Windows to install NTFS junction points (the equivalent of "bind mounts" on Linux/Unix) in worktrees, e.g. to map some development tools into a subdirectory. In such a scenario, it is pretty horrible if `git clean -dfx` traverses into the mapped directory and starts to "clean up". Let's just not do that. Let's make sure before we traverse into a directory that it is not a mount point (or junction). This addresses https://github.com/git-for-windows/git/issues/607 Signed-off-by: Johannes Schindelin --- builtin/clean.c | 14 ++++++++++++++ compat/mingw.c | 22 ++++++++++++++++++++++ compat/mingw.h | 3 +++ git-compat-util.h | 4 ++++ path.c | 39 +++++++++++++++++++++++++++++++++++++++ path.h | 1 + t/t7300-clean.sh | 9 +++++++++ 7 files changed, 92 insertions(+) diff --git a/builtin/clean.c b/builtin/clean.c index 1d5e7e5366bf09..e4f2d56d3210ba 100644 --- a/builtin/clean.c +++ b/builtin/clean.c @@ -41,6 +41,8 @@ static const char *msg_remove = N_("Removing %s\n"); static const char *msg_would_remove = N_("Would remove %s\n"); static const char *msg_skip_git_dir = N_("Skipping repository %s\n"); static const char *msg_would_skip_git_dir = N_("Would skip repository %s\n"); +static const char *msg_skip_mount_point = N_("Skipping mount point %s\n"); +static const char *msg_would_skip_mount_point = N_("Would skip mount point %s\n"); static const char *msg_warn_remove_failed = N_("failed to remove %s"); static const char *msg_warn_lstat_failed = N_("could not lstat %s\n"); static const char *msg_skip_cwd = N_("Refusing to remove current working directory\n"); @@ -185,6 +187,18 @@ static int remove_dirs(struct strbuf *path, const char *prefix, int force_flag, goto out; } + if (is_mount_point(path)) { + if (!quiet) { + quote_path(path->buf, prefix, "ed, 0); + printf(dry_run ? + _(msg_would_skip_mount_point) : + _(msg_skip_mount_point), quoted.buf); + } + *dir_gone = 0; + + goto out; + } + dir = opendir(path->buf); if (!dir) { /* an empty dir could be removed even if it is unreadble */ diff --git a/compat/mingw.c b/compat/mingw.c index 7b4f4254987dd2..385904d71d9321 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -3080,6 +3080,28 @@ pid_t waitpid(pid_t pid, int *status, int options) return -1; } +int mingw_is_mount_point(struct strbuf *path) +{ + WIN32_FIND_DATAW findbuf = { 0 }; + HANDLE handle; + wchar_t wfilename[MAX_PATH]; + int wlen = xutftowcs_path(wfilename, path->buf); + if (wlen < 0) + die(_("could not get long path for '%s'"), path->buf); + + /* remove trailing slash, if any */ + if (wlen > 0 && wfilename[wlen - 1] == L'/') + wfilename[--wlen] = L'\0'; + + handle = FindFirstFileW(wfilename, &findbuf); + if (handle == INVALID_HANDLE_VALUE) + return 0; + FindClose(handle); + + return (findbuf.dwFileAttributes & FILE_ATTRIBUTE_REPARSE_POINT) && + (findbuf.dwReserved0 == IO_REPARSE_TAG_MOUNT_POINT); +} + int xutftowcsn(wchar_t *wcs, const char *utfs, size_t wcslen, int utflen) { int upos = 0, wpos = 0; diff --git a/compat/mingw.h b/compat/mingw.h index 444daedfa52469..af6fc3f12970bf 100644 --- a/compat/mingw.h +++ b/compat/mingw.h @@ -36,6 +36,9 @@ static inline void convert_slashes(char *path) if (*path == '\\') *path = '/'; } +struct strbuf; +int mingw_is_mount_point(struct strbuf *path); +#define is_mount_point mingw_is_mount_point #define PATH_SEP ';' char *mingw_query_user_email(void); #define query_user_email mingw_query_user_email diff --git a/git-compat-util.h b/git-compat-util.h index 7dce2787ae4d22..4ce1e539719c17 100644 --- a/git-compat-util.h +++ b/git-compat-util.h @@ -346,6 +346,10 @@ static inline int git_has_dir_sep(const char *path) #define has_dir_sep(path) git_has_dir_sep(path) #endif +#ifndef is_mount_point +#define is_mount_point is_mount_point_via_stat +#endif + #ifndef create_symlink struct index_state; static inline int git_create_symlink(struct index_state *index UNUSED, diff --git a/path.c b/path.c index d7e17bf17404de..d76aa19ef8baf6 100644 --- a/path.c +++ b/path.c @@ -1328,6 +1328,45 @@ char *strip_path_suffix(const char *path, const char *suffix) return offset == -1 ? NULL : xstrndup(path, offset); } +int is_mount_point_via_stat(struct strbuf *path) +{ + size_t len = path->len; + dev_t current_dev; + struct stat st; + + if (!strcmp("/", path->buf)) + return 1; + + strbuf_addstr(path, "/."); + if (lstat(path->buf, &st)) { + /* + * If we cannot access the current directory, we cannot say + * that it is a bind mount. + */ + strbuf_setlen(path, len); + return 0; + } + current_dev = st.st_dev; + + /* Now look at the parent directory */ + strbuf_addch(path, '.'); + if (lstat(path->buf, &st)) { + /* + * If we cannot access the parent directory, we cannot say + * that it is a bind mount. + */ + strbuf_setlen(path, len); + return 0; + } + strbuf_setlen(path, len); + + /* + * If the device ID differs between current and parent directory, + * then it is a bind mount. + */ + return current_dev != st.st_dev; +} + int daemon_avoid_alias(const char *p) { int sl, ndot; diff --git a/path.h b/path.h index 0434ba5e07e806..85713809f63624 100644 --- a/path.h +++ b/path.h @@ -161,6 +161,7 @@ int normalize_path_copy(char *dst, const char *src); int strbuf_normalize_path(struct strbuf *src); int longest_ancestor_length(const char *path, struct string_list *prefixes); char *strip_path_suffix(const char *path, const char *suffix); +int is_mount_point_via_stat(struct strbuf *path); int daemon_avoid_alias(const char *path); /* diff --git a/t/t7300-clean.sh b/t/t7300-clean.sh index 00d4070156243b..7c3a1ca91df534 100755 --- a/t/t7300-clean.sh +++ b/t/t7300-clean.sh @@ -800,4 +800,13 @@ test_expect_success 'traverse into directories that may have ignored entries' ' ) ' +test_expect_success MINGW 'clean does not traverse mount points' ' + mkdir target && + >target/dont-clean-me && + git init with-mountpoint && + cmd //c "mklink /j with-mountpoint\\mountpoint target" && + git -C with-mountpoint clean -dfx && + test_path_is_file target/dont-clean-me +' + test_done From 2293ecb70198ed1ef5fde3f969a6b8472caacff7 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Tue, 11 Dec 2018 12:55:26 +0100 Subject: [PATCH 148/218] clean: remove mount points when possible Windows' equivalent to "bind mounts", NTFS junction points, can be unlinked without affecting the mount target. This is clearly what users expect to happen when they call `git clean -dfx` in a worktree that contains NTFS junction points: the junction should be removed, and the target directory of said junction should be left alone (unless it is inside the worktree). Signed-off-by: Johannes Schindelin --- builtin/clean.c | 13 +++++++++++++ compat/mingw.h | 1 + t/t7300-clean.sh | 1 + 3 files changed, 15 insertions(+) diff --git a/builtin/clean.c b/builtin/clean.c index e4f2d56d3210ba..6ed555000f9a41 100644 --- a/builtin/clean.c +++ b/builtin/clean.c @@ -41,8 +41,10 @@ static const char *msg_remove = N_("Removing %s\n"); static const char *msg_would_remove = N_("Would remove %s\n"); static const char *msg_skip_git_dir = N_("Skipping repository %s\n"); static const char *msg_would_skip_git_dir = N_("Would skip repository %s\n"); +#ifndef CAN_UNLINK_MOUNT_POINTS static const char *msg_skip_mount_point = N_("Skipping mount point %s\n"); static const char *msg_would_skip_mount_point = N_("Would skip mount point %s\n"); +#endif static const char *msg_warn_remove_failed = N_("failed to remove %s"); static const char *msg_warn_lstat_failed = N_("could not lstat %s\n"); static const char *msg_skip_cwd = N_("Refusing to remove current working directory\n"); @@ -188,6 +190,7 @@ static int remove_dirs(struct strbuf *path, const char *prefix, int force_flag, } if (is_mount_point(path)) { +#ifndef CAN_UNLINK_MOUNT_POINTS if (!quiet) { quote_path(path->buf, prefix, "ed, 0); printf(dry_run ? @@ -195,6 +198,16 @@ static int remove_dirs(struct strbuf *path, const char *prefix, int force_flag, _(msg_skip_mount_point), quoted.buf); } *dir_gone = 0; +#else + if (!dry_run && unlink(path->buf)) { + int saved_errno = errno; + quote_path(path->buf, prefix, "ed, 0); + errno = saved_errno; + warning_errno(_(msg_warn_remove_failed), quoted.buf); + *dir_gone = 0; + ret = -1; + } +#endif goto out; } diff --git a/compat/mingw.h b/compat/mingw.h index af6fc3f12970bf..fb83cdaf4e982c 100644 --- a/compat/mingw.h +++ b/compat/mingw.h @@ -39,6 +39,7 @@ static inline void convert_slashes(char *path) struct strbuf; int mingw_is_mount_point(struct strbuf *path); #define is_mount_point mingw_is_mount_point +#define CAN_UNLINK_MOUNT_POINTS 1 #define PATH_SEP ';' char *mingw_query_user_email(void); #define query_user_email mingw_query_user_email diff --git a/t/t7300-clean.sh b/t/t7300-clean.sh index 7c3a1ca91df534..6f16f3893191e7 100755 --- a/t/t7300-clean.sh +++ b/t/t7300-clean.sh @@ -806,6 +806,7 @@ test_expect_success MINGW 'clean does not traverse mount points' ' git init with-mountpoint && cmd //c "mklink /j with-mountpoint\\mountpoint target" && git -C with-mountpoint clean -dfx && + test_path_is_missing with-mountpoint/mountpoint && test_path_is_file target/dont-clean-me ' From 48a4cf47087d71aa8ad3513161d1c92b0e86607a Mon Sep 17 00:00:00 2001 From: xungeng li Date: Wed, 7 Jun 2023 20:26:33 +0800 Subject: [PATCH 149/218] mingw: optionally enable wsl compability file mode bits The Windows Subsystem for Linux (WSL) version 2 allows to use `chmod` on NTFS volumes provided that they are mounted with metadata enabled (see https://devblogs.microsoft.com/commandline/chmod-chown-wsl-improvements/ for details), for example: $ chmod 0755 /mnt/d/test/a.sh In order to facilitate better collaboration between the Windows version of Git and the WSL version of Git, we can make the Windows version of Git also support reading and writing NTFS file modes in a manner compatible with WSL. Since this slightly slows down operations where lots of files are created (such as an initial checkout), this feature is only enabled when `core.WSLCompat` is set to true. Note that you also have to set `core.fileMode=true` in repositories that have been initialized without enabling WSL compatibility. There are several ways to enable metadata loading for NTFS volumes in WSL, one of which is to modify `/etc/wsl.conf` by adding: ``` [automount] enabled = true options = "metadata,umask=027,fmask=117" ``` And reboot WSL. It can also be enabled temporarily by this incantation: $ sudo umount /mnt/c && sudo mount -t drvfs C: /mnt/c -o metadata,uid=1000,gid=1000,umask=22,fmask=111 It's important to note that this modification is compatible with, but does not depend on WSL. The helper functions in this commit can operate independently and functions normally on devices where WSL is not installed or properly configured. Signed-off-by: xungeng li Signed-off-by: Johannes Schindelin --- Documentation/config/core.adoc | 6 ++ compat/mingw.c | 13 +++ compat/win32/wsl.c | 142 ++++++++++++++++++++++++++++ compat/win32/wsl.h | 12 +++ config.mak.uname | 4 +- contrib/buildsystems/CMakeLists.txt | 1 + meson.build | 1 + 7 files changed, 177 insertions(+), 2 deletions(-) create mode 100644 compat/win32/wsl.c create mode 100644 compat/win32/wsl.h diff --git a/Documentation/config/core.adoc b/Documentation/config/core.adoc index a0ebf03e2eb050..c870e7f7fe3a6f 100644 --- a/Documentation/config/core.adoc +++ b/Documentation/config/core.adoc @@ -788,3 +788,9 @@ core.maxTreeDepth:: to allow Git to abort cleanly, and should not generally need to be adjusted. When Git is compiled with MSVC, the default is 512. Otherwise, the default is 2048. + +core.WSLCompat:: + Tells Git whether to enable wsl compatibility mode. + The default value is false. When set to true, Git will set the mode + bits of the file in the way of wsl, so that the executable flag of + files can be set or read correctly. diff --git a/compat/mingw.c b/compat/mingw.c index c592b9d218a66a..72ede93c4aaf21 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -14,6 +14,7 @@ #include "trace2.h" #include "win32.h" #include "win32/lazyload.h" +#include "win32/wsl.h" #include "wrapper.h" #include #include @@ -850,6 +851,11 @@ int mingw_open (const char *filename, int oflags, ...) if (fd < 0 && create && GetLastError() == ERROR_ACCESS_DENIED && INIT_PROC_ADDR(RtlGetLastNtStatus) && RtlGetLastNtStatus() == STATUS_DELETE_PENDING) errno = EEXIST; + else if ((oflags & O_CREAT) && fd >= 0 && are_wsl_compatible_mode_bits_enabled()) { + _mode_t wsl_mode = S_IFREG | (mode&0777); + set_wsl_mode_bits_by_handle((HANDLE)_get_osfhandle(fd), wsl_mode); + } + if (fd < 0 && (oflags & O_ACCMODE) != O_RDONLY && errno == EACCES) { DWORD attrs = GetFileAttributesW(wfilename); if (attrs != INVALID_FILE_ATTRIBUTES && (attrs & FILE_ATTRIBUTE_DIRECTORY)) @@ -1231,6 +1237,11 @@ int mingw_lstat(const char *file_name, struct stat *buf) filetime_to_timespec(&(fdata.ftLastAccessTime), &(buf->st_atim)); filetime_to_timespec(&(fdata.ftLastWriteTime), &(buf->st_mtim)); filetime_to_timespec(&(fdata.ftCreationTime), &(buf->st_ctim)); + if (S_ISREG(buf->st_mode) && + are_wsl_compatible_mode_bits_enabled()) { + copy_wsl_mode_bits_from_disk(wfilename, -1, + &buf->st_mode); + } return 0; } @@ -1280,6 +1291,8 @@ static int get_file_info_by_handle(HANDLE hnd, struct stat *buf) filetime_to_timespec(&(fdata.ftLastAccessTime), &(buf->st_atim)); filetime_to_timespec(&(fdata.ftLastWriteTime), &(buf->st_mtim)); filetime_to_timespec(&(fdata.ftCreationTime), &(buf->st_ctim)); + if (are_wsl_compatible_mode_bits_enabled()) + get_wsl_mode_bits_by_handle(hnd, &buf->st_mode); return 0; } diff --git a/compat/win32/wsl.c b/compat/win32/wsl.c new file mode 100644 index 00000000000000..ab599770138b4e --- /dev/null +++ b/compat/win32/wsl.c @@ -0,0 +1,142 @@ +#define USE_THE_REPOSITORY_VARIABLE +#include "../../git-compat-util.h" +#include "../win32.h" +#include "../../repository.h" +#include "config.h" +#include "ntifs.h" +#include "wsl.h" + +int are_wsl_compatible_mode_bits_enabled(void) +{ + /* default to `false` during initialization */ + static const int fallback = 0; + static int enabled = -1; + + if (enabled < 0) { + /* avoid infinite recursion */ + if (!the_repository) + return fallback; + + if (the_repository->config && + the_repository->config->hash_initialized && + repo_config_get_bool(the_repository, "core.wslcompat", &enabled) < 0) + enabled = 0; + } + + return enabled < 0 ? fallback : enabled; +} + +int copy_wsl_mode_bits_from_disk(const wchar_t *wpath, ssize_t wpathlen, + _mode_t *mode) +{ + int ret = -1; + HANDLE h; + if (wpathlen >= 0) { + /* + * It's caller's duty to make sure wpathlen is reasonable so + * it does not overflow. + */ + wchar_t *fn2 = (wchar_t*)alloca((wpathlen + 1) * sizeof(wchar_t)); + memcpy(fn2, wpath, wpathlen * sizeof(wchar_t)); + fn2[wpathlen] = 0; + wpath = fn2; + } + h = CreateFileW(wpath, FILE_READ_EA | SYNCHRONIZE, + FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE, + NULL, OPEN_EXISTING, + FILE_FLAG_BACKUP_SEMANTICS | + FILE_FLAG_OPEN_REPARSE_POINT, + NULL); + if (h != INVALID_HANDLE_VALUE) { + ret = get_wsl_mode_bits_by_handle(h, mode); + CloseHandle(h); + } + return ret; +} + +#ifndef LX_FILE_METADATA_HAS_UID +#define LX_FILE_METADATA_HAS_UID 0x1 +#define LX_FILE_METADATA_HAS_GID 0x2 +#define LX_FILE_METADATA_HAS_MODE 0x4 +#define LX_FILE_METADATA_HAS_DEVICE_ID 0x8 +#define LX_FILE_CASE_SENSITIVE_DIR 0x10 +typedef struct _FILE_STAT_LX_INFORMATION { + LARGE_INTEGER FileId; + LARGE_INTEGER CreationTime; + LARGE_INTEGER LastAccessTime; + LARGE_INTEGER LastWriteTime; + LARGE_INTEGER ChangeTime; + LARGE_INTEGER AllocationSize; + LARGE_INTEGER EndOfFile; + uint32_t FileAttributes; + uint32_t ReparseTag; + uint32_t NumberOfLinks; + ACCESS_MASK EffectiveAccess; + uint32_t LxFlags; + uint32_t LxUid; + uint32_t LxGid; + uint32_t LxMode; + uint32_t LxDeviceIdMajor; + uint32_t LxDeviceIdMinor; +} FILE_STAT_LX_INFORMATION, *PFILE_STAT_LX_INFORMATION; +#endif + +/* + * This struct is extended from the original FILE_FULL_EA_INFORMATION of + * Microsoft Windows. + */ +struct wsl_full_ea_info_t { + uint32_t NextEntryOffset; + uint8_t Flags; + uint8_t EaNameLength; + uint16_t EaValueLength; + char EaName[7]; + char EaValue[4]; + char Padding[1]; +}; + +enum { + FileStatLxInformation = 70, +}; +__declspec(dllimport) NTSTATUS WINAPI + NtQueryInformationFile(HANDLE FileHandle, + PIO_STATUS_BLOCK IoStatusBlock, + PVOID FileInformation, ULONG Length, + uint32_t FileInformationClass); +__declspec(dllimport) NTSTATUS WINAPI + NtSetInformationFile(HANDLE FileHandle, PIO_STATUS_BLOCK IoStatusBlock, + PVOID FileInformation, ULONG Length, + uint32_t FileInformationClass); +__declspec(dllimport) NTSTATUS WINAPI + NtSetEaFile(HANDLE FileHandle, PIO_STATUS_BLOCK IoStatusBlock, + PVOID EaBuffer, ULONG EaBufferSize); + +int set_wsl_mode_bits_by_handle(HANDLE h, _mode_t mode) +{ + uint32_t value = mode; + struct wsl_full_ea_info_t ea_info; + IO_STATUS_BLOCK iob; + /* mode should be valid to make WSL happy */ + assert(S_ISREG(mode) || S_ISDIR(mode)); + ea_info.NextEntryOffset = 0; + ea_info.Flags = 0; + ea_info.EaNameLength = 6; + ea_info.EaValueLength = sizeof(value); /* 4 */ + strlcpy(ea_info.EaName, "$LXMOD", sizeof(ea_info.EaName)); + memcpy(ea_info.EaValue, &value, sizeof(value)); + ea_info.Padding[0] = 0; + return NtSetEaFile(h, &iob, &ea_info, sizeof(ea_info)); +} + +int get_wsl_mode_bits_by_handle(HANDLE h, _mode_t *mode) +{ + FILE_STAT_LX_INFORMATION fxi; + IO_STATUS_BLOCK iob; + if (NtQueryInformationFile(h, &iob, &fxi, sizeof(fxi), + FileStatLxInformation) == 0) { + if (fxi.LxFlags & LX_FILE_METADATA_HAS_MODE) + *mode = (_mode_t)fxi.LxMode; + return 0; + } + return -1; +} diff --git a/compat/win32/wsl.h b/compat/win32/wsl.h new file mode 100644 index 00000000000000..1f5ad7e67a4fc2 --- /dev/null +++ b/compat/win32/wsl.h @@ -0,0 +1,12 @@ +#ifndef COMPAT_WIN32_WSL_H +#define COMPAT_WIN32_WSL_H + +int are_wsl_compatible_mode_bits_enabled(void); + +int copy_wsl_mode_bits_from_disk(const wchar_t *wpath, ssize_t wpathlen, + _mode_t *mode); + +int get_wsl_mode_bits_by_handle(HANDLE h, _mode_t *mode); +int set_wsl_mode_bits_by_handle(HANDLE h, _mode_t mode); + +#endif diff --git a/config.mak.uname b/config.mak.uname index fa8d36f1e31823..83f27b0ad2bf85 100644 --- a/config.mak.uname +++ b/config.mak.uname @@ -511,7 +511,7 @@ endif compat/win32/path-utils.o \ compat/win32/pthread.o compat/win32/syslog.o \ compat/win32/trace2_win32_process_info.o \ - compat/win32/dirent.o + compat/win32/dirent.o compat/win32/wsl.o COMPAT_CFLAGS = -D__USE_MINGW_ACCESS -DDETECT_MSYS_TTY \ -DENSURE_MSYSTEM_IS_SET="\"$(MSYSTEM)\"" -DMINGW_PREFIX="\"$(patsubst /%,%,$(MINGW_PREFIX))\"" \ -DNOGDI -DHAVE_STRING_H -Icompat -Icompat/regex -Icompat/win32 -DSTRIP_EXTENSION=\".exe\" @@ -715,7 +715,7 @@ ifeq ($(uname_S),MINGW) compat/win32/flush.o \ compat/win32/path-utils.o \ compat/win32/pthread.o compat/win32/syslog.o \ - compat/win32/dirent.o + compat/win32/dirent.o compat/win32/wsl.o BASIC_CFLAGS += -DWIN32 EXTLIBS += -lws2_32 GITLIBS += git.res diff --git a/contrib/buildsystems/CMakeLists.txt b/contrib/buildsystems/CMakeLists.txt index e802c21f180ea5..b9516fa6bcdd4f 100644 --- a/contrib/buildsystems/CMakeLists.txt +++ b/contrib/buildsystems/CMakeLists.txt @@ -274,6 +274,7 @@ if(CMAKE_SYSTEM_NAME STREQUAL "Windows") compat/win32/syslog.c compat/win32/trace2_win32_process_info.c compat/win32/dirent.c + compat/win32/wsl.c compat/nedmalloc/nedmalloc.c compat/strdup.c) set(NO_UNIX_SOCKETS 1) diff --git a/meson.build b/meson.build index 7a6b54df6bec84..8520e16472d7ae 100644 --- a/meson.build +++ b/meson.build @@ -1284,6 +1284,7 @@ elif host_machine.system() == 'windows' 'compat/win32/path-utils.c', 'compat/win32/pthread.c', 'compat/win32/syslog.c', + 'compat/win32/wsl.c', 'compat/win32mmap.c', 'compat/nedmalloc/nedmalloc.c', ] From 6d5d3b6169d2f3fc1d218ab0fb36689745f660a0 Mon Sep 17 00:00:00 2001 From: Karsten Blees Date: Sat, 6 Jul 2013 02:09:35 +0200 Subject: [PATCH 150/218] Win32: make FILETIME conversion functions public We will use them in the upcoming "FSCache" patches (to accelerate sequential lstat() calls). Signed-off-by: Karsten Blees Signed-off-by: Johannes Schindelin --- compat/mingw-posix.h | 18 ++++++++++++++++++ compat/mingw.c | 18 ------------------ 2 files changed, 18 insertions(+), 18 deletions(-) diff --git a/compat/mingw-posix.h b/compat/mingw-posix.h index 9dc70e1c49cdf5..96df96f8e49c14 100644 --- a/compat/mingw-posix.h +++ b/compat/mingw-posix.h @@ -340,6 +340,17 @@ static inline int getrlimit(int resource, struct rlimit *rlp) return 0; } +/* + * The unit of FILETIME is 100-nanoseconds since January 1, 1601, UTC. + * Returns the 100-nanoseconds ("hekto nanoseconds") since the epoch. + */ +static inline long long filetime_to_hnsec(const FILETIME *ft) +{ + long long winTime = ((long long)ft->dwHighDateTime << 32) + ft->dwLowDateTime; + /* Windows to Unix Epoch conversion */ + return winTime - 116444736000000000LL; +} + /* * Use mingw specific stat()/lstat()/fstat() implementations on Windows, * including our own struct stat with 64 bit st_size and nanosecond-precision @@ -356,6 +367,13 @@ struct timespec { #endif #endif +static inline void filetime_to_timespec(const FILETIME *ft, struct timespec *ts) +{ + long long hnsec = filetime_to_hnsec(ft); + ts->tv_sec = (time_t)(hnsec / 10000000); + ts->tv_nsec = (hnsec % 10000000) * 100; +} + struct mingw_stat { _dev_t st_dev; _ino_t st_ino; diff --git a/compat/mingw.c b/compat/mingw.c index 6cc34e13f0ad2c..89b39106514933 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -1180,24 +1180,6 @@ int mingw_chmod(const char *filename, int mode) return _wchmod(wfilename, mode); } -/* - * The unit of FILETIME is 100-nanoseconds since January 1, 1601, UTC. - * Returns the 100-nanoseconds ("hekto nanoseconds") since the epoch. - */ -static inline long long filetime_to_hnsec(const FILETIME *ft) -{ - long long winTime = ((long long)ft->dwHighDateTime << 32) + ft->dwLowDateTime; - /* Windows to Unix Epoch conversion */ - return winTime - 116444736000000000LL; -} - -static inline void filetime_to_timespec(const FILETIME *ft, struct timespec *ts) -{ - long long hnsec = filetime_to_hnsec(ft); - ts->tv_sec = (time_t)(hnsec / 10000000); - ts->tv_nsec = (hnsec % 10000000) * 100; -} - /** * Verifies that safe_create_leading_directories() would succeed. */ From 28d1fc4ead4d45c94852dbb3e8769c180aedf15f Mon Sep 17 00:00:00 2001 From: Karsten Blees Date: Sun, 8 Sep 2013 14:17:31 +0200 Subject: [PATCH 151/218] Win32: dirent.c: Move opendir down Move opendir down in preparation for the next patch. Signed-off-by: Karsten Blees --- compat/win32/dirent.c | 68 +++++++++++++++++++++---------------------- 1 file changed, 34 insertions(+), 34 deletions(-) diff --git a/compat/win32/dirent.c b/compat/win32/dirent.c index 24ee9b814d6adf..2f7d64369f9eda 100644 --- a/compat/win32/dirent.c +++ b/compat/win32/dirent.c @@ -21,40 +21,6 @@ static inline void finddata2dirent(struct dirent *ent, WIN32_FIND_DATAW *fdata) ent->d_type = DT_REG; } -DIR *opendir(const char *name) -{ - wchar_t pattern[MAX_PATH + 2]; /* + 2 for '/' '*' */ - WIN32_FIND_DATAW fdata; - HANDLE h; - int len; - DIR *dir; - - /* convert name to UTF-16 and check length < MAX_PATH */ - if ((len = xutftowcs_path(pattern, name)) < 0) - return NULL; - - /* append optional '/' and wildcard '*' */ - if (len && !is_dir_sep(pattern[len - 1])) - pattern[len++] = '/'; - pattern[len++] = '*'; - pattern[len] = 0; - - /* open find handle */ - h = FindFirstFileW(pattern, &fdata); - if (h == INVALID_HANDLE_VALUE) { - DWORD err = GetLastError(); - errno = (err == ERROR_DIRECTORY) ? ENOTDIR : err_win_to_posix(err); - return NULL; - } - - /* initialize DIR structure and copy first dir entry */ - dir = xmalloc(sizeof(DIR)); - dir->dd_handle = h; - dir->dd_stat = 0; - finddata2dirent(&dir->dd_dir, &fdata); - return dir; -} - struct dirent *readdir(DIR *dir) { if (!dir) { @@ -93,3 +59,37 @@ int closedir(DIR *dir) free(dir); return 0; } + +DIR *opendir(const char *name) +{ + wchar_t pattern[MAX_PATH + 2]; /* + 2 for '/' '*' */ + WIN32_FIND_DATAW fdata; + HANDLE h; + int len; + DIR *dir; + + /* convert name to UTF-16 and check length < MAX_PATH */ + if ((len = xutftowcs_path(pattern, name)) < 0) + return NULL; + + /* append optional '/' and wildcard '*' */ + if (len && !is_dir_sep(pattern[len - 1])) + pattern[len++] = '/'; + pattern[len++] = '*'; + pattern[len] = 0; + + /* open find handle */ + h = FindFirstFileW(pattern, &fdata); + if (h == INVALID_HANDLE_VALUE) { + DWORD err = GetLastError(); + errno = (err == ERROR_DIRECTORY) ? ENOTDIR : err_win_to_posix(err); + return NULL; + } + + /* initialize DIR structure and copy first dir entry */ + dir = xmalloc(sizeof(DIR)); + dir->dd_handle = h; + dir->dd_stat = 0; + finddata2dirent(&dir->dd_dir, &fdata); + return dir; +} From 1004faca05ac9ca55804c565fe06d33cd31a1f7c Mon Sep 17 00:00:00 2001 From: Karsten Blees Date: Sun, 8 Sep 2013 14:18:40 +0200 Subject: [PATCH 152/218] mingw: make the dirent implementation pluggable Emulating the POSIX `dirent` API on Windows via `FindFirstFile()`/`FindNextFile()` is pretty staightforward, however, most of the information provided in the `WIN32_FIND_DATA` structure is thrown away in the process. A more sophisticated implementation may cache this data, e.g. for later reuse in calls to `lstat()`. Make the `dirent` implementation pluggable so that it can be switched at runtime, e.g. based on a config option. Define a base DIR structure with pointers to `readdir()`/`closedir()` that match the `opendir()` implementation (similar to vtable pointers in Object-Oriented Programming). Define `readdir()`/`closedir()` so that they call the function pointers in the `DIR` structure. This allows to choose the `opendir()` implementation on a call-by-call basis. Make the fixed-size `dirent.d_name` buffer a flex array, as `d_name` may be implementation specific (e.g. a caching implementation may allocate a `struct dirent` with _just_ the size needed to hold the `d_name` in question). Signed-off-by: Karsten Blees Signed-off-by: Johannes Schindelin --- compat/win32/dirent.c | 30 +++++++++++++++++++----------- compat/win32/dirent.h | 28 +++++++++++++++++++++------- 2 files changed, 40 insertions(+), 18 deletions(-) diff --git a/compat/win32/dirent.c b/compat/win32/dirent.c index 2f7d64369f9eda..f17e1595468f44 100644 --- a/compat/win32/dirent.c +++ b/compat/win32/dirent.c @@ -1,15 +1,21 @@ #include "../../git-compat-util.h" -struct DIR { - struct dirent dd_dir; /* includes d_type */ +#pragma GCC diagnostic push +#pragma GCC diagnostic ignored "-Wpedantic" +typedef struct dirent_DIR { + struct DIR base_dir; /* extend base struct DIR */ HANDLE dd_handle; /* FindFirstFile handle */ int dd_stat; /* 0-based index */ -}; + struct dirent dd_dir; /* includes d_type */ +} dirent_DIR; +#pragma GCC diagnostic pop + +DIR *(*opendir)(const char *dirname) = dirent_opendir; static inline void finddata2dirent(struct dirent *ent, WIN32_FIND_DATAW *fdata) { - /* convert UTF-16 name to UTF-8 */ - xwcstoutf(ent->d_name, fdata->cFileName, sizeof(ent->d_name)); + /* convert UTF-16 name to UTF-8 (d_name points to dirent_DIR.dd_name) */ + xwcstoutf(ent->d_name, fdata->cFileName, MAX_PATH * 3); /* Set file type, based on WIN32_FIND_DATA */ if ((fdata->dwFileAttributes & FILE_ATTRIBUTE_REPARSE_POINT) @@ -21,7 +27,7 @@ static inline void finddata2dirent(struct dirent *ent, WIN32_FIND_DATAW *fdata) ent->d_type = DT_REG; } -struct dirent *readdir(DIR *dir) +static struct dirent *dirent_readdir(dirent_DIR *dir) { if (!dir) { errno = EBADF; /* No set_errno for mingw */ @@ -48,7 +54,7 @@ struct dirent *readdir(DIR *dir) return &dir->dd_dir; } -int closedir(DIR *dir) +static int dirent_closedir(dirent_DIR *dir) { if (!dir) { errno = EBADF; @@ -60,13 +66,13 @@ int closedir(DIR *dir) return 0; } -DIR *opendir(const char *name) +DIR *dirent_opendir(const char *name) { wchar_t pattern[MAX_PATH + 2]; /* + 2 for '/' '*' */ WIN32_FIND_DATAW fdata; HANDLE h; int len; - DIR *dir; + dirent_DIR *dir; /* convert name to UTF-16 and check length < MAX_PATH */ if ((len = xutftowcs_path(pattern, name)) < 0) @@ -87,9 +93,11 @@ DIR *opendir(const char *name) } /* initialize DIR structure and copy first dir entry */ - dir = xmalloc(sizeof(DIR)); + dir = xmalloc(sizeof(dirent_DIR) + MAX_PATH); + dir->base_dir.preaddir = (struct dirent *(*)(DIR *dir)) dirent_readdir; + dir->base_dir.pclosedir = (int (*)(DIR *dir)) dirent_closedir; dir->dd_handle = h; dir->dd_stat = 0; finddata2dirent(&dir->dd_dir, &fdata); - return dir; + return (DIR*) dir; } diff --git a/compat/win32/dirent.h b/compat/win32/dirent.h index 058207e4bfed62..a58a8075fd70e3 100644 --- a/compat/win32/dirent.h +++ b/compat/win32/dirent.h @@ -1,20 +1,34 @@ #ifndef DIRENT_H #define DIRENT_H -typedef struct DIR DIR; - #define DT_UNKNOWN 0 #define DT_DIR 1 #define DT_REG 2 #define DT_LNK 3 struct dirent { - unsigned char d_type; /* file type to prevent lstat after readdir */ - char d_name[MAX_PATH * 3]; /* file name (* 3 for UTF-8 conversion) */ + unsigned char d_type; /* file type to prevent lstat after readdir */ + char d_name[/* FLEX_ARRAY */]; /* file name */ }; -DIR *opendir(const char *dirname); -struct dirent *readdir(DIR *dir); -int closedir(DIR *dir); +/* + * Base DIR structure, contains pointers to readdir/closedir implementations so + * that opendir may choose a concrete implementation on a call-by-call basis. + */ +typedef struct DIR { + struct dirent *(*preaddir)(struct DIR *dir); + int (*pclosedir)(struct DIR *dir); +} DIR; + +/* default dirent implementation */ +extern DIR *dirent_opendir(const char *dirname); + +#define opendir git_opendir + +/* current dirent implementation */ +extern DIR *(*opendir)(const char *dirname); + +#define readdir(dir) (dir->preaddir(dir)) +#define closedir(dir) (dir->pclosedir(dir)) #endif /* DIRENT_H */ From d503c8c280a3bcea29e676636705183d77ddffe4 Mon Sep 17 00:00:00 2001 From: Karsten Blees Date: Sun, 8 Sep 2013 14:21:30 +0200 Subject: [PATCH 153/218] Win32: make the lstat implementation pluggable Emulating the POSIX lstat API on Windows via GetFileAttributes[Ex] is quite slow. Windows operating system APIs seem to be much better at scanning the status of entire directories than checking single files. A caching implementation may improve performance by bulk-reading entire directories or reusing data obtained via opendir / readdir. Make the lstat implementation pluggable so that it can be switched at runtime, e.g. based on a config option. Signed-off-by: Karsten Blees Signed-off-by: Johannes Schindelin --- compat/mingw-posix.h | 2 +- compat/mingw.c | 2 ++ 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/compat/mingw-posix.h b/compat/mingw-posix.h index 96df96f8e49c14..9158f89d89d239 100644 --- a/compat/mingw-posix.h +++ b/compat/mingw-posix.h @@ -406,7 +406,7 @@ int mingw_fstat(int fd, struct stat *buf); #ifdef lstat #undef lstat #endif -#define lstat mingw_lstat +extern int (*lstat)(const char *file_name, struct stat *buf); int mingw_utime(const char *file_name, const struct utimbuf *times); diff --git a/compat/mingw.c b/compat/mingw.c index 89b39106514933..e37723d307a67b 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -1377,6 +1377,8 @@ int mingw_lstat(const char *file_name, struct stat *buf) return -1; } +int (*lstat)(const char *file_name, struct stat *buf) = mingw_lstat; + static int get_file_info_by_handle(HANDLE hnd, struct stat *buf) { BY_HANDLE_FILE_INFORMATION fdata; From 2a27e75758a8b51e8953fd7d8fe112e2a1b4b348 Mon Sep 17 00:00:00 2001 From: Karsten Blees Date: Sun, 8 Sep 2013 14:23:27 +0200 Subject: [PATCH 154/218] mingw: add infrastructure for read-only file system level caches Add a macro to mark code sections that only read from the file system, along with a config option and documentation. This facilitates implementation of relatively simple file system level caches without the need to synchronize with the file system. Enable read-only sections for 'git status' and preload_index. Signed-off-by: Karsten Blees --- Documentation/config/core.adoc | 6 ++++++ builtin/commit.c | 1 + compat/mingw.c | 6 ++++++ compat/mingw.h | 2 ++ git-compat-util.h | 15 +++++++++++++++ preload-index.c | 3 +++ 6 files changed, 33 insertions(+) diff --git a/Documentation/config/core.adoc b/Documentation/config/core.adoc index c870e7f7fe3a6f..ebdebd094bf461 100644 --- a/Documentation/config/core.adoc +++ b/Documentation/config/core.adoc @@ -721,6 +721,12 @@ relatively high IO latencies. When enabled, Git will do the index comparison to the filesystem data in parallel, allowing overlapping IO's. Defaults to true. +core.fscache:: + Enable additional caching of file system data for some operations. ++ +Git for Windows uses this to bulk-read and cache lstat data of entire +directories (instead of doing lstat file by file). + core.unsetenvvars:: Windows-only: comma-separated list of environment variables' names that need to be unset before spawning any other process. diff --git a/builtin/commit.c b/builtin/commit.c index a3e52ac9ca6607..3a182adc9c319b 100644 --- a/builtin/commit.c +++ b/builtin/commit.c @@ -1623,6 +1623,7 @@ struct repository *repo UNUSED) PATHSPEC_PREFER_FULL, prefix, argv); + enable_fscache(1); if (status_format != STATUS_FORMAT_PORCELAIN && status_format != STATUS_FORMAT_PORCELAIN_V2) progress_flag = REFRESH_PROGRESS; diff --git a/compat/mingw.c b/compat/mingw.c index e37723d307a67b..2f2852ec7eb305 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -277,6 +277,7 @@ enum hide_dotfiles_type { static enum hide_dotfiles_type hide_dotfiles = HIDE_DOTFILES_DOTGITONLY; static char *unset_environment_variables; +int core_fscache; int mingw_core_config(const char *var, const char *value, const struct config_context *ctx UNUSED, @@ -290,6 +291,11 @@ int mingw_core_config(const char *var, const char *value, return 0; } + if (!strcmp(var, "core.fscache")) { + core_fscache = git_config_bool(var, value); + return 0; + } + if (!strcmp(var, "core.unsetenvvars")) { if (!value) return config_error_nonbool(var); diff --git a/compat/mingw.h b/compat/mingw.h index 2fd688b886a805..0e9295006025c5 100644 --- a/compat/mingw.h +++ b/compat/mingw.h @@ -1,5 +1,7 @@ #include "mingw-posix.h" +extern int core_fscache; + struct config_context; int mingw_core_config(const char *var, const char *value, const struct config_context *ctx, void *cb); diff --git a/git-compat-util.h b/git-compat-util.h index c0146824b6a847..795677d4773cb2 100644 --- a/git-compat-util.h +++ b/git-compat-util.h @@ -1071,6 +1071,21 @@ static inline int is_missing_file_error(int errno_) return (errno_ == ENOENT || errno_ == ENOTDIR); } +/* + * Enable/disable a read-only cache for file system data on platforms that + * support it. + * + * Implementing a live-cache is complicated and requires special platform + * support (inotify, ReadDirectoryChangesW...). enable_fscache shall be used + * to mark sections of git code that extensively read from the file system + * without modifying anything. Implementations can use this to cache e.g. stat + * data or even file content without the need to synchronize with the file + * system. + */ +#ifndef enable_fscache +#define enable_fscache(x) /* noop */ +#endif + int cmd_main(int, const char **); /* diff --git a/preload-index.c b/preload-index.c index b222821b448526..61e8f3a1f6ec84 100644 --- a/preload-index.c +++ b/preload-index.c @@ -141,6 +141,7 @@ void preload_index(struct index_state *index, pthread_mutex_init(&pd.mutex, NULL); } + enable_fscache(1); for (i = 0; i < threads; i++) { struct thread_data *p = data+i; int err; @@ -176,6 +177,8 @@ void preload_index(struct index_state *index, trace2_data_intmax("index", NULL, "preload/sum_lstat", t2_sum_lstat); trace2_region_leave("index", "preload", NULL); + + enable_fscache(0); } int repo_read_index_preload(struct repository *repo, From edd55c24ec5e0e88966b00609e7fc2be55d914b1 Mon Sep 17 00:00:00 2001 From: Karsten Blees Date: Tue, 1 Oct 2013 12:51:54 +0200 Subject: [PATCH 155/218] mingw: add a cache below mingw's lstat and dirent implementations Checking the work tree status is quite slow on Windows, due to slow `lstat()` emulation (git calls `lstat()` once for each file in the index). Windows operating system APIs seem to be much better at scanning the status of entire directories than checking single files. Add an `lstat()` implementation that uses a cache for lstat data. Cache misses read the entire parent directory and add it to the cache. Subsequent `lstat()` calls for the same directory are served directly from the cache. Also implement `opendir()`/`readdir()`/`closedir()` so that they create and use directory listings in the cache. The cache doesn't track file system changes and doesn't plug into any modifying file APIs, so it has to be explicitly enabled for git functions that don't modify the working copy. Note: in an earlier version of this patch, the cache was always active and tracked file system changes via ReadDirectoryChangesW. However, this was much more complex and had negative impact on the performance of modifying git commands such as 'git checkout'. Signed-off-by: Karsten Blees Signed-off-by: Johannes Schindelin --- compat/win32/fscache.c | 487 ++++++++++++++++++++++++++++ compat/win32/fscache.h | 10 + config.mak.uname | 4 +- contrib/buildsystems/CMakeLists.txt | 3 +- git-compat-util.h | 2 + meson.build | 1 + 6 files changed, 504 insertions(+), 3 deletions(-) create mode 100644 compat/win32/fscache.c create mode 100644 compat/win32/fscache.h diff --git a/compat/win32/fscache.c b/compat/win32/fscache.c new file mode 100644 index 00000000000000..4305514a8bf763 --- /dev/null +++ b/compat/win32/fscache.c @@ -0,0 +1,487 @@ +#include "../../git-compat-util.h" +#include "../../hashmap.h" +#include "../win32.h" +#include "fscache.h" +#include "../../dir.h" +#include "../../abspath.h" + +static int initialized; +static volatile long enabled; +static struct hashmap map; +static CRITICAL_SECTION mutex; + +/* + * An entry in the file system cache. Used for both entire directory listings + * and file entries. + */ +#pragma GCC diagnostic push +#pragma GCC diagnostic ignored "-Wpedantic" +struct fsentry { + struct hashmap_entry ent; + mode_t st_mode; + /* Pointer to the directory listing, or NULL for the listing itself. */ + struct fsentry *list; + /* Pointer to the next file entry of the list. */ + struct fsentry *next; + + union { + /* Reference count of the directory listing. */ + volatile long refcnt; + struct { + /* More stat members (only used for file entries). */ + off64_t st_size; + struct timespec st_atim; + struct timespec st_mtim; + struct timespec st_ctim; + } s; + } u; + + /* Length of name. */ + unsigned short len; + /* + * Name of the entry. For directory listings: relative path of the + * directory, without trailing '/' (empty for cwd()). For file entries: + * name of the file. Typically points to the end of the structure if + * the fsentry is allocated on the heap (see fsentry_alloc), or to a + * local variable if on the stack (see fsentry_init). + */ + struct dirent dirent; +}; +#pragma GCC diagnostic pop + +#pragma GCC diagnostic push +#ifdef __clang__ +#pragma GCC diagnostic ignored "-Wflexible-array-extensions" +#endif +struct heap_fsentry { + union { + struct fsentry ent; + char dummy[sizeof(struct fsentry) + MAX_PATH]; + } u; +}; +#pragma GCC diagnostic pop + +/* + * Compares the paths of two fsentry structures for equality. + */ +static int fsentry_cmp(void *cmp_data UNUSED, + const struct fsentry *fse1, const struct fsentry *fse2, + void *keydata UNUSED) +{ + int res; + if (fse1 == fse2) + return 0; + + /* compare the list parts first */ + if (fse1->list != fse2->list && + (res = fsentry_cmp(NULL, fse1->list ? fse1->list : fse1, + fse2->list ? fse2->list : fse2, NULL))) + return res; + + /* if list parts are equal, compare len and name */ + if (fse1->len != fse2->len) + return fse1->len - fse2->len; + return fspathncmp(fse1->dirent.d_name, fse2->dirent.d_name, fse1->len); +} + +/* + * Calculates the hash code of an fsentry structure's path. + */ +static unsigned int fsentry_hash(const struct fsentry *fse) +{ + unsigned int hash = fse->list ? fse->list->ent.hash : 0; + return hash ^ memihash(fse->dirent.d_name, fse->len); +} + +/* + * Initialize an fsentry structure for use by fsentry_hash and fsentry_cmp. + */ +static void fsentry_init(struct fsentry *fse, struct fsentry *list, + const char *name, size_t len) +{ + fse->list = list; + if (len > MAX_PATH) + BUG("Trying to allocate fsentry for long path '%.*s'", + (int)len, name); + memcpy(fse->dirent.d_name, name, len); + fse->dirent.d_name[len] = 0; + fse->len = len; + hashmap_entry_init(&fse->ent, fsentry_hash(fse)); +} + +/* + * Allocate an fsentry structure on the heap. + */ +static struct fsentry *fsentry_alloc(struct fsentry *list, const char *name, + size_t len) +{ + /* overallocate fsentry and copy the name to the end */ + struct fsentry *fse = xmalloc(sizeof(struct fsentry) + len + 1); + /* init the rest of the structure */ + fsentry_init(fse, list, name, len); + fse->next = NULL; + fse->u.refcnt = 1; + return fse; +} + +/* + * Add a reference to an fsentry. + */ +inline static void fsentry_addref(struct fsentry *fse) +{ + if (fse->list) + fse = fse->list; + + InterlockedIncrement(&(fse->u.refcnt)); +} + +/* + * Release the reference to an fsentry, frees the memory if its the last ref. + */ +static void fsentry_release(struct fsentry *fse) +{ + if (fse->list) + fse = fse->list; + + if (InterlockedDecrement(&(fse->u.refcnt))) + return; + + while (fse) { + struct fsentry *next = fse->next; + free(fse); + fse = next; + } +} + +/* + * Allocate and initialize an fsentry from a WIN32_FIND_DATA structure. + */ +static struct fsentry *fseentry_create_entry(struct fsentry *list, + const WIN32_FIND_DATAW *fdata) +{ + char buf[MAX_PATH * 3]; + int len; + struct fsentry *fse; + len = xwcstoutf(buf, fdata->cFileName, ARRAY_SIZE(buf)); + + fse = fsentry_alloc(list, buf, len); + + fse->st_mode = file_attr_to_st_mode(fdata->dwFileAttributes, + IO_REPARSE_TAG_SYMLINK); + fse->dirent.d_type = S_ISREG(fse->st_mode) ? DT_REG : + S_ISDIR(fse->st_mode) ? DT_DIR : DT_LNK; + fse->u.s.st_size = (((off64_t) (fdata->nFileSizeHigh)) << 32) + | fdata->nFileSizeLow; + filetime_to_timespec(&(fdata->ftLastAccessTime), &(fse->u.s.st_atim)); + filetime_to_timespec(&(fdata->ftLastWriteTime), &(fse->u.s.st_mtim)); + filetime_to_timespec(&(fdata->ftCreationTime), &(fse->u.s.st_ctim)); + + return fse; +} + +/* + * Create an fsentry-based directory listing (similar to opendir / readdir). + * Dir should not contain trailing '/'. Use an empty string for the current + * directory (not "."!). + */ +static struct fsentry *fsentry_create_list(const struct fsentry *dir) +{ + wchar_t pattern[MAX_PATH + 2]; /* + 2 for '/' '*' */ + WIN32_FIND_DATAW fdata; + HANDLE h; + int wlen; + struct fsentry *list, **phead; + DWORD err; + + /* convert name to UTF-16 and check length < MAX_PATH */ + if ((wlen = xutftowcsn(pattern, dir->dirent.d_name, MAX_PATH, + dir->len)) < 0) { + if (errno == ERANGE) + errno = ENAMETOOLONG; + return NULL; + } + + /* append optional '/' and wildcard '*' */ + if (wlen) + pattern[wlen++] = '/'; + pattern[wlen++] = '*'; + pattern[wlen] = 0; + + /* open find handle */ + h = FindFirstFileW(pattern, &fdata); + if (h == INVALID_HANDLE_VALUE) { + err = GetLastError(); + errno = (err == ERROR_DIRECTORY) ? ENOTDIR : err_win_to_posix(err); + return NULL; + } + + /* allocate object to hold directory listing */ + list = fsentry_alloc(NULL, dir->dirent.d_name, dir->len); + + /* walk directory and build linked list of fsentry structures */ + phead = &list->next; + do { + *phead = fseentry_create_entry(list, &fdata); + phead = &(*phead)->next; + } while (FindNextFileW(h, &fdata)); + + /* remember result of last FindNextFile, then close find handle */ + err = GetLastError(); + FindClose(h); + + /* return the list if we've got all the files */ + if (err == ERROR_NO_MORE_FILES) + return list; + + /* otherwise free the list and return error */ + fsentry_release(list); + errno = err_win_to_posix(err); + return NULL; +} + +/* + * Adds a directory listing to the cache. + */ +static void fscache_add(struct fsentry *fse) +{ + if (fse->list) + fse = fse->list; + + for (; fse; fse = fse->next) + hashmap_add(&map, &fse->ent); +} + +/* + * Clears the cache. + */ +static void fscache_clear(void) +{ + hashmap_clear_and_free(&map, struct fsentry, ent); + hashmap_init(&map, (hashmap_cmp_fn)fsentry_cmp, NULL, 0); +} + +/* + * Checks if the cache is enabled for the given path. + */ +static inline int fscache_enabled(const char *path) +{ + return enabled > 0 && !is_absolute_path(path); +} + +/* + * Looks up or creates a cache entry for the specified key. + */ +static struct fsentry *fscache_get(struct fsentry *key) +{ + struct fsentry *fse; + + EnterCriticalSection(&mutex); + /* check if entry is in cache */ + fse = hashmap_get_entry(&map, key, ent, NULL); + if (fse) { + fsentry_addref(fse); + LeaveCriticalSection(&mutex); + return fse; + } + /* if looking for a file, check if directory listing is in cache */ + if (!fse && key->list) { + fse = hashmap_get_entry(&map, key->list, ent, NULL); + if (fse) { + LeaveCriticalSection(&mutex); + /* dir entry without file entry -> file doesn't exist */ + errno = ENOENT; + return NULL; + } + } + + /* create the directory listing (outside mutex!) */ + LeaveCriticalSection(&mutex); + fse = fsentry_create_list(key->list ? key->list : key); + if (!fse) + return NULL; + + EnterCriticalSection(&mutex); + /* add directory listing if it hasn't been added by some other thread */ + if (!hashmap_get_entry(&map, key, ent, NULL)) + fscache_add(fse); + + /* lookup file entry if requested (fse already points to directory) */ + if (key->list) + fse = hashmap_get_entry(&map, key, ent, NULL); + + /* return entry or ENOENT */ + if (fse) + fsentry_addref(fse); + else + errno = ENOENT; + + LeaveCriticalSection(&mutex); + return fse; +} + +/* + * Enables or disables the cache. Note that the cache is read-only, changes to + * the working directory are NOT reflected in the cache while enabled. + */ +int fscache_enable(int enable) +{ + int result; + + if (!initialized) { + /* allow the cache to be disabled entirely */ + if (!core_fscache) + return 0; + + InitializeCriticalSection(&mutex); + hashmap_init(&map, (hashmap_cmp_fn) fsentry_cmp, NULL, 0); + initialized = 1; + } + + result = enable ? InterlockedIncrement(&enabled) + : InterlockedDecrement(&enabled); + + if (enable && result == 1) { + /* redirect opendir and lstat to the fscache implementations */ + opendir = fscache_opendir; + lstat = fscache_lstat; + } else if (!enable && !result) { + /* reset opendir and lstat to the original implementations */ + opendir = dirent_opendir; + lstat = mingw_lstat; + EnterCriticalSection(&mutex); + fscache_clear(); + LeaveCriticalSection(&mutex); + } + return result; +} + +/* + * Lstat replacement, uses the cache if enabled, otherwise redirects to + * mingw_lstat. + */ +int fscache_lstat(const char *filename, struct stat *st) +{ + int dirlen, base, len; +#pragma GCC diagnostic push +#ifdef __clang__ +#pragma GCC diagnostic ignored "-Wflexible-array-extensions" +#endif + struct heap_fsentry key[2]; +#pragma GCC diagnostic pop + struct fsentry *fse; + + if (!fscache_enabled(filename)) + return mingw_lstat(filename, st); + + /* split filename into path + name */ + len = strlen(filename); + if (len && is_dir_sep(filename[len - 1])) + len--; + base = len; + while (base && !is_dir_sep(filename[base - 1])) + base--; + dirlen = base ? base - 1 : 0; + + /* lookup entry for path + name in cache */ + fsentry_init(&key[0].u.ent, NULL, filename, dirlen); + fsentry_init(&key[1].u.ent, &key[0].u.ent, filename + base, len - base); + fse = fscache_get(&key[1].u.ent); + if (!fse) { + errno = ENOENT; + return -1; + } + + /* + * Special case symbolic links: FindFirstFile()/FindNextFile() did not + * provide us with the length of the target path. + */ + if (fse->u.s.st_size == MAX_PATH && S_ISLNK(fse->st_mode)) { + char buf[MAX_PATH]; + int len = readlink(filename, buf, sizeof(buf) - 1); + + if (len > 0) + fse->u.s.st_size = len; + } + + /* copy stat data */ + st->st_ino = 0; + st->st_gid = 0; + st->st_uid = 0; + st->st_dev = 0; + st->st_rdev = 0; + st->st_nlink = 1; + st->st_mode = fse->st_mode; + st->st_size = fse->u.s.st_size; + st->st_atim = fse->u.s.st_atim; + st->st_mtim = fse->u.s.st_mtim; + st->st_ctim = fse->u.s.st_ctim; + + /* don't forget to release fsentry */ + fsentry_release(fse); + return 0; +} + +typedef struct fscache_DIR { + struct DIR base_dir; /* extend base struct DIR */ + struct fsentry *pfsentry; + struct dirent *dirent; +} fscache_DIR; + +/* + * Readdir replacement. + */ +static struct dirent *fscache_readdir(DIR *base_dir) +{ + fscache_DIR *dir = (fscache_DIR*) base_dir; + struct fsentry *next = dir->pfsentry->next; + if (!next) + return NULL; + dir->pfsentry = next; + dir->dirent = &next->dirent; + return dir->dirent; +} + +/* + * Closedir replacement. + */ +static int fscache_closedir(DIR *base_dir) +{ + fscache_DIR *dir = (fscache_DIR*) base_dir; + fsentry_release(dir->pfsentry); + free(dir); + return 0; +} + +/* + * Opendir replacement, uses a directory listing from the cache if enabled, + * otherwise calls original dirent implementation. + */ +DIR *fscache_opendir(const char *dirname) +{ + struct heap_fsentry key; + struct fsentry *list; + fscache_DIR *dir; + int len; + + if (!fscache_enabled(dirname)) + return dirent_opendir(dirname); + + /* prepare name (strip trailing '/', replace '.') */ + len = strlen(dirname); + if ((len == 1 && dirname[0] == '.') || + (len && is_dir_sep(dirname[len - 1]))) + len--; + + /* get directory listing from cache */ + fsentry_init(&key.u.ent, NULL, dirname, len); + list = fscache_get(&key.u.ent); + if (!list) + return NULL; + + /* alloc and return DIR structure */ + dir = (fscache_DIR*) xmalloc(sizeof(fscache_DIR)); + dir->base_dir.preaddir = fscache_readdir; + dir->base_dir.pclosedir = fscache_closedir; + dir->pfsentry = list; + return (DIR*) dir; +} diff --git a/compat/win32/fscache.h b/compat/win32/fscache.h new file mode 100644 index 00000000000000..ed518b422d705e --- /dev/null +++ b/compat/win32/fscache.h @@ -0,0 +1,10 @@ +#ifndef FSCACHE_H +#define FSCACHE_H + +int fscache_enable(int enable); +#define enable_fscache(x) fscache_enable(x) + +DIR *fscache_opendir(const char *dir); +int fscache_lstat(const char *file_name, struct stat *buf); + +#endif diff --git a/config.mak.uname b/config.mak.uname index 346770ba7e7891..95fd7539d6c14d 100644 --- a/config.mak.uname +++ b/config.mak.uname @@ -511,7 +511,7 @@ endif compat/win32/path-utils.o \ compat/win32/pthread.o compat/win32/syslog.o \ compat/win32/trace2_win32_process_info.o \ - compat/win32/dirent.o compat/win32/wsl.o + compat/win32/dirent.o compat/win32/fscache.o compat/win32/wsl.o COMPAT_CFLAGS = -D__USE_MINGW_ACCESS -DDETECT_MSYS_TTY \ -DENSURE_MSYSTEM_IS_SET="\"$(MSYSTEM)\"" -DMINGW_PREFIX="\"$(patsubst /%,%,$(MINGW_PREFIX))\"" \ -DNOGDI -DHAVE_STRING_H -Icompat -Icompat/regex -Icompat/win32 -DSTRIP_EXTENSION=\".exe\" @@ -716,7 +716,7 @@ ifeq ($(uname_S),MINGW) compat/win32/flush.o \ compat/win32/path-utils.o \ compat/win32/pthread.o compat/win32/syslog.o \ - compat/win32/dirent.o compat/win32/wsl.o + compat/win32/dirent.o compat/win32/fscache.o compat/win32/wsl.o BASIC_CFLAGS += -DWIN32 EXTLIBS += -lws2_32 GITLIBS += git.res diff --git a/contrib/buildsystems/CMakeLists.txt b/contrib/buildsystems/CMakeLists.txt index b3e5545ad876fe..b6053e7c75f71f 100644 --- a/contrib/buildsystems/CMakeLists.txt +++ b/contrib/buildsystems/CMakeLists.txt @@ -302,7 +302,8 @@ if(CMAKE_SYSTEM_NAME STREQUAL "Windows") compat/win32/dirent.c compat/win32/wsl.c compat/nedmalloc/nedmalloc.c - compat/strdup.c) + compat/strdup.c + compat/win32/fscache.c) set(NO_UNIX_SOCKETS 1) elseif(CMAKE_SYSTEM_NAME STREQUAL "Linux") diff --git a/git-compat-util.h b/git-compat-util.h index 795677d4773cb2..8c53d79f8dca2f 100644 --- a/git-compat-util.h +++ b/git-compat-util.h @@ -158,9 +158,11 @@ static inline int is_xplatform_dir_sep(int c) /* pull in Windows compatibility stuff */ #include "compat/win32/path-utils.h" #include "compat/mingw.h" +#include "compat/win32/fscache.h" #elif defined(_MSC_VER) #include "compat/win32/path-utils.h" #include "compat/msvc.h" +#include "compat/win32/fscache.h" #endif /* used on Mac OS X */ diff --git a/meson.build b/meson.build index 9aa19583e75d07..2540100788c875 100644 --- a/meson.build +++ b/meson.build @@ -1288,6 +1288,7 @@ elif host_machine.system() == 'windows' 'compat/winansi.c', 'compat/win32/dirent.c', 'compat/win32/flush.c', + 'compat/win32/fscache.c', 'compat/win32/path-utils.c', 'compat/win32/pthread.c', 'compat/win32/syslog.c', From 2b9e0125634552e69732c753233785f33ff31aa0 Mon Sep 17 00:00:00 2001 From: Karsten Blees Date: Tue, 24 Jun 2014 13:22:35 +0200 Subject: [PATCH 156/218] fscache: load directories only once If multiple threads access a directory that is not yet in the cache, the directory will be loaded by each thread. Only one of the results is added to the cache, all others are leaked. This wastes performance and memory. On cache miss, add a future object to the cache to indicate that the directory is currently being loaded. Subsequent threads register themselves with the future object and wait. When the first thread has loaded the directory, it replaces the future object with the result and notifies waiting threads. Signed-off-by: Karsten Blees --- compat/win32/fscache.c | 65 ++++++++++++++++++++++++++++++++++++------ 1 file changed, 56 insertions(+), 9 deletions(-) diff --git a/compat/win32/fscache.c b/compat/win32/fscache.c index 4305514a8bf763..b87e19f287f566 100644 --- a/compat/win32/fscache.c +++ b/compat/win32/fscache.c @@ -27,6 +27,8 @@ struct fsentry { union { /* Reference count of the directory listing. */ volatile long refcnt; + /* Handle to wait on the loading thread. */ + HANDLE hwait; struct { /* More stat members (only used for file entries). */ off64_t st_size; @@ -268,16 +270,43 @@ static inline int fscache_enabled(const char *path) return enabled > 0 && !is_absolute_path(path); } +/* + * Looks up a cache entry, waits if its being loaded by another thread. + * The mutex must be owned by the calling thread. + */ +static struct fsentry *fscache_get_wait(struct fsentry *key) +{ + struct fsentry *fse = hashmap_get_entry(&map, key, ent, NULL); + + /* return if its a 'real' entry (future entries have refcnt == 0) */ + if (!fse || fse->list || fse->u.refcnt) + return fse; + + /* create an event and link our key to the future entry */ + key->u.hwait = CreateEvent(NULL, TRUE, FALSE, NULL); + key->next = fse->next; + fse->next = key; + + /* wait for the loading thread to signal us */ + LeaveCriticalSection(&mutex); + WaitForSingleObject(key->u.hwait, INFINITE); + CloseHandle(key->u.hwait); + EnterCriticalSection(&mutex); + + /* repeat cache lookup */ + return hashmap_get_entry(&map, key, ent, NULL); +} + /* * Looks up or creates a cache entry for the specified key. */ static struct fsentry *fscache_get(struct fsentry *key) { - struct fsentry *fse; + struct fsentry *fse, *future, *waiter; EnterCriticalSection(&mutex); /* check if entry is in cache */ - fse = hashmap_get_entry(&map, key, ent, NULL); + fse = fscache_get_wait(key); if (fse) { fsentry_addref(fse); LeaveCriticalSection(&mutex); @@ -285,7 +314,7 @@ static struct fsentry *fscache_get(struct fsentry *key) } /* if looking for a file, check if directory listing is in cache */ if (!fse && key->list) { - fse = hashmap_get_entry(&map, key->list, ent, NULL); + fse = fscache_get_wait(key->list); if (fse) { LeaveCriticalSection(&mutex); /* dir entry without file entry -> file doesn't exist */ @@ -294,16 +323,34 @@ static struct fsentry *fscache_get(struct fsentry *key) } } + /* add future entry to indicate that we're loading it */ + future = key->list ? key->list : key; + future->next = NULL; + future->u.refcnt = 0; + hashmap_add(&map, &future->ent); + /* create the directory listing (outside mutex!) */ LeaveCriticalSection(&mutex); - fse = fsentry_create_list(key->list ? key->list : key); - if (!fse) + fse = fsentry_create_list(future); + EnterCriticalSection(&mutex); + + /* remove future entry and signal waiting threads */ + hashmap_remove(&map, &future->ent, NULL); + waiter = future->next; + while (waiter) { + HANDLE h = waiter->u.hwait; + waiter = waiter->next; + SetEvent(h); + } + + /* leave on error (errno set by fsentry_create_list) */ + if (!fse) { + LeaveCriticalSection(&mutex); return NULL; + } - EnterCriticalSection(&mutex); - /* add directory listing if it hasn't been added by some other thread */ - if (!hashmap_get_entry(&map, key, ent, NULL)) - fscache_add(fse); + /* add directory listing to the cache */ + fscache_add(fse); /* lookup file entry if requested (fse already points to directory) */ if (key->list) From 4a4c40520f891f2998213eb13db605ed30908380 Mon Sep 17 00:00:00 2001 From: Jeff Hostetler Date: Tue, 24 Jan 2017 15:12:13 -0500 Subject: [PATCH 157/218] fscache: add key for GIT_TRACE_FSCACHE Signed-off-by: Jeff Hostetler Signed-off-by: Johannes Schindelin --- compat/win32/fscache.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/compat/win32/fscache.c b/compat/win32/fscache.c index b87e19f287f566..a4095c17c9417a 100644 --- a/compat/win32/fscache.c +++ b/compat/win32/fscache.c @@ -4,11 +4,13 @@ #include "fscache.h" #include "../../dir.h" #include "../../abspath.h" +#include "../../trace.h" static int initialized; static volatile long enabled; static struct hashmap map; static CRITICAL_SECTION mutex; +static struct trace_key trace_fscache = TRACE_KEY_INIT(FSCACHE); /* * An entry in the file system cache. Used for both entire directory listings @@ -214,6 +216,8 @@ static struct fsentry *fsentry_create_list(const struct fsentry *dir) if (h == INVALID_HANDLE_VALUE) { err = GetLastError(); errno = (err == ERROR_DIRECTORY) ? ENOTDIR : err_win_to_posix(err); + trace_printf_key(&trace_fscache, "fscache: error(%d) '%s'\n", + errno, dir->dirent.d_name); return NULL; } @@ -399,6 +403,7 @@ int fscache_enable(int enable) fscache_clear(); LeaveCriticalSection(&mutex); } + trace_printf_key(&trace_fscache, "fscache: enable(%d)\n", enable); return result; } From 88d00b8234033a3f55ea7ab181ff619d8fdf291f Mon Sep 17 00:00:00 2001 From: Jeff Hostetler Date: Tue, 13 Dec 2016 14:05:32 -0500 Subject: [PATCH 158/218] fscache: remember not-found directories Teach FSCACHE to remember "not found" directories. This is a performance optimization. FSCACHE is a performance optimization available for Windows. It intercepts Posix-style lstat() calls into an in-memory directory using FindFirst/FindNext. It improves performance on Windows by catching the first lstat() call in a directory, using FindFirst/ FindNext to read the list of files (and attribute data) for the entire directory into the cache, and short-cut subsequent lstat() calls in the same directory. This gives a major performance boost on Windows. However, it does not remember "not found" directories. When STATUS runs and there are missing directories, the lstat() interception fails to find the parent directory and simply return ENOENT for the file -- it does not remember that the FindFirst on the directory failed. Thus subsequent lstat() calls in the same directory, each re-attempt the FindFirst. This completely defeats any performance gains. This can be seen by doing a sparse-checkout on a large repo and then doing a read-tree to reset the skip-worktree bits and then running status. This change reduced status times for my very large repo by 60%. Signed-off-by: Jeff Hostetler Signed-off-by: Johannes Schindelin --- compat/win32/fscache.c | 36 ++++++++++++++++++++++++++++++++---- 1 file changed, 32 insertions(+), 4 deletions(-) diff --git a/compat/win32/fscache.c b/compat/win32/fscache.c index a4095c17c9417a..f16a6fb07e360d 100644 --- a/compat/win32/fscache.c +++ b/compat/win32/fscache.c @@ -188,7 +188,8 @@ static struct fsentry *fseentry_create_entry(struct fsentry *list, * Dir should not contain trailing '/'. Use an empty string for the current * directory (not "."!). */ -static struct fsentry *fsentry_create_list(const struct fsentry *dir) +static struct fsentry *fsentry_create_list(const struct fsentry *dir, + int *dir_not_found) { wchar_t pattern[MAX_PATH + 2]; /* + 2 for '/' '*' */ WIN32_FIND_DATAW fdata; @@ -197,6 +198,8 @@ static struct fsentry *fsentry_create_list(const struct fsentry *dir) struct fsentry *list, **phead; DWORD err; + *dir_not_found = 0; + /* convert name to UTF-16 and check length < MAX_PATH */ if ((wlen = xutftowcsn(pattern, dir->dirent.d_name, MAX_PATH, dir->len)) < 0) { @@ -215,6 +218,7 @@ static struct fsentry *fsentry_create_list(const struct fsentry *dir) h = FindFirstFileW(pattern, &fdata); if (h == INVALID_HANDLE_VALUE) { err = GetLastError(); + *dir_not_found = 1; /* or empty directory */ errno = (err == ERROR_DIRECTORY) ? ENOTDIR : err_win_to_posix(err); trace_printf_key(&trace_fscache, "fscache: error(%d) '%s'\n", errno, dir->dirent.d_name); @@ -223,6 +227,8 @@ static struct fsentry *fsentry_create_list(const struct fsentry *dir) /* allocate object to hold directory listing */ list = fsentry_alloc(NULL, dir->dirent.d_name, dir->len); + list->st_mode = S_IFDIR; + list->dirent.d_type = DT_DIR; /* walk directory and build linked list of fsentry structures */ phead = &list->next; @@ -307,12 +313,16 @@ static struct fsentry *fscache_get_wait(struct fsentry *key) static struct fsentry *fscache_get(struct fsentry *key) { struct fsentry *fse, *future, *waiter; + int dir_not_found; EnterCriticalSection(&mutex); /* check if entry is in cache */ fse = fscache_get_wait(key); if (fse) { - fsentry_addref(fse); + if (fse->st_mode) + fsentry_addref(fse); + else + fse = NULL; /* non-existing directory */ LeaveCriticalSection(&mutex); return fse; } @@ -321,7 +331,10 @@ static struct fsentry *fscache_get(struct fsentry *key) fse = fscache_get_wait(key->list); if (fse) { LeaveCriticalSection(&mutex); - /* dir entry without file entry -> file doesn't exist */ + /* + * dir entry without file entry, or dir does not + * exist -> file doesn't exist + */ errno = ENOENT; return NULL; } @@ -335,7 +348,7 @@ static struct fsentry *fscache_get(struct fsentry *key) /* create the directory listing (outside mutex!) */ LeaveCriticalSection(&mutex); - fse = fsentry_create_list(future); + fse = fsentry_create_list(future, &dir_not_found); EnterCriticalSection(&mutex); /* remove future entry and signal waiting threads */ @@ -349,6 +362,18 @@ static struct fsentry *fscache_get(struct fsentry *key) /* leave on error (errno set by fsentry_create_list) */ if (!fse) { + if (dir_not_found && key->list) { + /* + * Record that the directory does not exist (or is + * empty, which for all practical matters is the same + * thing as far as fscache is concerned). + */ + fse = fsentry_alloc(key->list->list, + key->list->dirent.d_name, + key->list->len); + fse->st_mode = 0; + hashmap_add(&map, &fse->ent); + } LeaveCriticalSection(&mutex); return NULL; } @@ -360,6 +385,9 @@ static struct fsentry *fscache_get(struct fsentry *key) if (key->list) fse = hashmap_get_entry(&map, key, ent, NULL); + if (fse && !fse->st_mode) + fse = NULL; /* non-existing directory */ + /* return entry or ENOENT */ if (fse) fsentry_addref(fse); From e29aa2ea2eaa4633fa5eafec32b8110afd625fda Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Wed, 25 Jan 2017 18:39:16 +0100 Subject: [PATCH 159/218] fscache: add a test for the dir-not-found optimization Signed-off-by: Johannes Schindelin --- t/t1090-sparse-checkout-scope.sh | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+) diff --git a/t/t1090-sparse-checkout-scope.sh b/t/t1090-sparse-checkout-scope.sh index 3a14218b245d4c..529844e2862c74 100755 --- a/t/t1090-sparse-checkout-scope.sh +++ b/t/t1090-sparse-checkout-scope.sh @@ -106,4 +106,24 @@ test_expect_success 'in partial clone, sparse checkout only fetches needed blobs test_cmp expect actual ' +test_expect_success MINGW 'no unnecessary opendir() with fscache' ' + git clone . fscache-test && + ( + cd fscache-test && + git config core.fscache 1 && + echo "/excluded/*" >.git/info/sparse-checkout && + for f in $(test_seq 10) + do + sha1=$(echo $f | git hash-object -w --stdin) && + git update-index --add \ + --cacheinfo 100644,$sha1,excluded/$f || exit 1 + done && + test_tick && + git commit -m excluded && + GIT_TRACE_FSCACHE=1 git status >out 2>err && + grep excluded err >grep.out && + test_line_count = 1 grep.out + ) +' + test_done From 578fc0cb843b604d54ac7be4000a6bce836f8cc2 Mon Sep 17 00:00:00 2001 From: Jeff Hostetler Date: Tue, 22 Nov 2016 11:26:38 -0500 Subject: [PATCH 160/218] add: use preload-index and fscache for performance Teach "add" to use preload-index and fscache features to improve performance on very large repositories. During an "add", a call is made to run_diff_files() which calls check_remove() for each index-entry. This calls lstat(). On Windows, the fscache code intercepts the lstat() calls and builds a private cache using the FindFirst/FindNext routines, which are much faster. Somewhat independent of this, is the preload-index code which distributes some of the start-up costs across multiple threads. We need to keep the call to read_cache() before parsing the pathspecs (and hence cannot use the pathspecs to limit any preload) because parse_pathspec() is using the index to determine whether a pathspec is, in fact, in a submodule. If we would not read the index first, parse_pathspec() would not error out on a path that is inside a submodule, and t7400-submodule-basic.sh would fail with not ok 47 - do not add files from a submodule We still want the nice preload performance boost, though, so we simply call read_cache_preload(&pathspecs) after parsing the pathspecs. Signed-off-by: Jeff Hostetler Signed-off-by: Johannes Schindelin --- builtin/add.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/builtin/add.c b/builtin/add.c index 7737ab878bfceb..554399e7416885 100644 --- a/builtin/add.c +++ b/builtin/add.c @@ -491,12 +491,16 @@ int cmd_add(int argc, (!(addremove || take_worktree_changes) ? ADD_CACHE_IGNORE_REMOVAL : 0)); + enable_fscache(1); if (repo_read_index_preload(repo, &pathspec, 0) < 0) die(_("index file corrupt")); die_in_unpopulated_submodule(repo->index, prefix); die_path_inside_submodule(repo->index, &pathspec); + /* We do not really re-read the index but update the up-to-date flags */ + preload_index(repo->index, &pathspec, 0); + if (add_new_files) { int baselen; @@ -609,5 +613,6 @@ int cmd_add(int argc, free(ps_matched); dir_clear(&dir); clear_pathspec(&pathspec); + enable_fscache(0); return exit_status; } From 7269bddea64dd7bc5be34c43104102851912a09d Mon Sep 17 00:00:00 2001 From: Jeff Hostetler Date: Wed, 1 Nov 2017 15:05:44 -0400 Subject: [PATCH 161/218] dir.c: make add_excludes aware of fscache during status Teach read_directory_recursive() and add_excludes() to be aware of optional fscache and avoid trying to open() and fstat() non-existant ".gitignore" files in every directory in the worktree. The current code in add_excludes() calls open() and then fstat() for a ".gitignore" file in each directory present in the worktree. Change that when fscache is enabled to call lstat() first and if present, call open(). This seems backwards because both lstat needs to do more work than fstat. But when fscache is enabled, fscache will already know if the .gitignore file exists and can completely avoid the IO calls. This works because of the lstat diversion to mingw_lstat when fscache is enabled. This reduced status times on a 350K file enlistment of the Windows repo on a NVMe SSD by 0.25 seconds. Signed-off-by: Jeff Hostetler --- compat/win32/fscache.c | 5 +++++ compat/win32/fscache.h | 3 +++ dir.c | 39 ++++++++++++++++++++++++++++++--------- git-compat-util.h | 4 ++++ 4 files changed, 42 insertions(+), 9 deletions(-) diff --git a/compat/win32/fscache.c b/compat/win32/fscache.c index f16a6fb07e360d..7d7699d3322471 100644 --- a/compat/win32/fscache.c +++ b/compat/win32/fscache.c @@ -12,6 +12,11 @@ static struct hashmap map; static CRITICAL_SECTION mutex; static struct trace_key trace_fscache = TRACE_KEY_INIT(FSCACHE); +int fscache_is_enabled(void) +{ + return enabled; +} + /* * An entry in the file system cache. Used for both entire directory listings * and file entries. diff --git a/compat/win32/fscache.h b/compat/win32/fscache.h index ed518b422d705e..9a21fd5709c5bc 100644 --- a/compat/win32/fscache.h +++ b/compat/win32/fscache.h @@ -4,6 +4,9 @@ int fscache_enable(int enable); #define enable_fscache(x) fscache_enable(x) +int fscache_is_enabled(void); +#define is_fscache_enabled() (fscache_is_enabled()) + DIR *fscache_opendir(const char *dir); int fscache_lstat(const char *file_name, struct stat *buf); diff --git a/dir.c b/dir.c index 2f72dc865eea59..f4c3f87a2264e9 100644 --- a/dir.c +++ b/dir.c @@ -1156,16 +1156,37 @@ static int add_patterns(const char *fname, const char *base, int baselen, size_t size = 0; char *buf; - if (flags & PATTERN_NOFOLLOW) - fd = open_nofollow(fname, O_RDONLY); - else - fd = open(fname, O_RDONLY); - - if (fd < 0 || fstat(fd, &st) < 0) { - if (fd < 0) - warn_on_fopen_errors(fname); + /* + * Since `clang`'s `-Wunreachable-code` mode is clever, it would figure + * out that on non-Windows platforms, this `lstat()` is unreachable. + * We do want to keep the conditional block for the sake of Windows, + * though, so let's use the `NOT_CONSTANT()` trick to suppress that error. + */ + if (NOT_CONSTANT(is_fscache_enabled())) { + if (lstat(fname, &st) < 0) { + fd = -1; + } else { + fd = open(fname, O_RDONLY); + if (fd < 0) + warn_on_fopen_errors(fname); + } + } else { + if (flags & PATTERN_NOFOLLOW) + fd = open_nofollow(fname, O_RDONLY); else - close(fd); + fd = open(fname, O_RDONLY); + + if (fd < 0 || fstat(fd, &st) < 0) { + if (fd < 0) + warn_on_fopen_errors(fname); + else { + close(fd); + fd = -1; + } + } + } + + if (fd < 0) { if (!istate) return -1; r = read_skip_worktree_file_from_index(istate, fname, diff --git a/git-compat-util.h b/git-compat-util.h index 8c53d79f8dca2f..b22d865bfdba1f 100644 --- a/git-compat-util.h +++ b/git-compat-util.h @@ -1088,6 +1088,10 @@ static inline int is_missing_file_error(int errno_) #define enable_fscache(x) /* noop */ #endif +#ifndef is_fscache_enabled +#define is_fscache_enabled() (0) +#endif + int cmd_main(int, const char **); /* From 97e33df758ea47b3957ac2bc55a860e72208e97a Mon Sep 17 00:00:00 2001 From: Jeff Hostetler Date: Wed, 20 Dec 2017 10:43:41 -0500 Subject: [PATCH 162/218] fscache: make fscache_enabled() public Make fscache_enabled() function public rather than static. Remove unneeded fscache_is_enabled() function. Change is_fscache_enabled() macro to call fscache_enabled(). is_fscache_enabled() now takes a pathname so that the answer is more precise and mean "is fscache enabled for this pathname", since fscache only stores repo-relative paths and not absolute paths, we can avoid attempting lookups for absolute paths. Signed-off-by: Jeff Hostetler --- compat/win32/fscache.c | 7 +------ compat/win32/fscache.h | 4 ++-- dir.c | 2 +- git-compat-util.h | 2 +- 4 files changed, 5 insertions(+), 10 deletions(-) diff --git a/compat/win32/fscache.c b/compat/win32/fscache.c index 7d7699d3322471..ef0a2686a66fb9 100644 --- a/compat/win32/fscache.c +++ b/compat/win32/fscache.c @@ -12,11 +12,6 @@ static struct hashmap map; static CRITICAL_SECTION mutex; static struct trace_key trace_fscache = TRACE_KEY_INIT(FSCACHE); -int fscache_is_enabled(void) -{ - return enabled; -} - /* * An entry in the file system cache. Used for both entire directory listings * and file entries. @@ -280,7 +275,7 @@ static void fscache_clear(void) /* * Checks if the cache is enabled for the given path. */ -static inline int fscache_enabled(const char *path) +int fscache_enabled(const char *path) { return enabled > 0 && !is_absolute_path(path); } diff --git a/compat/win32/fscache.h b/compat/win32/fscache.h index 9a21fd5709c5bc..660ada053b4309 100644 --- a/compat/win32/fscache.h +++ b/compat/win32/fscache.h @@ -4,8 +4,8 @@ int fscache_enable(int enable); #define enable_fscache(x) fscache_enable(x) -int fscache_is_enabled(void); -#define is_fscache_enabled() (fscache_is_enabled()) +int fscache_enabled(const char *path); +#define is_fscache_enabled(path) fscache_enabled(path) DIR *fscache_opendir(const char *dir); int fscache_lstat(const char *file_name, struct stat *buf); diff --git a/dir.c b/dir.c index f4c3f87a2264e9..3b0c686b08411b 100644 --- a/dir.c +++ b/dir.c @@ -1162,7 +1162,7 @@ static int add_patterns(const char *fname, const char *base, int baselen, * We do want to keep the conditional block for the sake of Windows, * though, so let's use the `NOT_CONSTANT()` trick to suppress that error. */ - if (NOT_CONSTANT(is_fscache_enabled())) { + if (NOT_CONSTANT(is_fscache_enabled(fname))) { if (lstat(fname, &st) < 0) { fd = -1; } else { diff --git a/git-compat-util.h b/git-compat-util.h index b22d865bfdba1f..b140500661f102 100644 --- a/git-compat-util.h +++ b/git-compat-util.h @@ -1089,7 +1089,7 @@ static inline int is_missing_file_error(int errno_) #endif #ifndef is_fscache_enabled -#define is_fscache_enabled() (0) +#define is_fscache_enabled(path) (0) #endif int cmd_main(int, const char **); From 7057269e2334692c286c841f89c1d60c28dd23b7 Mon Sep 17 00:00:00 2001 From: Jeff Hostetler Date: Wed, 20 Dec 2017 11:19:27 -0500 Subject: [PATCH 163/218] dir.c: regression fix for add_excludes with fscache Fix regression described in: https://github.com/git-for-windows/git/issues/1392 which was introduced in: https://github.com/git-for-windows/git/commit/b2353379bba414e6c00dde913497cc9c827366f2 Problem Symptoms ================ When the user has a .gitignore file that is a symlink, the fscache optimization introduced above caused the stat-data from the symlink, rather that of the target file, to be returned. Later when the ignore file was read, the buffer length did not match the stat.st_size field and we called die("cannot use as an exclude file") Optimization Rationale ====================== The above optimization calls lstat() before open() primarily to ask fscache if the file exists. It gets the current stat-data as a side effect essentially for free (since we already have it in memory). If the file does not exist, it does not need to call open(). And since very few directories have .gitignore files, we can greatly reduce time spent in the filesystem. Discussion of Fix ================= The above optimization calls lstat() rather than stat() because the fscache only intercepts lstat() calls. Calls to stat() stay directed to the mingw_stat() completly bypassing fscache. Furthermore, calls to mingw_stat() always call {open, fstat, close} so that symlinks are properly dereferenced, which adds *additional* open/close calls on top of what the original code in dir.c is doing. Since the problem only manifests for symlinks, we add code to overwrite the stat-data when the path is a symlink. This preserves the effect of the performance gains provided by the fscache in the normal case. Signed-off-by: Jeff Hostetler --- dir.c | 27 +++++++++++++++++++++++++++ 1 file changed, 27 insertions(+) diff --git a/dir.c b/dir.c index 3b0c686b08411b..11ea2ce044096c 100644 --- a/dir.c +++ b/dir.c @@ -1157,6 +1157,28 @@ static int add_patterns(const char *fname, const char *base, int baselen, char *buf; /* + * A performance optimization for status. + * + * During a status scan, git looks in each directory for a .gitignore + * file before scanning the directory. Since .gitignore files are not + * that common, we can waste a lot of time looking for files that are + * not there. Fortunately, the fscache already knows if the directory + * contains a .gitignore file, since it has already read the directory + * and it already has the stat-data. + * + * If the fscache is enabled, use the fscache-lstat() interlude to see + * if the file exists (in the fscache hash maps) before trying to open() + * it. + * + * This causes problem when the .gitignore file is a symlink, because + * we call lstat() rather than stat() on the symlnk and the resulting + * stat-data is for the symlink itself rather than the target file. + * We CANNOT use stat() here because the fscache DOES NOT install an + * interlude for stat() and mingw_stat() always calls "open-fstat-close" + * on the file and defeats the purpose of the optimization here. Since + * symlinks are even more rare than .gitignore files, we force a fstat() + * after our open() to get stat-data for the target file. + * * Since `clang`'s `-Wunreachable-code` mode is clever, it would figure * out that on non-Windows platforms, this `lstat()` is unreachable. * We do want to keep the conditional block for the sake of Windows, @@ -1169,6 +1191,11 @@ static int add_patterns(const char *fname, const char *base, int baselen, fd = open(fname, O_RDONLY); if (fd < 0) warn_on_fopen_errors(fname); + else if (S_ISLNK(st.st_mode) && fstat(fd, &st) < 0) { + warn_on_fopen_errors(fname); + close(fd); + fd = -1; + } } } else { if (flags & PATTERN_NOFOLLOW) From a3ec13a81115a5dafed107f5406549f7f35b25be Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Wed, 20 Sep 2017 21:52:28 +0200 Subject: [PATCH 164/218] git-gui--askyesno: fix funny text wrapping The text wrapping seems to be aligned to the right side of the Yes button, leaving an awful lot of empty space. Let's try to counter this by using pixel units. Signed-off-by: Johannes Schindelin --- git-gui/git-gui--askyesno.sh | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/git-gui/git-gui--askyesno.sh b/git-gui/git-gui--askyesno.sh index 142d1bc3de229b..837281fe337b6f 100755 --- a/git-gui/git-gui--askyesno.sh +++ b/git-gui/git-gui--askyesno.sh @@ -29,8 +29,8 @@ if {$argc < 1} { } ${NS}::frame .t -${NS}::label .t.m -text $prompt -justify center -width 40 -.t.m configure -wraplength 400 +${NS}::label .t.m -text $prompt -justify center -width 400px +.t.m configure -wraplength 400px pack .t.m -side top -fill x -padx 20 -pady 20 -expand 1 pack .t -side top -fill x -ipadx 20 -ipady 20 -expand 1 From 53b726f1e80b4a97a7a4fa29144786cec6e46863 Mon Sep 17 00:00:00 2001 From: Takuto Ikuta Date: Wed, 22 Nov 2017 20:39:38 +0900 Subject: [PATCH 165/218] fetch-pack.c: enable fscache for stats under .git/objects When I do git fetch, git call file stats under .git/objects for each refs. This takes time when there are many refs. By enabling fscache, git takes file stats by directory traversing and that improved the speed of fetch-pack for repository having large number of refs. In my windows workstation, this improves the time of `git fetch` for chromium repository like below. I took stats 3 times. * With this patch TotalSeconds: 9.9825165 TotalSeconds: 9.1862075 TotalSeconds: 10.1956256 Avg: 9.78811653333333 * Without this patch TotalSeconds: 15.8406702 TotalSeconds: 15.6248053 TotalSeconds: 15.2085938 Avg: 15.5580231 Signed-off-by: Takuto Ikuta Signed-off-by: Johannes Schindelin --- fetch-pack.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/fetch-pack.c b/fetch-pack.c index c8fa0a609ac6de..1221db1fb12b9c 100644 --- a/fetch-pack.c +++ b/fetch-pack.c @@ -763,6 +763,7 @@ static void mark_complete_and_common_ref(struct fetch_negotiator *negotiator, save_commit_buffer = 0; trace2_region_enter("fetch-pack", "parse_remote_refs_and_find_cutoff", NULL); + enable_fscache(1); for (ref = *refs; ref; ref = ref->next) { struct commit *commit; @@ -787,6 +788,7 @@ static void mark_complete_and_common_ref(struct fetch_negotiator *negotiator, if (!cutoff || cutoff < commit->date) cutoff = commit->date; } + enable_fscache(0); trace2_region_leave("fetch-pack", "parse_remote_refs_and_find_cutoff", NULL); /* From 7ac68eddb47a771774cc57af400f3d8a882d7891 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Wed, 20 Sep 2017 21:55:45 +0200 Subject: [PATCH 166/218] git-gui--askyesno (mingw): use Git for Windows' icon, if available For additional GUI goodness. Signed-off-by: Johannes Schindelin --- git-gui/git-gui--askyesno.sh | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/git-gui/git-gui--askyesno.sh b/git-gui/git-gui--askyesno.sh index 837281fe337b6f..e431f86a8e16ae 100755 --- a/git-gui/git-gui--askyesno.sh +++ b/git-gui/git-gui--askyesno.sh @@ -59,5 +59,17 @@ if {$::tcl_platform(platform) eq {windows}} { } } +if {$::tcl_platform(platform) eq {windows}} { + set icopath [file dirname [file normalize $argv0]] + if {[file tail $icopath] eq {git-core}} { + set icopath [file dirname $icopath] + } + set icopath [file dirname $icopath] + set icopath [file join $icopath share git git-for-windows.ico] + if {[file exists $icopath]} { + wm iconbitmap . -default $icopath + } +} + wm title . $title tk::PlaceWindow . From 4a65108cf68a6878115487c79d76949ecde1eb30 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Thu, 20 Jul 2017 22:45:01 +0200 Subject: [PATCH 167/218] mingw: explicitly specify with which cmd to prefix the cmdline The main idea of this patch is that even if we have to look up the absolute path of the script, if only the basename was specified as argv[0], then we should use that basename on the command line, too, not the absolute path. This patch will also help with the upcoming patch where we automatically substitute "sh ..." by "busybox sh ..." if "sh" is not in the PATH but "busybox" is: we will do that by substituting the actual executable, but still keep prepending "sh" to the command line. Signed-off-by: Johannes Schindelin --- compat/mingw.c | 19 ++++++++++--------- 1 file changed, 10 insertions(+), 9 deletions(-) diff --git a/compat/mingw.c b/compat/mingw.c index 76481ae864203c..00aaee83d20590 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -2157,8 +2157,8 @@ static int is_msys2_sh(const char *cmd) } static pid_t mingw_spawnve_fd(const char *cmd, const char **argv, char **deltaenv, - const char *dir, - int prepend_cmd, int fhin, int fhout, int fherr) + const char *dir, const char *prepend_cmd, + int fhin, int fhout, int fherr) { STARTUPINFOEXW si; PROCESS_INFORMATION pi; @@ -2238,9 +2238,9 @@ static pid_t mingw_spawnve_fd(const char *cmd, const char **argv, char **deltaen /* concatenate argv, quoting args as we go */ strbuf_init(&args, 0); if (prepend_cmd) { - char *quoted = (char *)quote_arg(cmd); + char *quoted = (char *)quote_arg(prepend_cmd); strbuf_addstr(&args, quoted); - if (quoted != cmd) + if (quoted != prepend_cmd) free(quoted); } for (; *argv; argv++) { @@ -2360,7 +2360,8 @@ static pid_t mingw_spawnve_fd(const char *cmd, const char **argv, char **deltaen return (pid_t)pi.dwProcessId; } -static pid_t mingw_spawnv(const char *cmd, const char **argv, int prepend_cmd) +static pid_t mingw_spawnv(const char *cmd, const char **argv, + const char *prepend_cmd) { return mingw_spawnve_fd(cmd, argv, NULL, NULL, prepend_cmd, 0, 1, 2); } @@ -2388,14 +2389,14 @@ pid_t mingw_spawnvpe(const char *cmd, const char **argv, char **deltaenv, pid = -1; } else { - pid = mingw_spawnve_fd(iprog, argv, deltaenv, dir, 1, + pid = mingw_spawnve_fd(iprog, argv, deltaenv, dir, interpr, fhin, fhout, fherr); free(iprog); } argv[0] = argv0; } else - pid = mingw_spawnve_fd(prog, argv, deltaenv, dir, 0, + pid = mingw_spawnve_fd(prog, argv, deltaenv, dir, NULL, fhin, fhout, fherr); free(prog); } @@ -2420,7 +2421,7 @@ static int try_shell_exec(const char *cmd, char *const *argv) argv2[0] = (char *)cmd; /* full path to the script file */ COPY_ARRAY(&argv2[1], &argv[1], argc); exec_id = trace2_exec(prog, (const char **)argv2); - pid = mingw_spawnv(prog, (const char **)argv2, 1); + pid = mingw_spawnv(prog, (const char **)argv2, interpr); if (pid >= 0) { int status; if (waitpid(pid, &status, 0) < 0) @@ -2444,7 +2445,7 @@ int mingw_execv(const char *cmd, char *const *argv) int exec_id; exec_id = trace2_exec(cmd, (const char **)argv); - pid = mingw_spawnv(cmd, (const char **)argv, 0); + pid = mingw_spawnv(cmd, (const char **)argv, NULL); if (pid < 0) { trace2_exec_result(exec_id, -1); return -1; From d3dd725ca478f27f563da6efd00c8a7c0effa4bc Mon Sep 17 00:00:00 2001 From: Takuto Ikuta Date: Tue, 30 Jan 2018 22:42:58 +0900 Subject: [PATCH 168/218] checkout.c: enable fscache for checkout again This is retry of #1419. I added flush_fscache macro to flush cached stats after disk writing with tests for regression reported in #1438 and #1442. git checkout checks each file path in sorted order, so cache flushing does not make performance worse unless we have large number of modified files in a directory containing many files. Using chromium repository, I tested `git checkout .` performance when I delete 10 files in different directories. With this patch: TotalSeconds: 4.307272 TotalSeconds: 4.4863595 TotalSeconds: 4.2975562 Avg: 4.36372923333333 Without this patch: TotalSeconds: 20.9705431 TotalSeconds: 22.4867685 TotalSeconds: 18.8968292 Avg: 20.7847136 I confirmed this patch passed all tests in t/ with core_fscache=1. Signed-off-by: Takuto Ikuta --- builtin/checkout.c | 2 ++ compat/win32/fscache.c | 12 ++++++++++++ compat/win32/fscache.h | 3 +++ entry.c | 3 +++ git-compat-util.h | 4 ++++ parallel-checkout.c | 1 + t/t7201-co.sh | 36 ++++++++++++++++++++++++++++++++++++ 7 files changed, 61 insertions(+) diff --git a/builtin/checkout.c b/builtin/checkout.c index e031e6188613a6..00306a4593013f 100644 --- a/builtin/checkout.c +++ b/builtin/checkout.c @@ -401,6 +401,7 @@ static int checkout_worktree(const struct checkout_opts *opts, if (pc_workers > 1) init_parallel_checkout(); + enable_fscache(1); for (pos = 0; pos < the_repository->index->cache_nr; pos++) { struct cache_entry *ce = the_repository->index->cache[pos]; if (ce->ce_flags & CE_MATCHED) { @@ -426,6 +427,7 @@ static int checkout_worktree(const struct checkout_opts *opts, errs |= run_parallel_checkout(&state, pc_workers, pc_threshold, NULL, NULL); mem_pool_discard(&ce_mem_pool, should_validate_cache_entries()); + enable_fscache(0); remove_marked_cache_entries(the_repository->index, 1); remove_scheduled_dirs(); errs |= finish_delayed_checkout(&state, opts->show_progress); diff --git a/compat/win32/fscache.c b/compat/win32/fscache.c index ef0a2686a66fb9..a91482d19cb1a4 100644 --- a/compat/win32/fscache.c +++ b/compat/win32/fscache.c @@ -435,6 +435,18 @@ int fscache_enable(int enable) return result; } +/* + * Flush cached stats result when fscache is enabled. + */ +void fscache_flush(void) +{ + if (enabled) { + EnterCriticalSection(&mutex); + fscache_clear(); + LeaveCriticalSection(&mutex); + } +} + /* * Lstat replacement, uses the cache if enabled, otherwise redirects to * mingw_lstat. diff --git a/compat/win32/fscache.h b/compat/win32/fscache.h index 660ada053b4309..2f06f8df97dcd0 100644 --- a/compat/win32/fscache.h +++ b/compat/win32/fscache.h @@ -7,6 +7,9 @@ int fscache_enable(int enable); int fscache_enabled(const char *path); #define is_fscache_enabled(path) fscache_enabled(path) +void fscache_flush(void); +#define flush_fscache() fscache_flush() + DIR *fscache_opendir(const char *dir); int fscache_lstat(const char *file_name, struct stat *buf); diff --git a/entry.c b/entry.c index 5982d838e3a487..c14ab29f3ad315 100644 --- a/entry.c +++ b/entry.c @@ -424,6 +424,9 @@ static int write_entry(struct cache_entry *ce, char *path, struct conv_attrs *ca } finish: + /* Flush cached lstat in fscache after writing to disk. */ + flush_fscache(); + if (state->refresh_cache) { if (!fstat_done && lstat(ce->name, &st) < 0) return error_errno("unable to stat just-written file %s", diff --git a/git-compat-util.h b/git-compat-util.h index b140500661f102..69f86792b69a7a 100644 --- a/git-compat-util.h +++ b/git-compat-util.h @@ -1092,6 +1092,10 @@ static inline int is_missing_file_error(int errno_) #define is_fscache_enabled(path) (0) #endif +#ifndef flush_fscache +#define flush_fscache() /* noop */ +#endif + int cmd_main(int, const char **); /* diff --git a/parallel-checkout.c b/parallel-checkout.c index a6d07dcb1805b9..1eb277a0fc0a55 100644 --- a/parallel-checkout.c +++ b/parallel-checkout.c @@ -647,6 +647,7 @@ static void write_items_sequentially(struct checkout *state) { size_t i; + flush_fscache(); for (i = 0; i < parallel_checkout.nr; i++) { struct parallel_checkout_item *pc_item = ¶llel_checkout.items[i]; write_pc_item(pc_item, state); diff --git a/t/t7201-co.sh b/t/t7201-co.sh index 9bcf7c0b40461f..545f388c44a515 100755 --- a/t/t7201-co.sh +++ b/t/t7201-co.sh @@ -35,6 +35,42 @@ fill () { } +test_expect_success MINGW 'fscache flush cache' ' + + git init fscache-test && + cd fscache-test && + git config core.fscache 1 && + echo A > test.txt && + git add test.txt && + git commit -m A && + echo B >> test.txt && + git checkout . && + test -z "$(git status -s)" && + echo A > expect.txt && + test_cmp expect.txt test.txt && + cd .. && + rm -rf fscache-test +' + +test_expect_success MINGW 'fscache flush cache dir' ' + + git init fscache-test && + cd fscache-test && + git config core.fscache 1 && + echo A > test.txt && + git add test.txt && + git commit -m A && + rm test.txt && + mkdir test.txt && + touch test.txt/test.txt && + git checkout . && + test -z "$(git status -s)" && + echo A > expect.txt && + test_cmp expect.txt test.txt && + cd .. && + rm -rf fscache-test +' + test_expect_success setup ' fill x y z >same && fill 1 2 3 4 5 6 7 8 >one && From 276b42986bb5800bb77309d0339b5ad3f84636bf Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Thu, 20 Jul 2017 20:41:29 +0200 Subject: [PATCH 169/218] mingw: when path_lookup() failed, try BusyBox BusyBox comes with a ton of applets ("applet" being the identical concept to Git's "builtins"). And similar to Git's builtins, the applets can be called via `busybox `, or the BusyBox executable can be copied/hard-linked to the command name. The similarities do not end here. Just as with Git's builtins, it is problematic that BusyBox' hard-linked applets cannot easily be put into a .zip file: .zip archives have no concept of hard-links and therefore would store identical copies (and also extract identical copies, "inflating" the archive unnecessarily). To counteract that issue, MinGit already ships without hard-linked copies of the builtins, and the plan is to do the same with BusyBox' applets: simply ship busybox.exe as single executable, without hard-linked applets. To accommodate that, Git is being taught by this commit a very special trick, exploiting the fact that it is possible to call an executable with a command-line whose argv[0] is different from the executable's name: when `sh` is to be spawned, and no `sh` is found in the PATH, but busybox.exe is, use that executable (with unchanged argv). Likewise, if any executable to be spawned is not on the PATH, but busybox.exe is found, parse the output of `busybox.exe --help` to find out what applets are included, and if the command matches an included applet name, use busybox.exe to execute it. Signed-off-by: Johannes Schindelin --- compat/mingw.c | 63 ++++++++++++++++++++++++++++++++++++++++++++++++ t/t0014-alias.sh | 2 +- 2 files changed, 64 insertions(+), 1 deletion(-) diff --git a/compat/mingw.c b/compat/mingw.c index 00aaee83d20590..5d9227cf5bae0a 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -12,6 +12,7 @@ #include "repository.h" #include "run-command.h" #include "strbuf.h" +#include "string-list.h" #include "symlinks.h" #include "trace2.h" #include "win32.h" @@ -1926,6 +1927,65 @@ static char *lookup_prog(const char *dir, int dirlen, const char *cmd, return NULL; } +static char *path_lookup(const char *cmd, int exe_only); + +static char *is_busybox_applet(const char *cmd) +{ + static struct string_list applets = STRING_LIST_INIT_DUP; + static char *busybox_path; + static int busybox_path_initialized; + + /* Avoid infinite loop */ + if (!strncasecmp(cmd, "busybox", 7) && + (!cmd[7] || !strcasecmp(cmd + 7, ".exe"))) + return NULL; + + if (!busybox_path_initialized) { + busybox_path = path_lookup("busybox.exe", 1); + busybox_path_initialized = 1; + } + + /* Assume that sh is compiled in... */ + if (!busybox_path || !strcasecmp(cmd, "sh")) + return xstrdup_or_null(busybox_path); + + if (!applets.nr) { + struct child_process cp = CHILD_PROCESS_INIT; + struct strbuf buf = STRBUF_INIT; + char *p; + + strvec_pushl(&cp.args, busybox_path, "--help", NULL); + + if (capture_command(&cp, &buf, 2048)) { + string_list_append(&applets, ""); + return NULL; + } + + /* parse output */ + p = strstr(buf.buf, "Currently defined functions:\n"); + if (!p) { + warning("Could not parse output of busybox --help"); + string_list_append(&applets, ""); + return NULL; + } + p = strchrnul(p, '\n'); + for (;;) { + size_t len; + + p += strspn(p, "\n\t ,"); + len = strcspn(p, "\n\t ,"); + if (!len) + break; + p[len] = '\0'; + string_list_insert(&applets, p); + p = p + len + 1; + } + } + + return string_list_has_string(&applets, cmd) ? + xstrdup(busybox_path) : NULL; +} + /* * Determines the absolute path of cmd using the split path in path. * If cmd contains a slash or backslash, no lookup is performed. @@ -1954,6 +2014,9 @@ static char *path_lookup(const char *cmd, int exe_only) path = sep + 1; } + if (!prog && !isexe) + prog = is_busybox_applet(cmd); + return prog; } diff --git a/t/t0014-alias.sh b/t/t0014-alias.sh index 156265d8d07cc8..6df1890d4b23e4 100755 --- a/t/t0014-alias.sh +++ b/t/t0014-alias.sh @@ -53,7 +53,7 @@ test_expect_success 'looping aliases - deprecated builtins' ' test_expect_success 'run-command formats empty args properly' ' test_must_fail env GIT_TRACE=1 git frotz a "" b " " c 2>actual.raw && - sed -ne "/run_command:/s/.*trace: run_command: //p" actual.raw >actual && + sed -ne "/run_command: git-frotz/s/.*trace: run_command: //p" actual.raw >actual && echo "git-frotz a '\'''\'' b '\'' '\'' c" >expect && test_cmp expect actual ' From 294bae6586aabc092ce2a8ff56fae9692cf09895 Mon Sep 17 00:00:00 2001 From: Ben Peart Date: Fri, 7 Sep 2018 11:39:57 -0400 Subject: [PATCH 170/218] Enable the filesystem cache (fscache) in refresh_index(). On file systems that support it, this can dramatically speed up operations like add, commit, describe, rebase, reset, rm that would otherwise have to lstat() every file to "re-match" the stat information in the index to that of the file system. On a synthetic repo with 1M files, "git reset" dropped from 52.02 seconds to 14.42 seconds for a savings of 72%. Signed-off-by: Ben Peart --- read-cache.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/read-cache.c b/read-cache.c index 38a04b8de3d7fb..27fc8af9ce92fe 100644 --- a/read-cache.c +++ b/read-cache.c @@ -1515,6 +1515,7 @@ int refresh_index(struct index_state *istate, unsigned int flags, typechange_fmt = in_porcelain ? "T\t%s\n" : "%s: needs update\n"; added_fmt = in_porcelain ? "A\t%s\n" : "%s: needs update\n"; unmerged_fmt = in_porcelain ? "U\t%s\n" : "%s: needs merge\n"; + enable_fscache(1); /* * Use the multi-threaded preload_index() to refresh most of the * cache entries quickly then in the single threaded loop below, @@ -1609,6 +1610,7 @@ int refresh_index(struct index_state *istate, unsigned int flags, display_progress(progress, istate->cache_nr); stop_progress(&progress); trace_performance_leave("refresh index"); + enable_fscache(0); return has_errors; } From afeaca3db4b59150887bcea7a80ad8c8724c2fcc Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Thu, 20 Jul 2017 22:18:56 +0200 Subject: [PATCH 171/218] test-tool: learn to act as a drop-in replacement for `iconv` It is convenient to assume that everybody who wants to build & test Git has access to a working `iconv` executable (after all, we already pretty much require libiconv). However, that limits esoteric test scenarios such as Git for Windows', where an end user installation has to ship with `iconv` for the sole purpose of being testable. That payload serves no other purpose. So let's just have a test helper (to be able to test Git, the test helpers have to be available, after all) to act as `iconv` replacement. Signed-off-by: Johannes Schindelin --- Makefile | 1 + t/helper/meson.build | 1 + t/helper/test-iconv.c | 47 +++++++++++++++++++++++++++++++++++++++++++ t/helper/test-tool.c | 1 + t/helper/test-tool.h | 1 + 5 files changed, 51 insertions(+) create mode 100644 t/helper/test-iconv.c diff --git a/Makefile b/Makefile index 8c381f8ccae9e5..e7e400ab23b4e0 100644 --- a/Makefile +++ b/Makefile @@ -837,6 +837,7 @@ TEST_BUILTINS_OBJS += test-hash-speed.o TEST_BUILTINS_OBJS += test-hash.o TEST_BUILTINS_OBJS += test-hashmap.o TEST_BUILTINS_OBJS += test-hexdump.o +TEST_BUILTINS_OBJS += test-iconv.o TEST_BUILTINS_OBJS += test-json-writer.o TEST_BUILTINS_OBJS += test-lazy-init-name-hash.o TEST_BUILTINS_OBJS += test-match-trees.o diff --git a/t/helper/meson.build b/t/helper/meson.build index 675e64c0101b61..cba4a9bf4f1434 100644 --- a/t/helper/meson.build +++ b/t/helper/meson.build @@ -29,6 +29,7 @@ test_tool_sources = [ 'test-hash.c', 'test-hashmap.c', 'test-hexdump.c', + 'test-iconv.c', 'test-json-writer.c', 'test-lazy-init-name-hash.c', 'test-match-trees.c', diff --git a/t/helper/test-iconv.c b/t/helper/test-iconv.c new file mode 100644 index 00000000000000..d3c772fddf990b --- /dev/null +++ b/t/helper/test-iconv.c @@ -0,0 +1,47 @@ +#include "test-tool.h" +#include "git-compat-util.h" +#include "strbuf.h" +#include "gettext.h" +#include "parse-options.h" +#include "utf8.h" + +int cmd__iconv(int argc, const char **argv) +{ + struct strbuf buf = STRBUF_INIT; + char *from = NULL, *to = NULL, *p; + size_t len; + int ret = 0; + const char * const iconv_usage[] = { + N_("test-helper --iconv []"), + NULL + }; + struct option options[] = { + OPT_STRING('f', "from-code", &from, "encoding", "from"), + OPT_STRING('t', "to-code", &to, "encoding", "to"), + OPT_END() + }; + + argc = parse_options(argc, argv, NULL, options, + iconv_usage, 0); + + if (argc > 1 || !from || !to) + usage_with_options(iconv_usage, options); + + if (!argc) { + if (strbuf_read(&buf, 0, 2048) < 0) + die_errno("Could not read from stdin"); + } else if (strbuf_read_file(&buf, argv[0], 2048) < 0) + die_errno("Could not read from '%s'", argv[0]); + + p = reencode_string_len(buf.buf, buf.len, to, from, &len); + if (!p) + die_errno("Could not reencode"); + if (write(1, p, len) < 0) + ret = !!error_errno("Could not write %"PRIuMAX" bytes", + (uintmax_t)len); + + strbuf_release(&buf); + free(p); + + return ret; +} diff --git a/t/helper/test-tool.c b/t/helper/test-tool.c index a7abc618b3887e..9d1b41c8e39b89 100644 --- a/t/helper/test-tool.c +++ b/t/helper/test-tool.c @@ -39,6 +39,7 @@ static struct test_cmd cmds[] = { { "hashmap", cmd__hashmap }, { "hash-speed", cmd__hash_speed }, { "hexdump", cmd__hexdump }, + { "iconv", cmd__iconv }, { "json-writer", cmd__json_writer }, { "lazy-init-name-hash", cmd__lazy_init_name_hash }, { "match-trees", cmd__match_trees }, diff --git a/t/helper/test-tool.h b/t/helper/test-tool.h index 7f150fa1eb9ad2..e18e5a9ed9de81 100644 --- a/t/helper/test-tool.h +++ b/t/helper/test-tool.h @@ -32,6 +32,7 @@ int cmd__getcwd(int argc, const char **argv); int cmd__hashmap(int argc, const char **argv); int cmd__hash_speed(int argc, const char **argv); int cmd__hexdump(int argc, const char **argv); +int cmd__iconv(int argc, const char **argv); int cmd__json_writer(int argc, const char **argv); int cmd__lazy_init_name_hash(int argc, const char **argv); int cmd__match_trees(int argc, const char **argv); From 91321e95fb48e9f03f4280bc1717de491dbf6984 Mon Sep 17 00:00:00 2001 From: Ben Peart Date: Thu, 4 Oct 2018 18:10:21 -0400 Subject: [PATCH 172/218] mem_pool: add GIT_TRACE_MEMPOOL support Add tracing around initializing and discarding mempools. In discard report on the amount of memory unused in the current block to help tune setting the initial_size. Signed-off-by: Ben Peart --- mem-pool.c | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/mem-pool.c b/mem-pool.c index 8bc77cb0e80a35..89bca70f713692 100644 --- a/mem-pool.c +++ b/mem-pool.c @@ -7,7 +7,9 @@ #include "git-compat-util.h" #include "mem-pool.h" #include "gettext.h" +#include "trace.h" +static struct trace_key trace_mem_pool = TRACE_KEY_INIT(MEMPOOL); #define BLOCK_GROWTH_SIZE (1024 * 1024 - sizeof(struct mp_block)) /* @@ -65,12 +67,20 @@ void mem_pool_init(struct mem_pool *pool, size_t initial_size) if (initial_size > 0) mem_pool_alloc_block(pool, initial_size, NULL); + + trace_printf_key(&trace_mem_pool, + "mem_pool (%p): init (%"PRIuMAX") initial size\n", + (void *)pool, (uintmax_t)initial_size); } void mem_pool_discard(struct mem_pool *pool, int invalidate_memory) { struct mp_block *block, *block_to_free; + trace_printf_key(&trace_mem_pool, + "mem_pool (%p): discard (%"PRIuMAX") unused\n", + (void *)pool, + (uintmax_t)(pool->mp_block->end - pool->mp_block->next_free)); block = pool->mp_block; while (block) { From 9de02ee87b61261a33fd0cdf1b8927538f5dfead Mon Sep 17 00:00:00 2001 From: Ben Peart Date: Tue, 23 Oct 2018 11:42:06 -0400 Subject: [PATCH 173/218] fscache: use FindFirstFileExW to avoid retrieving the short name Use FindFirstFileExW with FindExInfoBasic to avoid forcing NTFS to look up the short name. Also switch to a larger (64K vs 4K) buffer using FIND_FIRST_EX_LARGE_FETCH to minimize round trips to the kernel. In a repo with ~200K files, this drops warm cache status times from 3.19 seconds to 2.67 seconds for a 16% savings. Signed-off-by: Ben Peart --- compat/win32/fscache.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/compat/win32/fscache.c b/compat/win32/fscache.c index a91482d19cb1a4..a414794ea6a275 100644 --- a/compat/win32/fscache.c +++ b/compat/win32/fscache.c @@ -215,7 +215,8 @@ static struct fsentry *fsentry_create_list(const struct fsentry *dir, pattern[wlen] = 0; /* open find handle */ - h = FindFirstFileW(pattern, &fdata); + h = FindFirstFileExW(pattern, FindExInfoBasic, &fdata, FindExSearchNameMatch, + NULL, FIND_FIRST_EX_LARGE_FETCH); if (h == INVALID_HANDLE_VALUE) { err = GetLastError(); *dir_not_found = 1; /* or empty directory */ From c760849bdb5781dfb2e2fcbae426ac1513955d38 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Thu, 20 Jul 2017 22:25:21 +0200 Subject: [PATCH 174/218] tests(mingw): if `iconv` is unavailable, use `test-helper --iconv` Signed-off-by: Johannes Schindelin --- t/test-lib.sh | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/t/test-lib.sh b/t/test-lib.sh index 70fd3e9bafb800..c4639e1f600ed1 100644 --- a/t/test-lib.sh +++ b/t/test-lib.sh @@ -1682,6 +1682,12 @@ Darwin) test_set_prereq GREP_STRIPS_CR test_set_prereq WINDOWS GIT_TEST_CMP="GIT_DIR=/dev/null git diff --no-index --ignore-cr-at-eol --" + if ! type iconv >/dev/null 2>&1 + then + iconv () { + test-tool iconv "$@" + } + fi ;; *CYGWIN*) test_set_prereq POSIXPERM From d2182614c012488f85f75479e04f6c712f1e2fec Mon Sep 17 00:00:00 2001 From: Ben Peart Date: Fri, 2 Nov 2018 11:19:10 -0400 Subject: [PATCH 175/218] fscache: fscache takes an initial size Update enable_fscache() to take an optional initial size parameter which is used to initialize the hashmap so that it can avoid having to rehash as additional entries are added. Add a separate disable_fscache() macro to make the code clearer and easier to read. Signed-off-by: Ben Peart Signed-off-by: Johannes Schindelin --- builtin/add.c | 4 ++-- builtin/checkout.c | 4 ++-- builtin/commit.c | 4 ++-- compat/win32/fscache.c | 8 ++++++-- compat/win32/fscache.h | 5 +++-- fetch-pack.c | 4 ++-- git-compat-util.h | 4 ++++ preload-index.c | 4 ++-- read-cache.c | 4 ++-- unpack-trees.c | 2 +- 10 files changed, 26 insertions(+), 17 deletions(-) diff --git a/builtin/add.c b/builtin/add.c index 554399e7416885..0b510ee1732c80 100644 --- a/builtin/add.c +++ b/builtin/add.c @@ -491,7 +491,7 @@ int cmd_add(int argc, (!(addremove || take_worktree_changes) ? ADD_CACHE_IGNORE_REMOVAL : 0)); - enable_fscache(1); + enable_fscache(0); if (repo_read_index_preload(repo, &pathspec, 0) < 0) die(_("index file corrupt")); @@ -613,6 +613,6 @@ int cmd_add(int argc, free(ps_matched); dir_clear(&dir); clear_pathspec(&pathspec); - enable_fscache(0); + disable_fscache(); return exit_status; } diff --git a/builtin/checkout.c b/builtin/checkout.c index 00306a4593013f..c46405c0e39ed1 100644 --- a/builtin/checkout.c +++ b/builtin/checkout.c @@ -401,7 +401,7 @@ static int checkout_worktree(const struct checkout_opts *opts, if (pc_workers > 1) init_parallel_checkout(); - enable_fscache(1); + enable_fscache(the_repository->index->cache_nr); for (pos = 0; pos < the_repository->index->cache_nr; pos++) { struct cache_entry *ce = the_repository->index->cache[pos]; if (ce->ce_flags & CE_MATCHED) { @@ -427,7 +427,7 @@ static int checkout_worktree(const struct checkout_opts *opts, errs |= run_parallel_checkout(&state, pc_workers, pc_threshold, NULL, NULL); mem_pool_discard(&ce_mem_pool, should_validate_cache_entries()); - enable_fscache(0); + disable_fscache(); remove_marked_cache_entries(the_repository->index, 1); remove_scheduled_dirs(); errs |= finish_delayed_checkout(&state, opts->show_progress); diff --git a/builtin/commit.c b/builtin/commit.c index 98751de3f2ea68..04adf2ef9dd801 100644 --- a/builtin/commit.c +++ b/builtin/commit.c @@ -1623,7 +1623,7 @@ struct repository *repo UNUSED) PATHSPEC_PREFER_FULL, prefix, argv); - enable_fscache(1); + enable_fscache(0); if (status_format != STATUS_FORMAT_PORCELAIN && status_format != STATUS_FORMAT_PORCELAIN_V2) progress_flag = REFRESH_PROGRESS; @@ -1664,7 +1664,7 @@ struct repository *repo UNUSED) wt_status_print(&s); wt_status_collect_free_buffers(&s); - enable_fscache(0); + disable_fscache(); return 0; } diff --git a/compat/win32/fscache.c b/compat/win32/fscache.c index ad9fdd36964d93..db37cc930a7ab6 100644 --- a/compat/win32/fscache.c +++ b/compat/win32/fscache.c @@ -412,7 +412,7 @@ static struct fsentry *fscache_get(struct fsentry *key) * Enables or disables the cache. Note that the cache is read-only, changes to * the working directory are NOT reflected in the cache while enabled. */ -int fscache_enable(int enable) +int fscache_enable(int enable, size_t initial_size) { int result; @@ -428,7 +428,11 @@ int fscache_enable(int enable) InitializeCriticalSection(&mutex); lstat_requests = opendir_requests = 0; fscache_misses = fscache_requests = 0; - hashmap_init(&map, (hashmap_cmp_fn) fsentry_cmp, NULL, 0); + /* + * avoid having to rehash by leaving room for the parent dirs. + * '4' was determined empirically by testing several repos + */ + hashmap_init(&map, (hashmap_cmp_fn) fsentry_cmp, NULL, initial_size * 4); initialized = 1; } diff --git a/compat/win32/fscache.h b/compat/win32/fscache.h index 2f06f8df97dcd0..d49c9381114da6 100644 --- a/compat/win32/fscache.h +++ b/compat/win32/fscache.h @@ -1,8 +1,9 @@ #ifndef FSCACHE_H #define FSCACHE_H -int fscache_enable(int enable); -#define enable_fscache(x) fscache_enable(x) +int fscache_enable(int enable, size_t initial_size); +#define enable_fscache(initial_size) fscache_enable(1, initial_size) +#define disable_fscache() fscache_enable(0, 0) int fscache_enabled(const char *path); #define is_fscache_enabled(path) fscache_enabled(path) diff --git a/fetch-pack.c b/fetch-pack.c index 1221db1fb12b9c..1cd5b7d9c9f936 100644 --- a/fetch-pack.c +++ b/fetch-pack.c @@ -763,7 +763,7 @@ static void mark_complete_and_common_ref(struct fetch_negotiator *negotiator, save_commit_buffer = 0; trace2_region_enter("fetch-pack", "parse_remote_refs_and_find_cutoff", NULL); - enable_fscache(1); + enable_fscache(0); for (ref = *refs; ref; ref = ref->next) { struct commit *commit; @@ -788,7 +788,7 @@ static void mark_complete_and_common_ref(struct fetch_negotiator *negotiator, if (!cutoff || cutoff < commit->date) cutoff = commit->date; } - enable_fscache(0); + disable_fscache(); trace2_region_leave("fetch-pack", "parse_remote_refs_and_find_cutoff", NULL); /* diff --git a/git-compat-util.h b/git-compat-util.h index 69f86792b69a7a..6d737405299340 100644 --- a/git-compat-util.h +++ b/git-compat-util.h @@ -1088,6 +1088,10 @@ static inline int is_missing_file_error(int errno_) #define enable_fscache(x) /* noop */ #endif +#ifndef disable_fscache +#define disable_fscache() /* noop */ +#endif + #ifndef is_fscache_enabled #define is_fscache_enabled(path) (0) #endif diff --git a/preload-index.c b/preload-index.c index 61e8f3a1f6ec84..e466fef15bcd79 100644 --- a/preload-index.c +++ b/preload-index.c @@ -141,7 +141,7 @@ void preload_index(struct index_state *index, pthread_mutex_init(&pd.mutex, NULL); } - enable_fscache(1); + enable_fscache(index->cache_nr); for (i = 0; i < threads; i++) { struct thread_data *p = data+i; int err; @@ -178,7 +178,7 @@ void preload_index(struct index_state *index, trace2_data_intmax("index", NULL, "preload/sum_lstat", t2_sum_lstat); trace2_region_leave("index", "preload", NULL); - enable_fscache(0); + disable_fscache(); } int repo_read_index_preload(struct repository *repo, diff --git a/read-cache.c b/read-cache.c index 27fc8af9ce92fe..3274f55ca3719d 100644 --- a/read-cache.c +++ b/read-cache.c @@ -1515,7 +1515,7 @@ int refresh_index(struct index_state *istate, unsigned int flags, typechange_fmt = in_porcelain ? "T\t%s\n" : "%s: needs update\n"; added_fmt = in_porcelain ? "A\t%s\n" : "%s: needs update\n"; unmerged_fmt = in_porcelain ? "U\t%s\n" : "%s: needs merge\n"; - enable_fscache(1); + enable_fscache(0); /* * Use the multi-threaded preload_index() to refresh most of the * cache entries quickly then in the single threaded loop below, @@ -1610,7 +1610,7 @@ int refresh_index(struct index_state *istate, unsigned int flags, display_progress(progress, istate->cache_nr); stop_progress(&progress); trace_performance_leave("refresh index"); - enable_fscache(0); + disable_fscache(); return has_errors; } diff --git a/unpack-trees.c b/unpack-trees.c index 2acfe71899b7e6..ff2f0fe57d5359 100644 --- a/unpack-trees.c +++ b/unpack-trees.c @@ -1825,7 +1825,7 @@ static void mark_new_skip_worktree(struct pattern_list *pl, */ enable_fscache(istate->cache_nr); clear_ce_flags(istate, select_flag, skip_wt_flag, pl, show_progress); - enable_fscache(0); + disable_fscache(); } static void populate_from_existing_patterns(struct unpack_trees_options *o, From 16f8f997bfa2f30de81d13e437bf705418476495 Mon Sep 17 00:00:00 2001 From: Ben Peart Date: Thu, 4 Oct 2018 18:10:21 -0400 Subject: [PATCH 176/218] fscache: add GIT_TEST_FSCACHE support Add support to fscache to enable running the entire test suite with the fscache enabled. Signed-off-by: Ben Peart --- compat/win32/fscache.c | 5 +++++ t/README | 3 +++ 2 files changed, 8 insertions(+) diff --git a/compat/win32/fscache.c b/compat/win32/fscache.c index a414794ea6a275..935e36e5955c98 100644 --- a/compat/win32/fscache.c +++ b/compat/win32/fscache.c @@ -5,6 +5,7 @@ #include "../../dir.h" #include "../../abspath.h" #include "../../trace.h" +#include "config.h" static int initialized; static volatile long enabled; @@ -408,7 +409,11 @@ int fscache_enable(int enable) int result; if (!initialized) { + int fscache = git_env_bool("GIT_TEST_FSCACHE", -1); + /* allow the cache to be disabled entirely */ + if (fscache != -1) + core_fscache = fscache; if (!core_fscache) return 0; diff --git a/t/README b/t/README index adbbd9acf4ab27..f19468151410eb 100644 --- a/t/README +++ b/t/README @@ -479,6 +479,9 @@ GIT_TEST_NAME_HASH_VERSION=, when set, causes 'git pack-objects' to assume '--name-hash-version='. +GIT_TEST_FSCACHE= exercises the uncommon fscache code path +which adds a cache below mingw's lstat and dirent implementations. + Naming Tests ------------ From ea5f40e8e5fa28131568fbdcfb219266b3cc4581 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Thu, 11 Oct 2018 23:55:44 +0200 Subject: [PATCH 177/218] gitattributes: mark .png files as binary Signed-off-by: Johannes Schindelin --- .gitattributes | 1 + 1 file changed, 1 insertion(+) diff --git a/.gitattributes b/.gitattributes index 556322be01b4a8..69dcb5bb2d0cde 100644 --- a/.gitattributes +++ b/.gitattributes @@ -6,6 +6,7 @@ *.pm text eol=lf diff=perl *.py text eol=lf diff=python *.bat text eol=crlf +*.png binary CODE_OF_CONDUCT.md -whitespace /Documentation/**/*.adoc text eol=lf whitespace=trail,space,incomplete /command-list.txt text eol=lf From 3fba11cb296a762c61672b7b332666eae8a0bca0 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Tue, 11 Dec 2018 12:59:29 +0100 Subject: [PATCH 178/218] fscache: remember the reparse tag for each entry We will use this in the next commit to implement an FSCache-aware version of is_mount_point(). Signed-off-by: Johannes Schindelin --- compat/win32/fscache.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/compat/win32/fscache.c b/compat/win32/fscache.c index 973ae7efb246c9..46dca7a5635faf 100644 --- a/compat/win32/fscache.c +++ b/compat/win32/fscache.c @@ -46,6 +46,7 @@ static struct trace_key trace_fscache = TRACE_KEY_INIT(FSCACHE); struct fsentry { struct hashmap_entry ent; mode_t st_mode; + ULONG reparse_tag; /* Pointer to the directory listing, or NULL for the listing itself. */ struct fsentry *list; /* Pointer to the next file entry of the list. */ @@ -202,6 +203,10 @@ static struct fsentry *fseentry_create_entry(struct fscache *cache, fse = fsentry_alloc(cache, list, buf, len); + fse->reparse_tag = + fdata->FileAttributes & FILE_ATTRIBUTE_REPARSE_POINT ? + fdata->EaSize : 0; + fse->st_mode = file_attr_to_st_mode(fdata->FileAttributes, fdata->EaSize); fse->dirent.d_type = S_ISREG(fse->st_mode) ? DT_REG : From 13126e23edb3dd2cf1e8cbdd7e0c99a497ba2137 Mon Sep 17 00:00:00 2001 From: Ben Peart Date: Thu, 4 Oct 2018 15:38:08 -0400 Subject: [PATCH 179/218] fscache: update fscache to be thread specific instead of global The threading model for fscache has been to have a single, global cache. This puts requirements on it to be thread safe so that callers like preload-index can call it from multiple threads. This was implemented with a single mutex and completion events which introduces contention between the calling threads. Simplify the threading model by making fscache thread specific. This allows us to remove the global mutex and synchronization events entirely and instead associate a fscache with every thread that requests one. This works well with the current multi-threading which divides the cache entries into blocks with a separate thread processing each block. At the end of each worker thread, if there is a fscache on the primary thread, merge the cached results from the worker into the primary thread cache. This enables us to reuse the cache later especially when scanning for untracked files. In testing, this reduced the time spent in preload_index() by about 25% and also reduced the CPU utilization significantly. On a repo with ~200K files, it reduced overall status times by ~12%. Signed-off-by: Ben Peart --- compat/win32/fscache.c | 294 +++++++++++++++++++++++++---------------- compat/win32/fscache.h | 22 ++- git-compat-util.h | 12 ++ preload-index.c | 8 +- 4 files changed, 215 insertions(+), 121 deletions(-) diff --git a/compat/win32/fscache.c b/compat/win32/fscache.c index db37cc930a7ab6..cd57901aaf67b4 100644 --- a/compat/win32/fscache.c +++ b/compat/win32/fscache.c @@ -7,14 +7,24 @@ #include "../../trace.h" #include "config.h" -static int initialized; -static volatile long enabled; -static struct hashmap map; +static volatile long initialized; +static DWORD dwTlsIndex; static CRITICAL_SECTION mutex; -static unsigned int lstat_requests; -static unsigned int opendir_requests; -static unsigned int fscache_requests; -static unsigned int fscache_misses; + +/* + * Store one fscache per thread to avoid thread contention and locking. + * This is ok because multi-threaded access is 1) uncommon and 2) always + * splitting up the cache entries across multiple threads so there isn't + * any overlap between threads anyway. + */ +struct fscache { + volatile long enabled; + struct hashmap map; + unsigned int lstat_requests; + unsigned int opendir_requests; + unsigned int fscache_requests; + unsigned int fscache_misses; +}; static struct trace_key trace_fscache = TRACE_KEY_INIT(FSCACHE); /* @@ -34,8 +44,6 @@ struct fsentry { union { /* Reference count of the directory listing. */ volatile long refcnt; - /* Handle to wait on the loading thread. */ - HANDLE hwait; struct { /* More stat members (only used for file entries). */ off64_t st_size; @@ -260,86 +268,63 @@ static struct fsentry *fsentry_create_list(const struct fsentry *dir, /* * Adds a directory listing to the cache. */ -static void fscache_add(struct fsentry *fse) +static void fscache_add(struct fscache *cache, struct fsentry *fse) { if (fse->list) fse = fse->list; for (; fse; fse = fse->next) - hashmap_add(&map, &fse->ent); + hashmap_add(&cache->map, &fse->ent); } /* * Clears the cache. */ -static void fscache_clear(void) +static void fscache_clear(struct fscache *cache) { - hashmap_clear_and_free(&map, struct fsentry, ent); - hashmap_init(&map, (hashmap_cmp_fn)fsentry_cmp, NULL, 0); - lstat_requests = opendir_requests = 0; - fscache_misses = fscache_requests = 0; + hashmap_clear_and_free(&cache->map, struct fsentry, ent); + hashmap_init(&cache->map, (hashmap_cmp_fn)fsentry_cmp, NULL, 0); + cache->lstat_requests = cache->opendir_requests = 0; + cache->fscache_misses = cache->fscache_requests = 0; } /* * Checks if the cache is enabled for the given path. */ -int fscache_enabled(const char *path) +static int do_fscache_enabled(struct fscache *cache, const char *path) { - return enabled > 0 && !is_absolute_path(path); + return cache->enabled > 0 && !is_absolute_path(path); } -/* - * Looks up a cache entry, waits if its being loaded by another thread. - * The mutex must be owned by the calling thread. - */ -static struct fsentry *fscache_get_wait(struct fsentry *key) +int fscache_enabled(const char *path) { - struct fsentry *fse = hashmap_get_entry(&map, key, ent, NULL); - - /* return if its a 'real' entry (future entries have refcnt == 0) */ - if (!fse || fse->list || fse->u.refcnt) - return fse; - - /* create an event and link our key to the future entry */ - key->u.hwait = CreateEvent(NULL, TRUE, FALSE, NULL); - key->next = fse->next; - fse->next = key; - - /* wait for the loading thread to signal us */ - LeaveCriticalSection(&mutex); - WaitForSingleObject(key->u.hwait, INFINITE); - CloseHandle(key->u.hwait); - EnterCriticalSection(&mutex); + struct fscache *cache = fscache_getcache(); - /* repeat cache lookup */ - return hashmap_get_entry(&map, key, ent, NULL); + return cache ? do_fscache_enabled(cache, path) : 0; } /* * Looks up or creates a cache entry for the specified key. */ -static struct fsentry *fscache_get(struct fsentry *key) +static struct fsentry *fscache_get(struct fscache *cache, struct fsentry *key) { - struct fsentry *fse, *future, *waiter; + struct fsentry *fse; int dir_not_found; - EnterCriticalSection(&mutex); - fscache_requests++; + cache->fscache_requests++; /* check if entry is in cache */ - fse = fscache_get_wait(key); + fse = hashmap_get_entry(&cache->map, key, ent, NULL); if (fse) { if (fse->st_mode) fsentry_addref(fse); else fse = NULL; /* non-existing directory */ - LeaveCriticalSection(&mutex); return fse; } /* if looking for a file, check if directory listing is in cache */ if (!fse && key->list) { - fse = fscache_get_wait(key->list); + fse = hashmap_get_entry(&cache->map, key->list, ent, NULL); if (fse) { - LeaveCriticalSection(&mutex); /* * dir entry without file entry, or dir does not * exist -> file doesn't exist @@ -349,25 +334,8 @@ static struct fsentry *fscache_get(struct fsentry *key) } } - /* add future entry to indicate that we're loading it */ - future = key->list ? key->list : key; - future->next = NULL; - future->u.refcnt = 0; - hashmap_add(&map, &future->ent); - - /* create the directory listing (outside mutex!) */ - LeaveCriticalSection(&mutex); - fse = fsentry_create_list(future, &dir_not_found); - EnterCriticalSection(&mutex); - - /* remove future entry and signal waiting threads */ - hashmap_remove(&map, &future->ent, NULL); - waiter = future->next; - while (waiter) { - HANDLE h = waiter->u.hwait; - waiter = waiter->next; - SetEvent(h); - } + /* create the directory listing */ + fse = fsentry_create_list(key->list ? key->list : key, &dir_not_found); /* leave on error (errno set by fsentry_create_list) */ if (!fse) { @@ -381,19 +349,18 @@ static struct fsentry *fscache_get(struct fsentry *key) key->list->dirent.d_name, key->list->len); fse->st_mode = 0; - hashmap_add(&map, &fse->ent); + hashmap_add(&cache->map, &fse->ent); } - LeaveCriticalSection(&mutex); return NULL; } /* add directory listing to the cache */ - fscache_misses++; - fscache_add(fse); + cache->fscache_misses++; + fscache_add(cache, fse); /* lookup file entry if requested (fse already points to directory) */ if (key->list) - fse = hashmap_get_entry(&map, key, ent, NULL); + fse = hashmap_get_entry(&cache->map, key, ent, NULL); if (fse && !fse->st_mode) fse = NULL; /* non-existing directory */ @@ -404,59 +371,104 @@ static struct fsentry *fscache_get(struct fsentry *key) else errno = ENOENT; - LeaveCriticalSection(&mutex); return fse; } /* - * Enables or disables the cache. Note that the cache is read-only, changes to + * Enables the cache. Note that the cache is read-only, changes to * the working directory are NOT reflected in the cache while enabled. */ -int fscache_enable(int enable, size_t initial_size) +int fscache_enable(size_t initial_size) { - int result; + int fscache; + struct fscache *cache; + int result = 0; + + /* allow the cache to be disabled entirely */ + fscache = git_env_bool("GIT_TEST_FSCACHE", -1); + if (fscache != -1) + core_fscache = fscache; + if (!core_fscache) + return 0; + /* + * refcount the global fscache initialization so that the + * opendir and lstat function pointers are redirected if + * any threads are using the fscache. + */ if (!initialized) { - int fscache = git_env_bool("GIT_TEST_FSCACHE", -1); - - /* allow the cache to be disabled entirely */ - if (fscache != -1) - core_fscache = fscache; - if (!core_fscache) - return 0; - InitializeCriticalSection(&mutex); - lstat_requests = opendir_requests = 0; - fscache_misses = fscache_requests = 0; + if (!dwTlsIndex) { + dwTlsIndex = TlsAlloc(); + if (dwTlsIndex == TLS_OUT_OF_INDEXES) { + LeaveCriticalSection(&mutex); + return 0; + } + } + + /* redirect opendir and lstat to the fscache implementations */ + opendir = fscache_opendir; + lstat = fscache_lstat; + } + InterlockedIncrement(&initialized); + + /* refcount the thread specific initialization */ + cache = fscache_getcache(); + if (cache) { + InterlockedIncrement(&cache->enabled); + } else { + cache = (struct fscache *)xcalloc(1, sizeof(*cache)); + cache->enabled = 1; /* * avoid having to rehash by leaving room for the parent dirs. * '4' was determined empirically by testing several repos */ - hashmap_init(&map, (hashmap_cmp_fn) fsentry_cmp, NULL, initial_size * 4); - initialized = 1; + hashmap_init(&cache->map, (hashmap_cmp_fn)fsentry_cmp, NULL, initial_size * 4); + if (!TlsSetValue(dwTlsIndex, cache)) + BUG("TlsSetValue error"); } - result = enable ? InterlockedIncrement(&enabled) - : InterlockedDecrement(&enabled); + trace_printf_key(&trace_fscache, "fscache: enable\n"); + return result; +} - if (enable && result == 1) { - /* redirect opendir and lstat to the fscache implementations */ - opendir = fscache_opendir; - lstat = fscache_lstat; - } else if (!enable && !result) { +/* + * Disables the cache. + */ +void fscache_disable(void) +{ + struct fscache *cache; + + if (!core_fscache) + return; + + /* update the thread specific fscache initialization */ + cache = fscache_getcache(); + if (!cache) + BUG("fscache_disable() called on a thread where fscache has not been initialized"); + if (!cache->enabled) + BUG("fscache_disable() called on an fscache that is already disabled"); + InterlockedDecrement(&cache->enabled); + if (!cache->enabled) { + TlsSetValue(dwTlsIndex, NULL); + trace_printf_key(&trace_fscache, "fscache_disable: lstat %u, opendir %u, " + "total requests/misses %u/%u\n", + cache->lstat_requests, cache->opendir_requests, + cache->fscache_requests, cache->fscache_misses); + fscache_clear(cache); + free(cache); + } + + /* update the global fscache initialization */ + InterlockedDecrement(&initialized); + if (!initialized) { /* reset opendir and lstat to the original implementations */ opendir = dirent_opendir; lstat = mingw_lstat; - EnterCriticalSection(&mutex); - trace_printf_key(&trace_fscache, "fscache: lstat %u, opendir %u, " - "total requests/misses %u/%u\n", - lstat_requests, opendir_requests, - fscache_requests, fscache_misses); - fscache_clear(); - LeaveCriticalSection(&mutex); } - trace_printf_key(&trace_fscache, "fscache: enable(%d)\n", enable); - return result; + + trace_printf_key(&trace_fscache, "fscache: disable\n"); + return; } /* @@ -464,10 +476,10 @@ int fscache_enable(int enable, size_t initial_size) */ void fscache_flush(void) { - if (enabled) { - EnterCriticalSection(&mutex); - fscache_clear(); - LeaveCriticalSection(&mutex); + struct fscache *cache = fscache_getcache(); + + if (cache && cache->enabled) { + fscache_clear(cache); } } @@ -485,11 +497,12 @@ int fscache_lstat(const char *filename, struct stat *st) struct heap_fsentry key[2]; #pragma GCC diagnostic pop struct fsentry *fse; + struct fscache *cache = fscache_getcache(); - if (!fscache_enabled(filename)) + if (!cache || !do_fscache_enabled(cache, filename)) return mingw_lstat(filename, st); - lstat_requests++; + cache->lstat_requests++; /* split filename into path + name */ len = strlen(filename); if (len && is_dir_sep(filename[len - 1])) @@ -502,7 +515,7 @@ int fscache_lstat(const char *filename, struct stat *st) /* lookup entry for path + name in cache */ fsentry_init(&key[0].u.ent, NULL, filename, dirlen); fsentry_init(&key[1].u.ent, &key[0].u.ent, filename + base, len - base); - fse = fscache_get(&key[1].u.ent); + fse = fscache_get(cache, &key[1].u.ent); if (!fse) { errno = ENOENT; return -1; @@ -579,11 +592,12 @@ DIR *fscache_opendir(const char *dirname) struct fsentry *list; fscache_DIR *dir; int len; + struct fscache *cache = fscache_getcache(); - if (!fscache_enabled(dirname)) + if (!cache || !do_fscache_enabled(cache, dirname)) return dirent_opendir(dirname); - opendir_requests++; + cache->opendir_requests++; /* prepare name (strip trailing '/', replace '.') */ len = strlen(dirname); if ((len == 1 && dirname[0] == '.') || @@ -592,7 +606,7 @@ DIR *fscache_opendir(const char *dirname) /* get directory listing from cache */ fsentry_init(&key.u.ent, NULL, dirname, len); - list = fscache_get(&key.u.ent); + list = fscache_get(cache, &key.u.ent); if (!list) return NULL; @@ -603,3 +617,53 @@ DIR *fscache_opendir(const char *dirname) dir->pfsentry = list; return (DIR*) dir; } + +struct fscache *fscache_getcache(void) +{ + return (struct fscache *)TlsGetValue(dwTlsIndex); +} + +void fscache_merge(struct fscache *dest) +{ + struct hashmap_iter iter; + struct hashmap_entry *e; + struct fscache *cache = fscache_getcache(); + + /* + * Only do the merge if fscache was enabled and we have a dest + * cache to merge into. + */ + if (!dest) { + fscache_enable(0); + return; + } + if (!cache) + BUG("fscache_merge() called on a thread where fscache has not been initialized"); + + TlsSetValue(dwTlsIndex, NULL); + trace_printf_key(&trace_fscache, "fscache_merge: lstat %u, opendir %u, " + "total requests/misses %u/%u\n", + cache->lstat_requests, cache->opendir_requests, + cache->fscache_requests, cache->fscache_misses); + + /* + * This is only safe because the primary thread we're merging into + * isn't being used so the critical section only needs to prevent + * the the child threads from stomping on each other. + */ + EnterCriticalSection(&mutex); + + hashmap_iter_init(&cache->map, &iter); + while ((e = hashmap_iter_next(&iter))) + hashmap_add(&dest->map, e); + + dest->lstat_requests += cache->lstat_requests; + dest->opendir_requests += cache->opendir_requests; + dest->fscache_requests += cache->fscache_requests; + dest->fscache_misses += cache->fscache_misses; + LeaveCriticalSection(&mutex); + + free(cache); + + InterlockedDecrement(&initialized); +} diff --git a/compat/win32/fscache.h b/compat/win32/fscache.h index d49c9381114da6..2eb8bf3f5cfee8 100644 --- a/compat/win32/fscache.h +++ b/compat/win32/fscache.h @@ -1,9 +1,16 @@ #ifndef FSCACHE_H #define FSCACHE_H -int fscache_enable(int enable, size_t initial_size); -#define enable_fscache(initial_size) fscache_enable(1, initial_size) -#define disable_fscache() fscache_enable(0, 0) +/* + * The fscache is thread specific. enable_fscache() must be called + * for each thread where caching is desired. + */ + +int fscache_enable(size_t initial_size); +#define enable_fscache(initial_size) fscache_enable(initial_size) + +void fscache_disable(void); +#define disable_fscache() fscache_disable() int fscache_enabled(const char *path); #define is_fscache_enabled(path) fscache_enabled(path) @@ -14,4 +21,13 @@ void fscache_flush(void); DIR *fscache_opendir(const char *dir); int fscache_lstat(const char *file_name, struct stat *buf); +/* opaque fscache structure */ +struct fscache; + +struct fscache *fscache_getcache(void); +#define getcache_fscache() fscache_getcache() + +void fscache_merge(struct fscache *dest); +#define merge_fscache(dest) fscache_merge(dest) + #endif diff --git a/git-compat-util.h b/git-compat-util.h index 6d737405299340..fa3cead03c006c 100644 --- a/git-compat-util.h +++ b/git-compat-util.h @@ -1084,6 +1084,10 @@ static inline int is_missing_file_error(int errno_) * data or even file content without the need to synchronize with the file * system. */ + + /* opaque fscache structure */ +struct fscache; + #ifndef enable_fscache #define enable_fscache(x) /* noop */ #endif @@ -1100,6 +1104,14 @@ static inline int is_missing_file_error(int errno_) #define flush_fscache() /* noop */ #endif +#ifndef getcache_fscache +#define getcache_fscache() (NULL) /* noop */ +#endif + +#ifndef merge_fscache +#define merge_fscache(dest) /* noop */ +#endif + int cmd_main(int, const char **); /* diff --git a/preload-index.c b/preload-index.c index e466fef15bcd79..ac0310008754a3 100644 --- a/preload-index.c +++ b/preload-index.c @@ -20,6 +20,8 @@ #include "trace2.h" #include "config.h" +static struct fscache *fscache; + /* * Mostly randomly chosen maximum thread counts: we * cap the parallelism to 20 threads, and we want @@ -57,6 +59,7 @@ static void *preload_thread(void *_data) nr = index->cache_nr - p->offset; last_nr = nr; + enable_fscache(nr); do { struct cache_entry *ce = *cep++; struct stat st; @@ -100,6 +103,7 @@ static void *preload_thread(void *_data) pthread_mutex_unlock(&pd->mutex); } cache_def_clear(&cache); + merge_fscache(fscache); return NULL; } @@ -118,6 +122,7 @@ void preload_index(struct index_state *index, if (!HAVE_THREADS || !core_preload_index) return; + fscache = getcache_fscache(); threads = index->cache_nr / THREAD_COST; if ((index->cache_nr > 1) && (threads < 2) && git_env_bool("GIT_TEST_PRELOAD_INDEX", 0)) threads = 2; @@ -141,7 +146,6 @@ void preload_index(struct index_state *index, pthread_mutex_init(&pd.mutex, NULL); } - enable_fscache(index->cache_nr); for (i = 0; i < threads; i++) { struct thread_data *p = data+i; int err; @@ -177,8 +181,6 @@ void preload_index(struct index_state *index, trace2_data_intmax("index", NULL, "preload/sum_lstat", t2_sum_lstat); trace2_region_leave("index", "preload", NULL); - - disable_fscache(); } int repo_read_index_preload(struct repository *repo, From fb699af7de899cfe2891945b64a3706ebf8c1fcf Mon Sep 17 00:00:00 2001 From: Ben Peart Date: Tue, 25 Sep 2018 16:28:16 -0400 Subject: [PATCH 180/218] fscache: add fscache hit statistics Track fscache hits and misses for lstat and opendir requests. Reporting of statistics is done when the cache is disabled for the last time and freed and is only reported if GIT_TRACE_FSCACHE is set. Sample output is: 11:33:11.836428 compat/win32/fscache.c:433 fscache: lstat 3775, opendir 263, total requests/misses 4052/269 Signed-off-by: Ben Peart --- compat/win32/fscache.c | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) diff --git a/compat/win32/fscache.c b/compat/win32/fscache.c index 935e36e5955c98..ad9fdd36964d93 100644 --- a/compat/win32/fscache.c +++ b/compat/win32/fscache.c @@ -11,6 +11,10 @@ static int initialized; static volatile long enabled; static struct hashmap map; static CRITICAL_SECTION mutex; +static unsigned int lstat_requests; +static unsigned int opendir_requests; +static unsigned int fscache_requests; +static unsigned int fscache_misses; static struct trace_key trace_fscache = TRACE_KEY_INIT(FSCACHE); /* @@ -272,6 +276,8 @@ static void fscache_clear(void) { hashmap_clear_and_free(&map, struct fsentry, ent); hashmap_init(&map, (hashmap_cmp_fn)fsentry_cmp, NULL, 0); + lstat_requests = opendir_requests = 0; + fscache_misses = fscache_requests = 0; } /* @@ -318,6 +324,7 @@ static struct fsentry *fscache_get(struct fsentry *key) int dir_not_found; EnterCriticalSection(&mutex); + fscache_requests++; /* check if entry is in cache */ fse = fscache_get_wait(key); if (fse) { @@ -381,6 +388,7 @@ static struct fsentry *fscache_get(struct fsentry *key) } /* add directory listing to the cache */ + fscache_misses++; fscache_add(fse); /* lookup file entry if requested (fse already points to directory) */ @@ -418,6 +426,8 @@ int fscache_enable(int enable) return 0; InitializeCriticalSection(&mutex); + lstat_requests = opendir_requests = 0; + fscache_misses = fscache_requests = 0; hashmap_init(&map, (hashmap_cmp_fn) fsentry_cmp, NULL, 0); initialized = 1; } @@ -434,6 +444,10 @@ int fscache_enable(int enable) opendir = dirent_opendir; lstat = mingw_lstat; EnterCriticalSection(&mutex); + trace_printf_key(&trace_fscache, "fscache: lstat %u, opendir %u, " + "total requests/misses %u/%u\n", + lstat_requests, opendir_requests, + fscache_requests, fscache_misses); fscache_clear(); LeaveCriticalSection(&mutex); } @@ -471,6 +485,7 @@ int fscache_lstat(const char *filename, struct stat *st) if (!fscache_enabled(filename)) return mingw_lstat(filename, st); + lstat_requests++; /* split filename into path + name */ len = strlen(filename); if (len && is_dir_sep(filename[len - 1])) @@ -564,6 +579,7 @@ DIR *fscache_opendir(const char *dirname) if (!fscache_enabled(dirname)) return dirent_opendir(dirname); + opendir_requests++; /* prepare name (strip trailing '/', replace '.') */ len = strlen(dirname); if ((len == 1 && dirname[0] == '.') || From 1e0607af33b4e5dfaa43a42fd1576a3b64eae926 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Sat, 5 Aug 2017 20:28:37 +0200 Subject: [PATCH 181/218] tests: move test PNGs into t/lib-diff/ We already have a directory where we store files intended for use by multiple test scripts. The same directory is a better home for the test-binary-*.png files than t/. Signed-off-by: Johannes Schindelin --- t/{ => lib-diff}/test-binary-1.png | Bin t/{ => lib-diff}/test-binary-2.png | Bin t/t3307-notes-man.sh | 2 +- t/t3903-stash.sh | 2 +- t/t4012-diff-binary.sh | 2 +- t/t4049-diff-stat-count.sh | 2 +- t/t4108-apply-threeway.sh | 12 ++++++------ t/t6403-merge-file.sh | 4 ++-- t/t6407-merge-binary.sh | 2 +- t/t9200-git-cvsexportcommit.sh | 14 +++++++------- 10 files changed, 20 insertions(+), 20 deletions(-) rename t/{ => lib-diff}/test-binary-1.png (100%) rename t/{ => lib-diff}/test-binary-2.png (100%) diff --git a/t/test-binary-1.png b/t/lib-diff/test-binary-1.png similarity index 100% rename from t/test-binary-1.png rename to t/lib-diff/test-binary-1.png diff --git a/t/test-binary-2.png b/t/lib-diff/test-binary-2.png similarity index 100% rename from t/test-binary-2.png rename to t/lib-diff/test-binary-2.png diff --git a/t/t3307-notes-man.sh b/t/t3307-notes-man.sh index 1aa366a410e9a3..7e5c06e6615d7a 100755 --- a/t/t3307-notes-man.sh +++ b/t/t3307-notes-man.sh @@ -26,7 +26,7 @@ test_expect_success 'example 1: notes to add an Acked-by line' ' ' test_expect_success 'example 2: binary notes' ' - cp "$TEST_DIRECTORY"/test-binary-1.png . && + cp "$TEST_DIRECTORY"/lib-diff/test-binary-1.png . && git checkout B && blob=$(git hash-object -w test-binary-1.png) && git notes --ref=logo add -C "$blob" && diff --git a/t/t3903-stash.sh b/t/t3903-stash.sh index 70879941c22f8c..0c9022290fad0f 100755 --- a/t/t3903-stash.sh +++ b/t/t3903-stash.sh @@ -1377,7 +1377,7 @@ test_expect_success 'stash -- works with binary files' ' mkdir -p subdir && >subdir/untracked && >subdir/tracked && - cp "$TEST_DIRECTORY"/test-binary-1.png subdir/tracked-binary && + cp "$TEST_DIRECTORY"/lib-diff/test-binary-1.png subdir/tracked-binary && git add subdir/tracked* && git stash -- subdir/ && test_path_is_missing subdir/tracked && diff --git a/t/t4012-diff-binary.sh b/t/t4012-diff-binary.sh index 97b5ac04071d36..0fb50d2ffc91d9 100755 --- a/t/t4012-diff-binary.sh +++ b/t/t4012-diff-binary.sh @@ -19,7 +19,7 @@ test_expect_success 'prepare repository' ' echo AIT >a && echo BIT >b && echo CIT >c && echo DIT >d && git update-index --add a b c d && echo git >a && - cat "$TEST_DIRECTORY"/test-binary-1.png >b && + cat "$TEST_DIRECTORY"/lib-diff/test-binary-1.png >b && echo git >c && cat b b >d ' diff --git a/t/t4049-diff-stat-count.sh b/t/t4049-diff-stat-count.sh index eceb47c8594416..2161a1e8cf5ba6 100755 --- a/t/t4049-diff-stat-count.sh +++ b/t/t4049-diff-stat-count.sh @@ -33,7 +33,7 @@ test_expect_success 'binary changes do not count in lines' ' git reset --hard && echo a >a && echo c >c && - cat "$TEST_DIRECTORY"/test-binary-1.png >d && + cat "$TEST_DIRECTORY"/lib-diff/test-binary-1.png >d && cat >expect <<-\EOF && a | 1 + c | 1 + diff --git a/t/t4108-apply-threeway.sh b/t/t4108-apply-threeway.sh index f30e85659dbb87..7f84edd9653a7d 100755 --- a/t/t4108-apply-threeway.sh +++ b/t/t4108-apply-threeway.sh @@ -272,11 +272,11 @@ test_expect_success 'apply with --3way --cached and conflicts' ' test_expect_success 'apply binary file patch' ' git reset --hard main && - cp "$TEST_DIRECTORY/test-binary-1.png" bin.png && + cp "$TEST_DIRECTORY/lib-diff/test-binary-1.png" bin.png && git add bin.png && git commit -m "add binary file" && - cp "$TEST_DIRECTORY/test-binary-2.png" bin.png && + cp "$TEST_DIRECTORY/lib-diff/test-binary-2.png" bin.png && git diff --binary >bin.diff && git reset --hard && @@ -287,11 +287,11 @@ test_expect_success 'apply binary file patch' ' test_expect_success 'apply binary file patch with 3way' ' git reset --hard main && - cp "$TEST_DIRECTORY/test-binary-1.png" bin.png && + cp "$TEST_DIRECTORY/lib-diff/test-binary-1.png" bin.png && git add bin.png && git commit -m "add binary file" && - cp "$TEST_DIRECTORY/test-binary-2.png" bin.png && + cp "$TEST_DIRECTORY/lib-diff/test-binary-2.png" bin.png && git diff --binary >bin.diff && git reset --hard && @@ -302,11 +302,11 @@ test_expect_success 'apply binary file patch with 3way' ' test_expect_success 'apply full-index patch with 3way' ' git reset --hard main && - cp "$TEST_DIRECTORY/test-binary-1.png" bin.png && + cp "$TEST_DIRECTORY/lib-diff/test-binary-1.png" bin.png && git add bin.png && git commit -m "add binary file" && - cp "$TEST_DIRECTORY/test-binary-2.png" bin.png && + cp "$TEST_DIRECTORY/lib-diff/test-binary-2.png" bin.png && git diff --full-index >bin.diff && git reset --hard && diff --git a/t/t6403-merge-file.sh b/t/t6403-merge-file.sh index 801284cf8fcde5..cc39753ad45810 100755 --- a/t/t6403-merge-file.sh +++ b/t/t6403-merge-file.sh @@ -355,12 +355,12 @@ test_expect_success "expected conflict markers" ' test_expect_success 'binary files cannot be merged' ' test_must_fail git merge-file -p \ - orig.txt "$TEST_DIRECTORY"/test-binary-1.png new1.txt 2> merge.err && + orig.txt "$TEST_DIRECTORY"/lib-diff/test-binary-1.png new1.txt 2> merge.err && grep "Cannot merge binary files" merge.err ' test_expect_success 'binary files cannot be merged with --object-id' ' - cp "$TEST_DIRECTORY"/test-binary-1.png . && + cp "$TEST_DIRECTORY"/lib-diff/test-binary-1.png . && git add orig.txt new1.txt test-binary-1.png && test_must_fail git merge-file --object-id \ :orig.txt :test-binary-1.png :new1.txt 2> merge.err && diff --git a/t/t6407-merge-binary.sh b/t/t6407-merge-binary.sh index e8a28717cece32..2547f1d504a2c5 100755 --- a/t/t6407-merge-binary.sh +++ b/t/t6407-merge-binary.sh @@ -9,7 +9,7 @@ export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME test_expect_success setup ' - cat "$TEST_DIRECTORY"/test-binary-1.png >m && + cat "$TEST_DIRECTORY"/lib-diff/test-binary-1.png >m && git add m && git ls-files -s | sed -e "s/ 0 / 1 /" >E1 && test_tick && diff --git a/t/t9200-git-cvsexportcommit.sh b/t/t9200-git-cvsexportcommit.sh index 14cbe9652779bc..415ac008fd7118 100755 --- a/t/t9200-git-cvsexportcommit.sh +++ b/t/t9200-git-cvsexportcommit.sh @@ -58,8 +58,8 @@ test_expect_success 'New file' ' mkdir A B C D E F && echo hello1 >A/newfile1.txt && echo hello2 >B/newfile2.txt && - cp "$TEST_DIRECTORY"/test-binary-1.png C/newfile3.png && - cp "$TEST_DIRECTORY"/test-binary-1.png D/newfile4.png && + cp "$TEST_DIRECTORY"/lib-diff/test-binary-1.png C/newfile3.png && + cp "$TEST_DIRECTORY"/lib-diff/test-binary-1.png D/newfile4.png && git add A/newfile1.txt && git add B/newfile2.txt && git add C/newfile3.png && @@ -84,8 +84,8 @@ test_expect_success 'Remove two files, add two and update two' ' rm -f B/newfile2.txt && rm -f C/newfile3.png && echo Hello5 >E/newfile5.txt && - cp "$TEST_DIRECTORY"/test-binary-2.png D/newfile4.png && - cp "$TEST_DIRECTORY"/test-binary-1.png F/newfile6.png && + cp "$TEST_DIRECTORY"/lib-diff/test-binary-2.png D/newfile4.png && + cp "$TEST_DIRECTORY"/lib-diff/test-binary-1.png F/newfile6.png && git add E/newfile5.txt && git add F/newfile6.png && git commit -a -m "Test: Remove, add and update" && @@ -173,7 +173,7 @@ test_expect_success 'New file with spaces in file name' ' mkdir "G g" && echo ok then >"G g/with spaces.txt" && git add "G g/with spaces.txt" && \ - cp "$TEST_DIRECTORY"/test-binary-1.png "G g/with spaces.png" && \ + cp "$TEST_DIRECTORY"/lib-diff/test-binary-1.png "G g/with spaces.png" && \ git add "G g/with spaces.png" && git commit -a -m "With spaces" && id=$(git rev-list --max-count=1 HEAD) && @@ -185,7 +185,7 @@ test_expect_success 'New file with spaces in file name' ' test_expect_success 'Update file with spaces in file name' ' echo Ok then >>"G g/with spaces.txt" && - cat "$TEST_DIRECTORY"/test-binary-1.png >>"G g/with spaces.png" && \ + cat "$TEST_DIRECTORY"/lib-diff/test-binary-1.png >>"G g/with spaces.png" && \ git add "G g/with spaces.png" && git commit -a -m "Update with spaces" && id=$(git rev-list --max-count=1 HEAD) && @@ -210,7 +210,7 @@ test_expect_success !MINGW 'File with non-ascii file name' ' mkdir -p Å/goo/a/b/c/d/e/f/g/h/i/j/k/l/m/n/o/p/q/r/s/t/u/v/w/x/y/z/å/ä/ö && echo Foo >Å/goo/a/b/c/d/e/f/g/h/i/j/k/l/m/n/o/p/q/r/s/t/u/v/w/x/y/z/å/ä/ö/gårdetsågårdet.txt && git add Å/goo/a/b/c/d/e/f/g/h/i/j/k/l/m/n/o/p/q/r/s/t/u/v/w/x/y/z/å/ä/ö/gårdetsågårdet.txt && - cp "$TEST_DIRECTORY"/test-binary-1.png Å/goo/a/b/c/d/e/f/g/h/i/j/k/l/m/n/o/p/q/r/s/t/u/v/w/x/y/z/å/ä/ö/gårdetsågårdet.png && + cp "$TEST_DIRECTORY"/lib-diff/test-binary-1.png Å/goo/a/b/c/d/e/f/g/h/i/j/k/l/m/n/o/p/q/r/s/t/u/v/w/x/y/z/å/ä/ö/gårdetsågårdet.png && git add Å/goo/a/b/c/d/e/f/g/h/i/j/k/l/m/n/o/p/q/r/s/t/u/v/w/x/y/z/å/ä/ö/gårdetsågårdet.png && git commit -a -m "Går det så går det" && \ id=$(git rev-list --max-count=1 HEAD) && From c5561acbe40249b7cb414ceee61acdd44a3c49ae Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Mon, 23 Apr 2018 23:20:00 +0200 Subject: [PATCH 182/218] fscache: Windows Docker volumes are *not* symbolic links ... even if they may look like them. As looking up the target of the "symbolic link" (just to see whether it starts with `/ContainerMappedDirectories/`) is pretty expensive, we do it when we can be *really* sure that there is a possibility that this might be the case. Signed-off-by: Johannes Schindelin Signed-off-by: JiSeop Moon --- compat/win32/fscache.c | 24 +++++++++++++++++++++++- 1 file changed, 23 insertions(+), 1 deletion(-) diff --git a/compat/win32/fscache.c b/compat/win32/fscache.c index 46dca7a5635faf..efe6be7e5ef0a5 100644 --- a/compat/win32/fscache.c +++ b/compat/win32/fscache.c @@ -207,8 +207,30 @@ static struct fsentry *fseentry_create_entry(struct fscache *cache, fdata->FileAttributes & FILE_ATTRIBUTE_REPARSE_POINT ? fdata->EaSize : 0; + /* + * On certain Windows versions, host directories mapped into + * Windows Containers ("Volumes", see https://docs.docker.com/storage/volumes/) + * look like symbolic links, but their targets are paths that + * are valid only in kernel mode. + * + * Let's work around this by detecting that situation and + * telling Git that these are *not* symbolic links. + */ + if (fse->reparse_tag == IO_REPARSE_TAG_SYMLINK && + sizeof(buf) > (size_t)(list ? list->len + 1 : 0) + fse->len + 1 && + is_inside_windows_container()) { + size_t off = 0; + if (list) { + memcpy(buf, list->dirent.d_name, list->len); + buf[list->len] = '/'; + off = list->len + 1; + } + memcpy(buf + off, fse->dirent.d_name, fse->len); + buf[off + fse->len] = '\0'; + } + fse->st_mode = file_attr_to_st_mode(fdata->FileAttributes, - fdata->EaSize); + fdata->EaSize, buf); fse->dirent.d_type = S_ISREG(fse->st_mode) ? DT_REG : S_ISDIR(fse->st_mode) ? DT_DIR : DT_LNK; fse->u.s.st_size = S_ISLNK(fse->st_mode) ? MAX_PATH : From a95aa8f5a0d9f1f5245ca612cdb80d26035dcd15 Mon Sep 17 00:00:00 2001 From: Ben Peart Date: Fri, 2 Nov 2018 11:19:10 -0400 Subject: [PATCH 183/218] fscache: teach fscache to use mempool Now that the fscache is single threaded, take advantage of the mem_pool as the allocator to significantly reduce the cost of allocations and frees. With the reduced cost of free, in future patches, we can start freeing the fscache at the end of commands instead of just leaking it. Signed-off-by: Ben Peart Signed-off-by: Johannes Schindelin --- compat/win32/fscache.c | 45 ++++++++++++++++++++++-------------------- 1 file changed, 24 insertions(+), 21 deletions(-) diff --git a/compat/win32/fscache.c b/compat/win32/fscache.c index cd57901aaf67b4..6da112e4f90382 100644 --- a/compat/win32/fscache.c +++ b/compat/win32/fscache.c @@ -6,6 +6,7 @@ #include "../../abspath.h" #include "../../trace.h" #include "config.h" +#include "../../mem-pool.h" static volatile long initialized; static DWORD dwTlsIndex; @@ -20,6 +21,7 @@ static CRITICAL_SECTION mutex; struct fscache { volatile long enabled; struct hashmap map; + struct mem_pool mem_pool; unsigned int lstat_requests; unsigned int opendir_requests; unsigned int fscache_requests; @@ -129,11 +131,12 @@ static void fsentry_init(struct fsentry *fse, struct fsentry *list, /* * Allocate an fsentry structure on the heap. */ -static struct fsentry *fsentry_alloc(struct fsentry *list, const char *name, +static struct fsentry *fsentry_alloc(struct fscache *cache, struct fsentry *list, const char *name, size_t len) { /* overallocate fsentry and copy the name to the end */ - struct fsentry *fse = xmalloc(sizeof(struct fsentry) + len + 1); + struct fsentry *fse = + mem_pool_alloc(&cache->mem_pool, sizeof(*fse) + len + 1); /* init the rest of the structure */ fsentry_init(fse, list, name, len); fse->next = NULL; @@ -153,27 +156,21 @@ inline static void fsentry_addref(struct fsentry *fse) } /* - * Release the reference to an fsentry, frees the memory if its the last ref. + * Release the reference to an fsentry. */ static void fsentry_release(struct fsentry *fse) { if (fse->list) fse = fse->list; - if (InterlockedDecrement(&(fse->u.refcnt))) - return; - - while (fse) { - struct fsentry *next = fse->next; - free(fse); - fse = next; - } + InterlockedDecrement(&(fse->u.refcnt)); } /* * Allocate and initialize an fsentry from a WIN32_FIND_DATA structure. */ -static struct fsentry *fseentry_create_entry(struct fsentry *list, +static struct fsentry *fseentry_create_entry(struct fscache *cache, + struct fsentry *list, const WIN32_FIND_DATAW *fdata) { char buf[MAX_PATH * 3]; @@ -181,7 +178,7 @@ static struct fsentry *fseentry_create_entry(struct fsentry *list, struct fsentry *fse; len = xwcstoutf(buf, fdata->cFileName, ARRAY_SIZE(buf)); - fse = fsentry_alloc(list, buf, len); + fse = fsentry_alloc(cache, list, buf, len); fse->st_mode = file_attr_to_st_mode(fdata->dwFileAttributes, IO_REPARSE_TAG_SYMLINK); @@ -201,7 +198,7 @@ static struct fsentry *fseentry_create_entry(struct fsentry *list, * Dir should not contain trailing '/'. Use an empty string for the current * directory (not "."!). */ -static struct fsentry *fsentry_create_list(const struct fsentry *dir, +static struct fsentry *fsentry_create_list(struct fscache *cache, const struct fsentry *dir, int *dir_not_found) { wchar_t pattern[MAX_PATH + 2]; /* + 2 for '/' '*' */ @@ -240,14 +237,14 @@ static struct fsentry *fsentry_create_list(const struct fsentry *dir, } /* allocate object to hold directory listing */ - list = fsentry_alloc(NULL, dir->dirent.d_name, dir->len); + list = fsentry_alloc(cache, NULL, dir->dirent.d_name, dir->len); list->st_mode = S_IFDIR; list->dirent.d_type = DT_DIR; /* walk directory and build linked list of fsentry structures */ phead = &list->next; do { - *phead = fseentry_create_entry(list, &fdata); + *phead = fseentry_create_entry(cache, list, &fdata); phead = &(*phead)->next; } while (FindNextFileW(h, &fdata)); @@ -259,7 +256,7 @@ static struct fsentry *fsentry_create_list(const struct fsentry *dir, if (err == ERROR_NO_MORE_FILES) return list; - /* otherwise free the list and return error */ + /* otherwise release the list and return error */ fsentry_release(list); errno = err_win_to_posix(err); return NULL; @@ -282,7 +279,9 @@ static void fscache_add(struct fscache *cache, struct fsentry *fse) */ static void fscache_clear(struct fscache *cache) { - hashmap_clear_and_free(&cache->map, struct fsentry, ent); + mem_pool_discard(&cache->mem_pool, 0); + mem_pool_init(&cache->mem_pool, 0); + hashmap_clear(&cache->map); hashmap_init(&cache->map, (hashmap_cmp_fn)fsentry_cmp, NULL, 0); cache->lstat_requests = cache->opendir_requests = 0; cache->fscache_misses = cache->fscache_requests = 0; @@ -335,7 +334,7 @@ static struct fsentry *fscache_get(struct fscache *cache, struct fsentry *key) } /* create the directory listing */ - fse = fsentry_create_list(key->list ? key->list : key, &dir_not_found); + fse = fsentry_create_list(cache, key->list ? key->list : key, &dir_not_found); /* leave on error (errno set by fsentry_create_list) */ if (!fse) { @@ -345,7 +344,7 @@ static struct fsentry *fscache_get(struct fscache *cache, struct fsentry *key) * empty, which for all practical matters is the same * thing as far as fscache is concerned). */ - fse = fsentry_alloc(key->list->list, + fse = fsentry_alloc(cache, key->list->list, key->list->dirent.d_name, key->list->len); fse->st_mode = 0; @@ -424,6 +423,7 @@ int fscache_enable(size_t initial_size) * '4' was determined empirically by testing several repos */ hashmap_init(&cache->map, (hashmap_cmp_fn)fsentry_cmp, NULL, initial_size * 4); + mem_pool_init(&cache->mem_pool, 0); if (!TlsSetValue(dwTlsIndex, cache)) BUG("TlsSetValue error"); } @@ -455,7 +455,8 @@ void fscache_disable(void) "total requests/misses %u/%u\n", cache->lstat_requests, cache->opendir_requests, cache->fscache_requests, cache->fscache_misses); - fscache_clear(cache); + mem_pool_discard(&cache->mem_pool, 0); + hashmap_clear(&cache->map); free(cache); } @@ -657,6 +658,8 @@ void fscache_merge(struct fscache *dest) while ((e = hashmap_iter_next(&iter))) hashmap_add(&dest->map, e); + mem_pool_combine(&dest->mem_pool, &cache->mem_pool); + dest->lstat_requests += cache->lstat_requests; dest->opendir_requests += cache->opendir_requests; dest->fscache_requests += cache->fscache_requests; From 405120a730bf1829668defd78f490d7c3bb7aa52 Mon Sep 17 00:00:00 2001 From: Derrick Stolee Date: Wed, 12 Jun 2019 00:58:49 +0000 Subject: [PATCH 184/218] unpack-trees: enable fscache for sparse-checkout When updating the skip-worktree bits in the index to align with new values in a sparse-checkout file, Git scans the entire working directory with lstat() calls. In a sparse-checkout, many of these lstat() calls are for paths that do not exist. Enable the fscache feature during this scan. Since enable_fscache() calls nest, the disable_fscache() method decrements a counter and would only clear the cache if that counter reaches zero. In a local test of a repo with ~2.2 million paths, updating the index with git read-tree -m -u HEAD with a sparse-checkout file containing only /.gitattributes improved from 2-3 minutes to ~6 seconds. Signed-off-by: Derrick Stolee Signed-off-by: Johannes Schindelin --- unpack-trees.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/unpack-trees.c b/unpack-trees.c index 998a1e6dc70cae..2acfe71899b7e6 100644 --- a/unpack-trees.c +++ b/unpack-trees.c @@ -1823,7 +1823,9 @@ static void mark_new_skip_worktree(struct pattern_list *pl, * 2. Widen worktree according to sparse-checkout file. * Matched entries will have skip_wt_flag cleared (i.e. "in") */ + enable_fscache(istate->cache_nr); clear_ce_flags(istate, select_flag, skip_wt_flag, pl, show_progress); + enable_fscache(0); } static void populate_from_existing_patterns(struct unpack_trees_options *o, From 8157fe4321f7898e9865fe3d518c85bfe77b97bb Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Tue, 18 Jul 2017 01:15:40 +0200 Subject: [PATCH 185/218] tests: only override sort & find if there are usable ones in /usr/bin/ The idea is to allow running the test suite on MinGit with BusyBox installed in /mingw64/bin/sh.exe. In that case, we will want to exclude sort & find (and other Unix utilities) from being bundled. Signed-off-by: Johannes Schindelin --- git-sh-setup.sh | 21 ++++++++++++++------- t/test-lib.sh | 21 ++++++++++++++------- 2 files changed, 28 insertions(+), 14 deletions(-) diff --git a/git-sh-setup.sh b/git-sh-setup.sh index 19aef72ec25530..fad4f9df94e143 100644 --- a/git-sh-setup.sh +++ b/git-sh-setup.sh @@ -292,13 +292,20 @@ create_virtual_base() { # Platform specific tweaks to work around some commands case $(uname -s) in *MINGW*) - # Windows has its own (incompatible) sort and find - sort () { - /usr/bin/sort "$@" - } - find () { - /usr/bin/find "$@" - } + if test -x /usr/bin/sort + then + # Windows has its own (incompatible) sort; override + sort () { + /usr/bin/sort "$@" + } + fi + if test -x /usr/bin/find + then + # Windows has its own (incompatible) find; override + find () { + /usr/bin/find "$@" + } + fi # git sees Windows-style pwd pwd () { builtin pwd -W diff --git a/t/test-lib.sh b/t/test-lib.sh index c4639e1f600ed1..770f4affb29d58 100644 --- a/t/test-lib.sh +++ b/t/test-lib.sh @@ -1662,13 +1662,20 @@ Darwin) test_set_prereq EXECKEEPSPID ;; *MINGW*) - # Windows has its own (incompatible) sort and find - sort () { - /usr/bin/sort "$@" - } - find () { - /usr/bin/find "$@" - } + if test -x /usr/bin/sort + then + # Windows has its own (incompatible) sort; override + sort () { + /usr/bin/sort "$@" + } + fi + if test -x /usr/bin/find + then + # Windows has its own (incompatible) find; override + find () { + /usr/bin/find "$@" + } + fi # git sees Windows-style pwd pwd () { builtin pwd -W From 447db237661e0c37c4fd5accdcdc189d1bd409c7 Mon Sep 17 00:00:00 2001 From: xungeng li Date: Wed, 7 Jun 2023 20:26:33 +0800 Subject: [PATCH 186/218] fscache: optionally enable wsl compability file mode bits The Windows Subsystem for Linux (WSL) version 2 allows to use `chmod` on NTFS volumes provided that they are mounted with metadata enabled (see https://devblogs.microsoft.com/commandline/chmod-chown-wsl-improvements/ for details), for example: $ chmod 0755 /mnt/d/test/a.sh In order to facilitate better collaboration between the Windows version of Git and the WSL version of Git, we can make the Windows version of Git also support reading and writing NTFS file modes in a manner compatible with WSL. Since this slightly slows down operations where lots of files are created (such as an initial checkout), this feature is only enabled when `core.WSLCompat` is set to true. Note that you also have to set `core.fileMode=true` in repositories that have been initialized without enabling WSL compatibility. There are several ways to enable metadata loading for NTFS volumes in WSL, one of which is to modify `/etc/wsl.conf` by adding: ``` [automount] enabled = true options = "metadata,umask=027,fmask=117" ``` And reboot WSL. It can also be enabled temporarily by this incantation: $ sudo umount /mnt/c && sudo mount -t drvfs C: /mnt/c -o metadata,uid=1000,gid=1000,umask=22,fmask=111 It's important to note that this modification is compatible with, but does not depend on WSL. The helper functions in this commit can operate independently and functions normally on devices where WSL is not installed or properly configured. Signed-off-by: xungeng li Signed-off-by: Johannes Schindelin --- compat/win32/fscache.c | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) diff --git a/compat/win32/fscache.c b/compat/win32/fscache.c index efe6be7e5ef0a5..e247e9a6663131 100644 --- a/compat/win32/fscache.c +++ b/compat/win32/fscache.c @@ -8,6 +8,7 @@ #include "config.h" #include "../../mem-pool.h" #include "ntifs.h" +#include "wsl.h" static volatile long initialized; static DWORD dwTlsIndex; @@ -242,6 +243,21 @@ static struct fsentry *fseentry_create_entry(struct fscache *cache, &(fse->u.s.st_mtim)); filetime_to_timespec((FILETIME *)&(fdata->CreationTime), &(fse->u.s.st_ctim)); + if (fdata->EaSize > 0 && + sizeof(buf) >= (size_t)(list ? list->len+1 : 0) + fse->len+1 && + are_wsl_compatible_mode_bits_enabled()) { + size_t off = 0; + wchar_t wpath[MAX_LONG_PATH]; + if (list && list->len) { + memcpy(buf, list->dirent.d_name, list->len); + buf[list->len] = '/'; + off = list->len + 1; + } + memcpy(buf + off, fse->dirent.d_name, fse->len); + buf[off + fse->len] = '\0'; + if (xutftowcs_long_path(wpath, buf) >= 0) + copy_wsl_mode_bits_from_disk(wpath, -1, &fse->st_mode); + } return fse; } From 893ef6f64acb141c145859a18cca0a676433ed99 Mon Sep 17 00:00:00 2001 From: Ben Peart Date: Fri, 16 Nov 2018 10:59:18 -0500 Subject: [PATCH 187/218] fscache: make fscache_enable() thread safe The recent change to make fscache thread specific relied on fscache_enable() being called first from the primary thread before being called in parallel from worker threads. Make that more robust and protect it with a critical section to avoid any issues. Helped-by: Johannes Schindelin Signed-off-by: Ben Peart --- compat/mingw.c | 4 ++++ compat/win32/fscache.c | 23 +++++++++++++---------- compat/win32/fscache.h | 2 ++ 3 files changed, 19 insertions(+), 10 deletions(-) diff --git a/compat/mingw.c b/compat/mingw.c index 2f2852ec7eb305..f037489e5c3d50 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -16,6 +16,7 @@ #include "trace2.h" #include "win32.h" #include "win32/exit-process.h" +#include "win32/fscache.h" #include "win32/lazyload.h" #include "win32/wsl.h" #include "wrapper.h" @@ -4249,6 +4250,9 @@ int wmain(int argc, const wchar_t **wargv) InitializeCriticalSection(&pinfo_cs); InitializeCriticalSection(&phantom_symlinks_cs); + /* initialize critical section for fscache */ + InitializeCriticalSection(&fscache_cs); + /* set up default file mode and file modes for stdin/out/err */ _fmode = _O_BINARY; _setmode(_fileno(stdin), _O_BINARY); diff --git a/compat/win32/fscache.c b/compat/win32/fscache.c index 6da112e4f90382..335c46dd36c544 100644 --- a/compat/win32/fscache.c +++ b/compat/win32/fscache.c @@ -10,7 +10,7 @@ static volatile long initialized; static DWORD dwTlsIndex; -static CRITICAL_SECTION mutex; +CRITICAL_SECTION fscache_cs; /* * Store one fscache per thread to avoid thread contention and locking. @@ -395,12 +395,12 @@ int fscache_enable(size_t initial_size) * opendir and lstat function pointers are redirected if * any threads are using the fscache. */ + EnterCriticalSection(&fscache_cs); if (!initialized) { - InitializeCriticalSection(&mutex); if (!dwTlsIndex) { dwTlsIndex = TlsAlloc(); if (dwTlsIndex == TLS_OUT_OF_INDEXES) { - LeaveCriticalSection(&mutex); + LeaveCriticalSection(&fscache_cs); return 0; } } @@ -409,12 +409,13 @@ int fscache_enable(size_t initial_size) opendir = fscache_opendir; lstat = fscache_lstat; } - InterlockedIncrement(&initialized); + initialized++; + LeaveCriticalSection(&fscache_cs); /* refcount the thread specific initialization */ cache = fscache_getcache(); if (cache) { - InterlockedIncrement(&cache->enabled); + cache->enabled++; } else { cache = (struct fscache *)xcalloc(1, sizeof(*cache)); cache->enabled = 1; @@ -448,7 +449,7 @@ void fscache_disable(void) BUG("fscache_disable() called on a thread where fscache has not been initialized"); if (!cache->enabled) BUG("fscache_disable() called on an fscache that is already disabled"); - InterlockedDecrement(&cache->enabled); + cache->enabled--; if (!cache->enabled) { TlsSetValue(dwTlsIndex, NULL); trace_printf_key(&trace_fscache, "fscache_disable: lstat %u, opendir %u, " @@ -461,12 +462,14 @@ void fscache_disable(void) } /* update the global fscache initialization */ - InterlockedDecrement(&initialized); + EnterCriticalSection(&fscache_cs); + initialized--; if (!initialized) { /* reset opendir and lstat to the original implementations */ opendir = dirent_opendir; lstat = mingw_lstat; } + LeaveCriticalSection(&fscache_cs); trace_printf_key(&trace_fscache, "fscache: disable\n"); return; @@ -652,7 +655,7 @@ void fscache_merge(struct fscache *dest) * isn't being used so the critical section only needs to prevent * the the child threads from stomping on each other. */ - EnterCriticalSection(&mutex); + EnterCriticalSection(&fscache_cs); hashmap_iter_init(&cache->map, &iter); while ((e = hashmap_iter_next(&iter))) @@ -664,9 +667,9 @@ void fscache_merge(struct fscache *dest) dest->opendir_requests += cache->opendir_requests; dest->fscache_requests += cache->fscache_requests; dest->fscache_misses += cache->fscache_misses; - LeaveCriticalSection(&mutex); + initialized--; + LeaveCriticalSection(&fscache_cs); free(cache); - InterlockedDecrement(&initialized); } diff --git a/compat/win32/fscache.h b/compat/win32/fscache.h index 2eb8bf3f5cfee8..042b247a542554 100644 --- a/compat/win32/fscache.h +++ b/compat/win32/fscache.h @@ -6,6 +6,8 @@ * for each thread where caching is desired. */ +extern CRITICAL_SECTION fscache_cs; + int fscache_enable(size_t initial_size); #define enable_fscache(initial_size) fscache_enable(initial_size) From 9fef3a3be2fedaab2a62aa4891cead1737b05868 Mon Sep 17 00:00:00 2001 From: Ben Peart Date: Thu, 1 Nov 2018 11:40:51 -0400 Subject: [PATCH 188/218] status: disable and free fscache at the end of the status command At the end of the status command, disable and free the fscache so that we don't leak the memory and so that we can dump the fscache statistics. Signed-off-by: Ben Peart --- builtin/commit.c | 1 + 1 file changed, 1 insertion(+) diff --git a/builtin/commit.c b/builtin/commit.c index 3a182adc9c319b..98751de3f2ea68 100644 --- a/builtin/commit.c +++ b/builtin/commit.c @@ -1664,6 +1664,7 @@ struct repository *repo UNUSED) wt_status_print(&s); wt_status_collect_free_buffers(&s); + enable_fscache(0); return 0; } From 53f478e497c527c75e694972a1a40f948087af5a Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Mon, 19 Nov 2018 20:34:13 +0100 Subject: [PATCH 189/218] tests: use the correct path separator with BusyBox BusyBox-w32 is a true Win32 application, i.e. it does not come with a POSIX emulation layer. That also means that it does *not* use the Unix convention of separating the entries in the PATH variable using colons, but semicolons. However, there are also BusyBox ports to Windows which use a POSIX emulation layer such as Cygwin's or MSYS2's runtime, i.e. using colons as PATH separators. As a tell-tale, let's use the presence of semicolons in the PATH variable: on Unix, it is highly unlikely that it contains semicolons, and on Windows (without POSIX emulation), it is virtually guaranteed, as everybody should have both $SYSTEMROOT and $SYSTEMROOT/system32 in their PATH. Signed-off-by: Johannes Schindelin --- t/interop/interop-lib.sh | 8 ++++++-- t/lib-proto-disable.sh | 2 +- t/t0021-conversion.sh | 2 +- t/t0060-path-utils.sh | 24 ++++++++++++------------ t/t0061-run-command.sh | 6 +++--- t/t0300-credentials.sh | 2 +- t/t1504-ceiling-dirs.sh | 10 +++++----- t/t2300-cd-to-toplevel.sh | 2 +- t/t3418-rebase-continue.sh | 4 ++-- t/t5615-alternate-env.sh | 4 ++-- t/t5802-connect-helper.sh | 2 +- t/t7006-pager.sh | 4 ++-- t/t7606-merge-custom.sh | 2 +- t/t7811-grep-open.sh | 2 +- t/t9003-help-autocorrect.sh | 2 +- t/t9800-git-p4-basic.sh | 2 +- t/test-lib.sh | 17 +++++++++++++---- 17 files changed, 54 insertions(+), 41 deletions(-) diff --git a/t/interop/interop-lib.sh b/t/interop/interop-lib.sh index 1b5864d2a7f22c..1facc69d97741a 100644 --- a/t/interop/interop-lib.sh +++ b/t/interop/interop-lib.sh @@ -4,6 +4,10 @@ . ../../GIT-BUILD-OPTIONS INTEROP_ROOT=$(pwd) BUILD_ROOT=$INTEROP_ROOT/build +case "$PATH" in +*\;*) PATH_SEP=\; ;; +*) PATH_SEP=: ;; +esac build_version () { if test -z "$1" @@ -57,7 +61,7 @@ wrap_git () { write_script "$1" <<-EOF GIT_EXEC_PATH="$2" export GIT_EXEC_PATH - PATH="$2:\$PATH" + PATH="$2$PATH_SEP\$PATH" export GIT_EXEC_PATH exec git "\$@" EOF @@ -71,7 +75,7 @@ generate_wrappers () { echo >&2 fatal: test tried to run generic git: $* exit 1 EOF - PATH=$(pwd)/.bin:$PATH + PATH=$(pwd)/.bin$PATH_SEP$PATH } VERSION_A=${GIT_TEST_VERSION_A:-$VERSION_A} diff --git a/t/lib-proto-disable.sh b/t/lib-proto-disable.sh index 890622be81642b..9db481e1be15b2 100644 --- a/t/lib-proto-disable.sh +++ b/t/lib-proto-disable.sh @@ -214,7 +214,7 @@ setup_ext_wrapper () { cd "$TRASH_DIRECTORY/remote" && eval "$*" EOF - PATH=$TRASH_DIRECTORY:$PATH && + PATH=$TRASH_DIRECTORY$PATH_SEP$PATH && export TRASH_DIRECTORY ' } diff --git a/t/t0021-conversion.sh b/t/t0021-conversion.sh index f0d50d769e9fc5..0c5975336f2104 100755 --- a/t/t0021-conversion.sh +++ b/t/t0021-conversion.sh @@ -8,7 +8,7 @@ export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME . ./test-lib.sh . "$TEST_DIRECTORY"/lib-terminal.sh -PATH=$PWD:$PATH +PATH=$PWD$PATH_SEP$PATH TEST_ROOT="$(pwd)" write_script <<\EOF "$TEST_ROOT/rot13.sh" diff --git a/t/t0060-path-utils.sh b/t/t0060-path-utils.sh index 3cdc4738644dbc..5abfa202c19dca 100755 --- a/t/t0060-path-utils.sh +++ b/t/t0060-path-utils.sh @@ -147,25 +147,25 @@ ancestor /foo /fo -1 ancestor /foo /foo -1 ancestor /foo /bar -1 ancestor /foo /foo/bar -1 -ancestor /foo /foo:/bar -1 -ancestor /foo /:/foo:/bar 0 -ancestor /foo /foo:/:/bar 0 -ancestor /foo /:/bar:/foo 0 +ancestor /foo "/foo$PATH_SEP/bar" -1 +ancestor /foo "/$PATH_SEP/foo$PATH_SEP/bar" 0 +ancestor /foo "/foo$PATH_SEP/$PATH_SEP/bar" 0 +ancestor /foo "/$PATH_SEP/bar$PATH_SEP/foo" 0 ancestor /foo/bar / 0 ancestor /foo/bar /fo -1 ancestor /foo/bar /foo 4 ancestor /foo/bar /foo/ba -1 -ancestor /foo/bar /:/fo 0 -ancestor /foo/bar /foo:/foo/ba 4 +ancestor /foo/bar "/$PATH_SEP/fo" 0 +ancestor /foo/bar "/foo$PATH_SEP/foo/ba" 4 ancestor /foo/bar /bar -1 ancestor /foo/bar /fo -1 -ancestor /foo/bar /foo:/bar 4 -ancestor /foo/bar /:/foo:/bar 4 -ancestor /foo/bar /foo:/:/bar 4 -ancestor /foo/bar /:/bar:/fo 0 -ancestor /foo/bar /:/bar 0 +ancestor /foo/bar "/foo$PATH_SEP/bar" 4 +ancestor /foo/bar "/$PATH_SEP/foo$PATH_SEP/bar" 4 +ancestor /foo/bar "/foo$PATH_SEP/$PATH_SEP/bar" 4 +ancestor /foo/bar "/$PATH_SEP/bar$PATH_SEP/fo" 0 +ancestor /foo/bar "/$PATH_SEP/bar" 0 ancestor /foo/bar /foo 4 -ancestor /foo/bar /foo:/bar 4 +ancestor /foo/bar "/foo$PATH_SEP/bar" 4 ancestor /foo/bar /bar -1 # Windows-specific: DOS drives, network shares diff --git a/t/t0061-run-command.sh b/t/t0061-run-command.sh index 60cfe65979e215..905e90e1f72541 100755 --- a/t/t0061-run-command.sh +++ b/t/t0061-run-command.sh @@ -69,7 +69,7 @@ test_expect_success 'run_command does not try to execute a directory' ' cat bin2/greet EOF - PATH=$PWD/bin1:$PWD/bin2:$PATH \ + PATH=$PWD/bin1$PATH_SEP$PWD/bin2$PATH_SEP$PATH \ test-tool run-command run-command greet >actual 2>err && test_cmp bin2/greet actual && test_must_be_empty err @@ -86,7 +86,7 @@ test_expect_success POSIXPERM 'run_command passes over non-executable file' ' cat bin2/greet EOF - PATH=$PWD/bin1:$PWD/bin2:$PATH \ + PATH=$PWD/bin1$PATH_SEP$PWD/bin2$PATH_SEP$PATH \ test-tool run-command run-command greet >actual 2>err && test_cmp bin2/greet actual && test_must_be_empty err @@ -106,7 +106,7 @@ test_expect_success POSIXPERM,SANITY 'unreadable directory in PATH' ' git config alias.nitfol "!echo frotz" && chmod a-rx local-command && ( - PATH=./local-command:$PATH && + PATH=./local-command$PATH_SEP$PATH && git nitfol >actual ) && echo frotz >expect && diff --git a/t/t0300-credentials.sh b/t/t0300-credentials.sh index 64ead1571ae1e1..add4aeb6f92fd1 100755 --- a/t/t0300-credentials.sh +++ b/t/t0300-credentials.sh @@ -80,7 +80,7 @@ test_expect_success 'setup helper scripts' ' printf "username=\\007latrix Lestrange\\n" EOF - PATH="$PWD:$PATH" + PATH="$PWD$PATH_SEP$PATH" ' test_expect_success 'credential_fill invokes helper' ' diff --git a/t/t1504-ceiling-dirs.sh b/t/t1504-ceiling-dirs.sh index e04420f4368b93..ff9fb804827b59 100755 --- a/t/t1504-ceiling-dirs.sh +++ b/t/t1504-ceiling-dirs.sh @@ -84,9 +84,9 @@ then GIT_CEILING_DIRECTORIES="$TRASH_ROOT/top/" test_fail subdir_ceil_at_top_slash - GIT_CEILING_DIRECTORIES=":$TRASH_ROOT/top" + GIT_CEILING_DIRECTORIES="$PATH_SEP$TRASH_ROOT/top" test_prefix subdir_ceil_at_top_no_resolve "sub/dir/" - GIT_CEILING_DIRECTORIES=":$TRASH_ROOT/top/" + GIT_CEILING_DIRECTORIES="$PATH_SEP$TRASH_ROOT/top/" test_prefix subdir_ceil_at_top_slash_no_resolve "sub/dir/" fi @@ -116,13 +116,13 @@ GIT_CEILING_DIRECTORIES="$TRASH_ROOT/subdi" test_prefix subdir_ceil_at_subdi_slash "sub/dir/" -GIT_CEILING_DIRECTORIES="/foo:$TRASH_ROOT/sub" +GIT_CEILING_DIRECTORIES="/foo$PATH_SEP$TRASH_ROOT/sub" test_fail second_of_two -GIT_CEILING_DIRECTORIES="$TRASH_ROOT/sub:/bar" +GIT_CEILING_DIRECTORIES="$TRASH_ROOT/sub$PATH_SEP/bar" test_fail first_of_two -GIT_CEILING_DIRECTORIES="/foo:$TRASH_ROOT/sub:/bar" +GIT_CEILING_DIRECTORIES="/foo$PATH_SEP$TRASH_ROOT/sub$PATH_SEP/bar" test_fail second_of_three diff --git a/t/t2300-cd-to-toplevel.sh b/t/t2300-cd-to-toplevel.sh index c8de6d8a190220..91f523d5198d8d 100755 --- a/t/t2300-cd-to-toplevel.sh +++ b/t/t2300-cd-to-toplevel.sh @@ -16,7 +16,7 @@ test_cd_to_toplevel () { test_expect_success $3 "$2" ' ( cd '"'$1'"' && - PATH="$EXEC_PATH:$PATH" && + PATH="$EXEC_PATH$PATH_SEP$PATH" && . git-sh-setup && cd_to_toplevel && [ "$(pwd -P)" = "$TOPLEVEL" ] diff --git a/t/t3418-rebase-continue.sh b/t/t3418-rebase-continue.sh index f9b8999db50f1b..e03a28c0aaad24 100755 --- a/t/t3418-rebase-continue.sh +++ b/t/t3418-rebase-continue.sh @@ -82,7 +82,7 @@ test_expect_success 'rebase --continue remembers merge strategy and options' ' rm -f actual && ( - PATH=./test-bin:$PATH && + PATH=./test-bin$PATH_SEP$PATH && test_must_fail git rebase -s funny -X"option=arg with space" \ -Xop\"tion\\ -X"new${LF}line " main topic ) && @@ -91,7 +91,7 @@ test_expect_success 'rebase --continue remembers merge strategy and options' ' echo "Resolved" >F2 && git add F2 && ( - PATH=./test-bin:$PATH && + PATH=./test-bin$PATH_SEP$PATH && git rebase --continue ) && test_cmp expect actual diff --git a/t/t5615-alternate-env.sh b/t/t5615-alternate-env.sh index 9d6aa2187f2aaa..1bfeccdeb49958 100755 --- a/t/t5615-alternate-env.sh +++ b/t/t5615-alternate-env.sh @@ -39,7 +39,7 @@ test_expect_success 'access alternate via absolute path' ' ' test_expect_success 'access multiple alternates' ' - check_obj "$PWD/one.git/objects:$PWD/two.git/objects" <<-EOF + check_obj "$PWD/one.git/objects$PATH_SEP$PWD/two.git/objects" <<-EOF $one blob $two blob EOF @@ -75,7 +75,7 @@ test_expect_success 'access alternate via relative path (subdir)' ' quoted='"one.git\057objects"' unquoted='two.git/objects' test_expect_success 'mix of quoted and unquoted alternates' ' - check_obj "$quoted:$unquoted" <<-EOF + check_obj "$quoted$PATH_SEP$unquoted" <<-EOF $one blob $two blob EOF diff --git a/t/t5802-connect-helper.sh b/t/t5802-connect-helper.sh index a7be375bceb8d3..26cbcebf3b2b24 100755 --- a/t/t5802-connect-helper.sh +++ b/t/t5802-connect-helper.sh @@ -86,7 +86,7 @@ test_expect_success 'set up fake git-daemon' ' "$TRASH_DIRECTORY/remote" EOF export TRASH_DIRECTORY && - PATH=$TRASH_DIRECTORY:$PATH + PATH=$TRASH_DIRECTORY$PATH_SEP$PATH ' test_expect_success 'ext command can connect to git daemon (no vhost)' ' diff --git a/t/t7006-pager.sh b/t/t7006-pager.sh index 9717e825f0d7a5..e3aa496a286331 100755 --- a/t/t7006-pager.sh +++ b/t/t7006-pager.sh @@ -54,7 +54,7 @@ test_expect_success !MINGW,TTY 'LESS and LV envvars set by git-sh-setup' ' sane_unset LESS LV && PAGER="env >pager-env.out; wc" && export PAGER && - PATH="$(git --exec-path):$PATH" && + PATH="$(git --exec-path)$PATH_SEP$PATH" && export PATH && test_terminal sh -c ". git-sh-setup && git_pager" ) && @@ -388,7 +388,7 @@ test_default_pager() { EOF chmod +x \$less && ( - PATH=.:\$PATH && + PATH=.$PATH_SEP\$PATH && export PATH && $full_command ) && diff --git a/t/t7606-merge-custom.sh b/t/t7606-merge-custom.sh index 81fb7c474c14c1..8197a1c46bb5b6 100755 --- a/t/t7606-merge-custom.sh +++ b/t/t7606-merge-custom.sh @@ -23,7 +23,7 @@ test_expect_success 'set up custom strategy' ' EOF chmod +x git-merge-theirs && - PATH=.:$PATH && + PATH=.$PATH_SEP$PATH && export PATH ' diff --git a/t/t7811-grep-open.sh b/t/t7811-grep-open.sh index 3160be59fd2e26..1a98d733dceb86 100755 --- a/t/t7811-grep-open.sh +++ b/t/t7811-grep-open.sh @@ -52,7 +52,7 @@ test_expect_success SIMPLEPAGER 'git grep -O' ' EOF echo grep.h >expect.notless && - PATH=.:$PATH git grep -O GREP_PATTERN >out && + PATH=.$PATH_SEP$PATH git grep -O GREP_PATTERN >out && { test_cmp expect.less pager-args || test_cmp expect.notless pager-args diff --git a/t/t9003-help-autocorrect.sh b/t/t9003-help-autocorrect.sh index 8da318d2b543da..c7a03aae697ac0 100755 --- a/t/t9003-help-autocorrect.sh +++ b/t/t9003-help-autocorrect.sh @@ -13,7 +13,7 @@ test_expect_success 'setup' ' echo distimdistim was called EOF - PATH="$PATH:." && + PATH="$PATH$PATH_SEP." && export PATH && git commit --allow-empty -m "a single log entry" && diff --git a/t/t9800-git-p4-basic.sh b/t/t9800-git-p4-basic.sh index 0816763e46639c..b3dbd02961fae3 100755 --- a/t/t9800-git-p4-basic.sh +++ b/t/t9800-git-p4-basic.sh @@ -286,7 +286,7 @@ test_expect_success 'exit when p4 fails to produce marshaled output' ' EOF chmod 755 badp4dir/p4 && ( - PATH="$TRASH_DIRECTORY/badp4dir:$PATH" && + PATH="$TRASH_DIRECTORY/badp4dir$PATH_SEP$PATH" && export PATH && test_expect_code 1 git p4 clone --dest="$git" //depot >errs 2>&1 ) && diff --git a/t/test-lib.sh b/t/test-lib.sh index 770f4affb29d58..41e623f6460103 100644 --- a/t/test-lib.sh +++ b/t/test-lib.sh @@ -15,6 +15,15 @@ # You should have received a copy of the GNU General Public License # along with this program. If not, see https://www.gnu.org/licenses/ . +# On Unix/Linux, the path separator is the colon, on other systems it +# may be different, though. On Windows, for example, it is a semicolon. +# If the PATH variable contains semicolons, it is pretty safe to assume +# that the path separator is a semicolon. +case "$PATH" in +*\;*) PATH_SEP=\; ;; +*) PATH_SEP=: ;; +esac + # Test the binaries we have just built. The tests are kept in # t/ subdirectory and are run in 'trash directory' subdirectory. if test -z "$TEST_DIRECTORY" @@ -1392,7 +1401,7 @@ then done done IFS=$OLDIFS - PATH=$GIT_VALGRIND/bin:$PATH + PATH=$GIT_VALGRIND/bin$PATH_SEP$PATH GIT_EXEC_PATH=$GIT_VALGRIND/bin export GIT_VALGRIND GIT_VALGRIND_MODE="$valgrind" @@ -1404,7 +1413,7 @@ elif test -n "$GIT_TEST_INSTALLED" then GIT_EXEC_PATH=$($GIT_TEST_INSTALLED/git --exec-path) || error "Cannot run git from $GIT_TEST_INSTALLED." - PATH=$GIT_TEST_INSTALLED:$GIT_BUILD_DIR/t/helper:$PATH + PATH=$GIT_TEST_INSTALLED$PATH_SEP$GIT_BUILD_DIR/t/helper$PATH_SEP$PATH GIT_EXEC_PATH=${GIT_TEST_EXEC_PATH:-$GIT_EXEC_PATH} else # normal case, use ../bin-wrappers only unless $with_dashes: if test -n "$no_bin_wrappers" @@ -1420,12 +1429,12 @@ else # normal case, use ../bin-wrappers only unless $with_dashes: fi with_dashes=t fi - PATH="$git_bin_dir:$PATH" + PATH="$git_bin_dir$PATH_SEP$PATH" fi GIT_EXEC_PATH=$GIT_BUILD_DIR if test -n "$with_dashes" then - PATH="$GIT_BUILD_DIR:$GIT_BUILD_DIR/t/helper:$PATH" + PATH="$GIT_BUILD_DIR$PATH_SEP$GIT_BUILD_DIR/t/helper$PATH_SEP$PATH" fi fi GIT_TEMPLATE_DIR="$GIT_TEST_TEMPLATE_DIR" From b122cb907f11179ca6bb9706fd195780561e98fd Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Tue, 11 Dec 2018 12:17:49 +0100 Subject: [PATCH 190/218] fscache: implement an FSCache-aware is_mount_point() When FSCache is active, we can cache the reparse tag and use it directly to determine whether a path refers to an NTFS junction, without any additional, costly I/O. Note: this change only makes a difference with the next commit, which will make use of the FSCache in `git clean` (contingent on `core.fscache` set, of course). Signed-off-by: Johannes Schindelin --- compat/mingw.c | 2 ++ compat/mingw.h | 3 ++- compat/win32/fscache.c | 40 ++++++++++++++++++++++++++++++++++++++++ compat/win32/fscache.h | 1 + 4 files changed, 45 insertions(+), 1 deletion(-) diff --git a/compat/mingw.c b/compat/mingw.c index f037489e5c3d50..563bf86e88baa5 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -3485,6 +3485,8 @@ pid_t waitpid(pid_t pid, int *status, int options) return -1; } +int (*win32_is_mount_point)(struct strbuf *path) = mingw_is_mount_point; + int mingw_is_mount_point(struct strbuf *path) { WIN32_FIND_DATAW findbuf = { 0 }; diff --git a/compat/mingw.h b/compat/mingw.h index 0e9295006025c5..62b15f12d1cf06 100644 --- a/compat/mingw.h +++ b/compat/mingw.h @@ -40,7 +40,8 @@ static inline void convert_slashes(char *path) } struct strbuf; int mingw_is_mount_point(struct strbuf *path); -#define is_mount_point mingw_is_mount_point +extern int (*win32_is_mount_point)(struct strbuf *path); +#define is_mount_point win32_is_mount_point #define CAN_UNLINK_MOUNT_POINTS 1 #define PATH_SEP ';' char *mingw_query_user_email(void); diff --git a/compat/win32/fscache.c b/compat/win32/fscache.c index e247e9a6663131..26ae9ab1c1a464 100644 --- a/compat/win32/fscache.c +++ b/compat/win32/fscache.c @@ -515,6 +515,7 @@ int fscache_enable(size_t initial_size) /* redirect opendir and lstat to the fscache implementations */ opendir = fscache_opendir; lstat = fscache_lstat; + win32_is_mount_point = fscache_is_mount_point; } initialized++; LeaveCriticalSection(&fscache_cs); @@ -575,6 +576,7 @@ void fscache_disable(void) /* reset opendir and lstat to the original implementations */ opendir = dirent_opendir; lstat = mingw_lstat; + win32_is_mount_point = mingw_is_mount_point; } LeaveCriticalSection(&fscache_cs); @@ -662,6 +664,44 @@ int fscache_lstat(const char *filename, struct stat *st) return 0; } +/* + * is_mount_point() replacement, uses cache if enabled, otherwise falls + * back to mingw_is_mount_point(). + */ +int fscache_is_mount_point(struct strbuf *path) +{ + int dirlen, base, len; +#pragma GCC diagnostic push +#ifdef __clang__ +#pragma GCC diagnostic ignored "-Wflexible-array-extensions" +#endif + struct heap_fsentry key[2]; +#pragma GCC diagnostic pop + struct fsentry *fse; + struct fscache *cache = fscache_getcache(); + + if (!cache || !do_fscache_enabled(cache, path->buf)) + return mingw_is_mount_point(path); + + cache->lstat_requests++; + /* split path into path + name */ + len = path->len; + if (len && is_dir_sep(path->buf[len - 1])) + len--; + base = len; + while (base && !is_dir_sep(path->buf[base - 1])) + base--; + dirlen = base ? base - 1 : 0; + + /* lookup entry for path + name in cache */ + fsentry_init(&key[0].u.ent, NULL, path->buf, dirlen); + fsentry_init(&key[1].u.ent, &key[0].u.ent, path->buf + base, len - base); + fse = fscache_get(cache, &key[1].u.ent); + if (!fse) + return mingw_is_mount_point(path); + return fse->reparse_tag == IO_REPARSE_TAG_MOUNT_POINT; +} + typedef struct fscache_DIR { struct DIR base_dir; /* extend base struct DIR */ struct fsentry *pfsentry; diff --git a/compat/win32/fscache.h b/compat/win32/fscache.h index 042b247a542554..386c770a85d321 100644 --- a/compat/win32/fscache.h +++ b/compat/win32/fscache.h @@ -22,6 +22,7 @@ void fscache_flush(void); DIR *fscache_opendir(const char *dir); int fscache_lstat(const char *file_name, struct stat *buf); +int fscache_is_mount_point(struct strbuf *path); /* opaque fscache structure */ struct fscache; From 1f339a4db07eb2c9e588d60d12b0a5443542de70 Mon Sep 17 00:00:00 2001 From: Ben Peart Date: Thu, 15 Nov 2018 14:15:40 -0500 Subject: [PATCH 191/218] fscache: teach fscache to use NtQueryDirectoryFile Using FindFirstFileExW() requires the OS to allocate a 64K buffer for each directory and then free it when we call FindClose(). Update fscache to call the underlying kernel API NtQueryDirectoryFile so that we can do the buffer management ourselves. That allows us to allocate a single buffer for the lifetime of the cache and reuse it for each directory. This change improves performance of 'git status' by 18% in a repo with ~200K files and 30k folders. Documentation for NtQueryDirectoryFile can be found at: https://docs.microsoft.com/en-us/windows-hardware/drivers/ddi/content/ntifs/nf-ntifs-ntquerydirectoryfile https://docs.microsoft.com/en-us/windows/desktop/FileIO/file-attribute-constants https://docs.microsoft.com/en-us/windows/desktop/fileio/reparse-point-tags To determine if the specified directory is a symbolic link, inspect the FileAttributes member to see if the FILE_ATTRIBUTE_REPARSE_POINT flag is set. If so, EaSize will contain the reparse tag (this is a so far undocumented feature, but confirmed by the NTFS developers). To determine if the reparse point is a symbolic link (and not some other form of reparse point), test whether the tag value equals the value IO_REPARSE_TAG_SYMLINK. The NtQueryDirectoryFile() call works best (and on Windows 8.1 and earlier, it works *only*) with buffer sizes up to 64kB. Which is 32k wide characters, so let's use that as our buffer size. Signed-off-by: Ben Peart Signed-off-by: Johannes Schindelin --- compat/win32/fscache.c | 126 +++++++++++++++++++++++++++++---------- compat/win32/ntifs.h | 131 +++++++++++++++++++++++++++++++++++++++++ 2 files changed, 226 insertions(+), 31 deletions(-) create mode 100644 compat/win32/ntifs.h diff --git a/compat/win32/fscache.c b/compat/win32/fscache.c index 335c46dd36c544..973ae7efb246c9 100644 --- a/compat/win32/fscache.c +++ b/compat/win32/fscache.c @@ -7,6 +7,7 @@ #include "../../trace.h" #include "config.h" #include "../../mem-pool.h" +#include "ntifs.h" static volatile long initialized; static DWORD dwTlsIndex; @@ -26,6 +27,13 @@ struct fscache { unsigned int opendir_requests; unsigned int fscache_requests; unsigned int fscache_misses; + /* + * 32k wide characters translates to 64kB, which is the maximum that + * Windows 8.1 and earlier can handle. On network drives, not only + * the client's Windows version matters, but also the server's, + * therefore we need to keep this to 64kB. + */ + WCHAR buffer[32 * 1024]; }; static struct trace_key trace_fscache = TRACE_KEY_INIT(FSCACHE); @@ -166,29 +174,47 @@ static void fsentry_release(struct fsentry *fse) InterlockedDecrement(&(fse->u.refcnt)); } +static int xwcstoutfn(char *utf, int utflen, const wchar_t *wcs, int wcslen) +{ + if (!wcs || !utf || utflen < 1) { + errno = EINVAL; + return -1; + } + utflen = WideCharToMultiByte(CP_UTF8, 0, wcs, wcslen, utf, utflen, NULL, NULL); + if (utflen) + return utflen; + errno = ERANGE; + return -1; +} + /* - * Allocate and initialize an fsentry from a WIN32_FIND_DATA structure. + * Allocate and initialize an fsentry from a FILE_FULL_DIR_INFORMATION structure. */ static struct fsentry *fseentry_create_entry(struct fscache *cache, struct fsentry *list, - const WIN32_FIND_DATAW *fdata) + PFILE_FULL_DIR_INFORMATION fdata) { char buf[MAX_PATH * 3]; int len; struct fsentry *fse; - len = xwcstoutf(buf, fdata->cFileName, ARRAY_SIZE(buf)); + + len = xwcstoutfn(buf, ARRAY_SIZE(buf), fdata->FileName, fdata->FileNameLength / sizeof(wchar_t)); fse = fsentry_alloc(cache, list, buf, len); - fse->st_mode = file_attr_to_st_mode(fdata->dwFileAttributes, - IO_REPARSE_TAG_SYMLINK); + fse->st_mode = file_attr_to_st_mode(fdata->FileAttributes, + fdata->EaSize); fse->dirent.d_type = S_ISREG(fse->st_mode) ? DT_REG : S_ISDIR(fse->st_mode) ? DT_DIR : DT_LNK; - fse->u.s.st_size = (((off64_t) (fdata->nFileSizeHigh)) << 32) - | fdata->nFileSizeLow; - filetime_to_timespec(&(fdata->ftLastAccessTime), &(fse->u.s.st_atim)); - filetime_to_timespec(&(fdata->ftLastWriteTime), &(fse->u.s.st_mtim)); - filetime_to_timespec(&(fdata->ftCreationTime), &(fse->u.s.st_ctim)); + fse->u.s.st_size = S_ISLNK(fse->st_mode) ? MAX_PATH : + fdata->EndOfFile.LowPart | + (((off_t)fdata->EndOfFile.HighPart) << 32); + filetime_to_timespec((FILETIME *)&(fdata->LastAccessTime), + &(fse->u.s.st_atim)); + filetime_to_timespec((FILETIME *)&(fdata->LastWriteTime), + &(fse->u.s.st_mtim)); + filetime_to_timespec((FILETIME *)&(fdata->CreationTime), + &(fse->u.s.st_ctim)); return fse; } @@ -201,8 +227,10 @@ static struct fsentry *fseentry_create_entry(struct fscache *cache, static struct fsentry *fsentry_create_list(struct fscache *cache, const struct fsentry *dir, int *dir_not_found) { - wchar_t pattern[MAX_PATH + 2]; /* + 2 for '/' '*' */ - WIN32_FIND_DATAW fdata; + wchar_t pattern[MAX_PATH]; + NTSTATUS status; + IO_STATUS_BLOCK iosb; + PFILE_FULL_DIR_INFORMATION di; HANDLE h; int wlen; struct fsentry *list, **phead; @@ -218,15 +246,18 @@ static struct fsentry *fsentry_create_list(struct fscache *cache, const struct f return NULL; } - /* append optional '/' and wildcard '*' */ - if (wlen) - pattern[wlen++] = '/'; - pattern[wlen++] = '*'; - pattern[wlen] = 0; + /* handle CWD */ + if (!wlen) { + wlen = GetCurrentDirectoryW(ARRAY_SIZE(pattern), pattern); + if (!wlen || wlen >= (ssize_t)ARRAY_SIZE(pattern)) { + errno = wlen ? ENAMETOOLONG : err_win_to_posix(GetLastError()); + return NULL; + } + } - /* open find handle */ - h = FindFirstFileExW(pattern, FindExInfoBasic, &fdata, FindExSearchNameMatch, - NULL, FIND_FIRST_EX_LARGE_FETCH); + h = CreateFileW(pattern, FILE_LIST_DIRECTORY, + FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE, + NULL, OPEN_EXISTING, FILE_FLAG_BACKUP_SEMANTICS, NULL); if (h == INVALID_HANDLE_VALUE) { err = GetLastError(); *dir_not_found = 1; /* or empty directory */ @@ -243,22 +274,55 @@ static struct fsentry *fsentry_create_list(struct fscache *cache, const struct f /* walk directory and build linked list of fsentry structures */ phead = &list->next; - do { - *phead = fseentry_create_entry(cache, list, &fdata); + status = NtQueryDirectoryFile(h, NULL, 0, 0, &iosb, cache->buffer, + sizeof(cache->buffer), FileFullDirectoryInformation, FALSE, NULL, FALSE); + if (!NT_SUCCESS(status)) { + /* + * NtQueryDirectoryFile returns STATUS_INVALID_PARAMETER when + * asked to enumerate an invalid directory (ie it is a file + * instead of a directory). Verify that is the actual cause + * of the error. + */ + if (status == (NTSTATUS)STATUS_INVALID_PARAMETER) { + DWORD attributes = GetFileAttributesW(pattern); + if (!(attributes & FILE_ATTRIBUTE_DIRECTORY)) + status = ERROR_DIRECTORY; + } + goto Error; + } + di = (PFILE_FULL_DIR_INFORMATION)(cache->buffer); + for (;;) { + + *phead = fseentry_create_entry(cache, list, di); phead = &(*phead)->next; - } while (FindNextFileW(h, &fdata)); - /* remember result of last FindNextFile, then close find handle */ - err = GetLastError(); - FindClose(h); + /* If there is no offset in the entry, the buffer has been exhausted. */ + if (di->NextEntryOffset == 0) { + status = NtQueryDirectoryFile(h, NULL, 0, 0, &iosb, cache->buffer, + sizeof(cache->buffer), FileFullDirectoryInformation, FALSE, NULL, FALSE); + if (!NT_SUCCESS(status)) { + if (status == STATUS_NO_MORE_FILES) + break; + goto Error; + } + + di = (PFILE_FULL_DIR_INFORMATION)(cache->buffer); + continue; + } + + /* Advance to the next entry. */ + di = (PFILE_FULL_DIR_INFORMATION)(((PUCHAR)di) + di->NextEntryOffset); + } - /* return the list if we've got all the files */ - if (err == ERROR_NO_MORE_FILES) - return list; + CloseHandle(h); + return list; - /* otherwise release the list and return error */ +Error: + trace_printf_key(&trace_fscache, + "fscache: status(%ld) unable to query directory " + "contents '%s'\n", status, dir->dirent.d_name); + CloseHandle(h); fsentry_release(list); - errno = err_win_to_posix(err); return NULL; } diff --git a/compat/win32/ntifs.h b/compat/win32/ntifs.h new file mode 100644 index 00000000000000..64ed792c52f352 --- /dev/null +++ b/compat/win32/ntifs.h @@ -0,0 +1,131 @@ +#ifndef _NTIFS_ +#define _NTIFS_ + +/* + * Copy necessary structures and definitions out of the Windows DDK + * to enable calling NtQueryDirectoryFile() + */ + +typedef _Return_type_success_(return >= 0) LONG NTSTATUS; +#define NT_SUCCESS(Status) (((NTSTATUS)(Status)) >= 0) + +#if !defined(_NTSECAPI_) && !defined(_WINTERNL_) && \ + !defined(__UNICODE_STRING_DEFINED) +#define __UNICODE_STRING_DEFINED +typedef struct _UNICODE_STRING { + USHORT Length; + USHORT MaximumLength; + PWSTR Buffer; +} UNICODE_STRING; +typedef UNICODE_STRING *PUNICODE_STRING; +typedef const UNICODE_STRING *PCUNICODE_STRING; +#endif /* !_NTSECAPI_ && !_WINTERNL_ && !__UNICODE_STRING_DEFINED */ + +typedef enum _FILE_INFORMATION_CLASS { + FileDirectoryInformation = 1, + FileFullDirectoryInformation, + FileBothDirectoryInformation, + FileBasicInformation, + FileStandardInformation, + FileInternalInformation, + FileEaInformation, + FileAccessInformation, + FileNameInformation, + FileRenameInformation, + FileLinkInformation, + FileNamesInformation, + FileDispositionInformation, + FilePositionInformation, + FileFullEaInformation, + FileModeInformation, + FileAlignmentInformation, + FileAllInformation, + FileAllocationInformation, + FileEndOfFileInformation, + FileAlternateNameInformation, + FileStreamInformation, + FilePipeInformation, + FilePipeLocalInformation, + FilePipeRemoteInformation, + FileMailslotQueryInformation, + FileMailslotSetInformation, + FileCompressionInformation, + FileObjectIdInformation, + FileCompletionInformation, + FileMoveClusterInformation, + FileQuotaInformation, + FileReparsePointInformation, + FileNetworkOpenInformation, + FileAttributeTagInformation, + FileTrackingInformation, + FileIdBothDirectoryInformation, + FileIdFullDirectoryInformation, + FileValidDataLengthInformation, + FileShortNameInformation, + FileIoCompletionNotificationInformation, + FileIoStatusBlockRangeInformation, + FileIoPriorityHintInformation, + FileSfioReserveInformation, + FileSfioVolumeInformation, + FileHardLinkInformation, + FileProcessIdsUsingFileInformation, + FileNormalizedNameInformation, + FileNetworkPhysicalNameInformation, + FileIdGlobalTxDirectoryInformation, + FileIsRemoteDeviceInformation, + FileAttributeCacheInformation, + FileNumaNodeInformation, + FileStandardLinkInformation, + FileRemoteProtocolInformation, + FileMaximumInformation +} FILE_INFORMATION_CLASS, *PFILE_INFORMATION_CLASS; + +typedef struct _FILE_FULL_DIR_INFORMATION { + ULONG NextEntryOffset; + ULONG FileIndex; + LARGE_INTEGER CreationTime; + LARGE_INTEGER LastAccessTime; + LARGE_INTEGER LastWriteTime; + LARGE_INTEGER ChangeTime; + LARGE_INTEGER EndOfFile; + LARGE_INTEGER AllocationSize; + ULONG FileAttributes; + ULONG FileNameLength; + ULONG EaSize; + WCHAR FileName[1]; +} FILE_FULL_DIR_INFORMATION, *PFILE_FULL_DIR_INFORMATION; + +typedef struct _IO_STATUS_BLOCK { + union { + NTSTATUS Status; + PVOID Pointer; + } u; + ULONG_PTR Information; +} IO_STATUS_BLOCK, *PIO_STATUS_BLOCK; + +typedef VOID +(NTAPI *PIO_APC_ROUTINE)( + IN PVOID ApcContext, + IN PIO_STATUS_BLOCK IoStatusBlock, + IN ULONG Reserved); + +NTSYSCALLAPI +NTSTATUS +NTAPI +NtQueryDirectoryFile( + _In_ HANDLE FileHandle, + _In_opt_ HANDLE Event, + _In_opt_ PIO_APC_ROUTINE ApcRoutine, + _In_opt_ PVOID ApcContext, + _Out_ PIO_STATUS_BLOCK IoStatusBlock, + _Out_writes_bytes_(Length) PVOID FileInformation, + _In_ ULONG Length, + _In_ FILE_INFORMATION_CLASS FileInformationClass, + _In_ BOOLEAN ReturnSingleEntry, + _In_opt_ PUNICODE_STRING FileName, + _In_ BOOLEAN RestartScan +); + +#define STATUS_NO_MORE_FILES ((NTSTATUS)0x80000006L) + +#endif From 11a47d2047461374c964e7d9ae7ed25f3dc7500f Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Fri, 30 Jun 2017 00:35:40 +0200 Subject: [PATCH 192/218] mingw: only use Bash-ism `builtin pwd -W` when available Traditionally, Git for Windows' SDK uses Bash as its default shell. However, other Unix shells are available, too. Most notably, the Win32 port of BusyBox comes with `ash` whose `pwd` command already prints Windows paths as Git for Windows wants them, while there is not even a `builtin` command. Therefore, let's be careful not to override `pwd` unless we know that the `builtin` command is available. Signed-off-by: Johannes Schindelin --- git-sh-setup.sh | 14 ++++++++++---- t/test-lib.sh | 14 ++++++++++---- 2 files changed, 20 insertions(+), 8 deletions(-) diff --git a/git-sh-setup.sh b/git-sh-setup.sh index fad4f9df94e143..c51ad34148ccf3 100644 --- a/git-sh-setup.sh +++ b/git-sh-setup.sh @@ -306,10 +306,16 @@ case $(uname -s) in /usr/bin/find "$@" } fi - # git sees Windows-style pwd - pwd () { - builtin pwd -W - } + # On Windows, Git wants Windows paths. But /usr/bin/pwd spits out + # Unix-style paths. At least in Bash, we have a builtin pwd that + # understands the -W option to force "mixed" paths, i.e. with drive + # prefix but still with forward slashes. Let's use that, if available. + if type builtin >/dev/null 2>&1 + then + pwd () { + builtin pwd -W + } + fi is_absolute_path () { case "$1" in [/\\]* | [A-Za-z]:*) diff --git a/t/test-lib.sh b/t/test-lib.sh index 41e623f6460103..ba9d89781dc6a5 100644 --- a/t/test-lib.sh +++ b/t/test-lib.sh @@ -1685,10 +1685,16 @@ Darwin) /usr/bin/find "$@" } fi - # git sees Windows-style pwd - pwd () { - builtin pwd -W - } + # On Windows, Git wants Windows paths. But /usr/bin/pwd spits out + # Unix-style paths. At least in Bash, we have a builtin pwd that + # understands the -W option to force "mixed" paths, i.e. with drive + # prefix but still with forward slashes. Let's use that, if available. + if type builtin >/dev/null 2>&1 + then + pwd () { + builtin pwd -W + } + fi # no POSIX permissions # backslashes in pathspec are converted to '/' # exec does not inherit the PID From ce6f9a8ebb938481e20113fb6a43cd9fb6f43fe3 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Tue, 11 Dec 2018 12:17:49 +0100 Subject: [PATCH 193/218] clean: make use of FSCache The `git clean` command needs to enumerate plenty of files and directories, and can therefore benefit from the FSCache. Signed-off-by: Johannes Schindelin --- builtin/clean.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/builtin/clean.c b/builtin/clean.c index 6ed555000f9a41..e15d595c3dc7cc 100644 --- a/builtin/clean.c +++ b/builtin/clean.c @@ -1042,6 +1042,7 @@ int cmd_clean(int argc, if (repo_read_index(the_repository) < 0) die(_("index file corrupt")); + enable_fscache(the_repository->index->cache_nr); pl = add_pattern_list(&dir, EXC_CMDL, "--exclude option"); for (i = 0; i < exclude_list.nr; i++) @@ -1116,6 +1117,7 @@ int cmd_clean(int argc, } } + disable_fscache(); strbuf_release(&abs_path); strbuf_release(&buf); string_list_clear(&del_list, 0); From 507c935cc21fc5d9ce102b81970141531ad410ba Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Fri, 30 Jun 2017 22:32:33 +0200 Subject: [PATCH 194/218] tests (mingw): remove Bash-specific pwd option The -W option is only understood by MSYS2 Bash's pwd command. We already make sure to override `pwd` by `builtin pwd -W` for MINGW, so let's not double the effort here. This will also help when switching the shell to another one (such as BusyBox' ash) whose pwd does *not* understand the -W option. Signed-off-by: Johannes Schindelin --- t/t9902-completion.sh | 7 +------ 1 file changed, 1 insertion(+), 6 deletions(-) diff --git a/t/t9902-completion.sh b/t/t9902-completion.sh index 2f9a597ec7f493..e3c8259ce73b95 100755 --- a/t/t9902-completion.sh +++ b/t/t9902-completion.sh @@ -139,12 +139,7 @@ invalid_variable_name='${foo.bar}' actual="$TRASH_DIRECTORY/actual" -if test_have_prereq MINGW -then - ROOT="$(pwd -W)" -else - ROOT="$(pwd)" -fi +ROOT="$(pwd)" test_expect_success 'setup for __git_find_repo_path/__gitdir tests' ' mkdir -p subdir/subsubdir && From f0c43368b1561d50d963286fe98d0795b2b2c180 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Wed, 19 Jul 2017 17:07:56 +0200 Subject: [PATCH 195/218] test-lib: add BUSYBOX prerequisite When running with BusyBox, we will want to avoid calling executables on the PATH that are implemented in BusyBox itself. Signed-off-by: Johannes Schindelin --- t/test-lib.sh | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/t/test-lib.sh b/t/test-lib.sh index ba9d89781dc6a5..fe4f624ee11d40 100644 --- a/t/test-lib.sh +++ b/t/test-lib.sh @@ -1889,6 +1889,10 @@ test_lazy_prereq UNZIP ' test $? -ne 127 ' +test_lazy_prereq BUSYBOX ' + case "$($SHELL --help 2>&1)" in *BusyBox*) true;; *) false;; esac +' + run_with_limited_cmdline () { (ulimit -s 128 && "$@") } From c67399fa644409df7dedf74d41482e3165714b59 Mon Sep 17 00:00:00 2001 From: Doug Kelly Date: Wed, 8 Jan 2014 20:28:15 -0600 Subject: [PATCH 196/218] pack-objects (mingw): demonstrate a segmentation fault with large deltas There is a problem in the way 9ac3f0e5b3e4 (pack-objects: fix performance issues on packing large deltas, 2018-07-22) initializes that mutex in the `packing_data` struct. The problem manifests in a segmentation fault on Windows, when a mutex (AKA critical section) is accessed without being initialized. (With pthreads, you apparently do not really have to initialize them?) This was reported in https://github.com/git-for-windows/git/issues/1839. Signed-off-by: Doug Kelly Signed-off-by: Johannes Schindelin --- t/meson.build | 1 + t/t7429-submodule-long-path.sh | 106 +++++++++++++++++++++++++++++++++ 2 files changed, 107 insertions(+) create mode 100755 t/t7429-submodule-long-path.sh diff --git a/t/meson.build b/t/meson.build index 94bc6ccd1bd72a..4a6a8704b615e2 100644 --- a/t/meson.build +++ b/t/meson.build @@ -905,6 +905,7 @@ integration_tests = [ 't7424-submodule-mixed-ref-formats.sh', 't7425-submodule-gitdir-path-extension.sh', 't7426-submodule-get-default-remote.sh', + 't7429-submodule-long-path.sh', 't7450-bad-git-dotfiles.sh', 't7500-commit-template-squash-signoff.sh', 't7501-commit-basic-functionality.sh', diff --git a/t/t7429-submodule-long-path.sh b/t/t7429-submodule-long-path.sh new file mode 100755 index 00000000000000..f692cedbff7ff8 --- /dev/null +++ b/t/t7429-submodule-long-path.sh @@ -0,0 +1,106 @@ +#!/bin/sh +# +# Copyright (c) 2013 Doug Kelly +# + +test_description='Test submodules with a path near PATH_MAX + +This test verifies that "git submodule" initialization, update and clones work, including with recursive submodules and paths approaching PATH_MAX (260 characters on Windows) +' + +TEST_NO_CREATE_REPO=1 +. ./test-lib.sh + +longpath="" +for (( i=0; i<4; i++ )); do + longpath="0123456789abcdefghijklmnopqrstuvwxyz$longpath" +done +# Pick a substring maximum of 90 characters +# This should be good, since we'll add on a lot for temp directories +longpath=${longpath:0:90}; export longpath + +test_expect_failure 'submodule with a long path' ' + git config --global protocol.file.allow always && + GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME= \ + git -c init.defaultBranch=long init --bare remote && + test_create_repo bundle1 && + ( + cd bundle1 && + test_commit "shoot" && + git rev-parse --verify HEAD >../expect + ) && + mkdir home && + ( + cd home && + git clone ../remote test && + cd test && + git checkout -B long && + git submodule add ../bundle1 $longpath && + test_commit "sogood" && + ( + cd $longpath && + git rev-parse --verify HEAD >actual && + test_cmp ../../../expect actual + ) && + git push origin long + ) && + mkdir home2 && + ( + cd home2 && + git clone ../remote test && + cd test && + git checkout long && + git submodule update --init && + ( + cd $longpath && + git rev-parse --verify HEAD >actual && + test_cmp ../../../expect actual + ) + ) +' + +test_expect_failure 'recursive submodule with a long path' ' + GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME= \ + git -c init.defaultBranch=long init --bare super && + test_create_repo child && + ( + cd child && + test_commit "shoot" && + git rev-parse --verify HEAD >../expect + ) && + test_create_repo parent && + ( + cd parent && + git submodule add ../child $longpath && + test_commit "aim" + ) && + mkdir home3 && + ( + cd home3 && + git clone ../super test && + cd test && + git checkout -B long && + git submodule add ../parent foo && + git submodule update --init --recursive && + test_commit "sogood" && + ( + cd foo/$longpath && + git rev-parse --verify HEAD >actual && + test_cmp ../../../../expect actual + ) && + git push origin long + ) && + mkdir home4 && + ( + cd home4 && + git clone ../super test --recursive && + ( + cd test/foo/$longpath && + git rev-parse --verify HEAD >actual && + test_cmp ../../../../expect actual + ) + ) +' +unset longpath + +test_done From dd3c78f8d39127787ff20cf4db3f49c701913446 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Sat, 5 Aug 2017 21:36:01 +0200 Subject: [PATCH 197/218] t5003: use binary file from t/lib-diff/ At some stage, t5003-archive-zip wants to add a file that is not ASCII. To that end, it uses /bin/sh. But that file may actually not exist (it is too easy to forget that not all the world is Unix/Linux...)! Besides, we already have perfectly fine binary files intended for use solely by the tests. So let's use one of them instead. Signed-off-by: Johannes Schindelin --- t/t5003-archive-zip.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/t/t5003-archive-zip.sh b/t/t5003-archive-zip.sh index c8c1c5c06b6037..8f2a2cbc6b8103 100755 --- a/t/t5003-archive-zip.sh +++ b/t/t5003-archive-zip.sh @@ -88,7 +88,7 @@ test_expect_success \ 'mkdir a && echo simple textfile >a/a && mkdir a/bin && - cp /bin/sh a/bin && + cp "$TEST_DIRECTORY/lib-diff/test-binary-1.png" a/bin && printf "text\r" >a/text.cr && printf "text\r\n" >a/text.crlf && printf "text\n" >a/text.lf && From 651cc1123446f0f62d71cd585c8cc201ec1492af Mon Sep 17 00:00:00 2001 From: Karsten Blees Date: Tue, 28 Jul 2015 21:07:41 +0200 Subject: [PATCH 198/218] mingw: support long paths Windows paths are typically limited to MAX_PATH = 260 characters, even though the underlying NTFS file system supports paths up to 32,767 chars. This limitation is also evident in Windows Explorer, cmd.exe and many other applications (including IDEs). Particularly annoying is that most Windows APIs return bogus error codes if a relative path only barely exceeds MAX_PATH in conjunction with the current directory, e.g. ERROR_PATH_NOT_FOUND / ENOENT instead of the infinitely more helpful ERROR_FILENAME_EXCED_RANGE / ENAMETOOLONG. Many Windows wide char APIs support longer than MAX_PATH paths through the file namespace prefix ('\\?\' or '\\?\UNC\') followed by an absolute path. Notable exceptions include functions dealing with executables and the current directory (CreateProcess, LoadLibrary, Get/SetCurrentDirectory) as well as the entire shell API (ShellExecute, SHGetSpecialFolderPath...). Introduce a handle_long_path function to check the length of a specified path properly (and fail with ENAMETOOLONG), and to optionally expand long paths using the '\\?\' file namespace prefix. Short paths will not be modified, so we don't need to worry about device names (NUL, CON, AUX). Contrary to MSDN docs, the GetFullPathNameW function doesn't seem to be limited to MAX_PATH (at least not on Win7), so we can use it to do the heavy lifting of the conversion (translate '/' to '\', eliminate '.' and '..', and make an absolute path). Add long path error checking to xutftowcs_path for APIs with hard MAX_PATH limit. Add a new MAX_LONG_PATH constant and xutftowcs_long_path function for APIs that support long paths. While improved error checking is always active, long paths support must be explicitly enabled via 'core.longpaths' option. This is to prevent end users from shooting themselves in the foot by checking out files that Windows Explorer, cmd/bash or their favorite IDE cannot handle. Test suite: Test the case is when the full pathname length of a dir is close to 260 (MAX_PATH). Bug report and an original reproducer by Andrey Rogozhnikov: https://github.com/msysgit/git/pull/122#issuecomment-43604199 [jes: adjusted test number to avoid conflicts, added support for chdir(), etc] Thanks-to: Martin W. Kirst Thanks-to: Doug Kelly Original-test-by: Andrey Rogozhnikov Signed-off-by: Karsten Blees Signed-off-by: Stepan Kasal Signed-off-by: Johannes Schindelin Signed-off-by: Josh Soref --- Documentation/config/core.adoc | 7 ++ compat/mingw.c | 194 +++++++++++++++++++++++++-------- compat/mingw.h | 75 ++++++++++++- compat/win32/dirent.c | 17 ++- compat/win32/fscache.c | 22 ++-- t/meson.build | 1 + t/t2031-checkout-long-paths.sh | 102 +++++++++++++++++ t/t7429-submodule-long-path.sh | 24 ++-- 8 files changed, 361 insertions(+), 81 deletions(-) create mode 100755 t/t2031-checkout-long-paths.sh diff --git a/Documentation/config/core.adoc b/Documentation/config/core.adoc index ebdebd094bf461..cac7438e7de505 100644 --- a/Documentation/config/core.adoc +++ b/Documentation/config/core.adoc @@ -727,6 +727,13 @@ core.fscache:: Git for Windows uses this to bulk-read and cache lstat data of entire directories (instead of doing lstat file by file). +core.longpaths:: + Enable long path (> 260) support for builtin commands in Git for + Windows. This is disabled by default, as long paths are not supported + by Windows Explorer, cmd.exe and the Git for Windows tool chain + (msys, bash, tcl, perl...). Only enable this if you know what you're + doing and are prepared to live with a few quirks. + core.unsetenvvars:: Windows-only: comma-separated list of environment variables' names that need to be unset before spawning any other process. diff --git a/compat/mingw.c b/compat/mingw.c index 563bf86e88baa5..213a33c5c99c48 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -280,6 +280,27 @@ static enum hide_dotfiles_type hide_dotfiles = HIDE_DOTFILES_DOTGITONLY; static char *unset_environment_variables; int core_fscache; +int are_long_paths_enabled(void) +{ + /* default to `false` during initialization */ + static const int fallback = 0; + + static int enabled = -1; + + if (enabled < 0) { + /* avoid infinite recursion */ + if (!the_repository) + return fallback; + + if (the_repository->config && + the_repository->config->hash_initialized && + repo_config_get_bool(the_repository, "core.longpaths", &enabled) < 0) + enabled = 0; + } + + return enabled < 0 ? fallback : enabled; +} + int mingw_core_config(const char *var, const char *value, const struct config_context *ctx UNUSED, void *cb UNUSED) @@ -360,7 +381,7 @@ process_phantom_symlink(const wchar_t *wtarget, const wchar_t *wlink) { HANDLE hnd; BY_HANDLE_FILE_INFORMATION fdata; - wchar_t relative[MAX_PATH]; + wchar_t relative[MAX_LONG_PATH]; const wchar_t *rel; /* @@ -575,8 +596,8 @@ int mingw_unlink(const char *pathname, int handle_in_use_error) { static int use_legacy_delete = -1; int tries = 0; - wchar_t wpathname[MAX_PATH]; - if (xutftowcs_path(wpathname, pathname) < 0) + wchar_t wpathname[MAX_LONG_PATH]; + if (xutftowcs_long_path(wpathname, pathname) < 0) return -1; if (use_legacy_delete < 0) @@ -611,7 +632,7 @@ static int is_dir_empty(const wchar_t *wpath) { WIN32_FIND_DATAW findbuf; HANDLE handle; - wchar_t wbuf[MAX_PATH + 2]; + wchar_t wbuf[MAX_LONG_PATH + 2]; wcscpy(wbuf, wpath); wcscat(wbuf, L"\\*"); handle = FindFirstFileW(wbuf, &findbuf); @@ -632,7 +653,7 @@ static int is_dir_empty(const wchar_t *wpath) int mingw_rmdir(const char *pathname) { int tries = 0; - wchar_t wpathname[MAX_PATH]; + wchar_t wpathname[MAX_LONG_PATH]; struct stat st; /* @@ -654,7 +675,7 @@ int mingw_rmdir(const char *pathname) return -1; } - if (xutftowcs_path(wpathname, pathname) < 0) + if (xutftowcs_long_path(wpathname, pathname) < 0) return -1; do { @@ -723,15 +744,18 @@ static int set_hidden_flag(const wchar_t *path, int set) int mingw_mkdir(const char *path, int mode UNUSED) { int ret; - wchar_t wpath[MAX_PATH]; + wchar_t wpath[MAX_LONG_PATH]; if (!is_valid_win32_path(path, 0)) { errno = EINVAL; return -1; } - if (xutftowcs_path(wpath, path) < 0) + /* CreateDirectoryW path limit is 248 (MAX_PATH - 8.3 file name) */ + if (xutftowcs_path_ex(wpath, path, MAX_LONG_PATH, -1, 248, + are_long_paths_enabled()) < 0) return -1; + ret = _wmkdir(wpath); if (!ret) process_phantom_symlinks(); @@ -897,7 +921,7 @@ int mingw_open (const char *filename, int oflags, ...) va_list args; unsigned mode; int fd, create = (oflags & (O_CREAT | O_EXCL)) == (O_CREAT | O_EXCL); - wchar_t wfilename[MAX_PATH]; + wchar_t wfilename[MAX_LONG_PATH]; open_fn_t open_fn; WIN32_FILE_ATTRIBUTE_DATA fdata; @@ -930,7 +954,7 @@ int mingw_open (const char *filename, int oflags, ...) if (filename && !strcmp(filename, "/dev/null")) wcscpy(wfilename, L"nul"); - else if (xutftowcs_path(wfilename, filename) < 0) + else if (xutftowcs_long_path(wfilename, filename) < 0) return -1; /* @@ -1021,14 +1045,14 @@ FILE *mingw_fopen (const char *filename, const char *otype) { int hide = needs_hiding(filename); FILE *file; - wchar_t wfilename[MAX_PATH], wotype[4]; + wchar_t wfilename[MAX_LONG_PATH], wotype[4]; if (filename && !strcmp(filename, "/dev/null")) wcscpy(wfilename, L"nul"); else if (!is_valid_win32_path(filename, 1)) { int create = otype && strchr(otype, 'w'); errno = create ? EINVAL : ENOENT; return NULL; - } else if (xutftowcs_path(wfilename, filename) < 0) + } else if (xutftowcs_long_path(wfilename, filename) < 0) return NULL; if (xutftowcs(wotype, otype, ARRAY_SIZE(wotype)) < 0) @@ -1050,14 +1074,14 @@ FILE *mingw_freopen (const char *filename, const char *otype, FILE *stream) { int hide = needs_hiding(filename); FILE *file; - wchar_t wfilename[MAX_PATH], wotype[4]; + wchar_t wfilename[MAX_LONG_PATH], wotype[4]; if (filename && !strcmp(filename, "/dev/null")) wcscpy(wfilename, L"nul"); else if (!is_valid_win32_path(filename, 1)) { int create = otype && strchr(otype, 'w'); errno = create ? EINVAL : ENOENT; return NULL; - } else if (xutftowcs_path(wfilename, filename) < 0) + } else if (xutftowcs_long_path(wfilename, filename) < 0) return NULL; if (xutftowcs(wotype, otype, ARRAY_SIZE(wotype)) < 0) @@ -1107,7 +1131,7 @@ ssize_t mingw_write(int fd, const void *buf, size_t len) HANDLE h = (HANDLE) _get_osfhandle(fd); if (GetFileType(h) != FILE_TYPE_PIPE) { if (orig == EINVAL) { - wchar_t path[MAX_PATH]; + wchar_t path[MAX_LONG_PATH]; DWORD ret = GetFinalPathNameByHandleW(h, path, ARRAY_SIZE(path), 0); UINT drive_type = ret > 0 && ret < ARRAY_SIZE(path) ? @@ -1144,20 +1168,23 @@ ssize_t mingw_write(int fd, const void *buf, size_t len) int mingw_access(const char *filename, int mode) { - wchar_t wfilename[MAX_PATH]; + wchar_t wfilename[MAX_LONG_PATH]; if (!strcmp("nul", filename) || !strcmp("/dev/null", filename)) return 0; - if (xutftowcs_path(wfilename, filename) < 0) + if (xutftowcs_long_path(wfilename, filename) < 0) return -1; /* X_OK is not supported by the MSVCRT version */ return _waccess(wfilename, mode & ~X_OK); } +/* cached length of current directory for handle_long_path */ +static int current_directory_len = 0; + int mingw_chdir(const char *dirname) { - wchar_t wdirname[MAX_PATH]; - - if (xutftowcs_path(wdirname, dirname) < 0) + int result; + wchar_t wdirname[MAX_LONG_PATH]; + if (xutftowcs_long_path(wdirname, dirname) < 0) return -1; if (has_symlinks) { @@ -1176,13 +1203,15 @@ int mingw_chdir(const char *dirname) CloseHandle(hnd); } - return _wchdir(normalize_ntpath(wdirname)); + result = _wchdir(normalize_ntpath(wdirname)); + current_directory_len = GetCurrentDirectoryW(0, NULL); + return result; } int mingw_chmod(const char *filename, int mode) { - wchar_t wfilename[MAX_PATH]; - if (xutftowcs_path(wfilename, filename) < 0) + wchar_t wfilename[MAX_LONG_PATH]; + if (xutftowcs_long_path(wfilename, filename) < 0) return -1; return _wchmod(wfilename, mode); } @@ -1316,8 +1345,8 @@ int mingw_lstat(const char *file_name, struct stat *buf) WIN32_FILE_ATTRIBUTE_DATA fdata; DWORD reparse_tag = 0; int link_len = 0; - wchar_t wfilename[MAX_PATH]; - int wlen = xutftowcs_path(wfilename, file_name); + wchar_t wfilename[MAX_LONG_PATH]; + int wlen = xutftowcs_long_path(wfilename, file_name); if (wlen < 0) return -1; @@ -1332,7 +1361,7 @@ int mingw_lstat(const char *file_name, struct stat *buf) if (GetFileAttributesExW(wfilename, GetFileExInfoStandard, &fdata)) { /* for reparse points, get the link tag and length */ if (fdata.dwFileAttributes & FILE_ATTRIBUTE_REPARSE_POINT) { - char tmpbuf[MAX_PATH]; + char tmpbuf[MAX_LONG_PATH]; if (read_reparse_point(wfilename, FALSE, tmpbuf, &link_len, &reparse_tag) < 0) @@ -1413,12 +1442,12 @@ static int get_file_info_by_handle(HANDLE hnd, struct stat *buf) int mingw_stat(const char *file_name, struct stat *buf) { - wchar_t wfile_name[MAX_PATH]; + wchar_t wfile_name[MAX_LONG_PATH]; HANDLE hnd; int result; /* open the file and let Windows resolve the links */ - if (xutftowcs_path(wfile_name, file_name) < 0) + if (xutftowcs_long_path(wfile_name, file_name) < 0) return -1; hnd = CreateFileW(wfile_name, 0, FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE, NULL, @@ -1486,10 +1515,10 @@ int mingw_utime (const char *file_name, const struct utimbuf *times) FILETIME mft, aft; int rc; DWORD attrs; - wchar_t wfilename[MAX_PATH]; + wchar_t wfilename[MAX_LONG_PATH]; HANDLE osfilehandle; - if (xutftowcs_path(wfilename, file_name) < 0) + if (xutftowcs_long_path(wfilename, file_name) < 0) return -1; /* must have write permission */ @@ -2197,6 +2226,10 @@ static pid_t mingw_spawnve_fd(const char *cmd, const char **argv, char **deltaen if (*argv && !strcmp(cmd, *argv)) wcmd[0] = L'\0'; + /* + * Paths to executables and to the current directory do not support + * long paths, therefore we cannot use xutftowcs_long_path() here. + */ else if (xutftowcs_path(wcmd, cmd) < 0) return -1; if (dir && xutftowcs_path(wdir, dir) < 0) @@ -2908,12 +2941,12 @@ int mingw_rename(const char *pold, const char *pnew) static int supports_file_rename_info_ex = 1; DWORD attrs = INVALID_FILE_ATTRIBUTES, gle, attrsold; int tries = 0; - wchar_t wpold[MAX_PATH], wpnew[MAX_PATH]; + wchar_t wpold[MAX_LONG_PATH], wpnew[MAX_LONG_PATH]; int wpnew_len; - if (xutftowcs_path(wpold, pold) < 0) + if (xutftowcs_long_path(wpold, pold) < 0) return -1; - wpnew_len = xutftowcs_path(wpnew, pnew); + wpnew_len = xutftowcs_long_path(wpnew, pnew); if (wpnew_len < 0) return -1; @@ -2943,9 +2976,9 @@ int mingw_rename(const char *pold, const char *pnew) * flex array so that the structure has to be allocated on * the heap. As we declare this structure ourselves though * we can avoid the allocation and define FileName to have - * MAX_PATH bytes. + * MAX_LONG_PATH bytes. */ - WCHAR FileName[MAX_PATH]; + WCHAR FileName[MAX_LONG_PATH]; } rename_info = { 0 }; HANDLE old_handle = INVALID_HANDLE_VALUE; BOOL success; @@ -3319,9 +3352,9 @@ int mingw_raise(int sig) int link(const char *oldpath, const char *newpath) { - wchar_t woldpath[MAX_PATH], wnewpath[MAX_PATH]; - if (xutftowcs_path(woldpath, oldpath) < 0 || - xutftowcs_path(wnewpath, newpath) < 0) + wchar_t woldpath[MAX_LONG_PATH], wnewpath[MAX_LONG_PATH]; + if (xutftowcs_long_path(woldpath, oldpath) < 0 || + xutftowcs_long_path(wnewpath, newpath) < 0) return -1; if (!CreateHardLinkW(wnewpath, woldpath, NULL)) { @@ -3364,7 +3397,7 @@ static enum symlink_type check_symlink_attr(struct index_state *index, const cha int mingw_create_symlink(struct index_state *index, const char *target, const char *link) { - wchar_t wtarget[MAX_PATH], wlink[MAX_PATH]; + wchar_t wtarget[MAX_LONG_PATH], wlink[MAX_LONG_PATH]; int len; /* fail if symlinks are disabled or API is not supported (WinXP) */ @@ -3373,8 +3406,8 @@ int mingw_create_symlink(struct index_state *index, const char *target, const ch return -1; } - if ((len = xutftowcs_path(wtarget, target)) < 0 - || xutftowcs_path(wlink, link) < 0) + if ((len = xutftowcs_long_path(wtarget, target)) < 0 + || xutftowcs_long_path(wlink, link) < 0) return -1; /* convert target dir separators to backslashes */ @@ -3411,12 +3444,12 @@ int mingw_create_symlink(struct index_state *index, const char *target, const ch int readlink(const char *path, char *buf, size_t bufsiz) { - WCHAR wpath[MAX_PATH]; - char tmpbuf[MAX_PATH]; + WCHAR wpath[MAX_LONG_PATH]; + char tmpbuf[MAX_LONG_PATH]; int len; DWORD tag; - if (xutftowcs_path(wpath, path) < 0) + if (xutftowcs_long_path(wpath, path) < 0) return -1; if (read_reparse_point(wpath, TRUE, tmpbuf, &len, &tag) < 0) @@ -3491,8 +3524,8 @@ int mingw_is_mount_point(struct strbuf *path) { WIN32_FIND_DATAW findbuf = { 0 }; HANDLE handle; - wchar_t wfilename[MAX_PATH]; - int wlen = xutftowcs_path(wfilename, path->buf); + wchar_t wfilename[MAX_LONG_PATH]; + int wlen = xutftowcs_long_path(wfilename, path->buf); if (wlen < 0) die(_("could not get long path for '%s'"), path->buf); @@ -3635,9 +3668,9 @@ static size_t append_system_bin_dirs(char *path, size_t size) static int is_system32_path(const char *path) { - WCHAR system32[MAX_PATH], wpath[MAX_PATH]; + WCHAR system32[MAX_LONG_PATH], wpath[MAX_LONG_PATH]; - if (xutftowcs_path(wpath, path) < 0 || + if (xutftowcs_long_path(wpath, path) < 0 || !GetSystemDirectoryW(system32, ARRAY_SIZE(system32)) || _wcsicmp(system32, wpath)) return 0; @@ -4073,6 +4106,68 @@ int is_valid_win32_path(const char *path, int allow_literal_nul) } } +int handle_long_path(wchar_t *path, int len, int max_path, int expand) +{ + int result; + wchar_t buf[MAX_LONG_PATH]; + + /* + * we don't need special handling if path is relative to the current + * directory, and current directory + path don't exceed the desired + * max_path limit. This should cover > 99 % of cases with minimal + * performance impact (git almost always uses relative paths). + */ + if ((len < 2 || (!is_dir_sep(path[0]) && path[1] != ':')) && + (current_directory_len + len < max_path)) + return len; + + /* + * handle everything else: + * - absolute paths: "C:\dir\file" + * - absolute UNC paths: "\\server\share\dir\file" + * - absolute paths on current drive: "\dir\file" + * - relative paths on other drive: "X:file" + * - prefixed paths: "\\?\...", "\\.\..." + */ + + /* convert to absolute path using GetFullPathNameW */ + result = GetFullPathNameW(path, MAX_LONG_PATH, buf, NULL); + if (!result) { + errno = err_win_to_posix(GetLastError()); + return -1; + } + + /* + * return absolute path if it fits within max_path (even if + * "cwd + path" doesn't due to '..' components) + */ + if (result < max_path) { + wcscpy(path, buf); + return result; + } + + /* error out if we shouldn't expand the path or buf is too small */ + if (!expand || result >= MAX_LONG_PATH - 6) { + errno = ENAMETOOLONG; + return -1; + } + + /* prefix full path with "\\?\" or "\\?\UNC\" */ + if (buf[0] == '\\') { + /* ...unless already prefixed */ + if (buf[1] == '\\' && (buf[2] == '?' || buf[2] == '.')) + return len; + + wcscpy(path, L"\\\\?\\UNC\\"); + wcscpy(path + 8, buf + 2); + return result + 6; + } else { + wcscpy(path, L"\\\\?\\"); + wcscpy(path + 4, buf); + return result + 4; + } +} + #if !defined(_MSC_VER) /* * Disable MSVCRT command line wildcard expansion (__getmainargs called from @@ -4264,6 +4359,9 @@ int wmain(int argc, const wchar_t **wargv) /* initialize Unicode console */ winansi_init(); + /* init length of current directory for handle_long_path */ + current_directory_len = GetCurrentDirectoryW(0, NULL); + /* invoke the real main() using our utf8 version of argv. */ exit_status = main(argc, argv); diff --git a/compat/mingw.h b/compat/mingw.h index 62b15f12d1cf06..807ee7b7e2e573 100644 --- a/compat/mingw.h +++ b/compat/mingw.h @@ -1,6 +1,7 @@ #include "mingw-posix.h" extern int core_fscache; +int are_long_paths_enabled(void); struct config_context; int mingw_core_config(const char *var, const char *value, @@ -78,6 +79,42 @@ int is_path_owned_by_current_sid(const char *path, struct strbuf *report); int is_valid_win32_path(const char *path, int allow_literal_nul); #define is_valid_path(path) is_valid_win32_path(path, 0) +/** + * Max length of long paths (exceeding MAX_PATH). The actual maximum supported + * by NTFS is 32,767 (* sizeof(wchar_t)), but we choose an arbitrary smaller + * value to limit required stack memory. + */ +#define MAX_LONG_PATH 4096 + +/** + * Handles paths that would exceed the MAX_PATH limit of Windows Unicode APIs. + * + * With expand == false, the function checks for over-long paths and fails + * with ENAMETOOLONG. The path parameter is not modified, except if cwd + path + * exceeds max_path, but the resulting absolute path doesn't (e.g. due to + * eliminating '..' components). The path parameter must point to a buffer + * of max_path wide characters. + * + * With expand == true, an over-long path is automatically converted in place + * to an absolute path prefixed with '\\?\', and the new length is returned. + * The path parameter must point to a buffer of MAX_LONG_PATH wide characters. + * + * Parameters: + * path: path to check and / or convert + * len: size of path on input (number of wide chars without \0) + * max_path: max short path length to check (usually MAX_PATH = 260, but just + * 248 for CreateDirectoryW) + * expand: false to only check the length, true to expand the path to a + * '\\?\'-prefixed absolute path + * + * Return: + * length of the resulting path, or -1 on failure + * + * Errors: + * ENAMETOOLONG if path is too long + */ +int handle_long_path(wchar_t *path, int len, int max_path, int expand); + /** * Converts UTF-8 encoded string to UTF-16LE. * @@ -136,18 +173,46 @@ static inline int xutftowcs(wchar_t *wcs, const char *utf, size_t wcslen) } /** - * Simplified file system specific variant of xutftowcsn, assumes output - * buffer size is MAX_PATH wide chars and input string is \0-terminated, - * fails with ENAMETOOLONG if input string is too long. + * Simplified file system specific wrapper of xutftowcsn and handle_long_path. + * Converts ERANGE to ENAMETOOLONG. If expand is true, wcs must be at least + * MAX_LONG_PATH wide chars (see handle_long_path). */ -static inline int xutftowcs_path(wchar_t *wcs, const char *utf) +static inline int xutftowcs_path_ex(wchar_t *wcs, const char *utf, + size_t wcslen, int utflen, int max_path, int expand) { - int result = xutftowcsn(wcs, utf, MAX_PATH, -1); + int result = xutftowcsn(wcs, utf, wcslen, utflen); if (result < 0 && errno == ERANGE) errno = ENAMETOOLONG; + if (result >= 0) + result = handle_long_path(wcs, result, max_path, expand); return result; } +/** + * Simplified file system specific variant of xutftowcsn, assumes output + * buffer size is MAX_PATH wide chars and input string is \0-terminated, + * fails with ENAMETOOLONG if input string is too long. Typically used for + * Windows APIs that don't support long paths, e.g. SetCurrentDirectory, + * LoadLibrary, CreateProcess... + */ +static inline int xutftowcs_path(wchar_t *wcs, const char *utf) +{ + return xutftowcs_path_ex(wcs, utf, MAX_PATH, -1, MAX_PATH, 0); +} + +/** + * Simplified file system specific variant of xutftowcsn for Windows APIs + * that support long paths via '\\?\'-prefix, assumes output buffer size is + * MAX_LONG_PATH wide chars, fails with ENAMETOOLONG if input string is too + * long. The 'core.longpaths' git-config option controls whether the path + * is only checked or expanded to a long path. + */ +static inline int xutftowcs_long_path(wchar_t *wcs, const char *utf) +{ + return xutftowcs_path_ex(wcs, utf, MAX_LONG_PATH, -1, MAX_PATH, + are_long_paths_enabled()); +} + /** * Converts UTF-16LE encoded string to UTF-8. * diff --git a/compat/win32/dirent.c b/compat/win32/dirent.c index f17e1595468f44..87063101f57202 100644 --- a/compat/win32/dirent.c +++ b/compat/win32/dirent.c @@ -68,19 +68,24 @@ static int dirent_closedir(dirent_DIR *dir) DIR *dirent_opendir(const char *name) { - wchar_t pattern[MAX_PATH + 2]; /* + 2 for '/' '*' */ + wchar_t pattern[MAX_LONG_PATH + 2]; /* + 2 for "\*" */ WIN32_FIND_DATAW fdata; HANDLE h; int len; dirent_DIR *dir; - /* convert name to UTF-16 and check length < MAX_PATH */ - if ((len = xutftowcs_path(pattern, name)) < 0) + /* convert name to UTF-16 and check length */ + if ((len = xutftowcs_path_ex(pattern, name, MAX_LONG_PATH, -1, + MAX_PATH - 2, + are_long_paths_enabled())) < 0) return NULL; - /* append optional '/' and wildcard '*' */ + /* + * append optional '\' and wildcard '*'. Note: we need to use '\' as + * Windows doesn't translate '/' to '\' for "\\?\"-prefixed paths. + */ if (len && !is_dir_sep(pattern[len - 1])) - pattern[len++] = '/'; + pattern[len++] = '\\'; pattern[len++] = '*'; pattern[len] = 0; @@ -93,7 +98,7 @@ DIR *dirent_opendir(const char *name) } /* initialize DIR structure and copy first dir entry */ - dir = xmalloc(sizeof(dirent_DIR) + MAX_PATH); + dir = xmalloc(sizeof(dirent_DIR) + MAX_LONG_PATH); dir->base_dir.preaddir = (struct dirent *(*)(DIR *dir)) dirent_readdir; dir->base_dir.pclosedir = (int (*)(DIR *dir)) dirent_closedir; dir->dd_handle = h; diff --git a/compat/win32/fscache.c b/compat/win32/fscache.c index 26ae9ab1c1a464..cbd90ececf6b37 100644 --- a/compat/win32/fscache.c +++ b/compat/win32/fscache.c @@ -85,7 +85,7 @@ struct fsentry { struct heap_fsentry { union { struct fsentry ent; - char dummy[sizeof(struct fsentry) + MAX_PATH]; + char dummy[sizeof(struct fsentry) + MAX_LONG_PATH]; } u; }; #pragma GCC diagnostic pop @@ -129,7 +129,7 @@ static void fsentry_init(struct fsentry *fse, struct fsentry *list, const char *name, size_t len) { fse->list = list; - if (len > MAX_PATH) + if (len > MAX_LONG_PATH) BUG("Trying to allocate fsentry for long path '%.*s'", (int)len, name); memcpy(fse->dirent.d_name, name, len); @@ -234,7 +234,7 @@ static struct fsentry *fseentry_create_entry(struct fscache *cache, fdata->EaSize, buf); fse->dirent.d_type = S_ISREG(fse->st_mode) ? DT_REG : S_ISDIR(fse->st_mode) ? DT_DIR : DT_LNK; - fse->u.s.st_size = S_ISLNK(fse->st_mode) ? MAX_PATH : + fse->u.s.st_size = S_ISLNK(fse->st_mode) ? MAX_LONG_PATH : fdata->EndOfFile.LowPart | (((off_t)fdata->EndOfFile.HighPart) << 32); filetime_to_timespec((FILETIME *)&(fdata->LastAccessTime), @@ -270,7 +270,7 @@ static struct fsentry *fseentry_create_entry(struct fscache *cache, static struct fsentry *fsentry_create_list(struct fscache *cache, const struct fsentry *dir, int *dir_not_found) { - wchar_t pattern[MAX_PATH]; + wchar_t pattern[MAX_LONG_PATH]; NTSTATUS status; IO_STATUS_BLOCK iosb; PFILE_FULL_DIR_INFORMATION di; @@ -281,13 +281,11 @@ static struct fsentry *fsentry_create_list(struct fscache *cache, const struct f *dir_not_found = 0; - /* convert name to UTF-16 and check length < MAX_PATH */ - if ((wlen = xutftowcsn(pattern, dir->dirent.d_name, MAX_PATH, - dir->len)) < 0) { - if (errno == ERANGE) - errno = ENAMETOOLONG; + /* convert name to UTF-16 and check length */ + if ((wlen = xutftowcs_path_ex(pattern, dir->dirent.d_name, + MAX_LONG_PATH, dir->len, MAX_PATH - 2, + are_long_paths_enabled())) < 0) return NULL; - } /* handle CWD */ if (!wlen) { @@ -638,8 +636,8 @@ int fscache_lstat(const char *filename, struct stat *st) * Special case symbolic links: FindFirstFile()/FindNextFile() did not * provide us with the length of the target path. */ - if (fse->u.s.st_size == MAX_PATH && S_ISLNK(fse->st_mode)) { - char buf[MAX_PATH]; + if (fse->u.s.st_size == MAX_LONG_PATH && S_ISLNK(fse->st_mode)) { + char buf[MAX_LONG_PATH]; int len = readlink(filename, buf, sizeof(buf) - 1); if (len > 0) diff --git a/t/meson.build b/t/meson.build index 4a6a8704b615e2..7d53e5ba6574e9 100644 --- a/t/meson.build +++ b/t/meson.build @@ -275,6 +275,7 @@ integration_tests = [ 't2026-checkout-pathspec-file.sh', 't2027-checkout-track.sh', 't2030-unresolve-info.sh', + 't2031-checkout-long-paths.sh', 't2040-checkout-symlink-attr.sh', 't2050-git-dir-relative.sh', 't2060-switch.sh', diff --git a/t/t2031-checkout-long-paths.sh b/t/t2031-checkout-long-paths.sh new file mode 100755 index 00000000000000..f30f8920ca689c --- /dev/null +++ b/t/t2031-checkout-long-paths.sh @@ -0,0 +1,102 @@ +#!/bin/sh + +test_description='checkout long paths on Windows + +Ensures that Git for Windows can deal with long paths (>260) enabled via core.longpaths' + +. ./test-lib.sh + +if test_have_prereq !MINGW +then + skip_all='skipping MINGW specific long paths test' + test_done +fi + +test_expect_success setup ' + p=longpathxx && # -> 10 + p=$p$p$p$p$p && # -> 50 + p=$p$p$p$p$p && # -> 250 + + path=${p}/longtestfile && # -> 263 (MAX_PATH = 260) + + blob=$(echo foobar | git hash-object -w --stdin) && + + printf "100644 %s 0\t%s\n" "$blob" "$path" | + git update-index --add --index-info && + git commit -m initial -q +' + +test_expect_success 'checkout of long paths without core.longpaths fails' ' + git config core.longpaths false && + test_must_fail git checkout -f 2>error && + grep -q "Filename too long" error && + test ! -d longpa* +' + +test_expect_success 'checkout of long paths with core.longpaths works' ' + git config core.longpaths true && + git checkout -f && + test_path_is_file longpa*/longtestfile +' + +test_expect_success 'update of long paths' ' + echo frotz >>$(ls longpa*/longtestfile) && + echo $path > expect && + git ls-files -m > actual && + test_cmp expect actual && + git add $path && + git commit -m second && + git grep "frotz" HEAD -- $path +' + +test_expect_success cleanup ' + # bash cannot delete the trash dir if it contains a long path + # lets help cleaning up (unless in debug mode) + if test -z "$debug" + then + rm -rf longpa~1 + fi +' + +# check that the template used in the test won't be too long: +abspath="$(pwd)"/testdir +test ${#abspath} -gt 230 || +test_set_prereq SHORTABSPATH + +test_expect_success SHORTABSPATH 'clean up path close to MAX_PATH' ' + p=/123456789abcdef/123456789abcdef/123456789abcdef/123456789abc/ef && + p=y$p$p$p$p && + subdir="x$(echo "$p" | tail -c $((253 - ${#abspath})) - )" && + # Now, $abspath/$subdir has exactly 254 characters, and is inside CWD + p2="$abspath/$subdir" && + test 254 = ${#p2} && + + # Be careful to overcome path limitations of the MSys tools and split + # the $subdir into two parts. ($subdir2 has to contain 16 chars and a + # slash somewhere following; that is why we asked for abspath <= 230 and + # why we placed a slash near the end of the $subdir template.) + subdir2=${subdir#????????????????*/} && + subdir1=testdir/${subdir%/$subdir2} && + mkdir -p "$subdir1" && + i=0 && + # The most important case is when absolute path is 258 characters long, + # and that will be when i == 4. + while test $i -le 7 + do + mkdir -p $subdir2 && + touch $subdir2/one-file && + mv ${subdir2%%/*} "$subdir1/" && + subdir2=z${subdir2} && + i=$(($i+1)) || + exit 1 + done && + + # now check that git is able to clear the tree: + (cd testdir && + git init && + git config core.longpaths yes && + git clean -fdx) && + test ! -d "$subdir1" +' + +test_done diff --git a/t/t7429-submodule-long-path.sh b/t/t7429-submodule-long-path.sh index f692cedbff7ff8..458519eafd6f03 100755 --- a/t/t7429-submodule-long-path.sh +++ b/t/t7429-submodule-long-path.sh @@ -11,15 +11,20 @@ This test verifies that "git submodule" initialization, update and clones work, TEST_NO_CREATE_REPO=1 . ./test-lib.sh -longpath="" -for (( i=0; i<4; i++ )); do - longpath="0123456789abcdefghijklmnopqrstuvwxyz$longpath" -done -# Pick a substring maximum of 90 characters -# This should be good, since we'll add on a lot for temp directories -longpath=${longpath:0:90}; export longpath +# cloning a submodule calls is_git_directory("$path/../.git/modules/$path"), +# which effectively limits the maximum length to PATH_MAX / 2 minus some +# overhead; start with 3 * 36 = 108 chars (test 2 fails if >= 110) +longpath36=0123456789abcdefghijklmnopqrstuvwxyz +longpath180=$longpath36$longpath36$longpath36$longpath36$longpath36 -test_expect_failure 'submodule with a long path' ' +# the git database must fit within PATH_MAX, which limits the submodule name +# to PATH_MAX - len(pwd) - ~90 (= len("/objects//") + 40-byte sha1 + some +# overhead from the test case) +pwd=$(pwd) +pwdlen=$(echo "$pwd" | wc -c) +longpath=$(echo $longpath180 | cut -c 1-$((170-$pwdlen))) + +test_expect_success 'submodule with a long path' ' git config --global protocol.file.allow always && GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME= \ git -c init.defaultBranch=long init --bare remote && @@ -59,7 +64,7 @@ test_expect_failure 'submodule with a long path' ' ) ' -test_expect_failure 'recursive submodule with a long path' ' +test_expect_success 'recursive submodule with a long path' ' GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME= \ git -c init.defaultBranch=long init --bare super && test_create_repo child && @@ -101,6 +106,5 @@ test_expect_failure 'recursive submodule with a long path' ' ) ) ' -unset longpath test_done From 82551d45051fa2c28bfaf6c3977eb49085dfadb7 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Fri, 21 Jul 2017 12:48:33 +0200 Subject: [PATCH 199/218] t5532: workaround for BusyBox on Windows While it may seem super convenient to some old Unix hands to simpy require Perl to be available when running the test suite, this is a major hassle on Windows, where we want to verify that Perl is not, actually, required in a NO_PERL build. As a super ugly workaround, we "install" a script into /usr/bin/perl reading like this: #!/bin/sh # We'd much rather avoid requiring Perl altogether when testing # an installed Git. Oh well, that's why we cannot have nice # things. exec c:/git-sdk-64/usr/bin/perl.exe "$@" The problem with that is that BusyBox assumes that the #! line in a script refers to an executable, not to a script. So when it encounters the line #!/usr/bin/perl in t5532's proxy-get-cmd, it barfs. Let's help this situation by simply executing the Perl script with the "interpreter" specified explicitly. Signed-off-by: Johannes Schindelin --- t/t5532-fetch-proxy.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/t/t5532-fetch-proxy.sh b/t/t5532-fetch-proxy.sh index 95d0f33b29531c..86fe5d8f752147 100755 --- a/t/t5532-fetch-proxy.sh +++ b/t/t5532-fetch-proxy.sh @@ -32,7 +32,7 @@ test_expect_success 'setup proxy script' ' write_script proxy <<-\EOF echo >&2 "proxying for $*" - cmd=$(./proxy-get-cmd) + cmd=$("$PERL_PATH" ./proxy-get-cmd) echo >&2 "Running $cmd" exec $cmd EOF From 64c6fcea0685fcb4c5c0c11fdff6a0c4ddee79ab Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Wed, 6 Sep 2023 09:14:47 +0200 Subject: [PATCH 200/218] win32(long path support): leave drive-less absolute paths intact When trying to ensure that long paths are handled correctly, we first normalize absolute paths as we encounter them. However, if the path is a so-called "drive-less" absolute path, i.e. if it is relative to the current drive but _does_ start with a directory separator, we would want the normalized path to be such a drive-less absolute path, too. Let's do that, being careful to still include the drive prefix when we need to go through the `\\?\` dance (because there, the drive prefix is absolutely required). This fixes https://github.com/git-for-windows/git/issues/4586. Signed-off-by: Johannes Schindelin --- compat/mingw.c | 7 ++++++- t/t2031-checkout-long-paths.sh | 9 +++++++++ 2 files changed, 15 insertions(+), 1 deletion(-) diff --git a/compat/mingw.c b/compat/mingw.c index 213a33c5c99c48..76481ae864203c 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -4142,7 +4142,12 @@ int handle_long_path(wchar_t *path, int len, int max_path, int expand) * "cwd + path" doesn't due to '..' components) */ if (result < max_path) { - wcscpy(path, buf); + /* Be careful not to add a drive prefix if there was none */ + if (is_wdir_sep(path[0]) && + !is_wdir_sep(buf[0]) && buf[1] == L':' && is_wdir_sep(buf[2])) + wcscpy(path, buf + 2); + else + wcscpy(path, buf); return result; } diff --git a/t/t2031-checkout-long-paths.sh b/t/t2031-checkout-long-paths.sh index f30f8920ca689c..15416a1d6ee8c7 100755 --- a/t/t2031-checkout-long-paths.sh +++ b/t/t2031-checkout-long-paths.sh @@ -99,4 +99,13 @@ test_expect_success SHORTABSPATH 'clean up path close to MAX_PATH' ' test ! -d "$subdir1" ' +test_expect_success SYMLINKS_WINDOWS 'leave drive-less, short paths intact' ' + printf "/Program Files" >symlink-target && + symlink_target_oid="$(git hash-object -w --stdin actual && + grep " *PF *\\[\\\\Program Files\\]" actual +' + test_done From 9a20b670757e87b1ecfe3f8c08b9cf571b56a6db Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Fri, 21 Jul 2017 13:24:55 +0200 Subject: [PATCH 201/218] t5605: special-case hardlink test for BusyBox-w32 When t5605 tries to verify that files are hardlinked (or that they are not), it uses the `-links` option of the `find` utility. BusyBox' implementation does not support that option, and BusyBox-w32's lstat() does not even report the number of hard links correctly (for performance reasons). So let's just switch to a different method that actually works on Windows. Signed-off-by: Johannes Schindelin --- t/t5605-clone-local.sh | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/t/t5605-clone-local.sh b/t/t5605-clone-local.sh index 2397f8fa618054..a7444acc5f89e4 100755 --- a/t/t5605-clone-local.sh +++ b/t/t5605-clone-local.sh @@ -11,6 +11,21 @@ repo_is_hardlinked() { test_line_count = 0 output } +if test_have_prereq MINGW,BUSYBOX +then + # BusyBox' `find` does not support `-links`. Besides, BusyBox-w32's + # lstat() does not report hard links, just like Git's mingw_lstat() + # (from where BusyBox-w32 got its initial implementation). + repo_is_hardlinked() { + for f in $(find "$1/objects" -type f) + do + "$SYSTEMROOT"/system32/fsutil.exe \ + hardlink list $f >links && + test_line_count -gt 1 links || return 1 + done + } +fi + test_expect_success 'preparing origin repository' ' : >file && git add . && git commit -m1 && git clone --bare . a.git && From 5066fcd33b28cd9db2fc9a0dc66201682c590a8f Mon Sep 17 00:00:00 2001 From: Jeff Hostetler Date: Fri, 25 Mar 2022 16:56:04 -0400 Subject: [PATCH 202/218] compat/fsmonitor/fsm-*-win32: support long paths Update wchar_t buffers to use MAX_LONG_PATH instead of MAX_PATH and call xutftowcs_long_path() in the Win32 backend source files. Signed-off-by: Jeff Hostetler --- compat/fsmonitor/fsm-health-win32.c | 6 +++--- compat/fsmonitor/fsm-listen-win32.c | 18 +++++++++--------- compat/fsmonitor/fsm-path-utils-win32.c | 8 ++++---- 3 files changed, 16 insertions(+), 16 deletions(-) diff --git a/compat/fsmonitor/fsm-health-win32.c b/compat/fsmonitor/fsm-health-win32.c index 2aa8c219acee4d..4b53360d194105 100644 --- a/compat/fsmonitor/fsm-health-win32.c +++ b/compat/fsmonitor/fsm-health-win32.c @@ -34,7 +34,7 @@ struct fsm_health_data struct wt_moved { - wchar_t wpath[MAX_PATH + 1]; + wchar_t wpath[MAX_LONG_PATH + 1]; BY_HANDLE_FILE_INFORMATION bhfi; } wt_moved; }; @@ -143,8 +143,8 @@ static int has_worktree_moved(struct fsmonitor_daemon_state *state, return 0; case CTX_INIT: - if (xutftowcs_path(data->wt_moved.wpath, - state->path_worktree_watch.buf) < 0) { + if (xutftowcs_long_path(data->wt_moved.wpath, + state->path_worktree_watch.buf) < 0) { error(_("could not convert to wide characters: '%s'"), state->path_worktree_watch.buf); return -1; diff --git a/compat/fsmonitor/fsm-listen-win32.c b/compat/fsmonitor/fsm-listen-win32.c index 9a6efc9bea340b..afcc172750af10 100644 --- a/compat/fsmonitor/fsm-listen-win32.c +++ b/compat/fsmonitor/fsm-listen-win32.c @@ -28,7 +28,7 @@ struct one_watch DWORD count; struct strbuf path; - wchar_t wpath_longname[MAX_PATH + 1]; + wchar_t wpath_longname[MAX_LONG_PATH + 1]; DWORD wpath_longname_len; HANDLE hDir; @@ -131,8 +131,8 @@ static int normalize_path_in_utf8(wchar_t *wpath, DWORD wpath_len, */ static void check_for_shortnames(struct one_watch *watch) { - wchar_t buf_in[MAX_PATH + 1]; - wchar_t buf_out[MAX_PATH + 1]; + wchar_t buf_in[MAX_LONG_PATH + 1]; + wchar_t buf_out[MAX_LONG_PATH + 1]; wchar_t *last; wchar_t *p; @@ -197,8 +197,8 @@ static enum get_relative_result get_relative_longname( const wchar_t *wpath, DWORD wpath_len, wchar_t *wpath_longname, size_t bufsize_wpath_longname) { - wchar_t buf_in[2 * MAX_PATH + 1]; - wchar_t buf_out[MAX_PATH + 1]; + wchar_t buf_in[2 * MAX_LONG_PATH + 1]; + wchar_t buf_out[MAX_LONG_PATH + 1]; DWORD root_len; DWORD out_len; @@ -298,10 +298,10 @@ static struct one_watch *create_watch(const char *path) FILE_SHARE_WRITE | FILE_SHARE_READ | FILE_SHARE_DELETE; HANDLE hDir; DWORD len_longname; - wchar_t wpath[MAX_PATH + 1]; - wchar_t wpath_longname[MAX_PATH + 1]; + wchar_t wpath[MAX_LONG_PATH + 1]; + wchar_t wpath_longname[MAX_LONG_PATH + 1]; - if (xutftowcs_path(wpath, path) < 0) { + if (xutftowcs_long_path(wpath, path) < 0) { error(_("could not convert to wide characters: '%s'"), path); return NULL; } @@ -545,7 +545,7 @@ static int process_worktree_events(struct fsmonitor_daemon_state *state) struct string_list cookie_list = STRING_LIST_INIT_DUP; struct fsmonitor_batch *batch = NULL; const char *p = watch->buffer; - wchar_t wpath_longname[MAX_PATH + 1]; + wchar_t wpath_longname[MAX_LONG_PATH + 1]; /* * If the kernel gets more events than will fit in the kernel diff --git a/compat/fsmonitor/fsm-path-utils-win32.c b/compat/fsmonitor/fsm-path-utils-win32.c index f4f9cc1f336720..c6eb065bde48b4 100644 --- a/compat/fsmonitor/fsm-path-utils-win32.c +++ b/compat/fsmonitor/fsm-path-utils-win32.c @@ -69,8 +69,8 @@ static int check_remote_protocol(wchar_t *wpath) */ int fsmonitor__get_fs_info(const char *path, struct fs_info *fs_info) { - wchar_t wpath[MAX_PATH]; - wchar_t wfullpath[MAX_PATH]; + wchar_t wpath[MAX_LONG_PATH]; + wchar_t wfullpath[MAX_LONG_PATH]; size_t wlen; UINT driveType; @@ -78,7 +78,7 @@ int fsmonitor__get_fs_info(const char *path, struct fs_info *fs_info) * Do everything in wide chars because the drive letter might be * a multi-byte sequence. See win32_has_dos_drive_prefix(). */ - if (xutftowcs_path(wpath, path) < 0) { + if (xutftowcs_long_path(wpath, path) < 0) { return -1; } @@ -97,7 +97,7 @@ int fsmonitor__get_fs_info(const char *path, struct fs_info *fs_info) * slashes to backslashes. This is essential to get GetDriveTypeW() * correctly handle some UNC "\\server\share\..." paths. */ - if (!GetFullPathNameW(wpath, MAX_PATH, wfullpath, NULL)) { + if (!GetFullPathNameW(wpath, MAX_LONG_PATH, wfullpath, NULL)) { return -1; } From 7e0fcca4b61b2680b73e43930f734597237ca878 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Tue, 29 Sep 2020 13:50:59 +0200 Subject: [PATCH 203/218] Add a GitHub workflow to monitor component updates MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Rather than using private IFTTT Applets that send mails to this maintainer whenever a new version of a Git for Windows component was released, let's use the power of GitHub workflows to make this process publicly visible. This workflow monitors the Atom/RSS feeds, and opens a ticket whenever a new version was released. Note: Bash sometimes releases multiple patched versions within a few minutes of each other (i.e. 5.1p1 through 5.1p4, 5.0p15 and 5.0p16). The MSYS2 runtime also has a similar system. We can address those patches as a group, so we shouldn't get multiple issues about them. Note further: We're not acting on newlib releases, OpenSSL alphas, Perl release candidates or non-stable Perl releases. There's no need to open issues about them. Co-authored-by: Matthias Aßhauer Signed-off-by: Johannes Schindelin --- .github/workflows/monitor-components.yml | 94 ++++++++++++++++++++++++ 1 file changed, 94 insertions(+) create mode 100644 .github/workflows/monitor-components.yml diff --git a/.github/workflows/monitor-components.yml b/.github/workflows/monitor-components.yml new file mode 100644 index 00000000000000..f15ff218d28b81 --- /dev/null +++ b/.github/workflows/monitor-components.yml @@ -0,0 +1,94 @@ +name: Monitor component updates + +# Git for Windows is a slightly modified subset of MSYS2. Some of its +# components are maintained by Git for Windows, others by MSYS2. To help +# keeping the former up to date, this workflow monitors the Atom/RSS feeds +# and opens new tickets for each new component version. + +on: + schedule: + - cron: "23 8,11,14,17 * * *" + workflow_dispatch: + +env: + CHARACTER_LIMIT: 5000 + MAX_AGE: 7d + +jobs: + job: + # Only run this in Git for Windows' fork + if: github.event.repository.owner.login == 'git-for-windows' + runs-on: ubuntu-latest + permissions: + issues: write + strategy: + matrix: + component: + - label: git + feed: https://github.com/git/git/tags.atom + - label: git-lfs + feed: https://github.com/git-lfs/git-lfs/tags.atom + - label: git-credential-manager + feed: https://github.com/git-ecosystem/git-credential-manager/tags.atom + - label: tig + feed: https://github.com/jonas/tig/tags.atom + - label: cygwin + feed: https://github.com/cygwin/cygwin/releases.atom + title-pattern: ^(?!.*newlib) + - label: msys2-runtime-package + feed: https://github.com/msys2/MSYS2-packages/commits/master/msys2-runtime.atom + - label: msys2-runtime + feed: https://github.com/msys2/msys2-runtime/commits/HEAD.atom + aggregate: true + - label: openssh + feed: https://github.com/openssh/openssh-portable/tags.atom + - label: libfido2 + feed: https://github.com/Yubico/libfido2/tags.atom + - label: libcbor + feed: https://github.com/PJK/libcbor/tags.atom + - label: openssl + feed: https://github.com/openssl/openssl/tags.atom + title-pattern: ^(?!.*alpha) + - label: gnutls + feed: https://gnutls.org/news.atom + - label: heimdal + feed: https://github.com/heimdal/heimdal/tags.atom + - label: git-sizer + feed: https://github.com/github/git-sizer/tags.atom + - label: gitflow + feed: https://github.com/petervanderdoes/gitflow-avh/tags.atom + - label: curl + feed: https://github.com/curl/curl/tags.atom + title-pattern: ^(?!rc-) + - label: mintty + feed: https://github.com/mintty/mintty/releases.atom + - label: 7-zip + feed: https://sourceforge.net/projects/sevenzip/rss?path=/7-Zip + aggregate: true + - label: bash + feed: https://git.savannah.gnu.org/cgit/bash.git/atom/?h=master + aggregate: true + - label: perl + feed: https://github.com/Perl/perl5/tags.atom + title-pattern: ^(?!.*(5\.[0-9]+[13579]|RC)) + - label: pcre2 + feed: https://github.com/PCRE2Project/pcre2/tags.atom + - label: mingw-w64-llvm + feed: https://github.com/msys2/MINGW-packages/commits/master/mingw-w64-llvm.atom + - label: innosetup + feed: https://github.com/jrsoftware/issrc/tags.atom + - label: mimalloc + feed: https://github.com/microsoft/mimalloc/tags.atom + title-pattern: ^(?!v1\.|v3\.[01]\.) + fail-fast: false + steps: + - uses: git-for-windows/rss-to-issues@v0 + with: + feed: ${{matrix.component.feed}} + prefix: "[New ${{matrix.component.label}} version]" + labels: component-update + github-token: ${{ secrets.GITHUB_TOKEN }} + character-limit: ${{ env.CHARACTER_LIMIT }} + max-age: ${{ env.MAX_AGE }} + aggregate: ${{matrix.component.aggregate}} + title-pattern: ${{matrix.component.title-pattern}} From 25d52faf3fb62a495875851ce7c97dd89b3fa9b4 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Wed, 5 Jul 2017 15:14:50 +0200 Subject: [PATCH 204/218] t5813: allow for $PWD to be a Windows path Git for Windows uses MSYS2's Bash to run the test suite, which comes with benefits but also at a heavy price: on the plus side, MSYS2's POSIX emulation layer allows us to continue pretending that we are on a Unix system, e.g. use Unix paths instead of Windows ones, yet this is bought at a rather noticeable performance penalty. There *are* some more native ports of Unix shells out there, though, most notably BusyBox-w32's ash. These native ports do not use any POSIX emulation layer (or at most a *very* thin one, choosing to avoid features such as fork() that are expensive to emulate on Windows), and they use native Windows paths (usually with forward slashes instead of backslashes, which is perfectly legal in almost all use cases). And here comes the problem: with a $PWD looking like, say, C:/git-sdk-64/usr/src/git/t/trash directory.t5813-proto-disable-ssh Git's test scripts get quite a bit confused, as their assumptions have been shattered. Not only does this path contain a colon (oh no!), it also does not start with a slash. This is a problem e.g. when constructing a URL as t5813 does it: ssh://remote$PWD. Not only is it impossible to separate the "host" from the path with a $PWD as above, even prefixing $PWD by a slash won't work, as /C:/git-sdk-64/... is not a valid path. As a workaround, detect when $PWD does not start with a slash on Windows, and simply strip the drive prefix, using an obscure feature of Windows paths: if an absolute Windows path starts with a slash, it is implicitly prefixed by the drive prefix of the current directory. As we are talking about the current directory here, anyway, that strategy works. Signed-off-by: Johannes Schindelin --- t/t5813-proto-disable-ssh.sh | 19 +++++++++++++++++-- 1 file changed, 17 insertions(+), 2 deletions(-) diff --git a/t/t5813-proto-disable-ssh.sh b/t/t5813-proto-disable-ssh.sh index 045e2fe6ce376a..c78581dc9f4a1e 100755 --- a/t/t5813-proto-disable-ssh.sh +++ b/t/t5813-proto-disable-ssh.sh @@ -15,8 +15,23 @@ test_expect_success 'setup repository to clone' ' ' test_proto "host:path" ssh "remote:repo.git" -test_proto "ssh://" ssh "ssh://remote$PWD/remote/repo.git" -test_proto "git+ssh://" ssh "git+ssh://remote$PWD/remote/repo.git" + +hostdir="$PWD" +if test_have_prereq MINGW && test "/${PWD#/}" != "$PWD" +then + case "$PWD" in + [A-Za-z]:/*) + hostdir="${PWD#?:}" + ;; + *) + skip_all="Unhandled PWD '$PWD'; skipping rest" + test_done + ;; + esac +fi + +test_proto "ssh://" ssh "ssh://remote$hostdir/remote/repo.git" +test_proto "git+ssh://" ssh "git+ssh://remote$hostdir/remote/repo.git" # Don't even bother setting up a "-remote" directory, as ssh would generally # complain about the bogus option rather than completing our request. Our From 2981f01510d2238a05dfab620a08a72508eae4f4 Mon Sep 17 00:00:00 2001 From: Ben Boeckel Date: Fri, 22 Apr 2022 09:06:23 -0400 Subject: [PATCH 205/218] clean: suggest using `core.longPaths` if paths are too long to remove On Windows, git repositories may have extra files which need cleaned (e.g., a build directory) that may be arbitrarily deep. Suggest using `core.longPaths` if such situations are encountered. Fixes: #2715 Signed-off-by: Ben Boeckel --- Documentation/config/advice.adoc | 3 +++ advice.c | 1 + advice.h | 1 + builtin/clean.c | 13 +++++++++++++ 4 files changed, 18 insertions(+) diff --git a/Documentation/config/advice.adoc b/Documentation/config/advice.adoc index 257db58918179a..0b3199f4660886 100644 --- a/Documentation/config/advice.adoc +++ b/Documentation/config/advice.adoc @@ -64,6 +64,9 @@ all advice messages. set their identity configuration. mergeConflict:: Shown when various commands stop because of conflicts. + nameTooLong:: + Advice shown if a filepath operation is attempted where the + path was too long. nestedTag:: Shown when a user attempts to recursively tag a tag object. pushAlreadyExists:: diff --git a/advice.c b/advice.c index 0018501b7bc103..fec2b37627d2df 100644 --- a/advice.c +++ b/advice.c @@ -61,6 +61,7 @@ static struct { [ADVICE_IGNORED_HOOK] = { "ignoredHook" }, [ADVICE_IMPLICIT_IDENTITY] = { "implicitIdentity" }, [ADVICE_MERGE_CONFLICT] = { "mergeConflict" }, + [ADVICE_NAME_TOO_LONG] = { "nameTooLong" }, [ADVICE_NESTED_TAG] = { "nestedTag" }, [ADVICE_OBJECT_NAME_WARNING] = { "objectNameWarning" }, [ADVICE_PUSH_ALREADY_EXISTS] = { "pushAlreadyExists" }, diff --git a/advice.h b/advice.h index 8def28068861df..b826620fb45916 100644 --- a/advice.h +++ b/advice.h @@ -28,6 +28,7 @@ enum advice_type { ADVICE_IGNORED_HOOK, ADVICE_IMPLICIT_IDENTITY, ADVICE_MERGE_CONFLICT, + ADVICE_NAME_TOO_LONG, ADVICE_NESTED_TAG, ADVICE_OBJECT_NAME_WARNING, ADVICE_PUSH_ALREADY_EXISTS, diff --git a/builtin/clean.c b/builtin/clean.c index e15d595c3dc7cc..f8a54a4a47bc7b 100644 --- a/builtin/clean.c +++ b/builtin/clean.c @@ -26,6 +26,7 @@ #include "pathspec.h" #include "help.h" #include "prompt.h" +#include "advice.h" static int require_force = -1; /* unset */ static int interactive; @@ -221,6 +222,9 @@ static int remove_dirs(struct strbuf *path, const char *prefix, int force_flag, quote_path(path->buf, prefix, "ed, 0); errno = saved_errno; warning_errno(_(msg_warn_remove_failed), quoted.buf); + if (saved_errno == ENAMETOOLONG) { + advise_if_enabled(ADVICE_NAME_TOO_LONG, _("Setting `core.longPaths` may allow the deletion to succeed.")); + } *dir_gone = 0; } ret = res; @@ -256,6 +260,9 @@ static int remove_dirs(struct strbuf *path, const char *prefix, int force_flag, quote_path(path->buf, prefix, "ed, 0); errno = saved_errno; warning_errno(_(msg_warn_remove_failed), quoted.buf); + if (saved_errno == ENAMETOOLONG) { + advise_if_enabled(ADVICE_NAME_TOO_LONG, _("Setting `core.longPaths` may allow the deletion to succeed.")); + } *dir_gone = 0; ret = 1; } @@ -299,6 +306,9 @@ static int remove_dirs(struct strbuf *path, const char *prefix, int force_flag, quote_path(path->buf, prefix, "ed, 0); errno = saved_errno; warning_errno(_(msg_warn_remove_failed), quoted.buf); + if (saved_errno == ENAMETOOLONG) { + advise_if_enabled(ADVICE_NAME_TOO_LONG, _("Setting `core.longPaths` may allow the deletion to succeed.")); + } *dir_gone = 0; ret = 1; } @@ -1109,6 +1119,9 @@ int cmd_clean(int argc, qname = quote_path(item->string, NULL, &buf, 0); errno = saved_errno; warning_errno(_(msg_warn_remove_failed), qname); + if (saved_errno == ENAMETOOLONG) { + advise_if_enabled(ADVICE_NAME_TOO_LONG, _("Setting `core.longPaths` may allow the deletion to succeed.")); + } errors++; } else if (!quiet) { qname = quote_path(item->string, NULL, &buf, 0); From 2bafcd2cadb3d83b9c6ce88da8591651cd0e10f8 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Thu, 25 Nov 2021 11:26:41 +0100 Subject: [PATCH 206/218] Partially un-revert "editor: save and reset terminal after calling EDITOR" In e3f7e01b50be (Revert "editor: save and reset terminal after calling EDITOR", 2021-11-22), we reverted the commit wholesale where the terminal state would be saved and restored before/after calling an editor. The reverted commit was intended to fix a problem with Windows Terminal where simply calling `vi` would cause problems afterwards. To fix the problem addressed by the revert, but _still_ keep the problem with Windows Terminal fixed, let's revert the revert, with a twist: we restrict the save/restore _specifically_ to the case where `vi` (or `vim`) is called, and do not do the same for any other editor. This should still catch the majority of the cases, and will bridge the time until the original patch is re-done in a way that addresses all concerns. Signed-off-by: Johannes Schindelin --- editor.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/editor.c b/editor.c index fd174e6a034f1c..f6d960c6f30782 100644 --- a/editor.c +++ b/editor.c @@ -13,6 +13,7 @@ #include "strvec.h" #include "run-command.h" #include "sigchain.h" +#include "compat/terminal.h" #ifndef DEFAULT_EDITOR #define DEFAULT_EDITOR "vi" @@ -64,6 +65,7 @@ static int launch_specified_editor(const char *editor, const char *path, return error("Terminal is dumb, but EDITOR unset"); if (strcmp(editor, ":")) { + int save_and_restore_term = !strcmp(editor, "vi") || !strcmp(editor, "vim"); struct strbuf realpath = STRBUF_INIT; struct child_process p = CHILD_PROCESS_INIT; int ret, sig; @@ -92,7 +94,11 @@ static int launch_specified_editor(const char *editor, const char *path, strvec_pushv(&p.env, (const char **)env); p.use_shell = 1; p.trace2_child_class = "editor"; + if (save_and_restore_term) + save_and_restore_term = !save_term(1); if (start_command(&p) < 0) { + if (save_and_restore_term) + restore_term(); strbuf_release(&realpath); return error("unable to start editor '%s'", editor); } @@ -100,6 +106,8 @@ static int launch_specified_editor(const char *editor, const char *path, sigchain_push(SIGINT, SIG_IGN); sigchain_push(SIGQUIT, SIG_IGN); ret = finish_command(&p); + if (save_and_restore_term) + restore_term(); strbuf_release(&realpath); sig = ret - 128; sigchain_pop(SIGINT); From 2a47ae0582f0f8bd16e4bc56e1e55017c46e632d Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Tue, 10 Dec 2019 21:41:57 +0100 Subject: [PATCH 207/218] reset: reinstate support for the deprecated --stdin option The `--stdin` option was a well-established paradigm in other commands, therefore we implemented it in `git reset` for use by Visual Studio. Unfortunately, upstream Git decided that it is time to introduce `--pathspec-from-file` instead. To keep backwards-compatibility for some grace period, we therefore reinstate the `--stdin` option on top of the `--pathspec-from-file` option, but mark it firmly as deprecated. Helped-by: Victoria Dye Helped-by: Matthew John Cheetham Signed-off-by: Johannes Schindelin --- Documentation/git-reset.adoc | 11 +++++++++++ builtin/reset.c | 16 ++++++++++++++++ t/meson.build | 1 + t/t7108-reset-stdin.sh | 32 ++++++++++++++++++++++++++++++++ 4 files changed, 60 insertions(+) create mode 100755 t/t7108-reset-stdin.sh diff --git a/Documentation/git-reset.adoc b/Documentation/git-reset.adoc index 5023b5069972ca..933e2fac7dd662 100644 --- a/Documentation/git-reset.adoc +++ b/Documentation/git-reset.adoc @@ -12,6 +12,7 @@ git reset [--soft | --mixed [-N] | --hard | --merge | --keep] [-q] [] git reset [-q] [] [--] ... git reset [-q] [--pathspec-from-file= [--pathspec-file-nul]] [] git reset (--patch | -p) [] [--] [...] +DEPRECATED: git reset [-q] [--stdin [-z]] [] DESCRIPTION ----------- @@ -139,6 +140,16 @@ include::diff-context-options.adoc[] + For more details, see the 'pathspec' entry in linkgit:gitglossary[7]. +`--stdin`:: + DEPRECATED (use `--pathspec-from-file=-` instead): Instead of taking + list of paths from the command line, read list of paths from the + standard input. Paths are separated by LF (i.e. one path per line) by + default. + +`-z`:: + DEPRECATED (use `--pathspec-file-nul` instead): Only meaningful with + `--stdin`; paths are separated with NUL character instead of LF. + EXAMPLES -------- diff --git a/builtin/reset.c b/builtin/reset.c index 3590be57a5f03c..1cd7e61fe45e90 100644 --- a/builtin/reset.c +++ b/builtin/reset.c @@ -38,6 +38,8 @@ #include "trace2.h" #include "dir.h" #include "add-interactive.h" +#include "strbuf.h" +#include "quote.h" #define REFRESH_INDEX_DELAY_WARNING_IN_MS (2 * 1000) @@ -46,6 +48,7 @@ static const char * const git_reset_usage[] = { N_("git reset [-q] [] [--] ..."), N_("git reset [-q] [--pathspec-from-file [--pathspec-file-nul]] []"), N_("git reset --patch [] [--] [...]"), + N_("DEPRECATED: git reset [-q] [--stdin [-z]] []"), NULL }; @@ -347,6 +350,7 @@ int cmd_reset(int argc, struct pathspec pathspec; int intent_to_add = 0; struct interactive_options interactive_opts = INTERACTIVE_OPTIONS_INIT; + int nul_term_line = 0, read_from_stdin = 0; const struct option options[] = { OPT__QUIET(&quiet, N_("be quiet, only report errors")), OPT_BOOL(0, "no-refresh", &no_refresh, @@ -379,6 +383,10 @@ int cmd_reset(int argc, N_("record only the fact that removed paths will be added later")), OPT_PATHSPEC_FROM_FILE(&pathspec_from_file), OPT_PATHSPEC_FILE_NUL(&pathspec_file_nul), + OPT_BOOL('z', NULL, &nul_term_line, + N_("DEPRECATED (use --pathspec-file-nul instead): paths are separated with NUL character")), + OPT_BOOL(0, "stdin", &read_from_stdin, + N_("DEPRECATED (use --pathspec-from-file=- instead): read paths from ")), OPT_END() }; @@ -388,6 +396,14 @@ int cmd_reset(int argc, PARSE_OPT_KEEP_DASHDASH); parse_args(&pathspec, argv, prefix, patch_mode, &rev); + if (read_from_stdin) { + warning(_("--stdin is deprecated, please use --pathspec-from-file=- instead")); + free(pathspec_from_file); + pathspec_from_file = xstrdup("-"); + if (nul_term_line) + pathspec_file_nul = 1; + } + if (pathspec_from_file) { if (patch_mode) die(_("options '%s' and '%s' cannot be used together"), "--pathspec-from-file", "--patch"); diff --git a/t/meson.build b/t/meson.build index 94bc6ccd1bd72a..c2c7df576b19f1 100644 --- a/t/meson.build +++ b/t/meson.build @@ -875,6 +875,7 @@ integration_tests = [ 't7105-reset-patch.sh', 't7106-reset-unborn-branch.sh', 't7107-reset-pathspec-file.sh', + 't7108-reset-stdin.sh', 't7110-reset-merge.sh', 't7111-reset-table.sh', 't7112-reset-submodule.sh', diff --git a/t/t7108-reset-stdin.sh b/t/t7108-reset-stdin.sh new file mode 100755 index 00000000000000..b7cbcbf869296c --- /dev/null +++ b/t/t7108-reset-stdin.sh @@ -0,0 +1,32 @@ +#!/bin/sh + +test_description='reset --stdin' + +. ./test-lib.sh + +test_expect_success 'reset --stdin' ' + test_commit hello && + git rm hello.t && + test -z "$(git ls-files hello.t)" && + echo hello.t | git reset --stdin && + test hello.t = "$(git ls-files hello.t)" +' + +test_expect_success 'reset --stdin -z' ' + test_commit world && + git rm hello.t world.t && + test -z "$(git ls-files hello.t world.t)" && + printf world.tQworld.tQhello.tQ | q_to_nul | git reset --stdin -z && + printf "hello.t\nworld.t\n" >expect && + git ls-files >actual && + test_cmp expect actual +' + +test_expect_success '--stdin requires --mixed' ' + echo hello.t >list && + test_must_fail git reset --soft --stdin Date: Mon, 4 Apr 2022 15:38:58 -0700 Subject: [PATCH 208/218] fsmonitor: reintroduce core.useBuiltinFSMonitor Reintroduce the 'core.useBuiltinFSMonitor' config setting (originally added in 0a756b2a25 (fsmonitor: config settings are repository-specific, 2021-03-05)) after its removal from the upstream version of FSMonitor. Upstream, the 'core.useBuiltinFSMonitor' setting was rendered obsolete by "overloading" the 'core.fsmonitor' setting to take a boolean value. However, several applications (e.g., 'scalar') utilize the original config setting, so it should be preserved for a deprecation period before complete removal: * if 'core.fsmonitor' is a boolean, the user is correctly using the new config syntax; do not use 'core.useBuiltinFSMonitor'. * if 'core.fsmonitor' is unspecified, use 'core.useBuiltinFSMonitor'. * if 'core.fsmonitor' is a path, override and use the builtin FSMonitor if 'core.useBuiltinFSMonitor' is 'true'; otherwise, use the FSMonitor hook indicated by the path. Additionally, for this deprecation period, advise users to switch to using 'core.fsmonitor' to specify their use of the builtin FSMonitor. Signed-off-by: Victoria Dye --- Documentation/config/advice.adoc | 4 ++++ advice.c | 1 + advice.h | 1 + fsmonitor-settings.c | 34 ++++++++++++++++++++++++++++++-- 4 files changed, 38 insertions(+), 2 deletions(-) diff --git a/Documentation/config/advice.adoc b/Documentation/config/advice.adoc index 257db58918179a..f156f638dcd5ee 100644 --- a/Documentation/config/advice.adoc +++ b/Documentation/config/advice.adoc @@ -166,4 +166,8 @@ all advice messages. Shown when the user tries to create a worktree from an invalid reference, to tell the user how to create a new unborn branch instead. + + useCoreFSMonitorConfig:: + Advice shown if the deprecated 'core.useBuiltinFSMonitor' config + setting is in use. -- diff --git a/advice.c b/advice.c index 0018501b7bc103..01f0fe407e84a4 100644 --- a/advice.c +++ b/advice.c @@ -89,6 +89,7 @@ static struct { [ADVICE_SUBMODULE_MERGE_CONFLICT] = { "submoduleMergeConflict" }, [ADVICE_SUGGEST_DETACHING_HEAD] = { "suggestDetachingHead" }, [ADVICE_UPDATE_SPARSE_PATH] = { "updateSparsePath" }, + [ADVICE_USE_CORE_FSMONITOR_CONFIG] = { "useCoreFSMonitorConfig" }, [ADVICE_WAITING_FOR_EDITOR] = { "waitingForEditor" }, [ADVICE_WORKTREE_ADD_ORPHAN] = { "worktreeAddOrphan" }, }; diff --git a/advice.h b/advice.h index 8def28068861df..d5d7696897351e 100644 --- a/advice.h +++ b/advice.h @@ -56,6 +56,7 @@ enum advice_type { ADVICE_SUBMODULE_MERGE_CONFLICT, ADVICE_SUGGEST_DETACHING_HEAD, ADVICE_UPDATE_SPARSE_PATH, + ADVICE_USE_CORE_FSMONITOR_CONFIG, ADVICE_WAITING_FOR_EDITOR, ADVICE_WORKTREE_ADD_ORPHAN, }; diff --git a/fsmonitor-settings.c b/fsmonitor-settings.c index a6587a8972b184..b4c29f44a27827 100644 --- a/fsmonitor-settings.c +++ b/fsmonitor-settings.c @@ -5,6 +5,7 @@ #include "fsmonitor-ipc.h" #include "fsmonitor-settings.h" #include "fsmonitor-path-utils.h" +#include "advice.h" /* * We keep this structure definition private and have getters @@ -100,6 +101,31 @@ static struct fsmonitor_settings *alloc_settings(void) return s; } +static int check_deprecated_builtin_config(struct repository *r) +{ + int core_use_builtin_fsmonitor = 0; + + /* + * If 'core.useBuiltinFSMonitor' is set, print a deprecation warning + * suggesting the use of 'core.fsmonitor' instead. If the config is + * set to true, set the appropriate mode and return 1 indicating that + * the check resulted the config being set by this (deprecated) setting. + */ + if(!repo_config_get_bool(r, "core.useBuiltinFSMonitor", &core_use_builtin_fsmonitor) && + core_use_builtin_fsmonitor) { + if (!git_env_bool("GIT_SUPPRESS_USEBUILTINFSMONITOR_ADVICE", 0)) { + advise_if_enabled(ADVICE_USE_CORE_FSMONITOR_CONFIG, + _("core.useBuiltinFSMonitor=true is deprecated;" + "please set core.fsmonitor=true instead")); + setenv("GIT_SUPPRESS_USEBUILTINFSMONITOR_ADVICE", "1", 1); + } + fsm_settings__set_ipc(r); + return 1; + } + + return 0; +} + static void lookup_fsmonitor_settings(struct repository *r) { const char *const_str; @@ -126,12 +152,16 @@ static void lookup_fsmonitor_settings(struct repository *r) return; case 1: /* config value was unset */ + if (check_deprecated_builtin_config(r)) + return; + const_str = getenv("GIT_TEST_FSMONITOR"); break; case -1: /* config value set to an arbitrary string */ - if (repo_config_get_pathname(r, "core.fsmonitor", &to_free)) - return; /* should not happen */ + if (check_deprecated_builtin_config(r) || + repo_config_get_pathname(r, "core.fsmonitor", &to_free)) + return; const_str = to_free; break; From b133bda58de07a4bfe3e8ee109007c17cd046ef6 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Tue, 6 Feb 2024 18:45:35 +0100 Subject: [PATCH 209/218] dependabot: help keeping GitHub Actions versions up to date See https://docs.github.com/en/code-security/dependabot/working-with-dependabot/keeping-your-actions-up-to-date-with-dependabot#enabling-dependabot-version-updates-for-actions for details. Signed-off-by: Johannes Schindelin --- .github/dependabot.yml | 13 +++++++++++++ 1 file changed, 13 insertions(+) create mode 100644 .github/dependabot.yml diff --git a/.github/dependabot.yml b/.github/dependabot.yml new file mode 100644 index 00000000000000..22d5376407abf1 --- /dev/null +++ b/.github/dependabot.yml @@ -0,0 +1,13 @@ +# To get started with Dependabot version updates, you'll need to specify which +# package ecosystems to update and where the package manifests are located. +# Please see the documentation for all configuration options: +# https://docs.github.com/code-security/dependabot/dependabot-version-updates/configuration-options-for-the-dependabot.yml-file +# especially +# https://docs.github.com/en/code-security/dependabot/working-with-dependabot/keeping-your-actions-up-to-date-with-dependabot#enabling-dependabot-version-updates-for-actions + +version: 2 +updates: + - package-ecosystem: "github-actions" # See documentation for possible values + directory: "/" # Location of package manifests + schedule: + interval: "weekly" From fa18d8c944889942e59337f47443f998639662ac Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Fri, 7 Jul 2017 10:15:36 +0200 Subject: [PATCH 210/218] t9200: skip tests when $PWD contains a colon On Windows, the current working directory is pretty much guaranteed to contain a colon. If we feed that path to CVS, it mistakes it for a separator between host and port, though. This has not been a problem so far because Git for Windows uses MSYS2's Bash using a POSIX emulation layer that also pretends that the current directory is a Unix path (at least as long as we're in a shell script). However, that is rather limiting, as Git for Windows also explores other ports of other Unix shells. One of those is BusyBox-w32's ash, which is a native port (i.e. *not* using any POSIX emulation layer, and certainly not emulating Unix paths). So let's just detect if there is a colon in $PWD and punt in that case. Signed-off-by: Johannes Schindelin --- t/t9200-git-cvsexportcommit.sh | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/t/t9200-git-cvsexportcommit.sh b/t/t9200-git-cvsexportcommit.sh index 415ac008fd7118..5f827626e31eb7 100755 --- a/t/t9200-git-cvsexportcommit.sh +++ b/t/t9200-git-cvsexportcommit.sh @@ -11,6 +11,13 @@ if ! test_have_prereq PERL; then test_done fi +case "$PWD" in +*:*) + skip_all='cvs would get confused by the colon in `pwd`; skipping tests' + test_done + ;; +esac + cvs >/dev/null 2>&1 if test $? -ne 1 then From 489d6613cecbc733356c2865b66661d70aee3326 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Mon, 13 Feb 2023 13:31:35 +0100 Subject: [PATCH 211/218] Describe Git for Windows' architecture The Git for Windows project has grown quite complex over the years, certainly much more complex than during the first years where the `msysgit.git` repository was abusing Git for package management purposes and the `git/git` fork was called `4msysgit.git`. Let's describe the status quo in a thorough way. Signed-off-by: Johannes Schindelin --- ARCHITECTURE.md | 116 ++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 116 insertions(+) create mode 100644 ARCHITECTURE.md diff --git a/ARCHITECTURE.md b/ARCHITECTURE.md new file mode 100644 index 00000000000000..7de4f99bf71ec4 --- /dev/null +++ b/ARCHITECTURE.md @@ -0,0 +1,116 @@ +# Architecture of Git for Windows + +Git for Windows is a complex project. + +## What _is_ Git for Windows? + +### A fork of `git/git` + +First and foremost, it is a friendly fork of [`git/git`](https://github.com/git/git), aiming to improve Git's Windows support. The [`git-for-windows/git`](https://github.com/git-for-windows/git) repository contains dozens of topics on top of `git/git`, some awaiting to be "upstreamed" (i.e. to be contributed to `git/git`), some still being stabilized, and a few topics are specific to the Git for Windows project and are not intended to be integrated into `git/git` at all. + +### Enhancing and maintaining Git's support for Windows + +On the source code side, Git's Windows support is made a bit more tricky than strictly necessary by the fact that Git does not have any platform abstraction layer (unlike other version control systems, such as Subversion). It relies on the presence of POSIX features such as the `hstrerror()` function, and on platforms lacking that functionality, Git provides shims. That leads to some challenges e.g. with the `stat()` function which is very slow on Windows because it has to collect much more metadata than what e.g. the very quick `GetFileAttributesExW()` Win32 API function provides, even when Git calls `stat()` merely to test for the presence of a file (for which all that gathered metadata is totally irrelevant). + +### Providing more than just source code + +In contrast to the Git project, Git for Windows not only publishes tagged source code versions, but full builds of Git. In fact, Git for Windows' primary purpose, as far as most users are concerned, is to provide a convenient installer that end-users can run to have Git on their computer, without ever having to check out `git-for-windows/git` let alone build it. In essence, Git for Windows has to maintain a separate project altogether in addition to the fork of `git/git`, just to build these release artifacts: [`git-for-windows/build-extra`](https://github.com/git-for-windows/build-extra). This repository also contains the definition for a couple of other release artifacts published by Git for Windows, e.g. the "portable" edition of Git for Windows which is a self-extracting 7-Zip archive that does not need to be installed. + +### A software distribution, really + +Another aspect that contributes to the complexity of Git for Windows is that it is not just building `git.exe` and distributes that. Due to its heritage within the Linux project, Git takes certain things for granted, such as the presence of a Unix shell, or for that matter, a package management system from which dependencies can be fetched and updated independently of Git itself. Things that are distinctly not present in most Windows setups. To accommodate for that, Git for Windows originally relied on the MSys project, a minimal fork of Cygwin providing a Unix shell ("Bash"), a Perl interpreter and similar Unix-like tools, and on the MINGW project, a project to build libraries and executables using a GNU C Compiler that relies only on Win32 API functions. As of Git for Windows v2.x, the project has switched away from [MSys](https://sourceforge.net/projects/mingw/files/MSYS/)/[MinGW](https://osdn.net/projects/mingw/) (due to less-than-active maintenance) to [the MSYS2 project](https://msys2.org). That switch brought along the benefit of a robust package management system based on [Pacman](https://archlinux.org/pacman/) (hailing from Arch Linux). To support Windows users, who are in general unfamiliar with Linux-like package management and the need to update installed packages frequently, Git for Windows bundles a subset of its own fork of MSYS2. To put things in perspective: Git for Windows bundles files from ~170 packages, one of which contains Git, and another one contains Git's help files. In that respect, Git for Windows acts like a distribution more than like a mere single software application. + +Most of MSYS2's packages that are bundled in Git for Windows are consumed directly from MSYS2. Others need forks that are maintained by Git for Windows project, to support Git for Windows better. These forks live in the [`git-for-windows/MSYS2-packages`](https://github.com/git-for-windows/MSYS2-packages) and [`git-for-windows/MINGW-packages`](https://github.com/git-for-windows/MINGW-packages) repositories. There are several reasons justifying these forks. For example, the Git for Windows' flavor of the MSYS2 runtime behaves like Git's test suite expects it while MSYS2's flavor does not. Another example: The Bash executable bundled in Git for Windows is code-signed with the same certificate as `git.exe` to help anti-malware programs get out of the users' way. That is why Git for Windows maintains its own `bash` Pacman package. And since MSYS2 dropped 32-bit support already, Git for Windows has to update the 32-bit Pacman packages itself, which is done in the git-for-windows/MSYS2-packages repository. (Side note: the 32-bit issue is a bit more complicated, actually: MSYS2 _still_ builds _MINGW_ packages targeting i686 processors, but no longer any _MSYS_ packages for said processor architecture, and Git for Windows does not keep all of the 32-bit MSYS packages up to date but instead judiciously decides which packages are vital enough as far as Git is concerned to justify the maintenance cost.) + +### Supporting third-party applications that use Git's functionality + +Since the infrastructure required by Git is non-trivial the installer (or for that matter, the Portable Git) is not exactly light-weight: As of January 2023, both artifacts are over fifty megabytes. This is a problem for third-party applications wishing to bundle a version of Git for Windows, which is often advisable given that applications may depend on features that have been introduced only in recent Git versions and therefore relying on an installed Git for Windows could break things. To help with that, the Git for Windows project also provides MinGit as a release artifact, a zip file that is much smaller than the full installer and that contains only the parts of Git for Windows relevant for third-party applications. It lacks Git GUI, for example, as well as the terminal program MinTTY, or for that matter, the documentation. + +### Supporting `git/git`'s GitHub workflows + +The Git for Windows project is also responsible for keeping the Windows part of `git/git`'s automated builds up and running. On Windows, there is no canonical and easy way to get a build environment necessary to build Git and run its test suite, therefore this is a non-trivial task that comes with its own maintenance cost. Git for Windows provides two GitHub Actions to help with that: [`git-for-windows/setup-git-for-windows-sdk`](https://github.com/git-for-windows/setup-git-for-windows-sdk) to set up a tiny subset of Git for Windows' full SDK (which would require about 500MB to be cloned, as opposed to the ~75MB of that subset) and [`git-for-windows/get-azure-pipelines-artifact`](https://github.com/git-for-windows/get-azure-pipelines-artifact) e.g. to download some regularly pre-built artifacts (for example, when `git/git`'s automated tests ran on an Ubuntu version that did not provide an up to date [Coccinelle](https://coccinelle.gitlabpages.inria.fr/website/) package, this GitHub Action was used to download a pre-built version of that Debian package). + +## Maintaining Git for Windows' components + +Git for Windows uses a combination of [a GitHub App called GitForWindowsHelper](https://github.com/git-for-windows/gfw-helper-github-app) (to listen for so-called [slash commands](https://github.com/git-for-windows/gfw-helper-github-app#slash-commands)) combined with workflows in [the `git-for-windows-automation` repository](https://github.com/git-for-windows/git-for-windows-automation/) (for computationally heavy tasks) to support Git for Windows' repetitive tasks. + +This heavy automation serves two purposes: + +1. Document the knowledge about "how things are done" in the Git for Windows project. +2. Make Git for Windows' maintenance less tedious by off-loading as many tasks onto machines as possible. + +One neat trick of some `git-for-windows-automation` workflows is that they "mirror back" check runs to the targeted PRs in another repository. This essentially allows versioning the source code independently of the workflow definition. + +Here is a diagram showing how the bits and pieces fit together. + +```mermaid +graph LR + A[`monitor-components`] --> |opens| B + B{issues labeled
`component-update`} --> |/open pr| C + C((GitForWindowsHelper)) --> |triggers| D + D[`open-pr`] --> |opens| E + E{PR in
MINGW-packages
MSYS2-packages
build-extra} --> |closes| B + E --> |/deploy| F + F((GitForWindowsHelper)) --> |triggers| G + G[`build-and-deploy`] --> |deploys to| H + H{Pacman repository} + C --> |backed by| I + F --> |backed by| I + I[[Azure Function]] + D --> |running in| J + G --> | running in| J + J[[git-for-windows-automation]] + K[[git-sdk-32
git-sdk-64
git-sdk-arm64]] --> |syncing from| H + B --> |/add release note| L + L[`add-release-note`] +``` + +For the curious mind, here are [detailed instructions how the Azure Function backing the GitForWindowsHelper GitHub App was set up](https://github.com/git-for-windows/gfw-helper-github-app#how-this-github-app-was-set-up). + +### The `monitor-components` workflow + +When new versions of components that Git for Windows builds become available, new Pacman packages have to be built. To this end, [the `monitor-components` workflow](https://github.com/git-for-windows/git/blob/main/.github/workflows/monitor-components.yml) monitors a couple of RSS feeds and opens new tickets labeled `component-update` for such new versions. + +### Opening Pull Requests to update Git for Windows' components + +After determining that such a ticket indeed indicates the need for a new Pacman package build, a Git for Windows maintainer issues the `/open pr` command via an issue comment ([example](https://github.com/git-for-windows/git/issues/4281#issuecomment-1426859787)), which gets picked up by the GitForWindowsHelper GitHub App, which in turn triggers [the `open-pr` workflow](https://github.com/git-for-windows/git-for-windows-automation/blob/main/.github/workflows/open-pr.yml) in the `git-for-windows-automation` repository. + +### Deploying the Pacman packages + +This will open a Pull Request in one of Git for Windows' repositories, and once the PR build passes, a Git for Windows maintainer issues the `/deploy` command ([example](https://github.com/git-for-windows/MINGW-packages/pull/69#issuecomment-1427591890)), which gets picked up by the GitForWindowsHelper GitHub App, which triggers [the `build-and-deploy` workflow](https://github.com/git-for-windows/git-for-windows-automation/blob/main/.github/workflows/build-and-deploy.yml). + +### Adding release notes + +Finally, once the packages have been built and deployed to the Pacman repository (which is hosted in Azure Blob Storage), a Git for Windows maintainer will merge the PR(s), which in turn will close the ticket, and the maintainer then issues an `/add release note` command ([example](https://github.com/git-for-windows/MINGW-packages/pull/69#issuecomment-1427782230)), which again gets picked up by the GitForWindowsHelper GitHub App that triggers [the `add-release-note` workflow](https://github.com/git-for-windows/build-extra/blob/main/.github/workflows/add-release-note.yml) that creates and pushes a new commit to the `ReleaseNotes.md` file in `build-extra` ([example](https://github.com/git-for-windows/build-extra/commit/b39c148ff8dc0e987afdb677d17c46a8e99fd0ef)). + +## Releasing official Git for Windows versions + +A relatively infrequent part of Git for Windows' maintainers' duties, if the most rewarding part, is the task of releasing new versions of Git for Windows. + +Most commonly, this is done in response to the "upstream" Git project releasing a new version. When that happens, a Git for Windows maintainer runs [the helper script](https://github.com/git-for-windows/build-extra/blob/main/shears.sh) to perform a "merging rebase" (i.e. a rebase that starts with a fake-merge of the previous tip commit, to maintain both a clean set of commits as well as a [fast-forwarding](https://git-scm.com/docs/git-merge#Documentation/git-merge.txt---ff-only) commit history). + +Once that is done, the maintainer will open a Pull Request to benefit from the automated builds and tests ([example](https://github.com/git-for-windows/git/pull/4160)) as well as from reviews of the [`range-diff`](https://git-scm.com/docs/git-range-diff) relative to the current `main` branch. + +Once everything looks good, the maintainer will issue the `/git-artifacts` command ([example](https://github.com/git-for-windows/git/pull/4160#issuecomment-1346801735)). This will trigger an automated workflow that builds all of the release artifacts: installers, Portable Git, MinGit, `.tar.xz` archive and a NuGet package. Apart from the NuGet package, two sets of artifacts are built: targeting 32-bit ("x86") and 64-bit ("amd64"). + +Once these artifacts are built, the maintainer will download the installer and run [the "pre-flight checklist"](https://github.com/git-for-windows/build-extra/blob/main/installer/checklist.txt). + +If everything looks good, a `/release` command will be issued, which triggers yet another workflow that will download the just-built-and-verified release artifacts, publish them as a new GitHub release, publish the NuGet packages, deploy the Pacman packages to the Pacman repository, send out an announcement mail, and update the respective repositories including [Git for Windows' website](https://gitforwindows.org/). + +As mentioned [before](#architecture-of-git-for-windows), the `/git-artifacts` and `/release` commands are picked up by the GitForWindowsHelper GitHub App which subsequently triggers the respective workflows in the `git-for-windows-automation` repository. Here is a diagram: + +```mermaid +graph LR + A{Pull Request
updating to
new Git version} --> |/git-artifacts| B + B((GitForWindowsHelper)) --> |triggers| C + C[`tag-git`] --> |upon successful build
triggers| D + D((GitForWindowsHelper)) --> |triggers| E + E[`git-artifacts`] + E --> |maintainer verifies artifacts| E + A --> |upon verified `git-artifacts`
/release| F + F[`release-git`] + C --> |running in| J + E --> | running in| J + F --> | running in| J + J[[git-for-windows-automation]] +``` \ No newline at end of file From 458b8eee59d8c4133350fb4aeb1a3a0341118d3a Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Mon, 26 Jan 2026 19:18:40 +0100 Subject: [PATCH 212/218] Add an AGENTS.md file to help with AI-assisted debugging/development MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit In this time and age, AI is everywhere. However, it's sometimes not very easy to use. For green-field projects it works quite a bit better than for existing legacy projects. And Git's source code is _quite_ as legacy code as they come... 😁 Now, the only way how AI can be used efficiently with legacy code is by providing enough information by way of prompt context for the AI to have a chance to make any sense of the code. The structure and the architecture is, after all, not designed for AI, but rather the opposite: By virtue of having grown organically over two decades, there is no design that AI coding models would readily grasp. So here is a document that describes all kinds of aspects about this project. The idea is to help AI by providing information that it does not have ingrained in its weights. The idea is to provide information that a human prompter might take for granted, but no coding model will have been trained on specifically. Assisted-by: Claude Opus 4.5 Signed-off-by: Johannes Schindelin --- .gitattributes | 1 + AGENTS.md | 1204 ++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 1205 insertions(+) create mode 100644 AGENTS.md diff --git a/.gitattributes b/.gitattributes index 69dcb5bb2d0cde..53ccb9407d57bc 100644 --- a/.gitattributes +++ b/.gitattributes @@ -7,6 +7,7 @@ *.py text eol=lf diff=python *.bat text eol=crlf *.png binary +/AGENTS.md conflict-marker-size=32 CODE_OF_CONDUCT.md -whitespace /Documentation/**/*.adoc text eol=lf whitespace=trail,space,incomplete /command-list.txt text eol=lf diff --git a/AGENTS.md b/AGENTS.md new file mode 100644 index 00000000000000..c60945448ff42b --- /dev/null +++ b/AGENTS.md @@ -0,0 +1,1204 @@ +# Git for Windows - Development Guide + +## Background + +Git for Windows is a fork of upstream Git that provides the necessary +adaptations to make Git work well on Windows. While the primary target is +Windows, the project also maintains working builds on other platforms (Linux, +macOS) because cross-platform builds often catch mistakes that might be missed +when testing only on Windows. + +There are downstream projects that build on Git for Windows, such as Microsoft +Git, which adds features for large monorepos hosted on Azure DevOps. + +## Overview + +This document provides guidance for developing and debugging in +Git for Windows. + +## Repository Structure + +### Branch Naming Patterns + +Based on actual repository usage: + +- `main` - The primary development branch +- Feature branches use descriptive topic names, targeting the main branch + +## Building and Testing + +### Build + +```bash +make -j$(nproc) +``` + +On Windows (in a Git for Windows SDK shell): + +```bash +make -j15 +``` + +### Run Specific Tests + +```bash +cd t && sh t0001-init.sh # Run normally +cd t && sh t0001-init.sh -v # Verbose +cd t && sh t0001-init.sh -ivx # verbose, trace, fail-fast +``` + +Some tests are expensive and skipped by default. When a test exits immediately +with "skip all", check the test script header for `test_bool_env GIT_TEST_*` +to find which environment variable enables it. + +## Git Source Code Structure + +This section provides a bird's eye view of Git's source code layout. For +more details, see "A birds-eye view of Git's source code" in +`Documentation/user-manual.adoc`. + +### Key Directories + +| Directory | Purpose | +|------------------|----------------------------------------------------| +| `builtin/` | Built-in command implementations (`cmd_()`) | +| `xdiff/` | Low-level diff algorithms (libxdiff) | +| `t/` | Test suite (shell scripts, helpers, libraries) | +| `Documentation/` | Man pages, guides, technical docs (AsciiDoc) | +| `contrib/` | Optional extras, not part of core Git | +| `compat/` | Platform compatibility shims | +| `refs/` | Reference backends (files, reftable) | +| `reftable/` | Reftable format implementation | + +### Built-in Commands + +Built-in commands are implemented in `builtin/.c` with a function +`cmd_()`. To add a new built-in: + +1. Create `builtin/.c` implementing `cmd_()` +2. Add entry to the `commands[]` array in `git.c`: + ```c + { "", cmd_, RUN_SETUP }, + ``` +3. Add to `BUILTIN_OBJS` in `Makefile` +4. Add to `command-list.txt` with appropriate category +5. Run `make check-builtins` to verify consistency + +### Object Data Model + +Git stores four types of objects, defined in `object.h`: + +```c +enum object_type { + OBJ_COMMIT = 1, /* Points to tree, has parent commits, metadata */ + OBJ_TREE = 2, /* Directory listing: names -> blob/tree OIDs */ + OBJ_BLOB = 3, /* File contents */ + OBJ_TAG = 4, /* Annotated tag pointing to another object */ +}; +``` + +Objects are addressed by their SHA (OID) and stored in the Object Database. + +### Object Database (ODB) + +The ODB is defined in `odb.h` and implemented in `odb.c`: + +- **`struct object_database`**: Top-level container, owned by a repository + - `sources`: Linked list of `odb_source` (primary + alternates) + - `replace_map`: Object replacements (see `git-replace(1)`) + - `commit_graph`: Commit-graph cache for faster traversal + +- **`struct odb_source`**: A single object store location + - `path`: Directory (e.g., `.git/objects` or an alternate) + - `loose`: Loose object cache + - `packfiles`: Packfile store (idx + pack files) + +Key functions: +- `odb_read_object()`: Read an object by OID +- `odb_write_object()`: Write an object, returns OID +- `odb_read_object_info()`: Get object type/size without reading content + +### Documentation + +Documentation lives in `Documentation/` as AsciiDoc (`.adoc`) files: + +- `git-.adoc` - Man pages for commands +- `config/.adoc` - Config option documentation (included by others) +- `technical/` - Technical specifications and internals + +To build documentation: +```bash +make -C Documentation html # Build HTML docs +make -C Documentation man # Build man pages +``` + +To add documentation for a new config option, add it to the appropriate +file in `Documentation/config/`. These are included by other docs. + +To lint documentation: +```bash +make -C Documentation lint-docs +``` + +## Debugging Techniques + +### Debugging Philosophy + +Debugging is not about guessing fixes and seeing if they work. It is about +building a complete understanding of the problem before attempting any fix. +The goal is not speed to a "fix" but confidence that you understand and have +addressed the root cause. + +**Respect turnaround time.** If seeing the result of an attempted fix takes +7-10 minutes (e.g., a CI workflow run), you cannot afford to guess. Each +iteration costs human time and attention. Before pushing any change: + +1. Ask: "What information am I missing to competently assess this situation?" +2. Add diagnostic output that will provide that information if the fix fails. +3. Consider whether you can reproduce the issue locally where turnaround is + seconds, not minutes. + +**Understand before acting.** Before attempting any fix: + +1. When investigating a regression between two versions, start by examining + the code diff. Analyze what actually changed before running any tests. + Tests confirm hypotheses; reading the diff gives you the hypothesis. +2. Trace the code flow completely. Read the relevant Makefiles, scripts, and + source files. Understand what each component does and how they interact. +3. Identify all changes that could have contributed: upstream commits, + downstream patches, infrastructure changes (CI runner updates, dependency + upgrades). +4. For each potential cause, find the specific commit, its date, its intent, + and how it interacts with other components. +5. Build a hypothesis. Then ask: "How would I confirm or disprove this?" + +**Do not assume root cause from symptoms.** A symptom appearing on one +platform does not mean the bug is platform-specific. The cause may be in +shared code that manifests differently across platforms. Similarly, a passing +test on one platform when it fails on another is data to investigate, not +grounds to conclude "works for me." + +**When a fix does not work, investigate why.** If you expected a fix to work +and it did not, that is valuable information. Do not abandon that line of +thinking and try something else. Instead: + +1. Ask: "Why didn't that work? What does this tell me about my understanding?" +2. Add more targeted diagnostics to understand the discrepancy. +3. Re-examine your assumptions. Something you believed to be true is false. + +**Add diagnostics proactively.** Before pushing a fix attempt, add diagnostic +output that will: + +1. Confirm the state you expect to see if the fix works. +2. Reveal the actual state if it does not. +3. Provide enough context to understand the next step without another round + trip. + +For build failures, this might include: library paths, compiler flags, +architecture information, symbol tables, file existence checks, environment +variables. + +**Build confidence before pushing.** A fix should not be a guess. You should +be able to explain: + +1. What was the root cause? +2. Why does this fix address it? +3. What other ways could this problem be solved? +4. Am I choosing the "most correct" or "most effective" approach? +5. What evidence confirms your understanding? +6. What could still go wrong, and how would you detect it? + +### Searching the Codebase + +In particular when debugging failures that printed error messages, it is often +a useful thing to search for those error messages; If parts of the message seem +mutable (e.g. commit OIDs), those will not be hard-coded and the search needs +to accommodate for that by using regular expressions or prefix matches. + +Use `git grep` for fast code searches: + +```bash +git grep -n -i "pattern" # Case-insensitive search with line numbers +git grep -n -w "word" # Whole-word matches only +git grep -n -i "pattern" -- "*.c" # Search only C files +``` + +### Trace2 + +Enable tracing to see command execution patterns: +```bash +GIT_TRACE2_EVENT=/path/to/trace.txt git +``` + +### Instrumenting Git Internals During Tests + +When adding debug output to Git's C code during test investigation, +`fprintf(stderr, ...)` from git subprocesses spawned by the test framework +is typically swallowed (redirected or discarded by the test harness). Use +Trace2 instead: + +```c +trace2_data_intmax("index", NULL, "my_debug/cache_nr", istate->cache_nr); +trace2_data_string("index", NULL, "my_debug/state", some_string); +``` + +Then run the test with `GIT_TRACE2_EVENT` or `GIT_TRACE2_PERF` pointing to +a file, and grep the output. This integrates with Git's existing tracing +infrastructure and survives the test framework's output management. + +As a last resort (e.g. when Trace2 is not initialized yet at the point you +need to instrument), write to a fixed file path: + +```c +FILE *f = fopen("/tmp/debug.log", "a"); +if (f) { fprintf(f, "state: %u\n", value); fclose(f); } +``` + +### Comparing Branches After Rebase + +```bash +# See what patches exist in a new branch but not old +git log --oneline old-branch..new-branch +# or +git range-diff -s --right-only old-branch...new-branch + +# Compare specific files between branches +git diff old-branch..new-branch -- path/to/file.c +# or +git log -p old-branch..new-branch -- path/to/file.c +# or even +git log -L start-line,end-line:path/to/file.c old-branch..new-branch -- + +# Find upstream changes between tags +git log --oneline --first-parent v2.52.0..v2.53.0 +``` + +### Test Failure Investigation + +1. **Reproduce with tracing**: Run test with `-ivx` flags +2. **Check timestamps**: Look at `t_abs` in trace to understand ordering +3. **Compare with working version**: Build and test the previous version +4. **Bisect if needed**: Use `git bisect` to find the breaking commit + +Bisecting failures introduced by upstream commits require some stunts to +apply the downstream changes for every bisection step. This can be done by +squashing all downstream changes into one throw-away commit and then +cherry-picking that (typically, there will be merge conflicts the farther +away from the original branch point the commit is cherry-picked to, so it +often makes sense to squash both old and new downstream changes, and then +to "interpolate" between them when encountering merge conflicts). + +### Bisecting Failures in `seen` + +When a topic passes on its own but fails after being merged to `seen`, the +failure is caused by interaction with another in-flight topic. To identify +the culprit: + +1. Fetch the exact `seen` commit from the failing CI run (get the SHA from + the workflow run metadata via the GitHub API). +2. Use a worktree checked out at that `seen` commit. +3. Bisect the first-parent history between `upstream/master` and `seen~1` + (excluding the topic's own merge). At each bisection step, merge the + topic in temporarily, build, run the test, then undo the merge. +4. Write a `git bisect run` script that automates this. Key pitfalls: + - The script must `unset` test environment variables (especially + `GIT_TEST_SPLIT_INDEX`) before cleanup operations like + `git checkout -f`, otherwise the worktree's own index can get + corrupted. + - Use `git checkout -f "$ORIG"` (not `git reset --hard`) to undo the + temporary merge, since `reset --hard` under split-index can corrupt. + - Save the current commit OID at the start (`ORIG=$(git rev-parse HEAD)`) + because `ORIG_HEAD` is unreliable during bisect. + - On merge conflict, return 125 (skip) and `git merge --abort`. +5. Store the alias for running with the full set of CI test variables as a + repository-local alias (to avoid repeating the long export list and to + allow the user to approve the tool call once). + +### CI/Workflow Failure Investigation + +When a CI workflow fails, the debugging process has a high cost per iteration. +Approach these failures methodically: + +**1. Establish what changed.** Before looking at the error, identify: + +- What was the last successful run? What version/commit was it based on? +- What changed between then and now? (upstream commits, downstream patches, + runner image updates, dependency changes) +- Use the GitHub API to retrieve run metadata and compare. + +**2. Analyze the error deeply.** Read the full error message and surrounding +context. Understand: + +- What command failed? +- What were its inputs (flags, environment, paths)? +- What did it expect vs. what did it get? + +**3. Trace the code flow locally.** Before making any CI changes: + +- Read the workflow YAML, Makefiles, and scripts involved. +- Understand how variables flow from one to another. +- Identify where the failing values come from. + +**4. Reproduce locally if possible.** Many CI failures can be reproduced +locally with faster turnaround: + +- For build failures: replicate the build environment and commands. +- For macOS issues: if you lack a Mac, at least trace the Makefile logic + to understand what flags should be set and why. +- For test failures that only appear in specific CI jobs (like + `linux-TEST-vars`): reproduce with the _exact_ set of environment + variables that job sets. Check `ci/run-build-and-tests.sh` for the + job's variable block. Do not assume a single variable (e.g. + `GIT_TEST_SPLIT_INDEX`) is sufficient; other variables may contribute + to the failure path. +- When a test fails in `seen` but not on the topic branch alone, check + out the exact `seen` commit from the failing CI run (get the SHA from + the workflow run metadata) and reproduce against that. The interaction + with other in-flight topics is the likely cause. + +**5. Do not assume CI coverage from platform support.** When asking "why +does platform X not see this bug?", verify whether CI actually tests that +combination on that platform. For example, `GIT_TEST_SPLIT_INDEX=yes` is +only set by `linux-TEST-vars`; there is no equivalent `osx-TEST-vars` or +`windows-TEST-vars` job. A bug that only manifests under split-index +testing may be present on all platforms but only caught on Linux. + +**5. Add comprehensive diagnostics on first attempt.** If you must push to +CI to test, make that push count: + +- Add diagnostic output for every hypothesis you have. +- Print the values of key variables, paths, flags. +- Show the state before and after key operations. +- Design diagnostics to distinguish between your hypotheses. + +**6. Do not remove diagnostics until the problem is solved.** Keep them in +"drop!" commits so they can be easily removed later but provide information +if subsequent fixes also fail. + +**7. When a fix fails, treat it as data.** The failure tells you something. +Your mental model was wrong. Figure out what before trying again. + +## Git Workflow + +This repository is a shared development environment, not a sandbox. Exercise +caution with all Git operations. + +### Committing Changes + +Never use `git add -A` or `git add .` - these commands will stage untracked +build artifacts, editor swap files, and other detritus that should not be +committed. Always specify pathspecs explicitly: + +```bash +# Good: stage and commit specific files +git commit -sm "your message here" path/to/file.c other/file.h + +# Bad: stages everything, including untracked garbage +git add -A && git commit -m "message" +``` + +The `-s` flag adds a Signed-off-by trailer, which is required for this +project. + +When AI assistance is used to author or co-author a commit, add a +Co-authored-by trailer identifying the model: + +```bash +git commit -s --trailer "Co-authored-by: " -m "message" file.c +``` + +### Pushing Changes + +Never push without explicit user permission. The user controls when and +where changes are pushed. This is especially critical because: + +- The repository has multiple remotes with different purposes +- Force-pushing to the wrong remote can cause significant damage +- Tags require special handling (`git push --tags` or explicit tag pushes) + +Wait for the user to push, or ask explicitly before pushing. + +### Making Code Changes + +**Minimal, surgical changes.** Make the smallest possible change to achieve +the goal. Do not rewrite entire files or functions when a targeted edit +suffices. When removing functionality: + +1. Remove the code paths that invoke the unwanted functionality +2. Compile to identify what is now unused +3. Remove the unused functions one at a time +4. Repeat until clean + +**No fly-by changes.** Do not make changes that were not requested, even if +they seem like improvements (renaming variables, reformatting untouched code, +"fixing" things not part of the task). If you believe a change would be +beneficial but it was not requested, ask for permission first. + +**The human is the driver.** Execute what is asked. If you think something +should be done differently, ask---do not just do it. + +### Commit Message Quality + +Good commit messages use flowing English prose, not bullet points. They +clearly state: + +- **Context**: What situation prompted this change? Include URLs to failing + CI runs, issue numbers, or other references that future readers will need. +- **Intent**: What is this change trying to accomplish? +- **Justification**: Why is this the right approach? What alternatives were + considered? When choosing between approaches based on performance, + include measured timings so future readers understand the tradeoffs. +- **Implementation**: How does the change work? (Only for non-obvious parts; + don't describe what's clear from the diff.) + +Include exact error messages rather than vague descriptions. If a build +failed with `Undefined symbols for architecture arm64: "_iconv"`, put that +in the commit message - don't just say "fixed a linker error." + +Wrap commit messages at 76 columns per line. + +### Commit Prefixes for Rebase Workflows + +This repository uses interactive rebase with autosquash. Commit prefixes +signal intent: + +- **`fixup! `**: Will be squashed into the referenced commit + during rebase. The title after `fixup!` must match the original commit's + title exactly. +- **`drop!`**: Indicates a commit that should be dropped before the final + merge. Used for debugging, temporary workarounds, or experiments. + +To find the correct title for a fixup commit: + +```bash +git log --oneline path/to/changed/file | head -10 +``` + +Then use the exact title: + +```bash +git commit -sm "fixup! release: add Mac OSX installer build" path/to/file +``` + +## Rebasing Workflow + +Rebases are the bread and butter of Git for Windows: topic branches are +rebased every time upstream Git releases a new version. This section covers +the workflow for managing downstream patches through repeated rebases. + +### Merging-Rebases + +Git for Windows uses "merging-rebases" to maintain downstream patches. Unlike +a flat series of commits, the downstream changes are organized as topic +branches merged together, preserving the logical grouping of related changes. + +Each integration branch (`main`, `shears/next`, `shears/seen`) contains a +marker commit with the message "Start the merging-rebase to \". This +commit separates upstream history from downstream patches. Reference it with: + +```bash +# Find the marker commit +git log --oneline --grep="Start the merging-rebase" -1 + +# Reference it using commit message search syntax +origin/main^{/Start.the.merging-rebase} +``` + +When working with merging-rebases: + +- **Downstream patches start after the marker**: Use + `origin/main^{/Start.the.merging-rebase}..origin/main` to see all + downstream commits +- **Topic branches are merged, not rebased flat**: Each logical feature or + fix is a branch merged into the integration branch +- **Merge commits are preserved**: The rebase recreates the merge structure + on top of the new upstream base + +To compare downstream patches before and after a rebase: + +```bash +# Compare the old and new downstream patch series +git range-diff \ + old-base^{/Start.the.merging-rebase}..old-branch \ + new-base^{/Start.the.merging-rebase}..new-branch +``` + +### Starting a Merging-Rebase + +To rebase the downstream patches onto a new upstream version, create a marker +commit and use it as the base for an interactive rebase: + +```bash +# Variables for the commit message +tag=v2.53.0 +# The previous marker - this becomes the exclusion point for --onto +previousMergeOid=$(git rev-parse origin/main^{/Start.the.merging-rebase}) +tagOid=$(git rev-parse "$tag") +tipOid=$(git rev-parse origin/main) + +# Create the marker commit with two parents: the tag and the current tip +markerOid=$(git commit-tree "$tag^{tree}" -p "$tag" -p "$tipOid" -m "Start the merging-rebase to $tag + +This commit starts the rebase of $previousMergeOid to $tagOid") + +# Graft the marker to appear as if it has only the tag as parent +git replace --graft "$markerOid" "$tag" + +# Use the marker as the base for rebasing (only commits after previousMergeOid) +git rebase -r --onto "$markerOid" "$previousMergeOid" origin/main + +# After the rebase completes, delete the replace ref +git replace -d "$markerOid" +``` + +The marker commit is created with two parents: the upstream tag and the +current branch tip. The `git replace --graft` makes Git see only the tag as +parent during the rebase, allowing the downstream commits to be cleanly +rebased onto the new upstream. After the rebase completes, the replace ref +is deleted to clean up. + +#### The shears/* Branches + +Upstream Git has four integration branches: `seen`, `next`, `master`, and +`maint`. Git for Windows maintains a corresponding `shears/*` branch for each +(`shears/seen`, `shears/next`, `shears/master`, `shears/maint`) that +continuously rebases Git for Windows' `main` onto the respective upstream +branch. + +These branches are updated incrementally rather than from scratch, avoiding +re-resolution of merge conflicts. The update process leverages reachability: + +1. **Integrate new downstream commits**: If `origin/main` has commits not yet + in the shears branch, rebase them on top (using `-r` to preserve branch + structure). Update the marker commit's message and second parent. + +2. **Integrate new upstream commits**: If the upstream branch has commits not + yet integrated, rebase onto the new upstream tip. Update the marker commit + accordingly. + +The marker commit's second parent always points to the current `origin/main` +tip, making it trivial to identify what downstream commits are included. +Similarly, the marker's first parent (the upstream base) shows exactly which +upstream version is integrated. + +### When to Skip a Patch + +Use `git rebase --skip` when the patch is already in the new base: + +- **Upstreamed**: The patch was accepted upstream and is now in `seen` +- **Backported**: A fix we backported is now included in the upstream base +- **Superseded**: HEAD already contains evolved code that includes this + change + +Signs to skip rather than resolve: HEAD has the functionality, the +conflict would discard the patch entirely, or `git range-diff` shows +the downstream and upstream patches are equivalent. + +To find the corresponding upstream commit for a conflicting patch: + +```bash +git range-diff --left-only REBASE_HEAD^! REBASE_HEAD.. +``` + +### Resolving Merge Conflicts + +When resolving merge conflicts during a rebase (especially when squashing +fixups), the goal is to **apply the minimal surgical change** that the +patch intended, not to reconstruct entire functions or add duplicate code. + +#### 1. Understand What the Patch Wants + +First, examine the patch being applied: + +```bash +git show REBASE_HEAD +``` + +Look at the actual changes (lines starting with `-` and `+`): +- What lines are being removed? +- What lines are being added? +- What is the context (function name, nearby code)? + +**Key insight**: The patch shows the *intent*---a specific small change to +make. Focus on this, not on the conflict markers' content. + +**Code movement detection**: If the patch shows large changes, check with +`--ignore-space-change`: + +```bash +git show --ignore-space-change +``` + +This reveals whether the commit is primarily **moving code** (lots of +whitespace changes) or making **logic changes** (actual code modifications). +When code was moved and re-indented, focus only on the non-whitespace +changes when resolving the conflict. + +#### 2. Understand Where the Code Is Now + +The conflict occurred because the code moved or changed since the patch was +created. Find where that code actually exists now: + +```bash +# If the patch was changing a specific pattern, find all occurrences +git grep -n "pattern from patch" + +# View the conflicted file around those locations +``` + +**Common mistake**: Assuming the conflict markers show you what to do. They +do not---they just show where Git got confused. + +#### 3. Apply the Surgical Change + +Make **only** the change the patch intended, but in the current location: + +- If the patch adds `--abbrev=12` to a range-diff call, find where that + range-diff call is NOW and add it there +- If the patch changes a `.split()` pattern, find where that pattern is NOW + and change it +- Do not copy entire functions from the conflict markers +- Do not create duplicates + +#### 4. Remove ALL Conflict Markers + +Conflict markers make the file invalid code: +``` +<<<<<<< HEAD +======= +>>>>>>> commit-hash +``` + +**All three types of markers must be completely removed.** + +#### 5. Verify the Resolution + +**Critical**: After staging your resolution, verify it matches the patch +intent: + +```bash +# Compare your staged changes to the original patch +git diff --cached +git rebase --show-current-patch + +# Or more directly, compare to REBASE_HEAD +git diff --cached +git show REBASE_HEAD + +# For code that was moved/re-indented, ignore whitespace +git diff --cached --ignore-space-change +git show REBASE_HEAD --ignore-space-change +``` + +**Verify, verify, verify**: The output of `git diff --cached` should +correspond closely to the diff in `git show REBASE_HEAD`. The line numbers +and context will differ (because code moved), but the actual changes (the +`-` and `+` lines) should match the patch intent. + +**After completing a rebase**, always verify the final result: + +```bash +# Compare tree before and after rebase +git diff @{1} + +# Shows what changed in each rebased commit +git range-diff @{1}... +``` + +If the rebase was onto the same base commit (e.g., squashing fixups), the +`git diff @{1}` should be empty---this proves the rebase only reorganized +commits without changing the end result. If the rebase was onto a new base +commit (e.g., rebasing onto a new upstream release), the diff should match +the difference between the old and new base commits, modulo any changes +from upstreamed or backported patches. The `git range-diff @{1}...` shows +the intended amendments (like adding `--abbrev=12`) were correctly applied +to each commit. + +### Conflict Resolution Red Flags + +These indicate you are doing it wrong: + +- Your diff adds hundreds of lines when the patch only changed 3 +- Conflict markers remain in the file +- Functions appear twice in the file +- You added `<<<<<<< HEAD` or `=======` to the staged changes +- Syntax check fails after resolution + +### Key Conflict Resolution Lessons + +1. **Context changes, intent does not** - The patch's line numbers are + wrong, but the change is right +2. **Conflict markers lie** - They show you where Git got confused, not + what you should do +3. **One change at a time** - If the patch adds one line, your resolution + should add one line +4. **Verify, verify, verify** - `git diff --cached` should match + `git show REBASE_HEAD` (modulo context) +5. **Post-rebase verification** - `git diff @{1}` (empty) and + `git range-diff @{1}...` (shows amendments) +6. **Ignore whitespace for code moves** - Use `--ignore-space-change` to + see the actual logic changes when code was moved and re-indented +7. **When in doubt, look at the range-diff** - `git range-diff` shows if + you matched the intent + +### Useful Rebase Tools + +- `git rebase --show-current-patch` - See what change is being applied +- `git show REBASE_HEAD` - Alternative to above, works better with + `--ignore-space-change` +- `git show --ignore-space-change` - See only logic changes, not + whitespace/indentation +- `git grep -n "pattern"` - Find where code moved to +- `git log -L ,: REBASE_HEAD..HEAD` - See how upstream + modified a line range since the original patch; invaluable for + understanding how conflicting lines changed +- `git diff --cached` - After staging resolution, verify it matches + REBASE_HEAD +- `git diff @{1}` - After rebase, compare tree before/after +- `git range-diff @{1}...` - After rebase, verify intended changes were made +- `git range-diff A^! B^!` - Compare original patch to your resolution + +### Leveraging Rerere + +Git's "reuse recorded resolution" (`rerere`) feature automatically records +how you resolve conflicts and replays those resolutions when the same +conflict recurs. This is invaluable for repeated rebases where the same +downstream patches conflict with similar upstream changes. + +When you see `Staged 'file' using previous resolution`, Git has applied a +previously recorded resolution. Always verify these auto-resolutions are +still correct---upstream context may have changed enough that the old +resolution no longer applies cleanly. + +To enable rerere: +```bash +git config --global rerere.enabled true +``` + +### Automation Tips + +When running rebases in automated or scripted contexts, disable the pager +to avoid hangs: + +```bash +GIT_PAGER=cat git range-diff ... +# or +git --no-pager log ... +``` + +### Non-interactive "Interactive" Rebases + +AI agents cannot drive interactive editors reliably. Instead, insert a +`break` as the first todo command so the rebase stops immediately, then +edit the todo file directly: + +```bash +# Start the rebase, stopping before any picks execute +GIT_SEQUENCE_EDITOR='sed -i 1ib' git rebase -ir + +# Find and edit the todo file with the view/edit tools +git rev-parse --git-path rebase-merge/git-rebase-todo + +# After editing the todo, continue (GIT_EDITOR=true suppresses the +# editor that fixup -C and amend! commands would otherwise open) +GIT_EDITOR=true git rebase --continue +``` + +### Scripted Hunk Staging + +`git add -p` is interactive by default, but its prompts follow a +predictable protocol. To stage the first hunk of a file without +human interaction: + +```bash +printf '%s\n' s y q | git add -p +``` + +The `s` splits a large hunk, `y` stages the first sub-hunk, and `q` +quits. Adjust the sequence for different hunk selections (e.g., +`y y n q` to stage the first two hunks but skip the third). + +### Finding Which Commit to Amend + +When a working-tree change belongs in an earlier commit (an `hg absorb` +workflow), use `git log -L` to find which commit last touched the +relevant lines: + +```bash +git log -L ,+: +``` + +This shows the full history of a line range, making it easy to identify +the commit whose title you need for a `fixup!` commit. This is far more +surgical than grepping through full diffs. + +### Fixup Commits + +Downstream patches sometimes require adjustment due to changes in the +environment they operate in. These changes may come from: + +- **Upstream code changes**: API modifications, struct field moves, + declarations relocating between headers, or semantic changes in functions + that downstream code depends on. +- **External environment changes**: CI runner image updates, toolchain + upgrades, dependency version changes, or platform behavior shifts. + +In both cases, create a `fixup!` commit that will be squashed into the +original downstream patch during the next interactive rebase. The commit +message body must precisely document the change that necessitated the fix: + +- For upstream changes: reference the specific upstream commit (by OID or + title) and explain what it changed. +- For external changes: include URLs to failing CI runs, document what + changed in the environment (e.g., "GitHub Actions macos-latest runner + upgraded from macOS 14 to macOS 15"), and note the exact error message. + +This documentation is essential because the fixup will be squashed away, +and the context will be lost if not recorded in the commit message that +gets squashed into. + +Run affected tests before finalizing. + +### `amend!` Commits + +A `fixup!` commit keeps the target's commit message and merely combines +its diff into the target. An `amend!` commit additionally **replaces** +the target's commit message with its own body. Use `amend!` when the +fix changes the meaning of the target sufficiently that the original +subject or body is no longer accurate, or when the goal is to align a +downstream commit with a specific upstream replacement. + +The format is rigid: the first line of an `amend!` commit must be +exactly `amend! `, followed by a blank line and then +the **new** commit message that should replace the target's, starting +with the new subject line: + +``` +amend! mingw: use mimalloc + +mingw: stop using nedmalloc + +The vendored nedmalloc allocator under compat/nedmalloc/ has been +unmaintained upstream... +``` + +After autosquash, the resulting commit has the new subject (`mingw: +stop using nedmalloc`), the new body, and a diff that is the +composition of the target's diff and the `amend!`'s diff. Crafting the +`amend!` diff so that the composition equals a known upstream commit's +diff is the canonical way to align a downstream branch-thicket commit +with an in-flight upstream replacement: when the next merging-rebase +picks up the upstream commit, the byte-identical downstream commit +collapses into it cleanly. + +### PRs Composed Entirely of `fixup!` and `amend!` Commits + +Adjusting or removing a feature that lives in the branch thicket is +often best expressed as a PR that consists *only* of `fixup!` and +`amend!` commits targeting the existing thicket commits. Each pair +autosquashes during the next merging-rebase. Pairs whose diffs cancel +exactly produce empty commits, which the rebase drops with +`--empty=drop`. The end state is *as if the original commits had been +edited or removed in place*, while preserving review-friendly atomic +patches in the PR. + +This is the preferred pattern for reverting a multi-commit downstream +feature. Order the fixups in **reverse** of the originals so each +revert applies cleanly to the worktree as you build the series. + +### Common Adaptation Patterns + +**Struct field moves**: When upstream moves fields between structs, update +all downstream code that accesses those fields. + +**API changes**: When upstream changes function signatures, update callers +and verify semantics are preserved. + +**New abstractions**: When upstream introduces new layers, ensure downstream +code uses the correct instance. + +## Coding Conventions + +The Git project maintains a charmingly old-school, Unix-greybeard aesthetic +when it comes to text encoding. In the spirit of the PDP-11 and Bell Labs +terminal sessions of yore: + +- **ASCII only**: Avoid Unicode characters in source code, comments, and + documentation. Use `->` instead of `→`, `--` instead of `—`, and so on. + To verify your changes contain no non-ASCII characters: + ``` + git diff | LC_ALL=C grep '[^ -~]' + ``` +- **80 columns per line**: The mailing list veterans will "kindly" remind you + that lines should not exceed 80 characters (they do mean columns, but + let's not split beards or hairs about wide glyphs). + First, check for whitespace errors (trailing whitespace, mid-line tabs, etc.): + ``` + git diff --check + ``` + Once that passes, you know tabs only appear at line beginnings, so each + tab equals exactly 8 columns. To find lines exceeding 80 columns: + ``` + git diff --no-color | grep '^+' | sed 's/\t/ /g' | grep '.\{82\}' + ``` + (We use 82 because diff output prefixes added lines with `+`.) +- **Tabs for indentation**: The codebase uses tabs, not spaces. +- **No trailing whitespace**: Clean up your lines. + +**Pre-commit checklist.** Run all three checks before every commit: + +```bash +git diff --check && +git diff --no-color | LC_ALL=C grep '[^ -~]' && + echo "ERROR: non-ASCII characters found" && +git diff --no-color | grep '^+' | sed 's/\t/ /g' | + grep '.\{82\}' && + echo "ERROR: lines exceed 80 columns" +``` + +The first command catches whitespace errors. If either of the latter +two produces output, fix the offending lines before committing. Note +that these checks apply to commit messages as well (wrap at 76 columns +for messages, 80 for code). + +See `Documentation/CodingGuidelines` for the full set of conventions. + +### strbuf patterns + +Use `strbuf_addf()` with string continuation for multi-line content instead +of multiple `strbuf_addstr()` calls: + +```c +/* Good */ +strbuf_addf(&buf, + "tree %s\n" + "author %s\n" + "committer %s\n" + "\ncommit message\n", + tree_hex, author, committer); + +/* Avoid */ +strbuf_addstr(&buf, "tree "); +strbuf_addstr(&buf, tree_hex); +strbuf_addstr(&buf, "\nauthor "); +/* ... */ +``` + +Choose descriptive variable names (`header` for pack headers, not generic +`buf`; use `buf` for the secondary strbuf if you cannot reuse the first). + +## Platform Considerations + +### Windows-specific issues + +On Windows, `unsigned long` is 32 bits even on 64-bit systems. Use `size_t` +for sizes that may exceed 4GB. Be careful with format strings: use `PRIuMAX` +with a cast for `size_t` values. + +## Contributing to Git for Windows + +The primary contribution path for this fork is a PR against +`git-for-windows/git`'s `main` branch. The repository is laid out as a +branch thicket on top of an upstream Git base; see +[Merging-Rebases](#merging-rebases) and +[Analyzing Branch Thickets](#analyzing-branch-thickets) for the +mechanics. + +### Opening a PR + +Push the topic branch to a personal fork on GitHub, then: + +```bash +gh pr create \ + --repo git-for-windows/git \ + --base main \ + --head : \ + --title "" \ + --body-file +``` + +Unlike upstream contributions, the PR body is rendered as Markdown on +GitHub, not sent as email. Use the formatting that aids review: +fenced code blocks, tables, links to workflow runs. + +### When the PR Adjusts the Thicket Itself + +If the PR's purpose is to edit, remove, or replace existing +branch-thicket commits, the natural form is a series of `fixup!` or +`amend!` commits targeting the affected originals. See +[Fixup Commits](#fixup-commits), +[`amend!` Commits](#amend-commits), and +[PRs Composed Entirely of `fixup!` and `amend!` Commits](#prs-composed-entirely-of-fixup-and-amend-commits). +The merging-rebase that produces the next `main` autosquashes these +into the thicket; the PR exists for review of the individual +adjustments. + +### When an Upstream Patch Will Replace a Thicket Commit + +If an upstream patch is in flight (for instance, on `gitgitgadget/git` +in `seen` or `next`) that replaces a downstream thicket commit, an +`amend!` commit whose body is a verbatim copy of the upstream commit +message and whose diff aligns the autosquashed target with the +upstream commit's diff is the canonical pattern. The next +merging-rebase that picks up the upstream commit will recognize the +two as byte-identical and collapse them. + +## Contributing to Upstream Git via GitGitGadget + +### Overview + +The upstream Git project accepts contributions via the mailing list +(`git@vger.kernel.org`). [GitGitGadget](https://gitgitgadget.github.io/) +bridges GitHub PRs to the mailing list: you push a branch to your GitHub +fork, open a PR against https://github.com/gitgitgadget/git, and +GitGitGadget formats and sends the patches. + +### Workflow + +1. Push the topic branch to your personal fork on GitHub (the remote + that points at `https://github.com//git`). +2. Open a PR from `:` against `gitgitgadget/git`'s `master`. +3. The PR title becomes the patch series subject; the PR body becomes the + cover letter. Use + `gh pr create --repo gitgitgadget/git --head :`. +4. Use `/submit` as a PR comment to send patches to the mailing list. +5. After review feedback, update the branch, force-push, and `/submit` again. + +### Branch Naming + +Do **not** use an initials prefix (like `ds/` or `js/`). That convention is +used by the Git maintainer when picking up topics, not by contributors. Use +descriptive names like `tests-explicit-bare-repo`. + +### Cover Letter Style + +The PR body is the cover letter. It should be plain text (not Markdown with +headers or bullet formatting), since it will be sent as email. Structure: + +- A brief subject line (the PR title, e.g. "tests: access bare repositories + explicitly") +- Motivation: why is this change needed? +- Summary: what does the series do? What patterns/techniques does it use? +- Scope: is this part of a larger effort? If so, link to the tracking PR. + +Keep it factual and measured. Avoid framing changes in terms of security +when contributing to upstream Git; frame them as robustness, correctness, +or preparation for future defaults. + +### Commit Message Conventions (Upstream Git) + +Upstream Git commit messages follow stricter conventions than the Microsoft +Git fork: + +- **Subject line**: `: ` (lowercase after the colon). + The `` is typically a file name without extension (e.g. `t0001`, + `setup`, `scalar`) or a subsystem name (e.g. `tests`, `refs`). +- **Body**: Flowing English prose, no bullet points. Wrap at 76 columns. +- **ASCII only**: No Unicode characters anywhere in the message. +- **Trailers**: `Signed-off-by` is mandatory. `Assisted-by` for AI. +- The subject line must accurately describe the diff content. If a commit + adds `--git-dir=.` to one invocation, do not title it "wrap bare repo + commands in subshell with `GIT_DIR`". + +### Patch Series with Dependencies + +When contributing a branch thicket (multiple related patch series with +dependencies), submit the foundation series first and note the overall +effort in the cover letter with a link to the tracking PR or `compare` +URL. Submit dependent series after earlier ones land in `seen`. + +Use `git replay --onto ..` to test whether a +sub-branch applies cleanly to a given base (e.g., `upstream/master` or +`upstream/seen`) without touching the working tree. By default (since +the `--ref-action` default changed to `update`), `git replay` updates +named refs in the range directly, producing no stdout output. Use +`--ref-action=print` to get the old behavior of printing `update-ref` +commands to stdout instead. Always verify that `git replay` actually +did something by checking the reflog of the affected branches. + +## Working with Worktrees + +### General Principles + +Use worktrees to work on multiple topics simultaneously without stashing +or switching branches. Keep worktrees as subdirectories of the main +repository and add them to `.git/info/exclude` so they do not show up +as untracked files. + +```bash +git worktree add +echo "" >> .git/info/exclude +``` + +### Rewriting Commits with `--update-refs` + +When rewriting history in a worktree (e.g., fixing a commit message via +`amend!` + autosquash), use `--update-refs` so that other local branches +pointing into the rewritten range are updated automatically: + +```bash +# Create a local branch at the commit to be pushed +git branch + +# Create the amend! commit and autosquash +git commit --allow-empty -F +GIT_SEQUENCE_EDITOR=true GIT_EDITOR=true \ + git rebase -i --autosquash --update-refs + +# Verify: tree should be identical +git diff @{1}.. + +# Force-push the updated branch +git push --force-with-lease +``` + +The `--update-refs` flag is essential: without it, only the checked-out +branch is rewritten and other branches become stale, pointing at +pre-rewrite commits. + +### Verifying Rebase Results + +After any rebase, verify that the tree content is unchanged (unless you +intentionally modified it): + +```bash +git diff @{1} # Should be empty for pure rewording +git range-diff @{1}... # Shows per-commit changes +``` + +## Analyzing Branch Thickets + +When a branch is structured as a sequence of merged sub-branches (a +"branch thicket"), use the merge structure to extract sub-branches: + +```bash +# List the merge commits (sub-branches) +git log --oneline --first-parent ...upstream/master | grep 'Merge branch' + +# Extract commits for a specific sub-branch (second parent of its merge) +git log --oneline ^1..^2 + +# Find what each sub-branch forks from +git log -1 --format='%H %s' ^ +``` + +Use `git replay` to test whether sub-branches can be rebased onto a new +base without conflicts. This replaces speculation about "overlapping files" +with actual evidence: + +```bash +git replay --onto upstream/master .. +``` + +If the range contains merge commits, `git replay` will fail with "replaying +merge commits is not supported yet!" In that case, identify the linear +commit range and replay just those commits. + +## Resources + +- [Git for Windows](https://gitforwindows.org/) +- [Git Internals](https://git-scm.com/book/en/v2/Git-Internals-Plumbing-and-Porcelain) +- [GitGitGadget](https://gitgitgadget.github.io/) - Bridge GitHub PRs to + the Git mailing list +- [Git Mailing List Archive](https://lore.kernel.org/git/) - Searchable + archive of all upstream discussion From ba76b109db64864051bc1eabe632132876711237 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Fri, 11 Oct 2019 13:22:24 +0200 Subject: [PATCH 213/218] Modify the Code of Conduct for Git for Windows The Git project followed Git for Windows' lead and added their Code of Conduct, based on the Contributor Covenant v1.4, later updated to v2.0. We adapt it slightly to Git for Windows. Signed-off-by: Johannes Schindelin --- CODE_OF_CONDUCT.md | 58 +++++++++++++++++++++------------------------- 1 file changed, 26 insertions(+), 32 deletions(-) diff --git a/CODE_OF_CONDUCT.md b/CODE_OF_CONDUCT.md index e58917c50a96dc..4daef7e3ce9196 100644 --- a/CODE_OF_CONDUCT.md +++ b/CODE_OF_CONDUCT.md @@ -1,9 +1,9 @@ -# Git Code of Conduct +# Git for Windows Code of Conduct This code of conduct outlines our expectations for participants within -the Git community, as well as steps for reporting unacceptable behavior. -We are committed to providing a welcoming and inspiring community for -all and expect our code of conduct to be honored. Anyone who violates +the **Git for Windows** community, as well as steps for reporting unacceptable +behavior. We are committed to providing a welcoming and inspiring community +for all and expect our code of conduct to be honored. Anyone who violates this code of conduct may be banned from the community. ## Our Pledge @@ -12,8 +12,8 @@ We as members, contributors, and leaders pledge to make participation in our community a harassment-free experience for everyone, regardless of age, body size, visible or invisible disability, ethnicity, sex characteristics, gender identity and expression, level of experience, education, socio-economic status, -nationality, personal appearance, race, religion, or sexual identity -and orientation. +nationality, personal appearance, race, caste, color, religion, or sexual +identity and orientation. We pledge to act and interact in ways that contribute to an open, welcoming, diverse, inclusive, and healthy community. @@ -28,17 +28,17 @@ community include: * Giving and gracefully accepting constructive feedback * Accepting responsibility and apologizing to those affected by our mistakes, and learning from the experience -* Focusing on what is best not just for us as individuals, but for the - overall community +* Focusing on what is best not just for us as individuals, but for the overall + community Examples of unacceptable behavior include: -* The use of sexualized language or imagery, and sexual attention or - advances of any kind +* The use of sexualized language or imagery, and sexual attention or advances of + any kind * Trolling, insulting or derogatory comments, and personal or political attacks * Public or private harassment -* Publishing others' private information, such as a physical or email - address, without their explicit permission +* Publishing others' private information, such as a physical or email address, + without their explicit permission * Other conduct which could reasonably be considered inappropriate in a professional setting @@ -58,20 +58,14 @@ decisions when appropriate. This Code of Conduct applies within all community spaces, and also applies when an individual is officially representing the community in public spaces. -Examples of representing our community include using an official e-mail address, +Examples of representing our community include using an official email address, posting via an official social media account, or acting as an appointed representative at an online or offline event. ## Enforcement Instances of abusive, harassing, or otherwise unacceptable behavior may be -reported to the community leaders responsible for enforcement at -git@sfconservancy.org, or individually: - - - Ævar Arnfjörð Bjarmason - - Christian Couder - - Junio C Hamano - - Taylor Blau +reported by contacting the Git for Windows maintainer. All complaints will be reviewed and investigated promptly and fairly. @@ -94,15 +88,15 @@ behavior was inappropriate. A public apology may be requested. ### 2. Warning -**Community Impact**: A violation through a single incident or series -of actions. +**Community Impact**: A violation through a single incident or series of +actions. **Consequence**: A warning with consequences for continued behavior. No interaction with the people involved, including unsolicited interaction with those enforcing the Code of Conduct, for a specified period of time. This includes avoiding interactions in community spaces as well as external channels -like social media. Violating these terms may lead to a temporary or -permanent ban. +like social media. Violating these terms may lead to a temporary or permanent +ban. ### 3. Temporary Ban @@ -118,27 +112,27 @@ Violating these terms may lead to a permanent ban. ### 4. Permanent Ban **Community Impact**: Demonstrating a pattern of violation of community -standards, including sustained inappropriate behavior, harassment of an +standards, including sustained inappropriate behavior, harassment of an individual, or aggression toward or disparagement of classes of individuals. -**Consequence**: A permanent ban from any sort of public interaction within -the community. +**Consequence**: A permanent ban from any sort of public interaction within the +community. ## Attribution This Code of Conduct is adapted from the [Contributor Covenant][homepage], -version 2.0, available at -[https://www.contributor-covenant.org/version/2/0/code_of_conduct.html][v2.0]. +version 2.1, available at +[https://www.contributor-covenant.org/version/2/1/code_of_conduct.html][v2.1]. Community Impact Guidelines were inspired by [Mozilla's code of conduct enforcement ladder][Mozilla CoC]. For answers to common questions about this code of conduct, see the FAQ at -[https://www.contributor-covenant.org/faq][FAQ]. Translations are available -at [https://www.contributor-covenant.org/translations][translations]. +[https://www.contributor-covenant.org/faq][FAQ]. Translations are available at +[https://www.contributor-covenant.org/translations][translations]. [homepage]: https://www.contributor-covenant.org -[v2.0]: https://www.contributor-covenant.org/version/2/0/code_of_conduct.html +[v2.1]: https://www.contributor-covenant.org/version/2/1/code_of_conduct.html [Mozilla CoC]: https://github.com/mozilla/diversity [FAQ]: https://www.contributor-covenant.org/faq [translations]: https://www.contributor-covenant.org/translations From d982e793be1bd5222d38146ae1d66e1076c97fe1 Mon Sep 17 00:00:00 2001 From: Derrick Stolee Date: Thu, 1 Mar 2018 12:10:14 -0500 Subject: [PATCH 214/218] CONTRIBUTING.md: add guide for first-time contributors Getting started contributing to Git can be difficult on a Windows machine. CONTRIBUTING.md contains a guide to getting started, including detailed steps for setting up build tools, running tests, and submitting patches to upstream. [includes an example by Pratik Karki how to submit v2, v3, v4, etc.] Signed-off-by: Derrick Stolee --- CONTRIBUTING.md | 417 ++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 417 insertions(+) create mode 100644 CONTRIBUTING.md diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md new file mode 100644 index 00000000000000..48ff9029374df3 --- /dev/null +++ b/CONTRIBUTING.md @@ -0,0 +1,417 @@ +How to Contribute to Git for Windows +==================================== + +Git was originally designed for Unix systems and still today, all the build tools for the Git +codebase assume you have standard Unix tools available in your path. If you have an open-source +mindset and want to start contributing to Git, but primarily use a Windows machine, then you may +have trouble getting started. This guide is for you. + +Get the Source +-------------- + +Clone the [GitForWindows repository on GitHub](https://github.com/git-for-windows/git). +It is helpful to create your own fork for storing your development branches. + +Windows uses different line endings than Unix systems. See +[this GitHub article on working with line endings](https://help.github.com/articles/dealing-with-line-endings/#refreshing-a-repository-after-changing-line-endings) +if you have trouble with line endings. + +Build the Source +---------------- + +First, download and install the latest [Git for Windows SDK (64-bit)](https://github.com/git-for-windows/build-extra/releases/latest). +When complete, you can run the Git SDK, which creates a new Git Bash terminal window with +the additional development commands, such as `make`. + + As of time of writing, the SDK uses a different credential manager, so you may still want to use normal Git + Bash for interacting with your remotes. Alternatively, use SSH rather than HTTPS and + avoid credential manager problems. + +You should now be ready to type `make` from the root of your `git` source directory. +Here are some helpful variations: + +* `make -j[N] DEVELOPER=1`: Compile new sources using up to N concurrent processes. + The `DEVELOPER` flag turns on all warnings; code failing these warnings will not be + accepted upstream ("upstream" = "the core Git project"). +* `make clean`: Delete all compiled files. + +When running `make`, you can use `-j$(nproc)` to automatically use the number of processors +on your machine as the number of concurrent build processes. + +You can go deeper on the Windows-specific build process by reading the +[technical overview](https://gitforwindows.org/technical-overview) or the +[guide to compiling Git with Visual Studio](https://gitforwindows.org/compiling-git-with-visual-studio). + +## Building `git` on Windows with Visual Studio + +The typical approach to building `git` is to use the standard `Makefile` with GCC, as +above. Developers working in a Windows environment may want to instead build with the +[Microsoft Visual C++ compiler and libraries toolset (MSVC)](https://blogs.msdn.microsoft.com/vcblog/2017/03/07/msvc-the-best-choice-for-windows/). +There are a few benefits to using MSVC over GCC during your development, including creating +symbols for debugging and [performance tracing](https://github.com/Microsoft/perfview#perfview-overview). + +There are two ways to build Git for Windows using MSVC. Each have their own merits. + +### Using SDK Command Line + +Use one of the following commands from the SDK Bash window to build Git for Windows: + +``` + make MSVC=1 -j12 + make MSVC=1 DEBUG=1 -j12 +``` + +The first form produces release-mode binaries; the second produces debug-mode binaries. +Both forms produce PDB files and can be debugged. However, the first is best for perf +tracing and the second is best for single-stepping. + +You can then open Visual Studio and select File -> Open -> Project/Solution and select +the compiled `git.exe` file. This creates a basic solution and you can use the debugging +and performance tracing tools in Visual Studio to monitor a Git process. Use the Debug +Properties page to set the working directory and command line arguments. + +Be sure to clean up before switching back to GCC (or to switch between debug and +release MSVC builds): + +``` + make MSVC=1 -j12 clean + make MSVC=1 DEBUG=1 -j12 clean +``` + +### Using the IDE + +If you prefer working in Visual Studio with a solution full of projects, then you can use +CMake, either by letting Visual Studio configure it automatically (simply open Git's +top-level directory via `File>Open>Folder...`) or by (downloading and) running +[CMake](https://cmake.org) manually. + +What to Change? +--------------- + +Many new contributors ask: What should I start working on? + +One way to win big with the open-source community is to look at the +[issues page](https://github.com/git-for-windows/git/issues) and see if there are any issues that +you can fix quickly, or if anything catches your eye. + +You can also look at [the unofficial Chromium issues page](https://crbug.com/git) for +multi-platform issues. You can look at recent user questions on +[the Git mailing list](https://public-inbox.org/git). + +Or you can "scratch your own itch", i.e. address an issue you have with Git. The team at Microsoft where the Git for Windows maintainer works, for example, is focused almost entirely on [improving performance](https://blogs.msdn.microsoft.com/devops/2018/01/11/microsofts-performance-contributions-to-git-in-2017/). +We approach our work by finding something that is slow and try to speed it up. We start our +investigation by reliably reproducing the slow behavior, then running that example using +the MSVC build and tracing the results in PerfView. + +You could also think of something you wish Git could do, and make it do that thing! The +only concern I would have with this approach is whether or not that feature is something +the community also wants. If this excites you though, go for it! Don't be afraid to +[get involved in the mailing list](http://vger.kernel.org/vger-lists.html#git) early for +feedback on the idea. + +Test Your Changes +----------------- + +After you make your changes, it is important that you test your changes. Manual testing is +important, but checking and extending the existing test suite is even more important. You +want to run the functional tests to see if you broke something else during your change, and +you want to extend the functional tests to be sure no one breaks your feature in the future. + +### Functional Tests + +Navigate to the `t/` directory and type `make` to run all tests or use `prove` as +[described on this Git for Windows page](https://gitforwindows.org/building-git): + +``` +prove -j12 --state=failed,save ./t[0-9]*.sh +``` + +You can also run each test directly by running the corresponding shell script with a name +like `tNNNN-descriptor.sh`. + +If you are adding new functionality, you may need to create unit tests by creating +helper commands that test a very limited action. These commands are stored in `t/helpers`. +When adding a helper, be sure to add a line to `t/Makefile` and to the `.gitignore` for the +binary file you add. The Git community prefers functional tests using the full `git` +executable, so try to exercise your new code using `git` commands before creating a test +helper. + +To find out why a test failed, repeat the test with the `-x -v -d -i` options and then +navigate to the appropriate "trash" directory to see the data shape that was used for the +test failed step. + +Read [`t/README`](t/README) for more details. + +### Performance Tests + +If you are working on improving performance, you will need to be acquainted with the +performance tests in `t/perf`. There are not too many performance tests yet, but adding one +as your first commit in a patch series helps to communicate the boost your change provides. + +To check the change in performance across multiple versions of `git`, you can use the +`t/perf/run` script. For example, to compare the performance of `git rev-list` across the +`core/master` and `core/next` branches compared to a `topic` branch, you can run + +``` +cd t/perf +./run core/master core/next topic -- p0001-rev-list.sh +``` + +You can also set certain environment variables to help test the performance on different +repositories or with more repetitions. The full list is available in +[the `t/perf/README` file](t/perf/README), +but here are a few important ones: + +``` +GIT_PERF_REPO=/path/to/repo +GIT_PERF_LARGE_REPO=/path/to/large/repo +GIT_PERF_REPEAT_COUNT=10 +``` + +When running the performance tests on Linux, you may see a message "Can't locate JSON.pm in +@INC" and that means you need to run `sudo cpanm install JSON` to get the JSON perl package. + +For running performance tests, it can be helpful to set up a few repositories with strange +data shapes, such as: + +**Many objects:** Clone repos such as [Kotlin](https://github.com/jetbrains/kotlin), [Linux](https://github.com/torvalds/linux), or [Android](https://source.android.com/setup/downloading). + +**Many pack-files:** You can split a fresh clone into multiple pack-files of size at most +16MB by running `git repack -adfF --max-pack-size=16m`. See the +[`git repack` documentation](https://git-scm.com/docs/git-repack) for more information. +You can count the number of pack-files using `ls .git/objects/pack/*.pack | wc -l`. + +**Many loose objects:** If you already split your repository into multiple pack-files, then +you can pick one to split into loose objects using `cat .git/objects/pack/[id].pack | git unpack-objects`; +delete the `[id].pack` and `[id].idx` files after this. You can count the number of loose +bjects using `ls .git/objects/??/* | wc -l`. + +**Deep history:** Usually large repositories also have deep histories, but you can use the +[test-many-commits-1m repo](https://github.com/cirosantilli/test-many-commits-1m/) to +target deep histories without the overhead of many objects. One issue with this repository: +there are no merge commits, so you will need to use a different repository to test a "wide" +commit history. + +**Large Index:** You can generate a large index and repo by using the scripts in +`t/perf/repos`. There are two scripts. `many-files.sh` which will generate a repo with +same tree and blobs but different paths. Using `many-files.sh -d 5 -w 10 -f 9` will create +a repo with ~1 million entries in the index. `inflate-repo.sh` will use an existing repo +and copy the current work tree until it is a specified size. + +Test Your Changes on Linux +-------------------------- + +It can be important to work directly on the [core Git codebase](https://github.com/git/git), +such as a recent commit into the `master` or `next` branch that has not been incorporated +into Git for Windows. Also, it can help to run functional and performance tests on your +code in Linux before submitting patches to the mailing list, which focuses on many platforms. +The differences between Windows and Linux are usually enough to catch most cross-platform +issues. + +### Using the Windows Subsystem for Linux + +The [Windows Subsystem for Linux (WSL)](https://docs.microsoft.com/en-us/windows/wsl/install-win10) +allows you to [install Ubuntu Linux as an app](https://www.microsoft.com/en-us/store/p/ubuntu/9nblggh4msv6) +that can run Linux executables on top of the Windows kernel. Internally, +Linux syscalls are interpreted by the WSL, everything else is plain Ubuntu. + +First, open WSL (either type "Bash" in Cortana, or execute "bash.exe" in a CMD window). +Then install the prerequisites, and `git` for the initial clone: + +``` +sudo apt-get update +sudo apt-get install git gcc make libssl-dev libcurl4-openssl-dev \ + libexpat-dev tcl tk gettext git-email zlib1g-dev +``` + +Then, clone and build: + +``` +git clone https://github.com/git-for-windows/git +cd git +git remote add -f upstream https://github.com/git/git +make +``` + +Be sure to clone into `/home/[user]/` and not into any folder under `/mnt/?/` or your build +will fail due to colons in file names. + +### Using a Linux Virtual Machine with Hyper-V + +If you prefer, you can use a virtual machine (VM) to run Linux and test your changes in the +full environment. The test suite runs a lot faster on Linux than on Windows or with the WSL. +You can connect to the VM using an SSH terminal like +[PuTTY](https://www.chiark.greenend.org.uk/~sgtatham/putty/). + +The following instructions are for using Hyper-V, which is available in some versions of Windows. +There are many virtual machine alternatives available, if you do not have such a version installed. + +* [Download an Ubuntu Server ISO](https://www.ubuntu.com/download/server). +* Open [Hyper-V Manager](https://docs.microsoft.com/en-us/virtualization/hyper-v-on-windows/quick-start/enable-hyper-v). +* [Set up a virtual switch](https://docs.microsoft.com/en-us/virtualization/hyper-v-on-windows/quick-start/connect-to-network) + so your VM can reach the network. +* Select "Quick Create", name your machine, select the ISO as installation source, and un-check + "This virtual machine will run Windows." +* Go through the Ubuntu install process, being sure to select to install OpenSSH Server. +* When install is complete, log in and check the SSH server status with `sudo service ssh status`. + * If the service is not found, install with `sudo apt-get install openssh-server`. + * If the service is not running, then use `sudo service ssh start`. +* Use `shutdown -h now` to shutdown the VM, go to the Hyper-V settings for the VM, expand Network Adapter + to select "Advanced Features", and set the MAC address to be static (this can save your VM from losing + network if shut down incorrectly). +* Provide as many cores to your VM as you can (for parallel builds). +* Restart your VM, but do not connect. +* Use `ssh` in Git Bash, download [PuTTY](https://www.chiark.greenend.org.uk/~sgtatham/putty/), or use your favorite SSH client to connect to the VM through SSH. + +In order to build and use `git`, you will need the following libraries via `apt-get`: + +``` +sudo apt-get update +sudo apt-get install git gcc make libssl-dev libcurl4-openssl-dev \ + libexpat-dev tcl tk gettext git-email zlib1g-dev +``` + +To get your code from your Windows machine to the Linux VM, it is easiest to push the branch to your fork of Git and clone your fork in the Linux VM. + +Don't forget to set your `git` config with your preferred name, email, and editor. + +Polish Your Commits +------------------- + +Before submitting your patch, be sure to read the [coding guidelines](https://github.com/git/git/blob/master/Documentation/CodingGuidelines) +and check your code to match as best you can. This can be a lot of effort, but it saves +time during review to avoid style issues. + +The other possibly major difference between the mailing list submissions and GitHub PR workflows +is that each commit will be reviewed independently. Even if you are submitting a +patch series with multiple commits, each commit must stand on it's own and be reviewable +by itself. Make sure the commit message clearly explain the why of the commit not the how. +Describe what is wrong with the current code and how your changes have made the code better. + +When preparing your patch, it is important to put yourself in the shoes of the Git community. +Accepting a patch requires more justification than approving a pull request from someone on +your team. The community has a stable product and is responsible for keeping it stable. If +you introduce a bug, then they cannot count on you being around to fix it. When you decided +to start work on a new feature, they were not part of the design discussion and may not +even believe the feature is worth introducing. + +Questions to answer in your patch message (and commit messages) may include: +* Why is this patch necessary? +* How does the current behavior cause pain for users? +* What kinds of repositories are necessary for noticing a difference? +* What design options did you consider before writing this version? Do you have links to + code for those alternate designs? +* Is this a performance fix? Provide clear performance numbers for various well-known repos. + +Here are some other tips that we use when cleaning up our commits: + +* Commit messages should be wrapped at 76 columns per line (or less; 72 is also a + common choice). +* Make sure the commits are signed off using `git commit (-s|--signoff)`. See + [SubmittingPatches](https://github.com/git/git/blob/v2.8.1/Documentation/SubmittingPatches#L234-L286) + for more details about what this sign-off means. +* Check for whitespace errors using `git diff --check [base]...HEAD` or `git log --check`. +* Run `git rebase --whitespace=fix` to correct upstream issues with whitespace. +* Become familiar with interactive rebase (`git rebase -i`) because you will be reordering, + squashing, and editing commits as your patch or series of patches is reviewed. +* Make sure any shell scripts that you add have the executable bit set on them. This is + usually for test files that you add in the `/t` directory. You can use + `git add --chmod=+x [file]` to update it. You can test whether a file is marked as executable + using `git ls-files --stage \*.sh`; the first number is 100755 for executable files. +* Your commit titles should match the "area: change description" format. Rules of thumb: + * Choose ": " prefix appropriately. + * Keep the description short and to the point. + * The word that follows the ": " prefix is not capitalized. + * Do not include a full-stop at the end of the title. + * Read a few commit messages -- using `git log origin/master`, for instance -- to + become acquainted with the preferred commit message style. +* Build source using `make DEVELOPER=1` for extra-strict compiler warnings. + +Submit Your Patch +----------------- + +Git for Windows [accepts pull requests on GitHub](https://github.com/git-for-windows/git/pulls), but +these are reserved for Windows-specific improvements. For core Git, submissions are accepted on +[the Git mailing list](https://public-inbox.org/git). + +### Configure Git to Send Emails + +There are a bunch of options for configuring the `git send-email` command. These options can +be found in the documentation for +[`git config`](https://git-scm.com/docs/git-config) and +[`git send-email`](https://git-scm.com/docs/git-send-email). + +``` +git config --global sendemail.smtpserver +git config --global sendemail.smtpserverport 587 +git config --global sendemail.smtpencryption tls +git config --global sendemail.smtpuser +``` + +To avoid storing your password in the config file, store it in the Git credential manager: + +``` +$ git credential fill +protocol=smtp +host= +username= +password=password +``` + +Before submitting a patch, read the [Git documentation on submitting patches](https://github.com/git/git/blob/master/Documentation/SubmittingPatches). + +To construct a patch set, use the `git format-patch` command. There are three important options: + +* `--cover-letter`: If specified, create a `[v#-]0000-cover-letter.patch` file that can be + edited to describe the patch as a whole. If you previously added a branch description using + `git branch --edit-description`, you will end up with a 0/N mail with that description and + a nice overall diffstat. +* `--in-reply-to=[Message-ID]`: This will mark your cover letter as replying to the given + message (which should correspond to your previous iteration). To determine the correct Message-ID, + find the message you are replying to on [public-inbox.org/git](https://public-inbox.org/git) and take + the ID from between the angle brackets. + +* `--subject-prefix=[prefix]`: This defaults to [PATCH]. For subsequent iterations, you will want to + override it like `--subject-prefix="[PATCH v2]"`. You can also use the `-v` option to have it + automatically generate the version number in the patches. + +If you have multiple commits and use the `--cover-letter` option be sure to open the +`0000-cover-letter.patch` file to update the subject and add some details about the overall purpose +of the patch series. + +### Examples + +To generate a single commit patch file: +``` +git format-patch -s -o [dir] -1 +``` +To generate four patch files from the last three commits with a cover letter: +``` +git format-patch --cover-letter -s -o [dir] HEAD~4 +``` +To generate version 3 with four patch files from the last four commits with a cover letter: +``` +git format-patch --cover-letter -s -o [dir] -v 3 HEAD~4 +``` + +### Submit the Patch + +Run [`git send-email`](https://git-scm.com/docs/git-send-email), starting with a test email: + +``` +git send-email --to=yourself@address.com [dir with patches]/*.patch +``` + +After checking the receipt of your test email, you can send to the list and to any +potentially interested reviewers. + +``` +git send-email --to=git@vger.kernel.org --cc= --cc= [dir with patches]/*.patch +``` + +To submit a nth version patch (say version 3): + +``` +git send-email --to=git@vger.kernel.org --cc= --cc= \ + --in-reply-to= [dir with patches]/*.patch +``` From 6735645a45c9a5a0c3516c988a97e6a727bec78e Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Fri, 10 Jan 2014 16:16:03 -0600 Subject: [PATCH 215/218] README.md: Add a Windows-specific preamble MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Includes touch-ups by 마누엘, Philip Oakley and 孙卓识. Signed-off-by: Johannes Schindelin --- README.md | 78 +++++++++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 76 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index d87bca1b8c3ebf..026d5d85caef09 100644 --- a/README.md +++ b/README.md @@ -1,4 +1,77 @@ -[![Build status](https://github.com/git/git/workflows/CI/badge.svg)](https://github.com/git/git/actions?query=branch%3Amaster+event%3Apush) +Git for Windows +=============== + +[![Contributor Covenant](https://img.shields.io/badge/Contributor%20Covenant-2.1-4baaaa.svg)](CODE_OF_CONDUCT.md) +[![Open in Visual Studio Code](https://img.shields.io/static/v1?logo=visualstudiocode&label=&message=Open%20in%20Visual%20Studio%20Code&labelColor=2c2c32&color=007acc&logoColor=007acc)](https://open.vscode.dev/git-for-windows/git) +[![Build status](https://github.com/git-for-windows/git/workflows/CI/badge.svg)](https://github.com/git-for-windows/git/actions?query=branch%3Amain+event%3Apush) +[![Join the chat at https://gitter.im/git-for-windows/git](https://badges.gitter.im/Join%20Chat.svg)](https://gitter.im/git-for-windows/git?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge) + +This is [Git for Windows](http://git-for-windows.github.io/), the Windows port +of [Git](http://git-scm.com/). + +The Git for Windows project is run using a [governance +model](http://git-for-windows.github.io/governance-model.html). If you +encounter problems, you can report them as [GitHub +issues](https://github.com/git-for-windows/git/issues), discuss them in Git +for Windows' [Discussions](https://github.com/git-for-windows/git/discussions) +or on the [Git mailing list](mailto:git@vger.kernel.org), and [contribute bug +fixes](https://gitforwindows.org/how-to-participate). + +To build Git for Windows, please either install [Git for Windows' +SDK](https://gitforwindows.org/#download-sdk), start its `git-bash.exe`, `cd` +to your Git worktree and run `make`, or open the Git worktree as a folder in +Visual Studio. + +To verify that your build works, use one of the following methods: + +- If you want to test the built executables within Git for Windows' SDK, + prepend `/bin-wrappers` to the `PATH`. +- Alternatively, run `make install` in the Git worktree. +- If you need to test this in a full installer, run `sdk build + git-and-installer`. +- You can also "install" Git into an existing portable Git via `make install + DESTDIR=` where `` refers to the top-level directory of the + portable Git. In this instance, you will want to prepend that portable Git's + `/cmd` directory to the `PATH`, or test by running that portable Git's + `git-bash.exe` or `git-cmd.exe`. +- If you built using a recent Visual Studio, you can use the menu item + `Build>Install git` (you will want to click on `Project>CMake Settings for + Git` first, then click on `Edit JSON` and then point `installRoot` to the + `mingw64` directory of an already-unpacked portable Git). + + As in the previous bullet point, you will then prepend `/cmd` to the `PATH` + or run using the portable Git's `git-bash.exe` or `git-cmd.exe`. +- If you want to run the built executables in-place, but in a CMD instead of + inside a Bash, you can run a snippet like this in the `git-bash.exe` window + where Git was built (ensure that the `EOF` line has no leading spaces), and + then paste into the CMD window what was put in the clipboard: + + ```sh + clip.exe < (see https://subspace.kernel.org/subscribing.html for details). The mailing list archives are available at , and other archival sites. +The core git mailing list is plain text (no HTML!). Issues which are security relevant should be disclosed privately to the Git Security mailing list . From ed80239778d4525dd8618b8333894571712dbac9 Mon Sep 17 00:00:00 2001 From: Brendan Forster Date: Thu, 18 Feb 2016 21:29:50 +1100 Subject: [PATCH 216/218] Add an issue template With improvements by Clive Chan, Adric Norris, Ben Bodenmiller and Philip Oakley. Helped-by: Clive Chan Helped-by: Adric Norris Helped-by: Ben Bodenmiller Helped-by: Philip Oakley Signed-off-by: Brendan Forster Signed-off-by: Johannes Schindelin --- .github/ISSUE_TEMPLATE/bug-report.yml | 105 ++++++++++++++++++++++++++ .github/ISSUE_TEMPLATE/config.yml | 1 + 2 files changed, 106 insertions(+) create mode 100644 .github/ISSUE_TEMPLATE/bug-report.yml create mode 100644 .github/ISSUE_TEMPLATE/config.yml diff --git a/.github/ISSUE_TEMPLATE/bug-report.yml b/.github/ISSUE_TEMPLATE/bug-report.yml new file mode 100644 index 00000000000000..b49593339932b2 --- /dev/null +++ b/.github/ISSUE_TEMPLATE/bug-report.yml @@ -0,0 +1,105 @@ +name: Bug report +description: Use this template to report bugs. +body: + - type: checkboxes + id: search + attributes: + label: Existing issues matching what you're seeing + description: Please search for [open](https://github.com/git-for-windows/git/issues?q=is%3Aopen) or [closed](https://github.com/git-for-windows/git/issues?q=is%3Aclosed) issue matching what you're seeing before submitting a new issue. + options: + - label: I was not able to find an open or closed issue matching what I'm seeing + - type: textarea + id: git-for-windows-version + attributes: + label: Git for Windows version + description: Which version of Git for Windows are you using? + placeholder: Please insert the output of `git --version --build-options` here + render: shell + validations: + required: true + - type: dropdown + id: windows-version + attributes: + label: Windows version + description: Which version of Windows are you running? + options: + - Windows 8.1 + - Windows 10 + - Windows 11 + - Other + default: 2 + validations: + required: true + - type: dropdown + id: windows-arch + attributes: + label: Windows CPU architecture + description: What CPU Archtitecture does your Windows target? + options: + - i686 (32-bit) + - x86_64 (64-bit) + - ARM64 + default: 1 + validations: + required: true + - type: textarea + id: windows-version-cmd + attributes: + label: Additional Windows version information + description: This provides us with further information about your Windows such as the build number + placeholder: Please insert the output of `cmd.exe /c ver` here + render: shell + - type: textarea + id: options + attributes: + label: Options set during installation + description: What options did you set as part of the installation? Or did you choose the defaults? + placeholder: | + One of the following: + > type "C:\Program Files\Git\etc\install-options.txt" + > type "C:\Program Files (x86)\Git\etc\install-options.txt" + > type "%USERPROFILE%\AppData\Local\Programs\Git\etc\install-options.txt" + > type "$env:USERPROFILE\AppData\Local\Programs\Git\etc\install-options.txt" + $ cat /etc/install-options.txt + render: shell + validations: + required: true + - type: textarea + id: other-things + attributes: + label: Other interesting things + description: Any other interesting things about your environment that might be related to the issue you're seeing? + - type: input + id: terminal + attributes: + label: Terminal/shell + description: Which terminal/shell are you running Git from? e.g Bash/CMD/PowerShell/other + validations: + required: true + - type: textarea + id: commands + attributes: + label: Commands that trigger the issue + description: What commands did you run to trigger this issue? If you can provide a [Minimal, Complete, and Verifiable example](http://stackoverflow.com/help/mcve) this will help us understand the issue. + render: shell + validations: + required: true + - type: textarea + id: expected-behaviour + attributes: + label: Expected behaviour + description: What did you expect to occur after running these commands? + validations: + required: true + - type: textarea + id: actual-behaviour + attributes: + label: Actual behaviour + description: What actually happened instead? + validations: + required: true + - type: textarea + id: repository + attributes: + label: Repository + description: If the problem was occurring with a specific repository, can you provide the URL to that repository to help us with testing? \ No newline at end of file diff --git a/.github/ISSUE_TEMPLATE/config.yml b/.github/ISSUE_TEMPLATE/config.yml new file mode 100644 index 00000000000000..ec4bb386bcf8a4 --- /dev/null +++ b/.github/ISSUE_TEMPLATE/config.yml @@ -0,0 +1 @@ +blank_issues_enabled: false \ No newline at end of file From 25c2e8ccacd22a855609c4e2e74d72efad2c8532 Mon Sep 17 00:00:00 2001 From: Philip Oakley Date: Fri, 22 Dec 2017 17:15:50 +0000 Subject: [PATCH 217/218] Modify the GitHub Pull Request template (to reflect Git for Windows) Git for Windows accepts pull requests; Core Git does not. Therefore we need to adjust the template (because it only matches core Git's project management style, not ours). Also: direct Git for Windows enhancements to their contributions page, space out the text for easy reading, and clarify that the mailing list is plain text, not HTML. Signed-off-by: Philip Oakley Signed-off-by: Johannes Schindelin --- .github/PULL_REQUEST_TEMPLATE.md | 20 ++++++++++++++++---- 1 file changed, 16 insertions(+), 4 deletions(-) diff --git a/.github/PULL_REQUEST_TEMPLATE.md b/.github/PULL_REQUEST_TEMPLATE.md index 37654cdfd7abcf..7baf31f2c471ec 100644 --- a/.github/PULL_REQUEST_TEMPLATE.md +++ b/.github/PULL_REQUEST_TEMPLATE.md @@ -1,7 +1,19 @@ -Thanks for taking the time to contribute to Git! Please be advised that the -Git community does not use github.com for their contributions. Instead, we use -a mailing list (git@vger.kernel.org) for code submissions, code reviews, and -bug reports. Nevertheless, you can use GitGitGadget (https://gitgitgadget.github.io/) +Thanks for taking the time to contribute to Git! + +Those seeking to contribute to the Git for Windows fork should see +http://gitforwindows.org/#contribute on how to contribute Windows specific +enhancements. + +If your contribution is for the core Git functions and documentation +please be aware that the Git community does not use the github.com issues +or pull request mechanism for their contributions. + +Instead, we use the Git mailing list (git@vger.kernel.org) for code and +documentation submissions, code reviews, and bug reports. The +mailing list is plain text only (anything with HTML is sent directly +to the spam folder). + +Nevertheless, you can use GitGitGadget (https://gitgitgadget.github.io/) to conveniently send your Pull Requests commits to our mailing list. For a single-commit pull request, please *leave the pull request description From 22ad5b3e2b33b2a36974038f7819f13941abb43a Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Fri, 23 Aug 2019 14:14:42 +0200 Subject: [PATCH 218/218] SECURITY.md: document Git for Windows' policies This is the recommended way on GitHub to describe policies revolving around security issues and about supported versions. Helped-by: Sven Strickroth Signed-off-by: Johannes Schindelin --- SECURITY.md | 56 +++++++++++++++++++++++++++++++++-------------------- 1 file changed, 35 insertions(+), 21 deletions(-) diff --git a/SECURITY.md b/SECURITY.md index c720c2ae7f9580..42b6d458bfd557 100644 --- a/SECURITY.md +++ b/SECURITY.md @@ -28,24 +28,38 @@ Examples for details to include: ## Supported Versions -There are no official "Long Term Support" versions in Git. -Instead, the maintenance track (i.e. the versions based on the -most recently published feature release, also known as ".0" -version) sees occasional updates with bug fixes. - -Fixes to vulnerabilities are made for the maintenance track for -the latest feature release and merged up to the in-development -branches. The Git project makes no formal guarantee for any -older maintenance tracks to receive updates. In practice, -though, critical vulnerability fixes are applied not only to the -most recent track, but to at least a couple more maintenance -tracks. - -This is typically done by making the fix on the oldest and still -relevant maintenance track, and merging it upwards to newer and -newer maintenance tracks. - -For example, v2.24.1 was released to address a couple of -[CVEs](https://cve.mitre.org/), and at the same time v2.14.6, -v2.15.4, v2.16.6, v2.17.3, v2.18.2, v2.19.3, v2.20.2, v2.21.1, -v2.22.2 and v2.23.1 were released. +Git for Windows is a "friendly fork" of [Git](https://git-scm.com/), i.e. changes in Git for Windows are frequently contributed back, and Git for Windows' release cycle closely following Git's. + +While Git maintains several release trains (when v2.19.1 was released, there were updates to v2.14.x-v2.18.x, too, for example), Git for Windows follows only the latest Git release. For example, there is no Git for Windows release corresponding to Git v2.16.5 (which was released after v2.19.0). + +One exception is [MinGit for Windows](https://gitforwindows.org/mingit) (a minimal subset of Git for Windows, intended for bundling with third-party applications that do not need any interactive commands nor support for `git svn`): critical security fixes are backported to the v2.11.x, v2.14.x, v2.19.x, v2.21.x and v2.23.x release trains. + +## Version number scheme + +The Git for Windows versions reflect the Git version on which they are based. For example, Git for Windows v2.21.0 is based on Git v2.21.0. + +As Git for Windows bundles more than just Git (such as Bash, OpenSSL, OpenSSH, GNU Privacy Guard), sometimes there are interim releases without corresponding Git releases. In these cases, Git for Windows appends a number in parentheses, starting with the number 2, then 3, etc. For example, both Git for Windows v2.17.1 and v2.17.1(2) were based on Git v2.17.1, but the latter included updates for Git Credential Manager and Git LFS, fixing critical regressions. + +## Tag naming scheme + +Every Git for Windows version is tagged using a name that starts with the Git version on which it is based, with the suffix `.windows.` appended. For example, Git for Windows v2.17.1' source code is tagged as [`v2.17.1.windows.1`](https://github.com/git-for-windows/git/releases/tag/v2.17.1.windows.1) (the patch level is always at least 1, given that Git for Windows always has patches on top of Git). Likewise, Git for Windows v2.17.1(2)' source code is tagged as [`v2.17.1.windows.2`](https://github.com/git-for-windows/git/releases/tag/v2.17.1.windows.2). + +## Release Candidate (rc) versions + +As a friendly fork of Git (the "upstream" project), Git for Windows is closely corelated to that project. + +Consequently, Git for Windows publishes versions based on Git's release candidates (for upcoming "`.0`" versions, see [Git's release schedule](https://tinyurl.com/gitCal)). These versions end in `-rc`, starting with `-rc0` for a very early preview of what is to come, and as with regular versions, Git for Windows tries to follow Git's releases as quickly as possible. + +Note: there is currently a bug in the "Check daily for updates" code, where it mistakes the final version as a downgrade from release candidates. Example: if you installed Git for Windows v2.23.0-rc3 and enabled the auto-updater, it would ask you whether you want to "downgrade" to v2.23.0 when that version was available. + +[All releases](https://github.com/git-for-windows/git/releases/), including release candidates, are listed via a link at the footer of the [Git for Windows](https://gitforwindows.org/) home page. + +## Snapshot versions ('nightly builds') + +Git for Windows also provides snapshots (these are not releases) of the current development as per git-for-Windows/git's `master` branch at the [Snapshots](https://gitforwindows.org/git-snapshots/) page. This link is also listed in the footer of the [Git for Windows](https://gitforwindows.org/) home page. + +Note: even if those builds are not exactly "nightly", they are sometimes referred to as "nightly builds" to keep with other projects' nomenclature. + +## Following upstream's developments + +The [gitforwindows/git repository](https://github.com/git-for-windows/git) also provides the `shears/*` branches. The `shears/*` branches reflect Git for Windows' patches, rebased onto the upstream integration branches, [updated (mostly) via automated CI builds](https://dev.azure.com/git-for-windows/git/_build?definitionId=25).