Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

Gc6026
Copy link
Owner

@Gc6026 Gc6026 commented Jun 15, 2024

Thanks for taking the time to contribute to Git! Please be advised that the
Git community does not use github.com for their contributions. Instead, we use
a mailing list ([email protected]) for code submissions, code reviews, and
bug reports. Nevertheless, you can use GitGitGadget (https://gitgitgadget.github.io/)
to conveniently send your Pull Requests commits to our mailing list.

For a single-commit pull request, please leave the pull request description
empty
: your commit message itself should describe your changes.

Please read the "guidelines for contributing" linked above!

Code cleanup.

* rs/archive-use-object-id:
  archive: convert queue_directory to struct object_id
Optimize code that handles large number of refs in the "git fetch"
code path.

* ps/fetch-optim:
  fetch: avoid second connectivity check if we already have all objects
  fetch: merge fetching and consuming refs
  fetch: refactor fetch refs to be more extendable
  fetch-pack: optimize loading of refs via commit graph
  connected: refactor iterator to return next object ID directly
  fetch: avoid unpacking headers in object existence check
  fetch: speed up lookup of want refs via commit-graph
The reachability bitmap file used to be generated only for a single
pack, but now we've learned to generate bitmaps for history that
span across multiple packfiles.

* tb/multi-pack-bitmaps: (27 commits)
  p5326: perf tests for MIDX bitmaps
  p5310: extract full and partial bitmap tests
  midx: respect 'GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP'
  t7700: update to work with MIDX bitmap test knob
  t5319: don't write MIDX bitmaps in t5319
  t5310: disable GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP
  t0410: disable GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP
  t5326: test multi-pack bitmap behavior
  t/helper/test-read-midx.c: add --checksum mode
  t5310: move some tests to lib-bitmap.sh
  pack-bitmap: write multi-pack bitmaps
  pack-bitmap: read multi-pack bitmaps
  pack-bitmap.c: avoid redundant calls to try_partial_reuse
  pack-bitmap.c: introduce 'bitmap_is_preferred_refname()'
  pack-bitmap.c: introduce 'nth_bitmap_object_oid()'
  pack-bitmap.c: introduce 'bitmap_num_objects()'
  midx: avoid opening multiple MIDXs when writing
  midx: close linked MIDXs, avoid leaking memory
  midx: infer preferred pack when not given one
  midx: reject empty `--preferred-pack`'s
  ...
Recent "diff -m" changes broke "gitk", which has been corrected.

* so/diff-index-regression-fix:
  diff-index: restore -c/--cc options handling
Regression fix.

* ab/send-email-config-fix:
  send-email: fix a "first config key wins" regression in v2.33.0
Build simplification.

* ab/no-more-check-bindir:
  Makefile: remove the check_bindir script
Docfix.

* bs/doc-bugreport-outdir:
  Documentation: fix default directory of git bugreport -o
Tie-break branches that point at the same object in the list of
branches on GitWeb to show the one pointed at by HEAD early.

* gh/gitweb-branch-sort:
  gitweb: use HEAD as secondary sort key in git_get_heads_list()
The code to make "git grep" recurse into submodules has been
updated to migrate away from the "add submodule's object store as
an alternate object store" mechanism (which is suboptimal).

* jt/grep-wo-submodule-odb-as-alternate:
  t7814: show lack of alternate ODB-adding
  submodule-config: pass repo upon blob config read
  grep: add repository to OID grep sources
  grep: allocate subrepos on heap
  grep: read submodule entry with explicit repo
  grep: typesafe versions of grep_source_init
  grep: use submodule-ODB-as-alternate lazy-addition
  submodule: lazily add submodule ODBs as alternates
The "git apply -3" code path learned not to bother the lower level
merge machinery when the three-way merge can be trivially resolved
without the content level merge.

* jc/trivial-threeway-binary-merge:
  apply: resolve trivial merge without hitting ll-merge with "--3way"
CI update.

* cb/ci-build-pedantic:
  ci: run a pedantic build as part of the GitHub workflow
The logic for auto-correction of misspelt subcommands learned to go
interactive when the help.autocorrect configuration variable is set
to 'prompt'.

* ab/help-autocorrect-prompt:
  help.c: help.autocorrect=prompt waits for user action
Teach "test_pause" and "debug" helpers to allow using the HOME and
TERM environment variables the user usually uses.

* pb/test-use-user-env:
  test-lib-functions: keep user's debugger config files and TERM in 'debug'
  test-lib-functions: optionally keep HOME, TERM and SHELL in 'test_pause'
  test-lib-functions: use 'TEST_SHELL_PATH' in 'test_pause'
"make INSTALL_STRIP=-s install" allows the installation step to use
"install -s" to strip the binaries as they get installed.

* bs/install-strip:
  make: add INSTALL_STRIP option variable
The code that optionally creates the *.rev reverse index file has
been optimized to avoid needless computation when it is not writing
the file out.

* ab/reverse-midx-optim:
  pack-write: skip *.rev work when not writing *.rev
The tracing of process ancestry information has been enhanced.

* ab/tr2-leaks-and-fixes:
  tr2: log N parent process names on Linux
  tr2: do compiler enum check in trace2_collect_process_info()
  tr2: leave the parent list empty upon failure & don't leak memory
  tr2: stop leaking "thread_name" memory
  tr2: clarify TRACE2_PROCESS_INFO_EXIT comment under Linux
  tr2: remove NEEDSWORK comment for "non-procfs" implementations
"git maintenance" scheduler learned to use systemd timers as a
possible backend.

* lh/systemd-timers:
  maintenance: add support for systemd timers on Linux
  maintenance: `git maintenance run` learned `--scheduler=<scheduler>`
  cache.h: Introduce a generic "xdg_config_home_for(…)" function
Reduce number of write(2) system calls while sending the
ref advertisement.

* jv/pkt-line-batch:
  upload-pack: use stdio in send_ref callbacks
  pkt-line: add stdio packet write functions
"git diff --submodule=diff" showed failure from run_command() when
trying to run diff inside a submodule, when the user manually
removes the submodule directory.

* dt/submodule-diff-fixes:
  diff --submodule=diff: don't print failure message twice
  diff --submodule=diff: do not fail on ever-initialied deleted submodules
  t4060: remove unused variable
The code to show progress indicator in a few codepaths did not
cover between 0-100%, which has been corrected.

* ab/progress-users-adjust-counters:
  entry: show finer-grained counter in "Filtering content" progress line
  commit-graph: fix bogus counter in "Scanning merged commits" progress line
"git update-ref --stdin" failed to flush its output as needed,
which potentially led the conversation to a deadlock.

* ps/update-ref-batch-flush:
  update-ref: fix streaming of status updates
Update the build procedure to use the "-pedantic" build when
DEVELOPER makefile macro is in effect.

* cb/pedantic-build-for-developers:
  developer: enable pedantic by default
  win32: allow building with pedantic mode enabled
  gettext: remove optional non-standard parens in N_() definition
"git range-diff -I... <range> <range>" segfaulted, which has been
corrected.

* rs/range-diff-avoid-segfault-with-I:
  range-diff: avoid segfault with -I
Leakfix.

* jc/prefix-filename-allocates:
  hash-object: prefix_filename() returns allocated memory these days
The "--preserve-merges" option of "git rebase" has been removed.

* js/retire-preserve-merges:
  sequencer: restrict scope of a formerly public function
  rebase: remove a no-longer-used function
  rebase: stop mentioning the -p option in comments
  rebase: remove obsolete code comment
  rebase: drop the internal `rebase--interactive` command
  git-svn: drop support for `--preserve-merges`
  rebase: drop support for `--preserve-merges`
  pull: remove support for `--rebase=preserve`
  tests: stop testing `git rebase --preserve-merges`
  remote: warn about unhandled branch.<name>.rebase values
  t5520: do not use `pull.rebase=preserve`
The order in which various files that make up a single (conceptual)
packfile has been reevaluated and straightened up.  This matters in
correctness, as an incomplete set of files must not be shown to a
running Git.

* tb/pack-finalize-ordering:
  pack-objects: rename .idx files into place after .bitmap files
  pack-write: split up finish_tmp_packfile() function
  builtin/index-pack.c: move `.idx` files into place last
  index-pack: refactor renaming in final()
  builtin/repack.c: move `.idx` files into place last
  pack-write.c: rename `.idx` files after `*.rev`
  pack-write: refactor renaming in finish_tmp_packfile()
  bulk-checkin.c: store checksum directly
  pack.h: line-wrap the definition of finish_tmp_packfile()
The reachability bitmap file used to be generated only for a single
pack, but now we've learned to generate bitmaps for history that
span across multiple packfiles.

* tb/multi-pack-bitmaps:
  pack-bitmap: drop bitmap_index argument from try_partial_reuse()
  pack-bitmap: drop repository argument from prepare_midx_bitmap_git()
This reverts commit eed4e81, reversing
changes made to 348fe07, so that we
can eventually replace it with a newer iteration.  As it stands, the
topic just makes it segfault when used with OpenSSH 8.7 (the latest).
Add progress display to "git bundle unbundle".

* ab/unbundle-progress:
  bundle: show progress on "unbundle"
  index-pack: add --progress-title option
  bundle API: change "flags" to be "extra_index_pack_args"
  bundle API: start writing API documentation
The compatibility implementation for unsetenv(3) were written to
mimic ancient, non-POSIX, variant seen in an old glibc; it has been
changed to return an integer to match the more modern era.

* jc/unsetenv-returns-an-int:
  unsetenv(3) returns int, not void
The command line complation for "git send-email" options have been
tweaked to make it easier to keep it in sync with the command itself.

* tp/send-email-completion:
  send-email docs: add format-patch options
  send-email: programmatically generate bash completions
Leakfix.

* tb/plug-pack-bitmap-leaks:
  pack-bitmap.c: more aggressively free in free_bitmap_index()
  pack-bitmap.c: don't leak type-level bitmaps
  midx.c: write MIDX filenames to strbuf
  builtin/multi-pack-index.c: don't leak concatenated options
  builtin/repack.c: avoid leaking child arguments
  builtin/pack-objects.c: don't leak memory via arguments
  t/helper/test-read-midx.c: free MIDX within read_midx_file()
  midx.c: don't leak MIDX from verify_midx_file
  midx.c: clean up chunkfile after reading the MIDX
Code simplification.

* rd/http-backend-code-simplification:
  http-backend: remove a duplicated code branch
"git add", "git mv", and "git rm" have been adjusted to avoid
updating paths outside of the sparse-checkout definition unless
the user specifies a "--sparse" option.

* ds/add-rm-with-sparse-index:
  dir: fix directory-matching bug
Various bugs in "git rebase -r" have been fixed.

* pw/rebase-r-fixes:
  rebase -i: fix rewording with --committer-date-is-author-date
Git 2.34-rc1

# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEE4fA2sf7nIh/HeOzvsLXohpav5ssFAmGC84AACgkQsLXohpav
# 5svBIg/9FjlAoy9osVAAnbg/TbGiHIDeaeEf+0oLmfq9dDX5ie9DzzCUGI6yibsw
# 0dZ3Ps2zmkcwzl26BPlNviONwIhKx1pA/F9ogvLt1SpXc1FGP/AXvM/O1cUhH87K
# 4IixKhemtlIV3U2QNoLpIKLEwcLe/409/BpgBLjQRroDnkNLOFcayHUDbUPxyvyn
# 8bluYMQRhmnBNKUyOv4BeKWT8ICk0+DUOePswEdh4hJgp8NrzNuBmf9IlEetxL/I
# S24/nr8/XFLTvjx3PVsBPHH6tJ1nUaX2vmvhTEGlcNWMQhSpxTRylOFSucRzfdxf
# 0EOjfoY7eM5gq8P3Wf0ZJdVYElg0RcXlHRBcf0bcr73hm1hwOOoJGrD44KJTqZL9
# ry2jGwWLpsdJLdJODMIDFuUQfOl4CtDVDFIeO/gxITGKm+0GlKDLks4YR87uVha4
# jfi42Tlx1fFDcvQKTuqj1y+Pste9jFyza/uDrStZME2pbZYLcwYzmvZnxyKq7QhE
# jSUCJH/bbKkjVAkgTgVxqHa94WJu/H7srFHePTnzlW/4cz0qXIDGveE2Nufp2qSw
# JXG8KksL8LrstWxFTHhKkAOhWG/1i3YbQ3PEcjff62VSBMWODwJazBGSH//oaNP3
# aUmPpw8NV30el2qPm5u18YSIwnoWQ3gfhoWETb8fnKorR3joK40=
# =qQhE
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 03 Nov 2021 01:39:28 PM PDT
# gpg:                using RSA key E1F036B1FEE7221FC778ECEFB0B5E88696AFE6CB
# gpg: Good signature from "Junio C Hamano <[email protected]>" [ultimate]
# gpg:                 aka "Junio C Hamano <[email protected]>" [ultimate]
# gpg:                 aka "Junio C Hamano <[email protected]>" [ultimate]

* tag 'v2.34.0-rc1':
  Git 2.34-rc1
* vd/pthread-setspecific-g11-fix:
  async_die_is_recursing: work around GCC v11.x issue on Fedora
Make a few helper functions unused and then lose them.

* ab/sh-retire-helper-functions:
  git-sh-setup: remove "sane_grep", it's not needed anymore
  git-sh-setup: remove unused sane_egrep() function
  git-instaweb: unconditionally assume that gitweb is mod_perl capable
  Makefile: remove $(NO_CURL) from $(SCRIPT_DEFINES)
  Makefile: remove $(GIT_VERSION) from $(SCRIPT_DEFINES)
  Makefile: move git-SCRIPT-DEFINES adjacent to $(SCRIPT_DEFINES)
The clean/smudge conversion code path has been prepared to better
work on platforms where ulong is narrower than size_t.

* mc/clean-smudge-with-llp64:
  clean/smudge: allow clean filters to process extremely large files
  odb: guard against data loss checking out a huge file
  git-compat-util: introduce more size_t helpers
  odb: teach read_blob_entry to use size_t
  t1051: introduce a smudge filter test for extremely large files
  test-lib: add prerequisite for 64-bit platforms
  test-tool genzeros: generate large amounts of data more efficiently
  test-genzeros: allow more than 2G zeros in Windows
Fix ssh-signing test to work on a platform where the default ACL is
overly loose to upset OpenSSH (reported on an installation of Cygwin).

* ad/ssh-signing-testfix:
  t/lib-git.sh: fix ACL-related permissions failure
Random changes to parse-options implementation.

* ab/parse-options-cleanup:
  parse-options.[ch]: revert use of "enum" for parse_options()
* ds/no-usable-cron-on-macos:
  maintenance: disable cron on macOS
* js/simple-ipc-cygwin-socket-fix:
  simple-ipc: work around issues with Cygwin's Unix socket emulation
* jk/ssh-signing-fix:
  t/lib-gpg: avoid broken versions of ssh-keygen
The revision traversal API has been optimized by taking advantage
of the commit-graph, when available, to determine if a commit is
reachable from any of the existing refs.

* ps/connectivity-optim:
  Revert "connected: do not sort input revisions"
"git fsck" has been taught to report mismatch between expected and
actual types of an object better.

* ab/fsck-unexpected-type:
  object-file: free(*contents) only in read_loose_object() caller
  object-file: fix SEGV on free() regression in v2.34.0-rc2
* js/trace2-raise-format-version:
  trace2: increment event format version
"git grep" looking in a blob that has non-UTF8 payload was
completely broken when linked with certain versions of PCREv2
library in the latest release.

* hm/paint-hits-in-log-grep:
  Revert "grep/pcre2: fix an edge case concerning ascii patterns and UTF-8 data"
"git pull" with any strategy when the other side is behind us
should succeed as it is a no-op, but doesn't.

* ev/pull-already-up-to-date-is-noop:
  pull: should be noop when already-up-to-date
Doc fix.

* ab/update-submitting-patches:
  SubmittingPatches: fix Asciidoc syntax in "GitHub CI" section
* master:
  0th batch for early fixes
Gc6026 pushed a commit that referenced this pull request Nov 30, 2024
It was recently reported that concurrent reads and writes may cause the
reftable backend to segfault. The root cause of this is that we do not
properly keep track of reftable readers across reloads.

Suppose that you have a reftable iterator and then decide to reload the
stack while iterating through the iterator. When the stack has been
rewritten since we have created the iterator, then we would end up
discarding a subset of readers that may still be in use by the iterator.
The consequence is that we now try to reference deallocated memory,
which of course segfaults.

One way to trigger this is in t5616, where some background maintenance
jobs have been leaking from one test into another. This leads to stack
traces like the following one:

  + git -c protocol.version=0 -C pc1 fetch --filter=blob:limit=29999 --refetch origin
  AddressSanitizer:DEADLYSIGNAL
  =================================================================
  ==657994==ERROR: AddressSanitizer: SEGV on unknown address 0x7fa0f0ec6089 (pc 0x55f23e52ddf9 bp
0x7ffe7bfa1700 sp 0x7ffe7bfa1700 T0)
  ==657994==The signal is caused by a READ memory access.
      #0 0x55f23e52ddf9 in get_var_int reftable/record.c:29
      #1 0x55f23e53295e in reftable_decode_keylen reftable/record.c:170
      #2 0x55f23e532cc0 in reftable_decode_key reftable/record.c:194
      #3 0x55f23e54e72e in block_iter_next reftable/block.c:398
      #4 0x55f23e5573dc in table_iter_next_in_block reftable/reader.c:240
      #5 0x55f23e5573dc in table_iter_next reftable/reader.c:355
      git#6 0x55f23e5573dc in table_iter_next reftable/reader.c:339
      #7 0x55f23e551283 in merged_iter_advance_subiter reftable/merged.c:69
      git#8 0x55f23e55169e in merged_iter_next_entry reftable/merged.c:123
      git#9 0x55f23e55169e in merged_iter_next_void reftable/merged.c:172
      git#10 0x55f23e537625 in reftable_iterator_next_ref reftable/generic.c:175
      git#11 0x55f23e2cf9c6 in reftable_ref_iterator_advance refs/reftable-backend.c:464
      git#12 0x55f23e2d996e in ref_iterator_advance refs/iterator.c:13
      git#13 0x55f23e2d996e in do_for_each_ref_iterator refs/iterator.c:452
      git#14 0x55f23dca6767 in get_ref_map builtin/fetch.c:623
      git#15 0x55f23dca6767 in do_fetch builtin/fetch.c:1659
      git#16 0x55f23dca6767 in fetch_one builtin/fetch.c:2133
      git#17 0x55f23dca6767 in cmd_fetch builtin/fetch.c:2432
      git#18 0x55f23dba7764 in run_builtin git.c:484
      git#19 0x55f23dba7764 in handle_builtin git.c:741
      git#20 0x55f23dbab61e in run_argv git.c:805
      git#21 0x55f23dbab61e in cmd_main git.c:1000
      git#22 0x55f23dba4781 in main common-main.c:64
      git#23 0x7fa0f063fc89 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
      git#24 0x7fa0f063fd44 in __libc_start_main_impl ../csu/libc-start.c:360
      git#25 0x55f23dba6ad0 in _start (git+0xadfad0) (BuildId: 803b2b7f59beb03d7849fb8294a8e2145dd4aa27)

While it is somewhat awkward that the maintenance processes survive
tests in the first place, it is totally expected that reftables should
work alright with concurrent writers. Seemingly they don't.

The only underlying resource that we need to care about in this context
is the reftable reader, which is responsible for reading a single table
from disk. These readers get discarded immediately (unless reused) when
calling `reftable_stack_reload()`, which is wrong. We can only close
them once we know that there are no iterators using them anymore.

Prepare for a fix by converting the reftable readers to be refcounted.

Reported-by: Jeff King <[email protected]>
Signed-off-by: Patrick Steinhardt <[email protected]>
Signed-off-by: Junio C Hamano <[email protected]>
Gc6026 pushed a commit that referenced this pull request Nov 30, 2024
This one is a little bit more curious. In t6112, we have a test that
exercises the `git rev-list --filter` option with invalid filters. We
execute git-rev-list(1) via `test_must_fail`, which means that we check
for leaks even though Git exits with an error code. This causes the
following leak:

    Direct leak of 27 byte(s) in 1 object(s) allocated from:
        #0 0x5555555e6946 in realloc.part.0 lsan_interceptors.cpp.o
        #1 0x5555558fb4b6 in xrealloc wrapper.c:137:8
        #2 0x5555558b6e06 in strbuf_grow strbuf.c:112:2
        #3 0x5555558b7550 in strbuf_add strbuf.c:311:2
        #4 0x5555557c1a88 in strbuf_addstr strbuf.h:310:2
        #5 0x5555557c1d4c in parse_list_objects_filter list-objects-filter-options.c:261:3
        git#6 0x555555885ead in handle_revision_pseudo_opt revision.c:2899:3
        #7 0x555555884e20 in setup_revisions revision.c:3014:11
        git#8 0x5555556c4b42 in cmd_rev_list builtin/rev-list.c:588:9
        git#9 0x5555555ec5e3 in run_builtin git.c:483:11
        git#10 0x5555555eb1e4 in handle_builtin git.c:749:13
        git#11 0x5555555ec001 in run_argv git.c:819:4
        git#12 0x5555555eaf94 in cmd_main git.c:954:19
        git#13 0x5555556fd569 in main common-main.c:64:11
        git#14 0x7ffff7ca714d in __libc_start_call_main (.../lib/libc.so.6+0x2a14d)
        git#15 0x7ffff7ca7208 in __libc_start_main@GLIBC_2.2.5 (.../libc.so.6+0x2a208)
        git#16 0x5555555ad064 in _start (git+0x59064)

This leak is valid, as we call `die()` and do not clean up the memory at
all. But what's curious is that this is the only leak reported, because
we don't clean up any other allocated memory, either, and I have no idea
why the leak sanitizer treats this buffer specially.

In any case, we can work around the leak by shuffling things around a
bit. Instead of calling `gently_parse_list_objects_filter()` and dying
after we have modified the filter spec, we simply do so beforehand. Like
this we don't allocate the buffer in the error case, which makes the
reported leak go away.

It's not pretty, but it manages to make t6112 leak free.

Signed-off-by: Patrick Steinhardt <[email protected]>
Signed-off-by: Junio C Hamano <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants