Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@tmcgilchrist
Copy link

This PR makes the minimal amount of changes to support ocaml/ocaml#12923

I want to fix the raw_backtrace code to use the new functions in ocaml/ocaml#9663

@tmcgilchrist
Copy link
Author

Removing the use of Obj.magic ends up being rather awkward, leaving it as is.

@tmcgilchrist tmcgilchrist marked this pull request as ready for review March 26, 2024 23:06
Copy link
Collaborator

@stedolan stedolan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looks good, thanks! (Couple of minor docs/opam comments though)

@ncik-roberts
Copy link

@stedolan Does it make sense to merge this?

@talex5
Copy link

talex5 commented Jul 1, 2024

I tried this (with the 5.2 backport):

$ opam switch create 5.2.0-statmemprof 5.2.0+statmemprof
$ opam pin memtrace.dev 'https://github.com/dra27/memtrace.git#5_1_statmemprof'

Running it on my program then failed when using 7 worker domains:

$ MEMTRACE=solver.ctf ./_build/default/stress/stress.exe --count=3 --internal-workers=7
Running in domain mode
[0install-solver.2.18; base-bigarray.base; base-domains.base; base-nnp.base;
 base-threads.base; base-unix.base; dune.3.16.0; host-arch-x86_64.1;
 host-system-other.1; ocaml.5.3.0; ocaml-config.3;
 ocaml-variants.5.3.0+trunk]

Solved warm-up request in: 1.39s
Running another 3 * 7 solves...
local: internal error, uncaught exception:
       File "src/memprof_tracer.ml", line 37, characters 2-8: Assertion failed

(code is at https://github.com/talex5/solver-service/tree/memtrace)

It did record a trace successfully with 3 domains. But then, the viewer wouldn't read it:

$ memtrace-viewer solver.ctf
Processing solver.ctf...
(monitor.ml.Error "Assert_failure src/location_codec.ml:27:17"
 ("Raised at Memtrace__Location_codec.Mtf_table.swap in file \"src/location_codec.ml\", line 27, characters 17-29"
  "Called from Memtrace__Location_codec.Mtf_table.decode in file \"src/location_codec.ml\", line 59, characters 6-16"
  "Called from Memtrace__Location_codec.Reader.get_location.(fun) in file \"src/location_codec.ml\", lines 163-175, characters 8-23"
  "Called from Stdlib__List.init in file \"list.ml\", line 64, characters 25-28"
  "Called from Memtrace__Location_codec.Reader.get_location in file \"src/location_codec.ml\", lines 150-182, characters 15-66"
  "Called from Memtrace__Trace.iter.iter_events_of_packet in file \"src/trace.ml\", line 699, characters 24-71"
  "Called from Memtrace__Trace.iter.iter_packets in file \"src/trace.ml\", line 743, characters 8-50"
  "Called from Memtrace_viewer_native__Raw_trace.iter in file \"server/src/raw_trace.ml\", lines 89-93, characters 2-17"
  "Called from Memtrace_viewer_native__Filtered_trace.iter_and_return_interpreter in file \"server/src/filtered_trace.ml\", lines 765-766, characters 2-55"
  "Called from Memtrace_viewer_native__Filtered_trace.iter in file \"server/src/filtered_trace.ml\" (inlined), line 771, characters 20-82"
  "Called from Memtrace_viewer_native__Graph.full_graph_and_max_time in file \"server/src/graph.ml\", lines 12-23, characters 2-45"
  "Called from Memtrace_viewer_native__Graph.build in file \"server/src/graph.ml\", line 57, characters 29-59"
  "Called from Memtrace_viewer_native__User_state.Initial.of_trace in file \"server/src/user_state.ml\", line 32, characters 16-66"
  "Called from Memtrace_viewer_native.main in file \"server/src/memtrace_viewer_native.ml\", line 72, characters 22-55"
  "Called from Async_kernel__Monitor.Exported_for_scheduler.schedule'.upon_work_fill_i in file \"src/monitor.ml\", line 293, characters 42-51"
  "Called from Async_kernel__Job_queue.run_jobs in file \"src/job_queue.ml\", line 180, characters 6-47"))

Is there something else I need to do to make this work?

@OlivierNicole
Copy link

I also got the same assert failure when trying to view a trace created using this fork. I created an issue at janestreet/memtrace_viewer#14. It’s unclear to me whether the problem lies in this PR or in memtrace_viewer.

@tmcgilchrist
Copy link
Author

I can reproduce both failures using OCaml 5.2.0 with base.v0.16.3 on macOS (ARM64).

This one for captures that errors when running on multiple domains:

$ OCAMLRUNPARAM=b MEMTRACE=solver.ctf ./_build/default/stress/stress.exe --count=3 --internal-workers=4
Running in domain mode
[0install-solver.2.18; base-bigarray.base; base-domains.base; base-nnp.base;
 base-threads.base; base-unix.base; dune.3.16.0; host-arch-x86_64.1;
 host-system-other.1; ocaml.5.3.0; ocaml-config.3;
 ocaml-variants.5.3.0+trunk]

Solved warm-up request in: 2.14s
Running another 3 * 4 solves...

local: internal error, uncaught exception:
       File "memtrace/src/memprof_tracer.ml", line 37, characters 2-8: Assertion failed
       Raised at Stdlib__Domain.join in file "domain.ml", line 296, characters 16-24
       Called from Stdlib__List.iter in file "list.ml", line 112, characters 12-15
       Called from Dune__exe__Stress.main in file "stress/stress.ml", line 44, characters 2-35
       Called from Cmdliner_term.app.(fun) in file "cmdliner_term.ml", line 24, characters 19-24
       Called from Cmdliner_eval.run_parser in file "cmdliner_eval.ml", line 35, characters 37-44

And the memtrace-viewer error:

$ dune exec memtrace-viewer  solver.ctf
Processing solver.ctf...               
(monitor.ml.Error "Assert_failure memtrace/src/location_codec.ml:27:17"
 ("Raised at Memtrace__Location_codec.Mtf_table.swap in file \"memtrace/src/location_codec.ml\", line 27, characters 17-29"
  "Called from Memtrace__Location_codec.Mtf_table.decode in file \"memtrace/src/location_codec.ml\", line 59, characters 6-16"
  "Called from Memtrace__Location_codec.Reader.get_location.(fun) in file \"memtrace/src/location_codec.ml\", lines 177-181, characters 8-22"
  "Called from Stdlib__List.init in file \"list.ml\", line 64, characters 25-28"
  "Called from Memtrace__Location_codec.Reader.get_location in file \"memtrace/src/location_codec.ml\", lines 150-182, characters 15-66"
  "Called from Memtrace__Trace.iter.iter_events_of_packet in file \"memtrace/src/trace.ml\", line 699, characters 24-71"
  "Called from Memtrace__Trace.iter.iter_packets in file \"memtrace/src/trace.ml\", line 743, characters 8-50"
  "Called from Memtrace_viewer_native__Raw_trace.iter in file \"memtrace_viewer/server/src/raw_trace.ml\", lines 89-93, characters 2-17"
  "Called from Memtrace_viewer_native__Filtered_trace.iter_and_return_interpreter in file \"memtrace_viewer/server/src/filtered_trace.ml\", lines 765-766, characters 2-55"
  "Called from Memtrace_viewer_native__Filtered_trace.iter in file \"memtrace_viewer/server/src/filtered_trace.ml\", line 771, characters 20-82"
  "Called from Memtrace_viewer_native__Graph.full_graph_and_max_time in file \"memtrace_viewer/server/src/graph.ml\", lines 12-23, characters 2-45"
  "Called from Memtrace_viewer_native__Graph.build in file \"memtrace_viewer/server/src/graph.ml\", line 57, characters 29-59"
  "Called from Memtrace_viewer_native__User_state.Initial.of_trace in file \"memtrace_viewer/server/src/user_state.ml\", line 32, characters 16-66"
  "Called from Memtrace_viewer_native.main in file \"memtrace_viewer/server/src/memtrace_viewer_native.ml\", line 72, characters 22-55"
  "Called from Async_kernel__Monitor.Exported_for_scheduler.schedule'.upon_work_fill_i in file \"src/monitor.ml\", line 293, characters 42-51"
  "Called from Async_kernel__Job_queue.run_jobs in file \"src/job_queue.ml\", line 180, characters 6-47"))

@tmcgilchrist
Copy link
Author

TL;DR Recording single domain traces seems fine, handling multiple domains requires a new trace file version.

Reporting what I've found so far in long form.

The error recording a trace file in File "memtrace/src/memprof_tracer.ml", line 37, characters 2-8: Assertion failed is caused by concurrent access to the internal mutable data structures at src/memprof_tracer.ml#L1-L8. When running across multiple domains the callbacks can run on different domains and try to access/modify that structure putting it into an inconsistent state. Wrapping it with a mutex or Kcas like tmcgilchrist@d0cc680 will avoid the error. However the current file format for memtrace is not designed for recording multiple domains with differing time stamps. The file reader asserts certain monotonic timestamp properties that do not hold. On a simpler test program that spawn domains like

(* fib_par.ml *)
let num_domains = try int_of_string Sys.argv.(1) with _ -> 1
let n = try int_of_string Sys.argv.(2) with _ -> 1

(* Sequential Fibonacci *)
let rec fib n =
  if n < 2 then 1 else fib (n - 1) + fib (n - 2)

module T = Domainslib.Task

let rec fib_par pool n =
  let _ = Buffer.create 10000 in
  if n > 20 then begin
      let a = T.async pool (fun _ -> fib_par pool (n-1)) in
      let b = T.async pool (fun _ -> fib_par pool (n-2)) in
      T.await pool a + T.await pool b
    end else
    (* Call sequential Fibonacci if the available work is small *)
    fib n

let main () =
  Memtrace.trace_if_requested ~context:"fib" ();
  let pool = T.setup_pool ~num_domains:(num_domains - 1) () in
  let res = T.run pool (fun _ -> fib_par pool n) in
  T.teardown_pool pool;
  Printf.printf "fib(%d) = %d\n" n res

let _ = main ()

I can generate traces and read them successfully, but I think that is just good luck.

Limiting the trace capture to a single domain usually yields a valid trace file that can be opened with memtrace_viewer. This capture from running MEMTRACE=solver.ctf ./_build/default/stress/stress.exe --count=30 --internal-workers=1
Screenshot 2024-08-02 at 10 59 58 AM

The second error in "Assert_failure memtrace/src/location_codec.ml:27:17" is caused by missing backtraces when recording the Move-to-front encoding. I haven't found the root cause for that yet, but if you creatively relax the file read constraints tmcgilchrist@06154cd you can read @OlivierNicole's file
Screenshot 2024-08-02 at 11 12 30 AM

@lukemaurer
Copy link
Contributor

The second error in "Assert_failure memtrace/src/location_codec.ml:27:17" is caused by missing backtraces when recording the Move-to-front encoding. I haven't found the root cause for that yet, but if you creatively relax the file read constraints [tmcgilchrist/memtrace@06154cd]

There was indeed a bug (fixed by #23) involving the MTF cache, and having looked at @OlivierNicole's trace I think it was at play there. @stedolan will know more than me, but I imagine for multi-domain support we'll just need a proper mutex instead of this locked flag in memprof_tracer.ml. Then timestamp monotonicity shouldn't be an issue because taking the timestamp and adding the sample to the trace happen in one atomic operation.

@tmcgilchrist
Copy link
Author

Thanks @lukemaurer I'll pick this back up and add a proper mutex (rather than using kcas) around things.
It would be useful to have this fully working and tested for the official 5.3 release.

@tmcgilchrist tmcgilchrist force-pushed the 5_1_statmemprof branch 2 times, most recently from 20f586e to 3789a1f Compare October 10, 2024 02:28
@tmcgilchrist
Copy link
Author

3789a1f puts a single Mutex around writing the CTF file and removes the two mutable fields locked and locked_ext. That (plus #23) is giving me consistent CTF files across multiple domains. I don't have an application that uses the ext_alloc and ext_free functions so I could be missing a use case there.

Running @talex5 's example above from solver-service as:

# Run with 6 domains
 MEMTRACE=solver-mutex-6-opam.2.3.0.ctf ./_build/default/stress/stress.exe local --count=120 --internal-workers=6 --cache-dir=./cache -q 

is working.

Note, since this is using a Mutex there might be more blocking occurring than you would want. The upside is it'll be consistent across 4.14 and 5.3.

@kit-ty-kate
Copy link

is there any chance for this PR to get merged and released before the final release of OCaml 5.3 planned for mid-december? It would be a great publicity for both 5.3 and memtrace if users could use them during the holiday season and be encouraged by the added value of both projects towards production grade software.

@kit-ty-kate
Copy link

ping

@christoph-cullmann
Copy link

Would appreciate if this could be merged, too.

README.md Outdated
Comment on lines 28 to 33
These instructions are for using statmemprof with OCaml 5.3.0+trunk

``` shell
# Setup a new Blank switch
opam switch create 5.3.0~alpha1 --no-install
eval $(opam env --switch=5.3.0~alpha1 --set-switch)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
These instructions are for using statmemprof with OCaml 5.3.0+trunk
``` shell
# Setup a new Blank switch
opam switch create 5.3.0~alpha1 --no-install
eval $(opam env --switch=5.3.0~alpha1 --set-switch)
These instructions are for using statmemprof with OCaml >= 5.3
``` shell
# Setup a new Blank switch
opam switch create ocaml --packages 'ocaml>=5.3' --no-install
eval $(opam env --switch=ocaml --set-switch)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @MisterDA

I don't expect these instructions to be included, the intention was to show how to test this PR.
Now that 5.3 is out it should be:

$ opam switch create 5.3.0-statmemprof 5.3.0
$ eval $(opam env --switch=5.3.0-statmemprof--set-switch)
$ opam pin memtrace.dev 'https://github.com/tmcgilchrist/memtrace.git#5_1_statmemprof'

once this gets released then a simple opam install memtrace will be enough.

(description "Generates compact traces of a program's memory use.")
(depends
(ocaml (>= 4.11.0))))
(ocaml (or (>= 5.3.0) (and (>= 4.11.0) (< 5.0))))))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this consider the ocaml-variants 5.2.0+statmemprof?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was treating that variant as an alpha release until 5.3.0 was available. I don't see much appeal in supporting that now that 5.3 is available.

@adrien-n
Copy link

I've happily been able to use memtrace on 5.3 with this PR. Thanks!

Comment on lines 148 to 149
let n = Atomic.get bytes_before_ext_sample - bytes in
Atomic.set bytes_before_ext_sample n;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here it seems more appropriate to do:

Atomic.fetch_and_add bytes_before_ext_sample (- bytes)

Unless if I'm missing some detail on how ext_alloc can be used.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have much insight into how ext_alloc is being used by JaneStreet but changing this to Atomic.fetch_and_add seems safer.

Comment on lines 128 to 129
Atomic.set bytes_before_ext_sample
(Atomic.get bytes_before_ext_sample + draw_sampler_bytes s);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here it seems safe to do Atomic.get, set but still Atomic.fetch_and_add is probably more efficient?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe since bytes_before_ext_sample can be mutated outside of the critical section (via ext_alloc) you actually should use fetch_and_add here too.

Comment on lines +123 to +140
let samples = Atomic.make 0 in
while Atomic.get bytes_before_ext_sample <= 0 do
ignore (Atomic.fetch_and_add bytes_before_ext_sample (draw_sampler_bytes s));
Atomic.incr samples
done;
assert (!samples > 0);
assert (Atomic.get samples > 0);
let callstack = Printexc.get_callstack max_int in
Some (Trace.Writer.put_alloc_with_raw_backtrace s.trace
(Trace.Timestamp.now ())
~length:size_words
~nsamples:!samples
~nsamples:(Atomic.get samples)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems that samples can in fact be a ref as it was originally, because it's only used locally.

@tmcgilchrist tmcgilchrist force-pushed the 5_1_statmemprof branch 2 times, most recently from 8ad6a10 to f90106c Compare May 27, 2025 00:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.