Thanks to visit codestin.com
Credit goes to github.com

Skip to content

nondeterminism in tor simulations with --model-unblocked-syscall-latency=true #3538

@sporksmith

Description

@sporksmith

The tor minimal test is quite close to fully deterministic normally, but enabling --model-unblocked-syscall-latency=true causes the strace logs to diverge, seemingly usually starting with the return time of a blocked epoll_wait call.

Repro steps

First enable both --model-unblocked-syscall-latency and deterministic strace logging for the tor test:

diff --git a/src/test/tor/minimal/CMakeLists.txt b/src/test/tor/minimal/CMakeLists.txt
index 246363d21..c73ae4836 100644
--- a/src/test/tor/minimal/CMakeLists.txt
+++ b/src/test/tor/minimal/CMakeLists.txt
@@ -15,7 +15,8 @@ add_shadow_tests(BASENAME tor-minimal
                  ARGS
                    --use-cpu-pinning true
                    --parallelism 2
-                   --strace-logging-mode off
+                   --strace-logging-mode deterministic
+                   --model-unblocked-syscall-latency=true
                    # Disable to support fork
                    --use-memory-manager false
                    --template-directory "shadow.data.template"

Build and run the test as usual. Locally I've also hacked up the tests cmake file to omit the golang tests, since building them requires root (an unshaven yak for another day)

# build
./setup build --debug --test --extra
# run the test
./setup test --extra tor-minimal
# save the results
mv build/src/test/tor/minimal/tor-minimal-shadow.data/ build/src/test/tor/minimal/tor-minimal-shadow.data.0/
# run the test again
./setup test --extra tor-minimal
# move again
mv build/src/test/tor/minimal/tor-minimal-shadow.data/ build/src/test/tor/minimal/tor-minimal-shadow.data.1/
# diff any relay strace file
$ diff --side-by-side --suppress-common-lines build/src/test/tor/minimal/tor-minimal-shadow.data.[01]/hosts/relay1/tor.1000.strace | head -n10
00:15:04.206800750 [tid 1000] epoll_wait(3, <pointer>, 32, 55 |	00:15:04.206794100 [tid 1000] epoll_wait(3, <pointer>, 32, 55
000000904206800750 [tid 1000] clock_gettime(...) = 0	      |	000000904206794100 [tid 1000] clock_gettime(...) = 0
000000904206801760 [tid 1000] time(...) = 946685704	      |	000000904206795110 [tid 1000] time(...) = 946685704
000000904206801760 [tid 1000] clock_gettime(...) = 0	      |	000000904206795110 [tid 1000] clock_gettime(...) = 0
000000904206801760 [tid 1000] clock_gettime(...) = 0	      |	000000904206795110 [tid 1000] clock_gettime(...) = 0
00:15:04.206801760 [tid 1000] read(23, <pointer>, 5) = 5      |	00:15:04.206795110 [tid 1000] read(23, <pointer>, 5) = 5
00:15:04.206802790 [tid 1000] read(23, <pointer>, 531) = 531  |	00:15:04.206796140 [tid 1000] read(23, <pointer>, 531) = 531
00:15:04.206802790 [tid 1000] read(23, <pointer>, 5) = -11 (E |	00:15:04.206796140 [tid 1000] read(23, <pointer>, 5) = -11 (E
000000904206804790 [tid 1000] time(...) = 946685704	      |	000000904206798140 [tid 1000] time(...) = 946685704
000000904206804790 [tid 1000] clock_gettime(...) = 0	      |	000000904206798140 [tid 1000] clock_gettime(...) = 0

Conversely, without --model-unblocked-syscall-latency=true, the strace diffs of tor processes are clean (except for those that fork an obfs4proxy, which has other unexplained nondeterminism)

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type: BugError or flaw producing unexpected results

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions