-
Notifications
You must be signed in to change notification settings - Fork 539
rerun rrd stats
#10593
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rerun rrd stats
#10593
Conversation
Web viewer built successfully. If applicable, you should also test it:
Note: This comment is updated whenever you push a commit. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks super useful!
code lgtm, seemed all very sane
|
||
match res { | ||
Ok(msg) => { | ||
num_chunks += 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's technically arbitrary messages an not chunks as not every message is a chunk. Not that it makes much difference in practice. Shouldn't that go only in the Ok(Some()) below? Since that's only used for rate printing it would be more accurate to just call it num_msg_processed
and print as much
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
woops
uncompressed.as_slice() | ||
} | ||
|
||
huh => anyhow::bail!("unknown Compression: {huh}"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
huh :D
let schema = decoded.batch.schema(); | ||
|
||
let entity_path = { | ||
let entity_path = schema.metadata().get("rerun:entity_path"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we have access to constants for all these metadata names here? It's not like we never rename those and you dont'have any tests in here so the tool might silently break :/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've purposefully written this as if I was an external user building their own little independent data crunching tool. Part of the value of Sorbet is that it's supposed to give me a stable ABI that I can target: I think it's more than time to put that promise to the test, especially now that we are starting to build datasets that cannot possibly be re-regenerated from scratch.
) { | ||
let path_to_input_rrds = paths | ||
.iter() | ||
.filter(|s| !s.is_empty()) // Avoid a problem with `pixi run check-backwards-compatibility` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... it passes in empty quoted paths?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have no idea, this is very old code
Title. In particular this: * Fixes a nasty `O(n**2)` that slipped in, bringing back performance to sub-linear levels. * Introduction compaction passes in the CLI, which is super important to get the most out of compaction in offline scenarios. ```sh $ pixi run rerun-release rrd compact problematic.rrd -o compacted.rrd --max-rows 999999999 --max-bytes 1048576 --num-pass 50 # [...] [2025-07-10T17:44:58Z INFO rerun::commands::rrd::merge_compact] processed 1289999 messages so far, current speed is 19312.92 msg/s [2025-07-10T17:44:58Z INFO rerun::commands::rrd::merge_compact] processed 1299999 messages so far, current speed is 19985.97 msg/s [2025-07-10T17:44:59Z INFO rerun::commands::rrd::merge_compact] processed 1309999 messages so far, current speed is 19466.20 msg/s [2025-07-10T17:44:59Z INFO rerun::commands::rrd::merge_compact] processed 1319999 messages so far, current speed is 18687.46 msg/s [2025-07-10T17:45:00Z INFO rerun::commands::rrd::merge_compact] processed 1329999 messages so far, current speed is 17226.69 msg/s [2025-07-10T17:45:00Z INFO rerun::commands::rrd::merge_compact] processed 1339999 messages so far, current speed is 18906.00 msg/s [2025-07-10T17:45:01Z INFO rerun::commands::rrd::merge_compact] processed 1349999 messages so far, current speed is 17622.72 msg/s [2025-07-10T17:45:01Z INFO rerun::commands::rrd::merge_compact] processed 1359999 messages so far, current speed is 18320.93 msg/s [2025-07-10T17:45:02Z INFO rerun::commands::rrd::merge_compact] processed 1369999 messages so far, current speed is 19278.27 msg/s [2025-07-10T17:45:03Z INFO rerun::commands::rrd::merge_compact] processed 1379999 messages so far, current speed is 19101.29 msg/s [2025-07-10T17:45:03Z INFO rerun::commands::rrd::merge_compact] processed 1389999 messages so far, current speed is 20092.77 msg/s [2025-07-10T17:45:04Z INFO rerun::commands::rrd::merge_compact] processed 1399999 messages so far, current speed is 19879.84 msg/s [2025-07-10T17:45:04Z INFO rerun::commands::rrd::merge_compact] processed 1409999 messages so far, current speed is 19367.66 msg/s [2025-07-10T17:45:05Z INFO rerun::commands::rrd::merge_compact] processed 1419999 messages so far, current speed is 19952.79 msg/s [2025-07-10T17:45:05Z INFO rerun::commands::rrd::merge_compact] processed 1429999 messages so far, current speed is 20056.93 msg/s [2025-07-10T17:45:06Z INFO rerun::commands::rrd::merge_compact] processed 1439999 messages so far, current speed is 20823.57 msg/s [2025-07-10T17:45:06Z INFO rerun::commands::rrd::merge_compact] running extra compaction pass… pass=0 [2025-07-10T17:45:13Z INFO rerun::commands::rrd::merge_compact] extra compaction pass completed pass=0 num_chunks_before=361505 num_chunks_after=184543 num_chunks_reduction="-48.951%" time=6.730054646s [2025-07-10T17:45:13Z INFO rerun::commands::rrd::merge_compact] running extra compaction pass… pass=1 [2025-07-10T17:45:16Z INFO rerun::commands::rrd::merge_compact] extra compaction pass completed pass=1 num_chunks_before=184543 num_chunks_after=96366 num_chunks_reduction="-47.781%" time=3.892597363s [2025-07-10T17:45:16Z INFO rerun::commands::rrd::merge_compact] running extra compaction pass… pass=2 [2025-07-10T17:45:18Z INFO rerun::commands::rrd::merge_compact] extra compaction pass completed pass=2 num_chunks_before=96366 num_chunks_after=54254 num_chunks_reduction="-43.700%" time=1.549344247s [2025-07-10T17:45:18Z INFO rerun::commands::rrd::merge_compact] running extra compaction pass… pass=3 [2025-07-10T17:45:19Z INFO rerun::commands::rrd::merge_compact] extra compaction pass completed pass=3 num_chunks_before=54254 num_chunks_after=33331 num_chunks_reduction="-38.565%" time=762.644953ms [2025-07-10T17:45:19Z INFO rerun::commands::rrd::merge_compact] running extra compaction pass… pass=4 [2025-07-10T17:45:19Z INFO rerun::commands::rrd::merge_compact] extra compaction pass completed pass=4 num_chunks_before=33331 num_chunks_after=22876 num_chunks_reduction="-31.367%" time=486.387978ms [2025-07-10T17:45:19Z INFO rerun::commands::rrd::merge_compact] running extra compaction pass… pass=5 [2025-07-10T17:45:20Z INFO rerun::commands::rrd::merge_compact] extra compaction pass completed pass=5 num_chunks_before=22876 num_chunks_after=17652 num_chunks_reduction="-22.836%" time=313.659082ms [2025-07-10T17:45:20Z INFO rerun::commands::rrd::merge_compact] running extra compaction pass… pass=6 [2025-07-10T17:45:20Z INFO rerun::commands::rrd::merge_compact] extra compaction pass completed pass=6 num_chunks_before=17652 num_chunks_after=15042 num_chunks_reduction="-14.786%" time=232.602897ms [2025-07-10T17:45:20Z INFO rerun::commands::rrd::merge_compact] running extra compaction pass… pass=7 [2025-07-10T17:45:20Z INFO rerun::commands::rrd::merge_compact] extra compaction pass completed pass=7 num_chunks_before=15042 num_chunks_after=13739 num_chunks_reduction="-8.662%" time=192.348912ms [2025-07-10T17:45:20Z INFO rerun::commands::rrd::merge_compact] running extra compaction pass… pass=8 [2025-07-10T17:45:20Z INFO rerun::commands::rrd::merge_compact] extra compaction pass completed pass=8 num_chunks_before=13739 num_chunks_after=13088 num_chunks_reduction="-4.738%" time=163.27111ms [2025-07-10T17:45:20Z INFO rerun::commands::rrd::merge_compact] running extra compaction pass… pass=9 [2025-07-10T17:45:20Z INFO rerun::commands::rrd::merge_compact] extra compaction pass completed pass=9 num_chunks_before=13088 num_chunks_after=12822 num_chunks_reduction="-2.032%" time=124.721993ms [2025-07-10T17:45:20Z INFO rerun::commands::rrd::merge_compact] running extra compaction pass… pass=10 [2025-07-10T17:45:20Z INFO rerun::commands::rrd::merge_compact] extra compaction pass completed pass=10 num_chunks_before=12822 num_chunks_after=12686 num_chunks_reduction="-1.061%" time=136.477701ms [2025-07-10T17:45:20Z INFO rerun::commands::rrd::merge_compact] running extra compaction pass… pass=11 [2025-07-10T17:45:21Z INFO rerun::commands::rrd::merge_compact] extra compaction pass completed pass=11 num_chunks_before=12686 num_chunks_after=12635 num_chunks_reduction="-0.402%" time=140.33912ms [2025-07-10T17:45:21Z INFO rerun::commands::rrd::merge_compact] running extra compaction pass… pass=12 [2025-07-10T17:45:21Z INFO rerun::commands::rrd::merge_compact] extra compaction pass completed pass=12 num_chunks_before=12635 num_chunks_after=12634 num_chunks_reduction="-0.008%" time=87.742732ms [2025-07-10T17:45:21Z INFO rerun::commands::rrd::merge_compact] running extra compaction pass… pass=13 [2025-07-10T17:45:21Z INFO rerun::commands::rrd::merge_compact] extra compaction pass completed pass=13 num_chunks_before=12634 num_chunks_after=12634 num_chunks_reduction="-0.000%" time=85.48635ms [2025-07-10T17:45:21Z INFO rerun::commands::rrd::merge_compact] cannot possibly improve further, stopping early pass=13 time=85.50556ms [2025-07-10T17:45:22Z INFO rerun::commands::rrd::merge_compact] preparing output… [2025-07-10T17:45:22Z INFO rerun::commands::rrd::merge_compact] encoding… [2025-07-10T17:45:28Z INFO rerun::commands::rrd::merge_compact] merge/compaction finished srcs=["problematic.rrd"] time=94.480722097s num_chunks_before=1446008 num_chunks_after=12634 num_chunks_reduction="-99.126%" srcs_size_bytes=9.2 GiB dst_size_bytes=7.3 GiB size_reduction="-20.822%" ``` --- * DNM: requires #10593
It does what you think it does... but with a twist! It can also compute state at the transport-layer, which is very handy as it allows us to measure things that are very important for Redap, e.g. how much on-disk data is schema vs. data.