Update tools.md for missing color meaning issue #1491 #1624

varshneydevansh · 2023-07-06T23:16:53Z

Issue >

Null hypothesis test discrepancy #1491

Solution Proposed >

documentation needs to be enhanced

Changes made >

Added the lines to the tools.md file which explains the following -

Clarify the meaning of the colors in the output.
Defining the 'success' and 'failure' explicitly.
Clarify the interpretation of the p-value and the null hypothesis.

## Issue > Null hypothesis test discrepancy google#1491 ## Solution Proposed > documentation needs to be enhanced ## Changes made > Added the lines to the tools.md file which explains the following - 1. Clarify the meaning of the colors in the output. 2. Defining the 'success' and 'failure' explicitly. 3. Clarify the interpretation of the p-value and the null hypothesis.

LebedevRI

Thank you for looking into this!
I'm not sure how to best write this, but i do have some thoughts.

docs/tools.md

Changes are based on my understanding from the code review from the PR.

docs/tools.md

When comparing benchmarks, `compare.py` uses statistical tests to determine whether there is a statistically-significant difference between the measurements being compared. The result of said statistical test is additionally communicated through color coding:

docs/tools.md

Restructured the text and made a few tweaks for better logical reading.

varshneydevansh

Thanks for your guidance, Roman. The starting paragraph felt so interesting while reading, and the changes you suggested made this more well versed.

It was the first time that I came across about benchmarking, learned so much in the process.

If anything, still felt inconsistent or incomplete, I would love to make those changes.

docs/tools.md

LebedevRI · 2023-07-07T15:28:44Z

@dmah42 someone more fluent in english will need to proofread the wording,
i can only mostly help with the content here.

dmah42 · 2023-07-07T15:37:08Z

@dmah42 someone more fluent in english will need to proofread the wording, i can only mostly help with the content here.

seems good to me. feel free to merge it whenever you're happy with the content.

Added an output summary from a benchmark comparison with statistics provided for a multi-threaded process. Furthermore, added the breakdown of each row.

docs/tools.md

LebedevRI

@varshneydevansh thank you!
I suppose this could use some more editorial work,
but this is certainly much better than what we have now :)

varshneydevansh · 2023-07-09T16:49:59Z

Thank you so much @LebedevRI for guiding me. I recently encountered benchmarking for the first time and began learning about it. Then I realized that this is something where, alongside learning, I can even contribute.

I learned a lot from this process, not just about benchmarking, but how to organize thoughts in a way that makes it easier to understand for others.

Regards.

varshneydevansh mentioned this pull request Jul 6, 2023

Null hypothesis test discrepancy #1491

Closed

LebedevRI reviewed Jul 6, 2023

View reviewed changes

docs/tools.md Show resolved Hide resolved

docs/tools.md Outdated Show resolved Hide resolved

added more more human-facing explanation first

0303599

Changes are based on my understanding from the code review from the PR.