[MXNET-547] Tutorial explaining how to use the profiler #11274
Conversation
- Add images from web-data - Add <!--notebook-skip-line-->
- Fix image URLs - Fix formatting of output
- Add download button. - Hide profile_stats.png in notebook.
|
@ThomasDelteil @thomelane @Ishitori @safrooze |
|
|
||
| It is often helpful to understand what operations take how much time while running a model. This helps optimize the model to run faster. In this tutorial, we will learn how to profile MXNet models to measure their running time and memory consumption using the MXNet profiler. | ||
|
|
||
| ## The incorrect way to profile |
There was a problem hiding this comment.
This is not incorrect. You can still use wait_to_read to time the dot operation
There was a problem hiding this comment.
Agree. But I don't want to suggest wait_to_read as the recommended way to measure time taken by operations. While it might work for toy problems like this,
- it is harder to use to measure execution time of multiple operations (requires wait_to_read both before and after the measured operation in multiple places).
- It is hard to use to measure running time of a block inside a sequence (which is common)
- it won't work for hybrid networks.
The goal of this tutorial is to point people to a recommended way of profiling that works for almost all cases.
However I can add a note along the lines of "While it is possible to use wait_to_read() before and after an operation to get running time of an operation, it is not a scalable method to measure running time of multiple operations, especially in a Sequential or Hybrid network"
|
|
||
| Check [this](http://mxnet.incubator.apache.org/install/index.html?device=Linux&language=Python&processor=CPU) page for more information on building from source for various environments. | ||
|
|
||
| After building with `USE_PROFILER=True` and installing, you can import the profiler and configure it from Python code. |
There was a problem hiding this comment.
USE_PROFILER=True or USE_PROFILER=1 or it doesn't matter?
| To use the profiler, you need to build MXNet with `USE_PROFILER` enabled. For example, this command will build the CPU version of MXNet on Linux, | ||
|
|
||
| ``` | ||
| make -j $(nproc) USE_OPENCV=1 USE_BLAS=openblas USE_PROFILER=1 |
There was a problem hiding this comment.
It would be useful to provide GPU version as well, as I assume many people would want to use it.
| Let's define a method that will run one training iteration given data and label. | ||
|
|
||
| ```python | ||
| # Use GPU is available |
There was a problem hiding this comment.
Typo? "Use GPU if available"
There was a problem hiding this comment.
Also, I am not sure that it is safe to include this line. Before you compile CPU version of mxnet and here you actually may end up using gpu() and it will fail.
| You can also dump the information collected by the profiler into a `json` file using the `profiler.dump()` function and view it in a browser. | ||
|
|
||
| ```python | ||
| profiler.dump() |
There was a problem hiding this comment.
So, the difference between getting a plain text version vs. json is in calling "dumps()" vs "dump()"? Is it possible to change this signature?
There was a problem hiding this comment.
I guess the s in dumps indicates the method returns a string. Like pickle.dumps() or json.dumps()
Link to installation page is sufficient.
|
|
||
| ```python | ||
| # Use GPU if available | ||
| ctx = mx.gpu() if mx.test_utils.list_gpus() else mx.cpu() |
There was a problem hiding this comment.
This API has a bug. It doesn't detect GPUs on Windows (due to usage of nvidia-smi command that may not exist in Windows). I have Github issue for this. Pasting here as FYI.
sandeep-krishnamurthy
left a comment
There was a problem hiding this comment.
Thanks Indu for your contributions. Overall, LGTM. I have few follow up questions before we merge the changes:
- Do you want to talk about Environment variables? Mainly, MXNET_EXEC_BULK_EXEC_TRAIN environment variable was very useful to me to profile independent smallest operations. Without this, profiler outputs are for fused operators.
- What is the plan for exisiting docs on profiler, do you want to link this tutorial - https://mxnet.incubator.apache.org/faq/perf.html
- Will it help to have the profiler output image for this example in the tutorial, to make it fully self-contained tutorial from objective to end result?
|
@sandeep-krishnamurthy. Thanks for the valuable inputs.
|
* Add first draft of profiler tutorial * Minor changes - Add images from web-data - Add <!--notebook-skip-line--> * Language corrections * Minor changes - Fix image URLs - Fix formatting of output * Minor changes - Add download button. - Hide profile_stats.png in notebook. * Add tutorial to index. * Add tutorial to tests. * Add a note about nd.waitall() * Remove the example build command. Link to installation page is sufficient. * Fix typo * Include info about env variables related to profiling * Add a further reading section
* Add first draft of profiler tutorial * Minor changes - Add images from web-data - Add <!--notebook-skip-line--> * Language corrections * Minor changes - Fix image URLs - Fix formatting of output * Minor changes - Add download button. - Hide profile_stats.png in notebook. * Add tutorial to index. * Add tutorial to tests. * Add a note about nd.waitall() * Remove the example build command. Link to installation page is sufficient. * Fix typo * Include info about env variables related to profiling * Add a further reading section
Description
Tutorial explaining how to use the profiler.
Checklist
Essentials
Please feel free to remove inapplicable items for your PR.