Codestin Search App

indhub · 2018-06-14T06:03:33Z

Description

Tutorial explaining how to use the profiler.

Checklist

Essentials

Please feel free to remove inapplicable items for your PR.

The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to the relevant JIRA issue created (except PRs with tiny changes)
Changes are complete (i.e. I finished coding on this PR)
All changes have test coverage:
Code is well-documented:
Check the API doc at http://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html

- Add images from web-data - Add

- Fix image URLs - Fix formatting of output

- Add download button. - Hide profile_stats.png in notebook.

indhub · 2018-06-14T06:06:39Z

@ThomasDelteil @thomelane @Ishitori @safrooze
Please take a look when you get time.

piiswrong · 2018-06-14T17:17:51Z

+
+It is often helpful to understand what operations take how much time while running a model. This helps optimize the model to run faster. In this tutorial, we will learn how to profile MXNet models to measure their running time and memory consumption using the MXNet profiler.
+
+## The incorrect way to profile


This is not incorrect. You can still use wait_to_read to time the dot operation

Agree. But I don't want to suggest wait_to_read as the recommended way to measure time taken by operations. While it might work for toy problems like this,

it is harder to use to measure execution time of multiple operations (requires wait_to_read both before and after the measured operation in multiple places).

It is hard to use to measure running time of a block inside a sequence (which is common)

it won't work for hybrid networks.

The goal of this tutorial is to point people to a recommended way of profiling that works for almost all cases.

However I can add a note along the lines of "While it is possible to use wait_to_read() before and after an operation to get running time of an operation, it is not a scalable method to measure running time of multiple operations, especially in a Sequential or Hybrid network"

Ishitori · 2018-06-14T21:06:08Z

+
+Check [this](http://mxnet.incubator.apache.org/install/index.html?device=Linux&language=Python&processor=CPU) page for more information on building from source for various environments. 
+
+After building with `USE_PROFILER=True` and installing, you can import the profiler and configure it from Python code.


USE_PROFILER=True or USE_PROFILER=1 or it doesn't matter?

Ishitori · 2018-06-14T21:07:15Z

+To use the profiler, you need to build MXNet with `USE_PROFILER` enabled. For example, this command will build the CPU version of MXNet on Linux,
+
+```
+make -j $(nproc) USE_OPENCV=1 USE_BLAS=openblas USE_PROFILER=1


It would be useful to provide GPU version as well, as I assume many people would want to use it.

Ishitori · 2018-06-14T21:08:33Z

+Let's define a method that will run one training iteration given data and label.
+
+```python
+# Use GPU is available


Typo? "Use GPU if available"

Also, I am not sure that it is safe to include this line. Before you compile CPU version of mxnet and here you actually may end up using gpu() and it will fail.

Ishitori · 2018-06-14T21:14:45Z

+You can also dump the information collected by the profiler into a `json` file using the `profiler.dump()` function and view it in a browser.
+
+```python
+profiler.dump()


So, the difference between getting a plain text version vs. json is in calling "dumps()" vs "dump()"? Is it possible to change this signature?

I guess the s in dumps indicates the method returns a string. Like pickle.dumps() or json.dumps()

Link to installation page is sufficient.

sandeep-krishnamurthy · 2018-06-17T00:15:59Z

+
+```python
+# Use GPU if available
+ctx = mx.gpu() if mx.test_utils.list_gpus() else mx.cpu()


This API has a bug. It doesn't detect GPUs on Windows (due to usage of nvidia-smi command that may not exist in Windows). I have Github issue for this. Pasting here as FYI.

sandeep-krishnamurthy

Thanks Indu for your contributions. Overall, LGTM. I have few follow up questions before we merge the changes:

Do you want to talk about Environment variables? Mainly, MXNET_EXEC_BULK_EXEC_TRAIN environment variable was very useful to me to profile independent smallest operations. Without this, profiler outputs are for fused operators.
What is the plan for exisiting docs on profiler, do you want to link this tutorial - https://mxnet.incubator.apache.org/faq/perf.html
Will it help to have the profiler output image for this example in the tutorial, to make it fully self-contained tutorial from objective to end result?

indhub · 2018-06-18T20:24:24Z

@sandeep-krishnamurthy. Thanks for the valuable inputs.

I've added a section on the environment variables.
I've linked to the perf faq.
The tutorial already has an image to show how profiler output looks like. If you want to generate the visualization that chrome generates inside Jupyter notebook, we need a library that can do that and I'm not aware of any. Note also that Chrome has ways to navigate the trace output (like zooming into specific timeframe). Unless there is a way to do all that from Jupyter notebook, I would prefer recommending Chrome trace viewer as the preferred way to view tracing information.

sandeep-krishnamurthy

LGTM! Thanks.
Build failed for some other flaky test. Can you please restart, we cannot merge without Green builds.

* Add first draft of profiler tutorial * Minor changes - Add images from web-data - Add  * Language corrections * Minor changes - Fix image URLs - Fix formatting of output * Minor changes - Add download button. - Hide profile_stats.png in notebook. * Add tutorial to index. * Add tutorial to tests. * Add a note about nd.waitall() * Remove the example build command. Link to installation page is sufficient. * Fix typo * Include info about env variables related to profiling * Add a further reading section

indhub added 7 commits June 11, 2018 18:06

Add first draft of profiler tutorial

d55d0d7

Minor changes

273797d

- Add images from web-data - Add

Language corrections

3e8a999

Minor changes

e177c70

- Fix image URLs - Fix formatting of output

Minor changes

823ea3e

- Add download button. - Hide profile_stats.png in notebook.

Add tutorial to index.

5041391

Add tutorial to tests.

d9e6540

indhub requested a review from szha as a code owner June 14, 2018 06:03

piiswrong reviewed Jun 14, 2018

View reviewed changes

Add a note about nd.waitall()

553211f

Ishitori reviewed Jun 14, 2018

View reviewed changes

indhub added 2 commits June 15, 2018 07:33

Remove the example build command.

3d272a0

Link to installation page is sufficient.

Fix typo

8f76e28

sandeep-krishnamurthy reviewed Jun 17, 2018

View reviewed changes

sandeep-krishnamurthy suggested changes Jun 17, 2018

View reviewed changes

indhub added 2 commits June 18, 2018 19:24

Include info about env variables related to profiling

54eae81

Add a further reading section

0ae6df1

sandeep-krishnamurthy approved these changes Jun 18, 2018

View reviewed changes

sandeep-krishnamurthy merged commit 00681c3 into apache:master Jun 19, 2018


		It is often helpful to understand what operations take how much time while running a model. This helps optimize the model to run faster. In this tutorial, we will learn how to profile MXNet models to measure their running time and memory consumption using the MXNet profiler.

		## The incorrect way to profile


		Check [this](http://mxnet.incubator.apache.org/install/index.html?device=Linux&language=Python&processor=CPU) page for more information on building from source for various environments.

		After building with `USE_PROFILER=True` and installing, you can import the profiler and configure it from Python code.

Conversation

indhub commented Jun 14, 2018

Description

Checklist

Essentials

Uh oh!

indhub commented Jun 14, 2018

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sandeep-krishnamurthy left a comment

Choose a reason for hiding this comment

Uh oh!

indhub commented Jun 18, 2018

Uh oh!

sandeep-krishnamurthy left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

sandeep-krishnamurthy left a comment •

edited

Loading