-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Expanded Hardware Utilization Information #800
Conversation
|
TravisCI failed because you didn't add Can you make it work with psutil v1.2.1? That's what's available on 14.04:
It's probably a good idea to just not display the information if you can't retrieve it on a certain system for whatever reason. |
Forgot to commit that one.
Yep that's what I'm already checking for on the front-end. On the back-end, on Windows, I do not even attempt to obtain it (and therefore will not be shown in the front-end). |
|
This is great. Do you think we could show the CPU utilization for dataset creation too? |
|
Hmm, probably. But is that useful? I haven't found a need for that myself On Wednesday, 1 June 2016, Greg Heinrich [email protected] wrote:
|
|
It's useful if you want to make sure you are utilizing your CPUs efficiently when creating a large dataset. But don't go out of your way to support that if it's not trivial. |
|
I have implemented this hardware utilization because I had a very distinct need for it. I do not have a need for this in creating the dataset at all. Secondly, implementing is not trivial at all as I do not see a way to accurately seperate the relevant hardware metrics exactly corresponding to the job; where-as currently the hardware utilization is reported only and exactly for that distinct, specific job, which is [to me and other digitizers] really neat and useful. N.B. I would also love to log these kind of metrics but that's for another PR. For example, my favorite metric is the GPU temperature; because it gives insight into some kind of running-average of the usage. I.E. <70 deg = inefficient settings/model. |
|
This looks good to me thanks. There are conflicts that must be resolved before merging. Question: the disk write info looks OK however the read statistics appear to be underestimated (I have a 4GB dataset and after several epoch the read counter shows only 96kB). Did I misunderstand what it's supposed to show? Or perhaps the process was reading the database from cache and it didn't count? |
|
Hmm yes indeed it seems the disk statistics are unreliable. Possibly due to On Monday, 13 June 2016, Greg Heinrich [email protected] wrote:
|
|
Okido ready when you are, fixed, squashed & rebased. |
|
That still looks very good to me, let's see what @lukeyeager thinks :-) |
| </dl> | ||
| {% endfor %} | ||
| {% if data_cpu %} | ||
| <h3>CPU ({% if 'pid' in data_cpu %}#{{data_cpu.pid}}{% endif %})</h3> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a little misleading. When I saw CPU (#10291) I thought it was talking about a CPU core or something. How about Process 20291 instead?
|
Oops I caused a merge conflict with #825. While you're rebasing, can you also:
|
|
I'm seeing some training jobs fail to finish with this change. Are you seeing the same? The Caffe task goes to 100% complete, but is stuck at |
|
No haven't seen that. Will try to reproduce. Might be because process variable |
|
I think I nailed it now @lukeyeager : replaced the shitty try block with a nice
|
|
I'm trying this now and am getting this error: It kills the background socketio thread and now I'm not getting any GPU or CPU information. $ python -c 'import psutil;print psutil.__file__;print psutil.__version__'
/usr/lib/python2.7/dist-packages/psutil/__init__.pyc
3.4.2Have you tried this with older versions of
|
|
Sorry my bad. Yeah too new. I'll just use the old functions. On Tuesday, 26 July 2016, Luke Yeager [email protected] wrote:
|
|
I think I fixed it. |
|
The Travis build is failing, but I think it's related to https://www.traviscistatus.com/incidents/2p40l49r3yxd. I'm following up with them ... |
Version fallback for psutil, tested for versions 1,3,4 and added some checks. Implemented showing hw info also for cpu-only systems
|
Mkay. I just squashed and rebased hoping to trigger Travis again. edit: yep, worked. I do advise to run a real test like you did before just to be sure. |
|
Looks good to me! Thanks for the nice addition and for supporting multiple versions of psutil! |
|
Does Travis also check on Windows OS? If nope, maybe ask @IsaacYangSLA to check this PR's functionality. Some |
|
I just ran a simple training task on Windows 7. The CPU / Memory usage was shown and updated correctly. So that basically concludes it works in Windows. |
…il_info Expanded Hardware Utilization Information
Needed this for identifying potential CPU (and disk) bottlenecks (for example for testing preprocessing during for #777).
Issues that this will immediately reveal is for example the 1200% CPU usage I had due to some over-optimization in my BLAS library.
Works for Caffe and Torch, but the psutil manual tells me disk usage is not supported on Windows so I'm checking for that one.
Caffe

## Torch