Codestin Search App

zou3519 · 2018-03-12T22:29:18Z

The reported issue is that on the CPU path, norm(value, dim) is slower than manually using pow, sqrt, and summing.

It turns out that the CPU path for norm(value, dim) is missing optimizations in the value=1, 2 cases. I added those in as well as an optimization for value = 3 (not sure if this is necessary, but this optimization is used for tensor.pow(3)).

@li-roy could you take a look?

Perf numbers:


In [1]: import torch
   ...: x = torch.randn(1024, 256)
   ...: y = torch.randn(1024, 256)
   ...:
   ...: %timeit torch.norm(x-y, 1, 1)
   ...: %timeit (x-y).sum(1)
   ...:
   ...: %timeit torch.norm(x-y, 2, 1)
   ...: %timeit torch.sqrt((x-y).pow(2).sum(1))
   ...:
   ...: %timeit torch.norm(x-y, 3, 1)
   ...: %timeit torch.pow((x - y).abs().pow(3).sum(1), 1/3)
   ...:
362 µs ± 56.4 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
332 µs ± 33.2 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

340 µs ± 8.42 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
358 µs ± 5.87 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

352 µs ± 4.55 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
691 µs ± 49.2 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

apaszke

Does CUDA have optimized implementations?

zou3519 · 2018-03-12T23:13:32Z

Yup!

JaeDukSeo · 2019-04-19T07:04:51Z

wait so is this fixed...?

soumith · 2019-04-19T15:15:13Z

@JaeDukSeo yes

Add optimization to norm for common norms

69cdea6

onnxbot-worker-1 mentioned this pull request Mar 12, 2018

[auto] pytorch-pr-5722 onnxbot/onnx-fb-universe#1078

Closed

apaszke approved these changes Mar 12, 2018

View reviewed changes

soumith merged commit 542fbcc into pytorch:master Mar 12, 2018

fmassa mentioned this pull request Mar 15, 2018

Feature Request: CPU performance optimization with MKL-DNN #4186

Open

ksanjeevan mentioned this pull request May 11, 2019

first attempt of mono downmix in the magnitude domain keunwoochoi/torchaudio-contrib#45

Open

laurentdupin pushed a commit to laurentdupin/pytorch that referenced this pull request Apr 24, 2026

Add optimization to norm for common norms (pytorch#5722)

1780b40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add optimization to norm for common norms#5722

Add optimization to norm for common norms#5722
soumith merged 1 commit into
pytorch:masterfrom
zou3519:pow-fastpath

zou3519 commented Mar 12, 2018 •

edited

Loading

Uh oh!

apaszke left a comment

Uh oh!

zou3519 commented Mar 12, 2018

Uh oh!

JaeDukSeo commented Apr 19, 2019

Uh oh!

soumith commented Apr 19, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

zou3519 commented Mar 12, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

apaszke left a comment

Choose a reason for hiding this comment

Uh oh!

zou3519 commented Mar 12, 2018

Uh oh!

JaeDukSeo commented Apr 19, 2019

Uh oh!

soumith commented Apr 19, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

zou3519 commented Mar 12, 2018 •

edited

Loading