Commit 0c2c6ee
authored
Added more fusion and vectorized kernel for transducer (#1125)
* Added support for fused ReLU and dropout into transducer joint
* Reorganized code selection path in transducer joint fwd
* Added support for fused ReLU+dropout into transducer joint
* Vectorize transducer loss backward with fused softmax (#3)
* Nanz/transducer loss (#4)
* Vectorize transducer loss backward with fused softmax
* Added a predicate to avoid potential IMA
* Nanz/transducer loss (#5)
* Vectorize transducer loss backward with fused softmax
* Added a predicate to avoid potentional IMA
* Added more predicates to avoid IMAs
* Updated documentations for newly added features.
* Fixed a error in transducer.py1 parent ed71996 commit 0c2c6ee
8 files changed
Lines changed: 662 additions & 185 deletions
File tree
- apex/contrib
- csrc/transducer
- test/transducer
- transducer
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
5 | 5 | | |
6 | 6 | | |
7 | 7 | | |
8 | | - | |
| 8 | + | |
9 | 9 | | |
10 | 10 | | |
11 | 11 | | |
| |||
14 | 14 | | |
15 | 15 | | |
16 | 16 | | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
17 | 20 | | |
18 | 21 | | |
19 | 22 | | |
20 | 23 | | |
21 | | - | |
| 24 | + | |
22 | 25 | | |
23 | 26 | | |
24 | 27 | | |
25 | 28 | | |
26 | 29 | | |
27 | | - | |
| 30 | + | |
| 31 | + | |
28 | 32 | | |
29 | | - | |
| 33 | + | |
30 | 34 | | |
31 | 35 | | |
32 | 36 | | |
| |||
35 | 39 | | |
36 | 40 | | |
37 | 41 | | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
38 | 45 | | |
39 | 46 | | |
40 | 47 | | |
| |||
51 | 58 | | |
52 | 59 | | |
53 | 60 | | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
54 | 64 | | |
55 | 65 | | |
56 | 66 | | |
57 | 67 | | |
58 | | - | |
| 68 | + | |
59 | 69 | | |
60 | 70 | | |
61 | 71 | | |
62 | 72 | | |
63 | 73 | | |
64 | | - | |
65 | | - | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
66 | 79 | | |
67 | 80 | | |
68 | 81 | | |
69 | 82 | | |
70 | 83 | | |
71 | | - | |
| 84 | + | |
72 | 85 | | |
73 | 86 | | |
74 | 87 | | |
75 | 88 | | |
76 | 89 | | |
77 | | - | |
| 90 | + | |
| 91 | + | |
78 | 92 | | |
79 | 93 | | |
80 | 94 | | |
| |||
0 commit comments