sycl: Remove not needed copy f16->f32 for dnnl mul mat #14125
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
PR proposes when
GGML_SYCL_F16=ON
to allow oneDNN to handle conversion and outputting of mul_mat into f32 and enabling fpmathmode tof16
.The current approach uses the memory pool to pass a f16
dst
for the oneDNN matmul and acpy
from f16 to the actual f32 outputdst_dd_i
Example of performance difference observed:
Lunar Lake
current approach:
proposed changes:
Battlemage
current approach:
proposed changes: