oneDPL 2022.3.0 release
·
718 commits
to main
since this release
New Features
- Added an experimental feature to dynamically select an execution context, e.g., a SYCL queue.
The feature provides selection functions such asselect,submitandsubmit_and_wait,
and several selection policies:fixed_resource_policy,round_robin_policy,
dynamic_load_policy, andauto_tune_policy. unseqandpar_unseqpolicies now enable vectorization also for Intel® oneAPI DPC++/C++ Compiler.- Added support for passing zip iterators as segment value data in
reduce_by_segment,
exclusive_scan_by_segment, andinclusive_scan_by_segment. - Improved performance of the
merge,sort,stable_sort,sort_by_key,
reduce,min_element,max_element,minmax_element,is_partitioned, and
lexicographical_comparealgorithms with DPC++ execution policies.
Fixed Issues
- Fixed the
reduce_asyncfunction to not ignore the provided binary operation.
New Known Issues and Limitations
- When compiled with
-fsycl-pstl-offloadoption of Intel® oneAPI DPC++/C++ compiler and with
libstdc++version 8 orlibc++,oneapi::dpl::execution::par_unseqoffloads
standard parallel algorithms to the SYCL device similarly tostd::execution::par_unseq
in accordance with the-fsycl-pstl-offloadoption value. - When using the dpl modulefile to initialize the user's environment and compiling with
-fsycl-pstl-offload
option of Intel® oneAPI DPC++/C++ compiler, a linking issue or program crash may be encountered due to the directory
containing libpstloffload.so not being included in the search path. Use the env/vars.sh to configure the working
environment to avoid the issue. - Compilation issues may be encountered when passing zip iterators to
exclusive_scan_by_segmenton Windows. - Incorrect results may be produced by
set_intersectionwith a DPC++ execution policy,
where elements are copied from the second input range rather than the first input range. - For
transform_exclusive_scanandexclusive_scanto run in-place (that is, with the same data
used for both input and destination) and with an execution policy ofunseqorpar_unseq,
it is required that the provided input and destination iterators are equality comparable.
Furthermore, the equality comparison of the input and destination iterator must evaluate to true.
If these conditions are not met, the result of these algorithm calls is undefined. sort,stable_sort,sort_by_key,partial_sort_copyalgorithms may work incorrectly or cause
a segmentation fault when used a DPC++ execution policy for CPU device, and built
on Linux with Intel® oneAPI DPC++/C++ Compiler and -O0 -g compiler options.
To avoid the issue, pass-fsycl-device-code-split=per_kerneloption to the compiler.- Incorrect results may be produced by
exclusive_scan,inclusive_scan,transform_exclusive_scan,
transform_inclusive_scan,exclusive_scan_by_segment,inclusive_scan_by_segment,reduce_by_segment
withunseqorpar_unseqpolicy when compiled by Intel® oneAPI DPC++/C++ Compiler
with-fiopenmp,-fiopenmp-simd,-qopenmp,-qopenmp-simdoptions on Linux.
To avoid the issue, pass-fopenmpor-fopenmp-simdoption instead. - Incorrect results may be produced by
reduceandtransform_reducewith 64-bit types andstd::multiplies,
sycl::multipliesoperations when compiled by Intel® C++ Compiler 2021.3 and newer and executed on GPU devices.