Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Provide detection for SIMD features in autoconf and at runtime #125022

Open
@picnixz

Description

@picnixz

Feature or enhancement

Proposal:

In #124951, there has been some initial discussion on improving the performances of base64 and possibly {bytearray,bytes,str}.translate using SIMD instructions.

More generally, if we want to use specific SIMD instructions, it'd be good if we at least know whether the processor supports them or not. Note that we already support SIMD in blake2 when possible. As such, I suggest an internal framework for detecting SIMD features for other part of the library as well as a compiler flag support detection.

Note that a single part of the code could benefit from some SIMD calls without having to link the entire library against the entire SIMD-128 or SIMD-256 instruction sets. Note that having a way to detect SIMD support should probably be independent of whether we would use them or not apart from the blake2 module because it could only benefit the standard library if we were to include them.


The blake2 module SIMD support is fairly... complicated due to the wide variety of platforms that need to be supported and due to the mixture of many SIMD instructions. So I don't think I want to touch that part and make it work under the new interface (at least, not for now). While I can say that I'm confident in detecting features on "widely used" systems, there are definitely systems that I don't know so I'd appreciate any help on this topic.

Has this already been discussed elsewhere?

I don't want to open a Discourse thread for now since it's mainly something that will be used internally and not to be exposed to the world.

Links to previous discussion of this feature:

There has been some discussion on Discourse already about SIMD in general and whether to include them (e.g., https://discuss.python.org/t/standard-library-support-for-simd/35138) but the number of results containing "SIMD" or "AVX" is very small. Either this is because the topic is too advanced (detecting CPU features is NOT funny and there is a lack of documentation, the best one being the Wikipedia page) or the feature request is too broad.

Linked PRs

Metadata

Metadata

Assignees

Labels

buildThe build process and cross-buildinterpreter-core(Objects, Python, Grammar, and Parser dirs)performancePerformance or resource usagetype-featureA feature request or enhancement

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions