-
Notifications
You must be signed in to change notification settings - Fork 2
Exactly-rounded summation of floating point values
License
radfordneal/xsum
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
ROUTINES AND PROGRAMS FOR EXACT SUMMATION. Written by Radford M. Neal. Source repository is at gitlab.com/radfordneal/xsum See NEWS for information on release history. This software allows exact summation of double-precision (FP64) floating-point values, storing the sum in a "superaccumulator", with the final sum then being rounded to the closest double-precision representation of the sum. Specialized versions of the summation functions for computing vector norms and dot products are also provided. Functions for finding the exact rouding of a superaccumulator divided by an integer are also provided. These methods (except the division functions) are described in the paper "Fast Exact Summation Using Small and Large Superaccumulators", by Radford M. Neal, available at https://arxiv.org/abs/1505.05571 (and also here as xsum-paper.pdf). The software here and can be used to reproduce the results therein. The routines and programs are licensed under the "MIT" license, which is in the file "LICENSE". See the 'api-doc' file for documentation on how to use the routines in an application program. Various test and timing programs can be built using one of the several Makefiles included. For instance, the following should work on Intel/AMD Linux systems with gcc installed: make -f Makefile-gcc-intel See the documentation at the start of the Makefile-xxx files for more details. For some systems, another Makefile may need to be created. Performance on a particular system may be improved by tuning compiler options, and perhaps the settings of the flags defined at the beginning of the xsum.c file. A Makefile-generic is supplied for using a generic 'cc' compiler on any system, but may not give the best performance. It may be best to make the programs using the "compile" script described below, rather than use "make" directly. The "fuzz" target for make will make a fuzzer program for testing, written by Kevin Gibbons. See comments in fuzz.c for how to use it. The xsum-check program does automatic correctness checks. The xsum-test program can be used to test the functions manually. The xsum-time program does timing tests. See comments in the source files for documentation on how to use these programs. The xsum-time-zhu and xsum-time-perm-zhu programs do timing tests that include times for the iFastSum and OnlineExact methods of Zhu and Hayes (ACM Transactions on Mathematical Software, Algorithm 908). Making this program requires their ExactSum.cpp and ExactSum.h source files, from the supplemental information for their paper (found at https://dl.acm.org/doi/10.1145/1824801.1824815). A C++ compiler is also needed. If you have all this, you can compile them (and the other programs) using the with-zhu make target. The programs can be remade by using the "compile" script, which runs make -B to recompile them with the makefile with a specified suffix. It also stores information on the git commit, diffs from that commit, and how they were compiled in Make-info. The "compile-zhu" script does the same with the Zhu and Hayes programs also compiled. The checks and timing tests reported in the paper can be run with the run-tests-zhu script (except that the tests of sizes 10 and 100 were done with fewer repetitions on the Raspberry Pi, to avoid very long runs). The run-tests script does the same but without the Zhu and Hayes methods. The run-checks script runs only the correctness checks. The output includes Make-info, produced by "compile" or "compile-zhu", so that it will be clear what version was tested. In the directory paper-results, the systems on which tests were done are described in the files with names of the form system-xxx, and the results are in corresponding files named results-xxx-ccc, where ccc is the compiler used. The plots in the paper are also there, produced by the plots.r script. The directories newer-results and newer-results2 contain results of more recent timing tests, similarly to paper-results. Runs were made with several compilers, with the best for vectors of size 1000 then being chosen (separately for each method). The plots show the best choice in dark colour and the others in light colour.
About
Exactly-rounded summation of floating point values
Resources
License
Stars
Watchers
Forks
Packages 0
No packages published