Work On Vec Geom Benchmark Program
Work On Vec Geom Benchmark Program
Kyungdon Choi
RAPPORT DE STAGE
KyungDon CHOI
Juin 2015
Index
Abstract
1. Introduction
2. GEANT-4 Applications
A. High Energy physics
B. Space and radiation
C. Medical
D. Others
6. Conclusion
Abstract
1. Introduction
1
GEANT-V official webpage http://geant.cern.ch/content/about-geant5
R, Cilk Plus, and NumPy, a module in Python, are examples of programming
languages which support array programming. The single instruction multi data
(SIMD) array capabilities, supported since Intel MMX, allow us to calculate
vectorized parameters.
CUDA is the GPGPU technology used for GEANT-V. This technology has been
developed by NVidia 2 and can be used in various fields. CUDA library helps C,
C++ or FORTRAN code directly send to the GPU without using assembly
language that the GPU may calculate the code directly. As a GPU has more
calculation units than a CPU, it is expected to calculate arrays and matrices faster
than a CPU. This describes that a GPU can be more powerful for the arrays or
matrices (vectorized) variables.
While GEANT-V allows for fast computing, this program requires a benchmark
program. For this reason I worked on a benchmark program for VecGeom which
is a Geometry library for GEANT-V. The purpose of the benchmark program is to
figure out if GEANT-V is significantly faster than GEANT-4. With this benchmark,
GEANT-V team was able to draw an image by applying same algorithm to ROOT6,
GEANT-4, and VecGeom and comparing the results. Cylindrical and spherical
mapping, added to benchmark program recently, produce segmentation faults
with ROOT6, especially DistToCone function in TGeoCone, but these work fine
with GEANT-4. The team verified this problem using Valgrind and GNU Debugger
(gdb). Reproducible debugging programs will be developed at the end of June
2015.
2. GEANT-4 Applications
2
http://www.nvidia.com/object/cuda_home_new.html
GEANT-4 is a great tool to simulate particles passing through matters that can be
used in various fields. It is first designed for HEP that it can simulate detectors
very well. GEANT-4 can also be used for medical, biological (ex, radiation effect to
DNA 3), space and radiation simulation.
3
The GEANT4-DNA project http://geant4-dna.org/
4
GEANT4 HEP application http://geant4.web.cern.ch/geant4/applications/hepapp.shtml
5
ArgoNeuT webpage http://t962.fnal.gov/
6
Calorimeter R&D webpage http://drcalorimetry.fnal.gov/
7
MINERvA webpage
http://nusoft.fnal.gov/minerva/minervadat/software_doxygen/HEAD/MINERVA/index.html
VI. Rare Isotope Science Project (RISP) 8 : RISP which will be built in
Deajeon, South Korea is using GEANT-4 for their beamline at RF/IF
team.
The European Space Agency (ESA) is the heaviest user of GEANT-4 in the space
field. Its GEANT-4 used projects are XMM-Newton Radiation Environment, Space
Environment Information System (SPENVIS), Dose Estimation by Simulation of the
ISS Radiation Environment (DESIRE) and Physics Models for Biological Effects of
Radiation and Shielding. Other projects such as The Gamma Ray Large Area
Space Telescope (GLAST) are also simulated with GEANT-4.
Space is full of space radiation that can damage humans, computer cores,
welding points, etc. GEANT-4 is a suitable software to simulate such damages.
C. Medical fields
8
RISP main webpage http://www.risp.re.kr/eng/pMainPage.do
GEANT4-DNA project is a very interesting project in medical field. This project is
modeling early biological damage induced by ionizing radiation at the DNA scale.
The goal of Geant-4 based Architecture for Medicine-Oriented Simulation
(GAMOS) project is to carry out GEANT-4 based simulation without C++ coding.
GATE is a simulation tool based on GEANT-4 that support positron emission
tomography (PET), single photon emission computed tomography (SPECT),
computed tomography (CT) and Radiotherapy experiments.
D. Other applications
After the Fukushima reactor accident, alternative energy sources are a hot issue
of research. One of the solutions is thorium reactor. Thorium reactors have some
benefits compared to uranium or plutonium based reactors. The expected nuclear
waste produced is less than 1/1000 of ordinary reactors and the nuclear waste
cannot be used to make nuclear weapons. Also thorium is not the material that
do chain reaction such that if there is an accident like Fukushima reactor accident,
a thorium reactor will automatically turn off instead of creating a meltdown.
GEANT-4 is used in this field to simulate the lifecycle of thorium9.
9
GEANT4 STUDIES OF THE THORIUM FUEL CYCLE Proceedings of 2011 Particle Accelerator
Conference, New York, NY, USA
clock and power stay still. This is because as time passes, transistors are getting
smaller such that more can place in a limited area. However, the amount of heat
they produce depends on the amount of power that transistors consume. So if
power usage of transistor or more efficient cooling method can found, the clock
cannot be increased. This is why it is hard to expect clocks more than 4GHz for
x86 architecture. The relation between CPU power and input voltage is as
follows 10:
2
𝑃𝑃𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑 = 𝛼𝛼𝛼𝛼𝑉𝑉𝐷𝐷𝐷𝐷 fA
𝑃𝑃𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑 is the total power applied to the CPU, C and A are transistor density
factors, f is the operating frequency, and 𝑉𝑉𝐷𝐷𝐷𝐷 is the power from the power
supplier or mother board. If a processor has extremely low input voltage, it can
be run in extremely high frequency. However, from the relation above, it can be
concluded that there exists a
certain threshold to increase
frequency due to the limited of
input power. This explains why
CPUs do not have high
operating frequency despite
adding more cores.
10
Power Consumption Analysis http://forums.anandtech.com/showthread.php?t=2281195
threshold. GEANT-V, which is designed to handle vectorized parameters, is
expected to break through the limit of GEANT-4 and can benefit various fields’
mentions in chapter 2.
4. Important software
CUDA works as flowing processing flows. 11 CUDA code copies data from main
memory to GPU memory. Then the CPU instructs the process to the GPU. After
receiving this instruction, the GPU will execute its cores in parallel. Finally, it will
copy these results from the GPU memory to the main memory. However this
doesn't work with all kinds of GPUs; only a few select NVIDIA GPUs are viable for
CUDA. 12
Advantages
I. Scattered reads : code can read from arbitrary addresses in
memory
II. Shared memory : CUDA exposes a fast shared memory region
that can be shared amongst threads. This can be used as a user-
managed cache, enabling higher bandwidth than is possible using
texture lookups.
III. Faster downloads and readbacks to and from the GPU
IV. Full support for integer and bitwise operations, including integer texture
lookups
11
CUDA Overview by Cliff Woolley, NVIDIA, page 5 to 7
12
CUDA developer page https://developer.nvidia.com/cuda-gpus
Disadvantages
I. CUDA does not support the full C standard.
II. Copying between host and device memory may incur a
performance hit due to limitations in system bus bandwidth and
latency
III. Unlike OpenCL, CUDA-enabled GPUs are only available from some
of NVIDIA GPU lineups
IV. CUDA (with computing capability 2.x) allows a subset of C++ class
functionality. For example member functions may not be virtual. 13
V. Valid C/C++ may sometimes be flagged and prevent compilation
due to the optimization techniques the compiler is required to
B. Jenkins
13
UDA C Programming Guide 3.1 – Appendix D.6
14
CUDA application list http://www.geforce.com/games-applications/pc-applications/setihome
15
Generating SU(Nc) pure gauge lattice QCD configurations on GPUs with CUDA
16
Jenkins Official page http://jenkins-ci.org/
17
Open Source MIT License http://opensource.org/licenses/MIT
The GEANT-V team used Jenkins as a testing platform, mostly used for nightly
test because, computing resource is limited during day time when, people use
computing power for development of GEANT-V and VecGeom.
C. JIRA
Jira is an agile software development tool which provides bug tracking, issue
tracking and project management functions. It is written in Java and is also a
platform independent software. It integrates well with source control programs
such as CVS, Git, Clearcase, etc. CERN officially provides JIRA, and the GEANT-V
team uses Jira as a project manager tool to control the project efficiently.
VecGeom is the new geometry libraries which will be used for GEANT-V. This
library consists of fully vectorized parameters but not yet proven to be faster than
scalar parameters. For this reason, building and maintaining the benchmark
program is important for developing VecGeom.
To compare runtime with existing simulators, ROOT6, GEANT-4 and GEANT-V are
obvious choices but there are other software required too 18. GCC version should
be more than 4.8.x (Current version is 5.1.1). It is good to use Devtoolset2.0 or
higher provided at CERN webpage 19. List of programs in Devtoolset2.0 provides
18
GEANT-V installation Guide http://geant.cern.ch/content/installation
19
Linux at CERN http://linux.web.cern.ch/linux/devtoolset/
tools as follow:
These are main programs used to code benchmark program, though there are
programs like binutils, elfutils, dwz, systemtap, oprofile and eclipse.
One of the goals for GEANT-V is fast simulation, to a degree that VecGeom
would run faster than using other simulation programs such as GEANT-4. To
perform this task benchmark program requires many programs. ROOT6 and
GEANT-4 are main competitors for comparing the runtime. Also vectorization
support libraries are required to run vectorized parameters with compilers.
Following programs are required to use the benchmark program:
I. ROOT6 : The only capable version for ROOT is ROOT6 and tag v6-
03-02.
II. GEANT-4 : Use the most resent version of GEANT-4. 20
III. VC : SIMD library for C++. Version should be higher than 2.20
(Current version is 2.24)
IV. VecGeom : Geometry libraries for GEANT-V
V. PYTHIA8 : Current version of PYTHIA is 8.200 but it is not
supported by ROOT. The version supported by LHC is PYTHIA 8.186.
VI. HepMC : An object oriented event record written in C++ for high
20
GEANT-4 download page http://geant4.web.cern.ch/geant4/support/download.shtml
energy physics monte carlo generators. 21
in 2003.
B. Benchmark program
21
M. Dobbs and J.B. Hansen, Comput. Phys. Commun. 134 (2001) 41.
II. Exception handling with parameters. At the beginning this
benchmark had 5 parameters and now there are 6 parameters with
scale factor of image.
III. Read the direction of X-ray and set the direction and scale of the
output image.
IV. Load each functions written in ROOT6, GEANT-4 and VecGeom to
measure the runtime of each functions and to draw the section of
the part of detector.
V. In the end, draw the image of part and print colors depend on the
X-ray decay length.
Figure 6 Monotone image of benchmark code was changed to use the total
Calorimeter in Z axis mass passed through. Figure 6 shows the
monotone image of calorimeter on z axis with the total mass in the pixel. The
color of a pixel is close to white if a pixel has heavier mass and it is close to black
if a pixel has 0 mass. This color mapping images are shown at Figure 7, where left
one is on z axis and right one is on x axis. However
colored images do
not describe the
structure of this
calorimeter clearly.
After log
scale
mapping was
done, new
coordinate
algorithm was
required,
which is a
spherical
Figure 8 Log 10 and Log 2 image
mapping. The
difference between spherical mapping and ordinary mapping is the change of
direction. The X-Ray direction does not change while running with Cartesian
coordinates, but the direction changes continuously while running with spherical
mapping from the center. The notation is as follows:
Also, cylindrical mapping is similar to spherical mapping code but one thing is
different; cylindrical mapping has 2 dimensional vectors instead of 3. The code for
cylindrical mapping is as follows:
To certify if this is only for ROOT6, mapping code was operated with GEANT-4.
With the success of GEANT-4 code test, final objective is to make a reproducible
debugging code for DistToCone in TGeoCone functions.
6. Conclusions