Security Risks in Deep Learning Implementations
Security Risks in Deep Learning Implementations
1
Qihoo 360 Security Research Lab 2
University of Georgia 3
University of Virginia
Abstract—Advance in deep learning algorithms overshadows We made a preliminary study on the threats and risks
their security risk in software implementations. This paper dis- caused by these vulnerabilities. With a wide variety of deep
closes a set of vulnerabilities in popular deep learning frameworks learning applications being built over these frameworks, we
including Caffe, TensorFlow, and Torch. Contrast to the small consider a range of attack surfaces including malformed data
code size of deep learning models, these deep learning frameworks in application inputs, training data, and models. The potential
arXiv:1711.11008v1 [cs.CR] 29 Nov 2017
These passionate adoptions of new machine learning algo- Deep learning frameworks enable fast development of ma-
rithms has sparked the development of multiple deep learning chine learning applications. Equipped with pre-implemented
frameworks, such as Caffe [3], TensorFlow [1], and Torch [6]. neural network layers, deep learning frameworks allow devel-
These frameworks enable fast development of deep learning opers to focus on application logic. Developers can design,
applications. A framework provides common building blocks build, and train scenario specific models on a deep learning
for layers of a neural network. By using these frameworks, framework without worrying about the coding details of input
developers can focus on model design and application specific parsing, matrix multiplication, or GPU optimizations.
logic without worrying the coding details of input parsing,
matrix multiplication, or GPU optimizations.
Program Logic Model Data
DL
In this paper, we examine the implementation of three Apps
popular deep learning frameworks: Caffe, TensorFlow, and
Torch. And we collected their software dependencies based on
Torch theano
the sample applications released along with the framework. DL
The implementation of these frameworks are complex (often
Frameworks TensorFlow Caffe
with hundreds of thousands lines of code) and are often built
over numerous 3rd party software packages, such as image and Framework
video processing, scientific computation libraries. Dependencies
GNU LibC NumPy
by such bugs are denial-of-service attacks to applications +static Size validateInputImageSize(const Size& size)
+{
running on top of the framework. The list below shows +
+
CV_Assert(size.width > 0);
CV_Assert(size.width <= CV_IO_MAX_IMAGE_WIDTH);
the patch to a bug found in the numpy python package, +
+
CV_Assert(size.height > 0);
CV_Assert(size.height <= CV_IO_MAX_IMAGE_HEIGHT);
which is a building block for the TensorFlow framework. +
+
uint64 pixels = (uint64)size.width * (uint64)size.height;
CV_Assert(pixels <= CV_IO_MAX_IMAGE_PIXELS);
The numpy package is used for matrix multiplication and + return size;
+}
related processing. It is commonly used by applications
built over TensorFlow. The particular bug occurs in the @@ -408,14 +426,26 @@ imread_( const String& filename, int flags, int hdrtype, Mat* mat=0 )
// established the required input image size
pad() function, which contains a while loop that would not -
-
CvSize size;
size.width = decoder->width();
terminate for inputs not anticipated by the developers. The -
+
size.height = decoder->height();
Size size = validateInputImageSize(Size(decoder->width(), decoder->height()));
flaws occur because of the variable safe-pad in the loop
condition is set to a negative value when an empty vector • Threat 3 – System Compromise: For software bugs that
is passed from a caller. Because of this bug, we showed allows an attacker to hijack control flow, attackers can
that popular sample TensoFlow applications, such as the potentially leverage the software bug and remotely com-
Urban Sound Classification [7], will hang with special promise the system that hosts deep learning applications.
crafted sound files. This occurs when deep learning applications run as a
cloud service to input feed from the network.
Listing 1: numpy patch example The list below shows a patch to a simple buffer overflow
--- a/numpy/lib/arraypad.py found in the OpenCV library. The OpenCV library is
+++ b/numpy/lib/arraypad.py
@@ -1406,7 +1406,10 @@ def pad(array, pad_width, mode, **kwargs): a computer vision library which designed for computa-
newmat = _append_min(newmat, pad_after, chunk_after, axis)
tional efficiency and with a strong focus on real-time
-
elif mode == ’reflect’:
for axis, (pad_before, pad_after) in enumerate(pad_width):
applications. OpenCV supports the deep learning frame-
+
+
if narray.size == 0:
raise ValueError("There aren’t any elements to reflect in ’array’!")
works, such as TensorFlow, Torch/PyTorch and Caffe.
+ The buffer overflow occurs in the readHeader function
+ for axis, (pad_before, pad_after) in enumerate(pad_width):
... ... in grfmt_bmp.cpp. The variable m_palatte represents a
method = kwargs[’reflect_type’]
safe_pad = newmat.shape[axis] - 1 buffer whose size is 256*4 bytes, however, the value
while ((pad_before > safe_pad) or (pad_after > safe_pad)):
... ... of clrused is from an input image which can be set to
an arbitrary value by attackers. Therefore, a malformed
• Threat 2 – Evasion attacks: Evasion attacks occur when BMP image could result to buffer overflow from the get-
an attacker can construct inputs that should be classified Bytes() call. Through our investigation, this vulnerability
as one category but being misclassified by deep learning provides the ability to make arbitrary memory writes and
applications as a different category. Machine learning we have successfully forced sample programs (such as
cpp_classification [2] in Caffe) spawning a remote shell the same execution path because all inputs go through the same
based on our crafted image input. layers of calculation. Therefore, simple errors such as divide-
While doing this work, we found another group of re- by-zero would not be easily found by coverage-based fuzzers
searchers [8] that have also studied the vulnerabilities since the path coverage feedback is less effective in this case.
and impact of OpenCV on machine learning applications.
Although their idea of exploring OpenCV for system C. Security Risks due to Logical Errors or Data Manipulation
compromise shares a similar goal with our effort, they
did not find or release vulnerabilities that are confirmed Our preliminary work focused on the “conventional” soft-
by OpenCV developers [4]. In contrast, our findings have ware vulnerabilities that lead to program crash, control flow
been confirmed by corresponding developers and many hijacking or denial-of-service. It is interesting to consider if
of them have been patched based on our suggestion. there are types of bugs specific to deep learning and need
In addition, we have also developed proof-of-concept special detection methods. Evasion attack or data poisoning
exploitation that has successfully demonstrated remote attack do not have to relies on conventional software flaws such
system compromise (by remotely gaining a shell) through as buffer overflow. It is enough to create an evasion if there are
the vulnerabilities found by us. mistakes allowing training or classification to use more data
than what an application suppose to have. The mismatch of
Listing 3: OpenCV patch example data consumption can be caused by a small inconsistency in
index 86cacd3..257f97c 100644
--- a/modules/imgcodecs/src/grfmt_bmp.cpp
data parsing between the framework implementation and the
+++ b/modules/imgcodecs/src/grfmt_bmp.cpp
@@ -118,8 +118,9 @@ bool BmpDecoder::readHeader()
conventional desktop software.
if( m_bpp <= 8 )
{
One additional challenge for detecting logical errors in
- memset( m_palette, 0, sizeof(m_palette)); deep learning applications is the difficulty to differentiate
- m_strm.getBytes( m_palette, (clrused == 0? 1<<m_bpp : clrused)*4 );
+ CV_Assert(clrused < 256); insufficient training from intended manipulation, which targets
+
+
memset(m_palette, 0, sizeof(m_palette));
m_strm.getBytes(m_palette, (clrused == 0? 1<<m_bpp : clrused)*4 ); to have a particular group of inputs misclassified. We plan to
}
iscolor = IsColorPalette( m_palette, m_bpp );
investigate methods to detect such type of errors.
else if( m_bpp == 16 && m_rle_code == BMP_BITFIELDS )
V. C ONCLUSION
IV. D ISCUSSION AND F UTURE W ORK
The purpose of this work is to raise awareness of the
The previous section presents software vulnerabilities in security threats caused by software implementation mistakes.
the implementations of deep learning frameworks. These vul- Deep Learning Frameworks are complex software and thus
nerabilities are only a set of factors that affect the overall ap- it is almost unavoidable for them to contain implementation
plication security. There are multiple other factors to consider, bugs. This paper presents an overview of the implementation
such as where does an application take input from, whether vulnerabilities and the corresponding risks in popular deep
training data are well formatted, that also affect the security learning frameworks. We discovered multiple vulnerabilities
risks. We briefly discussed a few related issues here. in popular deep learning frameworks and libraries they use.
The types of potential risks include denial-of-service, evasion
A. Security Risks for Applications in Closed Environments of detection, and system compromise. Although closed appli-
cations are less risky in terms of their control of the input,
Many sample deep learning applications are designed to they are not completely immune to these attacks. Considering
be used in a closed environment, in which the application the opaque nature of deep learning applications which buries
acquires input directly from sensors closely coupled with the the implicit logic in its training data, the security risks caused
application. For example, the machine learning implementation by implementation flaws can be difficult to detect. We hope
running on a camera only takes data output from the built-in our preliminary results in this paper can remind researchers to
camera sensor. Arguably the risk of malformed input is lower not forget conventional threats and actively look for ways to
than an application takes input from network or files controlled detect flaws in the software implementations of deep learning
by users. However, a closely coupled sensor does not eliminate applications.
threats of malformed input. For example, there are risks
associated with sensor integrity, which can be compromised.
If the sensor communicates with a cloud server where the R EFERENCES
deep learning applications run, attackers could reverse the [1] Gardener and Benoitsteiner, “An open-source software library for Machine Intelli-
communication protocol and directly attack the backend. gence,” https://www.tensorflow.org/, 2017.
[2] Y. Jia, “Classifying ImageNet: using the C++ API,” https://github.com/BVLC/caffe/
tree/master/examples/cpp_classification, 2017.
B. Detect Vulnerabilities in Deep Learning Applications [3] Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, S. Guadarrama,
and T. Darrell, “Caffe: Convolutional architecture for fast feature embedding,” arXiv
We applied traditional bug finding methods, especially preprint arXiv:1408.5093, 2014.
[4] Opencv Developers, “Opencv issue 5956,” https://github.com/opencv/opencv/issues/
fuzzing, to find the software vulnerabilities presented in this 5956, 2017, accessed 2017-09-03.
paper. We expect all conventional static and dynamic analysis [5] N. Papernot, P. McDaniel, I. Goodfellow, S. Jha, Z. B. Celik, and A. Swami,
“Practical black-box attacks against machine learning,” in Proceedings of the 2017
methods apply to the deep learning framework implementation. ACM on Asia Conference on Computer and Communications Security, ser. ASIA
However, we found that coverage-based fuzzing tools are not CCS ’17. New York, NY, USA: ACM, 2017, pp. 506–519.
ideal for testing deep learning applications, especially for [6] Ronan, Clément, Koray, and Soumith, “Torch: A SCIENTIFIC COMPUTING
FRAMEWORK FOR LUAJIT,” http://torch.ch/, 2017.
discovering errors in the execution of models. Taking the [7] A. Saeed, “Urban Sound Classification,” https://devhub.io/zh/repos/aqibsaeed-
MNIST image classifier as an example, almost all images cover Urban-Sound-Classification, 2017.
[8] R. Stevens, O. Suciu, A. Ruef, S. Hong, M. W. Hicks, and T. Dumitras,
“Summoning demons: The pursuit of exploitable bugs in machine learning,” CoRR,
vol. abs/1701.04739, 2017. [Online]. Available: http://arxiv.org/abs/1701.04739
[9] VisionLabs, “OpenCV bindings for LuaJIT+Torch,” https://github.com/VisionLabs/
torch-opencv, 2017.
[10] W. Xu, Y. Qi, and D. Evans, “Automatically evading classifiers,” in Network and
Distributed System Security Symposium, 2016.
[11] L. Yann, C. Corinna, and J. B. Christopher, “The MNIST Database of handwritten
digits,” http://yann.lecun.com/exdb/mnist/, 2017.