Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
62 views9 pages

Behavioral Malware Classification Using Convolutional Recurrent Neural Networks

malware
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
62 views9 pages

Behavioral Malware Classification Using Convolutional Recurrent Neural Networks

malware
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Behavioral Malware Classification using Convolutional Recurrent Neural

Networks

Bander Alsulami Spiros Mancoridis


Drexel University Drexel University
[email protected] [email protected]

Abstract Behavioral signatures have become a useful complement


to static signatures, which can be obfuscated easily and au-
Behavioral malware detection aims to improve on the tomatically [42]. For example, polymorphic and metamor-
performance of static signature-based techniques used by phic malware mutate their appearance and structure with-
anti-virus systems, which are less effective against modern out affecting their behavior [41, 14]. Behavioral features
polymorphic and metamorphic malware. Behavioral mal- capture run-time information such as file system activities,
ware classification aims to go beyond the detection of mal- memory allocations, network communications, and system
ware by also identifying a malware’s family according to a calls during the execution of a program. Such features make
naming scheme such as the ones used by anti-virus vendors. behavioral malware classifiers more resilient to static obfus-
Behavioral malware classification techniques use run-time cation methods.
features, such as file system or network activities, to capture Each anti-virus vendor has a unique labeling format
the behavioral characteristic of running processes. The in- for malware families. The format often includes the tar-
creasing volume of malware samples, diversity of malware get platform (e.g., Windows, Linux) the malware category
families, and the variety of naming schemes given to mal- (e.g., trojan, worm, ransomware), and an arbitrary char-
ware samples by anti-virus vendors present challenges to acter that describes the generation. For example, a mal-
behavioral malware classifiers. We describe a behavioral ware sample that belongs to the ransomware family Cerber
classifier that uses a Convolutional Recurrent Neural Net- is labeled Ransom:Win32/Cerber.a according to the nam-
work and data from Microsoft Windows Prefetch files. We ing scheme in the Microsoft Windows Defender anti-virus
demonstrate the model’s improvement on the state-of-the- system. Such naming schemes are used to simplify the
art using a large dataset of malware families and four major classification of malware samples, track their evolution,
anti-virus vendor naming schemes. The model is effective and associate their effective counter-response. The perfor-
in classifying malware samples that belong to common and mance of behavioral classification models depends on the
rare malware families and can incrementally accommodate ground truth labels assigned by the various anti-virus nam-
the introduction of new malware samples and families. ing schemes at training. Unfortunately, the naming schemes
are inconsistent across anti-virus vendors [35], which com-
plicates the training and evaluation process. This work de-
scribes a new malware classification model that performs
1 Introduction and Background consistently better than other models described in previ-
ous work using various anti-virus ground truth labeling
Malware classification is the process of assigning a mal- schemes.
ware sample to a specific malware family. Malware within This paper presents our contributions to behavioral mal-
a family shares similar properties that can be used to cre- ware classification using information gathered from Mi-
ate signatures for detection and classification. Signatures crosoft Windows Prefetch files. We demonstrate that our
can be categorized as static or dynamic based on how they technique achieves a high classification score on common
are extracted. A static signature can be based on a byte- malware families for a large number of samples. We mea-
code sequence [24], binary assembly instruction [31], or an sure the generalization of our malware classification model
imported Dynamic Link Library (DLL) [38]. Dynamic sig- on four different anti-virus scan engines. We demonstrate
natures can be based on file system activities [18, 40], ter- the robustness of our model on rare malware families with
minal commands [43], network communications [26, 46], small sample sizes. We also evaluate the ability of our
or function and system call sequences [37, 20, 2]. model to include the correct malware family in its top pre-

978-1-7281-0155-2/18/$31.00 ©2018 IEEE 103


dictions. Finally, we present our model’s capacity to learn Prefetch files to speed up the booting process and launch
the behavior of newly discovered malware samples and time of Windows programs. The Windows Cache Manager
families. (WCM) monitors the first two minutes of the booting pro-
The paper is organized as follows: Section 3 explains cess and another sixty seconds after all systems services are
Microsoft Windows Prefetch files, which are used as dy- loaded. Similarly, WCM continues to monitor the applica-
namic features in our model. Section 2 describes previous tion running for ten seconds. The prefetching process ana-
related work. Section 4 describes the architecture of our be- lyzes the usage patterns of Windows applications while they
havioral malware classification model. Section 5 explains load their dependency files such as dynamic link libraries,
how the dataset used in the experiment was created from the configuration files, and executable binary files. WCM stores
ground truth labelled data. Section 6 evaluates our model the information for each application in files with a .PF ex-
against previous work on behavioral malware classification. tension inside the system directory named Prefetch.
Finally, Section 7 outlines our conclusions and future work. Prefetch files store relevant information about the behav-
iors of the application, which can be used for memory se-
2 Related Work curity forensics, system resources auditing, and Rootkit de-
tection [4, 30, 28]. Many malicious activities can leave dis-
Behavioral malware classification has been researched tinguishable traces in Prefetch files [28, 30]. Even fileless
extensively to mitigate the shortcomings of static malware malware, which are memory resident malicious programs,
classification. Malware that use advanced obfuscation tech- can leave residual trails in Prefetch files after deleting their
niques, such as polymorphism and metamorphism, are a presence from the file system [13, 19, 7]. Poweliks is one
challenge for detection and classification using static anal- of the first fileless malware samples that can infect a com-
ysis techniques [15, 29]. Researchers introduced new dy- puter with Ransomware [19]. The malware employs eva-
namic features to profile the behavior of malware samples. sive techniques to avoid detection from traditional anti-virus
They extract the program control graphs [25] and measure software.
the similarity between malware within the same family. Figure 1 shows an example Prefetch file for the
The work described in [36, 9, 20] used sequences of func- CMD.EXE program. The first section has runtime infor-
tion/system calls to model the behavior of malware and ap- mation such as the last-execution timestamp. The second
plied machine learning techniques to group malware with section contains storage information. The third section lists
similar behavior into a common family. the directories accessed by the program. The final section
The disparity between anti-virus vendors’ naming lists the resource files loaded by the program. The exact
schemes affect the performance of behavioral malware clas- format of Prefetch files may vary on different versions of
sifiers [3, 8, 21]. A common solution is to cluster malware Windows, but the general structure is consistent across all
based on their observed behavior using unsupervised ma- versions. In our model, we only use the list of loaded files
chine learning [3]. However, malware samples that are dif- from the final section of each Prefetch file.
ficult to cluster are often left out [27]. A method to over-
come the disparity between anti-virus scan engine labels is 4 Malware Classification Model
to cluster multiple ground truth labels into a single valid
ground truth source [34]. Another solution uses a method Our model classifies malware into families using infor-
to aggregate labels in conjunction with and supervised and mation gathered from Prefetch files stored in the Windows
unsupervised machine learning techniques to infer suitable Prefetch folder. We use Convolutional Recurrent Neural
labels [21]. Networks to implement the components of our classifier.
Our work is distinct from previous efforts in that we This section describes the architecture of the model and the
build a Convolutional Recurrent Neural Network that uses training process used to create the model.
new dynamic features extracted from Windows Prefetch
files to classify malware. The model should outperform pre- 4.1 Model Architecture
vious work using any anti-virus labelling scheme, should
perform consistenly regardless of the ground truth labels, Figure 2 shows the general architecture of our behav-
and should be able to classify malware into both common ioral malware classifier. The first layer is the embedding
and rare malware families. layer. This layer receives a sequence of resource file names
and maps them to embedding vectors of arbitrary sizes. The
3 Microsoft Windows Prefetch Files number of embedding vectors represents the size of the vo-
cabulary of the model. Each file name corresponds to a
Prefetch files contain a summary of the behavior of Win- unique embedding vector. Embedding vectors generally im-
dows applications. The Windows operating system uses prove the performance of large neural networks for complex

104 2018 13th International Conference on Malicious and Unwanted Software: “Know Your Enemy” (MALWARE)
Type Size Malware Family Samples
Adware 0.79% MultiPlug, SoftPulse, DomaIQ
Backdoor 2.25% Advml, Fynloski, Cycbot, Hlux
Trojan 89.18% AntiFW, Buzus, Invader, Kovter
Virus 1.44% Lamer, Parite, Nimnul, Virut
Worm 4.28% AutoIt, Socks, VBNA, Generic
Ransomware 2.07% Xorist, Zerber, Blocker, Bitman

Table 1: Malware types, size, and examples of malware


families according to the Kaspersky, EsetNod32, Microsoft,
and McAfee

BiLSTM layer.
The sixth, and final, layer is Softmax. This layer outputs
the probability that a malware sample belongs to a specific
malware family.
To improve the generalization of our model, we apply
different regularization techniques. First, we apply dropout
between our model layers. Dropout is a commonly used
technique in training large neural networks to reduce over-
fitting [39]. Dropout has shown to improve the training and
classification performance of large neural networks. The
Figure 1: Example of a Prefetch file for the CMD.EXE pro-
goal is to learn hidden patterns without merely memorizing
gram
the training samples in the training data. This improves the
robustness of the model on unseen (i.e., zero-day) malware
samples.
learning problems [33].
The second layer is a convolutional layer. The layer ap-
plies a one dimensional (1D) sequential filter of a particular 5 Experimental Setup
size. The layer, then, slides the filter over the entire list
to extract adjacent file names. This helps the model learn
This section described how the dataset and the ground
the local relation between embedding vectors. 1D convolu-
truth labeling used in our experiment was created.
tional layers have been used successfully in sequence clas-
sification and text classification [23] problems.
The third layer is Max Pooling. This layer reduces the 5.1 Dataset Collection
size of the data from the previous layer. It is designed to
improve the computational performance and the accuracy We successfully executed around 100,000 malware
of our model and its respective training process. We use the samples obtained from the public malware repository
maximum function to select the important representation out VirusShare1 . Malware samples were deployed on freshly
of the data. installed Windows 7 executing on a virtual machine. Af-
The fourth layer is Bidirectional LSTM. Bidirectional ter each Prefetch file is collected, the virtual machine is re-
LSTM (BiLSTM) is an architecture of recurrent neural net- set to a clean (non-infected) state. In order for Windows
works [16]. Recurrent neural networks learn the long-term to generate a Prefetch file for malware sample, the sample
dependency between the embedding vectors. In our con- needs to be executed. Once the sample is loaded, Win-
text, they model the relationship between the resources file dows generates a Prefetch file automatically. This simpli-
names loaded in each Prefetch file. The bidirectional struc- fies the task of extracting the Prefetch files for malicious
ture consists of a forward and reversed LSTM, which is a programs. Our experiments only included malware sam-
structure that has been successful in NLP and sequence clas- ples that produced Prefetch files and were identified by ma-
sification problems [45, 17]. jor anti-virus engines, such as Kaspersky, EsetNod32, Mi-
The fifth layer is Global Max Pooling. This layer prop- crosoft, and McAfee.
agates only relevant information from the sequence of out-
puts of BiLSTM. It reduces the size of the output of the 1 VirusShare2 http://www.virusshare.com

2018 13th International Conference on Malicious and Unwanted Software: “Know Your Enemy” (MALWARE) 105
Figure 2: 1D-Conv-BiLSTM model architecture

5.2 Ground Truth Labeling classes in training data [11]. Malware training datasets of-
ten contain unbalanced samples for different malware fam-
Ground truth labels for malware were obtained through ilies. The ratio between malware family sizes sometimes
an online third-party virus scanning service called VirusTo- varies 1:100. Table 2 shows malware type, size of malware
tal3 . Given an MD5, SHA1 or SHA256 of a malware file, type, and a few examples of malware families.
VirusTotal provides the detection information for popular
anti-virus engines. This information also includes meta- 6.2 Classification Performance with Com-
data such as target platforms, malware types, and malware mon Malware Families
families for each anti-virus scan engine. Table 1 illustrates
malware types, sample size, examples of malware fami- We evaluate our malware classification model against the
lies according to EsetNod32, Kaspersky, Microsoft, and model of previous work on behavioral malware classifica-
MacAfee. tion [9]. The previous work examined multiple types of
feature extractions, feature selections, classification mod-
6 Evaluation els based on large datasets extracted from sequences of OS
system calls. The top performing models were Logistic re-
gression (LR) and Random Forests (RF). LR and RF were
This section describes the experimental evaluation of our
used with n-grams feature extraction and Term Frequency-
model against a model from previous work.
Inverse Document Frequency (TF-IDF) feature transforma-
tion [10]. RF also used Singular Value Decomposition
6.1 Performance Measurements (SVD) for feature dimensionality reduction [22].
We implemented our new model using the Keras and
The classification accuracy of our classification model is Tensorflow [12, 1] deep learning frameworks. We config-
measured by the F1 score, F1 demonstrates the trade-off be- ured our model using the following parameters:
tween Recall and Precision and combines them into a sin-
gle metric range from 0.0 to 1.0. Recall is the fraction of • Embedding layer: 100 hidden units
a number of retrieved examples over the number of all the
• 1D Convolutional layer: 250 filters, kernel size of five,
relevant examples. Precision is the fraction of the number
one stride, and RELU activation function
of relevant examples over the number of all retrieved ones.
The F1 score formula is: • 1D Max Pooling: pool size of four
P recision ∗ Recall • Bidirectional LSTM: 250 hidden units
F1 = 2 ∗
P recision + Recall
• L2 regularization: 0.02
A classifier is superior when its F1 score is higher. We
choose the F1 score because it is less prone to unbalanced • Dropout regularization: 0.5
3 VirusTotal, http://www.virustotal.com • Recurrent Dropout regularization: 0.2

106 2018 13th International Conference on Malicious and Unwanted Software: “Know Your Enemy” (MALWARE)
Anti-virus label (# of malware families)
Kaspersky (50) EsetNod32 (53) Microsoft (38) McAfee (55) F1 mean
1D-Conv-BiLSTM 0.734 0.854 0.754 0.765 0.777
LR 2-grams 0.711 0.821 0.734 0.756 0.756
LR 3-grams 0.718 0.822 0.726 0.756 0.756
RF 2-grams 0.702 0.792 0.731 0.755 0.745
RF 3-grams 0.671 0.699 0.72 0.724 0.704

Table 2: F1 score for 1D-Conv-BiLSTM, LR (2,3)-grams, and RF (2,3)-grams models using Kaspersky, EsetNod32, Mi-
crosoft, and McAfee labelings.

We implemented the previous work LR and RF models us- 6.3 Classification Performance with Rare
ing Scikit-learn [32]. We applied a grid search to select the Malware Families
best hyperparameters for the LR and RF models.
We train our model using Stochastic Gradient Descent
(SGD) with batch size of 32 samples and 300 epochs [47]. Rare malware families with small sample sizes repre-
SGD is an iterative optimization algorithm commonly used sent a significant percentage of all malware families. This
in training large neural networks. SGD can operate on large presents a difficulty for models to extract useful behavioral
training sets using one sample or a small batch of samples patterns due to insufficient samples during training. In this
at a time. Thus, it is efficient for large training sets and for experiment, we include any malware family that has at least
online training [5]. 10 malware samples. This presents a challenge for clas-
sification models because the number of malware families
We use a 10-fold cross-validation with stratified sam- largely increases while, at the same time, the number of
pling to create a balanced distribution of malware samples malware samples for each family decreases. We aim to
for malware families in each fold. We train the models on show the robustness of our classification model when ap-
9 splits of our dataset and test on a separate dataset. We plied to rare malware families.
repeat this experiment 10 times and take the average metric
Table 3 shows the classification performance of our
score for the final output. We include any malware families
model against LR and RF models using four anti-virus la-
that have a minimum of 50 malware samples.
beling schemes. The table shows that our model consis-
Table 2 shows the F1 score results of our experiment us- tently outperforms all other models despite the increased
ing four major anti-virus scan engines: Kaspersky, EsetN- number of malware families with a low sample size. For ex-
ode32, Microsoft, and MacAfee. The results show that our ample, on the EsetNod32 labeling scheme, our model per-
model outperforms all other models using any anti-virus en- formance decreases only -1.0% when the number of fam-
gine labeling. The second best are the LR models, which ilies increases from 53 to 180 families while other mod-
outperform the RF models on all anti-virus scan engines els exhibit larger classification performance degradations.
and reproduce the results described in [9]. It is noteworthy Specifically, our model shows the smallest decrease in
that the 3-gram features extraction usually provides better the classification performance from any anti-virus labeling
results than the 2-gram features in the LR models. How- scheme.
ever, the 2-gram features outperform the 3-gram features in
Figure 3 shows the average F1 scores of malware fam-
the RF models.
ilies for LR 3-grams, RF 2-grams, and 1D-Conv-BiLSTM
As shown, the performance of behavioral classification using EsetNod32 ground truth labels. We study the perfor-
models depends on the anti-virus engine labelling scheme mance of the behavioral classification models on individual
used during training. LR 3-grams show a better per- malware families to demonstrate the strength of the classi-
formance using the Kaspersky and EsetNode32 labelings, fication models on common and rare malware families. As
while a worse performance using the Microsoft labeling shown, the LR model struggles with rare malware families.
scheme. Moreover, RF 2-grams underperform all LR mod- However, it outperforms the RF model when the number
els except when using the Microsoft naming scheme. The malware samples in a family increases. Conversely, the RF
inconsistency of the results leads researchers to use the anti- model performs reasonably on rare malware families, but it
virus engine that produces the highest classification score. underperforms the LR models on common malware fami-
However, our model shows consistent performance across lies. Ultimately, our 1D-Conv-BiLSTM model outperforms
all major anti-virus engines and outperforms previous work both LR and RF models on almost all common and rare
on major anti-virus engines. malware families.

2018 13th International Conference on Malicious and Unwanted Software: “Know Your Enemy” (MALWARE) 107
Anti-virus label (# of malware families)
Kaspersky (192) EsetNod32 (180) Microsoft (137) McAfee (209)
F1 Diff (%) F1 Diff (%) F1 Diff (%) F1 Diff (%) F1 mean Diff (%)
1D-Conv-BiLSTM 0.647 -0.088 0.844 -0.010 0.727 -0.027 0.720 -0.045 0.735 -4.25%
LR 2-Grams 0.586 -0.124 0.790 -0.032 0.656 -0.078 0.652 -0.104 0.671 -8.45%
LR 3-Grams 0.594 -0.124 0.790 -0.032 0.651 -0.075 0.656 -0.100 0.673 -8.28%
RF 2-Grams 0.588 -0.114 0.760 -0.031 0.664 -0.067 0.658 -0.097 0.668 -7.73%
RF 3-Grams 0.527 -0.144 0.650 -0.049 0.627 -0.093 0.587 -0.137 0.598 -10.58%

Table 3: F1 score for 1D-Conv-BiLSTM, LR (2,3)-grams, and RF (2,3)-grams models using Kaspersky, EsetNod32, Mi-
crosoft, and McAfee labelings. Diff (%) shows the change of the F1 scores from the previous section after adding rare
malware families.

Figure 3: Average F1 scores of the log number of malware


samples per family for 1D-Conv-BiLSTM, LR 3-grams,
and RF 2-grams using EstNod32 ground truth labels.

6.4 Top Predictions Performance

We also evaluated the capacity of the classification mod-


els to find the correct malware family label considering their
top k predictions. That is, how the F1 score improves when
the top [1,2,...,k] predictions include the correct malware
family label. As shown in Figure 4, 1D-Cov-BiLSTM con-
sistently outperforms all of the other models using the top Figure 4: The F1 scores for behavioral classification models
[1,2,...,25] predictions. 1D-Conv-BiLSTM achieves around when top k predictions are used to find the correct malware
0.91, 0.95, and 0.99 F1 on the top 2, 5, and 25 predictions, family label according to EsetNod32 ground truth labeling.
respectively. This demonstrates that the correct malware
family label is usually 99% within the top 25 predictions
of our model. The performance of the RF models vary be-
tween the (2,3)-grams models, while the LR models achieve
similar F1 scores between (2,3)-grams models using top
predictions.
The LR (2,3)-grams models outperform the RF models
up to the top 5 predictions. Then, the RF 2-grams model
outperforms the LR models on the top 5 or higher predic-
tions. The RF 3-grams model, which achieves the low-

108 2018 13th International Conference on Malicious and Unwanted Software: “Know Your Enemy” (MALWARE)
models during training. The experiment shows that the in-
crementally re-trained model achieves a higher F1 at early
stages during training than the newly trained model. There-
fore, the training process can be shortened to reduce the
overhead of training on new malware samples. Moreover,
incremental re-training of our model is efficient and recom-
mended over fully re-training the model.

7 Conclusion

We introduce a new behavioral malware classification


model for the Microsoft Windows platform. Our model ex-
tracts features from the Windows Prefetch files. We show
the effectiveness of our classification technique on a large
malware collection and ground truth labels from 4 major
Figure 5: The F1 scores for newly trained and incremental anti-virus vendors.
trained 1D-Conv-BiLSTM models on the test dataset during We also evaluate our models on rare malware families
training. with a small number of malware samples. Despite the
increasing number of malware families, our model still
outperforms other state-of-the-art models. Moreover, we
est classification performance in our experiment, matches demonstrate our model’s ability to continuously learn the
the corresponding LR model performance when consider- behavior of new malware families, which reduces the time
ing the top 25 predictions. This shows that RF models have and overhead of the training process.
a higher capacity to find the correct malware families within In the future, we would like to improve our ground truth
the top candidates. The reason might be related to the fact labeling by combining all major scan engine labels to in-
that a Random Forest is an ensemble of decision trees [6], crease the performance and robustness of our classification
and it is knowns that ensemble models often overcome the model. We would also like to test our model on evolving
limitation of stand-alone classification models [44]. Our malware families over time.
model consistently outperforms the LR and RF models on
the top k predictions. References
6.5 Classification Performance with New [1] M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean,
Malware Families M. Devin, S. Ghemawat, G. Irving, M. Isard, et al. Tensor-
flow: A system for large-scale machine learning. In OSDI,
Behavioral malware classification models need to learn volume 16, pages 265–283, 2016.
the behavior of newly discovered malware continuously. [2] M. Alazab, S. Venkatraman, P. Watters, and M. Alazab.
This presents a challenge since the rate of malware sample Zero-day malware detection based on supervised learning
discovery is high. Therefore, it is efficient, and practical, algorithms of api call signatures. In Proceedings of the Ninth
Australasian Data Mining Conference-Volume 121, pages
incrementally to train an existing model rather than re-train
171–182. Australian Computer Society, Inc., 2011.
it from scratch on newly discovered samples. Incremental [3] M. Bailey, J. Oberheide, J. Andersen, Z. M. Mao, F. Jaha-
training provides a practical solution to assimilate new mal- nian, and J. Nazario. Automated classification and analysis
ware behavioral information into the classification models of internet malware. In International Workshop on Recent
without impacting the classification performance. Advances in Intrusion Detection, pages 178–197. Springer,
In this experiment, we evaluate our pre-trained model’s 2007.
ability to learn the behavior of new malware samples [4] B. Blunden. The Rootkit arsenal: Escape and evasion in
quickly. We train our model on all malware families that the dark corners of the system. Jones & Bartlett Publishers,
were discovered from 2010-2016. Then, we add malware 2012.
[5] L. Bottou. Large-scale machine learning with stochastic gra-
families that were discovered in 2017 to the training dataset
dient descent. In Proceedings of COMPSTAT’2010, pages
and incrementally retrain the model to create a new classifi-
177–186. Springer, 2010.
cation model. We aim to show that incrementally re-training [6] L. Breiman. Random forests. Machine learning, 45(1):5–
an existing model is more efficient and adaptive than train- 32, 2001.
ing a new model from scratch. [7] S. D. Candid Wueest and H. Anand. THE INCREASED
Figure 5 shows the classification performance of our USE OF POWERSHELL IN ATTACKS. https://

2018 13th International Conference on Malicious and Unwanted Software: “Know Your Enemy” (MALWARE) 109
www.symantec.com/content/dam/symantec/ [24] J. Z. Kolter and M. A. Maloof. Learning to detect and clas-
docs/security-center/white-papers/ sify malicious executables in the wild. Journal of Machine
increased-use-of-powershell-in-attacks-16-en. Learning Research, 7(Dec):2721–2744, 2006.
pdf, 2016. [Online; accessed 10-Jan-2017]. [25] C. Kruegel, E. Kirda, D. Mutz, W. Robertson, and G. Vigna.
[8] J. Canto, M. Dacier, E. Kirda, and C. Leita. Large scale Polymorphic worm detection using structural information of
malware collection: lessons learned. In IEEE SRDS Work- executables. In International Workshop on Recent Advances
shop on Sharing Field Data and Experiment Measurements in Intrusion Detection, pages 207–226. Springer, 2005.
on Resilience of Distributed Computing Systems. Citeseer, [26] W. Lee, S. J. Stolfo, and K. W. Mok. A data mining frame-
2008. work for building intrusion detection models. In Security
[9] R. Canzanese, S. Mancoridis, and M. Kam. Run-time classi- and Privacy, 1999. Proceedings of the 1999 IEEE Sympo-
fication of malicious processes using system call analysis. In sium on, pages 120–132. IEEE, 1999.
Malicious and Unwanted Software (MALWARE), 2015 10th [27] P. Li, L. Liu, D. Gao, and M. K. Reiter. On challenges in
International Conference on, pages 21–28. IEEE, 2015. evaluating malware clustering. In International Workshop
[10] W. Cavnar. Using an n-gram-based document representation on Recent Advances in Intrusion Detection, pages 238–255.
with a vector processing retrieval model. NIST SPECIAL Springer, 2010.
PUBLICATION SP, pages 269–269, 1995. [28] C. H. Malin, E. Casey, and J. M. Aquilina. Malware Foren-
[11] N. V. Chawla. Data mining for imbalanced datasets: An sics Field Guide for Windows Systems: Digital Forensics
overview. In Data mining and knowledge discovery hand- Field Guides. Elsevier, 2011.
book, pages 853–867. Springer, 2005. [29] J. A. Marpaung, M. Sain, and H.-J. Lee. Survey on mal-
[12] F. Chollet et al. Keras (2015), 2017. ware evasion techniques: State of the art and challenges. In
[13] A. Dove. Fileless malware–a behavioural analysis of kovter Advanced Communication Technology (ICACT), 2012 14th
persistence. 2016. International Conference on, pages 744–749. IEEE, 2012.
[30] D. Molina, M. Zimmerman, G. Roberts, M. Eaddie, and
[14] M. Egele, T. Scholte, E. Kirda, and C. Kruegel. A survey on
G. Peterson. Timely rootkit detection during live response.
automated dynamic malware-analysis techniques and tools.
In IFIP International Conference on Digital Forensics,
ACM Computing Surveys (CSUR), 44(2):6, 2012.
pages 139–148. Springer, 2008.
[15] E. Filiol. Malware pattern scanning schemes secure against
[31] R. Moskovitch, C. Feher, N. Tzachar, E. Berger, M. Gitel-
black-box analysis. Journal in Computer Virology, 2(1):35–
man, S. Dolev, and Y. Elovici. Unknown malcode detection
50, 2006.
using opcode representation. In Intelligence and Security
[16] I. Goodfellow, Y. Bengio, A. Courville, and Y. Bengio. Deep
Informatics, pages 204–215. Springer, 2008.
learning, volume 1. MIT press Cambridge, 2016.
[32] F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel,
[17] A. Graves and J. Schmidhuber. Framewise phoneme clas-
B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss,
sification with bidirectional lstm and other neural network
V. Dubourg, et al. Scikit-learn: Machine learning in python.
architectures. Neural Networks, 18(5):602–610, 2005.
Journal of machine learning research, 12(Oct):2825–2830,
[18] K. Heller, K. Svore, A. D. Keromytis, and S. Stolfo. One 2011.
class support vector machines for detecting anomalous win- [33] J. Pennington, R. Socher, and C. Manning. Glove: Global
dows registry accesses. In Workshop on Data Mining for vectors for word representation. In Proceedings of the 2014
Computer Security (DMSEC), Melbourne, FL, November conference on empirical methods in natural language pro-
19, 2003, pages 2–9, 2003. cessing (EMNLP), pages 1532–1543, 2014.
[19] B. S. R. R. U. INOCENCIO. Doing more [34] R. Perdisci et al. Vamo: towards a fully automated mal-
with less: A study of fileless infection at- ware clustering validity analysis. In Proceedings of the 28th
tacks. https://www.virusbulletin.com/ Annual Computer Security Applications Conference, pages
uploads/pdf/conference_slides/2015/ 329–338. ACM, 2012.
RiveraInocencio-VB2015.pdf, SEPTEMBER 30, [35] C. Raiu. A virus by any other name: Virus naming practices.
2015. [Online; accessed 19-Jan-2017]. Security Focus, 2002.
[20] G. Jacob, H. Debar, and E. Filiol. Behavioral detection of [36] K. Rieck, P. Trinius, C. Willems, and T. Holz. Automatic
malware: from a survey towards an established taxonomy. analysis of malware behavior using machine learning. Jour-
Journal in computer Virology, 4(3):251–266, 2008. nal of Computer Security, 19(4):639–668, 2011.
[21] A. Kantchelian, M. C. Tschantz, S. Afroz, B. Miller, [37] A.-D. Schmidt, R. Bye, H.-G. Schmidt, J. Clausen, O. Ki-
V. Shankar, R. Bachwani, A. D. Joseph, and J. D. Tygar. raz, K. A. Yuksel, S. A. Camtepe, and S. Albayrak. Static
Better malware ground truth: Techniques for weighting anti- analysis of executables for collaborative malware detection
virus vendor labels. In Proceedings of the 8th ACM Work- on android. In Communications, 2009. ICC’09. IEEE Inter-
shop on Artificial Intelligence and Security, pages 45–56. national Conference on, pages 1–5. IEEE, 2009.
ACM, 2015. [38] M. G. Schultz, E. Eskin, E. Zadok, and S. J. Stolfo. Data
[22] H. Kim, P. Howland, and H. Park. Dimension reduction in mining methods for detection of new malicious executa-
text classification with support vector machines. In Journal bles. In Security and Privacy, 2001. S&P 2001. Proceed-
of Machine Learning Research, pages 37–53, 2005. ings. 2001 IEEE Symposium on, pages 38–49. IEEE, 2001.
[23] Y. Kim. Convolutional neural networks for sentence classi- [39] N. Srivastava, G. E. Hinton, A. Krizhevsky, I. Sutskever, and
fication. arXiv preprint arXiv:1408.5882, 2014. R. Salakhutdinov. Dropout: a simple way to prevent neural

110 2018 13th International Conference on Malicious and Unwanted Software: “Know Your Enemy” (MALWARE)
networks from overfitting. Journal of Machine Learning Re-
search, 15(1):1929–1958, 2014.
[40] S. J. Stolfo, F. Apap, E. Eskin, K. Heller, S. Hershkop,
A. Honig, and K. Svore. A comparative evaluation of two
algorithms for windows registry anomaly detection. Journal
of Computer Security, 13(4):659–693, 2005.
[41] P. Szor. The art of computer virus research and defense.
Pearson Education, 2005.
[42] M. Venable, A. Walenstein, M. Hayes, C. Thompson, and
A. Lakhotia. Vilo: a shield in the malware variation battle.
Virus Bulletin, pages 5–10, 2007.
[43] K. Wang and S. Stolfo. One-class training for masquerade
detection. 2003.
[44] M. Woźniak, M. Graña, and E. Corchado. A survey of mul-
tiple classifier systems as hybrid systems. Information Fu-
sion, 16:3–17, 2014.
[45] Y. Wu, M. Schuster, Z. Chen, Q. V. Le, M. Norouzi,
W. Macherey, M. Krikun, Y. Cao, Q. Gao, K. Macherey,
et al. Google’s neural machine translation system: Bridg-
ing the gap between human and machine translation. arXiv
preprint arXiv:1609.08144, 2016.
[46] N. Ye and Q. Chen. An anomaly detection technique based
on a chi-square statistic for detecting intrusions into infor-
mation systems. Quality and Reliability Engineering Inter-
national, 17(2):105–112, 2001.
[47] T. Zhang. Solving large scale linear prediction problems
using stochastic gradient descent algorithms. In Proceed-
ings of the twenty-first international conference on Machine
learning, page 116. ACM, 2004.

2018 13th International Conference on Malicious and Unwanted Software: “Know Your Enemy” (MALWARE) 111

You might also like