Metamorph 2
Metamorph 2
Abstract—with the fast and vast upliftment of IT sector in 21 st knowledge of its software for static analysis, which is not
century, the question for system security also counts. As on one usually possible. Hence, signature based detection fails to
side, the IT field is growing with positivity, malware attacks are result in zero day malware attack.
also arising on the other. Hence, a great challenge for zero day Dynamic Analysis is the technique of analyzing the infected
malware attack. Also, malware authors of metamorphic malware
file while it is executing, in order, to keep a watch on its
and polymorphic malware gain an extra advantage through
mutation engine and virus generation toolkits as they can behavior or actions performed by it. Malware today are very
produce as many malware as they want. Our approach focuses smart, as some of them, stop working as soon as they see
on detection and classification of metamorphic malware emulated or virtual environment and can easily bypass the
according to their families. MM are hardest to detect by malware detection scheme. For this technique, proper
Antivirus Scanners because they differ structurally. We had knowledge of software is not necessary. Dynamic analysis
gathered a total of 600 malware including those also that proves itself to detect unknown malware. In present scenario
bypasses the AVS and 150 benign files. These files are dynamic analysis is most commonly used to detect malware
disassembled, preprocessed, control flow graphs and API call but it is not adequate.
graphs are generated. We had proposed an algorithm-
Gourmand Feature Selection algorithm for selecting desired B. Metamorphic Malware
features from call graphs. Classification is done through WEKA
The term metamorphic means ‘self-change in shape’.
tool, for which J-48 has given the most accuracy of 99.10%. Once
the metamorphic malware are detected, they are classified Metamorphism is the process of generating a copies of itself
according to their families using the histograms and Chi-square by changing its shape in each copy. Metamorphic malware are
distance measurement formula. the self-generated malware that are functionally same but
structurally different, means, every new variant of
Keywords— metamorphic malware, polymorphic malware, metamorphic malware generated on each iteration differs
mutation engine, code obfuscation, histogram. according to size, syntax, structure, instructions but their
behavior remains constant as they preserves the semantics.
I. INTRODUCTION That is, they belong to the same family of malware.
MALWARE- a small seven letter word, yet powerful C. Polymorphic Malware
enough to cause a substantial security attacks in computer
The term polymorphic means ‘multitudinal- the appearance
world. Malware has its various forms. It is a comprehensive
of more than one form.’ Polymorphic malware is same as
term for worms, Trojan, virus, spyware, adware, crimeware,
metamorphic malware, as to fool detectors, it also make
botnet and a list to go. In other words, malware a shortened for
changes to its initial code on each iteration.
MALicious softWARE, is a software designed against
system’s security, integrity and confidentiality.
TABLE 1: COMPARISION OF METAMORPHIC MALWARE AND POLYMORPHIC
MALWARE
A. Malware Analysis
S.
Malware can be analyzed by following two techniques No Metamorphic malware (MM) Polymorphic malware (PM)
Static Analysis and Dynamic Analysis. .
Static analysis uses the concept of pattern (byte code- MM has the ability to rewrite its PM does not have any such
1
own source code quality
signature) recognition for analyzing a malware. The suspected If signature is present in the
code is analyzed without actually executing it. Anti-virus 2 Tougher than PM to detect AVS, then these malware can be
scanners follow traditional signature based detection method easily detected.
to detect malware. Signature are the sequence’s byte present in Can be detected by AVS as one
Cannot be easily detected as these of its part, called the decryptor
the database. A very big disadvantage with signature based 3 malware follows the semantic remains same in every generated
detection is that it fails to detect malware whose signature is preserving scheme. new variant of malware, serving
not present in the database. Also, once should have proper as signature to AVS.
669
Authorized licensed use limited to: Amrita School of Engineering. Downloaded on November 05,2024 at 11:53:07 UTC from IEEE Xplore. Restrictions apply.
V. METAMORPHIC MALWARE CODE OBFUSCATION TECHNIQUES VI. METAMORPHIC MALWARE GENERATOR (MUTATION ENGINE)
It is a technique of making code more difficult and less When the code is fed to mutation engine, variants of
clear to understand and detect. For metamorphic malware code metamorphic malware are generated, which are functionally
obfuscation technique make the code different from its source same but differs structurally as shown in figure 1.
code, which is much harder to understand but is behaviorally
the same. For metamorphic malware code obfuscation FIGURE1. METAMORPHIC MALWARE VARIANTS OUT OF MUTATION ENGINE.
techniques are as follows and these techniques differ according
to the work they perform on targeted code.
(dots as
vertical lines)
670
Authorized licensed use limited to: Amrita School of Engineering. Downloaded on November 05,2024 at 11:53:07 UTC from IEEE Xplore. Restrictions apply.
as heart of the mutation engine. The obfuscator uses the FIGURE 2. METAMORPHIC MALWARE GENERATOR
information provided by code analyzer in order to transform
the binary of the input code, so that new variant of malware G
is generated which is functionally same but structurally
different. The code is then compressed and manificate by
code compressor and is then assembled into machine code Reveal the self-code Tool
through assembler. The code is now obfuscated and is
attached. The new code does not contain any matching
instruction as signature with its previous version, hence Disassemble the Disassemb
makes it difficult for AVS to detect them. ler
Portable Executables
B -150 Disassemble through IDA-
Pro Disassembler Build CFGs
Preprocess
M 332
Eng. 265
Benign Malware
Create Histogram
Classification
Evaluate according to family as -1, 0, and 1
:
Chi-square
671
Authorized licensed use limited to: Amrita School of Engineering. Downloaded on November 05,2024 at 11:53:07 UTC from IEEE Xplore. Restrictions apply.
The work of disassembler is to dis-assemble the code into WEKA is a classification tool for classifying benign and
assembly code. Once the code is disassembled, it is malicious files using various algorithms like J-48, Voted
preprocessed to remove comments, blank lines, labels etc. Perceptron, Naïve Bayes, KNN algorithm etc. We had
residue is the creamy code through which control flow passed our dataset to all these algorithms out of which J-48
graphs are generated. CFGs illustrates the flow of code algorithm has given the most accurate result of 99.10% as
segments through which API call graphs are generated. To illustrated in table3 and graph 1.
select the desired features out of API call graphs, we had
given Gourmand Feature Selection Algorithm as shown in GRAPH 1: WEKA ALGORITHMIC GRAPHICAL CLASSIFICATION RESULT OF
MALICIOUS FILE
Algorithm2 below.
672
Authorized licensed use limited to: Amrita School of Engineering. Downloaded on November 05,2024 at 11:53:07 UTC from IEEE Xplore. Restrictions apply.
VIII. CONCLUSION AND FUTURE WORK [3] Andrew Walenstein, Rachit Mathur, Mohamed R. Chouchane, &
Arun Lakhotia. Normalizing Metamorphic Malware Using Term
Our approach has been divided into two phases: first phase Rewriting. In International Workshop on Source Code Analysis and
detects the metamorphic malware and classifies it against Manipulation SCAM ’06, pages 75-84. IEEE, 2006.
benign files with the accuracy of 99.10% through our [4] VXHeavens, http://vx.netlux.org/vl.php
proposed Gourmand Feature Selection Algorithm and with [5] IDA Pro Disassembler and Debugger, http://www.hex-rays.com
WEKA [7] classification tool. This process follows dynamic [6] Guillaume Bonfante, Matthieu Kaczmarek, and Jean-Yves Marion,
approach. In second phase, the classified metamorphic “Architecture of a Morphological Malware Detector", Computer
malwares are again deeply classified according to their Virology, 2009, pp. 263-270.Electronic Publication: Digital Object
Identifiers (DOIs):
families using histograms and chi-square measurement
formula. Detection accuracy of metamorphic malware is [7] WEKA. http://www.cs.waikato.ac.nz/ml/weka.
99.10%, this percentage can be improved in future to 100% [8] Matthieu Kaczmarek Guillaume Bonfante and Jean-Yves Marion,
“Control Flow Graphs as Malware Signatures", 2007.
by introducing more novel approaches of detection. Our
[9] Analyzer of Vicious] Andrew H. Sung and Jianyun Xu and Patrick
approach is limited to portable executables, in future Chavez and Srinivas Mukkamala, “Static Executables (SAVE)", In
approach can be expanded for PDF and various web Proc. of 20th Annual Computer Security Applications Conference
browsers by analyzing these files thoroughly. (ACSAC 2004), 6-10 December 2004, Tucson, AZ, pp. 326-334.
[10] Heejo Lee Kyoochang Jeong, ”Code Graph for metamorphic
REFERENCES Malware Detection", In International conference on Information
Networking,ICOIN, 2008, pp. 1-5. IEEE.
[1] Vinod P. , Harshit Jain, Vijay Laxmi, Yashwant K. Golecha, Manoj [11] Wong, W. and Stamp, “Hunting for metamorphic engines”, Journal in
Singh Gaur. MEDUSA: MEtamorphic malware Dynamic analysis Computer Virology, 2006, 2:21122.
Using Signature from API. In 5th International Conference on
Malicious and Unwanted Software, MALWARE 2010, pages 263- [12] Essam Al Daoud, Ahid Al-Shbail, Adnan M. Al-Smadi. Detecting
269. ACM, 2010. Metamorphic Malware Using Rbitary Lenghth of Control Flow
Graphs and Nodes Alignment. In ICIT conference- Bioinformatics
[2] Qinghua Zhang, Douglas S. Reeves. MetaAware: Identifying and Image, 2009.
Metamorphic Malware. In Computer Security Application
Conference, Annual, pages 411-420, 2007.
673
Authorized licensed use limited to: Amrita School of Engineering. Downloaded on November 05,2024 at 11:53:07 UTC from IEEE Xplore. Restrictions apply.