Signture 044303
Signture 044303
INTRODUCTION
1.1 Background of the Study
Malware or malicious code refers to the broad class of software threats to computer systems and
networks. It includes any code that modifies, destroys or steals data, allows unauthorized access,
exploits or damages a system, or does some-thing that the user does not intend to do. Perhaps the
most sophisticated types of threats to computer systems are presented by malicious codes that
exploit vulnerabilities in applications. Pattern based signatures are the most common technique
employed for malware detection. Implicit in a signature-based method is apriority knowledge of
distinctive patterns of malicious code. The advantage of such malware detectors lies in their
simplicity and speed. While the signature-based approach is successful in detecting known
malware, it does not work for new malware for which signatures have not yet been prepared.
There is a need to train the detector often in order to detect new malware.
Malware authors often tend to obfuscate the executable so as to make analysis difficult and to
evade detection. Four techniques are commonly employed for obfuscating executables. The first
approach, insertion of dead code involves insertion of code that does not change the malware
behavior, such as a sequence of NOPs (no operation instructions). The second approach, register
reassignment involves changing the usage of one register with another such as eax with ebx to
evade detection. The third approach, instruction substitution replaces a sequence of instructions
with an equivalent instruction sequence. Finally, the fourth approach, code transposition involves
jumbling the sequences of instructions in such a way that the behavior of the code remains the
same. We note that, although all of these approaches change the code pattern in order to evade
detection, the behavior of the malware still remains the same.
Past research has focused on modeling program behavior for intrusion and malware detection.
Such modeling of program behavior was first studied by Forrest et al (2017). Their approach
called N-Grams used short sequences of system calls to model normal program behavior. Sekar
et al (2018), used system calls to construct a control flow graph of normal program behavior.
Peisert et al (2017), use sequence of function calls to represent program behavior. Based on such
results, in our approach, we have used API calls as measure of the malware program behavior.
Specifically, we use only a subset of API calls, called critical API calls in our analysis. These
critical API calls are the ones that can possibly cause malicious behavior. API calls have been
used in the past research for modeling program behavior and for detecting malware.
We use static analysis to extract critical API calls from known malicious programs to construct
signatures for an entire malware class rather than for a single specimen of malware. In our
approach, a malicious program is detected by statistical comparison of its API calls with that of a
malware class. The technique presented in this paper aims to detect known and unknown
malicious programs, including self-mutating malware. Also, it is capable of detecting malware
that use common obfuscations. Our approach relies on the fact that the behavior of the malicious
programs in a specific malware class differs considerably from programs in other malware
classes and benign programs.
One of the most common reasons that the signature-based approaches fail is when the malware
mutates, making signature based detection difficult. The presence of such a metamorphism has
already been witnessed in the past. These days, cell phones, such as cell phones and tablets, have
gotten exceptionally main stream because of a decrease in their expense and an expansion in
their functionalities and administrations accessibility. Also, the developing pattern of executing
bring your own gadget (BYOD) approaches in associations has additionally added to the
selection of these advances for ordinary correspondence exercises as well as to help to undertake
systems, industrial applications, and business exchanges, which raise new security issues.
Malware diseases have tormented associations and clients for quite a long time and are becoming
stealthier and expanding in number constantly. Thus, this application is used to secure the user's
device.
1.2 Statement of the Problem.
1. The problem has been largely on how to detect malware.
2. Problem of malfunction because of malware.
3. Problem of accessing data because of malware.
4. Problem of efficiently identifying malware.
Database: A database is a system intended to organize, store, and retrieve large amounts of data
easily. It consists of an organized collection of data for one or more uses, typically in digital
form.
Design:It is the art or process of designing how something will look, work.
Documentation: Material that provides official information or evidence that serves as record;
written specifications and information that describes the product.
Domain Name System (DNS): The mechanism for tracking and regulating Internet domain
names and addresses.
Packet:It is a short piece of a larger statement. Packets are data sent over computer networks,
such as the Internet.
Security: This helps to prevent unauthorized users from illegally accessing certain data within
the database, it protects your data/ files.
Software: These are set of logically related instructions given to the computer to perform some
specific tasks.
3.1 Methodology
System development methodologies are promoted as a means of improving the management and
control of the software development process, structuring and simplifying the process, and
standardizing the development process and product by specifying activities to be done and
techniques to be used. It is often tacitly assumed that the use of a system development
methodology will improve system development productivity and quality. One software
development methodology framework is not necessarily suitable for use by all projects. Each of
the available methodology frameworks are best suitable to specific kinds of projects, based on
various technical, organizational, project and team considerations. These software development
frameworks are often bound to some kind of organization, which further develops, supports the
use, and promotes the methodology framework. The methodology framework is often defined in
include:
3.1.1 Prototyping
portions of the solutions to demonstrate functionality and make needed refinement before
developing the final solution. Software prototyping produces a ‘throw away’ solution that is
designed for the sole purpose of verifying user functionality and for demonstrating capability. It
is an excellent way for the development team to confirm understanding of the requirements and
ensure that the proposed solution is consistent with business expectation. It works well with web-
based development and can quickly help confirm page navigation and other user interaction
requirement.
i. Throwaway
ii. Evolutionary
iii. Incremental
Throwaway or rapid prototyping refers to the creation of a model that will eventually be
discarded rather than becoming part of the final delivered software. ... When this goal has been
achieved, the prototype model is 'thrown away', and the system is formally developed based on
development team first constructs a prototype. Note that evolutionary prototyping is similar to
incremental development in that parts of the system may be inspected or delivered to the
In an incremental prototyping model the product features are added into each of several
prototypes. Typically development starts with the external features and user interface, and then
Advantages of Prototyping
iv. Identifies any problems with the efficiency of earliest design, requirement analysis and
coding activities.
Disadvantages of Prototyping:
This methodology models a system as a group of interacting objects. Each object represents
some entity of interest in the system being modeled, and is characterized by its class, its state
(data element) and its behavior. Various models can be created to show the static structure,
dynamics behavior, and run-time development of these collaborating objects. Object oriented
analysis (OOA) applies object modeling techniques to analyze the functional requirements for a
system, it focuses on what a system does. Object oriented Design (OOD) elaborates the analysis
model to produce implementation specification. It focuses on how the system does things.
Benefits of Object-Oriented
assurance that the system will enjoy a longer life while having far smaller maintenance
costs. Because most of the processes within the system are encapsulated, the behaviors
ii. Real-World Modeling: Object-oriented system tends to model the real world in a more
complete fashion than do traditional methods. Objects are organized into classes of
objects, and objects are associated with behaviors. The model is based on objects, rather
data attributes and characteristics of the class from which it was spawned. The new
object will also inherit the data and behaviors from all super classes in which it
participates. When a user creates a new type of a widget, the new object behaves
"wigitty", while having new behaviors which are defined to the system.
Demerits of Object-Oriented
systems are still unproved, and many bread-and-butter information systems applications
(i.e. payroll, accounting), may not benefit from the object-oriented approach.
religious in their fervor for object-oriented systems, remember that all the "HOOPLA" is
directed at the object-oriented approach to problem solving, and not to any specific
technology.
Expert system is a computer program that simulates an expert's thinking to solve a particular
problem. It is based on the principles of artificial intelligence, and its reasoning processes are
similar to normal human thinking. An expert system consists of a knowledge base, which
contains encoded knowledge, and an inference engine, which uses the knowledge base in
reasoning about a particular problem. Expert systems are most common in a specific problem
domain and are a traditional application or subfield of artificial intelligence. Expert system is
most valuable to organization that have a high level of know-how experience and expertise that
cannot be easily transferred to other members. They are designed to carry the intelligence and
information found in the intellect of experts and provide this knowledge to other members of
This is a Systematic approach to the Analysis and Design of Information System. SSADM
divides an application development project into modules, stages, steps and tasks and it provides a
In detail SSADM, sets out a cascade or water fall view of the system development in which there
are series of steps, each of which leads to the next step. The SSADM stages are:
iv. Design
v. Implementation
vi. Maintenance
Benefits of SSADM
i. Timelines: Theoretically, SSADM allows one to plan, manage and control a project
ii. Usability: Within SSADM special emphasis is put on the analysis of user needs.
carried out. Both are tried to see if they are well suited to each other.
the project’s progress is taken very seriously, issues like business objectives and business
needs are considered while the project is being developed. This offers the possibility to
tailor the planning of the project to the actual requirements of the business.
iv. Effective use of skills: SSADM does not require very special skills and can easily be
taught to the staff. Normally, common modeling and diagramming tools are used.
Commercial CASE tools are also offered in order to be able to set up SSADM easily.
v. Better quality: SSADM reduces the error rate of IS by defining a certain quality level in
Disadvantages of SSADM
SSADM puts special emphasis on the analysis of the system and its documentation. This causes
the danger of over-analyzing, which can be very time and cost consuming. Due to various types
of description methods, checks of consistence cannot be carried out. Especially with large
systems, the outline diagram can become very unclear, because all relevant data flows have to be
included.
Chosen Methodology
Based on the above explanations of some system development methodology, the most suited
methodology for the system under study is the SSADM. This method gives the project
development team the opportunity to interact with the system users in order to find out their
mode of operations, the problem they encounter which will lead to the team decision of
A System is the collection of interrelated units, facts and information that are joined together to
accomplish a specific objective. It can also be seen as a set of interrelated components or parts
that interact to achieve a special/desired goal. Analysis is a process of separating a whole into its
System analysis is therefore determining the requirement for a new system. Data is collected
about the present system, the data is then analyzed and new requirement is determined. Three
tasks of this phase are: gathering data, analyzing the data, and documenting the analyses.
i. Data Gathering: The prerequisite information needed are gotten in this phase. Sources
include observation, interviews, and questionnaire and looking at helpful documents, one
of which is the organization chart which shows an organization foundations and levels of
management.
ii. Analyzing data:- there are several tools for the analyzing of data which includes
iii. Documenting system analysis. To document and to communicate the finding of phase 2,
During the research work, data needed for the project was gathered from various sources. In
gathering and collecting necessary data and information needed for system analysis, two major
fact-finding techniques were used in this work and they are:
Primary Source
This refers to the sources of collecting original data in which the researcher made use of
empirical approach such as personal interview and questionnaires.
i. Oral Interview: This is the act of obtaining information through face – to – face
conversation. Questions asked in this case can be outlined or not. For the later, the next
question is usually based on the response of the interviewee.
ii. Observation: The researcher in this case has to in person observe the operations and
activities on-going in the data collection field. Data obtained from there are first hand and
are very useful and important in the course of the study.
iii. Questionnaire: The questionnaire is a written form of interview. The questions are
prepared as a document and given to respondents. The researcher then collates the
response of the respondents after collection.
Secondary Source
The secondary data were obtained by the researcher from magazines, Journal, Newspapers,
Library source and Internet downloads. The data collected from this means have been covered in
literature review in the chapter two of the project.
USER
SYSTEM
Signature-Based Malware
Detection
processes or functions or grouped together or decomposed into multiple processes. There can be
physical DFD’s that represent the physical files and transactions, or they can be business DFD’s
PROCESS NOTATIONS
Process Notation
DATAFLOW NOTATIONS
Process (aka Activity, Function)
Dataflows are pipelines through which
*
A processData movesFlow
transforms through it .Data(represented
arrows with the name of incoming
the data that
Flows depict
data by an arrow)
Process
flow into process. The(aka
outgoing arrows used Activity,
data
data/information flow
flowing Processes
to or from a
to represent Function)
Yourdon and Coad
transform or manipulate data. Each box Datastore Notation
the flows must either start and/or end at
A process
has a unique number astransforms
a process box.
identifier (top incoming data
al Entity(s) (aka Sink, Source,
flow into outgoing
left) and a unique name (an imperative data
- flow Processes
eg: 'do this' - statement in the main box
ator) External entities are
transform
area) The top line or is used manipulate for the data. Each box
locationhas
of, ora the unique people number responsible as identifier (top
outside the system, with which
for, the process.
left) and a unique name (an imperative -
stem communicates. External
eg: 'do this' - statement in the main box
are sources and area)destinations
The top lineof is used for the
tem's inputs *
and
YourdonData Process
outputs.
location
and Coad (represented
of, or the people by a Sarson
circle or rounded rectangle)
Ganeresponsible
and
Process Notations Datastore Notations
al Entities, also for, the known
process. as
EXTERNAL ENTITY NOTATIONS
al source/recipients, are things External Entity(s) (aka Sink, Source,
eople, machines, organisations Terminator) External entities are
objects outside the system, with which
which contribute data or the system communicates. External
entities are sources and destinations of
ation to theGane system
Yourdon and
and Sarson
Process Notation
or Coad
which the system's inputs and outputs.
External Entities, also known as
Process
data/information from it. Notations 'External source/recipients, are things
* External
DATAFLOW NOTATIONS Entity (represented by a square or oval, also called
(eg: people, machines, organisations a 'Source/Sink')
Dataflows are pipelines through which etc.) which contribute data or
packets of information flow. Label the information to the system or which
arrows with the name of the data that receive data/information from it.
moves through it .Data Flows depict
data/information flowing to or from a
process. The arrows used to represent
the flows must either start and/or end at
a process box.
DATAFLOW NOTATIONS
permanently.
Datastores are repositories of data in
Yourdon and Coad the system. They are sometimes also
Datastore Notation referred to as files. A repository of
information. In the physical model, this
represents a file, table, etc. In the
Yourdon and Coad logical model, a datastore is an object
Datastore Notationor entity. Data Stores are some location
* Data Store (represented by twowhere
parallel lines,
data is sometimes connected
held temporarily or by a vertical)
permanently.
The proposed system is a computerized commodity Exchange Information system. This system
reduces or eliminates the chances of inaccurate of information in commodity exchange. Enable
timely and easily accessed from the database system.
3.8 High Level Model of the Proposed System
Welcome
Authorization
Main Menu
Retrieve Record
Print All Print One
New Daily Exit
Record Complete Manual
d Job All Daily
Record record
About
SYSTEM DESIGN
Design is the process of defining the architecture, components, modules, interfaces and data for a
system to satisfy specified requirement. It can also be seen as the act of building a proposed
system from the fact collected through system investigation. System implementation is the
activity of proceeding from a given design of a system to a working version of that system, or the
specific way in which some part of the system is made to fulfill its function.
0.1
Detection
Approve
The Data Flow Diagram below (fig. 4.2) presents an expanded version of fig. 4.1 above. It shows
the complete entities, processes, data flows and data stores in the proposed computerized
0.0 S Commodity
Data D1 Database
Admin Registered Data Database
Filled Registration user
Form
0.1
Detection
Approve
Signature
Malware
4.3 Database Specification
The Database: This is the collection of related data that work together for the sake of references.
The database use in the new system in MS-Access. This MS-ACCESS is an object oriented
database system that was used to link the program to the database.
This tool can also enhanced into MYSQL which its tables are unlimited unlike the access.
In the proposed system, there are many tools used in its planning design and implementation.
These are:
1. Database table: This is the collection of related fields that could reference for a specific
purpose. The main table is the table harboring the list of all registered students.
2. Query: This is a database confirming to particular selected criteria form a table. The
below are the few examples of the queries used in the proposed system.
b. List of all male students also gotten from the main table.
3. Visual Basic. NET program: This is object oriented programming language which
supports the use of enhanced graphical user interface to showcase the functional of an
online admission system. This programming language was chosen because of it general
4. Data flow diagram: This is a tool used to represent the flow of information in the new
system.
5. Internet connection tool: The web browser that automatically pops up when the
program runs is an internet tool that was embedded to the visual basic environment to
Database Design
Password Text 30
Lastname Text 30
Login Form
Main Form
Detection
User Account
The output of a program determines the input and procedure format. It is necessary to consider
what is required from a system before deciding on how to set about producing it. The system
analyst will need to consider content, format, frequency of documents to be produced. Reports
The program designed involved some input forms in order to achieve or derived some required
output. These forms designed in this system are expected to be used to capture program inputs.
The forms include: login form, main form, file form, transaction form and report form
Login Screen
PASSWORD CHARACTER 10
4.6 Algorithms
According to Aguboshim, (2005) Algorithm as a step by step procedure organized into the
correct and logical sequence suitable for solving problem that can be transferred to computer.
i. Flowcharts
FILE DOCUMENT
ERROR ONE
CORRECTED
CHECK FOR CORRECTION
MASTER FILE
AND
TRANSACTION
PROGRAM
PROCESSING
DISPLAY ON MONITOR
Fig 4.3 System Flowchart
Start
Enter Password
Is password correct?
NO
YES
Register Exchange
Detection
Save Record
Display Report
Stop
Fig 4.2 Program Flowchart
Data dictionary is a traditional and a separate entity understood to contain the description of
Data dictionary contains the list of all files in the database, the number of records in each file and
In this new system, the data dictionary for each type of data record stored includes:
Password Text 30
Lastname Text 30
CHAPTER FIVE
SYSTEM IMPLEMENTATIONAND DOCUMENTATION