Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
804 views29 pages

A Framework For Crime Prediction and Prevention in Nigeria Using Deep Learning

This document presents a seminar report on proposing a framework for crime prediction and prevention in Nigeria using deep learning. It discusses how rising crime rates are a major issue in Nigeria. The report proposes using a Long Short-Term Memory (LSTM) deep learning model to help law enforcement agencies better detect, prevent, and solve crimes. The framework would analyze past crime data to accurately predict future crimes and identify patterns, helping improve security across Nigeria. If successful, it could significantly reduce crime rates and make Nigeria safer.

Uploaded by

isima peter
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
804 views29 pages

A Framework For Crime Prediction and Prevention in Nigeria Using Deep Learning

This document presents a seminar report on proposing a framework for crime prediction and prevention in Nigeria using deep learning. It discusses how rising crime rates are a major issue in Nigeria. The report proposes using a Long Short-Term Memory (LSTM) deep learning model to help law enforcement agencies better detect, prevent, and solve crimes. The framework would analyze past crime data to accurately predict future crimes and identify patterns, helping improve security across Nigeria. If successful, it could significantly reduce crime rates and make Nigeria safer.

Uploaded by

isima peter
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 29

A FRAMEWORK FOR CRIME PREDICTION AND PREVENTION IN NIGERIA USING

DEEP LEARNING

BY

OGHU EMUGHEDI
(PG/MAS/19/229)

A SEMINAR REPORT PRESENTED TO THE DEPARTMENT OF COMPUTER


SCIENCE, FEDERAL UNIVERSITY, LOKOJA

SUPERVISOR: DR. TAIWO KOLAJO

SEPTEMBER, 2021

1
ABSTRACT

Rising incidence of kidnapping and heinous crimes have become so prevalence and have been a
source of concerns to the members of the societies, government at all levels, public and private
organizations including the Western World. Various Researchers have used Convolutional Neural
Network(CNN), Stacked Auto-Encoders, Recurrent Neural Network from literature reviews but it
was found that Long-Short-Term Memory(LSTM) perform better as compared to other deep
learning algorithm in giving accurate prediction about crimes incidence occurrence with respect to
time. Previous researched work on LSTM have shown knowledge gap on their inability to handle
unstructured data in order to increase the volume of data that would be used for crime prediction.
The larger the dataset the more precise the prediction result would be which can be used to benched
mark several other model of crime prevention models. Predicting crimes accurately helps to
improve crime prevention and its will improve service delivery in the justice system. The sole
purpose of this study is to determine how deep learning can be used by law agencies or authorities
to detect, prevent, and solve crimes at a much more accurate and faster rate. In summary the
proposed framework; Long Short-Term Memory (LSTM),is an aspect of deep learning, that can
help the police and other law enforcement agencies to fight crimes and makes Nigeria the safest
place to live and work in the World

2
1.0 INTRODUCTION

1.1 Background to the study

The organized method of policing the country Nigeria has failed and there is urgent need to seek for
alternative approach in addressing this life threating societal ill. Ameh et al. (2020) stated that rising
incidence of kidnapping have become so prevalence and have been a source of concerns to the
government, members of the public and international community. Benjamin and Daniel (2014)
stated that Bokoharam kidnapped over 200 schoolgirls sitting for their final exams in the town of
Chibok in Borno State. Nnam and Otu (2015) opined several studies on kidnapping in Nigeria
revealed that the floodgate of kidnapping activities was opened on 12th January, 2006, when
MEND (Movement for the Emancipation of the Niger Delta) captured four (4) expatriate oil
workers in Ekeremo Local Government Area of Bayelsa State and the current pattern or trend of
kidnapping in contemporary Nigeria is a basket case; little wonder therefore that the latest global
ranking placed Nigeria as the fourth most notorious kidnapping nation where people can be easily
kidnapped with impunity. The activities of these men of underworld have made the life of an
average Nigerians meaningless and unpredictable.

The use of Big Data and Machine Learning have been fully utilized in USA, UK, India and other
Western World in using to handle crime problems to a manageable level (Manengadan et al., 2021;
Bello-Orgaz, 2016). Chowdary et al. (2020) stated that Crime analysis and prevention is a
systematic approach for identifying and analyzing patterns and trends in crime. With the increasing
advent of computerized systems, crime data analysts can help the Law enforcement officers to
speed up the process of solving crimes. Using the concept of data mining we can extract previously
unknown, useful information from an unstructured data (Werner, Yang & McConky, 2017). 

1.2 Motivation for the study

The prevalence of crime in the society is a bad omen for all the citizen of the Country. This wave of
crime that is ravaging the society if not check, it can lead to chaos and anarchy and makes Nigeria
ungovernable. In light of this opportunity, to carry out research, there was no other burning issue in
my mind that touches the heart than the security challenges that have become hydra-headed
problem for all our security agencies. Using deep learning Model can help to predict and prevent.

3
1.3 Statement of Research Problem

Major crimes in Nigeria include rape, kidnapping, murder, burglary, fraud, terrorism, robbery,
cyber-crimes, bribery and corruption, money laundering and so on have made life of an average
Nigeria unbearable and life threatening. These crimes occur in every part of Nigeria in hourly,
daily, weekly, monthly and yearly basis (Oguntunde et al., 2018; Ameh et al., 2020). Automatic
crime prediction using deep learning algorithms has played an increasingly critical role in criminal
justice systems and crime prevention efforts. Previous work, such as (Luo et al., 2017), shows that
deep learning can assist decision-making by processing a very high volume of data that are
difficult for human researchers to analyze efficiently (Khan et al., 2021). Deep learning
architectures Seven different types of deep learning models for Time Series Forecast, TSF which
are: multilayer perceptron, Elman recurrent, long short-term memory, echo state, gated recurrent
unit, convolutional, and temporal convolutional networks. For all these models, the number of
hyperparameters that have to be configured is high compared to traditional machine-learning
techniques. Therefore, the proper tuning of these parameters is a complex task that requires
considerable expertise (Lara-Benitez, Carranza-Garcia and Riquelme.2020). Further work on the
researched publication Employing Deep Learning and Time Series Analysis to Tackle the
Accuracy and Robustness of the Forecasting Problem is expected to be performed on scalability
analysis and implement the proposed method for different datasets (Haseeb et al. 2021)

1.4 Aim and Objectives of the Study

The aim of the study is to propose a framework for crime prediction and prevention in Nigeria using
deep learning

The objectives are:

1. To identify the features necessary to develop a framework for crime prediction and
prevention in Nigeria.
2. To propose a framework for crime prediction and prevention in Nigeria.

1.5 Significance of the Study

(i) It will restore peace to the country.


(ii) Increase in job creation.
(iii) High flow of direct foreign investments into country.

4
(iv) There will be high boom of tourist activities in the Nigeria.
(v) People will be willing to travel and live in any part of Nigeria without fear or threat to their
life.
(vi) Nigerians will be very proud of their country.
(vii) People can easily travel to their remote villages to visit families without any fear of
kidnapping, robbery etc.

5
2.0 LITERATURE REVIEW

This section presents a background on security issues in Nigeria and the research efforts made to
combat the security challenges

2.1 Security Challenges

Overtime the security of lives and properties of citizens of Nigeria as well as other residents in the
country have been seriously threatened due to the recurrent security challenges, which have
collectively undermined socio-economic development in the country (Nwokwu and Ogayi, 2021).
Nigeria has been facing multiple crises since independence. Such problems include citizenship, bad
governance, ethnicity, economic and political instability. These negative factors have been lingering
on in Nigeria and getting worse yearly. Issues of militancy, Boko Haram, Banditry and kidnapping
are some of the causes of the above mentioned problems. An attempt to extrapolate the nexus
between Banditry and Human Security in North Western Nigeria have given the present social and
political challenges in the country a serious problem that required all the necessary solutions to
address Abdulyakeen Abdulrasheed, (2021).

2.2 Law Enforcement Agents

This section captured the efforts of some discussed law enforcement agencies that are working
assiduously to see that various crimes committed in the country are nip at the bud and measures are
put in place to prevent. The incorporation of crime prediction model would make Nigeria the most
secured place in the World.

The Nigeria Police Force is the principal law enforcement agency in Nigeria. It has staff
deployment across the 36 states of the country and the Federal Capital Territory (FCT).
Its duty is derived from the Police Act and Regulations CAP 359 Laws of the Federation of Nigeria
1990. this act provides for in section 4 “the police shall be employed for the prevention and
detection of crime, the apprehension of offenders, the preservation of law and order, the protection
of life and property and the due enforcement of all laws and regulations with which they are directly
charged, and shall perform such military duties within or without Nigeria as may be required by
them by, or under the authority of, this or any other Acts” (Saviour O. A, 2008).

6
The National Drug Law Enforcement Agency (NDLEA) is a Federal agency in Nigeria charged
with eliminating the growing, processing, manufacturing, selling, exporting, and trafficking of hard
drugs.

The Defence Intelligence Agency (DIA) is the primary military intelligence agency of Nigeria..


The DIA promotes Nigeria's Defence Policy, enhances military cooperation with other countries,
protects the lives of Nigerian citizens, and maintains the territorial integrity of Nigeria.

Nigerian Army (NA): To guard areas of high importance, offer some emergency services to gain
professional skills required for performing the moral duties, ensure that the equipment of the army
and all the establishments are up-to-date and in a good working condition, coordinate the
enforcement of immigration laws and customs, defend the country from external aggression, protect
the country’s borders. To restore order if needed and in cases of insurrection, perform any other
duties mentioned in the National Assembly’s Act or as directed. 

 The Nigerian Navy: The Nigerian Navy(NN) has its headquarters in Abuja and are sea branch of
Nigerian Armed Forces, protecting the country’s territorial waters. Its three operational Commands
are in Lagos, Calabar, and Bayelsa.

The Nigerian Air Force. The Nigerian Air force operates on the air but also provides support for
both the Army and Navy. The primary roles of the Nigerian Military include: To maintain the
territorial integrity of Nigeria by defending the Nation from external aggression both on land , sea
or air, To air other security and paramilitary outfits in tackling internal threats. This is subject to the
Command of the President as approved by National Assembly and also carry out other functions
assigned to it by Federal Republic of Nigeria.

The State Security Service (SSS), self-styled as the Department of State Services (DSS): Its main


responsibilities are within the country and include counter-intelligence, internal security, counter-
terrorism, and surveillance as well as investigating some other types of serious crimes against the
state. It is also charged with the protection of senior government officials, particularly
the President, Vice President, state governors and visiting heads of states and governments with
their respective families.

Nigerian Custom Services: The core functions are Suppressing smuggling, preventing smuggling,
Customs clearance, Collecting revenue (duties) Accounting revenue

7
Nigerian Service Civil Defence Corps: Statutory functions of NSCDC in Nigeria is to protect lives
and properties in conjunction with Nigeria police. One of the crucial functions of the corps is to
protect pipelines from vandalism. The agency also involves in crisis resolutions. They protect the
country.

Correctional Centres (formerly called Nigerian Prison) Seven functions of Nigerian Prison in
managing crimes inmates and associated processes are listed below; Social Isolation And
Confinement, Enforces Repentance Punishes To Serve As Deterrence, Protects Citizens From
Criminals, Enforces Reformation, Produces Suspects In The Court, Keeps Convicted In The Lawful
Custody. The Nigerian prison service was founded with the mandate of managing law offenders and
criminals in prison custody. Those that are convicted offenders in a different prison yard, as well as,
the awaiting trial offenders in a different prison yard, until they are demanded by the court of law.

2.3 Crime Prediction

Crime influences people in many ways. Previous studies have shown the relationship between time
and crime incidence behavior. This research attempts to determine and examine the relationship
between time, crime incidences types and locations by using one of the neural network models for
time series data that is, Long Short-Term Memory network. The collected data is pre-processed,
analyzed and tested using Long Short-Term Memory recurrent neural network model. R-square
score is also used to test the accuracy. The study results show that applying Long Short-Term
Memory Recurrent Neural Network (LSTM RNN) enables to come up with more accurate
prediction about crime incidence occurrence with respect to time. Predicting crimes accurately
helps to improve crime prevention and decision and advance the justice system (Tsion
Yidnekachew, Nigus, Muluken and Tagele, 2020)

Further work on the researched publication Employing Deep Learning and Time Series Analysis to
Tackle the Accuracy and Robustness of the Forecasting Problem is expected to be performed on
scalability analysis and implement the proposed method for different datasets( Haseeb et al. 2021)

2.3.1 Time Series Forecasting

Structured series of data points listed at an equal-spaced time is called time series. Time series
analysis can be separated into two parts. The first part is to obtain the structured underlying pattern
of the ordered data. The second part narrates to fit a model for future prediction. The most
challenging part that involves mathematical calculations is the fitting part of the time series. Time

8
series can be used for univariate and multivariate analyses

2.4 Related Work

Deep learning (that is, machine learning via deep neural networks, or DNNs) has been applied to
image classification, speech recognition and neural machine translation, reporting increasing levels
of accuracy over the last years. e era of big data, deep learning is also being established as one of
the pillars of scientific discovery, complementary to theory, experimentation and scientific
simulation. Nowadays, DNNs are thus envisioned as potential key technologies in areas as diverse
as quantum technologies, solid state lighting, nanoelectronics and nanomechanics, high throughput
screening of new materials, computer vision in microscopy, radiography and tomography, and
astrophysics simulation, to name only a few. Furthermore, the application of deep learning to
current data science falls short in comparison with the volume of machine learning techniques being
leveraged by social media companies such as Google, Baidu and Facebook in their daily business.
A neural network (deep or not) can be viewed as a generic algorithm, which (semi-)automatically
adapts itself (i.e., learns) to solve a specific problem. In supervised neural networks (NNs), the
adaptation occurs via an off-line learning process (or training), which is then followed by the use of
the NN to solve the problem (or inference). In general, the inference is an inexpensive process,
which can be often performed using low precision (e.g., fixed-point, integer or even binary
arithmetic) on low-cost hardware (Castello, Dolz, Quintana-Orti and Duato, 2019).

Lawrence and Natarajan (2015) worked on using machine learning algorithms to analyze crime
data. machine learning and applications. The researchers implemented the Linear Regression,
Additive Regression, and Decision Stump algorithms using the same finite set of features on the
Communities and Crime Dataset. Overall, the linear regression algorithm performed the best among
the three selected algorithms. It was observed that the linear regression algorithm shown to be very
effective and accurate in predicting the crime data based on the training set input for the three
algorithms. linear regression algorithm could handle randomness in the test samples to a certain
extent (without incurring too much of prediction error). Data mining has become a vital part of
crime detection and prevention. Even though the scope of this work was to prove how effective and
accurate machine learning algorithms can be at predicting violent crimes, there are other
applications of data mining in the realm of law enforcement such as determining criminal "hot
spots", creating criminal profiles, and learning crime trends. Utilizing these applications of data
mining can be a long and tedious process for law enforcement officials who have to sift through
large volumes of data. However, the precision in which one could infer and create new knowledge
on how to slow down crime is well worth the safety and security of people.
9
Prabakaran and Shilpa (2018) work on Survey of Analysis of Crime Detection Techniques Using
Data Mining and Machine Learning in which Data mining is the field containing procedures for
finding designs or patterns in a huge dataset, it includes strategies at the convergence of machine
learning and database framework. It can be applied to various fields like future healthcare, market
basket analysis, education, manufacturing engineering, crime investigation etc. Among these, crime
investigation is an interesting application to process crime characteristics to help the society for a
better living. Surveyed various data mining techniques used in this domain. The researchers posited
that, it may be helpful in designing new strategies for crime prediction and analysis.

Alkesh and Sarvanaguru (2018) work on Crime Prediction and Analysis Using Machine Learning in
which Crime is one of the biggest and dominating problem in our society and its prevention is an
important task. Daily there are huge numbers of crimes committed frequently. This require keeping
track of all the crimes and maintaining a database for same which may be used for future reference.
The current problem faced are maintaining of proper dataset of crime and analyzing this data to help
in predicting and solving crimes in future. The researchers stated the set objective was to analyze
dataset which consist of numerous crimes and predicting the type of crime which may happen in
future depending upon various conditions. Moreso, the technique of using machine learning and
data science for crime prediction of Chicago crime data set was used. The crime data was extracted
from the official portal of Chicago police. It consists of crime information like location description,
type of crime, date, time, latitude, longitude.

Maryam et al. (2018) worked on Crime Data Mining, Threat Analysis and Prediction in which
continuous increase in the number of on-line users and the emergence of a variety of new on-line
business models require more intelligent and proactive techniques in tackling cybercrimes. In the
past few years, machine learning and data mining techniques have been applied to datasets derived
from different industries.

ToppiReddy et al. (2018) worked on Crime Prediction & Monitoring Framework Based on Spatial
Analysis in which Crimes are treacherous and common social problem faced worldwide. Crimes
affect the quality of life, economic growth, and reputation of a nation. There has been an enormous
increase in crime rate in the last few years. In order to reduce the crime rate, the law enforcements
need to take the preventive measures. With the aim of securing the society from crimes, there is a
need for advanced systems and new approaches for improving the crime analytics for protecting
their communities. Accurate real-time crime predictions help to reduce the crime rate but remains
challenging problem for the scientific community as crime occurrences depend on many complex
factors. Various visualizing techniques and machine learning algorithms are adopted for predicting

10
the crime distribution over an area. In the first step, the raw datasets were processed and visualized
based on the need.

Chowdary et al. (2020) stated that Crime analysis and prevention is a systematic approach for
identifying and analyzing patterns and trends in crime. Our system can predict the type of crime
activity which have high probability for given location in terms of latitude and longitude and date
and also we can visualize crime prone areas. With the increasing advent of computerized systems,
crime data analysts can help the Law enforcement officers to speed up the process of solving
crimes.

Dhinakaran et al. (2020) stated that technologies have increased in all aspects, the speed of crimes
and the variety of criminalities have additionally increased proportionately. This causes major threat
to the humans. When examining the information, we tend to predict the coming crimes with
relevancy location, time and different factors.

Neil, Nandish and Manan (2021) Crime forecasting: a machine learning and computer vision
approach to crime prediction and prevention in which crime is a deliberate act that can cause
physical or psychological harm, as well as property damage or loss, and can lead to punishment by
a state or other authority according to the severity of the crime. The number and forms of criminal
activities are increasing at an alarming rate, forcing agencies to develop efficient methods to take
preventive measures. In the current scenario of rapidly increasing crime, traditional crime-solving
techniques are unable to deliver results, being slow paced and less efficient. Thus, if we can come
up with ways to predict crime, in detailed, before it occurs, or come up with a “machine” that can
assist police officers, it would lift the burden of police and help in preventing crimes. To achieve
this, we suggest including machine learning (ML) and computer vision algorithms and techniques.
We described the results of certain cases where such approaches were used, and which motivated us
to pursue further research in this field. The main reason for the change in crime detection and
prevention lies in what happened before and after statistical observations of the authorities using
such techniques. The sole purpose of this study is to determine how a combination of ML and
computer vision can be used by law agencies or authorities to detect, prevent, and solve crimes at a
much more accurate and faster rate.

Manengadan et al. (2021) expressed that Crimes are common social problems that can even affect
the quality of life, even the economic growth of a country. Big Data Analytics (BDA) is used for
analyzing and identifying different crime patterns, their relations, and the trends within a large
amount of crime data. Here, BDA is applied to criminal data in which, data analysis is conducted
for the purpose of visualization. Big data analytics and visualization techniques were utilized to

11
analyze crime big data within the different parts of India. We have taken all the states of Indian for
analysis, visualization and prediction. The series of operations performed are data collection, data
pre-processing, visualization and trends prediction, in which LSTM model is used. The data
includes different cases of crimes with in different years and the crimes such as crime against
women and children in which, kidnap, murder, rape. The predictive results show that the LSTM
perform better than neural network models. Hence, the generated outcomes will benefit for police
and law enforcement organizations to clearly understand crime issues and that will help them to
track activities, predict the similar incidents, and optimize the decision making process.

. Data preparation integrates and reduces data and give them predefined attributes. Crime prediction
is analyzed by applying some calculation, calculated their moving average, difference, and auto-
regression. Forecasted Model gives 80% correct values, which is formed to be an accurate model.
This work helps for London police in decision-making against crime (Khawar and Akhter, 2021).

Predictive policing and crime analytics with a spatiotemporal focus get increasing attention among
a variety of scientifc communities and are already being implemented as effective policing tools
(Ourania Kounadi, Alina, Ristea, Adelson, Araujo and Michael, 2020).

Table 2.4.1: Summary of Related Works

S/ AUTHOR(s) OBJECTIVE(s) METHODOLOGY OUTCOME(S LIMITATION


N )
1. Lawrence Focus of this project Linear Regression Regression Non-
and is towards analyzing Additive Regression algorithms predictive
Natarajan the crime patterns of Decision Stump mentioned in (identifying
(2015) the four violent the previous variables)
crime. section were features could
used for the possibly
predictions and hinder
comparison implementatin
2 Prabakaran helpful in designing General techniques used crime Designing new
and Shilpa new strategies for in fraud detection. investigation is strategies for
(2018) crime prediction and Genetic algorithm. an interesting determining
analysis. Hidden Markov application to future crime
Model(HMM). process crime prediction and
Naive Bayesian characteristics analysis.
3 Alkesh and Train a model for Data collection. The model The current
12
Sarvanaguru prediction. Data Preprocessing. predicts the problem faced
(2018)  Building the Feature selection. type of is the aspect
model will be Building and Traning crime with of maintaining
done using Model. accuracy of a proper
better Prediction. 0.789. dataset of
algorithm Visualization. crime
4 ToppiReddy visualizing Data collection & visualization Future plan
et al. (2018) techniques and preprocessing. techniques and for applying
machine learning Data visualization classification other
algorithms are Visualization of Crime algorithms that classification
adopted for predicting Data Using Google can be used for algorithms on
the crime distribution Maps. predicting the the crime
over an area. Visualization of Exact crimes
Accurate real-time Location of Crime with
crime predictions 3D View.
help to reduce the Visualization based on
crime rate but type of Crime.
remains challenging
problem
5 Dhinakaran tendency to compare Data processing Crime Combined
et al. (2020) the placement of the Clustering knowledge is techniques are
particular hotspots K-means agglomeration that the needed to
with those found by Algorithm: dynamic and make a higher
the results rising analysis. crime
prediction
6 Neil, Ways to predict Data gathering. Aim at First, the
Nandish and crime, in detail. Data preprocessing. assisting body correct and
Manan Chose the model. of researchers complete
(2021) Train model to make crime building of
Test model prediction a the whole
Tuning model reality and system has to
prediction implement be done in the
such advanced near future
technology in
real li
7 Manengadan  Big data Data Collection. LSTM, we Future is
13
et al. (2021) analytics and Data Preprocessing. found that deployment of
visualization Narrative Visualization. LSTM Long Short
techniques Long Short Term performs better Term Memory
were utilized Memory, LSTM than other in security
to analyze conventional framework of
crime big data neural network all India
model cities.

2.5 Scalability Analysis of Deep Neural Network

2.5.1 Scalability

One of the main challenges in big data streaming analysis is the issue of scalability. The big data
stream is experiencing exponential growth in a way much faster than computer resources. The
processors follow Moore’s law, but the size of data is exploding. Therefore, research efforts should
be geared towards developing scalable frameworks and algorithms that will accommodate data
stream computing mode, effective resource allocation strategy and parallelization issues to cope
with the ever-growing size and complexity of data (Taiwo K., Olawande and Ayodele 2019).

Open sources tools and technologies for scalability of big data stream analysis are: BlockMon,
NISQL, Spark streaming, Apache, Kafka, Yahoo! S4, Apache Samza, Photon, Apache Aurora,
MavEstream, EsperTech, Redis, P-SPAROL, SAMOA, CQELS, ETALIS, XSEQ, Apache kylin
and Splunk stream.

Deployment may be considered for predictable or consistent loads. But if workloads are mixed (i.e.
consistent flows or spikes), a combination of cloud and on-premise approach. may be considered so
as to give room for easy integration of web-based services or software and access to critical
functions on the go.

2.6 Complementarity Analysis

We provide a complementarity analysis of the crime datasets. In particular, we check how the
datasets complement each other by checking if there is an intersection between their records. To
14
perform this analysis, we use the following record attributes: latitude, longitude, date, and the
period of the day, which of the crime occurred, victim’s sex and, crime type. The address entered in
different systems, and the way the systems handle each address may result in the same criminal
record with different latitudes and longitudes. To determine the existence of records referencing the
same crime in the two datasets, we use the latitude and longitude of the criminal record with the
precision of one geographical block. We performed the complementarity analysis using the
following attributes: date, the period of the day, victim’s gender, type of crime, and latitude and
longitude with the precision of one geographical block. Therefore, from the combination of the
records of the two datasets, we obtained a combined dataset that we call the crime dataset. By
combination, we mean merging the two datasets from the union of their records by removing the
duplicate records (Ursula, Castro, Marcos, Rodrigues and Wladmir, 2020).

3.0 METHODOLOGY

3.1 Description of the Proposed Model

15
This section involves the use of Method, material, tools and techniques and technologies to explain
the figure 3.1 the framework on crime prediction and prevention Model to be extracted from crime
dataset. The following are various stages in crime dataset analysis:

3.1.1 Data Collection: Data for the research is expected to be collected from crime data’s from
Nigeria police, Prisons database (Correctional database) and from Twitter API (Application
Program Interface) save in comma separated value file(csv).

Featured Attributes For each entry of crime incidents in the datasets, the following 13 featured
attributes are included: 1) IncidentNum - Case number of each incident; 2) Dates - Date and
timestamp of the crime incident; 3) Category - Type of the crime. This is the target/label that we
need to predict in the classification stage; 4) Descript - A brief note describing any pertinent details
of the crime; 5) DayOfWeek - Day of the week that crime occurred; 6) PdDistrict - Police
Department District ID where the crime is assigned; 7) Resolution - How the crime incident was
resolved (with the perpetrator being, say, arrest or booked); 8) Address - The approximate street
address of the crime incident; 9) X - Longitude of the location of a crime; 10) Y - Latitude of the
location of a crime; 11) Coordinate - Pairs of Longitude and Latitude; 12) Dome - whether crime id
domestic or not; 13) Arrest - Arrested or not; Following are some of the problems that can arise in
data collection:

 Inaccurate data. The collected data could be unrelated to the problem statement.
 Missing data. Sub-data could be missing. That could take the form of empty values in
columns or missing images for some class of prediction.
 Data imbalance. Some classes or categories in the data may have a disproportionately high
or low number of corresponding samples. As a result, they risk being under-represented in the
model.
 Data bias. Depending on how the data, subjects and labels themselves are chosen, the model
could propagate inherent biases on gender, politics, age or region, for example. Data bias is
difficult to detect and remove.

3.1.2 Data preprocessing: is the process of preparing the raw data involves removing of null
values using df = df.dropna() where df is the data frame. The categorical attributes (Location,
Block, Crime Type, Community Area) are converted into numeric using Label Encoder. The date

16
attribute is splitted into new attributes like month and hour which can be used as feature for the and
making it suitable for deep learning modelling:

(i) Features Extraction: is a type of dimensionality reduction where large numbers of


pixels of the image are efficiently represented in such a way that interesting parts
are captured effectively. Dimensionality reduction techniques reduce the high
dimensional data to low-dimensional data. In this study, we have applied the
principal component analysis (PCA) method which provides linear mapping based
on an eigenvector search. PCA provides different approaches to reduce the feature
space dimensionality [35, 36].

Data collection

Data Preprocessing

17
Feature Data
Extractions Cleaning

Data conversion in time series

Check stationarity of time-series


data

Structured data and Unstructured


Data

80% 20%
Training Testing
dataset dataset

ARIMA
SES Crime prediction models
HW
RNN-LSTM

Evaluate forecasting performance

Figure 3.1.0: Proposed Methodology

(ii) Data Cleaning: Data cleaning is the process of fixing or removing incorrect,
corrupted, incorrectly formatted, duplicate, or incomplete data within a dataset.
When combining multiple data sources, there are many opportunities for data to be
duplicated or mislabeled. If data is incorrect, outcomes and algorithms are
unreliable, even though they may look correct. There is no one absolute way to
prescribe the exact steps in the data cleaning process because the processes will
18
vary from dataset to dataset. But it is crucial to establish a template for your data
cleaning process so you know you are doing it the right way every time. The
following are process of cleaning the data:
 Dropping unnecessary columns in a DataFrame
 Changing the index of a DataFrame
 Using .str() methods to clean columns
 Using the DataFrame.applymap() function to clean the entire dataset, element-wise
 Renaming columns to a more recognizable set of labels
 Skipping unnecessary rows in a CSV file
3.2.1 Data conversion in time series:
(i) Data conversion is the process of translating data from one format to another. While the
concept itself may seem simple, data conversion is a critical step in the process of data
integration. This step enables the data to be read, altered, and executed in an application
or database other than that in which it was created.
(ii) A time series is a sequence of numbers that are ordered by a time index. This can be
thought of as a list or column of ordered values.
(iii) A supervised learning problem is comprised of input patterns (X) and output patterns (y),
such that an algorithm can learn how to predict the output patterns from the input
patterns.
(iv) A key function to help transform time series data into a supervised learning problem is
the Pandas shift() function.

3.2.2 Time series: They are stationary if they do not have trend or seasonal effects. Summary
statistics calculated on the time series are consistent over time, like the mean or the variance of the
observations. When a time series is stationary, it can be easier to model. Statistical modeling
methods assume or require the time series to be stationary to be effective.

3.2.3 Structured data: comprised of clearly defined data types with patterns that make them
easily searchable.

3.2.4 Unstructured Data Analysis Information Extraction

19
unstructured data, which is comprised of data from emails, social media feeds and crimes
related calls. Analyze semi- structured and unstructured data sets for improved crime
prediction.

As an outcome, organizations have to analyze semi- structured and unstructured crime datasets to


extract structured data insights to make improved crimes predictions.
Use of big data analytics tools and technology such Hadoop, NoSQL databases are being expected
to deploy.
3.2.4 Training data: is the data you use to train an algorithm or machine learning model to predict
the outcome you design your model to predict. 80% of data are usually taken as training dataset.

3.2.5 Testing data: Test data provides a final, real-world check of an unseen dataset to confirm that
the Machine Learning algorithm was trained effectively. The test set is a set of observations used to
evaluate the performance of the model using some performance metric. It is important that no
observations from the training set are included in the test set. If the test set does contain examples
from the training set, it will be difficult to assess whether the algorithm has learned to generalize
from the training set or has simply memorized it.

3.3 ARIMA (Auto Regressive Integrated Moving Average) Model


ARIMA is an acronym that stands for Auto Regressive Integrated Moving Average. It is a
generalization of the simpler Auto Regressive Moving Average and adds the notion of integration.
This acronym is descriptive, capturing the key aspects of the model itself. Briefly, they are:

 AR: Autoregression. A model that uses the dependent relationship between an observation


and some number of lagged observations.
 I: Integrated. The use of differencing of raw observations (e.g. subtracting an observation
from an observation at the previous time step) in order to make the time series stationary.
 MA: Moving Average. A model that uses the dependency between an observation and a
residual error from a moving average model applied to lagged observations.
Each of these components are explicitly specified in the model as a parameter. A standard notation
is used of ARIMA(p,d,q) where the parameters are substituted with integer values to quickly
indicate the specific ARIMA model being used.

The parameters of the ARIMA model are defined as follows:

20
 p: The number of lag observations included in the model, also called the lag order.
 d: The number of times that the raw observations are differenced, also called the degree of
differencing.
 q: The size of the moving average window, also called the order of moving average.
A linear regression model is constructed including the specified number and type of terms, and the
data is prepared by a degree of differencing in order to make it stationary, i.e. to remove trend and
seasonal structures that negatively affect the regression. The predicted value of the crime data set is
the = Constant / Sum of one or more recent values of Y and recent values of error of Y

In crime model, Stationaries series called auto regression, forecasting of errors called moving
average and the series, which is different to be made stationary called integrated. ARIMA model is
constructed with (p, d, q) where p is auto regression, d is the nonseasonal difference, q is logged
forecast error.

3.4 Simplest Exponentially Smoothing (SES)

Simple Exponential Smoothing Method. Simple exponential smoothing is the simplest method that
is suitable for stationary series. It is a time-series forecasting approach for a single parameter
without a trend and seasonality. SES models are generally based on the assumption that time series
should be oscillating at a constant level or slowly changing over time.

This method is suitable for forecasting data with no clear trend or seasonal pattern. The data
does not display any clear trending behavior or any seasonality.

3.5 Holt–Winters Exponential Smoothing Method. Holt–Winters exponential smoothing method


was designed in 1960 by extending the exponential smoothing method. HW is applied when data
are in the stationary form. For the calculation of the prediction measures, all the data values need to
be in series. This method is suitable when data are with the trend and seasonality

It applies three exponential smoothing formulae called triple exponential smoothing. First, the
average is computed to give locals the average of the series. Second, the trend is smooth, and
finally, smooth each subseries seasonal estimates for each season separately. The exponential
smoothing formula applies to a series of trend and constant seasonal elements using HW addition
and multiplication methods. An additive method is applied when the season changes through the
series are roughly unchanged. The multiplicative method is employed when changes are in
proportional series.

21
3.6 Recurrent Neural Network(RNN).

RNN is a type of Artificial Neural Network, ANN which has input, hidden, and output units.
Generally, the RNN model has a unidirectional flow of information from input layers to hidden
layers. It remembers end-to-end working of the model. A directional loop can help to remember
when to make a decision, what is an input of the current node, and what it had learned from the
inputs received previously. Using the previous sequence samples may help understand the current
sample. RNN can work well on time series because of its capability of remembering the previous
input received using the internal memory. This can help to make the RNN forecast accurately.

3.6.1 Long short-term memory (LSTM) networks are modified versions of the RNN that can
help to solve the short- and long-term dependencies which make it easier to remember previous
data. LSTM networks are trained using backpropagation through time which helps to overcome the
vanishing gradient problem. Traditional neural networks have neurons, while LSTM networks have
memory blocks connected through sequential layers. Each module contains gates that can handle
module status and outputs. The gated formation of the LSTM network manages its memory state.
The use of neural networks reduces the need for extensive feature engineering and allows training
of large datasets.

The difference between LSTM and RNN is an internal unit state which is also transmitted along
with the hidden state. The LSTM block receives the input sequence and then uses a gate activation
unit to decide if it is dynamic. This action creates a state change and adds information that
conditionally passes through the block. Gates make blocks much better than the classic neurons and
enable them to memorize current streams. The weight of the gates can be learned during the training
phase. The gating function controls the input, remembers the content in the internal state variables,
and handles the output that makes the LSTM unit flexible. In LSTM cells, there are three types of
gates, i.e., input, forget, and output (Figure3.1.1). Each unit of LSTM has a cell which has a state ct
at time t. The cell read/modify action is controlled using the input gate it, forget gate ft, and output
gate ot. At each time step, the LSTM unit receives the input from two external sources at each of the
four terminals, i.e., the three gates and the input

22
Forget gate Output gate
ft Ot

xt s x Ct s x ht

It
Input gate

Figure 3.1.1: LSTM unit and its components.

The proposed method will incorporate past criminal activity records that models them based on
RNN LSTM to predict occurrence of crimes. The research approach consists five phases.

Firstly, we collected the crime data from Nigerian police, Correctional Centres, Twitter API etc.
Time series data is used to extract context information. It then undergo process of data pre-
processing and generate more efficient data that can be used in predicting crime.

Thirdly, LSTM RNN model is employed to accurately predict crime incidences. Then after, train
dataset has undergone through data cleaning and data transformation process. From previous
experiments and analysis, it was observed that LSTM model gives high quality of prediction. R-
square score is used to evaluate the model. The LSTM model has high prediction accuracy and the
low error rate, showing best generalization capability. The pick points where crime can happen in a
specific period of time were defined and conferred with respect to crime locations. RNN. LSTM
can read, write, and delete information or can retain information in its memory. RNN is applied
along with LSTM to avoid the exploding gradient and vanishing gradient problem. RNN uses short-
term memory where LSTM is working like a gated cell in the form of sigmoid ranging from 0 to 1
which can help backpropagation and keep the gradient steep, so the training is short and accuracy is
high. RNN is used to handle the sequence-dependent variables of daily and monthly violent crimes.
The normalization technique is applied to the data to make them uniform. LSTM for regression with
time steps is applied on the violent crime in which the previous time step is taken in the series as the
input to forecast the output at the next time step. This process is applied by setting the columns to
be time-step dimension and changing the values of dimension back to 1. In this method, mapping is
applied by finding the end of the data pattern, checking the limits of sequence, and gathering input
23
and output parts of the pattern. The model can generate the future values of a time series and it can
be trained using the datasets. As usual, the data gets split into training data and test data so we can
later assess how well the final model performs. We take 80٪ of the dataset as a training data and 20
٪ as a text data

3.7 Evaluation of Model Accuracy

The Root Mean Square Error(RMSE): is the standard deviation of the residuals (prediction error)
commonly used in forecasting and regression analysis to verify an implemented model.

The Mean Absolute Percentage Error(MAPE):is a measure of prediction accuracy of a forecasting


model.

Below are mathematical formula that are used in the analysis:

where n is the total number of days

24
4.0 CONCLUSION AND FUTURE WORKS

4.1 Conclusion

From my researched, Deep learning using different models can help predict crimes which will aid
the police and other security agencies to become more proactive in enforcing the law by making
Nigeria cities, towns, villages and boarders the safest places to live and work in the World.

How to source for these data, where these crimes data banks are in Nigeria and how to assess it
should be something of concerns to all scientist and other professional alike. Data is the present day
Gold that all the advance countries are using to solve all myriads of their societal problems, that
they are being confronted on daily basis.

4.2 Future Works

The next stage of the research is to completely implement the proposed framework for crime
prediction and prevention model with real-life datasets of selected Nigerian cities and townships.
Evaluation of the Model will be carried out to confirm the accuracy and performance before full
deployment for crime prediction and prevention.

25
REFERENCES

Abdulyakeen Abdulrasheed, (2021). Armed Banditry and Human Security in North Western
Nigeria: The Impacts and the Way Forward, Journal of Humanities Social and Management
Sciences Edwin Clark University Vol 1 NO 1, pp-1-19.

Ameh et al. (2020). Kidnapping in Nigeria: Dimensions, causes and consequences; International
Journal of Scientific Research and Reviews; JSRR Volume 9 , Issue(2), pp-135-155

Alkesh B.& Sarvanaguru .R.K., (2018).Crime Prediction and Analysis Using Machine Learning;
International Research Journal of Engineering and Technology (IRJET) Volume: 05 Issue:
09 pp-1037-1042

Bello-Orgaz, (2016). Social big data: Recent achievements and new challenges, Journal of
Information fusion volume 28, page 45

Benjamin M. & Daniel A., (2014). boko haram kidnaps women and young girls in north-eastern
Nigeria, pp-51-56

Bryan, L & Stefan, Z (2021). Time-series forecasting with deep learning: a survey, Phyilosophical
transaction for Royal Society, pp-1-14.

Castello, A., Dolz, M. F., Quintana-Orti, E. S., & Duato, J. (2019). Theoretical Scalability Analysis
of Distributed Deep Convolutional Neural Networks. 2019 19th IEEE/ACM International
Symposium on Cluster, Cloud and Grid Computing (CCGRID), pp-532-541

Chowdary et al.(2020). An Approach For Crime Analysis Using Clustering Algorithm A Project
Report, Department Of Computer Science& Engineering, (Affiliated To Andhra University),
Pp-1-52

Dhinakaran et al; (2020). Various approaches to analyzing Crime and Prediction using data
analytics; Journal of Xi'an University of Architecture & Technology, Volume XII, Issue V,
pp-142-152

26
Han et al; (2020). Risk Prediction of Theft Crimes in Urban Communities: An Integrated Model of
LSTM and ST-GCN, IEEE access volume 8, pp-217222-217230

Haseeb T., Muhammad K. H., Muhammad U. S., Sabeen B., Muhammad S. S., & Rozita J. O.,
(2021). Employing Deep Learning and Time Series Analysis to Tackle the Accuracy and
Robustness of the Forecasting Problem, Hindawi Security and Communication Networks
Volume 202 1, pp-1-10.

Hicham et al; (2018). A crime prediction model based on spatial and temporal data. Periodicals of
Engineering and Natural Sciences Vol. 6, No. 2, pp.360-364

Ibidun, C. O. & Ademola P. A.,(2021). South Africa Crime Visualization, Trends Analysis, and
Prediction Using Machine Learning Linear Regression Technique, Applied Computational
Intelligence and Soft Computing, volume 2021, pp-1-14

Khan et al; 2021. Interpreting Criminal Charge Prediction and Its Algorithmic Bias via Quantum-
Inspired Complex Valued Networks, Proceedings of the 38 th International Conference on
Machine Learning, PMLR 139, pp-1-5

Khawar I., & Akhter R; (2021). Forecasting Crime Using ARIMA Model, PP1-14

Kim et al; (2018). Crime Analysis Through Machine Learning, conference paper.

Lara-Benitez, P., Carranza-Garcia, M., & Riquelme, J. C. (2020) An Experimental Review on Deep
Learning Architectures for Time Series Forecasting, International Journal of Neural Systems.
Pp-1-27

Lawrence M.& Natarajan M; (2015). Using Machine Learning Algorithms To Analyze Crime Data.
Machine Learning and Applications: An International Journal (MLAIJ) Vol.2, No.1, pp-1-
12

Luo et al (2017). Learning to predict charges for criminal cases with legal basis. In Proceedings of
the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 2727–2736

Maryam et al; (2018). Crime Data Mining, Threat Analysis and Prediction, pp-183-204

27
Mufeeda Manengadan et al.; (2021). Crime Data Analysis, Visualization and Prediction Using Long
Short Term Memory, LSTM, International Journal of Data Science and Analysis, Volume7,
Issue3 pages 51-59

Neil S; Nandish B.& Manan S; (2021). Crime forecasting: a machine learning and computer vision
approach to crime prediction and prevention; Visual Computing for Industry, Biomedicine,
and Art; Volume 4, Issue 9 pp- 1 of 14.

Nnam, M.U. & Otu, M. S.,( 2015). Predictors and Incidence of Kidnapping in Contemporary
Nigeria: A Socio-Criminological Analysis; International Journal of Recent Research in
Social Sciences and Humanities (IJRRSSH), Vol. 2, Issue 1, pp: (38-43),

Nwokwu, P.M. & Ogayi, G. O. (2021). Security Challenges As Threat To Socio-Economic


Development In Nigeria, African Journal Of Politics And Administrative Studies, Volume
14, Issue(1), Pp-18-32

Ourania et al; (2020). A systematic review on spatial crime forecasting, Journal of BiomMedical
Centre, BMC article7, PP-1-15

Ourania Kounadi, Alina Ristea, Adelson Araujo Jr. & MichaelL. (2020). A systematic review on


spatial crime forecasting, pp-1 of22

Prabakaran, S.& Shilpa M;( 2018). Survey of Analysis of Crime Detection Techniques Using Data
Mining and Machine Learning, Journal of Physics: Conference Series 1000, pp-1-10

Saviour O. Akpan, (2008). Enhancing the Effectiveness of the Nigerian Security Agencies Before,
During and After Elections in Nigeria - The Way out, Bassey Andah Journal, Volume 1,
page 119-132

Safat et al; 2021. Empirical Analysis for Crime Prediction and forecasting Using Machine Learning and
Deep Learning Techniques, IEEE Access , volume 9, pp-70080-70094

Stefan S., Thanh B. B., Christian D., Alexander H., Ludwig W., Steven P., & Klaus-Robert M.,
(2021). Towards CRISP-ML(Q): A Machine Learning Process Model with Quality
Assurance Methodology; Machine Learning and Knowledge Extraction,Volume 3, pp- 392–
413
28
Sri et al; (2020). FBI crime analysis and prediction using machine learning, Journal of Engineering
sciences, Vol 11, Issue 4 , ISSN NO:0377-9254; pp-441-448

Taiwo K., Olawande D., and Ayodele .(2019). Big data stream analysis: a systematic literature
review, Journal of Big Data, Volume 6 , issue 47 pp-1 of 30.

ToppiReddy et; (2018). Crime Prediction & Monitoring Framework Based on Spatial Analysis,
International Conference on Computational Intelligence and Data Science (ICCIDS),
volume 132 pp-696–705

Tsion E.M., Yidnekachew K.A., Nigus A. A., Muluken W. T., & Tagele B. M.(2020). Designing
Time Series Crime Prediction Model using Long Short-Term Memory Recurrent Neural
Network; nternational Journal of Recent Technology and Engineering (IJRTE) ISSN: 2277-
3878, Volume-9 Issue-4, PP-402-405.

Ursula R. M. Castro ´ a , Marcos W. Rodrigues b and Wladmir C. Brandao˜ c. (2020). Predicting


Crime by Exploiting Supervised Learning on Heterogeneous Data, PP-524-531.

Werner, G., Yang, S., & McConky, K. (2017). Time series forecasting of cyber attack intensity.
Proceedings of the 12th Annual Conference on Cyber and Information Security Research -
CISRC pp-1-3

29

You might also like