0% found this document useful (0 votes)

86 views10 pages

Data Mining

Organizations use data mining to analyze large amounts of data and discover useful patterns and relationships. Data mining involves collecting and processing data, then analyzing it to identify trends. This allows organizations to improve marketing, sales, and reduce costs. Data mining is used in many fields like banking, healthcare, and telecommunications to detect fraud, improve treatments, and design marketing campaigns.

Uploaded by

Rodwel Leo

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

86 views10 pages

Data Mining

Uploaded by

Rodwel Leo

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 10

1

Data Mining

Name of the student

Course
Tutor
Date
2

Data Mining

Introduction:
Organizations that use data mining to transform raw data into meaningful
information. Data mining is the practice of examining vast amounts of data to identify trends
and patterns. Data mining tools identify relationships between the data based on the
variables that users request or contribute. Employers may discover further about their
consumers by employing software to search for patterns in enormous amounts of data. This
allows them to design more successful marketing campaigns, improve sales, and save
expenses. Competent collection of data, storage, and computational capabilities are required
for data mining. Organizations could use data mining to discover about just what their
consumers are engaged in or would like to purchase, as well as for detecting fraud as well as
malware scanning (Sumathi 2006).

Discovering and evaluating enormous collections of information to uncover relevant

relationships and correlations is what data mining is all about. It may be used for marketing
strategy, credit risk management, detection of fraud, filtering the Email off spam messages,
and even determining user attitude or opinion. In order to make business, digital media
corporations utilize data mining tools to homogenize its users(Sumathi 2006). This
application of data mining has recently been heavily criticised since consumers are frequently
uninformed that data analysis is taking place with their private details, particularly if it's used
to affect opinions.

The data mining process breaks down into five steps. First, organizations collect data
and load it into their data warehouses. Next, they store and manage the data, either on in-
house servers or the cloud (Sumathi 2006). Business analysts, management teams, and
information technology professionals access the data and determine how they want to
organize it. Then, application software sorts the data based on the user's results, and finally,
the end-user presents the data in an easy-to-share format, such as a graph or table.
3

Literature review

Data mining, also known as knowledge discovery in databases, can be defined as the
process of analyzing large information repositories and of discovering implicit, but
potentially useful information (Han, Kamber, & Pei, 2011). Data mining has the capability to
uncover hidden relationships and to reveal unknown patterns and trends by digging into large
amounts of data (Sumathi & Sivanandam, 2006). The functions, or models, of data mining
can be categorized according to the task performed: association, classification, clustering, and
regression (Hui & Jha, 2000; Kao, Chang, & Lin, 2003; Nicholson, 2006b). Data mining
analysis is based normally on three techniques: classical statistics, artificial intelligence, and
machine learning (Girija & Srivatsa, 2006).

Classical statistics is mainly used for studying data, data relationships, as well as for
dealing with numeric data in large databases (David J. Hand, 1998). Examples of classical
statistics include regression analysis, cluster analysis, and discriminate analysis. Artificial
intelligence (AI) applies “human-thought-like” processing to statistical problems (Girija &
Srivatsa, 2006). AI uses several techniques such as genetic algorithms, fuzzy logic, and
neural computing. Finally, machine learning is the combination of advanced statistical
methods and AI heuristics, used for data analysis and knowledge discovery (Kononenko &
Kukar, 2007). Machine learning uses several classes of techniques: neural networks,
symbolic learning, genetic algorithms, and swarm optimization.

Data mining benefits from these technologies, but differs from the objective pursued:
extracting patterns, describing trends, and predicting behavior. This research project was
funded by the Flemish Interuniversity Council (VLIR-IUC), the National Secretariat of
Higher Education, Science, Technology and Innovation of Ecuador (SENESCYT); and
supported by the CEPRA VII project “Plataforma de integracio n, publicaci on y consulta
integrada de recursos bibliogr aficos en la Web Semantica” funded by the Ecuadorian
Consortium for Advanced Internet Development (CEDIA). The authors thank Andres
Auquilla for the fruitful discussions on data mining techniques trends, and Paul Vanegas for
reviewing some drafts of this article. 3 formats. These raw data are cleansed in order to
remove noise, and duplicated and inconsistent data (Han et al., 2011). These cleansed data are
then transformed into appropriated formats that can be understood by other data mining tools,
and filtration and aggregation techniques are applied to the data in order to extract
4

summarized data. In fact, interesting knowledge is extracted from the transformed data. This
information is analyzed in order to identify the truly interesting patterns. Eventually,
knowledge is visualized to the user. More detailed information regarding a data mining
process can be found in Han et al. (2011).

Data mining techniques are applied in a wide range of domains where large amounts
of data are available for the identification of unknown or hidden information. In this sense, N.
Girija and S.K. Srivatsa (2006) indicate that data mining techniques used in www are called
web mining, used in text are called text mining, and used in libraries are called bibliomining.
The term bibliomining, or data mining for libraries, was first used by Scott Nicholson and
Jeffrey Stanton (2003) to describe the combination of data warehousing, data mining and
bibliometrics. This term is used to track patterns, behavior changes, and trends of library
systems transactions. Although the concept is not new, the term bibliomining was created to
facilitate the search of the terms 4 “library” and “data mining” in the context of libraries
rather than in software libraries.

Interesting patterns are analyzed and visualized through reports. The mining process
will be iterated until the resulted information is verified and proved by key users such as
librarians and library managers (Shieh, 2010). The application of bibliomining tools is an
emerging trend that can be used to understand patterns of behavior among library users and
staff, and patterns of information resource use throughout the library (Nicholson & Stanton,
2006). Bibliomining is highly recommended to provide useful and necessary information for
library management requirements, focusing on the professional librarianship issues, but
highly database technical dependent (Shieh, 2010). Bibliomining can also be used to provide
a comprehensive overview of the library workflow in order to monitor staff performance,
determine areas of deficiency, and predict future user requirements (Prakash, Chand, &
Gohel, 2004).

The resulting information gives the possibility to perform scenario analysis of the
library system, where different situations that need to be taken into account during a decision-
making process are evaluated (Nicholson, 2006a). An additional application is to standardize
structures and reports in order to share data warehouses among groups of libraries, allowing
libraries to benchmark their information (Nicholson, 2006a). The aim of this study is to
investigate how far academic libraries are pragmatically using data mining tools, and in
which library aspects librarians are implementing them.
5

Impact of Data Mining on the Field of Nursing:

In health care, data mining is becoming increasingly popular, if not increasingly

essential. Heterogeneous medical data have been generated in various health care
organizations, including payers, medicine providers, pharmaceuticals information,
prescription information, doctor's notes, or clinical records produced day by day. These
quantitative data can be used to do clinical text mining, predictive modeling, survival
analysis, patient similarity analysis, and clustering, to improve care treatment and reduce
waste. In health care area, association analysis, clustering, and outlier analysis can be
applied. Treatment record data can be mined to explore ways to cut costs and deliver
better medicine (Koh 2005).

Data mining also can be used to identify and understand high-cost patients and
applied to mass of data generated by millions of prescriptions, operations, and treatment
courses to identify unusual patterns and uncover fraud. Using data mining, the treatments
can be improved. By continuous comparison of symptoms, causes, and medicines, data
analysis can be performed to make effective treatments. Data mining is also used for the
treatment of specific diseases, and the association of side-effects of treatments. Data
mining applications are used to find abnormal patterns such as laboratory, physician’s
results, inappropriate prescriptions, and fraudulent medical claims (Koh 2005).

Implementations of Data Mining

Mobile service providers use data mining to design their marketing campaigns and to
retain customers from moving to other vendors.From a large amount of data such as billing
information, email, text messages, web data transmissions, and customer service, the data
mining tools can predict “churn” that tells the customers who are looking to change the
vendors. With these results, a probability score is given. The mobile service providers are
then able to provide incentives, offers to customers who are at higher risk of churning. This
kind of mining is often used by major service providers such as broadband, phone, gas
providers (Matillion 2020).
6

IT team has enriched data mining skill and return on investment can be measured.
Researchers leverage association analysis and clustering to provide the insight of what
product combinations were purchased; it encourages customers to purchase related products
that they may have been missed or overlooked. Users’ behaviors are monitored and analyzed
to find similarities and patterns in Web surfing behavior so that the Web can be more
successful in meeting user needs (Matillion 2020).

Data Mining detects outliers across a vast amount of data. The criminal data includes
all details of the crime that has happened. Data Mining will study the patterns and trends and
predict future events with better accuracy.The agencies can find out which area is more prone
to crime, how much police personnel should be deployed, which age group should be
targeted, vehicle numbers to be scrutinized (Matillion 2020).

Advantages of Data Mining

Data mining benefits include:

 It helps companies gather reliable information.

 It’s an efficient, cost-effective solution compared to other data applications.

 It helps businesses make profitable production and operational adjustments.

 Data mining uses both new and legacy systems(Simplilearn 2021).

 It helps businesses make informed decisions.

 It helps detect credit risks and fraud.

 It helps data scientists easily analyze enormous amounts of data quickly.

 Data scientists can use the information to detect fraud, build risk models, and
improve product safety.

 It helps data scientists quickly initiate automated predictions of behaviors and

trends and discover hidden patterns (Simplilearn 2021).
7

Disadvantages of Data Mining

These are the major issues in data mining:

 Many data analytics tools are complex and challenging to use. Data scientists need
the right training to use the tools effectively. Data mining requires large databases,
making the process hard to manage.

 Speaking of the tools, different ones work with varying types of data mining,
depending on the algorithms they employ. Thus, data analysts must be sure to
choose the correct tools (Mishal 2021).

 Data mining techniques are not infallible, so there’s always the risk that the
information isn’t entirely accurate. This obstacle is especially relevant if there’s a
lack of diversity in the dataset.

 Companies can potentially sell the customer data they have gleaned to other
businesses and organizations, raising privacy concerns (Mishal 2021).

Possible Future Directions:

Some of the key data mining trends for the future include -

1. Multimedia Data Mining

This is one of the latest methods which is catching up because of the growing
ability to capture useful data accurately. It involves the extraction of data
from different kinds of multimedia sources such as audio, text, hypertext,
video, images, etc. and the data is converted into a numerical representation
in different formats. Can be used in clustering and classifications, performing
similarity checks, and also to identify associations (Data Entry 2021).

2. Distributed Data Mining

It involves the mining of huge amount of information stored in different
company locations or at different organizations. Highly sophisticated
algorithms are used to extract data from different locations and provide
proper insights and reports based upon them.

3. Spatial and Geographic Data Mining

It includes extracting information from environmental, astronomical, and

geographical data which also includes images taken from outer space. This
type of data mining can reveal various aspects such as distance and topology
which is mainly used in geographic information systems and other navigation
applications.

4. Time Series and Sequence Data Mining

The primary application of this type of data mining is study of cyclical and
seasonal trends. It is helpful in analyzing even random events which occur
outside the normal series of events. This method is mainly being use by retail
companies to access customer's buying patterns and their behaviors (Data
Entry 2021).

Conclusion:

Data mining is used in diverse applications such as banking, marketing, healthcare,

telecom industries, and many other areas. Data mining techniques help companies to gain
knowledgeable information, increase their profitability by making adjustments in processes
and operations. It is a fast process which helps business in decision making through analysis
of hidden patterns and trends. The insights mined from such data can prove invaluable
in improving care delivery, early diagnosis, disease identification and hospital staffing. There
is nearly limitless potential for leveraging data across the spectrums of patient care and
safety, as well as operational decision-making and academia.
9

References
Siguenza-Guzman, Lorena & Saquicela, Victor & Avila-Ordoñez, Elina & Vandewalle, Joos
& Cattrysse, Dirk. (2015). Literature Review of Data Mining Applications in
Academic Libraries. The Journal of Academic Librarianship. 41. 499-510.
10.1016/j.acalib.2015.06.007.https://www.researchgate.net/publication/
280101455_Literature_Review_of_Data_Mining_Applications_in_Academic_Librari
es
Koh, H. C., & Tan, G. (2005). Data mining applications in healthcare. Journal of healthcare
information management : JHIM, 19(2), 64–72.
https://pubmed.ncbi.nlm.nih.gov/15869215/
Han, J., Pei, J., & Kamber, M. (2011). Data mining: concepts and techniques. Elsevier.
https://www.sciencedirect.com/book/9780123814791/data-mining-concepts-and-
techniques
Nicholson, J. K. (2006). Global systems biology, personalized medicine and molecular
epidemiology. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1682018/
Hand, D. J. (1998). Data mining: statistics and more?. The American Statistician, 52(2), 112-
118. https://www.tandfonline.com/doi/abs/10.1080/00031305.1998.10480549
Prakash, K., Chand, P., & Gohel, U. (2004). Application of data mining in library and
Information services.
https://www.researchgate.net/publication/265496914_Application_of_Data_Mining_i
n_Library_and_Information_Services
Kononenko, I., & Kukar, M. (2007). Machine learning and data mining. Horwood
Publishing. https://www.sciencedirect.com/book/9781904275213/machine-learning-
and-data-mining
Yeh, J. R., Shieh, J. S., & Huang, N. E. (2010). Complementary ensemble empirical mode
decomposition: A novel noise enhanced data analysis method. Advances in adaptive
data analysis, 2(02), 135-156.
https://www.worldscientific.com/doi/abs/10.1142/S1793536910000422
Girija, N., & Srivatsa, S. K. (2006). A research study: Using data mining in knowledge base
business strategies. Information Technology Journal, 5(3), 590-600.
https://www.semanticscholar.org/paper/A-Research-Study%3A-Using-Data-Mining-
in-Knowledge-Girija-S.K.Srivatsa/9678aa65cb7d01c14b19d95ac4fae3b2d2433953
10

Sumathi, S., & Sivanandam, S. N. (2006). Introduction to data mining and its

applications (Vol. 29). Springer.
https://ir.inflibnet.ac.in:8443/ir/bitstream/1944/435/1/04Planner_22.pdf
Data Entry Services(2021). In 5 important Future Trends in Data Mining. Retreived from
https://www.flatworldsolutions.com/data-management/articles/data-mining-future-
trends.php
Simplilearn (June 2021). In What is Data Mining: Definition,Benefits, Applications,Top
Techniques, and more. Retrieved from https://www.simplilearn.com/what -is-data-
ming
Matillion (June 2020). In 5 real life applications of Data Mining and Business Intelligence.
Retrieved from https://www.matillion.com/resources/blog/5-real-life-applications-of-
data-mining-and -business-intelligence
Mishal Roomi(April 2021). In 7 Advantages and Disadvantages of Data Mining | Limitations
& Benefits of Data Mining. Retrieved from https://www.hitechwhizz.com/2021/04/7-
advantages-and-disadvantages-limitations-benefits-of-data-mining.html?m=1

Unit 5 DM
No ratings yet
Unit 5 DM
50 pages
Applications & Trends in Data Mining: Gaurav Gupta, Geetika Hans, Tamanna Sehgal
No ratings yet
Applications & Trends in Data Mining: Gaurav Gupta, Geetika Hans, Tamanna Sehgal
3 pages
Unit I-1data Mining Introduction
No ratings yet
Unit I-1data Mining Introduction
39 pages
Unit III
No ratings yet
Unit III
101 pages
Data Warehousing&Dat Mining
No ratings yet
Data Warehousing&Dat Mining
12 pages
Data Mining Concepts and Applications: Six Factors Behind The Sudden Rise in Popularity of Data Mining
No ratings yet
Data Mining Concepts and Applications: Six Factors Behind The Sudden Rise in Popularity of Data Mining
36 pages
(Ebook PDF) Data Mining Concepts and Techniques 3rd Instant Download
100% (4)
(Ebook PDF) Data Mining Concepts and Techniques 3rd Instant Download
54 pages
Data Mining: Should It Be Included in The 'Statistics' Curriculum?
No ratings yet
Data Mining: Should It Be Included in The 'Statistics' Curriculum?
4 pages
Data Mining e Resources
No ratings yet
Data Mining e Resources
98 pages
Chapter 1 (Introduction)
No ratings yet
Chapter 1 (Introduction)
17 pages
Introduction To Data Mining For Business Analytics
No ratings yet
Introduction To Data Mining For Business Analytics
51 pages
Data Science Module 1 Notes
No ratings yet
Data Science Module 1 Notes
16 pages
IS352 - Lecture 01
No ratings yet
IS352 - Lecture 01
62 pages
DMM Finals
No ratings yet
DMM Finals
30 pages
1st Slides
No ratings yet
1st Slides
60 pages
Data Mining-1
No ratings yet
Data Mining-1
7 pages
UNIT 1 - Lecture 1 - Introduction To Data Mining
No ratings yet
UNIT 1 - Lecture 1 - Introduction To Data Mining
62 pages
Seminar On Data Mining Concepts and Its
No ratings yet
Seminar On Data Mining Concepts and Its
8 pages
BI Ch02
No ratings yet
BI Ch02
29 pages
(Ebook PDF) Data Mining Concepts and Techniques 3rdinstant Download
100% (3)
(Ebook PDF) Data Mining Concepts and Techniques 3rdinstant Download
44 pages
Data Mining Techniques Unit-1
No ratings yet
Data Mining Techniques Unit-1
122 pages
Unit II Data Mining
No ratings yet
Unit II Data Mining
8 pages
Data Mining
No ratings yet
Data Mining
24 pages
V3N2 121 PDF
No ratings yet
V3N2 121 PDF
4 pages
Unit 1
No ratings yet
Unit 1
27 pages
Data Mining
No ratings yet
Data Mining
8 pages
Data Mining - Digital Notes (Unit I To V)
No ratings yet
Data Mining - Digital Notes (Unit I To V)
85 pages
Functions of Database Server
0% (2)
Functions of Database Server
4 pages
Post Print
No ratings yet
Post Print
41 pages
Data Mining: A Brief Introduction To The Field and Research Community
No ratings yet
Data Mining: A Brief Introduction To The Field and Research Community
5 pages
Bhabesh - Chapter 2
No ratings yet
Bhabesh - Chapter 2
34 pages
Chapter 6 - Data Mining Techniques
No ratings yet
Chapter 6 - Data Mining Techniques
19 pages
TPW Data Mining
No ratings yet
TPW Data Mining
4 pages
Topic 4 - Data Mining Tools and Technique
No ratings yet
Topic 4 - Data Mining Tools and Technique
22 pages
Data Mining
No ratings yet
Data Mining
11 pages
Ijcse 01768
No ratings yet
Ijcse 01768
4 pages
Micro Project
No ratings yet
Micro Project
22 pages
Unit I DM
No ratings yet
Unit I DM
27 pages
Introduction To Data Mining - 125604
No ratings yet
Introduction To Data Mining - 125604
7 pages
Data Mining-Introduction
No ratings yet
Data Mining-Introduction
8 pages
B SC (IT) VI-DSE3-M5
No ratings yet
B SC (IT) VI-DSE3-M5
13 pages
Unit 3 Data Mining
No ratings yet
Unit 3 Data Mining
21 pages
Data Mining Notes
No ratings yet
Data Mining Notes
9 pages
Thesis Chapterwise
No ratings yet
Thesis Chapterwise
52 pages
16 Case Study For Communciations PDF
No ratings yet
16 Case Study For Communciations PDF
9 pages
HELTHcrm
No ratings yet
HELTHcrm
8 pages
Bana1 Visualization
No ratings yet
Bana1 Visualization
22 pages
Data Engineering
No ratings yet
Data Engineering
6 pages
DB2-PPT-2-DB2 Objects V1.0
100% (1)
DB2-PPT-2-DB2 Objects V1.0
43 pages
DMWH M1
No ratings yet
DMWH M1
25 pages
Data Mining and Its Techniques: A Review Paper: Maria Shoukat (MS Student)
No ratings yet
Data Mining and Its Techniques: A Review Paper: Maria Shoukat (MS Student)
7 pages
Project On Google Analytics
100% (2)
Project On Google Analytics
36 pages
Data Mining Overview
No ratings yet
Data Mining Overview
24 pages
Data Mining Information
100% (1)
Data Mining Information
15 pages
Final Document
No ratings yet
Final Document
25 pages
Elasticsearch Optimization
No ratings yet
Elasticsearch Optimization
25 pages
DATA MINING-Knowledge Discovery in Databases
No ratings yet
DATA MINING-Knowledge Discovery in Databases
6 pages
DMX and Vmax Commands Quick References
No ratings yet
DMX and Vmax Commands Quick References
14 pages
The 8 Steps To Successful Enterprise Data Migration
No ratings yet
The 8 Steps To Successful Enterprise Data Migration
17 pages
Data Mining: Applications and Techniques
No ratings yet
Data Mining: Applications and Techniques
60 pages
Data Mining and Business Intelligence
No ratings yet
Data Mining and Business Intelligence
41 pages
Python Data Science Projects
No ratings yet
Python Data Science Projects
14 pages
Data Mining in Digital Library
No ratings yet
Data Mining in Digital Library
5 pages
Medical Data Mining Techniques Survey
No ratings yet
Medical Data Mining Techniques Survey
4 pages
CODR-OP-01 OP For Control of Documents & Records (IMS)
No ratings yet
CODR-OP-01 OP For Control of Documents & Records (IMS)
8 pages
Data Mining For Humanity: An Overview
No ratings yet
Data Mining For Humanity: An Overview
4 pages
Target Data Analyst SQL Interview Questions 1737945171
No ratings yet
Target Data Analyst SQL Interview Questions 1737945171
23 pages
Advantages and Disadvantages of CHATGPT
No ratings yet
Advantages and Disadvantages of CHATGPT
5 pages
Discussion Questions BA
No ratings yet
Discussion Questions BA
11 pages
Manual - DV
No ratings yet
Manual - DV
51 pages
CSC212Lesson One
No ratings yet
CSC212Lesson One
14 pages
Dark Web Monitoring Tool Report
No ratings yet
Dark Web Monitoring Tool Report
44 pages
Introduction To Data Mining Techniques: Dr. Rajni Jain
No ratings yet
Introduction To Data Mining Techniques: Dr. Rajni Jain
11 pages
Linkedin: How Big Data Is Used To Fuel Social Media Success
No ratings yet
Linkedin: How Big Data Is Used To Fuel Social Media Success
7 pages
Text Detector (OCR)
No ratings yet
Text Detector (OCR)
12 pages
Unit III - Full
No ratings yet
Unit III - Full
31 pages
Data Analytics
No ratings yet
Data Analytics
2 pages
Trubend Series 8000: Technical Data
No ratings yet
Trubend Series 8000: Technical Data
3 pages
Tugas Data Mining Pertemuan 10 Kelompok 3
No ratings yet
Tugas Data Mining Pertemuan 10 Kelompok 3
4 pages
CSE2004 - Database Management Systems
No ratings yet
CSE2004 - Database Management Systems
102 pages
JK DBMS
No ratings yet
JK DBMS
11 pages
Lect 04 Infograhic Fiscal Management Mind Map
No ratings yet
Lect 04 Infograhic Fiscal Management Mind Map
3 pages
Takehome-Midterm Exam-IS341-Database System (Theory) - Odd 2223
No ratings yet
Takehome-Midterm Exam-IS341-Database System (Theory) - Odd 2223
3 pages
Lab Activity 2 Completed
No ratings yet
Lab Activity 2 Completed
5 pages
Data Analyst & Developer Resume
No ratings yet
Data Analyst & Developer Resume
7 pages
Dbms Manual 2023 24
No ratings yet
Dbms Manual 2023 24
57 pages
Eye Tracking & Usability Correlation Study
No ratings yet
Eye Tracking & Usability Correlation Study
10 pages
Perancangan Sistem Informasi Manajemen Persediaan (Studi Kasus: Pdam Tirta Sakti Kabupaten Kerinci)
No ratings yet
Perancangan Sistem Informasi Manajemen Persediaan (Studi Kasus: Pdam Tirta Sakti Kabupaten Kerinci)
14 pages