Module 1 Introduction
Reference: Jeffrey Strickland, Predictive analytics using R, Simulation
educators, Colorado Springs, 2015.
Introduction to Analytics
• Analytics is the systematic computational analysis of data or statistics.
• It is used for the discovery, interpretation, and communication of
meaningful patterns in data.
• It also entails applying data patterns toward effective decision-
making.
• Analytics relies on the simultaneous application of statistics, computer
programming, and operations research to quantify performance.
2
Analytics vs Analysis
• Analysis focuses on the process of examining past data through business
understanding, data understanding, data preparation, modeling and evaluation, and
deployment.
• It is a subset of data analytics, which takes multiple data analysis processes to focus on why
an event happened and what may happen in the future based on the previous data.
• Data analytics is used to formulate larger organizational decisions.
• Data analytics is a multidisciplinary field.
Analytics vs Analysis
• Data analytics is a multidisciplinary field.
• There is extensive use of computer skills, mathematics, statistics.
• The use of descriptive techniques and predictive models to gain valuable knowledge from data through
analytics.
• There is increasing use of the term advanced analytics, typically used to describe the technical
aspects of analytics, especially in the emerging fields such as the use of machine learning
techniques like neural networks, decision trees, logistic regression, linear to multiple regression
analysis, and classification to do predictive modeling.
• It also includes unsupervised machine learning techniques like cluster analysis, Principal
Component Analysis, segmentation profile analysis and association analysis.
Types of Data Analytics
Descriptive Predictiv Prescriptive
e
• Aims to uncover • Helps forecast • Suggests
valuable insight behavior of people conclusions or
from the data and markets actions that may
be taken based
• Answers the • Answers the on the analysis
question question
“What “What could • Answers the
happened? happen?” question
” “What should
be done?”
5
Types of Data Analytics – another version
• Descriptive Analytics: what has happened?
• Diagnostic Analytics: why it happened?
• Predictive Analytics: what is likely to happen in the future?
• Prescriptive Analytics: what is the best course of action?
• Cognitive Analytics: applies human-like intelligence to drive results.
6
Predictive Analytics
•Data is a core strategic asset
• It encodes your business’ collective experience
•It is imperative to:
• Learn from your data.
• Learn as much as possible about your customers.
• Learn how to treat each customer individually.
Predictive Analytics
• Predictive Analysis is the branch of the advanced analytics which is
used to make predictions about unknown future events.
• Predictive Analytics encompasses a variety of statistical techniques
from modeling, machine learning, artificial intelligence and data
mining that analyze current and historical facts to make predictions
about future, or otherwise unknown events.
• Predictive modeling is a process used in predictive analytics to create
a statistical model of future behavior.
• Area of data mining concerned with forecasting probabilities and trends.
Predictive Analytics
• Predictive Analytics is sometimes used synonymously with predictive
modeling.
• Predictive modeling is a process used in predictive analytics to create
a statistical model of future behavior.
• The process of developing a mathematical tool or model that generates an
accurate prediction.
Predictive Analytics
•Predictive analytics is the practice of extracting insights from the
existing data set with the help of data mining, statistical modeling and
machine learning techniques and using it to predict
unobserved/unknown events.
• Identifying cause-effect relationships across the variables from the
historical data.
• Discovering hidden insights and patterns with the help of data mining
techniques.
• Apply observed patterns to unknowns in the past, present or future.
Predictive Analytics
•Some of the models used are:
• Forecasting
• Simulation
• Regression
• Classification
• Clustering
Examples of Predictive Analytics
• Finance: Forecasting Future Cash Flow
• Entertainment & Hospitality: Determining Staffing Needs
• Marketing: Behavioural Targeting
• Manufacturing: Preventing Malfunction
• Health Care: Early Detection of Allergic Reactions
12
Predictive Analytics - objective
• The predictive analytics process involves:
• collecting and cleaning massive amounts of data
• building predictive models using sophisticated predictive
algorithms and techniques.
Business Analytics
• Business Analytics (BA) refers to the skills, technologies, practices for continuous iterative
exploration and investigation of past business performance to gain insight and drive
business planning.
• It makes extensive use of statistical analysis, including descriptive and predictive
modeling, and fact-based management to drive decision making.
• Analyzes historical data and gains new insight to improve strategic decision-making.
• Employs business intelligence and several methodologies such as data mining, statistical
analysis, and predictive analytics
• Prescriptive modeling has also taken a role in BA.
• It is therefore closely related to management science and operations research.
Business Analytics
• Various business analytics types help analyze and transform data
into useful information, identifying and anticipating the current
trends and outcomes for smarter, data-driven business decisions.
• BA can answer questions like
• why is this happening
• what if these trends continue
• what will happen next (that is, predict)
• what is the best that can happen (that is, optimize).
Business Analytics
“At a time when companies in many industries offer similar products
and use comparable technology, high-performance business
processes among the last remaining points of differentiation.”
- Tom Davenport
Business Analytics - Types
• Business analytics is comprised of descriptive, predictive and
prescriptive analytics.
• These are generally understood to be descriptive modeling, predictive
modeling, and prescriptive modeling.
• Descriptive analytics – understand the past.
• Predictive analytics – predict the future.
• Prescriptive analytics – recommend an action.
Descriptive Analytics
• Descriptive models quantify relationships in data in a way that is often used to
classify customers or prospects into groups.
• Unlike predictive models that focus on predicting a single customer behavior (such
as credit risk), descriptive models identify many different relationships between
customers or products.
• Descriptive analytics provides simple summaries about the sample audience and
about the observations that have been made. Such summaries may be either
quantitative, i.e. summary statistics, or visual, i.e. simple-to-understand graphs.
• These summaries may either form the basis of the initial description of the data as
part of a more extensive statistical analysis, or they may be sufficient in and of
themselves for a particular investigation.
18
Descriptive Analysis
• Use of reports and visual displays to explain or understand past and
current business performance
• Contain statistical summaries of metrics such as sales and revenue
• Intended to provide an outline of trends in current and past
performance
What Happened?
Example:
• Summarizing past events, exchange of data, and social media usage
• Reporting general trends
Predictive analytics
• Predictive analytics encompasses a variety of statistical techniques from
modeling, machine learning, and data mining that analyze current and
historical facts to make predictions about future, or otherwise unknown,
events.
• Predictive models are models of relation between a specific performance
of a unit in a sample and one or more known attributes of the unit.
• The objective of the model is to assess the likelihood that a similar unit in
a different sample will exhibit the specific performance.
• This category encompasses models that are in many areas, such as
marketing, where they seek out subtle data patterns to answer questions
about customer performance, such as fraud detection models.
20
Predictive Analysis
• Ability to predict future performance
• Detecting patterns or relationships in historical data
• Project these relationships into the future
• Domain knowledge to construct a simplified representation
Example:
• Predicting customer preferences
• Detection of employee intentions
• Recommending products
• Predicting staff and resources
Prescriptive analytics
• Prescriptive analytics not only anticipates what will happen and when it will
happen, but also why it happens.
• Prescriptive analytics suggests decision options on how to take advantage of
a future opportunity or mitigate a future risk and shows the implication of
each decision option.
• Prescriptive analytics can take in new data to re-predict and re-prescribe.
Thus, automatically improving prediction accuracy and prescribing better
decision options.
• Prescriptive analytics ingests hybrid data, a combination of structured
(numbers, categories) and unstructured data (videos, images, sounds, texts),
and business rules to predict what lies ahead and to prescribe how to take
advantage of this predicted future data without compromising other priorities
22
Prescriptive Analysis
• Recommend a choice of action from predictions of future
performance
• Optimum decision based on the need to maximize (or minimize)
some aspect of performance
• Many different scenarios can be tested until an optimal criteria is
found
Example of Prescriptive Analysis
• Tracking dynamic prices in manufacturing
• Improving equipment management
• Suggest the best course of action
• Price modeling
• Evaluating rates of readmission
• Identifying testing
Models
Data driven vs Model driven approach
1. Data-driven modelling approach
• Aims to derive a description of behavior from observations of a
system so that it can describe how that system behaves (its output)
under different conditions or scenarios (its input).
• Generally, more data (observations) is used to form the description
which improves accuracy and hence there was shift in analytics to
handle big data.
• Machine learning uses a set of learning algorithms that can handle
large data sets.
Data driven vs Model driven approach
2. Model-driven modelling approach
• Aims to explain a system’s behavior not just derived from its inputs but
through a representation of the internal system’s structure.
• A real system is simplified into its essential elements (its processes) and
relationships between these elements (its structure).
• In addition to input data, information is required on the system's processes,
the function of these processes and the essential parts of the relationships
between these processes.
• Also called explanatory models as they represent the real system and
attempt to explain the behavior that occurs.
• Generally, have far smaller data needs than data-driven models because of
the key role of the representation of structure.
Descriptive Models
• Descriptive models do not rank-order customers by their likelihood of
taking a particular action the way predictive models do.
• Instead, descriptive models can be used, for example, to categorize
customers by their product preferences and life stage.
• Descriptive modeling tools can be utilized to develop further models
that can simulate large number of individualized agents and make
predictions.
28
Predictive Models
• Predictive models often perform calculations during live transactions,
for example, to evaluate the risk or opportunity of a given customer
or transaction, in order to guide a decision.
29
Decision Models
• Decision models describe the relationship between all the elements
of a decision—the known data (including results of predictive
models), the decision, and the forecast results of the decision—in
order to predict the results of decisions involving many variables.
• These models can be used in optimization, maximizing certain
outcomes while minimizing others.
• Decision models are generally used to develop decision logic or a set
of business rules that will produce the desired action for every
customer or circumstance.
30
Business Analytics Applications
• Predictive analytics can be put to use in many applications, a few are:
• Analytical Customer Relationship Management (CRM)
• Clinical decision support systems
• Collection analytics
• Cross-sell
• Customer retention
• Direct marketing
• Fraud detection
• Portfolio, product or economy-level prediction
• Risk management
• Underwriting
• Credit Card Companies
• Finance
• Human Resources
• Manufacturing
Application in business
• Predictive models exploit patterns found in historical and
transactional data to identify risks and opportunities.
• Models capture relationships among many factors to allow
assessment of risk or potential associated with a particular set of
conditions, guiding decision making for candidate transactions.
• Used in actuarial science, marketing, financial services, insurance,
telecommunications, retail, travel, healthcare, pharmaceuticals and
other fields.
32
Applications of Predictive Analytics
Analytical customer relationship management (CRM)
- marketing campaigns, sales, and customer services
Clinical decision support systems
- to determine which patients are at risk of developing certain conditions
Collection analytics
- who do not make their payments on time
Cross-sell
- selling additional products to current customers
Customer retention
- maintaining continuous consumer satisfaction
33
Applications of Predictive Analytics
(contd)
Direct marketing
- to identify
the most effective combination of product versions, marketing material,
communication channels and timing
Fraud detection
- fraudulent transactions , identity thefts and false insurance claims
Portfolio, product or economy-level prediction
- predicting store-level demand for inventory management purposes, predicting the unemployment
rate for the next year
Risk management
- predicts the best portfolio to maximize return
Underwriting
- predicting the chances of illness, default, bankruptcy
34
Analytical Techniques
The approaches and techniques used to conduct predictive analytics
can broadly be grouped into
• regression techniques
• machine learning techniques.
35
Regression techniques
• The focus lies on establishing a mathematical equation as a model to
represent the interactions between the different variables in
consideration.
• Types.
Linear regression model
Discrete choice models
Logistic regression
Multinomial logistic regression
Probit regression
Time series models
Survival or duration analysis
Classification and regression trees
Multivariate adaptive regression splines
36
Machine learning techniques
• Neural networks
• Multilayer Perceptron (MLP)
• Radial basis functions
• Support vector machines
• Naïve Bayes
• k-nearest neighbors
• Geospatial predictive modeling
37
Open source predictive analytic
tools
• scikit-learn
• KNIME
• OpenNN
• Orange
• R
• RapidMiner
• Weka
• GNU Octave
• Apache Mahout
38
Commercial predictive analytic tools
• Alpine Data Labs
• BIRT Analytics
• Angoss KnowledgeSTUDIO
• IBM SPSS Statistics and IBM SPSS Modeler
• KXEN Modeler
• Mathematica
• MATLAB
• Minitab
• Oracle Data Mining (ODM)
• Pervasive
• Predixion Software
• Revolution Analytics
• SAP
• SAS and SAS Enterprise Miner
• STATA
• STATISTICA
• TIBCO
39
More Applications
Healthcare Domain - Improving Patient Outcomes
- patient demographics, patient vitals, past medication history,
visits to the hospital, lab test results, and any claims data.
Manufacturing - Predictive Maintenance
- maintenance data logs maintained by the technicians,
especially for older machines.
- For newer machines, data coming in from the different
sensors of the machine—including temperature, running time,
power level durations, and error messages
40
More Applications
• Finance - Predicting Late Payments
- company or individual demographics, products they purchased/used,
past payment history, customer support logs, and any recent adverse
events.
• Insurance - Preventing Fraud
- the location where the claim originated, time of day, claimant history,
claim amount, and even public data such as the National Fraud Database.
• Company – Customer churn prediction
- customer demographics, products purchased, product usage, customer
calls, time since last contact, past transaction history, industry, company
size, and revenue.
41
Other References
• https://en.wikipedia.org/wiki/Analytics