EXPLORATORY DATA ANALYSIS
BASIC CONCEPTS AND INTERPRETATION
OUTLINE
DESCRIPTIVE CORRELATIONS HISTOGRAMS
STATISTICS
Exploratory data analysis deals with initial analysis of the
data before building actual models
It is a good idea to carefully study your data first, which
includes descriptive statistics, correlations and various graphs
EXPLORATORY
DATA ANALYSIS In this way, you can identify potential relationship in the data,
locate errors and other issues with data and select the best
modeling technique
Data preparation and initial analysis can be time consuming
but there exist a few quick approaches which we consider in
this presentation
SUMMARY
STATISTICS
We typically start by looking at
descriptive statistics which you
can obtain from Data > Data
Analysis > Descriptive
Statistics
We can select multiple
variables for analyses in “Input
Range” and then tick next to
“Summary statistics
Then we can look at the mean,
median, maximum etc
CORRELATION
Similarly, we can select Correlation from Data >
Data Analysis to study potential relationships
between variables
We can also use Home > Conditional Formatting >
Data Bars to highlight the strength of correlation
Age and Balance have positive correlation with
"Exited" while Active Members and Number of
Products are negatively correlated with churn
HISTOGRAM
Histogram shows frequency
distribution of the variable
To construct a histogram click
on Data > Data Analysis >
Histogram
Then select the Input Range,
tick Cumulative Percentage and
Chart Output
Frequency
100
150
200
250
300
0
50
350
360
370
380
390
400
410
420
HISTOGRAM
430
440
450
460
470
480
490
500
510
520
530
540
550
560
570
580
590
600
Bin
610
620
630
640
650
660
Histogram
670
680
690
700
710
720
730
740
750
760
770
780
790
800
810
820
830
840
More
0.00%
20.00%
40.00%
60.00%
80.00%
100.00%
120.00%
Frequency
Cumulative %
THANK YOU!
Front and back photo by Scott Graham on Unsplash