CREDIT CARD DATA ANALYSIS
1. SYNOPSIS
The idea of this project is to analyze the data and predict the probable type of customers that would
not leave their credit card sevices.
The whole project is divided into four major parts
1.Reading Data from the source
2.Data Analysis using Pandas
3.Data Visualization using Matplotlib
4.Export data in other formats
1
CREDIT CARD DATA ANALYSIS
2.USER MANUAL
2.1 HARDWARE REQUIREMENTS
CPU : Pentium(R) Dual-Core CPU ,3 GHZ
Keyboard :102 keys
Monitor : Generic PnP Moniter
RAM : 2.00GB
Hard Disk : 500 GB
2.2 SOFTWARE REQUIREMENTS
Operating System : Windows
Languages : python
IDE : python IDLE
Version : 3.7
Backend : MS-EXCEL
2
CREDIT CARD DATA ANALYSIS
2. ABOUT THE SOFTWARE
Python
Python is a dynamic, object-oriented programming language that can be used for many kinds of software
development. It offers strong support for integration with other languages and tools, comes with extensive
standard libraries, and can be learned in a few days. Many Python programmers report substantial productivity
gains and feel the language encourages the development of better code.
Python is the language used to build the Django framework. It is a dynamic scripting language similar to
Perl and Ruby. The principal author of Python is Guido van Rossum. Python supports dynamic typing and has a
garbage collector for automatic memory management. Another important feature of Python is dynamic name
solution which binds the names of functions and variables during execution.
What's Pandas for?
Pandas has so many uses that it might make sense to list the things it can't do instead of what it can do.
This tool is essentially your data’s home. Through pandas, you get acquainted with your data by cleaning,
transforming, and analyzing it.
For example, say you want to explore a dataset stored in a CSV on your computer. Pandas will extract the
data from that CSV into a DataFrame — a table, basically — then let you do things like:
Calculate statistics and answer questions about the data, like
o What's the average, median, max, or min of each column?
o Does column A correlate with column B?
o What does the distribution of data in column C look like?
Clean the data by doing things like removing missing values and filtering rows or columns by some
criteria
Visualize the data with help from Matplotlib. Plot bars, lines, histograms, bubbles, and more.
Store the cleaned, transformed data back into a CSV, other file or database
3
CREDIT CARD DATA ANALYSIS
Install and import
Pandas is an easy package to install. Open up your terminal program (for Mac users) or
command line (for PC users) and install it using either of the following commands:
pip install pandas
To import pandas we usually import it with a shorter name since it's used so much:
import pandas aspd
Matplotlib
Matplotlib is an amazing visualization library in Python for 2D plots of arrays. Matplotlib is a multi-
platform data visualization library built on NumPy arrays and designed to work with the broader SciPy stack. It
was introduced by John Hunter in the year 2002.
One of the greatest benefits of visualization is that it allows us visual access to huge amounts of data in
easily digestible visuals. Matplotlib consists of several plots like line, bar, scatter, histogram etc.
matplotlib.pyplot is a collection of command style functions that make matplotlib work like MATLAB.
Each pyplot function makes some change to a figure: e.g., creates a figure, creates a plotting area in a figure, plots
some lines in a plotting area, decorates the plot with labels, etc.
In matplotlib.pyplot various states are preserved across function calls, so that it keeps track of things like the
current figure and plotting area, and the plotting functions are directed to the current axes (please note that "axes"
here and in most places in the documentation refers to the axes part of a figure and not the strict mathematical
term for more than one axis).
Install and import
python-mpipinstall-Upip
python-mpipinstall-Umatplotlib
4
CREDIT CARD DATA ANALYSIS
4. Modules
MODULES USED :
1. Python Pandas for Data analysis and various other purposes
2. matplotlib for data visualization.
Functions in teacher.py
S.No. Functions USED FOR
1 def introduction() Displaying about the prjoct
2 def made by( ) Display the who made the project
3 def read_CSV file( ) To read data from a CSV file
4 def clear( ) To removes all item
5 def data_analysis_menu( ) Display the content of the CSV file
6 def graph( ) To draw line,bar,scatter,pie graph on based of
credit user
7 def export_menu( ) Display the function to generate export menu
8 def main_menu( ) Display the main menu
5
CREDIT CARD DATA ANALYSIS
5. LIMITATIONS
List of Limitations which are available in CREDIT CARD ANALYSIS
Report is not generated
Product code is not genereated automatically
No online support
It supports only single organization
Authentication phase is not created
6
CREDIT CARD DATA ANALYSIS
7.SOURCE CODE
import pandas as pd
import time
import sqlalchemy
import matplotlib.pyplot as plt
df = pd.DataFrame()
csv_file = "d:/sathya/BankChurners.csv"
def introduction():
msg = '''
A manager at the bank is disturbed with more and more customers leaving their credit card services.
They would really appreciate if one could predict for them who is gonna get churned so they can
proactively go to the customer to provide them better services and turn customers' decisions in
the opposite direction.
I got this dataset from a website with the URL as https://leaps.analyttica.com/home. I have been
using this for a while to get datasets and accordingly work on them to produce fruitful results.
The site explains how to solve a particular business problem.
7
CREDIT CARD DATA ANALYSIS
Now, this dataset consists of 10,000 customers mentioning their age, salary, marital_status,
credit card limit, credit card category, etc. There are nearly 18 features.
We have only 16.07 % of customers who have churned. Thus, it's a bit difficult to train our model
to predict churning customers.
In this project we are going to analyse the same dataset using Python Pandas on windows machine but
the project can be run on any machine support Python and Pandas. Besides pandas we also used
matplotlib python module for visualization of this dataset.
The whole project is divided into four major parts ie reading, analysis, visualization and export. all
these
part are further divided into menus for easy navigation
NOTE: Python is case-SENSITIVE so type exact Column Name wherever required.
If you have any query or suggestions please contact me at [email protected] \n\n\n\n'''
for x in msg:
print(x, end='')
time.sleep(0.002)
wait = input('Press any key to continue.....')
8
CREDIT CARD DATA ANALYSIS
def made_by():
msg = '''
Credit Card Analysis made by : muthamil
Roll No : 1234
School Name : Sathya Vidyalaya cbse
session : 2020-21
Thanks for evaluating my Project.
\n\n\n
'''
for x in msg:
print(x, end='')
time.sleep(0.002)
wait = input('Press any key to continue.....')
def read_csv_file():
9
CREDIT CARD DATA ANALYSIS
df = pd.read_csv(csv_file)
print(df)
# name of function : clear
# purpose : clear output screen
def clear():
for x in range(65):
print()
def data_analysis_menu():
df = pd.read_csv(csv_file)
while True:
clear()
print('\n\nD A T A A N A L Y S I S M E N U ')
print('_'*100,'\n')
print('1. Show Whole DataFrame')
print('2. Show Columns')
print('3. how Top Rows')
10
CREDIT CARD DATA ANALYSIS
print('4. Row Bottom Rows')
print('5. Show Specific Column')
print('6. Add a New Record')
print('7. Add a New Column')
print('8. Delete a Column')
print('9. Delete a Record')
print('10. Card Type User')
print('11. Gender wise User')
print('12. Data Summery')
print('13. Exit (Move to main menu)')
ch = int(input('\n\nEnter your choice:'))
if ch == 1:
print(df)
wait = input('\n\n\n Press any key to continuee.....')
if ch == 2:
print(df.columns)
wait = input('\n\n\n Press any key to continuee.....')
if ch == 3:
n = int(input('Enter Total rows you want to show :'))
print(df.head(n))
wait = input('\n\n\n Press any key to continuee.....')
11
CREDIT CARD DATA ANALYSIS
if ch == 4:
n = int(input('Enter Total rows you want to show :'))
print(df.tail(n))
wait = input('\n\n\n Press any key to continuee.....')
if ch == 5:
print(df.columns)
col_name = input('Enter Column Name that You want to print : ')
print(df[col_name])
wait = input('\n\n\n Press any key to continuee.....')
if ch == 6:
a = input('Enter Customer ID :')
b = input('Enter Customer Type :')
c = input(' Enter Customer Age:')
d = input('Enter Customer Gender :')
e = input('Enter Customer Dependent Count :')
f = input('Enter Education Level :')
g = input('Enter Marital Status :')
h = input('Enter Income Category :')
i = input('Enter Card Category :')
j = input('Enter Month on Book')
k = input('Enter Total Relationship count :')
12
CREDIT CARD DATA ANALYSIS
l = input('Enter Total Month Inactive in last 12 month :')
m = input('Enter Total Contacted in last 12 months :')
n = input('Enter Credit Limit :')
o = input('Enter Revolving Balance :')
p = input('Enter Average Open to Buy Card :')
q = input('Enter Total amount change Q4 to Q1 :')
r = input('Enter Total Transaction amount :')
s = input('Enter Total Transaction Credit:')
t = input('Enter Total Credit Change Q4 Q1 :')
u = input('Enter Average Utilization Ratio :')
data = {'clientID': a, 'Type': b, 'age': c,
'gender': d, 'Dependent_count': e, 'Educational_Level': f, 'Marital_Status': g,
'Income_Category':h,'Card_Category':i,'Months_on_book':j,'Total_Relationship_count':k,
'Month_Inactive_12_month':l,'Contacts_count_12_mon':m,'Credit_Limit':n,
'Total_Revolving_Bal':o,'Avg_Open_To_Buy':p,'Total_Amt_chng_Q4_Q1':q,'Total_Trans_Amt':r,
'Total_Trans_Ct':s,'Total_Ct_Chng_Q4_Q1':t,'Average_Utilization_Ration':u }
df = df.append(data, ignore_index=True)
print(df)
wait = input('\n\n\n Press any key to continuee.....')
if ch == 7:
13
CREDIT CARD DATA ANALYSIS
col_name = input('Enter new column name :')
col_value = int(input('Enter default column value :'))
df[col_name] = col_value
print(df)
print('\n\n\n Press any key to continue....')
wait = input()
if ch == 8:
col_name = input('Enter column Name to delete :')
del df[col_name]
print(df)
print('\n\n\n Press any key to continue....')
wait = input()
if ch == 9:
index_no = int(
input('Enter the Index Number that You want to delete :'))
df = df.drop(df.index[index_no])
print(df)
print('\n\n\n Press any key to continue....')
wait = input()
14
CREDIT CARD DATA ANALYSIS
if ch == 10:
print(df.columns)
print(df['Type'].unique())
tipe = input('Enter Card Type ')
g = df.groupby('Type')
print('Card Type : ', tipe)
print(g['Type'].count())
print('\n\n\n Press any key to continue....')
wait = input()
if ch == 11:
df1 = df.Gender.unique()
print('Available Gender :', df1)
print('\n\n')
schName = input('Enter Gender Type :')
df1 = df[df.Gender == schName]
print(df1)
print('\n\n\n Press any key to continue....')
wait = input()
15
CREDIT CARD DATA ANALYSIS
if ch == 12:
print(df.describe())
print("\n\n\nPress any key to continue....")
wait = input()
if ch == 13:
break
# name of function : graph
# purpose : To generate a Graph menu
def graph():
df = pd.read_csv(csv_file)
while True:
clear()
print('\nGRAPH MENU ')
print('_'*100)
print('1. Whole Data LINE Graph\n')
print('2. Whole Data Bar Graph\n')
print('3. Whole Data Scatter Graph\n')
print('4. Whole Data Pie Chart\n')
print('5. Bar Graph By Education Level\n')
16
CREDIT CARD DATA ANALYSIS
print('6. Bar Graph By Income Level\n')
print('7. Exit (Move to main menu)\n')
ch = int(input('Enter your choice:'))
if ch == 1:
g = df.groupby('Gender')
x = df['Gender'].unique()
y = g['Gender'].count()
#plt.xticks(rotation='vertical')
plt.xlabel('Gender')
plt.ylabel('Total Credit Card Users')
plt.title('Credit Card User- Gender wise')
plt.grid(True)
plt.plot(x, y) #line graph
plt.show()
if ch == 2:
g = df.groupby('Gender')
x = df['Gender'].unique()
y = g['Gender'].count()
#plt.xticks(rotation='vertical')
17
CREDIT CARD DATA ANALYSIS
plt.xlabel('Gender')
plt.ylabel('Total Credit Card Users')
plt.title('Credit Card User- Gender wise')
plt.bar(x, y) #bar graph
plt.grid(True)
plt.show()
wait = input()
if ch == 3:
g = df.groupby('Gender')
x = df['Gender'].unique()
y = g['Gender'].count()
#plt.xticks(rotation='vertical')
plt.xlabel('Gender')
plt.ylabel('Total Credit Card Users')
plt.title('Credit Card User- Gender wise')
plt.grid(True)
plt.scatter(x, y)
plt.show()
wait = input()
18
CREDIT CARD DATA ANALYSIS
if ch == 4:
g = df.groupby("Card_Category")
x = df['Card_Category'].unique()
y = g['Card_Category'].count()
plt.pie(y, labels=x, autopct='% .2f', startangle=90) #pie graph
plt.xticks(rotation='vertical')
plt.show()
if ch == 5:
g = df.groupby("Education_Level")
x = df['Education_Level'].unique()
y = g['Education_Level'].count()
plt.bar(x, y)
#plt.xticks(rotation='vertical')
plt.grid(True)
plt.title('Education Level wise Card User')
plt.xlabel('Education Level')
plt.show()
wait = input()
if ch == 6:
19
CREDIT CARD DATA ANALYSIS
g = df.groupby("Income_Category")
x = df['Income_Category'].unique()
y = g['Income_Category'].count()
plt.grid(True)
plt.title('Credit Card User- Income Group')
plt.xlabel('Income Group')
plt.ylabel('Card Users')
plt.bar(x,y)
plt.show()
if ch == 7:
break
# function name : export_menu
# purpose : function to generate export menu
def export_menu():
df = pd.read_csv(csv_file)
while True:
clear()
print('\n\nEXPORT MENU ')
20
CREDIT CARD DATA ANALYSIS
print('_'*100)
print()
print('1. CSV File\n')
print('2. Excel File\n')
print('3. Exit (Move to main menu)')
ch = int(input('Enter your Choice : '))
if ch == 1:
df.to_csv('c:/backup/bankchurner_backup.csv')
print('\n\nCheck your new file "bankchurner_backup.csv" on C: Drive.....')
wait = input('\n\n\n Press any key to continuee.....')
if ch == 2:
df.to_excel('c:/backup/bankchurner_backup.xlsx')
print('\n\nCheck your new file "bankchurner_backup.xlsx" on C: Drive.....')
wait = input('\n\n\n Press any key to continuee.....')
if ch == 3:
break
def main_menu():
21
CREDIT CARD DATA ANALYSIS
clear()
introduction()
while True:
clear()
print('MAIN MENU ')
print('_'*100)
print()
print('1. Read CSV File\n')
print('2. Data Analysis Menu\n')
print('3. Graph Menu\n')
print('4. Export Data\n')
print('5. Exit\n')
choice = int(input('Enter your choice :'))
if choice == 1:
read_csv_file()
wait = input(
'\n\n Press any key to continue....')
if choice == 2:
data_analysis_menu()
22
CREDIT CARD DATA ANALYSIS
wait = input('\n\n Press any key to continue....')
if choice == 3:
graph()
wait = input('\n\n Press any key to continue....')
if choice == 4:
export_menu()
wait = input(
'\n\n Press any key to continue....')
if choice == 5:
break
clear()
made_by()
# call your main menu
main_menu()
23
CREDIT CARD DATA ANALYSIS
8. SCREEN LAYOUT
IDLE:
1.Introduction
24
CREDIT CARD DATA ANALYSIS
2.Read CSV File
3.Data Analysis Menu
25
CREDIT CARD DATA ANALYSIS
4.Show Whole Dataframe
26
CREDIT CARD DATA ANALYSIS
5.Show Columns
27
CREDIT CARD DATA ANALYSIS
6.How Top Rows
28
CREDIT CARD DATA ANALYSIS
7.Row Bottom Rows
29
CREDIT CARD DATA ANALYSIS
8.Show Specific Column
30
CREDIT CARD DATA ANALYSIS
9.Add a New Record
31
CREDIT CARD DATA ANALYSIS
10.Add a New column
32
CREDIT CARD DATA ANALYSIS
11.Delete a Column
33
CREDIT CARD DATA ANALYSIS
12.Delete a Record
34
CREDIT CARD DATA ANALYSIS
13.Card type User
35
CREDIT CARD DATA ANALYSIS
14.Gender Wise User
36
CREDIT CARD DATA ANALYSIS
15.Data Summary
37
CREDIT CARD DATA ANALYSIS
16.Exit
38
CREDIT CARD DATA ANALYSIS
17.Graph Menu
18.Whole Data LINE Graph
39
CREDIT CARD DATA ANALYSIS
19.Whole Data Bar Graph
40
CREDIT CARD DATA ANALYSIS
20.Whole Data Scatter Chart
41
CREDIT CARD DATA ANALYSIS
21.Whole Data Pie Chart
42
CREDIT CARD DATA ANALYSIS
22.Bar Graph By Education Level
43
CREDIT CARD DATA ANALYSIS
23.Bar Graph By Income Level
44
CREDIT CARD DATA ANALYSIS
24.Exit
45
CREDIT CARD DATA ANALYSIS
25.Export Data
26.CSV File
46
CREDIT CARD DATA ANALYSIS
27.Excel File
47
CREDIT CARD DATA ANALYSIS
28.Exit
48
CREDIT CARD DATA ANALYSIS
29.Exit
49
CREDIT CARD DATA ANALYSIS
9. CONCLUSION
Credit Card Data Analysis is a complex issue that requires a substantial
amount of planning before throwing machine leading algorithms at it.
Nonetheless, it is also an application of data science and machine
leading for the good, which makes sure that the customer’s money is
safe and not easily tampered with.
50
CREDIT CARD DATA ANALYSIS
10. BIBLIOGRAPHY
In order to work on this project titled –CREDIT CARD DATA ANALYSIS, the following books and
literature are refered by me during the various phases of development of the project.
Reference Books:
Informatics Practices with Pyhton:-Sumita Arora.
Reference Websites:
www.python.org
www.tutorialspoint.com
www.w3schools.com
www.docs.python.org
www.freehtmltemplates.com
www.google.com/Python project
www.wikepedia.com/Python and Pandas projects
www.data.world
www.youTube.com
51