Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
18K views12 pages

Advertising Data Analysis

This document analyzes advertising data using Python libraries like Pandas, NumPy, Matplotlib, and Seaborn. Various visualizations are created to explore relationships between different advertising mediums and sales. A logistic regression model is fit to the data to predict sales based on advertising expenditures.

Uploaded by

Vishal Sharma
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18K views12 pages

Advertising Data Analysis

This document analyzes advertising data using Python libraries like Pandas, NumPy, Matplotlib, and Seaborn. Various visualizations are created to explore relationships between different advertising mediums and sales. A logistic regression model is fit to the data to predict sales based on advertising expenditures.

Uploaded by

Vishal Sharma
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

3/23/2019 Advertising

In [2]:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

In [3]:

import seaborn as sns


sns.set_style('darkgrid')

In [4]:

%matplotlib inline

In [6]:

file = r'C:\Users\hp\Desktop\advertising.csv'
data = pd.read_csv(file)

In [7]:

data.head()

Out[7]:

Unnamed: 0 TV radio newspaper sales

0 1 230.1 37.8 69.2 22.1

1 2 44.5 39.3 45.1 10.4

2 3 17.2 45.9 69.3 9.3

3 4 151.5 41.3 58.5 18.5

4 5 180.8 10.8 58.4 12.9

In [20]:

data.drop(['Unnamed: 0'],axis = 1,inplace = True)

In [11]:

data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 200 entries, 0 to 199
Data columns (total 5 columns):
Unnamed: 0 200 non-null int64
TV 200 non-null float64
radio 200 non-null float64
newspaper 200 non-null float64
sales 200 non-null float64
dtypes: float64(4), int64(1)
memory usage: 7.9 KB

http://localhost:8888/nbconvert/html/Advertising.ipynb?download=false 1/12
3/23/2019 Advertising

In [21]:

data.describe()

Out[21]:

TV radio newspaper sales

count 200.000000 200.000000 200.000000 200.000000

mean 147.042500 23.264000 30.554000 14.022500

std 85.854236 14.846809 21.778621 5.217457

min 0.700000 0.000000 0.300000 1.600000

25% 74.375000 9.975000 12.750000 10.375000

50% 149.750000 22.900000 25.750000 12.900000

75% 218.825000 36.525000 45.100000 17.400000

max 296.400000 49.600000 114.000000 27.000000

In [22]:

sns.heatmap(data.corr(),cmap = 'magma',lw = .7,linecolor = 'black',alpha = 0.8,annot =


True)

Out[22]:

<matplotlib.axes._subplots.AxesSubplot at 0x1ed5ba54630>

http://localhost:8888/nbconvert/html/Advertising.ipynb?download=false 2/12
3/23/2019 Advertising

In [30]:

sns.distplot(data['sales'],hist = True)

Out[30]:

<matplotlib.axes._subplots.AxesSubplot at 0x1ed5bf0be10>

http://localhost:8888/nbconvert/html/Advertising.ipynb?download=false 3/12
3/23/2019 Advertising

In [29]:

sns.lmplot(x = 'TV',y = 'radio',data = data)

Out[29]:

<seaborn.axisgrid.FacetGrid at 0x1ed5beaa7f0>

http://localhost:8888/nbconvert/html/Advertising.ipynb?download=false 4/12
3/23/2019 Advertising

In [35]:

sns.pairplot(data)

http://localhost:8888/nbconvert/html/Advertising.ipynb?download=false 5/12
3/23/2019 Advertising

Out[35]:

<seaborn.axisgrid.PairGrid at 0x1ed5f436978>

http://localhost:8888/nbconvert/html/Advertising.ipynb?download=false 6/12
3/23/2019 Advertising

http://localhost:8888/nbconvert/html/Advertising.ipynb?download=false 7/12
3/23/2019 Advertising

In [36]:

data.head()

Out[36]:

TV radio newspaper sales

0 230.1 37.8 69.2 22.1

1 44.5 39.3 45.1 10.4

2 17.2 45.9 69.3 9.3

3 151.5 41.3 58.5 18.5

4 180.8 10.8 58.4 12.9

In [39]:

X = data.drop(['sales'],axis = 1)
y = data['sales']

In [40]:

from sklearn.model_selection import train_test_split

In [41]:

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=1


01)

In [42]:

from sklearn.linear_model import LogisticRegression

In [44]:

logmodel = LogisticRegression()

In [56]:

from sklearn import preprocessing


from sklearn import utils

lab_enc = preprocessing.LabelEncoder()
encoded = lab_enc.fit_transform(y_train)

In [66]:

lab_enc = preprocessing.LabelEncoder()
encoded2 = lab_enc.fit_transform(y_test)

http://localhost:8888/nbconvert/html/Advertising.ipynb?download=false 8/12
3/23/2019 Advertising

In [60]:

logmodel.fit(X_train,encoded)

C:\Users\hp\Anaconda3\lib\site-packages\sklearn\linear_model\logistic.py:4
33: FutureWarning: Default solver will be changed to 'lbfgs' in 0.22. Spec
ify a solver to silence this warning.
FutureWarning)
C:\Users\hp\Anaconda3\lib\site-packages\sklearn\linear_model\logistic.py:4
60: FutureWarning: Default multi_class will be changed to 'auto' in 0.22.
Specify the multi_class option to silence this warning.
"this warning.", FutureWarning)

Out[60]:

LogisticRegression(C=1.0, class_weight=None, dual=False, fit_intercept=Tru


e,
intercept_scaling=1, max_iter=100, multi_class='warn',
n_jobs=None, penalty='l2', random_state=None, solver='warn',
tol=0.0001, verbose=0, warm_start=False)

In [61]:

predictions = logmodel.predict(X_test)

http://localhost:8888/nbconvert/html/Advertising.ipynb?download=false 9/12
3/23/2019 Advertising

In [69]:

from sklearn.metrics import classification_report,confusion_matrix


print(classification_report(encoded2,predictions))
print(confusion_matrix)

http://localhost:8888/nbconvert/html/Advertising.ipynb?download=false 10/12
3/23/2019 Advertising

precision recall f1-score support

0 0.00 0.00 0.00 1


1 0.00 0.00 0.00 1
2 0.00 0.00 0.00 1
3 0.00 0.00 0.00 1
4 0.00 0.00 0.00 1
5 0.00 0.00 0.00 1
6 0.00 0.00 0.00 1
7 0.00 0.00 0.00 1
8 0.00 0.00 0.00 1
9 0.50 0.50 0.50 2
10 0.00 0.00 0.00 2
11 0.00 0.00 0.00 1
12 0.00 0.00 0.00 2
13 0.00 0.00 0.00 1
14 0.00 0.00 0.00 2
15 0.00 0.00 0.00 1
16 0.00 0.00 0.00 1
17 0.00 0.00 0.00 2
18 0.00 0.00 0.00 1
19 0.00 0.00 0.00 1
20 0.00 0.00 0.00 1
21 0.00 0.00 0.00 1
22 0.00 0.00 0.00 2
23 0.00 0.00 0.00 1
24 0.00 0.00 0.00 1
25 0.00 0.00 0.00 2
26 0.00 0.00 0.00 1
27 0.00 0.00 0.00 1
28 0.00 0.00 0.00 1
29 0.00 0.00 0.00 2
30 0.00 0.00 0.00 1
31 0.00 0.00 0.00 1
32 0.00 0.00 0.00 2
33 0.00 0.00 0.00 1
34 0.00 0.00 0.00 1
35 0.00 0.00 0.00 1
36 0.00 0.00 0.00 1
37 0.00 0.00 0.00 1
38 0.00 0.00 0.00 1
39 0.00 0.00 0.00 1
40 0.00 0.00 0.00 1
41 0.00 0.00 0.00 1
42 0.00 0.00 0.00 1
43 0.00 0.00 0.00 1
44 0.00 0.00 0.00 1
45 0.00 0.00 0.00 1
46 0.00 0.00 0.00 1
47 0.00 0.00 0.00 1
48 0.00 0.00 0.00 1
49 0.00 0.00 0.00 1
50 0.00 0.00 0.00 1
54 0.00 0.00 0.00 0
63 0.00 0.00 0.00 0
73 0.00 0.00 0.00 0
81 0.00 0.00 0.00 0
84 0.00 0.00 0.00 0
95 0.00 0.00 0.00 0
98 0.00 0.00 0.00 0

http://localhost:8888/nbconvert/html/Advertising.ipynb?download=false 11/12
3/23/2019 Advertising

micro avg 0.02 0.02 0.02 60


macro avg 0.01 0.01 0.01 60
weighted avg 0.02 0.02 0.02 60

<function confusion_matrix at 0x000001ED6141F7B8>

C:\Users\hp\Anaconda3\lib\site-packages\sklearn\metrics\classification.py:
1143: UndefinedMetricWarning: Precision and F-score are ill-defined and be
ing set to 0.0 in labels with no predicted samples.
'precision', 'predicted', average, warn_for)
C:\Users\hp\Anaconda3\lib\site-packages\sklearn\metrics\classification.py:
1145: UndefinedMetricWarning: Recall and F-score are ill-defined and being
set to 0.0 in labels with no true samples.
'recall', 'true', average, warn_for)
C:\Users\hp\Anaconda3\lib\site-packages\sklearn\metrics\classification.py:
1143: UndefinedMetricWarning: Precision and F-score are ill-defined and be
ing set to 0.0 in labels with no predicted samples.
'precision', 'predicted', average, warn_for)
C:\Users\hp\Anaconda3\lib\site-packages\sklearn\metrics\classification.py:
1145: UndefinedMetricWarning: Recall and F-score are ill-defined and being
set to 0.0 in labels with no true samples.
'recall', 'true', average, warn_for)
C:\Users\hp\Anaconda3\lib\site-packages\sklearn\metrics\classification.py:
1143: UndefinedMetricWarning: Precision and F-score are ill-defined and be
ing set to 0.0 in labels with no predicted samples.
'precision', 'predicted', average, warn_for)
C:\Users\hp\Anaconda3\lib\site-packages\sklearn\metrics\classification.py:
1145: UndefinedMetricWarning: Recall and F-score are ill-defined and being
set to 0.0 in labels with no true samples.
'recall', 'true', average, warn_for)

In [ ]:

http://localhost:8888/nbconvert/html/Advertising.ipynb?download=false 12/12

You might also like