0% found this document useful (0 votes)

45 views10 pages

DA Lab ANSWERS

Uploaded by

sakthi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

45 views10 pages

DA Lab ANSWERS

Uploaded by

sakthi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

AD8412 - DATA ANALYTICS LAB

1. Implement the following functions in the list of BMI values for people living in a rural area

bmi_list = [29, 18, 20, 22, 19, 25, 30, 28,22, 21, 18, 19, 20, 20, 22, 23]

(i) random.choice()
(ii) random.sample()
(iii) random.randint()
PROGRAM:

import random
from random import sample
def BMI(height, weight):
bmi = weight/(height**2)
return bmi

bmi_list = [29, 18, 20, 22, 19, 25, 30, 28,22, 21, 18, 19, 20, 20, 22, 23]
height = 1.79832
weight = 70
bmi= BMI(height, weight)
print("The BMI is", format(bmi), "so ", end='')
if (bmi < 18.5):
print("underweight")

elif ( bmi >= 18.5 and bmi < 24.9):

print("Healthy")

elif ( bmi >= 24.9 and bmi < 30):

print("overweight")

elif ( bmi >=30):

print("Suffering from Obesity")

The BMI is 21.64532402096181 so Healthy

(i) random.choice()

print(random.choice(bmi_list))

output:
30
In [98]:

Page 1 of 10
(ii)random.sample()

print(sample(bmi_list,3))

output: [18, 25, 22]

(iii)random.randint()

print(random.randint(0, 12))

output: 9

2. Use the random.choices() function to select multiple random items from a sequence with
repetition.

For example, You have a list of names, and you want to choose random four names from it,
and it’s okay for you if one of the names repeats.

names = ["Roger", "Nadal", "Novac", "Andre", "Sarena", "Mariya", "Martina", “KUMAR”]

PROGRAM:

import random

names=["Roger", "Nadal", "Novac", "Andre", "Sarena", "Mariya", "Martina","Kumar"]

# choose three random sample with replacement to including repetition

sample_list3 = random.choices(names, k=4)

print(sample_list3)

Output:
['Novac', 'Novac', 'Martina', 'Sarena']

3. Write a Python program to demonstrate the use of sample() function for string and tuple
types.

import random

string = "Welcome World"

print("With string:", random.sample(string, 4))

output: With string: ['r', 'm', 'W', 'W']

Page 2 of 10
tuple1 = ("Selshia", "AI", "computer", "science", "Jansons", "Engineering", "btech")

print("With tuple:", random.sample(tuple1, 4))

output:
With tuple: ['Jansons', 'Selshia', 'btech', 'Engineering']

4. Write a python script to implement the Z-Test for the following problem:

A school claimed that the students’ study that is more intelligent than the average school.
On calculating the IQ scores of 50 students, the average turns out to be 11. The mean of
the population IQ is 100 and the standard deviation is 15. Check whether the claim of
principal is right or not at a 5% significance level.

PROGRAM:

import math

import numpy as np

from numpy.random import randn

from statsmodels.stats.weightstats import ztest

mean_iq = 110

sd_iq = 15/math.sqrt(50)

alpha =0.05

null_mean =100

data = sd_iq*randn(50)+mean_iq

print('mean=%.2f stdv=%.2f' % (np.mean(data), np.std(data)))

ztest_Score, p_value= ztest(data,value = null_mean, alternative='larger')

if(p_value < alpha):

print("Reject Null Hypothesis")

else:

print("Fail to Reject NUll Hypothesis")

OUTPUT:mean=109.65 stdv=2.06
Reject Null Hypothesis

Page 3 of 10
5. Write a Python program to demonstrate the ‘T-Test’ with suitable libraries for a sample
student’s data. (Create and use dataset of your own)

import pandas as pd

df=pd.read_csv("paired_ttest - paired_ttest.csv") tscore,pvalue= stats.ttest_rel(df['Brand

1'],df['Brand 2']) alpha=0.20

print(tscore,pvalue) if (pvalue>alpha):

print("Failed to reject or do not reject null hypothesis") else:

print("Reject null hypothesis")

output:

6. Import the necessary libraries in Python for implementing ‘One-Way ANOVA Test’ in a
sample dataset. (Create and use dataset of your own)

Program:

import pandas as pd

# load data file

df = pd.read_csv("https://reneshbedre.github.io/assets/posts/anova/onewayanova.txt",
sep="\t")

# reshape the d dataframe suitable for statsmodels package

df_melt = pd.melt(df.reset_index(), id_vars=['index'], value_vars=['A', 'B', 'C', 'D'])

# replace column names

df_melt.columns = ['index', 'treatments', 'value']

# generate a boxplot to see the data distribution by treatments. Using boxplot, we can

# easily detect the differences between different treatments

import matplotlib.pyplot as plt

import seaborn as sns

Page 4 of 10
ax = sns.barplot(x='treatments', y='value', data=df_melt)

ax = sns.swarmplot(x="treatments", y="value", data=df_melt)

plt.show()

output:

7. Import the necessary libraries in Python for implementing Two-Way ANOVA Test’ in a
sample dataset. (Create and use dataset of your own)

8. Let us consider a dataset where we have a value of response y for every feature x:

Generate a regression line for this sample data using Python.

PROGRAM:

Page 5 of 10
import numpy as np

import matplotlib.pyplot as plt

def estimate_coef(x, y):

# number of observations/points

n = np.size(x)

# mean of x and y vector

m_x = np.mean(x)

m_y = np.mean(y)

# calculating cross-deviation and deviation about x

SS_xy = np.sum(yx) - nm_y*m_x

SS_xx = np.sum(xx) - nm_x*m_x

# calculating regression coefficients

b_1 = SS_xy / SS_xx

b_0 = m_y - b_1*m_x

return (b_0, b_1)

def plot_regression_line(x, y, b):

plt.scatter(x, y, color = "m",

marker = "o", s = 30)

y_pred = b[0] + b[1]*x

plt.plot(x, y_pred, color = "g")

Page 6 of 10
plt.xlabel('x')

plt.ylabel('y')

plt.show()

def main():

x = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

y = np.array([1, 3, 2, 5, 7, 8, 8, 9, 10, 12])

b = estimate_coef(x, y)

print("Estimated coefficients:\nb_0 = {} \

\nb_1 = {}".format(b[0], b[1]))

plot_regression_line(x, y, b)

if __name__ == "__main__":

main()

OUTPUT:
Estimated coefficients:
b_0 = 1.2363636363636363
b_1 = 1.1696969696969697

9. Import scipy and draw the line of Linear Regression for the following data:

x = [5,7,8,7,2,17,2,9,4,11,12,9,6]

Page 7 of 10
y = [99,86,87,88,111,86,103,87,94,78,77,85,86]

Where the x-axis represents age, and the y-axis represents speed. We have registered the
age and speed of 13 cars as they were passing a tollbooth.

PROGRAM:

import matplotlib.pyplot as plt

from scipy import stats

x = [5,7,8,7,2,17,2,9,4,11,12,9,6]

y = [99,86,87,88,111,86,103,87,94,78,77,85,86]

slope, intercept, r, p, std_err = stats.linregress(x, y)

def myfunc(x):

return slope * x + intercept

mymodel = list(map(myfunc, x))

plt.scatter(x, y)

plt.plot(x, mymodel)

plt.xlabel('Age')

plt.ylabel('Speed Of Cars')

plt.show()

OUTPUT:

Page 8 of 10
Implement the time series analysis concept for a sample dataset using Pandas.
10. (Create and use dataset of your own)

Refer 12th program

Write a Python program to visualize the time series concepts using Matplotlib.
11. (Create and use dataset of your own)

REFER 12th program

12. Demonstrate various time series models using Python.(Create and use dataset of your own)

PROGRAM:

import matplotlib.pyplot as plt

df = pd.read_csv('https://raw.githubusercontent.com/selva86/datasets/master/a10.csv',
parse_dates=['date'], index_col='date')

# Draw Plot

def plot_df(df, x, y, title="", xlabel='Date', ylabel='Value', dpi=100):

plt.figure(figsize=(16,5), dpi=dpi)

plt.plot(x, y, color='tab:red')

plt.gca().set(title=title, xlabel=xlabel, ylabel=ylabel)

plt.show()

plot_df(df, x=df.index, y=df.value, title='Monthly anti-diabetic drug sales in Australia from

Page 9 of 10
1992 to 2008.')

OUTPUT:

Page 10 of 10

01
No ratings yet
01
314 pages
The Present Continuous
No ratings yet
The Present Continuous
4 pages
Hap Id 12534903
100% (3)
Hap Id 12534903
2 pages
Business Plan Group 3
100% (1)
Business Plan Group 3
12 pages
Lab Manual (DAV)
No ratings yet
Lab Manual (DAV)
33 pages
Splendor Plus
No ratings yet
Splendor Plus
1 page
ML Updated File
No ratings yet
ML Updated File
36 pages
Control Account Reconciliation Statement
No ratings yet
Control Account Reconciliation Statement
8 pages
Fds
No ratings yet
Fds
30 pages
Ad3411 - Dsa Lab Manual
No ratings yet
Ad3411 - Dsa Lab Manual
34 pages
ML Lab Manual-Iso
No ratings yet
ML Lab Manual-Iso
40 pages
CS3362 Data Science Laboratory Manual 2022-23
No ratings yet
CS3362 Data Science Laboratory Manual 2022-23
54 pages
Fdsa Lab Algorithm
No ratings yet
Fdsa Lab Algorithm
21 pages
ML Manual New
No ratings yet
ML Manual New
38 pages
ML Lab Mala Reddy CLG
No ratings yet
ML Lab Mala Reddy CLG
23 pages
Machine Learning
No ratings yet
Machine Learning
30 pages
FDSA Lab Manual 1
No ratings yet
FDSA Lab Manual 1
34 pages
Fdsa Lab Manual
No ratings yet
Fdsa Lab Manual
17 pages
Kumerahou: Pomaderris Kumeraho
No ratings yet
Kumerahou: Pomaderris Kumeraho
1 page
Solution For "Financial Statement Analysis" Penman 5th Edition
64% (28)
Solution For "Financial Statement Analysis" Penman 5th Edition
16 pages
Dsa Lab
No ratings yet
Dsa Lab
28 pages
Datascience Lab
No ratings yet
Datascience Lab
24 pages
FDSA Lab Manual
No ratings yet
FDSA Lab Manual
31 pages
Fdsa Record Ai&Ds
No ratings yet
Fdsa Record Ai&Ds
26 pages
DVA Lab Manual
No ratings yet
DVA Lab Manual
20 pages
FDSA Lab Manual Aim Algorithm
No ratings yet
FDSA Lab Manual Aim Algorithm
32 pages
ML File Syllabus
No ratings yet
ML File Syllabus
43 pages
AD3411 DATA SCIENCE AND ANALYTICS LAB (2) - Removed
No ratings yet
AD3411 DATA SCIENCE AND ANALYTICS LAB (2) - Removed
24 pages
Vanshika Goyal Gec Practicals
No ratings yet
Vanshika Goyal Gec Practicals
31 pages
FDSA Lab Manual
No ratings yet
FDSA Lab Manual
27 pages
Ad3411-Data Science and Analytics Laboratory
No ratings yet
Ad3411-Data Science and Analytics Laboratory
27 pages
Ad3411 - Data Science and Analytics Laboratory
No ratings yet
Ad3411 - Data Science and Analytics Laboratory
26 pages
Data Science Lab: Python & Stats
100% (7)
Data Science Lab: Python & Stats
24 pages
Dsa Lab Manual
No ratings yet
Dsa Lab Manual
17 pages
DM Slip Solutions
100% (1)
DM Slip Solutions
24 pages
Fda Batch2program
No ratings yet
Fda Batch2program
18 pages
Data Science Assignment
No ratings yet
Data Science Assignment
24 pages
Data Science and Analtics Laboratory
No ratings yet
Data Science and Analtics Laboratory
21 pages
Python Lab PRG
No ratings yet
Python Lab PRG
20 pages
Python 1
No ratings yet
Python 1
16 pages
Assignment 1
No ratings yet
Assignment 1
16 pages
ML (Sudhanshu)
No ratings yet
ML (Sudhanshu)
24 pages
Lab 11,12
No ratings yet
Lab 11,12
7 pages
ML Lab Manual
No ratings yet
ML Lab Manual
28 pages
Fha-Pyhton Program Unit 1-4
No ratings yet
Fha-Pyhton Program Unit 1-4
13 pages
FDSA Lab Record
No ratings yet
FDSA Lab Record
30 pages
Data Science and Analtics Laboratory
No ratings yet
Data Science and Analtics Laboratory
21 pages
DS - Lab Manual
No ratings yet
DS - Lab Manual
31 pages
MLC Practical
No ratings yet
MLC Practical
51 pages
Ankit Python
No ratings yet
Ankit Python
26 pages
Data Science Laboratory
No ratings yet
Data Science Laboratory
40 pages
Lab Mannual
No ratings yet
Lab Mannual
49 pages
Dav Pracs
No ratings yet
Dav Pracs
9 pages
AD3411
No ratings yet
AD3411
28 pages
WinDNC V06 02 NewFeatures en
100% (3)
WinDNC V06 02 NewFeatures en
2 pages
Exp 5-6-7-8
No ratings yet
Exp 5-6-7-8
8 pages
Univds
No ratings yet
Univds
8 pages
Dal Programs With Output
No ratings yet
Dal Programs With Output
11 pages
Drilling Machine Mechanics
No ratings yet
Drilling Machine Mechanics
14 pages
DAV Guidelines
No ratings yet
DAV Guidelines
4 pages
Rufh 2
No ratings yet
Rufh 2
28 pages
Data Analysis and Visualization Guide
No ratings yet
Data Analysis and Visualization Guide
16 pages
Fds Mannual
No ratings yet
Fds Mannual
39 pages
Smec ML Lab Manual R22
No ratings yet
Smec ML Lab Manual R22
21 pages
Szymanowski List of Compositions
No ratings yet
Szymanowski List of Compositions
12 pages
CO-367 Machine Learning Lab File: Submitted To: Submitted by
No ratings yet
CO-367 Machine Learning Lab File: Submitted To: Submitted by
12 pages
Data Science
No ratings yet
Data Science
18 pages
How To Send or Receive SMS Message Via GSM Module by at Commands
100% (1)
How To Send or Receive SMS Message Via GSM Module by at Commands
6 pages
Aly 8520 To Aly 8526 12V PL
No ratings yet
Aly 8520 To Aly 8526 12V PL
4 pages
AHP Template SCBUK
No ratings yet
AHP Template SCBUK
24 pages
Circles The Final Steps (MCQ'S) Ws
No ratings yet
Circles The Final Steps (MCQ'S) Ws
9 pages
Machine Learning Assignment Questions
No ratings yet
Machine Learning Assignment Questions
2 pages
Multi2sim Quickstart
No ratings yet
Multi2sim Quickstart
10 pages
SanyaMidha FullStackWebDeveloper Resume
100% (1)
SanyaMidha FullStackWebDeveloper Resume
1 page
Chandigarh Group of Colleges College of Engineering Landran, Mohali
No ratings yet
Chandigarh Group of Colleges College of Engineering Landran, Mohali
47 pages
Export Promotion
No ratings yet
Export Promotion
7 pages
Executive Leadership Profile
No ratings yet
Executive Leadership Profile
2 pages
Lab Experiments Vi Sem-1
No ratings yet
Lab Experiments Vi Sem-1
10 pages
05 Dispute
No ratings yet
05 Dispute
29 pages
Learning Objectives: Introduction W
No ratings yet
Learning Objectives: Introduction W
238 pages
Grainger Shows Strong End Market
No ratings yet
Grainger Shows Strong End Market
26 pages
Definition of Tax MCQs
No ratings yet
Definition of Tax MCQs
2 pages
Wind Meter App for Enthusiasts
No ratings yet
Wind Meter App for Enthusiasts
9 pages
ICT Audit Tender for FSB
No ratings yet
ICT Audit Tender for FSB
3 pages
QUESTÕES A SEREM TRABALHADAS EM SALA DE AULA.1111docx
No ratings yet
QUESTÕES A SEREM TRABALHADAS EM SALA DE AULA.1111docx
7 pages
CD Lab Exam
No ratings yet
CD Lab Exam
3 pages
Format Laporan MEM564 Ver2
No ratings yet
Format Laporan MEM564 Ver2
4 pages
PNL Account Cashflow Forecast: Missing Values
No ratings yet
PNL Account Cashflow Forecast: Missing Values
5 pages

DA Lab ANSWERS

Uploaded by

DA Lab ANSWERS

Uploaded by

AD8412 - DATA ANALYTICS LAB

elif ( bmi >= 18.5 and bmi < 24.9):

elif ( bmi >= 24.9 and bmi < 30):

elif ( bmi >=30):

The BMI is 21.64532402096181 so Healthy

output: [18, 25, 22]

names = ["Roger", "Nadal", "Novac", "Andre", "Sarena", "Mariya", "Martina", “KUMAR”]

names=["Roger", "Nadal", "Novac", "Andre", "Sarena", "Mariya", "Martina","Kumar"]

# choose three random sample with replacement to including repetition

sample_list3 = random.choices(names, k=4)

string = "Welcome World"

print("With string:", random.sample(string, 4))

print("With tuple:", random.sample(tuple1, 4))

from numpy.random import randn

from statsmodels.stats.weightstats import ztest

print('mean=%.2f stdv=%.2f' % (np.mean(data), np.std(data)))

ztest_Score, p_value= ztest(data,value = null_mean, alternative='larger')

if(p_value < alpha):

print("Reject Null Hypothesis")

print("Fail to Reject NUll Hypothesis")

df=pd.read_csv("paired_ttest - paired_ttest.csv") tscore,pvalue= stats.ttest_rel(df['Brand

print("Failed to reject or do not reject null hypothesis") else:

print("Reject null hypothesis")

# load data file

# reshape the d dataframe suitable for statsmodels package

df_melt = pd.melt(df.reset_index(), id_vars=['index'], value_vars=['A', 'B', 'C', 'D'])

# replace column names

df_melt.columns = ['index', 'treatments', 'value']

# easily detect the differences between different treatments

import matplotlib.pyplot as plt

import seaborn as sns

ax = sns.swarmplot(x="treatments", y="value", data=df_melt)

Generate a regression line for this sample data using Python.

import matplotlib.pyplot as plt

def estimate_coef(x, y):

# mean of x and y vector

# calculating cross-deviation and deviation about x

SS_xy = np.sum(y*x) - n*m_y*m_x

SS_xx = np.sum(x*x) - n*m_x*m_x

# calculating regression coefficients

b_1 = SS_xy / SS_xx

b_0 = m_y - b_1*m_x

return (b_0, b_1)

def plot_regression_line(x, y, b):

plt.scatter(x, y, color = "m",

marker = "o", s = 30)

y_pred = b[0] + b[1]*x

plt.plot(x, y_pred, color = "g")

y = np.array([1, 3, 2, 5, 7, 8, 8, 9, 10, 12])

\nb_1 = {}".format(b[0], b[1]))

import matplotlib.pyplot as plt

from scipy import stats

slope, intercept, r, p, std_err = stats.linregress(x, y)

return slope * x + intercept

mymodel = list(map(myfunc, x))

Refer 12th program

REFER 12th program

import matplotlib.pyplot as plt

def plot_df(df, x, y, title="", xlabel='Date', ylabel='Value', dpi=100):

plt.gca().set(title=title, xlabel=xlabel, ylabel=ylabel)

plot_df(df, x=df.index, y=df.value, title='Monthly anti-diabetic drug sales in Australia from

You might also like

SS_xy = np.sum(yx) - nm_y*m_x

SS_xx = np.sum(xx) - nm_x*m_x