0% found this document useful (0 votes)

10 views34 pages

Batch2 Ds

Uploaded by

ece apce

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views34 pages

Batch2 Ds

Uploaded by

ece apce

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 34

1)i. Write a NumPy program to convert a list and tuple into arrays.

Program:
import numpy as np

# Convert a list to a NumPy array

list_data = [1, 2, 3, 4, 5]

array_from_list = np.array(list_data)

print("Array from list:", array_from_list)

# Convert a tuple to a NumPy array

tuple_data = (10, 20, 30, 40, 50)

array_from_tuple = np.array(tuple_data)

print("Array from tuple:", array_from_tuple)

Array from list: [1 2 3 4 5]

Array from tuple: [10 20 30 40 50]

ii.Write a NumPy program to convert the values of Centigrade degrees into

Fahrenheit degrees and vice versa. Values have to be stored into a NumPy
array.
Program:
import numpy as np

# Function to convert Centigrade to Fahrenheit

def centigrade_to_fahrenheit(celsius):

return (celsius * 9/5) + 32

# Function to convert Fahrenheit to Centigrade

def fahrenheit_to_centigrade(fahrenheit):

return (fahrenheit - 32) * 5/9

# Create a NumPy array of Centigrade temperatures

centigrade_values = np.array([0, 10, 20, 30, 40, 50])

# Convert Centigrade to Fahrenheit

fahrenheit_values = centigrade_to_fahrenheit(centigrade_values)

# Create a NumPy array of Fahrenheit temperatures

fahrenheit_array = np.array([32, 50, 68, 86, 104, 122])

# Convert Fahrenheit to Centigrade

centigrade_from_fahrenheit = fahrenheit_to_centigrade(fahrenheit_array)

# Print the results

print("Centigrade values:", centigrade_values)

print("Converted Fahrenheit values:", fahrenheit_values)

print("\nFahrenheit values:", fahrenheit_array)

print("Converted Centigrade values:", centigrade_from_fahrenheit)

output:
Centigrade values: [ 0 10 20 30 40 50]

Converted Fahrenheit values: [ 32. 50. 68. 86. 104. 122.]

Fahrenheit values: [ 32 50 68 86 104 122]

Converted Centigrade values: [ 0. 10. 20. 30. 40. 50.]

2. i. Write a NumPy program to find the real and imaginary parts of an array of
complex numbers.
Program:
import numpy as np
# Create a NumPy array of complex numbers
complex_array = np.array([2 + 3j, 4 - 5j, -1 + 2j, 3 + 4j])
# Extract the real parts of the complex numbers
real_parts = np.real(complex_array)
# Extract the imaginary parts of the complex numbers
imaginary_parts = np.imag(complex_array)
# Print the results
print("Complex array:", complex_array)
print("Real parts:", real_parts)
print("Imaginary parts:", imaginary_parts)

output:
Complex array: [ 2.+3.j 4.-5.j -1.+2.j 3.+4.j]
Real parts: [ 2. 4. -1. 3.]
Imaginary parts: [ 3. -5. 2. 4.]

ii. Write a NumPy program to convert a NumPy array into a csv file
program:
import numpy as np

# Create a NumPy array

array_data = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])

# Save the NumPy array into a CSV file

np.savetxt('array_data.csv', array_data, delimiter=',', fmt='%d')

print("Array has been saved to 'array_data.csv'.")

output:
1,2,3
4,5,6
7,8,9
3. i. Write a NumPy program to perform the basic arithmetic operations
Program:
import numpy as np

# Create two NumPy arrays

array1 = np.array([10, 20, 30, 40, 50])

array2 = np.array([1, 2, 3, 4, 5])

# Addition

addition_result = array1 + array2

# Subtraction

subtraction_result = array1 - array2

# Multiplication

multiplication_result = array1 * array2

# Division

division_result = array1 / array2

# Exponentiation (array1 raised to the power of array2)

exponentiation_result = array1 ** array2

# Print the results

print("Array 1:", array1)

print("Array 2:", array2)

print("\nAddition (Array1 + Array2):", addition_result)

print("Subtraction (Array1 - Array2):", subtraction_result)

print("Multiplication (Array1 * Array2):", multiplication_result)

print("Division (Array1 / Array2):", division_result)

print("Exponentiation (Array1 ** Array2):", exponentiation_result)

output:
Array 1: [10 20 30 40 50]

Array 2: [1 2 3 4 5]

Addition (Array1 + Array2): [11 22 33 44 55]

Subtraction (Array1 - Array2): [ 9 18 27 36 45]

Multiplication (Array1 * Array2): [ 10 40 90 160 250]

Division (Array1 / Array2): [10. 10. 10. 10. 10.]

Exponentiation (Array1 ** Array2): [ 10 400 27000 1600000 9765625]

ii.Write a NumPy program to transpose an array.

Program:
import numpy as np

# Create a 2D NumPy array

array = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])

# Transpose the array

transposed_array = np.transpose(array)

# Alternatively, you can also use the shorthand `.T` to transpose

# transposed_array = array.T

# Print the original and transposed arrays

print("Original Array:")

print(array)

print("\nTransposed Array:")

print(transposed_array)

output:
Original Array:

[[1 2 3]

[4 5 6]

[7 8 9]]

Transposed Array:

[[1 4 7]

[2 5 8]

[3 6 9]]
4) i. Use NumPy , Create an array with 5 dimensions and verify that it has 5
dimensions.
Program:
import numpy as np

# Create a 5-dimensional NumPy array with random integers

array_5d = np.random.randint(1, 10, size=(2, 3, 4, 5, 6))

# Verify the number of dimensions using .ndim

print("Array Shape:", array_5d.shape)

print("Number of Dimensions:", array_5d.ndim)

output:
Array Shape: (2, 3, 4, 5, 6)

Number of Dimensions: 5

ii. Using NumPy, Sort a boolean array.

Program:
import numpy as np

# Create a boolean NumPy array

boolean_array = np.array([True, False, True, False, True, False])

# Sort the boolean array

sorted_array = np.sort(boolean_array)

# Print the original and sorted arrays

print("Original Boolean Array:", boolean_array)

print("Sorted Boolean Array:", sorted_array)

output:
Original Boolean Array: [ True False True False True False]

Sorted Boolean Array: [False False False True True True]

5) i. Create your own simple Pandas DataFrame and print its values.
Program:
import pandas as pd

# Create a simple dictionary with data

data = {

'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Eve'],

'Age': [24, 27, 22, 32, 29],

'City': ['New York', 'Los Angeles', 'Chicago', 'Houston', 'Phoenix']

# Create a DataFrame from the dictionary

df = pd.DataFrame(data)

# Print the DataFrame

print(df)

output:
Name Age City
0 Alice 24 New York
1 Bob 27 Los Angeles
2 Charlie 22 Chicago
3 David 32 Houston
4 Eve 29 Phoenix
ii. Create your own DataFrame from dict of narray/list.
Program:
import pandas as pd

import numpy as np

# Create a dictionary with NumPy arrays or lists

data = {

'Product': ['Laptop', 'Phone', 'Tablet', 'Monitor', 'Keyboard'],

'Price': np.array([1000, 600, 300, 250, 100]),

'Stock': np.array([50, 200, 150, 80, 500])

# Create a DataFrame from the dictionary

df = pd.DataFrame(data)

# Print the DataFrame

print(df)

output:

Product Price Stock

0 Laptop 1000 50
1 Phone 600 200
2 Tablet 300 150
3 Monitor 250 80
4 Keyboard 100 500
6. Perform appending, slicing, addition and deletion of rows with a Pandas
DataFrame.

Program:
import pandas as pd

# Create a simple DataFrame

data = {

'Name': ['Alice', 'Bob', 'Charlie', 'David'],

'Age': [24, 27, 22, 32],

'City': ['New York', 'Los Angeles', 'Chicago', 'Houston']

df = pd.DataFrame(data)

# Print the original DataFrame

print("Original DataFrame:")

print(df)

# 1. Appending a new row to the DataFrame

new_row = {'Name': 'Eve', 'Age': 29, 'City': 'Phoenix'}

df = df.append(new_row, ignore_index=True)

print("\nDataFrame after appending a new row:")

print(df)

# 2. Slicing the DataFrame (selecting specific rows)

sliced_df = df[1:3] # Selecting rows 1 and 2 (indexing starts from 0)

print("\nSliced DataFrame (rows 1 to 2):")

print(sliced_df)

# 3. Adding a new row with 'loc'

df.loc[len(df)] = ['Frank', 30, 'Dallas']

print("\nDataFrame after adding a new row with 'loc':")

print(df)

# 4. Deleting a row (deleting row with index 2)

df = df.drop(2)
print("\nDataFrame after deleting row with index 2:")

print(df)

output:
Original DataFrame:

Name Age City

0 Alice 24 New York

1 Bob 27 Los Angeles

2 Charlie 22 Chicago

3 David 32 Houston

DataFrame after appending a new row:

Name Age City

0 Alice 24 New York

1 Bob 27 Los Angeles

2 Charlie 22 Chicago

3 David 32 Houston

4 Eve 29 Phoenix

Sliced DataFrame (rows 1 to 2):

Name Age City

1 Bob 27 Los Angeles

2 Charlie 22 Chicago

DataFrame after adding a new row with 'loc':

Name Age City

0 Alice 24 New York

1 Bob 27 Los Angeles

2 Charlie 22 Chicago

3 David 32 Houston

4 Eve 29 Phoenix

5 Frank 30 Dallas
DataFrame after deleting row with index 2:

Name Age City

0 Alice 24 New York

1 Bob 27 Los Angeles

3 David 32 Houston

4 Eve 29 Phoenix

5 Frank 30 Dallas

7.i. Using Pandas, Create a DataFrame with a list of dictionaries, row indices,
and column indices.
Program:

import pandas as pd

# Create a list of dictionaries

data = [

{'Name': 'Alice', 'Age': 24, 'City': 'New York'},

{'Name': 'Bob', 'Age': 27, 'City': 'Los Angeles'},

{'Name': 'Charlie', 'Age': 22, 'City': 'Chicago'},

{'Name': 'David', 'Age': 32, 'City': 'Houston'}

# Define custom row indices and column indices

row_indices = ['A', 'B', 'C', 'D']

column_indices = ['Name', 'Age', 'City']

# Create the DataFrame

df = pd.DataFrame(data, index=row_indices, columns=column_indices)

# Print the DataFrame

print(df)
output:
Name Age City

A Alice 24 New York

B Bob 27 Los Angeles

C Charlie 22 Chicago

D David 32 Houston

ii. Use index label to delete or drop rows from a Pandas DataFrame.
Program:
import pandas as pd

# Create a simple DataFrame

data = {

'Name': ['Alice', 'Bob', 'Charlie', 'David'],

'Age': [24, 27, 22, 32],

'City': ['New York', 'Los Angeles', 'Chicago', 'Houston']

df = pd.DataFrame(data)

# Set custom row indices

df.index = ['A', 'B', 'C', 'D']

# Print the original DataFrame

print("Original DataFrame:")

print(df)

# 1. Drop a row by index label (e.g., drop row with index 'B')

df_dropped = df.drop('B')

print("\nDataFrame after dropping row with index 'B':")

print(df_dropped)

# 2. Drop multiple rows by index labels (e.g., drop rows with index 'A' and 'D')

df_dropped_multiple = df.drop(['A', 'D'])

print("\nDataFrame after dropping rows with index 'A' and 'D':")

print(df_dropped_multiple)
# 3. Drop a row in-place (this will modify the original DataFrame)

df.drop('C', inplace=True)

print("\nDataFrame after dropping row with index 'C' in-place:")

print(df)

output:
Original DataFrame:

Name Age City

A Alice 24 New York

B Bob 27 Los Angeles

C Charlie 22 Chicago

D David 32 Houston

DataFrame after dropping row with index 'B':

Name Age City

A Alice 24 New York

C Charlie 22 Chicago

D David 32 Houston

DataFrame after dropping rows with index 'A' and 'D':

Name Age City

B Bob 27 Los Angeles

C Charlie 22 Chicago

DataFrame after dropping row with index 'C' in-place:

Name Age City

A Alice 24 New York

B Bob 27 Los Angeles

D David 32 Houston
8.Using Pandas library,
i.Load the iris.CSV file
ii.Convert it into the data frame and read it .
iii.Display records only with species "Iris-setosa"
program:
import pandas as pd

# Step 1: Load the iris CSV file into a Pandas DataFrame

# Replace 'iris.csv' with the correct file path if necessary

df = pd.read_csv('iris.csv')

# Step 2: Display the entire DataFrame or the first few rows to ensure it's loaded correctly

print("First few records of the DataFrame:")

print(df.head())

# Step 3: Display only the records with species 'Iris-setosa'

setosa_df = df[df['species'] == 'Iris-setosa']

# Display the filtered DataFrame

print("\nRecords with species 'Iris-setosa':")

print(setosa_df)

output:
First few records of the DataFrame:

sepal_length sepal_width petal_length petal_width species

0 5.1 3.5 1.4 0.2 Iris-setosa

1 4.9 3.0 1.4 0.2 Iris-setosa

2 4.7 3.2 1.3 0.2 Iris-setosa

3 4.6 3.1 1.5 0.2 Iris-setosa

4 5.0 3.6 1.4 0.2 Iris-setosa

Records with species 'Iris-setosa':

sepal_length sepal_width petal_length petal_width species

0 5.1 3.5 1.4 0.2 Iris-setosa

1 4.9 3.0 1.4 0.2 Iris-setosa

2 4.7 3.2 1.3 0.2 Iris-setosa

3 4.6 3.1 1.5 0.2 Iris-setosa

4 5.0 3.6 1.4 0.2 Iris-setosa

...

9. Use the diabetes data set from UCI, Perform Univariate analysis.
Program:

import pandas as pd

import matplotlib.pyplot as plt

import seaborn as sns

# Step 1: Load the diabetes dataset from the UCI repository

# You can replace this URL with the actual URL of the dataset or load it from a local file.

url = 'https://raw.githubusercontent.com/jbrownlee/Datasets/master/pima-indians-
diabetes.data.csv'

columns = ['Pregnancies', 'Glucose', 'BloodPressure', 'SkinThickness', 'Insulin',

'BMI', 'DiabetesPedigreeFunction', 'Age', 'Outcome']

df = pd.read_csv(url, names=columns)

# Step 2: Check the first few rows of the dataset

print(df.head())

# Step 3: Summary statistics for numerical features

print("\nSummary Statistics:")

print(df.describe())

# Step 4: Visualizing the distribution of each feature (Univariate Analysis)

# Histograms for all features

df.hist(bins=20, figsize=(15,10))

plt.tight_layout()

plt.show()

# Step 5: Boxplots for all features to check for outliers

plt.figure(figsize=(15, 10))
sns.boxplot(data=df)

plt.xticks(rotation=45)

plt.tight_layout()

plt.show()

# Step 6: Checking the distribution of 'Outcome' (Diabetes status)

sns.countplot(x='Outcome', data=df)

plt.title('Distribution of Outcome (Diabetes Status)')

plt.show()

output:

10.Use the diabetes data set from Pima Indians Diabetes , Perform Bivariate
analysis.
Program:

import pandas as pd

import seaborn as sns

import matplotlib.pyplot as plt

# Load the dataset from the UCI repository or local file

url = 'https://raw.githubusercontent.com/jbrownlee/Datasets/master/pima-indians-
diabetes.data.csv'

columns = ['Pregnancies', 'Glucose', 'BloodPressure', 'SkinThickness', 'Insulin',

'BMI', 'DiabetesPedigreeFunction', 'Age', 'Outcome']

df = pd.read_csv(url, names=columns)

# Display first few rows of the dataset

print(df.head())

# Step 1: Correlation Heatmap to analyze relationships between numerical features

plt.figure(figsize=(10, 8))

correlation_matrix = df.corr()

sns.heatmap(correlation_matrix, annot=True, cmap='coolwarm', fmt='.2f', linewidths=0.5)

plt.title('Correlation Heatmap of Diabetes Dataset')

plt.show()

# Step 2: Scatter plots between features and target variable 'Outcome'

plt.figure(figsize=(15, 10))

# Plotting scatter plot for 'Glucose' vs 'Outcome'

plt.subplot(2, 3, 1)

sns.scatterplot(x='Glucose', y='Outcome', data=df)

plt.title('Glucose vs Outcome')

# Plotting scatter plot for 'BMI' vs 'Outcome'

plt.subplot(2, 3, 2)

sns.scatterplot(x='BMI', y='Outcome', data=df)

plt.title('BMI vs Outcome')

# Plotting scatter plot for 'Age' vs 'Outcome'

plt.subplot(2, 3, 3)
sns.scatterplot(x='Age', y='Outcome', data=df)

plt.title('Age vs Outcome')

# Plotting scatter plot for 'Insulin' vs 'Outcome'

plt.subplot(2, 3, 4)

sns.scatterplot(x='Insulin', y='Outcome', data=df)

plt.title('Insulin vs Outcome')

# Plotting scatter plot for 'BloodPressure' vs 'Outcome'

plt.subplot(2, 3, 5)

sns.scatterplot(x='BloodPressure', y='Outcome', data=df)

plt.title('BloodPressure vs Outcome')

# Plotting scatter plot for 'Pregnancies' vs 'Outcome'

plt.subplot(2, 3, 6)

sns.scatterplot(x='Pregnancies', y='Outcome', data=df)

plt.title('Pregnancies vs Outcome')

plt.tight_layout()

plt.show()

# Step 3: Pairplot to visualize the relationships between multiple features and 'Outcome'

sns.pairplot(df, hue='Outcome', diag_kind='hist', markers=["o", "s"])

plt.suptitle('Pairplot of Features with Outcome', y=1.02)

plt.show()
output:

11.Perform Multiple Regression analysis on your own dataset ( For example,

Car dataset with information Company Name, Model, Volume, Weight, CO2)
with more than one independent value to predict a value based on two or
more variable.
Program:
# Import necessary libraries

import pandas as pd

import statsmodels.api as sm

from sklearn.model_selection import train_test_split

from sklearn.linear_model import LinearRegression

from sklearn.metrics import mean_squared_error, r2_score

import matplotlib.pyplot as plt

# Step 1: Create or Load your Dataset

# Sample data representing car information

data = {

'Company Name': ['Toyota', 'Honda', 'Ford', 'BMW', 'Audi'],

'Model': ['Corolla', 'Civic', 'Focus', 'X5', 'A4'],

'Volume': [1.8, 2.0, 1.5, 3.0, 2.5], # Engine volume in liters

'Weight': [1300, 1200, 1400, 2000, 1800], # Weight in kilograms

'CO2': [120, 110, 140, 200, 180] # CO2 emissions in grams per km

# Convert to DataFrame

df = pd.DataFrame(data)

# Step 2: Preprocess the Data

# Since we are predicting CO2 based on Volume and Weight, we can drop 'Company Name' and
'Model' for now

df = df.drop(columns=['Company Name', 'Model'])

# Independent variables (Volume, Weight)

X = df[['Volume', 'Weight']]

# Dependent variable (CO2)

y = df['CO2']

# Step 3: Add a constant to the independent variables (for intercept)

X = sm.add_constant(X)

# Step 4: Perform Multiple Regression using statsmodels

model = sm.OLS(y, X).fit()

# Step 5: Display the summary of the regression analysis

print("Multiple Regression Analysis Summary (statsmodels):")

print(model.summary())

# Step 6: Perform Multiple Regression using scikit-learn

# Split the data into training and testing sets

X_train, X_test, y_train, y_test = train_test_split(df[['Volume', 'Weight']], df['CO2'], test_size=0.2,

random_state=42)

# Initialize the Linear Regression model

regressor = LinearRegression()
# Train the model

regressor.fit(X_train, y_train)

# Predict on the test set

y_pred = regressor.predict(X_test)

# Step 7: Evaluate the model

print("\nMultiple Regression Analysis using scikit-learn:")

print(f"Coefficients: {regressor.coef_}")

print(f"Intercept: {regressor.intercept_}")

# Calculate R-squared value and Mean Squared Error (MSE)

r2 = r2_score(y_test, y_pred)

mse = mean_squared_error(y_test, y_pred)

print(f"R-squared: {r2}")

print(f"Mean Squared Error: {mse}")

# Step 8: Plotting the results

plt.scatter(y_test, y_pred)

plt.xlabel("Actual CO2")

plt.ylabel("Predicted CO2")

plt.title("Actual vs Predicted CO2")

plt.show()

output:
12.Perform Bivariate analysis using the pandas DataFrame that contains
information about two variables: (1) Hours spent studying and (2) Exam score
received by 20 different students
Program:
import pandas as pd

import numpy as np

import matplotlib.pyplot as plt

import seaborn as sns

from scipy.stats import pearsonr

# Step 1: Create the DataFrame

data = {

'Hours Studying': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20],

'Exam Score': [35, 40, 50, 60, 65, 70, 75, 80, 85, 88, 90, 92, 94, 95, 96, 98, 99, 99, 100, 100]

# Convert the dictionary to a pandas DataFrame

df = pd.DataFrame(data)

# Step 2: Descriptive Statistics

print("Descriptive Statistics:")

print(df.describe())

# Step 3: Calculate Correlation

correlation, _ = pearsonr(df['Hours Studying'], df['Exam Score'])

print(f"\nCorrelation between Hours Studying and Exam Score: {correlation:.2f}")

# Step 4: Scatter Plot

plt.figure(figsize=(8, 6))

plt.scatter(df['Hours Studying'], df['Exam Score'], color='blue', label='Data Points')

plt.title('Hours Studying vs Exam Score')

plt.xlabel('Hours Studying')

plt.ylabel('Exam Score')

plt.grid(True)

plt.legend()
plt.show()

# Step 5: Linear Regression Line (Fit a regression line)

sns.regplot(x='Hours Studying', y='Exam Score', data=df, scatter_kws={'color':'blue'},

line_kws={'color':'red'})

plt.title('Linear Regression Line: Hours Studying vs Exam Score')

plt.xlabel('Hours Studying')

plt.ylabel('Exam Score')

plt.show()

output:

13 . Perform Univariate analysis with the following pandas DataFrame 'points':

[1, 1, 2, 3.5, 4, 4, 4, 5, 5, 6.5, 7, 7.4, 8, 13, 14.2] 'assists': [5, 7, 7, 9, 12, 9, 9, 4, 6,
8, 8, 9, 3, 2, 6] 'rebounds': [11, 8, 10, 6, 6, 5, 9, 12, 6, 6, 7, 8, 7, 9, 15].

Program:
import pandas as pd

import numpy as np

import matplotlib.pyplot as plt

import seaborn as sns

# Step 1: Create the DataFrame

data = {

'points': [1, 1, 2, 3.5, 4, 4, 4, 5, 5, 6.5, 7, 7.4, 8, 13, 14.2],

'assists': [5, 7, 7, 9, 12, 9, 9, 4, 6, 8, 8, 9, 3, 2, 6],

'rebounds': [11, 8, 10, 6, 6, 5, 9, 12, 6, 6, 7, 8, 7, 9, 15]

# Convert the dictionary to a pandas DataFrame

df = pd.DataFrame(data)

# Step 2: Descriptive Statistics for each column

print("Descriptive Statistics:")

print(df.describe())

# Step 3: Visualizing the Distribution of each variable

# Plot histograms for each variable

plt.figure(figsize=(12, 6))

# Histogram for 'points'

plt.subplot(1, 3, 1)

sns.histplot(df['points'], kde=True, color='blue', bins=10)

plt.title('Distribution of Points')

plt.xlabel('Points')

plt.ylabel('Frequency')

# Histogram for 'assists'

plt.subplot(1, 3, 2)

sns.histplot(df['assists'], kde=True, color='green', bins=10)

plt.title('Distribution of Assists')
plt.xlabel('Assists')

plt.ylabel('Frequency')

# Histogram for 'rebounds'

plt.subplot(1, 3, 3)

sns.histplot(df['rebounds'], kde=True, color='red', bins=10)

plt.title('Distribution of Rebounds')

plt.xlabel('Rebounds')

plt.ylabel('Frequency')

plt.tight_layout()

plt.show()

# Step 4: Box plots to visualize outliers

plt.figure(figsize=(12, 6))

# Box plot for 'points'

plt.subplot(1, 3, 1)

sns.boxplot(y=df['points'], color='blue')

plt.title('Boxplot of Points')

# Box plot for 'assists'

plt.subplot(1, 3, 2)

sns.boxplot(y=df['assists'], color='green')

plt.title('Boxplot of Assists')

# Box plot for 'rebounds'

plt.subplot(1, 3, 3)

sns.boxplot(y=df['rebounds'], color='red')

plt.title('Boxplot of Rebounds')
plt.tight_layout()

plt.show()

# Step 5: Skewness and Kurtosis

from scipy.stats import skew, kurtosis

# Skewness and Kurtosis for 'points'

points_skew = skew(df['points'])

points_kurt = kurtosis(df['points'])

# Skewness and Kurtosis for 'assists'

assists_skew = skew(df['assists'])

assists_kurt = kurtosis(df['assists'])

# Skewness and Kurtosis for 'rebounds'

rebounds_skew = skew(df['rebounds'])

rebounds_kurt = kurtosis(df['rebounds'])

print("\nSkewness and Kurtosis:")

print(f"Points: Skewness = {points_skew:.2f}, Kurtosis = {points_kurt:.2f}")

print(f"Assists: Skewness = {assists_skew:.2f}, Kurtosis = {assists_kurt:.2f}")

print(f"Rebounds: Skewness = {rebounds_skew:.2f}, Kurtosis = {rebounds_kurt:.2f}")

output:
14. i) Using various functions in numpy library, mathematically calculate the
values for a normal distribution and create Histograms to plot the probability
distribution curve.
Program:
import numpy as np

import matplotlib.pyplot as plt

# Step 1: Parameters for the normal distribution

mu = 0 # Mean of the distribution

sigma = 1 # Standard deviation

size = 10000 # Number of data points to generate

# Step 2: Generate random samples from a normal distribution

data = np.random.normal(mu, sigma, size)

# Step 3: Plot the histogram

plt.figure(figsize=(10, 6))
count, bins, ignored = plt.hist(data, bins=30, density=True, alpha=0.6, color='g')

# Step 4: Calculate the Probability Density Function (PDF)

# Define the normal distribution function

def normal_distribution(x, mu, sigma):

return (1/np.sqrt(2 * np.pi * sigma**2)) * np.exp(-0.5 * ((x - mu) / sigma)**2)

# Step 5: Generate points for the normal distribution curve

x_values = np.linspace(min(bins), max(bins), 100)

pdf_values = normal_distribution(x_values, mu, sigma)

# Step 6: Plot the PDF curve over the histogram

plt.plot(x_values, pdf_values, 'k', linewidth=2)

plt.title("Normal Distribution with Histogram")

plt.xlabel("Data points")

plt.ylabel("Density")

plt.grid(True)

plt.show()

output:
14.ii) Using plt.contour(), plt.contourf(), plt.imshow(), plt.colorbar(), plt.clabel()
functions visualize a contour plot.
Program:
import numpy as np

import matplotlib.pyplot as plt

# Create some sample data

x = np.linspace(-3, 3, 100)

y = np.linspace(-3, 3, 100)

X, Y = np.meshgrid(x, y)

Z = np.sin(X2 + Y2) / (X2 + Y2)

# Create a contour plot

plt.contour(X, Y, Z, levels=20, cmap='viridis')

# Create a filled contour plot

plt.contourf(X, Y, Z, levels=20, cmap='viridis', alpha=0.7)

# Add a colorbar

plt.colorbar()

# Add labels to the contour lines

plt.clabel(plt.contour(X, Y, Z, levels=20, colors='k'), inline=True, fontsize=10)

# Display the plot

plt.show()

output:
15 Make a three-dimensional plot with randomly generate 50 data points for x,
y, and z. Set the point color as red, and size of the point as 50.

Program:
import matplotlib.pyplot as plt

from mpl_toolkits.mplot3d import Axes3D

import numpy as np

# Generate 50 random data points for x, y, and z

np.random.seed(42) # Set a seed for reproducibility

x = np.random.rand(50) * 10

y = np.random.rand(50) * 10

z = np.random.rand(50) * 10

# Create a 3D plot

fig = plt.figure()

ax = fig.add_subplot(111, projection='3d')

# Plot the points with specified color and size

ax.scatter(x, y, z, c='red', s=50)

# Set labels for axes

ax.set_xlabel('X')

ax.set_ylabel('Y')

ax.set_zlabel('Z')

# Show the plot

plt.show()

output:

Manual
No ratings yet
Manual
52 pages
Fds Lab
No ratings yet
Fds Lab
16 pages
CS3361 - Data Science University Question Paper Answers
No ratings yet
CS3361 - Data Science University Question Paper Answers
46 pages
Fods Lab Ans
No ratings yet
Fods Lab Ans
36 pages
Untitled 8
No ratings yet
Untitled 8
2 pages
Practicals 1 To 4
No ratings yet
Practicals 1 To 4
15 pages
Dfs Manual
No ratings yet
Dfs Manual
43 pages
Python NumPy for Beginners
100% (1)
Python NumPy for Beginners
84 pages
Ds Lab-1
No ratings yet
Ds Lab-1
40 pages
Ilovepdf Merged (2) Merged
No ratings yet
Ilovepdf Merged (2) Merged
65 pages
ML IU48prac1,2
No ratings yet
ML IU48prac1,2
16 pages
Python Unit-5
No ratings yet
Python Unit-5
14 pages
Python Data Handling for Developers
No ratings yet
Python Data Handling for Developers
20 pages
Data Science Practical
No ratings yet
Data Science Practical
28 pages
Pythonfile
No ratings yet
Pythonfile
37 pages
UNIT-4 Important Q-A
No ratings yet
UNIT-4 Important Q-A
28 pages
NumPy and Pandas Basics Guide
No ratings yet
NumPy and Pandas Basics Guide
8 pages
Introduction To Numpy Pandas and Matplotlib
No ratings yet
Introduction To Numpy Pandas and Matplotlib
2 pages
Python Exps Questions
No ratings yet
Python Exps Questions
10 pages
Data Analysis with Python Libraries
No ratings yet
Data Analysis with Python Libraries
29 pages
Python Unit IV
No ratings yet
Python Unit IV
12 pages
Section 7
No ratings yet
Section 7
33 pages
Pandas
No ratings yet
Pandas
27 pages
DSC Lab Programs
No ratings yet
DSC Lab Programs
24 pages
DV Lab Manual Modified
No ratings yet
DV Lab Manual Modified
31 pages
Python Lab PRG
No ratings yet
Python Lab PRG
20 pages
Machine Learning Using Phython
No ratings yet
Machine Learning Using Phython
25 pages
Pandas Numpy
No ratings yet
Pandas Numpy
7 pages
EXP1-siddhant Gupta (23 - SE - 148)
No ratings yet
EXP1-siddhant Gupta (23 - SE - 148)
17 pages
Python Assignment
No ratings yet
Python Assignment
17 pages
Experiment No-7 Aaryo PDF
No ratings yet
Experiment No-7 Aaryo PDF
8 pages
Fundamentals of Data Science Lab Manual
No ratings yet
Fundamentals of Data Science Lab Manual
34 pages
Numpy Basics
No ratings yet
Numpy Basics
66 pages
ML Lab File Vijay Kumar
No ratings yet
ML Lab File Vijay Kumar
16 pages
Module 6 NumPY and Pandas
No ratings yet
Module 6 NumPY and Pandas
12 pages
FDS Lab Manual
No ratings yet
FDS Lab Manual
48 pages
Data Analysis Tools
No ratings yet
Data Analysis Tools
26 pages
Khadeeja - DS - PRACTICAL 4
No ratings yet
Khadeeja - DS - PRACTICAL 4
24 pages
Numpy and Pandas
No ratings yet
Numpy and Pandas
5 pages
Array - Numpy 1
No ratings yet
Array - Numpy 1
14 pages
Labmanualfds
No ratings yet
Labmanualfds
49 pages
NumPy and Pandas
No ratings yet
NumPy and Pandas
12 pages
Combined Cheatsheet
No ratings yet
Combined Cheatsheet
5 pages
01 Introduction To Python
No ratings yet
01 Introduction To Python
36 pages
New Updated Dav Experiment-3
No ratings yet
New Updated Dav Experiment-3
7 pages
Numpy Tutorial Basic To Advance 1656682851
No ratings yet
Numpy Tutorial Basic To Advance 1656682851
35 pages
DV Lab2 Updated
No ratings yet
DV Lab2 Updated
12 pages
NumPy Array Operations Guide
No ratings yet
NumPy Array Operations Guide
14 pages
Unit 1 Machine Learning
No ratings yet
Unit 1 Machine Learning
61 pages
Python Numpy Programming: Eliot Feibush
No ratings yet
Python Numpy Programming: Eliot Feibush
66 pages
21BECE30036 Prac 1
No ratings yet
21BECE30036 Prac 1
10 pages
ML Programs
No ratings yet
ML Programs
34 pages
Fods Lab Manual
No ratings yet
Fods Lab Manual
26 pages
APP Lab Manual Final
No ratings yet
APP Lab Manual Final
43 pages
Pratik Gaikwad 2023-B - Assignment2
No ratings yet
Pratik Gaikwad 2023-B - Assignment2
9 pages
Data Science Fundamentals Lab
No ratings yet
Data Science Fundamentals Lab
24 pages
Numpy Tutorial
No ratings yet
Numpy Tutorial
19 pages
List of Companies Address Jan 2018
No ratings yet
List of Companies Address Jan 2018
1 page
CIN L31900TN1985PLC012343 Tel No: +91-44-42208100/ 28604795 Fax No: +91-44-28604788
No ratings yet
CIN L31900TN1985PLC012343 Tel No: +91-44-42208100/ 28604795 Fax No: +91-44-28604788
4 pages
Chennai Region Engg College - Placement Contacts
100% (1)
Chennai Region Engg College - Placement Contacts
1 page
Circular - Clould Thing - 2021 Batch
No ratings yet
Circular - Clould Thing - 2021 Batch
1 page
My Homework Lesson 1 Addition Properties
100% (1)
My Homework Lesson 1 Addition Properties
7 pages
Intro To Real External Flows Lesson 1 PDF
No ratings yet
Intro To Real External Flows Lesson 1 PDF
11 pages
Holiday Assignment
No ratings yet
Holiday Assignment
2 pages
Grade 9 Tos - WW1
No ratings yet
Grade 9 Tos - WW1
2 pages
A Practical Guide To Critical Thinking-Haskins
0% (1)
A Practical Guide To Critical Thinking-Haskins
20 pages
DURERS MAGIC SQUARE Inclusion and Home Learning Guide
No ratings yet
DURERS MAGIC SQUARE Inclusion and Home Learning Guide
8 pages
Class - 10 Math Notes Chapter - 11 Constructions
No ratings yet
Class - 10 Math Notes Chapter - 11 Constructions
54 pages
7com1078 Cap Mock 2021
No ratings yet
7com1078 Cap Mock 2021
2 pages
Nernst Heat Theorem
No ratings yet
Nernst Heat Theorem
10 pages
Fundamentals of Data Structures in C - , 2 - Ellis Horowitz, Sahni, Dinesh Mehta
No ratings yet
Fundamentals of Data Structures in C - , 2 - Ellis Horowitz, Sahni, Dinesh Mehta
521 pages
Freshman Engineering Problem Solving With MATLAB
No ratings yet
Freshman Engineering Problem Solving With MATLAB
83 pages
磁力计校准简介
No ratings yet
磁力计校准简介
4 pages
1.2 Test Mark Scheme
No ratings yet
1.2 Test Mark Scheme
5 pages
GCSE Maths Higher Tier Exam 2014
No ratings yet
GCSE Maths Higher Tier Exam 2014
16 pages
Dose Effectiveness Analysis
No ratings yet
Dose Effectiveness Analysis
71 pages
Ss 2 Economics 1st Term E-Note
No ratings yet
Ss 2 Economics 1st Term E-Note
77 pages
Beam Analysis in Concrete Design
No ratings yet
Beam Analysis in Concrete Design
15 pages
Grade 2 Class Prog
No ratings yet
Grade 2 Class Prog
1 page
F2 Night Before Notes
No ratings yet
F2 Night Before Notes
11 pages
Digital Systems Design Exam 2023
No ratings yet
Digital Systems Design Exam 2023
2 pages
Building 261
No ratings yet
Building 261
2 pages
Python Notes
No ratings yet
Python Notes
77 pages
Topology Optimization for Engineers
No ratings yet
Topology Optimization for Engineers
14 pages
A Divergence Dating Analysis of Turtle Using Fossil Calibrations An Example of Best Practices
No ratings yet
A Divergence Dating Analysis of Turtle Using Fossil Calibrations An Example of Best Practices
24 pages
Phet Gas Law Simulation 2010
No ratings yet
Phet Gas Law Simulation 2010
8 pages
DC-1 Assignment-8
No ratings yet
DC-1 Assignment-8
5 pages
Air Cleaner Systems
No ratings yet
Air Cleaner Systems
21 pages
Worksheet - 1 Tangent - Normal
No ratings yet
Worksheet - 1 Tangent - Normal
11 pages
One-Dimensional Assembly Tolerance Stack-Up
100% (2)
One-Dimensional Assembly Tolerance Stack-Up
26 pages
Class XI Math Exam Marking Scheme
No ratings yet
Class XI Math Exam Marking Scheme
6 pages