Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
6 views15 pages

Experiment 1 To 4

Uploaded by

Amey Pawar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views15 pages

Experiment 1 To 4

Uploaded by

Amey Pawar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

EXPERIMENT NO 1

Aim : - To install ANACONDA in our system.

Theory : -
Steps to install Anaconda: -

Step 1: Download Anaconda


1. Go to the Anaconda website: Open your browser and go to the
official Anaconda website.
2. Download the installer: Choose the appropriate version for your
operating system (Windows,
macOS, or Linux) and click on the download link. This will download the
Anaconda installer.

Step 2: Install Anaconda


1. Run the installer: Locate the downloaded installer file and run it.
2. Follow the installation prompts:
1. Windows: Follow the prompts in the Anaconda Installer. You can use
the default
settings unless you have specific needs.
2. macOS: Open the downloaded .pkg file and follow the instructions on
the screen.
3. Linux: Open a terminal and run the downloaded .sh file using the
command bash
Anaconda3-*.sh. Follow the prompts in the terminal.

Step 3: Verify the Installation


1. Open a terminal or command prompt:
1. Windows: Open the Anaconda Prompt from the Start Menu.
2. macOS/Linux: Open the Terminal application.
2. Check the installation: Type conda list and press Enter. If the
installation was successful,
you will see a list of installed packages.

Step 4: Launch Jupyter Notebook


1. Open Anaconda Navigator:
1. Windows: Open Anaconda Navigator from the Start Menu.
2. macOS/Linux: Open Anaconda Navigator from the Applications folder
or by typing
anaconda-navigator in the terminal.
2. Launch Jupyter Notebook: In Anaconda Navigator, find the Jupyter
Notebook option and
click "Launch." This will open Jupyter Notebook in your default web
browser.

Step 5: Using Jupyter Notebook


1. Create a new notebook: In the Jupyter Notebook interface, navigate
to the directory where
you want to save your notebooks. Click the "New" button on the right
and select "Python 3"
to create a new notebook.
2. Start coding: You can now start writing and executing Python code in
the Jupyter Notebook.
Optional: Setting Up a Virtual Environment
77 SALONI PAWAR77 SALONI PAWAR
2
1. Create a virtual environment: Open Anaconda Prompt or Terminal
and run conda
create -n myenv python=3.8 (replace "myenv" with your desired
environment
name).
2. Activate the environment: Run conda activate myenv.
3. Install Jupyter in the environment: Run conda install jupyter.

Step 6: Launch Jupyter Notebook in a Virtual Environment


1. Activate your virtual environment: Run conda activate myenv.
2. Launch Jupyter Notebook: Run jupyter notebook.

Basic Python Programs

1) Simple Programs
Example 1: Print Statement
python
print("helloworld")
Explanation:
The print function is used to display the specified message or value to
the screen.

In this example, the string "helloworld" is passed as an argument to the


print function,
which outputs the text "helloworld".

Example 2: Variable Assignment and Addition


python
a=5
b = 7print("a + b =", a + b)
Explanation:
Variables a and b are assigned the values 5 and 7, respectively.
The expression a + b adds the values of a and b.
The print function outputs the result of the addition along with the
string "a + b =".

2) Decision Making Program


Example: Check Even or Odd Number
python
a = 100if a % 2 == 0:
print("Even number")else:77 SALONI PAWAR
3
print("Odd number")
Explanation:
Variable a is assigned the value 100.
The if statement checks whether a is divisible by 2 using the modulus
operator %.

If a % 2 equals 0, it means a is an even number, and the program prints


"Even number".

Otherwise, the program prints "Odd number" using the else clause.

3) Program Using Loops


Example: While Loop to Print Numbers
python
count = 10while count < 15:
count = count + 1
print(count)
Explanation:
Variable count is assigned the initial value of 10.
The while loop continues to execute as long as the condition count <
15 is true.
Inside the loop, the value of count is incremented by 1 in each iteration.
The print function outputs the current value of count during each
iteration.
The loop stops when count reaches 15, printing the numbers 11 to 15.

Screen View: -4
EXPERIMENT NO 02
AIM: - To perform python libraries(Numpy , Pandas, Matplotlib)
Theory: -
NumPy Library in Python

Introduction

NumPy (Numerical Python) is a library for working with arrays and


mathematical operations in Python. It's a fundamental package for
scientific computing and is widely used in various fields.

Key Features

Multi-dimensional arrays: Efficient and flexible arrays for representing


complex data structures.
Vectorized operations: Fast element-wise operations on entire arrays.
Matrix operations: Extensive set of matrix operations, including
multiplication and inversion.
Random number generation: Functions for generating random numbers,
including uniform and normal distributions.
Advantages

Speed: Faster than Python lists for numerical computations.


Memory efficiency: Uses less memory than Python lists.
Vectorized operations: Easy to perform complex calculations on entire
arrays.
Common Functions

numpy.array(): Creates a NumPy array from a Python list.


numpy.zeros(): Creates a NumPy array filled with zeros.
numpy.sum(): Calculates the sum of all elements in a NumPy array.
Real-World Applications
OUTPUT: -

Theory: Pandas Library in Python

Introduction

Pandas is a library for data manipulation and analysis in Python. It


provides data structures and functions to efficiently handle and process
large datasets, making it a fundamental tool for data science and
analytics.

Key Features
DataFrames: Two-dimensional labeled data structures with columns of
potentially different types.
Series: One-dimensional labeled array capable of holding any data type.
Indexing and Selecting: Efficient indexing and selecting of data using
labels or conditional statements.
Data Alignment: Automatic alignment of data during operations,
eliminating the need for manual data matching.
GroupBy and Reshaping: Functions for grouping, aggregating, and
reshaping data.
Advantages

Efficient Data Handling: Optimized for performance, making it suitable


for large datasets.
Flexible Data Structures: DataFrames and Series can handle various data
types and structures.
Easy Data Manipulation: Intuitive API for data filtering, sorting, and
grouping.
Integration with Other Libraries: Seamless integration with other
popular data science libraries, such as NumPy and Matplotlib.
Common Functions

pandas.read_csv(): Reads a CSV file into a DataFrame.


pandas.DataFrame(): Creates a DataFrame from a dictionary or other
data structure.
pandas.Series(): Creates a Series from a list or other data structure.
df.head(): Displays the first few rows of a DataFrame.
df.groupby(): Groups a DataFrame by one or more columns.
Real-World Applications

Data Analysis: Data cleaning, filtering, and visualization.


Data Science: Data preprocessing, feature engineering, and model
implementation.
Business Intelligence: Data reporting, dashboard creation, and business
analytics.

OUTPUT: -
Theory: Matplotlib Library in Python

Introduction

Matplotlib is a popular data visualization library in Python that provides


a comprehensive set of tools for creating high-quality 2D and 3D plots,
charts, and graphs. It's widely used in various fields, including scientific
research, data analysis, and machine learning.
Key Features

Plotting: Creates a wide range of plots, including line plots, scatter plots,
bar charts, histograms, and more.
Customization: Offers extensive customization options for plot
appearance, including colors, fonts, labels, and titles.
Interactive Plots: Supports interactive plots with zooming, panning, and
hover-over text.
3D Plotting: Creates 3D plots and charts for visualizing complex data.
Integration: Seamlessly integrates with other popular Python libraries,
such as NumPy, Pandas, and Scikit-learn.
Advantages

Easy to Use: Simple and intuitive API for creating plots and charts.
High-Quality Output: Produces high-quality, publication-ready plots and
charts.
Customizable: Offers extensive customization options for plot
appearance.
Flexible: Supports a wide range of plot types and data formats.
Large Community: Active community and extensive documentation
make it easy to find help and resources.
Common Functions

matplotlib.pyplot.plot(): Creates a line plot.


matplotlib.pyplot.scatter(): Creates a scatter plot.
matplotlib.pyplot.bar(): Creates a bar chart.
matplotlib.pyplot.hist(): Creates a histogram.
matplotlib.pyplot.show(): Displays the plot.

Real-World Applications

Data Analysis: Visualizing data to identify trends, patterns, and


correlations.
Scientific Research: Creating plots and charts to present research
findings.

OUTPUT: -
EXPERIMENT NO 03
AIM: - DATA PROCESSING AND DATA CLEANING
THEORY:
Data Processing And Cleaning

Introduction

Data Processing is a crucial step in the data science workflow that


involves transforming raw data into a format that is suitable for analysis
and modeling. It's an essential step that ensures data quality,
completeness, and consistency, and is a critical component of data
preparation.

Key Concepts

Data Cleaning: Identifying and correcting errors, inconsistencies, and


inaccuracies in the data.
Data Transformation: Converting data from one format to another to
make it more suitable for analysis.
Data Reduction: Selecting a subset of the most relevant data to reduce
dimensionality and improve model performance.
Data Integration: Combining data from multiple sources into a single,
unified view.

Data Processing Steps

Data Ingestion: Collecting and gathering data from various sources.


Data Cleaning: Identifying and correcting errors, inconsistencies, and
inaccuracies in the data.
Data Transformation: Converting data from one format to another to
make it more suitable for analysis.
Data Reduction: Selecting a subset of the most relevant data to reduce
dimensionality and improve model performance.
Data Integration: Combining data from multiple sources into a single,
unified view.
Data Quality Check: Verifying the quality and integrity of the processed
data.
Data Processing Techniques

Handling Missing Values: Replacing missing values with mean, median,


or imputed values.
Data Normalization: Scaling numerical data to a common range to
prevent feature dominance.
Data Standardization: Transforming data to have a mean of 0 and a
standard deviation of 1.
Data Aggregation: Combining multiple data points into a single value,
such as sum or average.
Data Encoding: Converting categorical data into numerical data using
techniques like one-hot encoding or label encoding.

OUTPUT: -
EXPERIMENT NO 04
AIM: - Simple Linear Regression
Theory: -

Simple Linear Regression in Machine Learning

Introduction

Simple Linear Regression is a fundamental concept in Machine Learning


that involves predicting a continuous output variable based on a single
input feature. It's a linear model that assumes a straight-line relationship
between the input feature and the output variable.

Key Concepts

Linear Relationship: The relationship between the input feature and


output variable is assumed to be linear.
Single Input Feature: Only one input feature is used to predict the
output variable.
Continuous Output Variable: The output variable is continuous,
meaning it can take on any value within a certain range.
Regression Equation: The linear equation that predicts the output
variable based on the input feature.

Mathematical Formulation

The simple linear regression equation is given by:

y = β0 + β1x + ε

where:

y is the output variable


x is the input feature
β0 is the intercept or bias term
β1 is the slope coefficient
ε is the error term

Assumptions
Linearity: The relationship between the input feature and output
variable is linear.
Independence: Each data point is independent of the others.
Homoscedasticity: The variance of the error term is constant across all
levels of the input feature.
Normality: The error term is normally distributed.
No or little multicollinearity: The input feature is not highly correlated
with itself.

OUTPUT: -
MULTIPLE LINEAR REGRESSION

You might also like