1.
Create a panda’s series from a dictionary of values and a ndarray
import pandas as pd
import numpy as np
# Create a Series from a dictionary
data_dict = {'A': 10, 'B': 20, 'C': 30}
series_from_dict = pd.Series(data_dict)
print("Series from Dictionary:")
print(series_from_dict)
# Create a Series from an ndarray
data_ndarray = np.array([40, 50, 60])
series_from_ndarray = pd.Series(data_ndarray)
print("\nSeries from ndarray:")
print(series_from_ndarray)
Series from Dictionary:
A 10
B 20
C 30
dtype: int64
Series from ndarray:
0 40
1 50
2 60
dtype: int64
2. Print elements above the 75th percentile:
import pandas as pd
# Create a Series
data = [10, 20, 30, 40, 50, 60, 70, 80, 90, 100]
series = pd.Series(data)
percentile_75 = series.quantile(0.75)
above_percentile_75 = series[series > percentile_75]
print("Elements above the 75th percentile:")
print(above_percentile_75)
output
Elements above the 75th percentile:
7 80
8 90
9 100
dtype: int64
3. Create a Data Frame quarterly sales where each row contains the item
category, item name, and expenditure. Group the rows by the category and
print the total expenditure per category.
import pandas as pd
data = {
'Category': ['A', 'B', 'A', 'B', 'C'],
'Item Name': ['Item1', 'Item2', 'Item3', 'Item4', 'Item5'],
'Expenditure': [100, 150, 200, 120, 80]
}
df = pd.DataFrame(data)
total_expenditure = df.groupby('Category')['Expenditure'].sum()
print("Total expenditure per category:")
print(total_expenditure)
output
Total expenditure per category:
Category
A 300
B 270
C 80
Name: Expenditure, dtype: int64
4. Create a data frame for examination result and display row labels,
column labels data types of each column and the dimensions
import pandas as pd
import pandas as pd
data = {
'Name': ['Alice', 'Bob', 'Charlie'],
'Math': [85, 70, 92],
'Science': [90, 88, 78]
}
df = pd.DataFrame(data)
print("DataFrame for examination results:")
print(df)
print("\nColumn data types:")
print(df.dtypes)
print("\nDimensions (rows, columns):")
print(df.shape)
output
DataFrame for examination results:
Name Math Science
0 Alice 85 90
1 Bob 70 88
2 Charlie 92 78
Column data types:
Name object
Math int64
Science int64
dtype: object
Dimensions (rows, columns):
(3, 3)
5. Filter out rows based on different criteria such as duplicate rows
import pandas as pd
import pandas as pd
data = {
'Name': ['Alice', 'Bob', 'Alice', 'Charlie', 'Bob'],
'Score': [85, 70, 85, 92, 70]
}
df = pd.DataFrame(data)
# Remove duplicate rows based on all columns
df_no_duplicates = df.drop_duplicates()
print("DataFrame without duplicate rows:")
print(df_no_duplicates)
output
DataFrame without duplicate rows:
Name Score
0 Alice 85
1 Bob 70
3 Charlie 92
6. Importing and exporting data between pandas and CSV file
import pandas as pd
# Export DataFrame to CSV file
data = {'Name': ['Alice', 'Bob', 'Charlie'],
'Age': [25, 30, 28]}
df = pd.DataFrame(data)
df.to_csv('output.csv', index=False)
# Import CSV file into DataFrame
imported_df = pd.read_csv('output.csv')
print("Imported DataFrame from CSV:")
print(imported_df)
Output
Imported DataFrame from CSV:
Name Age
0 Alice 25
1 Bob 30
2 Charlie 28
7)Create a Series from ndarray
i)Without index
import pandas as pd1
import numpy as np1
data = np1.array(['a','b','c','d'])
s = pd1.Series(data)
print(s)
Output
1a
2b
3c
4d
dtype: object
Note : default index is starting
from 0
ii)With index position
import pandas as p1
import numpy as np1
data = np1.array(['a','b','c','d'])
s = p1.Series(data,index=[100,101,102,103])
print(s)
Output
100 a
101 b
102 c
103d dtype:
Object
iii)Create a Series from dict
Eg.1(without index)
import pandas as pd1
import numpy as np1
data = {'a' : 0., 'b' : 1., 'c' : 2.}
s = pd1.Series(data)
print(s)
Output
a 0.0
b 1.0
c 2.0
dtype: float64
Eg.2 (with index)
import pandas as pd1
import numpy as np1
data = {'a' : 0., 'b' : 1., 'c' : 2.}
s = pd1.Series(data,index=['b','c','d','a'])
print(s)
Output
b 1.0
c 2.0
d NaN
a 0.0
dtype: float64
8.(A) Creation of Series from Scalar Values:
(A) Creation of Series from Scalar Values
A Series can be created using scalar values as shown in
the example below:
>>> import pandas as pd #import Pandas with alias pd
>>> series1 = pd.Series([10,20,30]) #create a Series
>>> print(series1) #Display the series
Output:
0 10
1 20
2 30
dtype: int64
User-defined labels to the index
and use them to access elements of a Series.
The following example has a numeric index in random order.
>>> series2 = pd.Series(["Kavi","Shyam","Ra
vi"], index=[3,5,1])
>>> print(series2) #Display the series
Output:
3 Kavi
5 Shyam
1 Ravi
dtype: object
(B) Creation of Series from NumPy Arrays
We can create a series from a one-dimensional (1D)
NumPy array, as shown below:
>>> import numpy as np # import NumPy with alias np
>>> import pandas as pd
>>> array1 = np.array([1,2,3,4])
>>> series3 = pd.Series(array1)
>>> print(series3)
Output:
01
12
23
34
dtype: int32
(C) Creation of Series from Dictionary
Dictionary keys can be used to construct an index for a
Series . Here, keys of the dictionary dict1 become indices in the series.
>>> dict1 = {'India': 'NewDelhi', 'UK':
'London', 'Japan': 'Tokyo'}
>>> print(dict1) #Display the dictionary
{'India': 'NewDelhi', 'UK': 'London', 'Japan':
'Tokyo'}
>>> series8 = pd.Series(dict1)
>>> print(series8) #Display the series
Output:
India NewDelhi
UK London
Japan Tokyo
dtype: object
9)
Pandas Series
Head function
import pandas as pd1
s = pd1.Series([1,2,3,4,5],index = ['a','b','c','d','e'])
print (s.head(3))
Output
a1
b. 2
c. 3
dtype: int64
Pandas Series tail function e.g import pandas as pd1 s =
pd1.Series([1,2,3,4,5],index = ['a','b','c','d','e']) print (s.tail(3)) Output c 3 d.
4 e. 5 dtype: int64
10)
Pandas Series
tail function
import pandas as pd1
s = pd1.Series([1,2,3,4,5],index = ['a','b','c','d','e'])
print (s.tail(3))
Output
c3
d. 4
e. 5
dtype: int64
11)Accessing Data from Series with indexing and slicing
import pandas as pd1
s = pd1.Series([1,2,3,4,5],index = ['a','b','c','d','e'])
print (s[0])# for 0 index position
print (s[:3]) #for first 3 index values
print (s[-3:]) # slicing for last 3 index values
Output
1
a. 1
b. 2
c. 3
dtype: int64 c 3
d. 4
e. 5
dtype: int64
12)Create a DataFrame from Dict of ndarrays / Lists
e.g.1
import pandas as pd1
data1 = {'Name':['Freya', 'Mohak'],'Age':[9,10]}
df1 = pd1.DataFrame(data1)
print (df1)
Output
Name Age
1 Freya 9
2 Mohak 10
13)Create a DataFrame from List of Dicts
e.g.1
import pandas as pd1
data1 = [{'x': 1, 'y': 2},{'x': 5, 'y': 4, 'z': 5}]
df1 = pd1.DataFrame(data1)
print (df1)
Output
xyz
0 1 2 NaN
1 5 4 5.0
14)Row Selection, Addition, and Deletion
#Selection by Label
import pandas as pd1
d1 = {'one' : pd1.Series([1, 2, 3], index=['a', 'b', 'c']),
'two' : pd1.Series([1, 2, 3, 4], index=['a', 'b', 'c', 'd'])} df1
= pd1.DataFrame(d1)
print (df1.loc['b'])
Output
one 2.0
two 2.0
Name: b, dtype: float64
15)selection by integer location
import pandas as pd1
d1 = {'one' : pd1.Series([1, 2, 3], index=['a', 'b', 'c']),
'two' : pd1.Series([1, 2, 3, 4], index=['a', 'b', 'c','d'])}
df1 = pd1.DataFrame(d1)
print (df1.iloc[2])
Output
one 3.0
two 3.0
Name: c, dtype: float64
16)Iterate over rows in a dataframe
import pandas as pd1
import numpy as np1
raw_data1 = {'name': ['freya', 'mohak'],
'age': [10, 1],
'favorite_color': ['pink', 'blue'],
'grade': [88, 92]}
df1 = pd1.DataFrame(raw_data1, columns = ['name', 'age',
'favorite_color', 'grade'])
for index, row in df1.iterrows():
print (row["name"], row["age"])
Output
freya 10
mohak 1
17)Binary operation over dataframe with series
import pandas as pd
x = pd.DataFrame({0: [1,2,3], 1: [4,5,6], 2: [7,8,9] })
y = pd.Series([1, 2, 3])
new_x = x.add(y, axis=0)
print(new_x)
Output 0 1 2
0147
1 4 10 16
2 9 18 27
18)Binary operation over
dataframe with dataframe
import pandas as pd
x = pd.DataFrame({0: [1,2,3], 1: [4,5,6], 2: [7,8,9] })
y = pd.DataFrame({0: [1,2,3], 1: [4,5,6], 2: [7,8,9] })
new_x = x.add(y, axis=0)
print(new_x)
Output
012
0 2 8 14
1 4 10 16
2 6 12 18
19)Merging/joining dataframe
import pandas as pd
left = pd.DataFrame({
'id':[1,2],
'Name': ['anil', 'vishal'],
'subject_id':['sub1','sub2']})
right = pd.DataFrame(
{'id':[1,2],
'Name': ['sumer', 'salil'],
'subject_id':['sub2','sub4']})
print (pd.merge(left,right,on='id'))
Output1
id Name_x subject_id_x Name_y subject_id_y
0 1 anil sub1 sumer sub2
1 2 vishal sub2 salil sub4
1)Plot following data on line chart:
Day Monday Tuesday Wednesday Thursday Friday
Income 510 350 475 580 600
1. Write a title for the chart “The Weekly Income Report”.
2. Write the appropriate titles of both the axes.
3. Write code to Display legends.
4. Display red color for the line.
5. Use the line style – dashed
6. Display diamond style markers on data points
input
import matplotlib.pyplot as pp
day
=['Monday','Tuesday','Wednesday','Thursday','Friday']
inc = [510,350,475,580,600]
pp.plot(day,inc,label='Income',color='r',linestyle='d
ashed',marker='D')
pp.title("The Weekly Income Report")
pp.xlabel("Days")
pp.ylabel("Income")
pp.legend()
pp.show()
Output:
2) A Shivalik restaurant has recorded the following data into their register
for their income by Drinks and Food. Plot them on the line chart.
Day Monday Tuesday Wednesday Thursday Friday
Drinks 450 560 400 605 580
Food 490 600 425 610 625
input
import matplotlib.pyplot as pp
day
=['Monday','Tuesday','Wednesday','Thursday','Friday']
dr = [450,560,400,605,580]
fd = [490,600,425,610,625]
pp.plot(day,dr,label='Drinks',color='g',linestyle='do
tted',marker='+')
pp.plot(day,fd,label='Food',color='m',linestyle='dash
dot',marker='x')
pp.title("The Weekly Restaurant Orders")
pp.xlabel("Days")
pp.ylabel("Orders")
pp.legend()
pp.show()
Output:
3) Write a program to plot a range from 1 to 30 with step value 4. Use
the following algebraic expression to show data.
y = 5*x+2
import matplotlib.pyplot as pp
import numpy as np
x = np.arange(1,30,4)
y = 5 * x + 2
pp.plot(x,y)
pp.show()
Output:
4)Display following bowling figures through bar chart:
Overs Runs
1 6
2 18
3 10
4 5
import matplotlib.pyplot as pp
overs =[1,2,3,4]
runs=[6,18,10,5]
pp.bar(overs,runs,color='m')
pp.xlabel('Overs')
pp.xlabel('Runs')
pp.title('Bowling Spell Analysis')
pp.xticks([1,2,3,4])
pp.yticks([5,10,15,20])
pp.show()
Output:
5)Given the school result data,analyse the performance of 5 students on
different parameter, e.g. subject wise.
Below is implementation code /source code
Here is program code to analyse the performance of student data on
different parameter, e.g. subject wise or class wise
1 import matplotlib.pyplot as plt
2 import pandas as pd
3 import numpy as np
4 marks = { "English" :[67,89,90,55],
5 "Maths":[55,67,45,56],
6 "IP":[66,78,89,90],
7 "Chemistry" :[45,56,67,65],
8 "Biology":[54,65,76,87]}
9 df =
1 pd.DataFrame(marks,index=['Sumedh','Athang','Sushi
0 l','Sujata'])
1 print("******************Marksheet****************
1 ")
1 print(df)
2 df.plot(kind='bar')
1 plt.xlabel(" ")
3 plt.ylabel(" ")
1 plt.show()
4
1
5
Below is output:
******************Marksheet****************
English Maths IP Chemistry Biology
Sumedh 67 55 66 45 54
Athang 89 67 78 56 65
Sushil 90 45 89 67 76
Sujata 55 56 90 65 87
Below is bar chart showing performance of students
subject-wise
MYSQL
1. Create a student table with the student id, name, and
marks as attributes where the student id is the primary key.
i)Create table student
Input:
CREATE TABLE student
( student_id INT PRIMARY KEY, name VARCHAR(50), marks
INT );
Output: Table student is created
ii)insert 5 values:
Input:
INSERT INTO student values(1, 'John', 85),
(2, 'Jane', 75),
(3, 'Michael', 90),
(4, 'Emma', 95),
(5, 'William', 78);
2. Insert the details of a new student in the above table
Input:
INSERT INTO student (student_id, name, marks) VALUES (6, 'Olivia',
88);
Output: 1 new record is inserted
3. Delete the details of a student in the above table.
Input:
DELETE FROM student WHERE student_id = 3;
Output: Student with student_id 3 is deleted.
4. Use the select command to get the details of the students with marks
more than 80.
Input:
SELECT * FROM student WHERE marks > 80;
Output:
| student_id | name | marks |
|------------|----------|-------|
| 1 | John | 85 |
| 3 | Michael | 90 |
| 4 | Emma | 95 |
| 6 | Olivia | 88 |
5. Find the min, max, sum, and average of the marks in a student
marks table.
Input:
SELECT MIN(marks) AS min_marks, MAX(marks) AS max_marks, SUM(marks) AS
sum_marks, AVG(marks) AS avg_marks
FROM student;
Output:
| min_marks | max_marks | sum_marks | avg_marks |
|-----------|-----------|-----------|-----------|
| 75 | 95 | 423 | 84.6 |
6.Find the total number of customers from each country in the
table (customer ID, customer Name, country) using group by.
i) create table customers
Input:
CREATE TABLE customers ( customer_id INT PRIMARY KEY,
customer_name VARCHAR(50), country VARCHAR(50) );
ii)Describe the structure of the customer table:
Input:
DESCRIBE customer;
Output:
| Field | Type | Null | Key | Default | Extra |
|---------------|--------------|------|-----|---------|----------------|
| customer_id | int(11) | NO | PRI | NULL | auto_increment|
| customer_name | varchar(50) | YES | | NULL | |
| country | varchar(50) | YES | | NULL | |
iii)Insert 6 values into the customer table:
Input:
INSERT INTO customers VALUES(1, 'Alice', 'USA'),
(2, 'Bob', 'Canada'),
(3, 'Charlie', 'USA'),
(4, 'David', 'UK'),
(5, 'Eva', 'Canada'),
(6, 'Frank', 'Germany');
Output: Table customers is created, and 6 records are inserted.
iv)Display all values in the customer table:
Input:
SELECT * FROM customer;
Output:
| customer_id | customer_name | country |
|-------------|---------------|----------|
| 1 | Alice | USA |
| 2 | Bob | Canada |
| 3 | Charlie | USA |
| 4 | David | UK |
| 5 | Eva | Canada |
| 6 | Frank | Germany |
v)Find the total number of customers from each country using GROUP
BY:
Input:
SELECT country, COUNT(*) AS num_customers FROM
customers GROUP BY country;
Output:
| country | num_customers |
|---------|---------------|
| Canada | 2 |
| Germany | 1 |
| UK |1 |
| USA | 2 |
7.Write a SQL query to order the (student ID, marks) table in
descending order of the marks.
Input:
SELECT student_id, marks
FROM student
ORDER BY marks DESC;
Output:
| student_id | marks |
|------------|-------|
|4 | 95 |
|3 | 90 |
|6 | 88 |
|1 | 85 |
|5 | 78 |
|2 | 75 |