Lect-07 and 08, Week-02
Lect-07 and 08, Week-02
Week # 02
Day # 01
Lecture # 07 & 08
Class: Summer’25
Your Facilitator, Adil Khan
NUMPY
NUMPY INTRODUCTION
In Python we have lists that serve the purpose of arrays, but they are
slow to process.
NumPy aims to provide an array object that is up to 50x faster than
traditional Python lists.
The array object in NumPy is called ndarray, it provides a lot of
supporting functions that make working with ndarray very easy.
Arrays are very frequently used in data science, where speed and
resources are very important.
Data Science: is a branch of computer science where we study how to
store, use and analyse data for deriving information from it.
WHY IS NUMPY FASTER THAN LISTS?
alias: In Python alias are an alternate name for referring to arr = np.array([1, 2, 3, 4, 5])
the same thing.
Create an alias with the as keyword while importing: print(arr)
NUMPY CREATING ARRAYS
Example
import numpy as np
We can create a NumPy ndarray object by using the
array() function
arr = np.array([1, 2, 3, 4, 5])
To create an ndarray, we can pass a list, tuple or any
array-like object into the array() method, and it will be print(arr)
converted into an ndarray:
print(type(arr))
Example
import numpy as np
Example 3D arrays
An array that has 1-D arrays as its elements is import numpy as np
called a 2-D array.
arr = np.array([[[1, 2, 3], [4, 5, 6]],
These are often used to represent matrix or 2nd [[1, 2, 3], [4, 5, 6]]])
order tensors. print(arr)
print(arr[2] + arr[3])
NUMPY ARRAY INDEXING
Example
import numpy as np
arr = np.array([1, 2, 3, 4, 5, 6, 7])
Slicing in python means taking elements from print(arr[4:])
one given index to another given index.
We pass slice instead of index like this: Example
import numpy as np
[start:end].
arr = np.array([1, 2, 3, 4, 5, 6, 7])
We can also define the step, like this: print(arr[:4])
[start:end:step].
Example
If we don't pass start it considered 0 import numpy as np
arr = np.array([1, 2, 3, 4, 5, 6, 7])
If we don't pass end its considered length of print(arr[1:5:2])
f
array in that dimension
Example
If we don't pass step, it considered 1 import numpy as np
arr = np.array([1, 2, 3, 4, 5, 6, 7])
print(arr[::2])
NUMPY DATA TYPES
strings - used to represent text data, the text is given under quote marks. e.g. "ABCD"
integer - used to represent integer numbers. e.g. -1, -2, -3
float - used to represent real numbers. e.g. 1.2, 42.42
boolean - used to represent True or False.
complex - used to represent complex numbers. e.g. 1.0 + 2.0j, 1.5 + 2.5j
NUMPY DATA TYPES
NumPy has some extra data types, and refer to data types with one character, like i for
integers, u for unsigned integers etc.
Below is a list of all data types in NumPy and the characters used to represent them.
i - integer
b - boolean
u - unsigned integer
f - float
c - complex float
m - timedelta
M - datetime
O - object
S - string
U - unicode string
V - fixed chunk of memory for other type ( void )
CREATING ARRAYS WITH A DEFINED DATA
TYPE
Example
import numpy as np
We use the array() function to create arrays;
this function can take an optional arr = np.array([1, 2, 3, 4], dtype='S')
argument: dtype that allows us to define
print(arr)
the expected data type of the array print(arr.dtype)
elements:
Example
import numpy as np
The shape of an array is the number of arr = np.array([[1, 2, 3, 4], [5, 6, 7, 8]])
elements in each dimension.
print(arr.shape)
NumPy arrays have an attribute called
shape that returns a tuple with each index Example
Print the shape of a 2-D array:
having the number of corresponding import numpy as np
elements.
arr = np.array([[1, 2, 3, 4], [5, 6, 7, 8]])
print(arr.shape)
f
Example: Iterate on the elements of the following 1-D array:
NUMPY ARRAY import numpy as np
for x in arr:
print(x)
Iterating means going through elements one by
one. Example:Iterate on the elements of the following 2-D array:
import numpy as np
There is a method called searchsorted() which Example: Find the indexes where the value 7 should be
performs a binary search in the array and returns inserted:
the index where the specified value would be import numpy as np
inserted to maintain the search order. arr = np.array([6, 7, 8, 9])
x = np.searchsorted(arr, 7)
The searchsorted() method is assumed to be used on print(x)
sorted arrays.
Example: Find the indexes where the values 2, 4, and
To search for more than one value, use an array with 6 should be inserted:
f
the specified values. import numpy as np
arr = np.array([1, 3, 5, 7])
x = np.searchsorted(arr, [2, 4, 6])
print(x)
NUMPY SORTING Example
Sort the array:
ARRAYS import numpy as np
arr = np.array([3, 2, 0, 1])
print(np.sort(arr))
print(myvar)
Example: Create a simple Pandas Series from a list:
PANDAS SERIES
import pandas as pd
A Pandas Series is like a column in a table. a = [1, 7, 2]
myvar = pd.Series(a)
print(myvar)
It is a one-dimensional array holding data of any Example: Return the first value of the Series:
type.
print(myvar[0])
If nothing else is specified, the values are labelled
with their index number. First value has index 0, Example: Create your own labels:
second value has index 1 etc.
import pandas as pd
This label can be used to access a specified value. a = [1, 7, 2]
myvar = pd.Series(a, index = ["x", "y", "z"])
With the index argument, you can name your own print(myvar)
labels.
Example: When f you have created labels, you can
access an item by referring to the label.
Return the value of "y":
print(myvar["y"])
Example
Create a simple Pandas Series from a dictionary:
calories =
{"day1": 420, "day2": 380, "day3": 390}
You can also use a key/value object, like a dictionary,
myvar = pd.Series(calories)
when creating a Series.
Note: The keys of the dictionary become the labels. print(myvar)
Example
To select only some of the items in the dictionary, Create a Series using only data from "day1" and
use the index argument and specify only the items "day2":
you want to include in the Series. import pandas as pd
calories =
{"day1": 420, "day2": 380, "day3": 390}
f
myvar = pd.Series(calories, index =
["day1", "day2"])
print(myvar)
PANDAS SERIES
Example
Data sets in Pandas are usually multi- Create a DataFrame from two Series:
dimensional tables, called DataFrames.
import pandas as pd
myvar = pd.DataFrame(data)
f
print(myvar)
Example
Create a simple Pandas DataFrame:
import pandas as pd
data = {
PANDAS DATAFRAMES "calories": [420, 380, 390],
"duration": [50, 40, 45]
}
#load data into a DataFrame object:
df = pd.DataFrame(data)
print(df)
A Pandas DataFrame is a 2-dimensional data Result
structure, like a 2-dimensional array, or a table calories duration
0 420 50
with rows and columns.. 1 380 40
2 390 45
Example
Return row 0:
As you can see from the result above, the #refer to the row index:
DataFrame is like a table with rows and columns. print(df.loc[0])
Result
calories 420
duration 50
Name: 0, dtype: int64
Pandas use the loc attribute to return one or more Example
specified row(s) Return row 0 and 1:
f
#use a list of indexes:
print(df.loc[[0, 1]])
Result
calories duration
0 420 50
1 380 40
Example
Add a list of names to give each row a name:
import pandas as pd
Example
Load a comma separated file (CSV file) into a DataFrame:
import pandas as pd
df = pd.read_csv('data.csv')
print(df)
PANDAS READ CSV
A simple way to store big data sets is to use CSV files
Example: Load the CSV into a DataFrame
(comma separated files).
CSV files contains plain text and is a well know format import pandas as pd
that can be read by everyone including Pandas.
df = pd.read_csv('data.csv')
In our examples we will be using a CSV file called
'data.csv'. print(df.to_string())
Download data.csv. or Open data.csv from Data Science
Example: Print the DataFrame without the
course on elearning.
to_string() method:
Tip: use to_string() to print the entire DataFrame
import pandas as pd
If you have a large DataFrame with many rows, Pandas
will only return the first 5 rows, and the last 5 rows: df = pd.read_csv('data.csv')
print(df)
PANDAS READ CSV
Example
Check the number of maximum returned rows:
max_rows: The number of rows returned is defined
import pandas as pd
in Pandas option settings.
print(pd.options.display.max_rows)
df = pd.read_csv('data.csv')
The head() method returns the headers and a
specified number of rows, starting from the top. print(df.head(10))
There is also a tail() method for viewing the last Example: Print the first 5 rows of the DataFrame:
rows of the DataFrame.
import pandas as pd
df = pd.read_csv('data.csv')
f
The tail() method returns the headers and a specified
number of rows, starting from the bottom. print(df.head()) Example