08-07-2021
I N D I A N I N S T I T U T E O F R E M O T E S E N S I N G, D E H R A D U N
Python for
Machine/Deep Learning Models
Kamal Pandey
Geoweb Services, IT & Distance Learning Department
Indian Institute of Remote Sensing (IIRS), ISRO Dehradun
[email protected]
I N D I A N I N S T I T U T E O F R E M O T E S E N S I N G, D E H R A D U N
Content
Brief about ML/DL
Why Python for ML/DL
Feature of python for ML/DL
Libraries in python for ML/DL
Working environment of python for ML/DL
Example and use cases
Linear Regression: Wheat Crop yield estimation
Splitting the dataset to Train and Test
1
08-07-2021
I N D I A N I N S T I T U T E O F R E M O T E S E N S I N G, D E H R A D U N
Machine Learning and Deep Learning
Machine Learning is a set of
algorithms that parse data, learn
from them, apply what they have
learnt to make intelligent
decision,
Deep learning is gaining much
popularity due to it’s supremacy
in terms of accuracy when
trained with huge amount of
data
I N D I A N I N S T I T U T E O F R E M O T E S E N S I N G, D E H R A D U N
? Python for Machine Learning and Deep Learning
• Free and open-source nature
• Community friendly and guarantees improvements
in the long run
• Exhaustive libraries
• Solution for every existing problem
• Smooth implementation and integration
• Accessible for people with the varying skill level to
adapt it
2
08-07-2021
I N D I A N I N S T I T U T E O F R E M O T E S E N S I N G, D E H R A D U N
? Python for Machine Learning and Deep Learning
• Increased productivity by reducing the time to
code and debug
e.g. mymodel.LinearRegression()
• Can be used for Soft Computing, Natural
Language Processing as well
• Works seamlessly with C and C++ code
modules
I N D I A N I N S T I T U T E O F R E M O T E S E N S I N G, D E H R A D U N
Features of Python w.r.t. ML/DL
• Easy/Simple Learning Curve
• General Purpose Language
• Ready made packages and libraries for machine learning
• Matplotlib
• Numpy
• Scipy
• scikit-learn
• tensorflow Excellent support for Deep Learning Models
• Interactive data analysis and modelling using iPython Notebooks
• Industry standard language for AI and machine Learning
3
08-07-2021
I N D I A N I N S T I T U T E O F R E M O T E S E N S I N G, D E H R A D U N
Python Libraries : Machine Learning and Deep Learning
Numpy and
Panda MXNet
Matplotlib
Tensor
Theano Keras
Flow
Spark Mllib Scikit-lean Pytorch
I N D I A N I N S T I T U T E O F R E M O T E S E N S I N G, D E H R A D U N
Python Libraries : Deep Learning
• Handling deep neural networks
• Natural Language Processing
• Partial Differential Equation
TensorFlow • Abstraction capabilities
• Image, Text, and Speech
recognition
• Effortless collaboration of ideas
and code
Offered by Google, fast, flexible, and scalable open-source
4
08-07-2021
I N D I A N I N S T I T U T E O F R E M O T E S E N S I N G, D E H R A D U N
Python Libraries : Deep Learning
• Neural layers
• Activation and cost functions
• Objectives
Keras
• Batch normalization
• Dropout
• Pooling
Support for the convolutional and recurrent neural network also exists
along with standard neural networks.
Keras is the high-level API of TensorFlow 2
I N D I A N I N S T I T U T E O F R E M O T E S E N S I N G, D E H R A D U N
Python Libraries : Deep Learning
• Tensor computing with the ability via
Graphics Processing Units
• Easy to learn, use and integrate with Python
ecosystem
PyTorch • Tensors — torch.Tensor
• Optimizers —
• Neural Networks —
• Autograd – Automatic gradiant calculation
5
08-07-2021
I N D I A N I N S T I T U T E O F R E M O T E S E N S I N G, D E H R A D U N
Python Libraries: Machine Learning
• Classification
• Regression
• Clustering
Scikit-learn
• Dimensionality Reduction
• Model Selection
• Preprocessing
I N D I A N I N S T I T U T E O F R E M O T E S E N S I N G, D E H R A D U N
Python Libraries: Data preparation/processing
• Dataset reshaping and pivoting
• Merging and joining of datasets
• Handling of missing data and data
Pandas alignment
• Various indexing options such as
Hierarchical axis indexing, Fancy indexing
• Data filtration options
Pandas make use of DataFrames (2 dimensional data structure)
6
08-07-2021
I N D I A N I N S T I T U T E O F R E M O T E S E N S I N G, D E H R A D U N
Python Libraries: Machine Learning
• Regression
• Clustering
• Optimization
Spark Mllib • Dimensional Reduction
• Classification
• Basic Statistics
• Feature Extraction
Spark Mllib is a machine learning library that enables easy scaling of your computations
I N D I A N I N S T I T U T E O F R E M O T E S E N S I N G, D E H R A D U N
Python Libraries: Machine Learning
• Support for GPUs
• Strong integration support with NumPy
Theano
• Ability to create custom C code for your
mathematical operations
Theano a robust library for carrying out scientific calculations on a large-scale
7
08-07-2021
I N D I A N I N S T I T U T E O F R E M O T E S E N S I N G, D E H R A D U N
Python Libraries: Deep Learning
• Highly scalable and supports quick model training
MXNeT • Used to train and deploy deep neural networks
• Excellent Portability and Scalability
•Amazon’s AWS prefers MXNet as its choice of preferred deep learning framework.
I N D I A N I N S T I T U T E O F R E M O T E S E N S I N G, D E H R A D U N
Python Libraries: Data preparation/processing
• Shape manipulation
• Sorting and Selecting capabilities
Numpy & • Discrete Fourier transformations
Matplotlib • Basic linear algebra and statistical operations
• Random simulations
• Support for n-dimensional arrays
The NumPy library for Python concentrates on handling extensive multi-
dimensional data and the intricate mathematical functions operating on the data
8
08-07-2021
I N D I A N I N S T I T U T E O F R E M O T E S E N S I N G, D E H R A D U N
Python IDE for Machine Learning
• Jupyter Notebook in Anaconda
• Google Colab
I N D I A N I N S T I T U T E O F R E M O T E S E N S I N G, D E H R A D U N
In Machine Learning, predicting
the future is very important.
Python Code Examples
• Linear Regression using Least Square Estimation
9
08-07-2021
I N D I A N I N S T I T U T E O F R E M O T E S E N S I N G, D E H R A D U N
Machine Learning : Linear Regression
• The term regression is used when you try to find
the relationship between variables.
• In Machine Learning, and in statistical modeling,
that relationship is used to predict the outcome of
future events.
• Linear regression uses the relationship between the
data-points to draw a straight line through all them.
I N D I A N I N S T I T U T E O F R E M O T E S E N S I N G, D E H R A D U N
Machine Learning : Linear Regression
• This line can be used to predict future values.
10
08-07-2021
I N D I A N I N S T I T U T E O F R E M O T E S E N S I N G, D E H R A D U N
Least Square Estimation
A common problem in experimental work is to obtain a
mathematical relationship y=f(x) between two variables
x and y by fitting a curve to the points in the plane
corresponding to various experimentally determined
values x and y, say:
(x1,y1) , (x2,y2),….,(xn,yn)
I N D I A N I N S T I T U T E O F R E M O T E S E N S I N G, D E H R A D U N
Polynomials
On the basis of theory, or simply by the pattern of
points one decide on the general form of the curve
y=f(x) to be fitted.
Some possibilities are:
• Straight line: y=a+bx
• Quadratic polynomial: y=a+bx+cx2
• Cubic polynomial: y=a+bx+cx2+ dx3
11
08-07-2021
I N D I A N I N S T I T U T E O F R E M O T E S E N S I N G, D E H R A D U N
Straight Line
Suppose we want to fit a straight line:
y=a+bx
to the experimentally determined y1 =a+bx1
points: y2 =a+bx2
…..
(x1,y1) , (x2,y2),….,(xn,yn)
yn =a+bxn
If these points are on the straight line
then the following equalities hold for
these points:
I N D I A N I N S T I T U T E O F R E M O T E S E N S I N G, D E H R A D U N
Straight Line
We can write this in a matrix form as :
𝑦1 1 𝑥1
𝑦2 1 𝑥2 𝑎
… =
… … 𝑏
𝑦𝑛 1 𝑥𝑛
or, more compactly, as ; 𝑦ത = 𝑀𝑣ҧ
𝑦1 1 𝑥1
𝑦2 1 𝑥2 𝑎
where: 𝑦=
ത … 𝑀= 𝑣=
ҧ
… … 𝑏
𝑦𝑛 1 𝑥𝑛
12
08-07-2021
I N D I A N I N S T I T U T E O F R E M O T E S E N S I N G, D E H R A D U N
Least Square Fitting
If the data points are not collinear,
then it’s impossible to find coefficient
a and b that satisfy 𝑦ത = 𝑀𝑣ҧ exactly.
Minimizing the error vector is done
using a technique that is called ഥ=(𝑴𝑻 𝑴)−𝟏 𝑴𝑻 𝒚
𝒗 ഥ
least square fitting.
A solution to the least square fit can
be found using the following formula:
I N D I A N I N S T I T U T E O F R E M O T E S E N S I N G, D E H R A D U N
Example
Suppose we have the following 4 observations:
(0,1),(1,3),(2,4),(3,4)
1 1 0
3 1 1
From the data: 𝑦=
ത 𝑀=
4 1 2
4 1 3
𝑇 1.5
Solution: 𝑣=(𝑀
ҧ 𝑀)−1 𝑀𝑇 𝑦ത =
1
And so the desired best fitting line is: y=1.5+x
13
08-07-2021
I N D I A N I N S T I T U T E O F R E M O T E S E N S I N G, D E H R A D U N
In Python
from numpy import *
x=matrix([[0],[1],[2],[3]])
y=matrix([[1],[3],[4],[4]])
M=matrix([1,0],[1,1],[1,2],[1,3])
v=(M.T*M).I*M.T*y [[1.5]
[1. ]]
print v
I N D I A N I N S T I T U T E O F R E M O T E S E N S I N G, D E H R A D U N
Plotting the result
from pylab import *
plot(x,y, 'ro')
x=arange(-1,5)
y=1.5+x
plot(x,y,'b-')
14
08-07-2021
I N D I A N I N S T I T U T E O F R E M O T E S E N S I N G, D E H R A D U N
Quadratic Curve
The same mathematical trick can be applied to
the higher order polynomials.
Fit a quadratic curve:
𝑠 = 𝑎0 + 𝑎1 𝑡 + 𝑎2 𝑡 2
to five data points (t,s):
(0.1,-0.18),
(0.2,0.31),(0.3,1.03),(0.4,2.48),(0.5,3.73)
I N D I A N I N S T I T U T E O F R E M O T E S E N S I N G, D E H R A D U N
Matrix Form
Again, the equations: can be written in a matrix form:
𝑠1 = 𝑎0 + 𝑎1 𝑡1 + 𝑎2 𝑡12 𝑦ത = 𝑀𝑣ҧ
𝑠2 = 𝑎0 + 𝑎1 𝑡2 + 𝑎2 𝑡22
where:
𝑠3 = 𝑎0 + 𝑎1 𝑡3 + 𝑎2 𝑡32
𝑠1 1 𝑡1 𝑡12
𝑠4 = 𝑎0 + 𝑎1 𝑡4 + 𝑎2 𝑡42 1 𝑡2 𝑡22
𝑠2
𝑠5 = 𝑎0 + 𝑎1 𝑡5 + 𝑎2 𝑡52 𝑦=
ത 𝑠3 𝑀= 1 𝑡3 𝑡32 𝑣=
ҧ
𝑠4 1 𝑡4 𝑡42
𝑠5
1 𝑡5 𝑡52
𝑎0
𝑎1
𝑎2
15
08-07-2021
I N D I A N I N S T I T U T E O F R E M O T E S E N S I N G, D E H R A D U N
In Python
from numpy import *
x=matrix([[0.1],[0.2],[0.3],[0.4],[0.5]])
y=matrix([[-0.18],[0.31],[1.03],[2.48],[3.73]])
M=matrix(ones((5,3)))
M[:,1]=x
M[:,2]=multiply(x,x)
[[-0.398 ]
[0.34714286]
v=(M.T*M).I*M.T*y [16.07142857]]
print (v)
I N D I A N I N S T I T U T E O F R E M O T E S E N S I N G, D E H R A D U N
Plotting the results
𝑠 = −0.40 + 0.35𝑡 + 16.1𝑡 2
from pylab import *
plot(x,y, 'ro')
x=arange(0.0,0.52,0.01)
y=v[0,0]+v[1,0]*x+v[2,0]*multiply(x,x)
plot(x,y,'b-')
16
08-07-2021
I N D I A N I N S T I T U T E O F R E M O T E S E N S I N G, D E H R A D U N
Multiple Regression
The mathematical “trick” of finding a matrix
form for the equations can be extended to
multiple dimensions.
In that case we call it multiple regression
I N D I A N I N S T I T U T E O F R E M O T E S E N S I N G, D E H R A D U N
Machine Learning : Linear Regression
import numpy as np
from sklearn.linear_model import LinearRegression
X = np.array([[1, 1], [1, 2], [2, 2], [2, 3]])
# y = 1 * x_0 + 2 * x_1 + 3
y = np.dot(X, np.array([1, 2])) + 3+np.random.rand()
reg = LinearRegression()
reg.fit(X, y)
reg.score(X, y)
reg.coef_
reg.intercept_
reg.predict(np.array([[3, 5]]))
𝑦 = 𝑥0 + 2𝑥1 + 3
17
08-07-2021
I N D I A N I N S T I T U T E O F R E M O T E S E N S I N G, D E H R A D U N
Machine Learning : Linear Regression
Crop Yield Estimation
import pandas
from sklearn import linear_model
df = pandas.read_excel("/content/drive/MyDrive/MLData/CropYeildPredictionSampleData.xlsx")
df=df.dropna(how='any')
df= df.drop(columns=['District','Year','Crop', 'Production(Tonnes)'])
y=df.pop('Yield(Tonnes/Ha)')
x =df.values
y=y.to_numpy()
y= y.reshape(-1,1)
model_yield = linear_model.LinearRegression(normalize=True)
model_yield.fit(x, y)
model_yield.predict(x[0].reshape(1,-1))
pred_y=regr.predict(x)
I N D I A N I N S T I T U T E O F R E M O T E S E N S I N G, D E H R A D U N
Machine Learning : Linear Regression
Data Splitting: Train and Test
import pandas
from sklearn.model_selection import train_test_split
from sklearn import linear_model
df = pandas.read_excel("/content/drive/MyDrive/MLData/CropYeildPredictionSampleData.xlsx")
df=df.dropna(how='any')
df= df.drop(columns=['District','Year','Crop', 'Production(Tonnes)'])
y=df.pop('Yield(Tonnes/Ha)')
x =df.values
y=y.to_numpy()
X_train, X_test, y_train, y_test = train_test_split(x, y, test_size=0.33, random_state=42)
model_yield = linear_model.LinearRegression(normalize=True)
model_yield.fit(X_train, y_train)
y_predicted= regr.predict(X_test)
pylab.scatter(y_predicted,y_test)
pylab.xlabel('Predicted')
pylab.ylabel('Actual')
18
08-07-2021
I N D I A N I N S T I T U T E O F R E M O T E S E N S I N G, D E H R A D U N
Machine Learning : Clustering
from sklearn.cluster import KMeans
import numpy as np
X = np.array([[1, 2], [1, 4], [1, 0],[10, 2], [10, 4], [10, 0]])
kmeans = KMeans(n_clusters=2,random_state=0).fit(X)
kmeans.labels_
kmeans.predict([[0, 0], [12, 3]])
kmeans.cluster_centers_
I N D I A N I N S T I T U T E O F R E M O T E S E N S I N G, D E H R A D U N
THANK YOU
19