Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
14 views19 pages

Lecture 2

The document provides an overview of machine learning using Python, highlighting its advantages, basic operations, and efficient coding practices. It also covers version control with Git, testing methodologies, and continuous integration processes. Key topics include Python data structures, package management, and unit testing frameworks like pytest.

Uploaded by

Edoardo Maschio
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views19 pages

Lecture 2

The document provides an overview of machine learning using Python, highlighting its advantages, basic operations, and efficient coding practices. It also covers version control with Git, testing methodologies, and continuous integration processes. Key topics include Python data structures, package management, and unit testing frameworks like pytest.

Uploaded by

Edoardo Maschio
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

Machine learning with pytho

ESCP-Paris 2021
Slides (or images, contents) adapted from D. Dligach, C. Müller, E.
Duchesnay, M.Defferrard, S. Sankararaman and many others (who made
,

their course materials freely available online).

Anh-Phuong TA
Chief Data Scientist at Le Figaro CCM-Benchmark group
[email protected]

1
n

Python review
Why Python?

• Python is a widely used, general purpose programming


language.
• Easy to start working with.
• Scientific computation functionality similar to Matlab
and Octave.
• Used by major deep learning frameworks such as
PyTorch and TensorFlow.

2
Common Operations
• + : Adds values on either side of the operator
• - : Subtracts right hand operand from left hand operand.
• * : Multiplies values on either side of the operator
• / : Divides left hand operand by right hand operand
• % : Divides left hand operand by right hand operand and returns
remainder
• ** : Performs exponential (power) calculation on operators
• // : Floor Division - The division of operands where the result is the
quotient in which the digits after the decimal point are removed. But
if one of the operands is negative, the result is floored, i.e., rounded
away from zero (towards negative infinity) −
• +=, *=, /=, %=, **= , //= : xxx AND
• Comparison Operators: == (, !=, <>, >, <, >=, <=), if the values of
two operands are equal (not equal, greater than, less than, etc),
then the condition becomes true

3
Python review
• Collections: list, tuple, dictionary
• Lists are mutable arrays:
• names = [‘a’, ‘b’]
• names.append(‘c’)
• names.extend([‘d’,’e’])
• len(names) == 5
• Tuples are immutable arrays
• names = (‘a’, ‘b’)
• len(names) == 2
• names[0] = ‘c’ => ERROR
• Dictionaries are hash maps
• d = dict() # d = {}
• d = {‘name’: ‘escp’}
• d[‘addr’] = ‘montparnass’
• for k,v in d.iteritems():
• Built-in values: None, True, False
• Loops:
• While condition:
• For i in range(n):

4
Python review
• Install packages in terminal using pip install
[package_name]
• Importing modules by using import module_name
• import os, time
• Import numpy as np # np is an alias
• from numpy import linalg as la
• np.ndarray Operations
• np.array([])
• np.max, np.min, np.argmax, np.sum, np.mean, …
• np.dot, np.linalg.norm, .T, +, -, *,
• np.random.random
• np.random.shuffle(x)

5
Efficient code
• Avoir explicit for-loops over indices if possible. For example,
the following codes are equivalent, but the later is much
faster
• squares = []
for i in range(n):
squares.append(i**2)
• squares = [i**2 for i in range(n)] # list comprehension
• using zip() to iterate over a pair of lists. For example,
numbers = [1, 2, 3]
letters = [“A”, “B”, “C”]
• for index in range(len(numbers)):
print(numbers[index], letters[index])
• for numbers_val, letters_val in zip(numbers, letters):
print(numbers_val, letters_val)

6
Efficient code
• Nothing wrong with the following codes, but it’s inefficient
d = {“nlp”: “data-science course”)
ret_data = “default_course”
if “nlp” in d:
ret_data = d[“nlp”]
print(ret_data)
=> use dict.get(.,.)
ret_data = d.get(“nlp”, “default_course”)
• Avoid concatenating two strings using “+”
ret_str = " "
for i in range(10):
ret_str += " " + str(i*2)
=> use "".join()
ret_str = " ".join([str(i*2) for i in range(10)])

7
Version control, Git &
Testing
Git and Github
https://guides.github.com/
http://rogerdudler.github.io/git-guide/
Con guration
git con g --global user.name “AP TA"
git con g --global user.email “[email protected]
git con g --global color.ui "auto"
git con g --global core.editor "vim"

Show the configuration

git con g --list


fi
fi
fi
fi
fi
fi
Git commands
Clon
Branc
Add, commit, add, commit, add, commit, ..
Merge / rebas
Push





e

Unit Tests and integration tests


Why
Ensure that code works correctly
Ensure that changes don’t break anything
Ensure that bugs are not reintroduced
Ensure robustness to user errors
Ensure code is reachable.





?

Types of test
Unit tests – function does the right thing
Integration tests – system / process does the right thing
Non-regression tests – bug got removed.



s

How to test?
pytest – http://doc.pytest.org
Searches for all test*.py les, runs all test* methods
Reports nice errors!



fi
.

Example
# content of inc.py

def inc(x)
return x + 2

# content of test_inc.py
from inc import in

def test_answer()
assert inc(3) == 4

Type: py.test test_inc.py


:

Test coverage

# inc.py # test_inc.py
def inc(x) from inc import in
if x < 0
return 0 def test_inc()
return x + 1 assert inc(3) == 4

def dec(x)
return x - 1

py.test
:

Test coverage

# inc.py # test_inc.py
def inc(x) from inc import in
if x < 0
return 0 def test_inc()
return x + 1 assert inc(3) == 4

def dec(x)
return x - 1

pytest --cov .
pytest --cov . --cov-report=html
:

Continuous integration (with


GitHub)
What is Continuous integration
Run command on each commit (or each PR)
Unit testing and integration testing
Can act as a build-farm (for binaries or documentation)
requires clear declaration of dependencies
Build matrix: Can run on many environments
Standard serviced: TravisCI, Appveyor, Azure Pipelines and CircleC






.

End (today)

You might also like