Python as First Programing Language
Luciano Baresi
[email protected]
https://baresi.faculty.polimi.it
lbaresi @ instagram
Luciano Baresi
• Professor @ DEIB
• Previously
– Researcher at Cefriel
– Visiting researcher
• University of Oregon (USA)
• University of Paderborn (Germany)
– Visiting professor
• University of Oregon (USA)
• Tongji University (China)
• Research interests
– Software engineering
• Dynamic software architectures
• Service- and cloud-based systems
• Mobile applications
https://baresi.faculty.polimi.it
Why Python?
• Focus on readability and coherence
• Developer Productivity
– 1/3 to 1/5 the size of traditional C++ or Java
– Less to type, less to debug
• Support Libraries
– Both standard and third-party
• Component integration
– Support for various integration mechanisms
– Invoke C/C++ libraries
– Get called from C/C++, Java/.NET, etc
What is Python?
• Python is an object-oriented scripting language
– Blends procedural, functional, and object-oriented
programming paradigms
• Common uses
– Shell tools
– Simple language for simple coding tasks
– General-purpose programming language
• Downsides?
– Not fully compiled means slower execution
Python 2 o 3
We have decided that January 1, 2020, was the day
that we sunset Python 2. That means that we will
not improve it anymore after that day, even if
someone finds a security problem in it. You should
upgrade to Python 3 as soon as you can
python.org
Executing Python
• Just type python to start an interactive prompt
>>>print(‘Hello World’)
Hello World
>>>print(2**8)
256
>>>lumberjack = ‘okay’
>>>lumberjack
‘okay’
>>>2**8
256
External file
• script.py
import sys
print(sys.platform)
print(2 ** 100)
x = 'Spam!'
print(x * 8)
python script.py
Dynamic typing
• We do not declare the specific types of our objects
– Types are determined automatically at run time
• Variable Creation
– A var is created when your code first assigns it a value
– Future assignments change the value
• Variable Types
– A var never has any type information or constraints
associated with it
– The notion of type is associated with objects, not names
– Variables simply refer to an object
• Variable Use
– When a var appears in an expression it is replaced
immediately by the object it references
Object creation
• The object also contains two meta-data:
– A type designator used to mark the type of the object
– A reference counter to determine when it is ok to
reclaim the object
Types live with objects
>>> a = 3
>>> a = ‘spam’
>>> a = 1.23
• What is happening?
• Names have no types
– We are simply changing the reference
• What about the objects? What happens to them?
– Objects are garbage collected
– This happens thanks to the reference counter on each
object
Shared references
Types (part I)
Conceptual hierarchy
• Programs are composed of modules
• Modules contain statements
• Statements contain expressions
• Expressions create and process objects
• Everything is an object in a Python script
• We will follow a bottom up approach
Core data types
• Numbers
• Strings
• Lists
• Dictionaries
• Tuples
• Files
• Sets
• Other (booleans, types, None)
• Program unit types (functions, modules, classes)
Numbers
• Very basic
– support for integers, floating-point, complex, decimals
with fixed precision, and rationals
– support for +, *, **, -, /, %, and //
– some existing modules
>>> 123 + 222
345
>>> 1.5 * 4
6.0
>>> 2 ** 100
1267650600…
Modules
>>> import math
>>> math.pi
3.141592653589793
>>> math.sqrt(85)
9.219544457292887
>>> import random
>>> random.random()
0.39609123973370275
>>> random.choice([1,2,3,4])
1
Strings
• Used for text and for arbitrary collections of bytes
(e.g., content of image file)
• Strings are examples of sequences
– positionally ordered collection of objects
– stored and fetched by position
• Strings are Immutable
Examples
>>> S = ‘Spam’
>>> len(S)
4
>>> S[0] >>> S[1:3] >>> S[:3]
’S’ ‘pa’ ‘Spa’
>>> S[1] >>> S[1:] >>> S[:-1]
‘p’ ‘pam’ ‘Spa’
>>> S[-1] >>> S[:]
‘m’ ‘Spam’
>>> S[-2]
‘a’
Slicing
>>> S = ‘abcdefghijklmnop’
>>> S[1:10:2]
‘bdfhj’
>>> S[::2]
‘acegikmo’
>>> S = ‘hello’
>>> S[::-1]
• The third parameter can be used for skipping
• With a negative stride the meanings of the first
two parameters are reversed
Changing strings
>>> S = ‘spam’
>>> S[0] = ‘x’ error!!!
• So how can I change a string? I can’t!
– I can use concatenation to create a new string
– I can use slicing, indexing and concatenation
– but I’m still assigning a new string
>>> S = S + ‘SPAM!’
>>> S
‘spamSPAM!’
>>> S = S[:4] + ‘Burger’ + S[-1]
>>> S
‘spamBurger!’
Type-specific methods
S = ‘Spam’
S.find(‘pa’) -> 1
S.replace(‘pa’, ‘XYZ’) -> ‘SXYZm’
>>> S
‘Spam’
line = ‘aaa,bbb,ccccc,dd’
line.split(‘,’) -> [‘aaa’, ‘bbb’, ‘ccccc’, ‘dd’]
S = ‘spam’
S.upper() -> ‘SPAM’
S.isalpha() -> True
‘%s, eggs, and %s’ %(‘spam’, ‘SPAM!’)
-> ‘spam, eggs, and SPAM!’
‘{0}, eggs, and {1}’.format(‘spam’, ‘SPAM!’)
-> ‘spam, eggs, and SPAM!’
Some commonly used method for strings
• split()
• join()
• lower()/upper()
• find()
• replace()
• strip()/lstrip()/rstrip()
• isalpha()/isdecimal()/isdigit()
• islower()
Strings: formatting
• String formatting expressions (a-la C)
– % is a binary operator - useful for encoding
multiple string substitutions all at once
• On the left
– a format string containing one or more
substitution targets
• On the right
– a tuple of objects
‘That is %d %s bird!’ % (1, ‘dead’)
‘That is 1 dead bird!’
Lists
• Most general sequence
– Positionally ordered collections of arbitrary objects
– Accessed by offset
– Variable length, heterogenous, arbitrarily nestable
– Mutable
– Arrays of object references
Basics
>>> L = [123, >>> L + [4,5,6]
‘spam’, 1.23] [123, ‘spam’, 1.23, 4, 5, 6]
>>> len(L)
3 >>> L*2
[123, ‘spam’, 1.23, 123,
>>> L[0] ‘spam’, 1.23]
123
>>>L[:-1] >>> L
[123, ‘spam’] [123, ‘spam’, 1.23]
• L is not modified
• slicing/concat/repeat create new lists
Slicing (I)
• Replacement/insertion
>>> L = [1,2,3]
>>> L[1:2] = [4,5]
>>> L
[1,4,5,3]
• Insertion (replace nothing)
>>> L[1:1] = [6,7]
>>> L
[1,6,7,4,5,3]
• Deletion (insert nothing)
>>> L[1:2] = []
>>> L
[1,7,4,5,3]
Slicing (II)
• Empty slice at front
>>> L = [1]
>>> L[:0] = [2,3,4]
>>> L
[2,3,4,1]
• Empty slice at end
>>> L[len(L):] = [5,6,7]
>>> L
[2,3,4,1,5,6,7]
• Insert all at end
>>> L.extend([8,9,10])
>>> L
[2,3,4,1,5,6,7, 8, 9, 10]
Type-specific methods
• append()/extend()
– + creates a new list
– append modifies and existing list
• insert()
• pop()/remove()
• reverse()
• sort()
List-specific operations
>>> L
[123, ‘spam’, 1.23]
>>> L.append(‘NI’)
>>> L
[123, ‘spam’, 1.23, ‘NI’]
>>> L.pop(2) >>> M = [‘bb’, ‘aa’, ‘cc’]
1.23 >>> M.sort()
>>> L >>> M
[123, ‘spam’, ‘NI’] [‘aa’, ‘bb’, ‘cc’]
>>> del L[0] >>> M.reverse()
>>> L >>> M
[‘spam’, ‘NI’] [‘cc’, ‘bb’, ‘aa’]
Comprehensions
• Comprehensions are one of the most common
uses of iteration
• It is a way of creating a new list from an existing
one
– [ expression for var in list]
L = [1,2,3,4,5]
res = [x + 10 for x in L]
• is equivalent to
res = []
for x in L:
res.append(x+10)
Extended Comprehensions
• Nested loops
[x+y for x in ‘abc’ for y in ‘lmn’]
[‘al’, ‘am’, ‘an’, ‘bl’, ‘bm’, ‘bn’,
‘cl’, ‘cm’, ‘cn’]
• With conditions
[-x for x in [1, 12, 4, 7, 8] if x > 5]
[-12, -7, -8]
Shared references with mutable types
• What about mutable types with in-place changes?
>>> L1 = [2, 3, 4] >>> L1 = [2,3,4]
>>> L2 = L1 >>> L2 = L1[:]
>>> L1[0] = 24 >>> L1[0] = 24
>>> L1 >>> L1
[24,3,4] [24,3,4]
>>> L2 >>> L2
[24, 3, 4] [2,3,4]
Shared references and Equality
• == tests whether the two referenced objects have
the same values
• is tests object identity
– Do they point to the same object?
>>> L = [1,2,3] >>> M = [1,2,3] >>> X = 42
>>> M = L >>> L == M >>> Y = 42
>>> L == M True >>> X == Y
True >>> L is M True
>>> L is M False >>> X is Y
True True
All statements
Assignments
• Assignments create object references
• Names are created when first signed
• Names must be assigned before being referenced
– Python raises an exception if you do
• Some operations perform assignments implicitly
Examples
>>> nudge = 1
>>> wink = 2
>>> A,B = nudge,wink
>>> A,B
(1,2)
>>> [C,D] = [nudge, wink]
>>> C,D
(1,2)
Temporary tuples
• Python creates a temporary tuple to save the
original values of the variables on the right of the
assignment
>>> nudge = 1
>>> wink = 2
>>> nudge, wink = wink, nudge
>>> nudge, wink
(2,1)
• Assignments are always done by position! Be sure
you have the right number of items!
Sequence assignments
>>> string = ‘SPAM’
>>> a,b,c = string # error!!!
>>> a,b,c = string[0], string[1], string[2:]
>>> a,b,c
(’S’, ‘P’, ‘AM’)
>>> a,b = string[:2]
>>> c = string[2:]
>>> a,b,c
(’S’, ‘P’, ‘AM’)
>>> a,b,c
(’S’, ‘P’, ‘AM’)
>>> (a,b),c = string[:2], string [2:]
Multiple-target assignments
>>> a = b = c =‘spam’
>>> a,b,c
(‘spam’,’spam’,‘spam’)
>>> a = b =[]
>>> b.append(42)
>>> a, b
([42], [42])
>>> a = b = 0
>>> b = b + 1
>>> a,b
(0,1)
Output
• Print outputs its argument(s) as a string
• Different arguments separated by commas are
outputted separated by blanks
>>> print(5)
5
>>> print("Hello")
Hello
>>> print(‘Hello’)
Hello
>>> print(‘Hello’, "students")
Hello students
Input and Output
• We can use a C-like syntax
• To read a value we must use input()
– It always returns a string
– int(), float() allow one to convert the string into a
different datum
>>> x=18
>>> y=15
>>> print("x= %d y= %d" % (x,y))
x= 18 y= 15
>>> print("x=", x, "y=", y)
x= 18 y= 15
>>> v=input("insert a number: ")
insert a number: 45
>>> n=int(v)
If-elif-else
• Python uses compound statements (nested
statements)
– Parentheses are optional
– End-of-line is end of statement
– End of indentation is end of block
• There are no switch or case statements!!!
if test1:
statements1
elif test2:
statements2
else:
statements3
Truth values and boolean tests
• All objects have an inherent boolean value
– Any nonzero number or nonempty object is true
– Zero numbers, empty objects, and the special object None
are false
– A non-empty string means true
• Comparisons and equality tests are applied recursively
to data structures
• Comparisons and equality tests return True or False
• Boolean and and or operators return a true or false
• Boolean operators stop evaluating as soon as the final
result is known
if/else ternary expression
if X:
A = Y
else:
A = Z
• can be stated using the following ternary
expression
A = Y if X else Z
• Example:
>>> A = ’t’ if ‘spam’ else ‘f’
>>> A
’t’
>>> A = ’t’ if ‘’ else ‘f’
>>> A
‘f’
while loops
• Python does not have do/until loops
– We can however simulate them using a test and a break
• break - jumps out of the closest enclosing loop
• continue - jumps to the top of the closest
enclosing loop
• pass - does nothing
• loop else block - runs if and only if the loop is
exited normally
x = ‘spam’
while test: while x:
statements print(x, end = ‘ ‘)
else: x = x[1:]
statements
Loop else
x = y // 2 # floor division
while x > 1:
if y%x == 0:
print(y, ‘has factor’, x)
break
x -= 1
else:
print(y, “is prime”)
for loops
• The objects in the iterable object are assigned to
the target one-by-one
– The target is used to refer to the current item in the
sequence
for target in object:
statements
if test : break
if test : continue
else
statements
T = [(1,2), (3,4), (5,6)]
for (a,b) in T :
print(a,b)
Range
• Produces an iterable that generates items on
demand
– This means it needs to be wrapped in a list to see all its
results at once
>>> list(range(5)), list(range(2,5))
([0, 1, 2, 3, 4], [ 2, 3, 4])
for i in range(3):
print(i, ‘Pythons’)
x = ‘spam’
for i in range(len(x)):
print(x[i], end=‘ ‘)
Enumerate
• Returns a generator object that has a next
method that returns a (index, value) tuple for each
loop iteration
s = ‘spam’
for (offset, item) in enumerate(s):
print(item, ‘appears at offset’, offset)
s appears at offset 0
p appears at offset 1
a appears at offset 2
m appears at offset 3
Iterable objects
• The for we saw previously works with any
Iterable object
– The iterable object is a generalisation of the notion of
sequence
• An object is iterable
– If it physically stores a sequence, or
– If it can produce one result at a time in the context of
an iteration tool
Dictionaries
• Accessed by key (not offset)
• Unordered collection of arbitrary objects
• Variable length, heterogenous, arbitrarily nestable
• Mutable
• Tables of object references (hash tables)
>>> D = {‘food’:’Spam’, ‘quantity’:4,
Basics
‘color’:’pink’}
>>> D[‘food’]
‘Spam’
>>> D[‘quantity’] +=1
>>> D
{‘food’:’Spam’, ‘quantity’:5, ‘color’:’pink’}
>>> len(D)
3
>>> list(D.keys())
[‘food’, ‘quantity’, ‘color’]
>>> list(D.values())
[‘Spam’, 5, ‘pink’]
Creating Dictionaries
>>> D = {}
>>> D[‘name’] = ‘Bob’
>>> D[‘job’] = ‘developer’
>>> D[‘age’] = 40
>>> D
{‘age’:40, ‘name’:’Bob’, ‘job’:’Developer’}
>>> print(D[‘name’])
Bob
>>> d1 = dict(name=‘Bob’, job=‘developer’,
age=40)
>>> d1
{‘age’:40, ‘name’:’Bob’, ‘job’:’Developer’}
Changing Dictionaries
>>> D
{‘eggs’:3, ‘spam’:2, ‘ham’:1}
>>> D[‘ham’] = [‘grill’, ‘bake’, ‘fry’]
>>> D
{‘eggs’:3, ‘spam’:2,
‘ham’:[‘grill’, ‘bake’, ‘fry’]}
>>> del D[‘eggs’]
>>> D
{‘spam’:2, ‘ham’:[‘grill’, ‘bake’, ‘fry’]}
>>> D[‘brunch’] = ‘Bacon’
>>> D
{‘brunch’:‘Bacon’, ’spam’:2,
‘ham’:[‘grill’, ‘bake’, ‘fry’]}
Non-string keys
>>> L = []
>>> L[99] = ‘spam’
error! index out of range!
>>> D = {}
>>> D[99] = ‘spam’
>>> D[99]
‘spam’
>>> D
{99:’spam’}
Test for availability
d = {1:’how', 2:’are', 3:’you?'}
if 1 in d:
print(d[1])
else:
print(0)
Iterating with Dictionaries
D = {‘a’:1, ‘b’:2, ‘c’:3}
for key in D:
print(key, ‘=>’, D[key])
for (key,value) in D.items():
print(key,’=>’,value)
T = [(1,2), (3,4), (5,6)]
for both in T:
a,b = both
print(a,b)
Counting with a dictionary
colors = [‘red’, ‘green’, ‘red’,
‘blue’, ‘green’, ‘red’]
d = {}
for color in colors:
if color not in d:
d[color] = 0
d[color] += 1
d ={}
for color in colors:
d[color] = d.get(color,0) + 1
Dictionary Comprehensions
• It is a way of creating a new dictionary from an
existing iterable
– {key_expression : value_expression for var in list}
• Requires two expressions
L = ['a', 'b', 'c', 'd']
d = {}
n = 1
for i in L:
d[i] = n
n = n + 1
L = ['a', 'b', 'c', 'd']
{letter: i+1 for i,letter in enumerate(L)}
Tuples
• Tuples are sequences
– Ordered collections of arbitrary objects
– Accessed by offset
– Immutable (like strings)
– Fixed length, heterogenous and arbitrarily nested
– Arrays of object references
Tuples: basics
>>> T = (1,2,3,4)
>>> len(T)
4
>>> T + (5,6)
(1,2,3,4,5,6)
>>> T[0]
1
>>> T.index(4)
3
>>>T.count(4)
1
>>>T[0] = 2 -> error
Syntax Issues
• Parentheses are used to enclose both expressions
and tuples in Python
• Commas are used to separate elements in a tuple
– Commas have no place in expressions
>>> x = (40)
>>> x
40
>>> x = (40,)
>>> x
(40,)
Sets
• Unordered collections of unique and immutable
objects
>>> X = set(‘spam’)
>>> Y = {‘h’, ’a’, ‘m’}
>>> X, Y
({‘m’, ‘a’, ‘p’, ’s’}, {‘m’, ‘a’, ‘h’})
>>> X & Y
{‘m’,’a’}
>>> X | Y
{‘m’, ‘h’, ’a’, ‘p’, ’s’}
>>> X-Y
{‘p’, ’s’}
>>> X > Y
False
Examples
• Filtering out duplicates
>>> list(set([1,2,1,3,1]))
[1,2,3]
• Finding Differences
>>> set(‘spam’) - set(‘ham’)
{‘p’, ’s’}
• Order-neutral equality
>>> set(‘spam’) == set(‘asmp’)
True
Set Comprehensions
Very similar to dictionary comprehensions, except
we only have values
{expression for var in list}
>>> a_set = set(range(10))
>>> a_set
{0, 1, 2, 3, 4, 5, 6, 7, 8, 9}
>>> {x ** 2 for x in a_set}
{0, 1, 4, 81, 64, 9, 16, 49, 25, 36}
>>> {x for x in a_set if x % 2 == 0}
{0, 8, 2, 4, 6}
Functions
• Functions are the most basic program structure
• def is executable code - it is run to generate a
function
– The function does not exist until we reach the def
• def creates a function object and assigns a name
– The function name becomes the reference to the
function object
• Function names can be assigned, used in lists, etc.
• return sends a result object back to the caller
– A return without a value simply returns to the caller
(and sends back a None)
Function Basics
• Arguments, return values, and variable types are
not declared
• Arguments are passed by position (unless you say
otherwise)
• Scopes
– By default all names assigned in a function are local to
that function
– global declares module-level variables that are to be
assigned
A simple example
def times(x,y):
return x * y
times(2,4)
times(3.14, 4)
times(‘Hello’, 4)
[x for x in s1 if x in s2]
Another example
• A single function can be generally applied to a
wide variety of objects, as long as they implement
the correct interface
– As long as the first object supports the for loop, and the
second object supports the in keyword
def intersect(seq1, seq2):
res =[]
for x in seq1:
if x in seq2:
res.append(x)
return res
intersect("SPAM”,"SCAM")
intersect([1,2,3],(1,4))
Scopes
• Where we define a variable will determine its
visibility
– By default, all names assigned inside a function are
associated with that function’s scope
• They can only be seen from within the function
• They do not clash with names defined outside the
function
• Variables can be defined in 3 different places
– Inside a def -> local to a function
– outside of all def -> global to the entire file
Summarising Scopes
Keyword global
• Try to minimise the use of global variables
X = 88 X = 88
def func(): def func():
X = 99 global X
X = 99
func()
print(X) func()
print(X)
Keyword nonlocal
• Using keyword nonlocal we can reference a
variable within an enclosing function scope
X = 99
def f1():
X = 88
def f2():
nonlocal X
X = 3
f2()
print(X) #3
f1()
print(X) #99
Arguments
• Arguments are passed by automatically assigning
objects to local variable names
– Immutable arguments are effectively passed “by value”
– Mutable arguments (lists/dictionaries) are effectively
passed “by pointer”
Examples
• Immutable objects
def f(a):
a = 99
b = 88
f(b)
print(b) # 88
• Mutable objects
def changer(a, b):
a= 2
b[0] = 'spam’
x = 1
l = [1, 2]
changer(x, l) # (1, ['spam', 2])
How to avoid changes
• There may be times when we want to avoid
change
– We need to pass a copy of the data
L = [1,2]
changer(X, L[:])
Output params and multiple Values
• Multiple output parameters can be packed into a
tuple, or a different collection type
def multiple(x,y):
x = 2
y = [3,4]
return x, y
x = 1
l = [1,2]
x, l = multiple(x, l)
Argument matching modes
• The mapping between objects and argument
names is achieved by position
• Python also provides additional tools to alter this:
– Keywords - matched by argument name
– Defaults - specify values for optional arguments that are
not passed
– Varargs - arbitrarily many positional or keyword
arguments
Details
• In a function call, arguments must appear in this
order:
– Positional arguments (value)
– Followed by keyword arguments (name=value)
– Followed …
• In a function header,arguments must appear in
this order:
– Normal arguments (name)
– Followed by default arguments (name=value)
– Followed …
Example
def func(spam, eggs, toast=0, ham=0):
print((spam, eggs, toast, ham))
func(1, 2)
func(1, ham=1, eggs=0)
func(spam=1, eggs=0)
func(toast=1, eggs=2, spam=3)
func(1, 2, 3, 4)
Arbitrary arguments
def f(*args):
print(args)
f() # ()
f(1) # (1,)
f(1,2,3,4) # (1,2,3,4)
def f(**args):
print(args)
f() # {}
f(a=1, b=2) # {‘a’:1, ‘b’:2}
Factory functions
def f1(): def maker(n):
x = 88 def action(x):
def f2(): return x**n
print(x) return action
return f2
f = maker(2)
a = f1() print(f(3))
a() print(f(4))
Another example
def counter_factory():
count = 0 # here we create the variable
def counter():
nonlocal count
count += 1
return count
return counter
counter1 = counter_factory()
print(counter1()) # 1
print(counter1()) # 2
counter2 = counter_factory()
print(counter2()) # 1
print(counter2()) # 2
print(counter2()) # 3
lambdas
• An expression form that generates function
objects
– lambda arg1, arg2, arg3: expression over args
– Creates and returns a function, it does not assign a
name (anonymous)
– Commonly used to inline a function definition, or to
defer execution of a piece of code
– lambda is an expression, not a statement
– lambda bodies are single expressions, not blocks of
statements
f = lambda x, y, z: x + y + z
f(2, 3, 4)
Examples
l = [lambda x: x ** 2,
lambda x: x ** 3,
lambda x: x ** 4]
for f in l:
print(f(2)) # 4, 8, 16
print(l[0](3)) # 9
key = 'got'
d = {'already': (lambda: 2 + 2),
'got': (lambda: 2 * 4),
'one': (lambda: 2 ** 6)}
print(d[key]()) # 8
Files
• Function open creates a file object
– A link to a file residing on your machine
Modes and example
• read(r), write(w), append(a), binary(b)
• both input and output (+)
myFile = open(‘myFile.txt’, ‘w’)
myFile.write(‘hello text file\n’) # 16
myFile.write(‘goodbye text file\n’) # 18
myFile.close()
myFile = open(‘myFile.txt’)
print(myFile.readline()) # ‘hello text file\n’
print(myFile.readline()) # ‘goodbye text file\n’
print(myFile.readline()) #‘’
for line in open(‘myFile.txt’):
print(line)
nomeFile = input(”File name: ")
file = open(nomeFile, "r")
content = file.read()
file.close()
words = content.split()
d = {}
for w in words:
w = cleanup(w)
if w in d:
d[w] = d[w] + 1
else:
d[w] = 1
for w in d:
print(w, '->', d[w])
def cleanupV1(p):
bads = "!@#$%^&*()_-+={[}]|;:,<.>?/~`'"
tmp = p.lower();
return tmp.strip(bads)
def cleanupV2(p):
tmp = p.lower();
n = 0
while n < len(tmp):
if tmp[n] < "a" or tmp[n] > "z":
tmp = tmp.replace(tmp[n], '')
else: n += 1
return tmp
Comprehensions and files
lines = [line.upper() for line in open(‘input.txt’)]
• This is very efficient since Python will scan the
file line by line, without loading the entire file
into memory
#O1.py
from M import func import
from N import func
func() # We have a problem!
#02.py
import M
import N
M.func()
N.func()
#03.py
from M import func as mfunc
from N import func as nfunc
mfunc()
nfunc()
Error handling with Exceptions
• Exceptions are events that can modify the control
flow
– In Python they are triggered on errors and intercepted
by our code
• Statements:
– try/except - catch and recover from exceptions
– try/finally - perform cleanup actions
– try/except/else/finally – catch, recover, perform
cleanup actions
Example
obj = 'spam'
print(obj[3])
• What happens if we try with an index that is too
large?
– print(obj[4])
• We get an exception!
• Goes all the way up to the top level of the
program and invokes the default exception
handler which prints the standard error message
and stops the program
Traceback (most recent call last):
IndexError: string index out of range
Let’s try to catch It
obj = 'spam'
try:
print(obj[4])
except IndexError:
print('got the exception!’)
print(‘continuing…’)
OUTPUT:
got the exception!
continuing…
try/except/else/finally
• The search for what code to run is done top to
bottom and from left to right!
try:
statements
except name1:
statements
except (name2, name3):
statements
except name4 as var:
statements
except:
statements
else:
statements
finally:
statements
Empty except
• The empty except will capture all exceptions
– This could be an easy way out!
• However, it will also capture exceptions that are
unrelated to our code
– It will also capture genuine programming mistakes for
which we want to see an error!
Else
• Similar to the else clause in while and for loops!
• It helps us understand how I got to that specific
place in code!
– Did I successfully execute everything I wanted?
– Did I go through an except block of code?
• It can only appear if we have at least one except!
Generator Functions
• Generators support procrastination
– Results are only developed when they are requested and
not all together
– They return results, but they can resume where they left
off when another call comes in
– They are defined using def but they are compiled into
an object that supports the iteration protocol
– They do not exit, they suspend their state around the
point of value generation
• The main difference with funcs lies in the use of
yield instead of return
– I can use the function anywhere I would expect an
iterable
Examples
def gensquares(N):
for i in range(N):
yield i**2
for i in gensquares(5):
print(i)
x = gensquares(4)
print(next(x))
print(next(x))
print(next(x))
print(next(x))
Generator expressions
• Are syntactically like normal list comprehensions,
but with parenthesis instead of square brackets
– Their objective is to create a generator object
print(list((x**2 for x in range(4))))
for num in (x**2 for x in range(4)):
print(num, num/2)
Python as object-oriented language
• OO is optional in Python
– A lot can be done with pure functions and procedural
code
• However
– It promotes code organization, reuse and extension
– Real-life concepts and objects can be easily represented
through classes and objects
– Code gets more manageable as it grows in size
Running example – Car
Attributes Methods
• brand: string
• turnOn()
• model: string
• turnOff()
• driver: string
• setDriver(string)
• isOn: boolean
• move()
• seatbeltFastened: boolean
• fastenSeatbelt()
• radioPlaying: boolean
• switchRadio()
• wheels: int
A class is defined by declaring
• a name (mandatory)
– usually in CamelCase style (e.g., Car)
• a “constructor” method __init__() (optional)
– defines how the class is instantiated
• other methods (optional)
– instance, class or static methods (later..)
– define object behavior
• class variables
– shared among all the class instances
class Car:
pass # an empty block
Constructor (1)
class Car:
def __init__(self, brand, model, driver = None):
self.model = model
self.brand = brand
self.driver = driver
self.isOn = False
myCar = Car('Fiat', 'Panda', 'Luciano')
print(myCar.model)
Constructor (2)
• __init__ is a reserved method of all Python
classes
• it is called every time an instance object is created
• even if not explicitly defined in the class code
– if present, the superclass __init__ method is used
– otherwise, object.__init__ method is used
• __init__ should not return anything (TypeError)
Methods
• Act as regular functions encountered so far
– Self is, by convention, the first argument of any method
def turnOn(self):
print("turning on..")
self.isOn = True
def setDriver(self, name):
self.driver = name
myCar.turnOn()
myCar.setDriver('Giovanni')
class MyClass:
def imeth(self):
return 'instance method called', self
@classmethod
def cmeth(cls):
return 'class method called', cls
@staticmethod
def smeth():
return 'static method called’
obj = MyClass()
obj.imeth()
MyClass.imeth(obj)
MyClass.cmeth()
obj.cmeth()
MyClass.smeth()
obj.smeth()
Method types
Instance methods
• Must always be called through an instance (self )
• If called through the class the instance reference
must be passed manually
• They can modify object state, but also class state
(through self.__class__)
Class methods
• Marked with @classmethod
• The class is the first parameter (cls),
• Can be called through classes or objects
• Cannot modify object states
• Can modify class state
Static methods
• Marked with @staticmethod
• Are called without an instance argument (neither
self, nor cls)
• Names are local to the scope of the classes in
which they are defined
• Can neither modify object state nor class state
• Used to group “functions” that have something to
do with the class
Class vs instance attributes (1)
class Car:
wheels = 4
def __init__(…):
…
myCar = Car('Fiat', 'Panda', 'Luciano')
yourCar = Car('Renault', '5', 'Francesco')
print(Car.wheels)
Car.wheels = 5
print(myCar.wheels)
print(yourCar.wheels)
Motorcycles
Attributes Methods
• brand: string
• turnOn()
• model: string
• turnOff()
• name: string
• setDriver(string)
• isOn: boolean
• putElmet()
• elmetOn: boolean
• wheels: int
Inheritance - Vehicle
class Vehicle:
def __init__(self, brand, model, driver):
self.model = model
self.brand = brand
self.driver = driver
self.isOn = False
def turnOnOff(self):
self.isOn = not self.isOn
def setDriver(self, name):
self.driver = name
Inheritance - Car
class Car(Vehicle):
wheels = 4
def __init__(self, brand, model, driver = None):
super().__init__(brand, model, driver)
self.radioOn = False
self.seatbelt = False
def turnOnOffRadio(self):
self.radioOn = not self.radioOn
def toggleSeatbelt(self):
self.seatbelt = not self.seatbelt
Inheritance - Motorcycle
class Motorcycle(Vehicle):
wheels = 2
def __init__(self, brand, model, driver = None):
super().__init__(brand, model, driver)
self.elmetOn = False
def putElmet(self):
self.elmetOn = not self.elmetOn
class Vehicle:
…
def accelerate(self): Polymorphism
print("accelerating")
class Car(Vehicle):
…
def accelerate(self):
print("pushing the pedal")
class Motorcycle(Vehicle):
…
def accelerate(self):
print("twisting the throttle")
car1 = Car('Fiat', 'Panda', 'Francesco')
moto1 = Motorcycle('Garelli', 'Gulp', 'Matteo')
veh1 = Vehicle('Generic', 'Vehicle', 'Giovanni')
vehicles = [car1, moto1, veh1]
for i in vehicles:
i.accelerate()
Protocols
from typing import Protocol
from abc import abstractmethod
class PColor(Protocol):
@abstractmethod
def draw(self) -> str:
raise NotImplementedError
class NiceColor(PColor):
def draw(self) -> str:
return "deep blue"
class BadColor(PColor):
def draw(self) -> str:
return super().draw() # Error, no implementation
Multiple inheritance
class C2: …
class C3: …
class C1(C2, C3): …
o1 = C1()
o2 = C1()
Intercepting Operators
• Classes can intercept operators that work on built-
in types
– Addition, slicing, printing, etc.
• This allows our objects to be more tightly
integrated in the Python object model
• Method names with double underscores (__X__)
are special hooks
– These methods are called automatically when instances
appear in built-in operations
• Classes may override most built-in operations
Operator overloading
+ __add__(self, other) Addition
* __mul__(self, other) Multiplication
- __sub__(self, other) Subtraction
% __mod__(self, other) Remainder
/ __truediv__(self, other) Division
< __lt__(self, other) Less than
<= __le__(self, other) Less than or equal to
== __eq__(self, other) Equal to
!= __ne__(self, other) Not equal to
> __gt__(self, other) Greater than
>= __ge__(self, other) Greater than or equal to
[index] __getitem__(self, index) Index operator
in __contains__(self, value) Check membership
len __len__(self) The number of elements
str __str__(self) The string representation
class MyClass:
def __init__(self, value):
Example
self.data = value
def __add__(self, other):
return MyClass(self.data + other)
def __str__(self):
return '[MyClass: %s]' % self.data
def __mul__(self, other):
self.data *= other
a = MyClass('abc')
print(a) # produces [MyClass: abc]
b = a + 'xyz'
print(b) # produces [MyClass: abcxyz]
a * 3
print(a) # produces [MyClass: abcabcabc]
User defined exceptions
class AlreadyGotOne(Exception):
pass
def grail():
raise AlreadyGotOne()
try:
grail()
except AlreadyGotOne:
print('got exception')
class Career(Exception):
def __str__(self):
return 'I became a waiter'
raise Career()
# __main__.Career: I became a waiter
Scientific packages
pip install <module>
NumPy
• The key to NumPy is the array object, an n-
dimensional array of homogeneous data types
– With many operations being performed in compiled
code for performance
– As a matter of fact, NumPy is a Python C extension
• There are several important differences between
NumPy arrays and standard Python sequences:
– NumPy arrays have a fixed size: modifying the size
means creating a new array
– NumPy arrays must be of the same data type, but this
can include Python objects
– More efficient mathematical operations than built-in
sequence types
NumPy arrays
>>> import numpy as np
>>> a = np.array([1,2,3,4,5,6,7,8,9])
>>> a
array([1, 2, 3, 4, 5, 6, 7, 8, 9])
>>> b = a.reshape((3,3))
>>> b
array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
>>> b * 10 + 4
array([[14, 24, 34],
[44, 54, 64],
[74, 84, 94]])
NumPy arrays
>>> np.zeros((2, 3))
array([[0., 0., 0.],
[0., 0., 0.]])
>>> np.ones((3,5))
array([[1., 1., 1., 1., 1.],
[1., 1., 1., 1., 1.],
[1., 1., 1., 1., 1.]])
>>> np.arange(10)
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
>>> np.arange(2, 3, 0.1)
array([2. , 2.1, 2.2, 2.3, 2.4, 2.5, 2.6, 2.7, 2.8, 2.9])
>>> np.arange(2, 10, dtype=np.double)
array([2., 3., 4., 5., 6., 7., 8., 9.])
Printing an array
>>> b = np.arange(9).reshape(3,3)
>>> print(b)
[[0 1 2]
[3 4 5]
[6 7 8]]
>>> c = np.arange(8).reshape(2,2,2)
>>> print(c)
[[[0 1]
[2 3]]
[[4 5]
[6 7]]]
>>> x = np.arange(10)
>>> x[-2]
8 Indexing
>>> x = x.reshape (2,5)
>>> x
array([[0, 1, 2, 3, 4],
[5, 6, 7, 8, 9]])
>>> x[1,3]
8
>>> x[1,-1]
9
>>> y = np.arange(35).reshape(5,7)
>>> y
array([[ 0, 1, 2, 3, 4, 5, 6],
[ 7, 8, 9, 10, 11, 12, 13],
[14, 15, 16, 17, 18, 19, 20],
[21, 22, 23, 24, 25, 26, 27],
[28, 29, 30, 31, 32, 33, 34]])
>>> y[1:4:2,::3]
array([[ 7, 10, 13],
[21, 24, 27]])
Array operations
>>> a = np.arange(6).reshape(2,3)
>>> b = np.arange(6).reshape(2,3)
>>> a+b
array([[ 0, 2, 4],
[ 6, 8, 10]]) • Basic operations apply element-
>>> a-b wise
array([[0, 0, 0], – Result is a new array with the
[0, 0, 0]]) resulting elements
>>> a**2
array([[ 0, 1, 4],
• Operations like *= and +=
[ 9, 16, 25]]) modify the existing array
>>> a>3
array([[False, False, False],
[False, True, True]])
>>> 10*np.sin(a)
array([[ 0. , 8.41470985, 9.09297427],
[ 1.41120008, -7.56802495, -9.58924275]])
Multiplication
• Multiplication is done element-wise
• We must perform a dot product to perform
matrix multiplication
>>> a
array([[0, 1, 2],
[3, 4, 5]])
>>> b
array([[0, 1, 2],
[3, 4, 5]])
>>> b = b.reshape((3,2))
>>> np.dot(b,a)
array([[ 3, 4, 5],
[ 9, 14, 19],
[15, 24, 33]])
>>> import numpy as np Linear algebra
>>> import numpy.linalg as la
>>> a = np.array([[1, 2], [3, 4]])
>>> a
array([[1, 2],
[3, 4]])
>>> a.transpose()
array([[1, 3],
[2, 4]])
>>> la.inv(a)
array([[-2. , 1. ],
[ 1.5, -0.5]])
>>> np.trace(a)
5
>>> b = np.eye(3)
>>> b
array([[1., 0., 0.],
[0., 1., 0.],
[0., 0., 1.]])
Matrices
• There is also a class matrix that inherits from the
ndarray
• There are some slight differences but matrices are
very similar to general arrays
• In NumPy’s own words, the question of whether
to use arrays or matrices comes down to the short
answer of “use arrays”
SciPy
• SciPy is a collection of mathematical algorithms
and convenience functions
• Provides the user with high-level commands and
classes for manipulating and visualizing data
• Contains various tools and functions for solving
common problems in scientific computing
SciPy modules
• Special mathematical functions (scipy.special) -- airy, elliptic,
bessel, etc.
• Integration (scipy.integrate)
• Optimization (scipy.optimize)
• Interpolation (scipy.interpolate)
• Fourier Transforms (scipy.fftpack)
• Signal Processing (scipy.signal)
• Linear Algebra (scipy.linalg)
• Compressed Sparse Graph Routines (scipy.sparse.csgraph)
• Spatial data structures and algorithms (scipy.spatial)
• Statistics (scipy.stats)
• Multidimensional image processing (scipy.ndimage)
• Data IO (scipy.io)
• Weave (scipy.weave)
• and more!
SciPy
• Let’s start with a simple little integration example
"
! sin 𝑥 𝑑𝑥
!
• Obviously, the first place we should look is
scipy.integrate!
Solution
• np.sin defines the sin function for us
• We can compute the definite integral from 𝑥 = 0
to 𝑥 = 𝜋 using the quad function
>>> import numpy as np
>>> import scipy.integrate as integrate
>>> result = integrate.quad(np.sin, 0, np.pi)
>>> result
(2.0, 2.220446049250313e-14)
>>> result = integrate.quad(np.sin, -np.inf, +np.inf)
>>> result
(0.0, 0.0)
Integrate with parameters
• We need to integrate a parametric function
>>> import numpy as np
>>> import scipy.integrate as integrate
>>> def integrand(x, a, b):
>>>... return a*x**2 + b
>>> result = integrate.quad(integrand, 0, 1, args=(2,1))
>>> print (result)
(1.6666666666666667, 1.8503717077085944e-14)
Matplotlib
• Matplotlib is an incredibly powerful (and
beautiful!) 2-D plotting library
• It’s easy to use and provides a huge number of
examples for tackling unique problems
PyPlot
• When a single sequence is passed, it generates the
x-values for us starting with 0
import matplotlib.pyplot as plt
plt.plot([1,2,3,4,5])
plt.ylabel('some significant numbers')
plt.show()
PyPlot
• The plot function can actually take any number
of arguments.
• Common usage of plot: plt.plot(x, y, format [, x, y,
format, ])
• Generally speaking, the x_values and y_values
will be numpy arrays and if not, they will be
converted to numpy arrays internally
• Line properties can be set via keyword arguments
to the plot function
– Examples include label, linewidth, animated, color,
etc…
PyPlot
import numpy as np
import matplotlib.pyplot as plt
t = np.arange(0., 5., 0.2)
# red dashes, blue squares and green triangles
plt.plot(t, t, 'r--', t, t**2, 'bs', t, t**3, 'g^')
plt.axis([0, 6, 0, 150]) # x and y range of axis
plt.show()
PyPlot formatting
Bar charts
import matplotlib.pyplot as plt
import numpy as np
labels = ["Baseline", "System"]
data = [3.75, 4.75]
yerror = [0.3497, 0.3108]
xerror = [0.2, 0.2]
xlocations = np.array(range(len(data)))
width = 0.5
ec = 'r'
plt.bar(xlocations, data, yerr=yerror, width=width,
xerr=xerror, ecolor=ec)
plt.yticks(range(0, 8))
plt.xticks(xlocations, labels)
plt.title("Average Ratings on the Training Set")
plt.show()
hist(x, bins=n)
• x is a sequence of numbers
• If keyword argument bins is an integer, it’s the
number of (equally spaced) bins
– Default is 10
import matplotlib.pyplot as plt
import numpy as np
x = np.random.normal(2, 0.5, 1000)
plt.hist(x, bins=50)
plt.show()
scatter(x, y)
• x and y are arrays of numbers of the same length
• Makes a scatter plot of x vs. y
from pylab import *
N = 20
x = 0.9*rand(N)
y = 0.9*rand(N)
scatter(x,y)
show()
Scatter plots
from pylab import *
N = 30
x = 0.9*rand(N)
y = 0.9*rand(N)
area = pi * (10 * rand(N))**2 # 0 to 10 point radius
scatter(x,y,s=area, marker='^', c='r')
savefig('scatter_demo')
3D plots
import matplotlib as mpl
from mpl_toolkits.mplot3d import Axes3D
import numpy as np
import matplotlib.pyplot as plt
fig = plt.figure()
ax = fig.gca(projection='3d')
theta = np.linspace(-4 * np.pi, 4 * np.pi, 100)
z = np.linspace(-2, 2, 100)
r = z**2 + 1
x = r * np.sin(theta)
y = r * np.cos(theta)
ax.plot(x, y, z, \
label='parametric curve')
plt.show()
Pandas
Material courtesy of Katia Oleinik
(
[email protected])
Pandas
• Adds data structures and tools designed to work
with table-like data
• Provides tools for data manipulation: reshaping,
merging, sorting, slicing, aggregation etc.
• Allows handling missing data
Reading data using pandas
• There is a number of pandas commands to read
other data formats
import pandas as pd
pd.read_excel('myfile.xlsx',sheet_name='Sheet1’,
index_col=None, na_values=['NA'])
pd.read_stata('myfile.dta')
pd.read_sas('myfile.sas7bdat')
pd.read_hdf('myfile.h5','df')
Exploring data frames
import pandas as pd
data = {
"calories": [420, 380, 390],
"duration": [50, 40, 45]
}
#load data into a DataFrame object
df = pd.DataFrame(data)
print(df)
Data Frames methods
df.method() description
head( [n] ), tail( [n] ) first/last n rows
describe() generate descriptive statistics (for numeric columns only)
max(), min() return max/min values for all numeric columns
mean(), median() return mean/median values for all numeric columns
std() standard deviation
sample([n]) returns a random sample of the data frame
dropna() drop all the records with missing values
Selecting a column in a Data Frame
• Subset the data frame using column name
– df['sex’]
• Use the column name as an attribute
– df.sex
Data Frames groupby method
• group by allows one to
– Split the data into groups based on some criteria
– Calculate statistics (or apply a function) to each group
import pandas as pd
data = {
"calories": [420, 380, 420, 390],
"duration": [50, 40, 30, 45]
}
df = pd.DataFrame(data)
df_rank = df.groupby(['calories'])
print(df_rank.mean())
Data Frame: filtering
• Any Boolean operator can be used to subset the
data
df_sub = df[df['salary'] > 120000]
df_f = df[df['sex'] == 'Female']
Data Frames: Slicing
• There are a number of ways to subset the Data
Frame
– one or more columns
– one or more rows
– a subset of rows and columns
• Rows and columns can be selected by their
position or label
Data Frames: Slicing
• When selecting one column, it is possible to use
single set of brackets, but the resulting object will
be a series (not a DataFrame)
• When we need to select more than one column
and/or make the output to be a DataFrame, we
should use double brackets
df['salary']
df[['rank','salary']]
Data Frames: Selecting rows
• If we need to select a range of rows, we can
specify the range using ":"
– Notice that the first row has a position 0, and the last
value in the range is omitted: 0:10 means positions
starting with 0 and ending with 9
df[10:20]
Data Frames: method loc
• If we need to select a range of rows, using their
labels we can use method loc
• If we need to select a range of rows and/or
columns, using their positions we can use method
iloc:
df_sub.loc[10:20,['rank','sex','salary’]]
df_sub.iloc[10:20,[0, 3, 4, 5]]
Data Frames: method iloc (summary)
df.iloc[0] # First row of a data frame
df.iloc[i] #(i+1)th row
df.iloc[-1] # Last row
df.iloc[:, 0] # First column
df.iloc[:, -1] # Last column
df.iloc[0:7] #First 7 rows
df.iloc[:, 0:2] #First 2 columns
df.iloc[1:3, 0:2] #Second through third rows and
first 2 columns
df.iloc[[0,5], [1,3]] #1st and 6th rows and 2nd
and 4th columns
Data Frames: Sorting
• We can sort the data by a value in the column
– By default the sorting will occur in ascending order and
a new data frame is returned
df_sorted = df.sort_values( by ='service’)
df_sorted = df.sort_values( by =['service', 'salary’],
ascending = [True, False])
df_sorted.head(10)
Missing Values
• There are a number of methods to deal with
missing values in the data frame:
df.method() description
dropna() Drop missing observations
dropna(how='all') Drop observations where all cells is NA
dropna(axis=1, Drop column if all the values are missing
how='all')
dropna(thresh = 5) Drop rows that contain less than 5 non-missing values
fillna(0) Replace missing values with zeros
isnull() returns True if the value is missing
notnull() Returns True for non-missing values
SQLAlchemy
• A library to ease the usage of SQL in Python
• A toolkit for SQL
• An Object Relational Mapper
• It allows writing schema-based object oriented
applications
Questions?
muito obrigado !!!
[email protected]
Write a simple Python 3 function that takes a string as
parameter. The function must return a list containing the
first 5 positions of the character z, uppercase or lowercase, in
the string. If the string doesn’t contain z, the function should
return an empty list and if it contains z less than 5 times, the
function should use the last position found to fill the missing
cells to arrive at a list of 5 items. For example, if the string
were azvbzc, the function would return [1,4,4,4,4]. Also
write a simple main program that uses the function.
Write a simple Python 3 function that takes a dictionary of
word and number occurrencies in some unknown text. The
function must return a list containing all words that begin
with a consonant and occur in the text an odd number of
times greater than 5. Also write a simple main program that
uses the defined function.
Write a simple Python 3 function that is passed a list (list) of
words as a parameter and prompts the user for one more
word. The function will have to compare the word entered
by the user with those present in the passed list, looking for
rhymes, intended as words whose last 3 letters are equal to
the word entered by the user. The rhymes must therefore be
output by the user.
The “scoundrel’s language” consists of doubling each
consonant in a word and inserting an “o” in between. For
example the word “mangiare” becomes
“momanongogiarore”. Write a Python 3 function that can
translate a word in the aforementioned language. The main
program should continue translating the words supplied by
the user until the word “stop” is entered.