Introduction To NumPy
Travis Oliphant and Eric Jones
[email protected][email protected]Enthought, Inc.
www.enthought.com
PyCon 2008
NumPy
Website -- http://numpy.scipy.org/
Offers Matlab-ish capabilities within Python
NumPy replaces Numeric and Numarray
Developed by Travis Oliphant
27 svn committers to the project
NumPy 1.0 released October, 2006
~16K downloads/month from Sourceforge.
This does not count:
Linux distributions that include numpy
Enthought distributions that include numpy
2
Getting Started
IMPORT NUMPY
>>> from numpy import *
>>> __version__
1.0.2.dev3487
or
>>> from numpy import array, ...
Often at the command line, it is
handy to import everything from
numpy into the command shell.
However, if you are writing scripts,
it is easier for others to read and
debug in the future if you use
explicit imports.
USING IPYTHON -PYLAB
C:\> ipython pylab
In [1]: array((1,2,3))
Out[1]: array([1, 2, 3])
Ipython has a pylab mode where
it imports all of numpy, matplotlib,
and scipy into the namespace for
you as a convenience.
While IPython is used for all the
demos, >>> is used on future
slides instead of In [1]:
because it takes up less room.
Array Operations
SIMPLE ARRAY MATH
>>> a = array([1,2,3,4])
>>> b = array([2,3,4,5])
>>> a + b
array([3, 5, 7, 9])
NumPy defines the following
constants:
pi = 3.14159265359
e = 2.71828182846
MATH FUNCTIONS
# Create array from 0 to 10
>>> x = arange(11.)
# multiply entire array by
# scalar value
>>> a = (2*pi)/10.
>>> a
0.62831853071795862
>>> a*x
array([ 0.,0.628,,6.283])
# inplace operations
>>> x *= a
>>> x
array([ 0.,0.628,,6.283])
# apply functions to array.
4
>>> y = sin(x)
Plotting Arrays
MATPLOTLIB
>>> plot(x,y)
CHACO SHELL
>>> from enthought.chaco2 \
...
import shell
>>> shell.plot(x,y)
Introducing NumPy Arrays
SIMPLE ARRAY CREATION
>>> a = array([0,1,2,3])
>>> a
array([0, 1, 2, 3])
CHECKING THE TYPE
>>> type(a)
<type 'array'>
NUMERIC TYPE OF ELEMENTS
>>> a.dtype
dtype(int32)
BYTES PER ELEMENT
>>> a.itemsize # per element
4
ARRAY SHAPE
# shape returns a tuple
# listing the length of the
# array along each dimension.
>>> a.shape
(4,)
>>> shape(a)
(4,)
ARRAY SIZE
# size reports the entire
# number of elements in an
# array.
>>> a.size
4
>>> size(a)
4
6
Introducing NumPy Arrays
BYTES OF MEMORY USED
CONVERSION TO LIST
# returns the number of bytes
# used by the data portion of
# the array.
>>> a.nbytes
16
# convert a numpy array to a
# python list.
>>> a.tolist()
[0, 1, 2, 3]
NUMBER OF DIMENSIONS
>>> a.ndim
1
# For 1D arrays, list also
# works equivalently, but
# is slower.
>>> list(a)
[0, 1, 2, 3]
ARRAY COPY
# create a copy of the array
>>> b = a.copy()
>>> b
array([0, 1, 2, 3])
7
Setting Array Elements
ARRAY INDEXING
>>> a[0]
0
>>> a[0] = 10
>>> a
[10, 1, 2, 3]
FILL
# set all values in an array.
>>> a.fill(0)
>>> a
[0, 0, 0, 0]
# This also works, but may
# be slower.
>>> a[:] = 1
>>> a
[1, 1, 1, 1]
BEWARE OF TYPE
COERSION
>>> a.dtype
dtype('int32')
# assigning a float to into
# an int32 array will
# truncate decimal part.
>>> a[0] = 10.6
>>> a
[10, 1, 2, 3]
# fill has the same behavior
>>> a.fill(-4.8)
>>> a
[-4, -4, -4, -4]
8
Multi-Dimensional Arrays
MULTI-DIMENSIONAL ARRAYS
>>> a = array([[ 0, 1, 2, 3],
[10,11,12,13]])
>>> a
array([[ 0, 1, 2, 3],
[10,11,12,13]])
NUMBER OF DIMENSIONS
>>> a.ndim
2
GET/SET ELEMENTS
(ROWS,COLUMNS)
>>> a[1,3]
13
>>>
(2,
>>>
(2,
>>> a[1,3] = -1
>>> a
array([[ 0, 1, 2, 3],
[10,11,12,-1]])
a.shape
4)
shape(a)
4)
ELEMENT COUNT
>>> a.size
8
>>> size(a)
8
column
row
ADDRESS FIRST ROW USING
SINGLE INDEX
>>> a[1]
array([10, 11, 12, -1])
10
Array Slicing
SLICING WORKS MUCH LIKE
STANDARD PYTHON SLICING
>>> a[0,3:5]
array([3, 4])
>>> a[4:,4:]
array([[44, 45],
[54, 55]])
>>> a[:,2]
array([2,12,22,32,42,52])
STRIDES ARE ALSO POSSIBLE
10
11
12
13
14
15
20
21
22
23
24
25
30
31
32
33
34
35
40
41
42
43
44
45
50
51
52
53
54
55
>>> a[2::2,::2]
array([[20, 22, 24],
[40, 42, 44]])
11
Slices Are References
Slices are references to memory in original array. Changing values in a slice also
changes the original array.
>>> a = array((0,1,2,3,4))
# create a slice containing only the
# last element of a
>>> b = a[2:3]
>>> b[0] = 10
# changing b changed a!
>>> a
array([ 1, 2, 10, 3, 4])
12
Fancy Indexing
INDEXING BY POSITION
INDEXING WITH BOOLEANS
>>> a = arange(0,80,10)
>>> mask = array([0,1,1,0,0,1,0,0],
...
dtype=bool)
# fancy indexing
>>> y = a[[1, 2, -3]]
>>> print y
[10 20 50]
# fancy indexing
>>> y = a[mask]
>>> print y
[10,20,50]
# using take
>>> y = take(a,[1,2,-3])
>>> print y
[10 20 50]
# using compress
>>> y = compress(mask, a)
>>> print y
[10,20,50]
10
20
10
20
50
30
40
50
60
70
13
Fancy Indexing in 2D
>>> a[[0,1,2,3,4],[1,2,3,4,5]]
array([ 1, 12, 23, 34, 45])
>>> a[3:,[0, 2,
array([[30, 32,
[40, 42,
[50, 52,
5]]
35],
45]])
55]])
>>> mask = array([1,0,1,0,0,1],
dtype=bool)
>>> a[mask,2]
array([2,22,52])
10
11
12
13
14
15
20
21
22
23
24
25
30
31
32
33
34
35
40
41
42
43
44
45
50
51
52
53
54
55
Unlike slicing, fancy indexing
creates copies instead of
views into original arrays.
14
Indexing with None
None is a special index that inserts a new axis in the array at the specified
location. Each None increases the arrays dimensionality by 1.
a
1X3
3X1
>>> y = a[None,:]
>>> shape(y)
(1, 3)
3X1X1
>>> y = a[:,None]
>>> shape(y)
(3, 1)
>>> y = a[:,None, None]
>>> shape(y)
(3, 1, 1)
0
0
1
2
15
3D Example
MULTIDIMENSIONAL
2
1
0
# Retreive two slices from a
# 3D cube via indexing.
>>> y = a[:,:,[2,-2]]
# The take() function also works.
>>> y = take(a,[2,-2], axis=2)
y
17
Where (version 1)
WHERE
# Find the indices in array
# where expression is True.
>>> a = array([0, 12, 5, 20])
>>> a>10
array([False, True, False,
True], dtype=bool)
>>> where(a>10)
array([1, 3])
18
Flattening Arrays
a.flatten()
a.flat
a.flatten() converts a multi-dimensional
array into a 1D array. The new array is a copy of
the original data.
a.flat is an attribute that returns an iterator
object that accesses the multi-dimensional array
data as a 1D array. It references the original
memory.
# Create a 2D array
>>> a = array([[0,1],
[2,3]])
# Flatten out elements to 1D
>>> b = a.flatten()
>>> b
array(0,1,2,3)
# Changing b does not change a
>>> b[0] = 10
>>> b
array(10,1,2,3)
no change
>>> a
array([[0, 1],
[2, 3]])
>>> a.flat
<numpy.flatiter obj...>
>>> a.flat[:]
array(0,1,2,3)
>>> b = a.flat
>>> b[0] = 10
changed!
>>> a
array([[10, 1],
[ 2, 3]])
22
(Un)raveling Arrays
a.ravel()
a.ravel() is the same as a.flatten(),
but it returns a reference (or view) of the array
if it is possible (ie. the memory is contiguous).
Otherwise the new array copies the data.
# Create a 2D array
>>> a = array([[0,1],
[2,3]])
# Flatten out elements to 1D
>>> b = a.ravel()
>>> b
array(0,1,2,3)
# Changing b does change a
>>> b[0] = 10
>>> b
array(10,1,2,3)
changed!
>>> a
array([[10, 1],
[ 2, 3]])
a.ravel()
# Create a 2D array
>>> a = array([[0,1],
[2,3]])
# Transpose array so memory
# layout is no longer contiguous
>>> aa = a.transpose()
>>> aa
array([[0, 2],
[1, 3]])
# ravel will create a copy of data
>>> b = aa.ravel()
array(0,2,1,3)
# changing b doesnt change a.
>>> b[0] = 10
>>> b
array(10,1,2,3)
>>> aa
array([[0, 1],
[2, 3]])
23
Reshaping Arrays
SHAPE AND RESHAPE
RESHAPE
>>> a = arange(6)
>>> a
array([0, 1, 2, 3, 4, 5])
>>> a.shape
(6,)
# return a new array with a
# different shape
>>> a.reshape(3,2)
array([[0, 1],
[2, 3],
[4, 5]])
# reshape array inplace to 2x3
>>> a.shape = (2,3)
>>> a
array([[0, 1, 2],
[3, 4, 5]])
# reshape cannot change the
# number of elements in an
# array.
>>> a.reshape(4,2)
ValueError: total size of new
array must be unchanged
24
Transpose
TRANSPOSE
>>> a = array([[0,1,2],
...
[3,4,5]])
>>> a.shape
(2,3)
# Transpose swaps the order
# of axes. For 2D this
# swaps rows and columns
>>> a.transpose()
array([[0, 3],
[1, 4],
[2, 5]])
# The .T attribute is
# equivalent to transpose()
>>> a.T
array([[0, 3],
[1, 4],
[2, 5]])
TRANSPOSE RETURNS VIEWS
>>> b = a.T
# changes to
>>> b[0,1] =
>>> a
array([[ 0,
[30,
b alter a
30
1,
4,
2],
5]])
TRANSPOSE AND STRIDES
# Transpose does not move
# values around in memory. It
# only changes the order of
# strides in the array
>>> a.strides
(12, 4)
>>> a.T.strides
(4, 12)
25
Squeeze
SQUEEZE
>>> a = array([[1,2,3],
...
[4,5,6]])
>>> a.shape
(2,3)
# insert an extra dimension
>>> a.shape = (2,1,3)
>>> a
array([[[0, 1, 2]],
[[3, 4, 5]]])
# squeeze removes any
# dimension with length=1
>>> a.squeeze()
>>> a.shape
(2,3)
27
Diagonals
DIAGONAL
DIAGONALS WITH INDEXING
>>> a = array([[11,21,31],
...
[12,22,32],
...
[13,23,33])
# Fancy indexing also works.
>>> i = [0,1,2]
>>> a[i,i]
array([11, 22, 33])
# Extract the diagonal from
# an array.
>>> a.diagonal()
array([11, 22, 33])
# Indexing can also be used
# to set diagonal values
>>> a[i,i] = 2
>>> i = array([0,1])
# upper diagonal
>>> a[i,i+1] = 1
# lower diagonal
>>> a[i+1,i]= = -1
>>> a
array([[ 2, 1, 13],
[-1, 2, 1],
[31, -1, 2]])
# Use offset to move off the
# main diagonal.
>>> a.diagonal(offset=1)
array([21, 32])
28
Complex Numbers
COMPLEX ARRAY ATTRIBUTES
>>> a = array([1+1j,1,2,3])
array([1.+1.j, 2.+0.j, 3.+0.j,
4.+0.j])
>>> a.dtype
dtype(complex128)
# real and imaginary parts
>>> a.real
array([ 1., 2., 3., 4.])
>>> a.imag
array([ 1., 0., 0., 0.])
# set imaginary part to a
# different set of values.
>>> a.imag = (1,2,3,4)
>>> a
array([1.+1.j, 2.+2.j, 3.+3.j,
4.+4.j])
CONJUGATION
>>> a.conj()
array([0.-1.j, 1.-2.j, 2.-3.j,
3.-4.j])
FLOAT (AND OTHER) ARRAYS
>>> a = array([0.,1,2,3])
# .real and .imag attributes
# are available.
>>> a.real
array([ 0., 1., 2., 3.])
>>> a.imag
array([ 0., 0., 0., 0.])
# But .imag is read-only.
>>> a.imag = (1,2,3,4)
TypeError: array does not
have imaginary part to set 29
Array Constructor Examples
FLOATING POINT ARRAYS
DEFAULT TO DOUBLE PRECISION
>>> a = array([0,1.,2,3])
>>> a.dtype
dtype(float64) notice
>>> a.nbytes
decimal
32
UNSIGNED INTEGER BYTE
>>> a = array([0,1,2,3],
...
dtype=uint8)
>>> a.dtype
dtype(uint8)
>>> a.nbytes
4
REDUCING PRECISION
>>> a = array([0,1.,2,3],
...
dtype=float32)
>>> a.dtype
dtype(float32)
>>> a.nbytes
16
30
NumPy dtypes
Basic Type
Available NumPy types
Comments
Boolean
bool
Integer
int8, int16, int32, int64,
int128, int
int defaults to the size of int in
C for the platform
Unsigned
Integer
uint8, uint16, uint32, uint64,
uint128, uint
uint defaults to the size of
unsigned int in C for the platform
Float
float32, float64, float,
longfloat,
Float is always a double precision
floating point value (64 bits).
longfloat represents large
precision floats. Its size is
platform dependent.
Complex
complex64, complex128, complex
The real and complex elements of a
complex64 are each represented by
a single precision (32 bit) value
for a total size of 64 bits.
Strings
str, unicode
Object
object
Represent items in array as Python
objects.
Records
void
Used for arbitrary data structures
in record arrays.
Elements are 1 byte in size
31
Type Casting
ASARRAY
ASTYPE
>>> a = array([1.5, -3],
...
dtype=float32)
>>> a
array([ 1.5, -3.], dtype=float32)
>>> a = array([1.5, -3],
...
dtype=float64)
>>> a.astype(float32)
array([ 1.5, -3.], dtype=float32)
# upcast
>>> asarray(a, dtype=float64)
array([ 1.5,-3.])
>>> a.astype(uint8)
array([ 1, 253],dtype=unit8)
# downcast
>>> asarray(a, dtype=uint8)
array([ 1, 253], dtype=uint8)
# asarray is efficient.
# It *does not* make a
# copy if the type is the same.
>>> b = asarray(a, dtype=float32)
>>> b[0] = 2.0
>>> a
array([2.0,-3. ])
# astype is safe.
# It always returns a copy of
# the array.
>>> b = a.astype(float64)
>>> b[0] = 2.0
>>> a
array([1.5, -3.])
32
Array Calculation Methods
SUM FUNCTION
>>> a = array([[1,2,3],
[4,5,6]], float)
# Sum defaults to summing all
# *all* array values.
>>> sum(a)
21.
# supply the keyword axis to
# sum along the 0th axis.
>>> sum(a, axis=0)
array([5., 7., 9.])
# supply the keyword axis to
# sum along the last axis.
>>> sum(a, axis=-1)
array([6., 15.])
SUM ARRAY METHOD
# The a.sum() defaults to
# summing *all* array values
>>> a.sum()
21.
# Supply an axis argument to
# sum along a specific axis.
>>> a.sum(axis=0)
array([5., 7., 9.])
PRODUCT
# product along columns.
>>> a.prod(axis=0)
array([ 4., 10., 18.])
# functional form.
>>> prod(a, axis=0)
array([ 4., 10., 18.])
33
Min/Max
MIN
MAX
>>> a = array([2.,3.,0.,1.])
>>> a.min(axis=0)
0.
# use NumPys amin() instead
# of Pythons builtin min()
# for speed operations on
# multi-dimensional arrays.
>>> amin(a, axis=0)
0.
>>> a = array([2.,1.,0.,3.])
>>> a.max(axis=0)
3.
ARGMIN
ARGMAX
# Find index of minimum value.
>>> a.argmin(axis=0)
2
# functional form
>>> argmin(a, axis=0)
2
# Find index of maximum value.
>>> a.argmax(axis=0)
1
# functional form
>>> argmax(a, axis=0)
1
34
# functional form
>>> amax(a, axis=0)
3.
Statistics Array Methods
MEAN
>>> a = array([[1,2,3],
[4,5,6]], float)
# mean value of each column
>>> a.mean(axis=0)
array([ 2.5, 3.5, 4.5])
>>> mean(a, axis=0)
array([ 2.5, 3.5, 4.5])
>>> average(a, axis=0)
array([ 2.5, 3.5, 4.5])
STANDARD DEV./VARIANCE
# Standard Deviation
>>> a.std(axis=0)
array([ 1.5, 1.5, 1.5])
# Variance
>>> a.var(axis=0)
array([2.25, 2.25, 2.25])
>>> var(a, axis=0)
array([2.25, 2.25, 2.25])
# average can also calculate
# a weighted average
>>> average(a, weights=[1,2],
...
axis=0)
array([ 3., 4., 5.])
35
Other Array Methods
CLIP
ROUND
# Limit values to a range
# Round values in an array.
# NumPy rounds to even, so
# 1.5 and 2.5 both round to 2.
>>> a = array([1.35, 2.5, 1.5])
>>> a.round()
array([ 1., 2., 2.])
>>> a = array([[1,2,3],
[4,5,6]], float)
# Set values < 3 equal to 3.
# Set values > 5 equal to 5.
>>> a.clip(3,5)
>>> a
array([[ 3., 3., 3.],
[ 4., 5., 5.]])
# Round to first decimal place.
>>> a.round(decimals=1)
array([ 1.4, 2.5, 1.5])
POINT TO POINT
# Calculate max min for
# array along columns
>>> a.ptp(axis=0)
array([ 3.0, 3.0, 3.0])
# max min for entire array.
>>> a.ptp(axis=None)
5.0
36
Summary of (most) array
attributes/methods
BASIC ATTRIBUTES
a.dtype Numerical type of array elements. float32, uint8, etc.
a.shape Shape of the array. (m,n,o,...)
a.size Number of elements in entire array.
a.itemsize Number of bytes used by a single element in the array.
a.nbytes Number of bytes used by entire array (data only).
a.ndim Number of dimensions in the array.
SHAPE OPERATIONS
a.flat An iterator to step through array as if it is 1D.
a.flatten() Returns a 1D copy of a multi-dimensional array.
a.ravel() Same as flatten(), but returns a view if possible.
a.resize(new_size) Change the size/shape of an array in-place.
a.swapaxes(axis1, axis2) Swap the order of two axes in an array.
a.transpose(*axes) Swap the order of any number of array axes.
a.T Shorthand for a.transpose()
a.squeeze() Remove any length=1 dimensions from an array.
37
Summary of (most) array
attributes/methods
FILL AND COPY
a.copy() Return a copy of the array.
a.fill(value) Fill array with a scalar value.
CONVERSION / COERSION
a.tolist() Convert array into nested lists of values.
a.tostring() raw copy of array memory into a python string.
a.astype(dtype) Return array coerced to given dtype.
a.byteswap(False) Convert byte order (big <-> little endian).
COMPLEX NUMBERS
a.real Return the real part of the array.
a.imag Return the imaginary part of the array.
a.conjugate() Return the complex conjugate of the array.
a.conj() Return the complex conjugate of an array.(same as conjugate)
38
Summary of (most) array
attributes/methods
SAVING
a.dump(file) Store a binary array data out to the given file.
a.dumps() returns the binary pickle of the array as a string.
a.tofile(fid, sep="", format="%s") Formatted ascii output to file.
SEARCH / SORT
a.nonzero() Return indices for all non-zero elements in a.
a.sort(axis=-1) Inplace sort of array elements along axis.
a.argsort(axis=-1) Return indices for element sort order along axis.
a.searchsorted(b) Return index where elements from b would go in a.
ELEMENT MATH OPERATIONS
a.clip(low, high) Limit values in array to the specified range.
a.round(decimals=0) Round to the specified number of digits.
a.cumsum(axis=None) Cumulative sum of elements along axis.
a.cumprod(axis=None) Cumulative product of elements along axis.
39
Summary of (most) array
attributes/methods
REDUCTION METHODS
All the following methods reduce the size of the array by 1
dimension by carrying out an operation along the specified axis. If
axis is None, the operation is carried out across the entire array.
a.sum(axis=None) Sum up values along axis.
a.prod(axis=None) Find the product of all values along axis.
a.min(axis=None) Find the minimum value along axis.
a.max(axis=None) Find the maximum value along axis.
a.argmin(axis=None) Find the index of the minimum value along axis.
a.argmax(axis=None) Find the index of the maximum value along axis.
a.ptp(axis=None) Calculate a.max(axis) a.min(axis)
a.mean(axis=None) Find the mean (average) value along axis.
a.std(axis=None) Find the standard deviation along axis.
a.var(axis=None) Find the variance along axis.
a.any(axis=None) True if any value along axis is non-zero.
(or)
a.all(axis=None) True if all values along axis are non-zero. (and)
40
Array Creation Functions
arange(start,stop=None,step=1,dtype=None)
Nearly identical to Pythons range(). Creates an array of values
in the range [start,stop) with the specified step value. Allows
non-integer values for start, stop, and step. When not specified,
typecode is derived from the start, stop, and step values.
>>> arange(0,2*pi,pi/4)
array([ 0.000, 0.785, 1.571, 2.356, 3.142,
3.927, 4.712, 5.497])
ones(shape,dtype=float64)
zeros(shape,dtype=float64)
shape is a number or sequence specifying the dimensions of the
array. If dtype is not specified, it defaults to float64
>>> ones((2,3),typecode=float32)
array([[ 1., 1., 1.],
[ 1.,
1.,
1.]],dtype=float32)
41
Array Creation Functions (cont.)
IDENTITY
# Generate an n by n identity
# array. The default dtype is
# float64.
>>> a = identity(4)
>>> a
array([[ 1., 0., 0., 0.],
[ 0., 1., 0., 0.],
[ 0., 0., 1., 0.],
[ 0., 0., 0., 1.]])
>>> a.dtype
dtype('float64')
>>> identity(4, dtype=int)
array([[ 1, 0, 0, 0],
[ 0, 1, 0, 0],
[ 0, 0, 1, 0],
[ 0, 0, 0, 1]])
EMPTY AND FILL
# empty(shape, dtype=float64,
#
order=C)
>>> a = empty(2)
>>> a
array([1.78021120e-306,
6.95357225e-308])
# fill array with 5.0
>>> a.fill(5.0)
array([5., 5.])
# alternative approach
# (slightly slower)
>>> a[:] = 4.0
array([4., 4.])
42
Array Creation Functions (cont.)
LINSPACE
# Generate N evenly spaced
# elements between (and
# including) start and
# stop values.
>>> linspace(0,1,5)
array([0.,0.25.,0.5,0.75,
1.0])
LOGSPACE
# Generate N evenly spaced
# elements on a log scale
# between base**start and
# base**stop (default base=10)
>>> logspace(0,1,5)
array([ 1., 1.77, 3.16, 5.62,
10.])
ROW SHORTCUT
# r_ and c_ are handy tools
# (cough hacks) for creating
# row and column arrays.
# Used like arange.
# -- real stride value.
>>> r_[0:1:.25]
array([ 0., 0.25., 0.5, 0.75])
# Used like linspace.
# -- complex stride value.
>>> r_[0:1:5j]
array([0.,0.25.,0.5,0.75,1.0])
# concatenate elements
>>> r_[(1,2,3),0,0,(4,5)]
array([1, 2, 3, 0, 0, 4, 5])
43
Array Creation Functions (cont.)
MGRID
OGRID
# get equally spaced point
# in N output arrays for an
# N-dimensional (mesh) grid
#
#
#
#
#
>>> x,y = mgrid[0:5,0:5]
>>> x
array([[0, 0, 0, 0, 0],
[1, 1, 1, 1, 1],
[2, 2, 2, 2, 2],
[3, 3, 3, 3, 3],
[4, 4, 4, 4, 4]])
>>> y
array([[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4]])
construct an open grid
of points (not filled in
but correctly shaped for
math operations to be
broadcast correctly).
>>> x,y = ogrid[0:3,0:3]
>>> x
array([[0],
[1],
[2]])
>>> y
array([[0, 1, 2]])
>>>
[[0
[1
[2
print x+y
1 2]
2 3]
3 4]]
44
Matrix Objects
MATRIX CREATION
BMAT
# Matlab-like creation from string
>>> A = mat(1,2,4;2,5,3;7,8,9)
>>> print A
Matrix([[1, 2, 4],
[2, 5, 3],
[7, 8, 9]])
# Create a matrix from
# sub-matrices.
>>> a = array([[1,2],
[3,4]])
>>> b = array([[10,20],
[30,40]])
# matrix exponents
>>> print A**4
Matrix([[ 6497, 9580, 9836],
[ 7138, 10561, 10818],
[18434, 27220, 27945]])
>>> bmat('a,b;b,a')
matrix([[ 1, 2, 10, 20],
[ 3, 4, 30, 40],
[10, 20, 1, 2],
[30, 40, 3, 4]])
# matrix multiplication
>>> print A*A.I
Matrix([[ 1., 0., 0.],
[ 0., 1., 0.],
[ 0., 0., 1.]])
45
Trig and Other Functions
TRIGONOMETRIC
sin(x)
cos(x)
arccos(x)
OTHERS
sinh(x)
cosh(x)
arccosh(x)
arctan(x)
arctanh(x)
arcsin(x)
arcsinh(x)
arctan2(x,y)
exp(x)
log10(x)
absolute(x)
negative(x)
floor(x)
hypot(x,y)
maximum(x,y)
log(x)
sqrt(x)
conjugate(x)
ceil(x)
fabs(x)
fmod(x,y)
minimum(x,y)
hypot(x,y)
Element by element distance
calculation using x 2 y 2
46
More Basic Functions
TYPE HANDLING
OTHER USEFUL FUNCTIONS
iscomplexobj real_if_close
isnan
fix
unwrap
roots
iscomplex
isscalar
nan_to_num
mod
sort_complex
poly
isrealobj
isneginf
common_type
amax
trim_zeros
any
isreal
isposinf
typename
amin
fliplr
all
imag
isinf
ptp
flipud
disp
real
isfinite
sum
rot90
unique
cumsum
eye
nansum
prod
diag
nanmax
SHAPE MANIPULATION
atleast_1d
hstack
hsplit
cumprod
select
nanargmax
atleast_2d
vstack
vsplit
diff
extract
nanargmin
atleast_3d
dstack
dsplit
angle
insert
nanmin
expand_dims
column_stack
split
apply_over_axes
squeeze
apply_along_axis
47
Vectorizing Functions
VECTORIZING FUNCTIONS
Example
# special.sinc already available
# This is just for show.
def sinc(x):
if x == 0.0:
return 1.0
else:
w = pi*x
return sin(w) / w
# attempt
>>> sinc([1.3,1.5])
TypeError: can't multiply
sequence to non-int
>>> x = r_[-5:5:100j]
>>> y = vsinc(x)
>>> plot(x, y)
SOLUTION
>>> from numpy import vectorize
>>> vsinc = vectorize(sinc)
>>> vsinc([1.3,1.5])
array([-0.1981, -0.2122])
48
Helpful Sites
SCIPY DOCUMENTATION PAGE
http://www.scipy.org/Documentation
NUMPY EXAMPLES
http://www.scipy.org/Numpy_Example_List_With_Doc
49
Mathematic Binary Operators
a + b
a - b
a % b
add(a,b)
subtract(a,b)
remainder(a,b)
MULTIPLY BY A SCALAR
>>> a = array((1,2))
>>> a*3.
array([3., 6.])
ELEMENT BY ELEMENT ADDITION
>>> a = array([1,2])
>>> b = array([3,4])
>>> a + b
array([4, 6])
a * b
a / b
a ** b
multiply(a,b)
divide(a,b)
power(a,b)
ADDITION USING AN OPERATOR
FUNCTION
>>> add(a,b)
array([4, 6])
IN PLACE OPERATION
# Overwrite contents of a.
# Saves array creation
# overhead
>>> add(a,b,a) # a += b
array([4, 6])
>>> a
array([4, 6])
50
Comparison and Logical Operators
equal
(==)
greater_equal (>=)
logical_and
logical_not
not_equal (!=)
less
(<)
logical_or
greater
(>)
less_equal (<=)
logical_xor
2D EXAMPLE
>>> a = array(((1,2,3,4),(2,3,4,5)))
>>> b = array(((1,2,5,4),(1,3,4,5)))
>>> a == b
array([[True, True, False, True],
[False, True, True, True]])
# functional equivalent
>>> equal(a,b)
array([[True, True, False, True],
[False, True, True, True]])
53
Bitwise Operators
bitwise_and
bitwise_or
(&)
(|)
invert
(~)
bitwise_xor
right_shift(a,shifts)
left_shift (a,shifts)
BITWISE EXAMPLES
>>> a = array((1,2,4,8))
>>> b = array((16,32,64,128))
>>> bitwise_or(a,b)
array([ 17, 34, 68, 136])
# bit inversion
>>> a = array((1,2,3,4), uint8)
>>> invert(a)
array([254, 253, 252, 251], dtype=uint8)
# left shift operation
>>> left_shift(a,3)
array([ 8, 16, 24, 32], dtype=uint8)
54
Bitwise and Comparison
Together
PRECEDENCE ISSUES
# When combining comparisons with bitwise operations,
# precedence requires parentheses around the comparisons.
>>> a = array([1,2,4,8])
>>> b = array([16,32,64,128])
>>> (a > 3) & (b < 100)
array([ False, False, True, False])
LOGICAL AND ISSUES
# Note that logical and isnt supported for arrays without
# calling the logical_and function.
>>> a>3 and b<100
Traceback (most recent call last):
ValueError: The truth value of an array with more than one
element is ambiguous. Use a.any() or a.all()
# Also, you cannot currently use the short version of
# comparison with numpy arrays.
>>> 2<a<4
Traceback (most recent call last):
ValueError: The truth value of an array with more than one
element is ambiguous. Use a.any() or a.all()
55
Universal Function Methods
The mathematic, comparative, logical, and bitwise operators that
take two arguments (binary operators) have special methods that
operate on arrays:
op.reduce(a,axis=0)
op.accumulate(a,axis=0)
op.outer(a,b)
op.reduceat(a,indices)
56
op.reduce()
op.reduce(a) applies op
to all the elements in the 1d
array
reducing it to a single
value. Using
example:
add as an
y add.reduce(a)
N 1
a[n]
n 0
a[0] a[1] ... a[ N 1]
ADD EXAMPLE
>>> a = array([1,2,3,4])
>>> add.reduce(a)
10
STRING LIST EXAMPLE
>>> a = array([ab,cd,ef],
...
dtype=object)
>>> add.reduce(a)
'abcdef'
LOGICAL OP EXAMPLES
>>> a = array([1,1,0,1])
>>> logical_and.reduce(a)
False
>>> logical_or.reduce(a)
True
57
op.reduce()
For multidimensional arrays, op.reduce(a,axis)applies op to the
elements of a along the specified axis. The resulting array has
dimensionality one less than a. The default value for axis is 0.
SUM COLUMNS BY DEFAULT
>>> add.reduce(a, 0)
array([60, 64, 68])
60
64
68
10
SUMMING UP EACH ROWS
>>> add.reduce(a,1)
array([ 3, 33, 63, 93])
33
10
11
12
11
12
63
20
21
22
20
21
22
93
30
31
32
30
31
32
58
op.accumulate()
op.accumulate(a) creates
a new array containing the
intermediate results of the
reduce operation at each
element in a.
y add.accumulate(a)
n0
a[n], a[n], , a[n]
n0
n0
N 1
ADD EXAMPLE
>>> a = array([1,2,3,4])
>>> add.accumulate(a)
array([ 1, 3, 6, 10])
STRING LIST EXAMPLE
>>> a = array([ab,cd,ef],
...
dtype=object)
>>> add.accumulate(a)
array([ab,abcd,abcdef],
dtype=object)
LOGICAL OP EXAMPLES
>>> a = array([1,1,0])
>>> logical_and.accumulate(a)
array([True, True, False])
>>> logical_or.accumulate(a)
array([True, True, True])
59
op.reduceat()
op.reduceat(a,indices)
applies op to ranges in the 1d
array a defined by the values in
indices. The resulting array
has the same length as
EXAMPLE
>>> a = array([0,10,20,30,
...
40,50])
>>> indices = array([1,4])
>>> add.reduceat(a,indices)
array([60, 90])
indices.
for :
y add.reduceat(a, indices)
y[i]
indices[i 1]
a[n]
n indices[i]
10
20
30
40
50
For multidimensional arrays,
reduceat() is always applied
along the last axis (sum of rows
for 2D arrays). This is
inconsistent with the default for
reduce() and accumulate().
60
op.outer()
op.outer(a,b) forms all possible combinations of elements
between a and b using op. The shape of the resulting array results
from concatenating the shapes of a and b. (order matters)
a
>>> add.outer(a,b)
b
>>> add.outer(b,a)
61
Array Functions choose()
>>> y = choose(choice_array,(c0,c1,c2,c3))
c0
0
3
6
1
4
7
c1
2
5
8
5
5
5
5
5
5
c2
5
5
5
2
2
2
2
2
2
c3
2
2
2
choice_array
0
1
0
0
2
1
0
3
3
choose()
0
5
6
1
2
5
2
9
9
65
Example - choose()
CLIP LOWER VALUES TO 10
CLIP LOWER AND UPPER
VALUES
>>> a
array([[ 0, 1, 2],
[10, 11, 12],
[20, 21, 22]])
>>> a < 10
array([[True, True, True],
[False, False, False],
[False, False, False],
dtype=bool)
>>> choose(a<10,(a,10))
array([[10, 10, 10],
[10, 11, 12],
[20, 21, 22]])
>>> lt = a < 10
>>> gt = a > 15
>>> choice = lt + 2 * gt
>>> choice
array([[1, 1, 1],
[0, 0, 0],
[2, 2, 2]])
>>> choose(choice,(a,10,15))
array([[10, 10, 10],
[10, 11, 12],
[15, 15, 15]])
66
Array Functions where()
>>> y = where(condition,false,true)
67
Array Functions concatenate()
concatenate((a0,a1,,aN),axis=0)
The input arrays (a0,a1,,aN) will be concatenated along the
given axis. They must have the same shape along every axis
except the one given.
10
11
12
50
51
52
60
61
62
>>> concatenate((x,y)) >>> concatenate((x,y),1)
0
10
11
12
50
51
52
50
51
52
60
61
62
10
11
12
60
61
62
>>> array((x,y))
50
51
52
60 61 62
10 11 12
68
Array Broadcasting
4x
31
4x
3 0
0
10
10
10
10
10
10
20
20
20
20
20
20
30
30
30
30
30
30
10
10
10
4x
3
10
10
10
20
20
20
20
20
20
30
30
30
30
30
30
10
10
10
20
20
20
30
30
30
30
stretch
10
11
12
20
21
22
30
31
32
20
4x1
10
0
+
2
=
stretch
stretch
69
Broadcasting Rules
The trailing axes of both arrays must either be 1 or have the same
size for broadcasting to occur. Otherwise, a
ValueError:
frames are not aligned exception is thrown.
mismatch!
4x3
4
0
10
10
10
20
20
20
30
30
30
70
Broadcasting in Action
>>> a = array((0,10,20,30))
>>> b = array((0,1,2))
>>> y = a[:, None] + b
0
10
20
30
10
11
12
20
21
22
30
31
32
71
Vector Quantization Example
Target 1
Feature 2
Target 2
0
1
2
4
Feature 1
72
Feature 2
Vector Quantization Example
1
2
4
Minimum
Distance
Feature 1
73
Vector Quantization Example
Feature 2
Observations
1
2
4
Feature 1
74
Vector Quantization Example
O b s e r v a tio n s
( obs )
x y z
o
o
o
o
o
o
o
o
o
o
0
1
2
3
4
5
6
7
8
9
Code Book
( book )
x
c0
c1
c2
c3
c4
diff = obs[None,:,:] book[:,None,:]
1x10x3
5x1x3
5x10x3
5x3
10x3
distance = sqrt(sum(diff**2,axis=-1))
o0 o1 o2 o3 o4 o5 o6 o7 o8 o9
code_index = argmin(distance,axis=0))
75
VQ Speed Comparisons
Method
Matlab 5.3
Python VQ1, double
Run Time
(sec)
Speed
Up
1.611
2.245
0.71
1.138
1.42
1.637
0.98
Python VQ2, float
0.954
1.69
C, double
0.066
24.40
Python VQ1, float
Python VQ2, double
0.064
24.40
C, float
with 16 features categorized
into 40 codes.
4000 observations
Pentium III 500 MHz.
VQ1 uses the technique described on the previous slide
verbatim.
VQ2 applies broadcasting on an observation by observation
76
Broadcasting Indexes
# Broadcasting can also be used to slice elements from
# different depths in a 3D (or any other shape) array.
# This is very powerful feature of indexing.
>>> data_cube = ones((3,3,3), dtype=float32)
>>> xi,yi = ogrid[:3,:3]
>>> zi = array([[0, 1, 2],
[1, 1, 2],
[2, 2, 2]])
>>> horizon = data_cube[xi,yi,zi]
Indices
Selected Data
horizon
yi
xi
zi
data_cube
77
Pickling
When pickling arrays, use binary storage when possible to save space.
>>> a = zeros((100,100),dtype=float32)
# total storage
>>> a.nbytes
40000
# standard pickling balloons 4x
>>> ascii = cPickle.dumps(a)
>>> len(ascii)
160061
# binary pickling is very nearly 1x
>>> binary = cPickle.dumps(a,2)
>>> len(binary)
40051
78
Controlling Output Format
set_printoptions(precision=None,threshold=None,
edgeitems=None, linewidth=None,
precision
suppress=None)
The number of
digits of precision to use for floating point
output. The default is 8.
threshold
array length where numpy starts truncating the output and
prints only the beginning and end of the array. The default
is 1000.
edgeitems
number of array elements to print at beginning and end of
array when threshold is exceeded. The default is 3.
linewidth
characters to print per line of output. The default is 75.
suppress
Indicates whether numpy suppress printing small floating
point values in scientific notation. The default is False.
79
Controlling Output Formats
PRECISION
>>> a = arange(1e6)
>>> a
array([ 0.00000000e+00, 1.00000000e+00, 2.00000000e+00, ...,
9.99997000e+05, 9.99998000e+05, 9.99999000e+05])
>>> set_printoptions(precision=3)
array([ 0.000e+00,
1.000e+00,
2.000e+00, ...,
1.000e+06,
1.000e+06,
1.000e+06])
SUPRESSING SMALL NUMBERS
>>> set_printoptions(precision=8)
>>> a = array((1, 2, 3, 1e-15))
>>> a
array([ 1.00000000e+00,
2.00000000e+00,
1.00000000e-15])
>>> set_printoptions(suppress=True)
>>> a
array([ 1., 2., 3., 0.])
3.00000000e+00,
80
Controlling Error Handling
seterr(all=None, divide=None, over=None,
under=None, invalid=None)
Set the error handling flags in ufunc operations on a per thread basis. Each
of the keyword arguments can be set to ignore, warn, print, log, raise,
or call.
all
Set the error handling mode for all error types to the specified
value.
divide
Set the error handling mode for divide-by-zero errors.
over
Set the error handling mode for overflow errors.
under
Set the error handling mode for underflow errors.
invalid
Set the error handling mode for invalid floating point errors.
81
Controlling Error Handling
>>> a = array((1,2,3))
>>> a/0.
Warning: divide by zero encountered in divide
array([ 1.#INF0000e+000, 1.#INF0000e+000, 1.#INF0000e+000])
# ignore division-by-zero. Also, save old values so that
# we can restore them.
>>> old_err = seterr(divide='ignore')
>>> a/0.
array([ 1.#INF0000e+000, 1.#INF0000e+000, 1.#INF0000e+000])
# Restore orignal error handling mode.
>>> old_err
{'divide': 'print', 'invalid': 'print', 'over': 'print',
'under': 'ignore'}
>>> seterr(**old_err)
>>> a/0.
Warning: divide by zero encountered in divide
array([ 1.#INF0000e+000, 1.#INF0000e+000, 1.#INF0000e+000])
82
Structured Arrays
# Create a data structure (dtype) that describes the fields and
# type of the items in each array element.
>>> particle_dtype = dtype([('mass','f4'), ('velocity', 'f4')])
# This must be a list of tuples. numpy (currently) doesn't like
# a list of arrays or a tuple of tuples.
>>> particles = array([(1,1), (1,2), (2,1), (1,3)],
dtype=particle_dtype)
>>> particles
[(1.0, 1.0) (1.0, 2.0) (1.0, 3.0) (2.0, 1.0)]
# retreive the mass for all particles through indexing.
>>> particles['mass']
[ 1. 1. 2. 1.]
# retreive particle 0 through indexing.
>>> particles[0]
(1, 1)
# sort particles in-place, with velocity as the primary field and
# mass as the secondary field.
>>> particles.sort(order=('velocity','mass'))
>>> particles
[(1.0, 1.0) (2.0, 1.0) (1.0, 2.0) (1.0, 3.0)]
# see demo/mutlitype_array/particle.py
83