Random Numpy
Random Numpy
What is a Random Number? Random number does NOT mean a different number every time. Random
means something that can not be predicted logically.
Pseudo Random and True Random. Computers work on programs, and programs are definitive set of
instructions. So it means there must be some algorithm to generate a random number as well.
If there is a program to generate random number it can be predicted, thus it is not truly random.
Random numbers generated through a generation algorithm are called pseudo random.
Yes. In order to generate a truly random number on our computers we need to get the random data
from some outside source. This outside source is generally our keystrokes, mouse movements, data
on network etc.
We do not need truly random numbers, unless it is related to security (e.g. encryption keys) or the basis
of application is the randomness (e.g. Digital roulette wheels).
Generate Random Number NumPy offers the random module to work with random numbers.
In [1]:
# Example
# Generate a random integer from 0 to 100:
Generate Random Float The random module's rand() method returns a random float between 0 and 1.
In [2]: # Example
# Generate a random float from 0 to 1:
0.9932839826871388
Generate Random Array In NumPy we work with arrays, and you can use the two methods from the
above examples to make random arrays.
Integers The randint() method takes a size parameter where you can specify the shape of an array.
In [3]:
# Example
# Generate a 1-D array containing 5 random integers from 0 to 100:
[45 42 91 5 2]
In [4]:
# Example
# Generate a 2-D array with 3 rows, each row containing 5 random integers f
# Generate a 2-D array with 3 rows and 5 columns containing random integers
x = random.randint(100, size=(3, 5))
[[ 5 18 67 40 59]
[11 59 83 72 3]
[66 44 15 47 64]]
Floats The rand() method also allows you to specify the shape of the array.
In [5]: # Example
# Generate a 1-D array containing 5 random floats:
In [6]: # Example
# Generate a 2-D array with 3 rows, each row containing 5 random numbers:
Generate Random Number From Array The choice() method allows you to generate a random value
based on an array of values.
The choice() method takes an array as a parameter and randomly returns one of the values.
In [7]: # Example
# Return one of the values in an array:
In [8]:
# Example
# Generate a 2-D array that consists of the values in the array parameter (
# Generate a 2-D array with 3 rows and 5 columns consisting of the values [
x = random.choice([3, 5, 7, 9], size=(3, 5))
[[3 3 5 5 9]
[3 5 3 7 3]
[7 3 7 3 3]]
What is Data Distribution? Data Distribution is a list of all possible values, and how often each value
occurs.
Such lists are important when working with statistics and data science.
The random module offer methods that returns randomly generated data distributions.
Random Distribution A random distribution is a set of random numbers that follow a certain probability
density function.
Probability Density Function: A function that describes a continuous probability. i.e. probability of all
values in an array.
We can generate random numbers based on defined probabilities using the choice() method of the
random module.
The choice() method allows us to specify the probability for each value.
The probability is set by a number between 0 and 1, where 0 means that the value will never occur and
1 means that the value will always occur.
In [18]:
#Example
#Generate a 1-D array containing 100 values, where each value has to be 3,
print(x)
[5 7 7 7 7 5 3 5 7 5 7 7 7 5 7 7 7 5 7 7 3 7 5 7 7 7 7 7 3 7 7 7 5 7 7 5 7
5 3 5 5 7 7 5 7 7 7 7 5 7 7 5 7 5 7 5 7 7 7 7 5 7 7 5 3 7 7 7 5 7 7 7 3 7
7 5 3 7 7 7 3 7 5 3 7 7 5 3 7 7 7 7 7 7 7 3 5 7 5 7]
Even if you run the example above 100 times, the value 9 will never occur.
You can return arrays of any shape and size by specifying the shape in the size parameter.
In [9]:
# Example
# Same example as above, but return a 2-D array with 3 rows, each containin
# Generate a 2-D array with 3 rows and 5 columns consisting of the values [
# with specified probabilities for each value
x = random.choice([3, 5, 7, 9], p=[0.1, 0.3, 0.6, 0.0], size=(3, 5))
[[7 3 5 5 3]
[5 5 7 7 7]
[7 3 7 7 7]]
Random Permutations
The NumPy Random module provides two methods for this: shuffle() and permutation().
Shuffling Arrays Shuffle means changing arrangement of elements in-place. i.e. in the array itself.
In [10]: # Example
# Randomly shuffle elements of the following array:
[5 3 4 1 2]
[1 2 5 3 4]
The permutation() method returns a re-arranged array (and leaves the original array un-changed).
Normal Distribution The Normal Distribution is one of the most important distributions.
It is also called the Gaussian Distribution after the German mathematician Carl Friedrich Gauss.
It fits the probability distribution of many events, eg. IQ Scores, Heartbeat etc.
scale - (Standard Deviation) how flat the graph distribution should be.
In [12]:
# Example
# Generate a random normal distribution of size 2x3:
import numpy as np
from numpy import random
In [13]: # Example
# Generate a random normal distribution of size 2x3 with mean at 1 and stan
# Generate a random normal distribution of size 2x3 with mean 1 and standar
x = random.normal(loc=1, scale=2, size=(2, 3))
In [14]:
from numpy import random
import matplotlib.pyplot as plt
import seaborn as sns
Note: The curve of a Normal Distribution is also known as the Bell Curve because of the bell-shaped
curve.
Binomial Distribution
It describes the outcome of binary scenarios, e.g. toss of a coin, it will either be head or tails.
n - number of trials.
p - probability of occurence of each trial (e.g. for toss of a coin 0.5 each).
Discrete Distribution:The distribution is defined at separate set of events, e.g. a coin toss's result is
discrete as it can be only head or tails whereas height of people is continuous as it can be 170, 170.1,
170.11 and so on.
In [15]: # Example
# Given 10 trials for coin toss generate 10 data points:
[6 7 5 3 4 7 7 5 7 7]
# NumPy ufuncs
What are ufuncs? ufuncs stands for "Universal Functions" and they are NumPy functions that operate
on the ndarray object.
Why use ufuncs? ufuncs are used to implement vectorization in NumPy which is way faster than
iterating over elements.
They also provide broadcasting and additional methods like reduce, accumulate etc. that are very
helpful for computation.
where boolean array or condition defining where the operations should take place.
out output array where the return value should be copied. What is Vectorization? Converting iterative
statements into a vector based operation is called vectorization.
list 2: [4, 5, 6, 7]
One way of doing it is to iterate over both of the lists and then sum each elements
In [17]:
# Example
# Without ufunc, we can use Python's built-in zip() method:
x = [1, 2, 3, 4]
y = [4, 5, 6, 7]
z = []
[5, 7, 9, 11]
In [18]: # Example
# With ufunc, we can use the add() function:
import numpy as np
# Define two lists
x = [1, 2, 3, 4]
y = [4, 5, 6, 7]
# Use numpy's add() function to add the elements of the two lists
z = np.add(x, y)
[ 5 7 9 11]
How To Create Your Own ufunc To create your own ufunc, you have to define a function, like you do
with normal functions in Python, then you add it to your NumPy ufunc library with the frompyfunc()
method.
function - the name of the function. inputs - the number of input arguments (arrays). outputs - the
number of output arrays.
In [19]:
#Example
# Create your own ufunc for addition:
import numpy as np
[6 8 10 12]
Check if a Function is a ufunc Check the type of a function to check if it is a ufunc or not.
In [32]:
#Example
#Check if a function is a ufunc:
import numpy as np
print(type(np.add))
<class 'numpy.ufunc'>
In [20]: # Example
# Use an if statement to check if the function is a ufunc or not:
import numpy as np
add is ufunc
Simple Arithmetic
Simple Arithmetic You could use arithmetic operators + - * / directly between NumPy arrays, but this
section discusses an extension of the same where we have functions that can take any array-like
objects e.g. lists, tuples etc. and perform arithmetic conditionally.
Arithmetic Conditionally: means that we can define conditions where the arithmetic operation should
happen.
All of the discussed arithmetic functions take a where parameter in which we can specify that
condition.
Addition The add() function sums the content of two arrays, and return the results in a new array.
In [21]: # Example
# Add the values in arr1 to the values in arr2:
import numpy as np
[30 32 34 36 38 40]
Subtraction The subtract() function subtracts the values from one array with the values from another
array, and return the results in a new array.
In [22]:
# Example
# Subtract the values in arr2 from the values in arr1:
import numpy as np
[-10 -1 8 17 26 35]
Multiplication The multiply() function multiplies the values from one array with the values from another
array, and return the results in a new array.
In [23]: # Example
# Multiply the values in arr1 with the values in arr2:
import numpy as np
Division The divide() function divides the values from one array with the values from another array, and
return the results in a new array.
In [24]:
# Example
# Divide the values in arr1 by the values in arr2:
import numpy as np
Power The power() function rises the values from the first array to the power of the values of the
second array, and return the results in a new array.
In [25]:
# Example
# Raise the values in arr1 to the power of values in arr2:
import numpy as np
Remainder Both the mod() and the remainder() functions return the remainder of the values in the first
array corresponding to the values in the second array, and return the results in a new array.
In [26]:
# Example
# Return the remainders:
import numpy as np
[ 1 6 3 0 0 27]
You get the same result when using the remainder() function:
In [27]: # Example
# Return the remainders:
import numpy as np
[ 1 6 3 0 0 27]
Quotient and Mod The divmod() function return both the quotient and the mod. The return value is two
arrays, the first array contains the quotient and second array contains the mod.
In [28]: # Example
# Return the quotient and mod:
import numpy as np
Quotient: [ 3 2 3 5 25 1]
Remainder: [ 1 6 3 0 0 27]
Absolute Values Both the absolute() and the abs() functions do the same absolute operation element-
wise but we should use absolute() to avoid confusion with python's inbuilt math.abs()
In [29]:
# Example
# Return the absolute values:
import numpy as np
[1 2 1 2 3 4]
Rounding Decimals
Rounding Decimals There are primarily five ways of rounding off decimals in NumPy:
truncation fix rounding floor ceil Truncation Remove the decimals, and return the float number closest
to zero. Use the trunc() and fix() functions.
In [30]:
# Example
# Truncate elements of the following array:
import numpy as np
[-3. 3.]
In [31]:
# Example
# Same example, using fix():
import numpy as np
[-3. 3.]
Rounding The around() function increments preceding digit or decimal by 1 if >=5 else do nothing.
In [32]:
# Example
# Round off 3.1666 to 2 decimal places:
import numpy as np
3.17
Floor The floor() function rounds off decimal to nearest lower integer.
E.g. floor of 3.166 is 3.
In [33]:
# Example
# Floor the elements of the following array:
import numpy as np
[-4. 3.]
Ceil The ceil() function rounds off decimal to nearest upper integer.
In [34]:
# Example
# Ceil the elements of the following array:
import numpy as np
[-3. 4.]
NumPy Logs
Logs NumPy provides functions to perform log at the base 2, e and 10.
We will also explore how we can take log for any base by creating a custom ufunc.
All of the log functions will place -inf or inf in the elements if the log can not be computed.
Log at Base 2 Use the log2() function to perform log at the base 2.
In [35]: # Example
# Find log at base 2 of all elements of the following array:
import numpy as np
Log at Base 10 Use the log10() function to perform log at the base 10.
In [36]: #Example
#Find log at base 10 of all elements of following array:
import numpy as np
Natural Log, or Log at Base e Use the log() function to perform log at the base e.
In [37]:
# Example
# Find log at base e of all elements of the following array:
import numpy as np
# Find the natural log (log at base e) of all elements in the array
log_arr = np.log(arr)
NumPy Summations
Summations What is the difference between summation and addition?
Addition is done between two arguments whereas summation happens over n elements.
In [38]:
#Example
# Add the values in arr1 to the values in arr2:
import numpy as np
[2 4 6]
In [39]:
# Example
# Sum the values in arr1 and the values in arr2:
import numpy as np
12
Summation Over an Axis If you specify axis=1, NumPy will sum the numbers in each array.
In [40]: # Example
# Perform summation in the following array over 1st axis:
import numpy as np
Cummulative Sum Cummulative sum means partially adding the elements in array.
E.g. The partial sum of [1, 2, 3, 4] would be [1, 1+2, 1+2+3, 1+2+3+4] = [1, 3, 6, 10].
In [4]:
# Example
# Perform cumulative summation in the following array:
import numpy as np
[1 3 6]
NumPy Products
Products To find the product of the elements in an array, use the prod() function.
In [71]:
# Example
# Find the product of the elements of this array:
import numpy as np
24
In [70]:
# Example
# Find the product of the elements of two arrays:
import numpy as np
40320
Product Over an Axis If you specify axis=1, NumPy will return the product of each array.
In [69]:
# Example
# Perform product in the following array over 1st axis:
import numpy as np
[ 24 1680]
E.g. The partial product of [1, 2, 3, 4] is [1, 12, 123, 1234] = [1, 2, 6, 24]
In [68]:
# Example
# Take cumulative product of all elements for the following array:
import numpy as np
[ 5 30 210 1680]
NumPy Differences
E.g. for [1, 2, 3, 4], the discrete difference would be [2-1, 3-2, 4-3] = [1, 1, 1]
In [67]:
# Example
# Compute discrete difference of the following array:
import numpy as np
[ 5 10 -20]
E.g. for [1, 2, 3, 4], the discrete difference with n = 2 would be [2-1, 3-2, 4-3] = [1, 1, 1] , then, since n=2,
we will do it once more, with the new result: [1-1, 1-1] = [0, 0]
In [66]: # Example
# Compute discrete difference of the following array twice:
import numpy as np
[ 5 -30]
Returns: [5 -30] because: 15-10=5, 25-15=10, and 5-25=-20 AND 10-5=5 and -20-10=-30
In [65]:
# Example
# Find the LCM of the following two numbers:
import numpy as np
12
Finding LCM in Arrays To find the Lowest Common Multiple of all values in an array, you can use the
reduce() method.
The reduce() method will use the ufunc, in this case the lcm() function, on each element, and reduce
the array by one dimension.
In [64]:
# Example
# Find the LCM of the values of the following array:
import numpy as np
18
In [63]: # Example
# Find the LCM of all values of an array where the array contains all integ
import numpy as np
2520
In [62]: # Example
# Find the HCF (GCD) of the following two numbers:
import numpy as np
Finding GCD in Arrays To find the Highest Common Factor of all values in an array, you can use the
reduce() method.
The reduce() method will use the ufunc, in this case the gcd() function, on each element, and reduce
the array by one dimension.
In [61]:
# Example
# Find the GCD for all of the numbers in the following array:
import numpy as np
In [60]:
# Example
# Find sine value of PI/2:
import numpy as np
1.0
In [59]:
# Example
# Find sine values for all of the values in arr:
import numpy as np
# Find the sine values for all of the values in the array
x = np.sin(arr)
Convert Degrees Into Radians By default all of the trigonometric functions take radians as parameters
but we can convert radians to degrees and vice versa as well in NumPy.
In [58]:
# Example
# Convert all of the values in the following array arr to radians:
import numpy as np
In [57]: # Example
# Convert all of the values in the following array arr to degrees:
import numpy as np
Finding Angles Finding angles from values of sine, cos, tan. E.g. sin, cos and tan inverse (arcsin,
arccos, arctan).
NumPy provides ufuncs arcsin(), arccos() and arctan() that produce radian values for corresponding
sin, cos and tan values given.
In [56]: # Example
# Find the angle of 1.0:
import numpy as np
1.5707963267948966
import numpy as np
# Find the angle for all of the sine values in the array
x = np.arcsin(arr)
# Print the resulting array
print(x)
NumPy provides the hypot() function that takes the base and perpendicular values and produces
hypotenues based on pythagoras theorem.
In [54]:
# Example
# Find the hypotenuse for 4 base and 3 perpendicular:
import numpy as np
5.0
Hyperbolic Functions NumPy provides the ufuncs sinh(), cosh() and tanh() that take values in radians
and produce the corresponding sinh, cosh and tanh values..
In [53]:
# Example
# Find sinh value of PI/2:
import numpy as np
2.3012989023072947
In [52]:
# Example
# Find cosh values for all of the values in arr:
import numpy as np
# Find the cosh values for all of the values in the array
x = np.cosh(arr)
Finding Angles Finding angles from values of hyperbolic sine, cos, tan. E.g. sinh, cosh and tanh inverse
(arcsinh, arccosh, arctanh).
Numpy provides ufuncs arcsinh(), arccosh() and arctanh() that produce radian values for
corresponding sinh, cosh and tanh values given.
In [51]:
# Example
# Find the angle of 1.0 using arcsinh:
import numpy as np
0.881373587019543
In [50]:
# Example
# Find the angle for all of the tanh values in the array:
import numpy as np
# Find the angle for all of the tanh values in the array
x = np.arctanh(arr)
Sets are used for operations involving frequent intersection, union and difference operations.
Create Sets in NumPy We can use NumPy's unique() method to find unique elements from any array.
E.g. create a set array, but remember that the set arrays should only be 1-D arrays.
In [49]:
# Example
# Convert the following array with repeated elements to a set:
import numpy as np
[1 2 3 4 5 6 7]
Finding Union
To find the unique values of two arrays, use the union1d() method.
In [48]:
# Example
# Find union of the following two set arrays:
import numpy as np
[1 2 3 4 5 6]
Finding Intersection To find only the values that are present in both arrays, use the intersect1d()
method.
In [47]:
import numpy as np
[3 4]
Note: the intersect1d() method takes an optional argument assume_unique, which if set to True can
speed up computation. It should always be set to True when dealing with sets.
Finding Difference To find only the values in the first set that is NOT present in the seconds set, use the
setdiff1d() method.
In [33]:
# Example
# Find the difference of set1 from set2:
import numpy as np
[1 2]
Note: the setdiff1d() method takes an optional argument assume_unique, which if set to True can
speed up computation. It should always be set to True when dealing with sets.
Finding Symmetric Difference To find only the values that are NOT present in BOTH sets, use the
setxor1d() method.
In [34]:
# Example
# Find the symmetric difference of set1 and set2:
import numpy as np
[1 2 5 6]
Note: the setxor1d() method takes an optional argument assume_unique, which if set to True can
speed up computation. It should always be set to True when dealing with sets.