Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
24 views40 pages

NumPy Interview Questions

The document provides a comprehensive list of interview questions and answers related to NumPy, a Python library for numerical and scientific computing. Key topics include array creation, mathematical functions, data manipulation, and performance features of NumPy. It also covers advanced concepts like matrix inversion, random number generation, and handling outliers in data.

Uploaded by

dnaresh2323
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views40 pages

NumPy Interview Questions

The document provides a comprehensive list of interview questions and answers related to NumPy, a Python library for numerical and scientific computing. Key topics include array creation, mathematical functions, data manipulation, and performance features of NumPy. It also covers advanced concepts like matrix inversion, random number generation, and handling outliers in data.

Uploaded by

dnaresh2323
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 40

NumPy Interview questions

(covers almost 70% of all interview questions on NumPy)

Q1. What is NumPy?

A:
NumPy stands for "Numerical Python." It's a popular Python library used for numerical
and scientific computing. NumPy helps you create and work with arrays and matrices,
and it provides many mathematical functions to perform operations on these arrays.
It's a key tool for data manipulation and analysis in Python and is also used by other
libraries in data science and machine learning.

Q2. How do I create a NumPy array?

A:
You can create NumPy arrays in several ways. Here are some common methods:

 Using np.array()

import numpy as np

array = np.array([1, 2, 3])

 Using np.zeros() (creates an array filled with zeros)

array = np.zeros((3, 4)) # 3 rows, 4 columns

 Using np.ones() (creates an array filled with ones)

array = np.ones((2, 3))

 Using np.full() (creates an array filled with a specific value)

array = np.full((2, 2), 7)

 Using np.arange() (creates an array with a range of numbers)

array = np.arange(0, 10, 2) # Start at 0, stop before 10, step by 2

 Using np.linspace() (creates an array with evenly spaced numbers)

array = np.linspace(0, 1, 5) # 5 numbers from 0 to 1

Q3. What are the main features of NumPy?

A:
NumPy has several important features:

 Arrays: Efficiently store and handle large amounts of data.

 Efficiency: Fast operations on large datasets.

 Mathematical Functions: A wide range of built-in math functions.

 Broadcasting: Perform operations on arrays of different shapes.


 Integration: Works well with other Python libraries.

 Multi-dimensional Arrays: Support for 2D, 3D, and higher-dimensional arrays.

 Indexing and Slicing: Easily access and modify parts of arrays.

 Memory Management: Efficient use of memory for large data.

Q4. How do you calculate the dot product of two NumPy arrays?

A:
You can calculate the dot product of two NumPy arrays using either the numpy.dot()
function or the @ operator.

 Using numpy.dot() function:

import numpy as np

a = np.array([1, 2])

b = np.array([3, 4])

dot_product = np.dot(a, b)

 Using the @ operator:

dot_product = a @ b

Both methods will give you the dot product as a single number.

Q5. How do I access elements in a NumPy array?

A:
You can access elements in a NumPy array using indexing and slicing:

 Indexing:
Access a specific element using its index (starting from 0).

import numpy as np

arr = np.array([10, 20, 30, 40, 50])

first_element = arr[0] # 10

 Slicing:
Access a range of elements.

subset = arr[1:4] # [20, 30, 40]

 Boolean Indexing:
Access elements that meet a condition.

condition = arr > 25

selected = arr[condition] # [30, 40, 50]


Q6. What is the difference between a shallow copy and a deep copy in
NumPy?

A:
In NumPy, there are two ways to copy an array:

 Shallow Copy:

o Creates a new array that views the same data as the original.

o No data is actually duplicated.

o Changes in the original array will affect the shallow copy, and vice versa.

original = np.array([1, 2, 3])

shallow = original.view()

 Deep Copy:

o Creates a completely new and independent array.

o Data is duplicated in memory.

o Changes in the original array do not affect the deep copy, and vice versa.

original = np.array([1, 2, 3])

deep = original.copy()

Q7. How do you reshape a NumPy array?

A:
You can change the shape of a NumPy array using the reshape() method or the
np.reshape() function. This changes the dimensions of the array without changing its
data.

 Using the reshape() method:

import numpy as np

original = np.array([1, 2, 3, 4, 5, 6])

reshaped = original.reshape((2, 3)) # 2 rows, 3 columns

 Using the np.reshape() function:

reshaped = np.reshape(original, (3, 2))

In both cases, original is your existing array, and (rows, columns) is the new shape
you want.

Q8. How to perform element-wise operations on NumPy arrays?

A:
You can perform element-wise operations (like addition, subtraction, multiplication,
division) on NumPy arrays using standard arithmetic operators. NumPy applies these
operations to each corresponding element.

import numpy as np

# Create two NumPy arrays

array1 = np.array([1, 2, 3, 4, 5])

array2 = np.array([6, 7, 8, 9, 10])

# Perform element-wise operations

result_add = array1 + array2 # [7, 9, 11, 13, 15]

result_subtract = array1 - array2 # [-5, -5, -5, -5, -5]

result_multiply = array1 * array2 # [6, 14, 24, 36, 50]

result_divide = array1 / array2 # [0.166..., 0.285..., 0.375, 0.444..., 0.5]

result_power = np.power(array1, 2) # [1, 4, 9, 16, 25]

Q9. Define the var function in NumPy.

A:
The numpy.var() function calculates the variance of the elements in a NumPy array.
Variance measures how much the numbers in the array are spread out.

import numpy as np

arr = np.array([1, 2, 3, 4, 5])

variance = np.var(arr)

Parameters:

 a: The input array.

 axis: The axis along which to compute the variance. If not specified, it calculates
the variance of the entire array.

 dtype: The data type of the output. If not specified, it uses the data type of the
input array.

Q10. Define the min and max functions in NumPy.

A:
NumPy provides np.min() and np.max() functions to find the smallest and largest
values in an array, respectively.

 np.min() Function:
Finds the minimum value.
import numpy as np

arr = np.array([3, 1, 4, 1, 5, 9])

minimum = np.min(arr) # 1

 np.max() Function:
Finds the maximum value.

maximum = np.max(arr) # 9

Parameters:

 a: The input array.

 axis: The axis along which to find the min or max. If not specified, it finds the
min or max of the entire array.

Q11. How to generate random numbers with NumPy?

A:
NumPy has several functions to generate random numbers:

 np.random.rand()
Generates random floats between 0 and 1.

random_float = np.random.rand() # e.g., 0.5488135

 np.random.randint()
Generates random integers within a specified range.

random_integer = np.random.randint(1, 10) # e.g., 7

 np.random.randn()
Generates random numbers from a standard normal distribution (mean=0,
std=1).

random_normal = np.random.randn() # e.g., -0.234153

 np.random.seed()
Sets the seed for the random number generator to ensure reproducibility.

np.random.seed(42)

Q12. What is the purpose of NumPy in Python?

A:
NumPy is a key library in Python for scientific computing and data analysis. Its main
purpose is to provide support for large and multi-dimensional arrays and matrices,
along with a collection of mathematical functions to operate on these arrays
efficiently.

Q13. How can you create a NumPy array from a Python list?
A:
You can convert a Python list to a NumPy array using the np.array() function.

import numpy as np

python_list = [1, 2, 3, 4, 5]

# Convert the Python list to a NumPy array

numpy_array = np.array(python_list)

Steps:

1. Create a Python list with the elements you want.

2. Use np.array() and pass the list as an argument to create a NumPy array.

Q14. How can you access elements in a NumPy array based on specific
conditions?

A:
You can use boolean indexing to select elements that meet certain conditions.

import numpy as np

arr = np.array([1, 2, 3, 4, 5])

# Define a condition (e.g., elements greater than 3)

condition = arr > 3

# Use the condition to select elements

selected_elements = arr[condition] # [4, 5]

Steps:

1. Create or use an existing NumPy array.

2. Define a condition that returns True or False for each element.

3. Use this condition inside the array's indexing brackets to select the elements
that satisfy the condition.

Q15. What are some common data types supported by NumPy?

A:
NumPy supports various data types to control how data is stored and processed. Some
common data types include:

 int: Integer numbers.


 float: Floating-point numbers (decimals).

 complex: Complex numbers with real and imaginary parts.

 bool: Boolean values (True or False).

 object: Objects (can store any data type).

 datetime: Dates and times.

Q16. How can you concatenate two NumPy arrays vertically?

A:
You can combine two NumPy arrays vertically (stack them on top of each other) using
np.vstack() or np.concatenate() with axis=0.

 Using np.vstack():

import numpy as np

array1 = np.array([[1, 2], [3, 4]])

array2 = np.array([[5, 6], [7, 8]])

combined = np.vstack((array1, array2))

# Result:

# [[1, 2],

# [3, 4],

# [5, 6],

# [7, 8]]

 Using np.concatenate() with axis=0:

combined = np.concatenate((array1, array2), axis=0)

Both methods will stack the arrays vertically.

Q17. What is the significance of the random module in NumPy?

A:
The random module in NumPy is important for generating random numbers and
performing random operations. Its main uses include:

 Random Number Generation: Creating random integers, floats, and other


numbers.

 Random Sequences: Generating random sequences of numbers.

 Probability Distributions: Sampling from different statistical distributions like


normal, binomial, etc.

 Random Choices: Selecting random elements from arrays or lists.


This module is widely used in simulations, data analysis, and testing scenarios where
randomness is needed.

Q18. How can you generate random numbers following a normal distribution
using NumPy?

A:
You can generate random numbers that follow a normal (Gaussian) distribution using
the numpy.random.normal() function.

import numpy as np

# Generate random numbers from a normal distribution

mean = 0 # Mean of the distribution

std_dev = 1 # Standard deviation

size = 5 # Number of random numbers to generate

random_numbers = np.random.normal(mean, std_dev, size)

# Example output: [ 0.49671415 -0.1382643 0.64768854 1.52302986 -0.23415337]

Parameters:

 loc: Mean of the distribution.

 scale: Standard deviation of the distribution.

 size: Number of random numbers to generate.

Q19. What is Matrix Inversion in NumPy?

A:
Matrix inversion is the process of finding the inverse of a square matrix. If you multiply
a matrix by its inverse, you get the identity matrix.

In NumPy, you can invert a matrix using the numpy.linalg.inv() function.

import numpy as np

# Define a square matrix

A = np.array([[1, 2],

[3, 4]])

# Calculate the inverse of the matrix


A_inverse = np.linalg.inv(A)

# Verify the inversion

identity = A @ A_inverse # Should be close to [[1, 0], [0, 1]]

Note: Only square matrices (same number of rows and columns) that are non-
singular (have an inverse) can be inverted.

Q20. Define the mean function in NumPy.

A:
The numpy.mean() function calculates the average (arithmetic mean) of the elements
in a NumPy array.

import numpy as np

arr = np.array([1, 2, 3, 4, 5])

# Calculate the mean of the entire array

average = np.mean(arr) # 3.0

# Calculate the mean along a specific axis (for multi-dimensional arrays)

matrix = np.array([[1, 2], [3, 4]])

mean_axis0 = np.mean(matrix, axis=0) # [2.0, 3.0]

mean_axis1 = np.mean(matrix, axis=1) # [1.5, 3.5]

Parameters:

 a: The input array.

 axis: The axis along which to compute the mean. If not specified, it calculates
the mean of the entire array.

Q.21 Write a NumPy code snippet to create an array of zeros.


To create an array of zeros, use the numpy.zeros() function:

# Create a 1D array of zeros with a specified length (e.g., 5)

zeros_1d = np.zeros(5)

print(zeros_1d)

Q.22 How can you identify outliers in a NumPy array?


You can identify outliers by:

1. Calculate Descriptive Statistics like mean and standard deviation.


2. Define a Threshold: Points beyond a specific threshold can be considered
outliers.

Q.23 How do you remove missing or null values from a NumPy array?
Use numpy.isnan() to create a mask and filter out the missing values:

mask = ~np.isnan(my_array)

filtered_array = my_array[mask]

Q.24 What is the difference between slicing and indexing in NumPy?

 Slicing: Extracts a portion of the array using a range of indices.

 Indexing: Accesses specific elements directly using indices.

Q.25 How do you compute the Fourier transform of a signal using NumPy?
Use numpy.fft for computing the Fourier Transform:

t = np.linspace(0, 1, 1000, endpoint=False) # Time vector

signal = np.sin(2 * np.pi * 5 * t)

fft_result = np.fft.fft(signal)

Q.26 How can you create an array with the same values?
Use numpy.full() or broadcasting:

arr = np.full(5, 7)

arr_2d = 2.0 * np.ones((3, 4))

Q.27 How can you modify the data type of a NumPy array?
Use astype() to change the data type:

new_array = original_array.astype(float)

Q.28 What is a masked array in NumPy?


A masked array is an array with some elements marked as invalid using a Boolean
mask. It helps handle datasets with missing values.

Q.29 What are some of the limitations of NumPy?

 Homogeneous data types

 Limited support for missing data and labels

 Single-threaded (doesn’t leverage multiple CPUs efficiently)

 Limited GPU support

Q.30 How do you sort a NumPy array in ascending or descending order?


Use numpy.sort() for ascending order:

sorted_array = np.sort(my_array)

For descending order, reverse the sorted array or use numpy.argsort().

Q.31 How to use NumPy with Matplotlib?


Use NumPy to create data and Matplotlib to plot:
x = np.linspace(0, 2 * np.pi, 100)

y = np.sin(x)

plt.plot(x, y)

plt.show()

Q.32 What is the use of diag() in a square matrix?


diag() extracts the diagonal elements of a matrix or creates a diagonal matrix.

Q.33 How are NumPy arrays better than Python lists?

 Performance: NumPy arrays are faster.

 Vectorization: NumPy operations are applied element-wise.

 Memory Efficiency: Arrays use less memory compared to lists.

Q.34 What is negative indexing in NumPy arrays?


Negative indexing accesses elements from the end of the array, e.g., -1 accesses the
last element.

Q.35 Can you create a plot in NumPy?


While NumPy handles data, plotting is done using Matplotlib:

x = np.linspace(0, 2 * np.pi, 100)

y = np.sin(x)

plt.plot(x, y)

plt.show()

Q.36 Discuss uses of vstack() and hstack() functions.

 vstack() stacks arrays vertically (top to bottom).

 hstack() stacks arrays horizontally (side by side).

Q.37 How does NumPy handle numerical exceptions?


NumPy handles numerical exceptions (e.g., division by zero) through warnings and by
returning inf, -inf, or nan.

Q.38 What is the significance of the random module in NumPy?


The random module in NumPy is used for generating random numbers for simulations,
testing, or initializing algorithms (like machine learning).

Q.39 How to get the eigenvalues of a matrix?


Use numpy.linalg.eigvals() to compute the eigenvalues:

eigenvalues = np.linalg.eigvals(matrix)

Q.40 How to calculate the determinant of a matrix using NumPy?


Use numpy.linalg.det() to compute the determinant:

determinant = np.linalg.det(matrix)

Q41. How do you find a matrix or vector norm using NumPy?


A:
You can find the norm (a measure of the size) of a matrix or vector using the
numpy.linalg.norm() function.

import numpy as np

# For a vector

vector = np.array([1, 2, 3])

vector_norm = np.linalg.norm(vector)

print("Vector Norm:", vector_norm)

# For a matrix

matrix = np.array([[1, 2], [3, 4]])

matrix_norm = np.linalg.norm(matrix)

print("Matrix Norm:", matrix_norm)

Parameters:

 x: The input vector or matrix.

 ord: Specifies the type of norm (e.g., 1 for L1 norm, 2 for L2 norm). If not
specified, the default is the L2 norm.

 axis: Specifies the axis along which to compute the norm. If not specified, the
norm is computed over the entire array.

Q42. How do you compare two NumPy arrays?

A:
You can compare two NumPy arrays using either the == operator combined with .all(),
or the numpy.array_equal() function.

Method 1: Using == and .all()

python

Copy code

import numpy as np

arr1 = np.array([1, 2, 3])

arr2 = np.array([1, 2, 3])

comparison = (arr1 == arr2).all()


print("Are arrays equal?", comparison)

Method 2: Using numpy.array_equal()

import numpy as np

arr1 = np.array([1, 2, 3])

arr2 = np.array([1, 2, 4])

comparison = np.array_equal(arr1, arr2)

print("Are arrays equal?", comparison)

 == Operator: Compares each element and returns a new array of boolean


values.

 .all() Method: Checks if all elements in the boolean array are True.

 numpy.array_equal(): Directly checks if both arrays have the same shape and
elements.

Q43. How do you calculate the QR decomposition of a given matrix using


NumPy?

A:
QR decomposition breaks down a matrix A into two matrices: Q (orthogonal) and R
(upper triangular). Use numpy.linalg.qr() to perform QR decomposition.

import numpy as np

# Define a matrix

A = np.array([[1, 2], [3, 4]])

# Perform QR decomposition

Q, R = np.linalg.qr(A)

print("Q Matrix:\n", Q)

print("R Matrix:\n", R)

Parameters:

 a: The input matrix you want to decompose.

 mode: Optional parameter to specify the type of decomposition. Common mode


is 'reduced'.
Result:

 Q: An orthogonal matrix.

 R: An upper triangular matrix.

Q44. How do you filter out integers from a float NumPy array?

A:
You can remove integers from a NumPy array containing both floats and integers
using methods like astype(int), numpy.mod(), numpy.isclose(), or round().

Method 1: Using astype(int)

import numpy as np

arr = np.array([1.0, 2.5, 3.0, 4.75, 5.0])

filtered = arr[arr.astype(int) != arr]

print("Filtered Array:", filtered)

Method 2: Using numpy.mod()

import numpy as np

arr = np.array([1.0, 2.5, 3.0, 4.75, 5.0])

filtered = arr[np.mod(arr, 1) != 0]

print("Filtered Array:", filtered)

Method 3: Using numpy.isclose()

import numpy as np

arr = np.array([1.0, 2.5, 3.0, 4.75, 5.0])

filtered = arr[~np.isclose(arr, arr.astype(int))]

print("Filtered Array:", filtered)

Method 4: Using round()

import numpy as np

arr = np.array([1.0, 2.5, 3.0, 4.75, 5.0])

filtered = arr[arr != np.round(arr)]

print("Filtered Array:", filtered)


Q45. How do you define a polynomial function in NumPy?

A:
You can define a polynomial using the numpy.poly1d class by providing the
coefficients of the polynomial.

import numpy as np

# Define a polynomial 2x^2 + 3x + 1

coefficients = [2, 3, 1]

poly = np.poly1d(coefficients)

print("Polynomial Function:\n", poly)

print("Value at x=2:", poly(2))

Parameters:

 arr: List of coefficients (highest degree first).

 r: (Optional) Roots of the polynomial.

 variable: (Optional) Variable name (default is x).

Q46. What are ndarrays in NumPy?

A:
An ndarray (N-dimensional array) is the main data structure in NumPy used to store
and manipulate numerical data. It can have multiple dimensions (1D, 2D, 3D, etc.)
and supports efficient mathematical operations.

import numpy as np

# Create a 2D ndarray

array_2d = np.array([[1, 2, 3], [4, 5, 6]])

print("2D ndarray:\n", array_2d)

Features:

 Multidimensional: Supports arrays with multiple dimensions.

 Homogeneous: All elements must be of the same data type.

 Efficient: Optimized for performance in numerical computations.

Q47. What are the main features that make NumPy unique?
A:
NumPy stands out due to several key features:

1. ndarray (N-dimensional array): Efficient storage and manipulation of large


datasets.

2. Performance: Optimized for fast numerical computations.

3. Vectorization: Allows operations on entire arrays without explicit loops.

4. Broadcasting: Enables arithmetic operations on arrays of different shapes.

5. Mathematical Functions: Provides a wide range of built-in mathematical


functions.

6. Integration: Works seamlessly with other Python libraries like Pandas,


Matplotlib, and SciPy.

7. Memory Efficiency: Uses less memory compared to standard Python lists.

Q48. What is the difference between the shape and size attributes of a
NumPy array?

A:

 shape:

o Definition: A tuple representing the dimensions of the array.

o Example: A 2x3 array has shape = (2, 3).

o Usage: Helps understand the structure of the array.

 size:

o Definition: Total number of elements in the array.

o Example: A 2x3 array has size = 6.

o Usage: Useful for knowing how much data is stored.

Example:

import numpy as np

arr = np.array([[1, 2, 3], [4, 5, 6]])

print("Shape:", arr.shape) # Output: (2, 3)

print("Size:", arr.size) # Output: 6

Q49. What are some important differences between standard Python


sequences and NumPy arrays?
A:
NumPy Arrays:

 Homogeneous: All elements must be of the same type.

 Fixed Size: Once created, the size cannot change.

 Efficient: Optimized for numerical operations and performance.

 Vectorized Operations: Supports element-wise operations without loops.

Python Sequences (Lists, Tuples):

 Heterogeneous: Can contain elements of different types.

 Dynamic Size: Can grow or shrink as needed.

 Less Efficient: Not optimized for large numerical computations.

 No Built-in Vectorization: Requires loops for element-wise operations.

Example:

import numpy as np

# NumPy array

np_arr = np.array([1, 2, 3])

print("NumPy Array:", np_arr)

# Python list

py_list = [1, 2, 3]

print("Python List:", py_list)

Q50. What are Universal Functions (ufuncs) in NumPy?

A:
Universal functions, or ufuncs, are fast, element-wise functions provided by NumPy
that operate on ndarray objects. They support a wide range of mathematical
operations like addition, multiplication, trigonometric functions, etc., and are
optimized for performance.

Examples of ufuncs:

 Arithmetic Operations: np.add(), np.subtract(), np.multiply(), np.divide()

 Trigonometric Functions: np.sin(), np.cos(), np.tan()

 Exponential and Logarithmic Functions: np.exp(), np.log()

 Statistical Functions: np.mean(), np.std()

Example:
import numpy as np

arr = np.array([1, 2, 3, 4])

# Apply a ufunc (square each element)

squared = np.square(arr)

print("Squared Array:", squared)

# Apply trigonometric ufunc

sine = np.sin(arr)

print("Sine of Array:", sine)

Q51. What is the difference between ndarray and array in NumPy?

A:

 ndarray:

o Definition: The official term for NumPy's N-dimensional array object.

o Usage: Refers specifically to the array structure used internally by


NumPy.

 array:

o Definition: A common term used to refer to NumPy arrays in general.

o Usage: Informally used in documentation and conversation to mean


ndarray.

Key Points:

 Both terms refer to the same data structure.

 ndarray is the technical term, while array is used more casually.

Example:

import numpy as np

# Using ndarray

nd_arr = np.array([1, 2, 3])

print("ndarray:", nd_arr)

# Using array (informal term)


arr = np.array([4, 5, 6])

print("Array:", arr)

Q52. How would you convert a pandas DataFrame into a NumPy array?

A:
You can convert a Pandas DataFrame to a NumPy array using the .values attribute or
the .to_numpy() method.

Using .values:

import pandas as pd

import numpy as np

# Create a Pandas DataFrame

data = {'A': [1, 2, 3], 'B': [4, 5, 6]}

df = pd.DataFrame(data)

# Convert to NumPy array

numpy_array = df.values

print("NumPy Array:\n", numpy_array)

Using .to_numpy():

import pandas as pd

import numpy as np

# Create a Pandas DataFrame

data = {'A': [7, 8, 9], 'B': [10, 11, 12]}

df = pd.DataFrame(data)

# Convert to NumPy array

numpy_array = df.to_numpy()

print("NumPy Array:\n", numpy_array)

Q53. Explain vectorization in NumPy.

A:
Vectorization in NumPy refers to performing operations on entire arrays or large
sections of arrays without using explicit Python loops. This allows for faster and more
efficient computations by leveraging optimized C and Fortran code under the hood.

Benefits:

 Speed: Vectorized operations are much faster than using loops.

 Code Simplicity: Leads to cleaner and more readable code.

 Efficiency: Takes advantage of low-level optimizations.

Example Without Vectorization (Using Loops):

import numpy as np

arr1 = np.array([1, 2, 3])

arr2 = np.array([4, 5, 6])

result = np.empty_like(arr1)

for i in range(len(arr1)):

result[i] = arr1[i] + arr2[i]

print("Result:", result)

Example With Vectorization:

import numpy as np

arr1 = np.array([1, 2, 3])

arr2 = np.array([4, 5, 6])

# Vectorized addition

result = arr1 + arr2

print("Result:", result)

Q54. How would you reverse a NumPy array?

A:
You can reverse a NumPy array using slicing with a step of -1.

import numpy as np

# Create a NumPy array


original_array = np.array([1, 2, 3, 4, 5])

# Reverse the array

reversed_array = original_array[::-1]

print("Reversed Array:", reversed_array)

Output:

(Javascript-code)

Reversed Array: [5 4 3 2 1]

Q55. How do you remove missing or null values from a NumPy array?

A:
Although NumPy doesn't natively support missing values like Pandas, you can handle
them using masked arrays or by filtering with boolean masks.

Method 1: Using Masked Arrays

import numpy as np

# Create an array with NaN values

arr = np.array([1.0, 2.0, np.nan, 4.0, 5.0])

# Create a masked array where NaN values are masked

masked_arr = np.ma.masked_invalid(arr)

# Get only the non-masked (valid) data

clean_data = masked_arr.compressed()

print("Clean Data:", clean_data)

Method 2: Using Boolean Mask

import numpy as np

# Create an array with NaN values

arr = np.array([1.0, 2.0, np.nan, 4.0, 5.0])


# Create a mask for non-NaN values

mask = ~np.isnan(arr)

# Filter out NaN values

filtered_array = arr[mask]

print("Filtered Array:", filtered_array)

Q56. What is the difference between slicing and indexing in NumPy?

A:

 Indexing:

o Definition: Accesses specific elements or subsets using exact positions.

o Usage: Retrieve or modify individual elements or specific parts of the


array.

o Example:

import numpy as np

arr = np.array([10, 20, 30, 40, 50])

element = arr[2] # 30

 Slicing:

o Definition: Extracts a range of elements using a start and end index.

o Usage: Create a new array containing a subset of the original array.

o Example:

import numpy as np

arr = np.array([10, 20, 30, 40, 50])

subset = arr[1:4] # [20, 30, 40]

Key Differences:

 Precision: Indexing is precise, while slicing selects a range.

 Result: Indexing returns individual elements or specific selections; slicing


returns a new array with a continuous subset.

Q57. How do you create a masked array in NumPy, and what is its purpose?

A:
A masked array in NumPy allows you to mark certain elements as invalid or
"masked," which can be ignored in computations. This is useful when dealing with
data that has missing or corrupt values.

Creating a Masked Array:

import numpy as np

# Create a regular array with some invalid values (e.g., -999)

data = np.array([1, 2, -999, 4, -999, 6])

# Create a masked array where -999 is considered invalid

masked_data = np.ma.masked_where(data == -999, data)

print("Masked Array:", masked_data)

Purpose:

 Handle Missing Data: Easily ignore or exclude invalid/missing entries in


calculations.

 Maintain Data Integrity: Perform operations without altering the original


data.

Example of Using Masked Array:

import numpy as np

# Define masked array

masked_data = np.ma.masked_array([1, 2, -999, 4, -999, 6], mask=[False, False, True,


False, True, False])

# Calculate the mean, ignoring masked values

mean = masked_data.mean()

print("Mean (ignoring masked values):", mean)

Q58. What are some common techniques for normalizing data in a NumPy
array?

A:
Normalization scales data to a standard range, improving the performance of
machine learning algorithms. Common normalization techniques include:

1. Min-Max Scaling:
o Description: Scales data to a range between 0 and 1.

o Formula: (x - min) / (max - min)

o Example:

import numpy as np

arr = np.array([10, 20, 30, 40, 50])

normalized = (arr - arr.min()) / (arr.max() - arr.min())

print("Min-Max Normalized:", normalized)

2. Z-Score Normalization (Standardization):

o Description: Centers data around the mean with a standard deviation of


1.

o Formula: (x - mean) / std

o Example:

import numpy as np

arr = np.array([10, 20, 30, 40, 50])

standardized = (arr - arr.mean()) / arr.std()

print("Z-Score Standardized:", standardized)

3. Log Transformation:

o Description: Applies the logarithm to compress the range of data.

o Example:

import numpy as np

arr = np.array([1, 10, 100, 1000, 10000])

log_transformed = np.log(arr)

print("Log Transformed:", log_transformed)

4. Box-Cox Transformation:

o Description: Transforms non-normal dependent variables to a normal


shape.

o Note: Requires all data to be positive.

5. Robust Scaling:

o Description: Uses statistics that are robust to outliers (e.g., median and
interquartile range).

o Example:

import numpy as np

arr = np.array([1, 2, 3, 4, 100])


median = np.median(arr)

iqr = np.percentile(arr, 75) - np.percentile(arr, 25)

robust_scaled = (arr - median) / iqr

print("Robust Scaled:", robust_scaled)

Q59. How do you remove missing or null values from a NumPy array?

A:
Since NumPy doesn't natively support missing values like Pandas, you can handle
them by either filtering them out or replacing them with suitable values.

Method 1: Replacing Missing Values

import numpy as np

# Create an array with missing values represented by -999

arr = np.array([1, 2, -999, 4, -999, 6])

# Replace -999 with a specific value, e.g., 0

arr[arr == -999] = 0

print("Array after replacement:", arr)

Method 2: Filtering Out Missing Values Using a Mask

import numpy as np

# Create an array with NaN values

arr = np.array([1.0, 2.0, np.nan, 4.0, 5.0])

# Create a mask for non-NaN values

mask = ~np.isnan(arr)

# Filter out NaN values

filtered_array = arr[mask]

print("Filtered Array:", filtered_array)


Q60. How do you create two 2-D arrays and plot them using Matplotlib?

A:
You can create two 2-D NumPy arrays and visualize them using Matplotlib's plotting
functions.

Example: Creating and Plotting 2-D Arrays

import numpy as np

import matplotlib.pyplot as plt

# Create two 2-D arrays

array1 = np.array([[34, 43, 73],

[82, 22, 12],

[53, 94, 66]])

array2 = np.array([[10, 10, 10],

[20, 20, 20],

[30, 30, 30]])

# Plotting the arrays

plt.figure(figsize=(8, 6))

# Plot array1 as a heatmap

plt.subplot(1, 2, 1)

plt.imshow(array1, cmap='viridis', aspect='auto')

plt.title('Array 1 Heatmap')

plt.colorbar()

# Plot array2 as a heatmap

plt.subplot(1, 2, 2)

plt.imshow(array2, cmap='plasma', aspect='auto')

plt.title('Array 2 Heatmap')

plt.colorbar()

plt.tight_layout()
plt.show()

Explanation:

1. Creating Arrays:

o array1 and array2 are defined as 2-D NumPy arrays.

2. Plotting:

o plt.imshow(): Displays the array as an image (heatmap).

o cmap: Specifies the color map.

o aspect='auto': Adjusts the aspect ratio.

o plt.colorbar(): Adds a color bar to indicate the scale.

o plt.subplot(): Creates subplots to display multiple plots side by side.

o plt.tight_layout(): Adjusts subplot parameters for a neat layout.

o plt.show(): Displays the plots.

Output: Two heatmaps representing array1 and array2 side by side.

Q61. What is the difference between NumPy and Pandas?

A:
NumPy and Pandas are both essential Python libraries used for data manipulation
and analysis, but they serve different purposes.

 NumPy (Numerical Python):

o Purpose: Primarily used for numerical and scientific computing.

o Data Structure: Provides the ndarray, a powerful N-dimensional array for


storing homogeneous data (all elements are the same type).

o Features: Efficient operations on large arrays, mathematical functions,


and support for multi-dimensional data.

 Pandas:

o Purpose: Designed for data analysis and manipulation.

o Data Structures: Introduces Series (1D) and DataFrame (2D), which can
handle heterogeneous data (different data types in each column).

o Features: Easy handling of missing data, data alignment, merging/joining


datasets, and powerful data aggregation and transformation tools.

Example:

import numpy as np

import pandas as pd

# NumPy array
np_array = np.array([1, 2, 3, 4, 5])

print("NumPy Array:", np_array)

# Pandas DataFrame

data = {'A': [1, 2, 3], 'B': [4, 5, 6]}

df = pd.DataFrame(data)

print("\nPandas DataFrame:\n", df)

Q62. Why is NumPy faster than Python lists?

A:
NumPy is faster than Python lists for several reasons:

1. Uniform Data Types:

o NumPy: Stores elements of the same data type, allowing for optimized
memory usage and faster computations.

o Lists: Can store elements of different data types, which requires more
memory and slower access.

2. Optimized C Implementation:

o NumPy: Built on C and Fortran, NumPy operations are executed in


compiled code, which is much faster than Python's interpreted loops.

3. Vectorized Operations:

o NumPy: Supports vectorized operations, allowing you to perform batch


operations on entire arrays without explicit Python loops.

o Lists: Require explicit loops for similar operations, which are slower.

4. Memory Efficiency:

o NumPy: Uses contiguous memory blocks, improving cache locality and


access speed.

o Lists: Use pointers to objects, which are scattered in memory, leading to


slower access times.

Example:

import numpy as np

import time

# NumPy array

np_array = np.arange(1000000)
# Python list

py_list = list(range(1000000))

# NumPy addition

start_time = time.time()

np_result = np_array + 1

print("NumPy Time:", time.time() - start_time, "seconds")

# Python list addition using list comprehension

start_time = time.time()

py_result = [x + 1 for x in py_list]

print("Python List Time:", time.time() - start_time, "seconds")

Output:

(Less-code)

NumPy Time: 0.025 seconds

Python List Time: 0.25 seconds

Q63. How do you check for an empty (zero-element) array in NumPy?

A:
To check if a NumPy array is empty (has zero elements), you can use the .size
attribute. If size is 0, the array is empty.

Example:

import numpy as np

# Create an empty array with shape (1, 0)

empty_array = np.zeros((1, 0))

print("Empty Array:\n", empty_array)

print("Size:", empty_array.size) # Output: 0

# Check if the array is empty

if empty_array.size == 0:

print("The array is empty.")


else:

print("The array is not empty.")

Output:

(Sql-code)

Empty Array:

[]

Size: 0

The array is empty.

Q64. How do you count the number of times a given value appears in an
array of integers in NumPy?

A:
You can use the numpy.bincount() function to count the occurrences of each non-
negative integer in an array. Note that bincount() only works with non-negative
integers.

Example:

import numpy as np

# Create an array of integers

arr = np.array([0, 5, 4, 0, 4, 4, 3, 0, 0, 5, 2, 1, 1, 9])

# Count the occurrences of each integer

counts = np.bincount(arr)

print("Counts of each integer:", counts)

Output:

(Sql-code)

Counts of each integer: [4 2 1 1 3 2 0 0 0 1]

Explanation:

 0 appears 4 times.

 1 appears 2 times.

 2 appears 1 time.

 3 appears 1 time.
 4 appears 3 times.

 5 appears 2 times.

 9 appears 1 time.

Q65. How can you sort an array in NumPy?

A:
You can sort a NumPy array using the .sort() method or the numpy.sort() function.

 In-place Sorting with .sort():

o Sorts the array itself and returns None.

 Creating a Sorted Copy with numpy.sort():

o Returns a new sorted array, leaving the original array unchanged.

Example:

import numpy as np

# Create an unsorted array

arr = np.array([3, 2, 1, 5, 4])

# In-place sort

arr.sort()

print("Sorted Array (in-place):", arr)

# Create a new sorted array without changing the original

original = np.array([10, 7, 8, 9, 1])

sorted_copy = np.sort(original)

print("Original Array:", original)

print("Sorted Copy:", sorted_copy)

Output:

(Less-code)

Sorted Array (in-place): [1 2 3 4 5]

Original Array: [10 7 8 9 1]

Sorted Copy: [ 1 7 8 9 10]

Sorting in Descending Order:


To sort in descending order, you can sort the array and then reverse it.

import numpy as np

# Create an array

arr = np.array([3, 1, 4, 2, 5])

# Sort in ascending order and then reverse

sorted_desc = np.sort(arr)[::-1]

print("Sorted in Descending Order:", sorted_desc)

Output:

(Css-code)

Sorted in Descending Order: [5 4 3 2 1]

Q66. How can you find the maximum or minimum value of an array in
NumPy?

A:
You can use numpy.max() and numpy.min() functions to find the maximum and
minimum values in a NumPy array, respectively.

Example:

import numpy as np

# Create an array

arr = np.array([3, 2, 1, 5, 4])

# Find the maximum value

max_value = np.max(arr)

print("Maximum Value:", max_value) # Output: 5

# Find the minimum value

min_value = np.min(arr)

print("Minimum Value:", min_value) # Output: 1

Finding Max/Min Along an Axis:


For multi-dimensional arrays, you can specify the axis along which to find the max or
min.

import numpy as np

# Create a 2D array

matrix = np.array([[3, 2, 1],

[5, 4, 6]])

# Find the maximum value in each column (axis=0)

max_cols = np.max(matrix, axis=0)

print("Max of each column:", max_cols) # Output: [5 4 6]

# Find the minimum value in each row (axis=1)

min_rows = np.min(matrix, axis=1)

print("Min of each row:", min_rows) # Output: [1 4]

Q67. How can slicing and indexing be used for data cleaning in NumPy?

A:
Slicing and indexing are powerful techniques in NumPy that help in cleaning data by
allowing you to access, modify, and filter specific parts of an array based on certain
conditions.

Example Scenario:

 Problem: You have a dataset with some negative values that are invalid and
you want to replace them with zeros. Additionally, you want to extract all values
greater than 2 for further analysis.

Step-by-Step Solution:

1. Replace Negative Values with Zeros (Indexing):

import numpy as np

# Sample NumPy array with negative values

data = np.array([1, 2, -1, 4, 5, -2, 7])

# Use indexing to replace negative values with zeros

data[data < 0] = 0
print("Data after replacing negatives with zeros:", data)

Output:

(Csharp-code)

Data after replacing negatives with zeros: [1 2 0 4 5 0 7]

2. Extract Elements Greater Than 2 (Slicing):

# Use slicing (with a condition) to extract elements greater than 2

subset = data[data > 2]

print("Elements greater than 2:", subset)

Output:

(Less-code)

Elements greater than 2: [4 5 7]

Explanation:

 Indexing (data[data < 0] = 0):

o Creates a boolean mask where each element is checked against the


condition data < 0.

o Elements that satisfy the condition are replaced with 0.

 Slicing (subset = data[data > 2]):

o Creates a boolean mask where each element is checked against the


condition data > 2.

o Extracts and creates a new array subset containing only the elements that
satisfy the condition.

Benefits:

 Efficiency: Performs operations on the entire array without explicit loops.

 Readability: Clear and concise code for data manipulation.

 Flexibility: Easily apply multiple conditions and transformations.

Q68. What is the difference between using the shape and size attributes of a
NumPy array?

A:
Both shape and size are attributes of a NumPy array that provide information about
the array's dimensions and size, but they serve different purposes.

 shape:

o Definition: A tuple that describes the dimensions of the array.

o Example: For a 2x3 array, shape is (2, 3).


o Usage: Helps understand the structure of the array, such as the number
of rows and columns.

 size:

o Definition: An integer representing the total number of elements in the


array.

o Example: For a 2x3 array, size is 6 (2 * 3).

o Usage: Useful for knowing how much data is stored in the array,
regardless of its shape.

Example:

import numpy as np

# Create a 2D NumPy array

arr = np.array([[1, 2, 3, 4],

[5, 6, 7, 8],

[9, 10, 11, 12]])

# Get the shape of the array

shape = arr.shape

print("Shape:", shape) # Output: (3, 4)

# Access individual dimensions

num_rows = shape[0]

num_cols = shape[1]

print("Number of Rows:", num_rows) # Output: 3

print("Number of Columns:", num_cols) # Output: 4

# Get the size of the array

size = arr.size

print("Size:", size) # Output: 12

# Calculate the size manually

calculated_size = num_rows * num_cols

print("Calculated Size:", calculated_size) # Output: 12


Key Differences:

 shape provides a breakdown of the array's dimensions.

 size gives the total count of all elements in the array.

Q69. What is a NumPy array and how is it different from a NumPy matrix?

A:
NumPy Array (ndarray):

 Definition: A powerful N-dimensional array object in NumPy used for storing


and manipulating numerical data.

 Features:

o Multidimensional: Supports arrays with multiple dimensions (1D, 2D,


3D, etc.).

o Homogeneous: All elements must be of the same data type.

o Flexible Operations: Supports element-wise operations, slicing,


indexing, and more.

Example:

import numpy as np

# Create a 2D NumPy array

array = np.array([[1, 2, 3],

[4, 5, 6]])

print("NumPy Array:\n", array)

NumPy Matrix:

 Definition: A specialized 2-dimensional array subclass in NumPy specifically


designed for linear algebra operations.

 Features:

o Always 2D: Matrices are strictly 2-dimensional.

o Operator Overloading: The * operator performs matrix multiplication


instead of element-wise multiplication.

o Built-in Linear Algebra Operations: Provides methods like .I for


inverse, .T for transpose, etc.

Example:

import numpy as np
# Create a NumPy matrix

matrix = np.matrix([[1, 2],

[3, 4]])

print("NumPy Matrix:\n", matrix)

# Matrix multiplication using *

matrix_mult = matrix * matrix

print("Matrix Multiplication:\n", matrix_mult)

Key Differences:

 Dimensions:

o Array: Can be multi-dimensional (1D, 2D, etc.).

o Matrix: Always 2-dimensional.

 Operations:

o Array: * performs element-wise multiplication.

o Matrix: * performs matrix multiplication.

 Use Cases:

o Array: General-purpose, suitable for a wide range of applications.

o Matrix: Specialized for linear algebra tasks.

Note:
While matrices can be useful for linear algebra, NumPy encourages the use of ndarray
for most purposes due to its flexibility and broader functionality. Matrix subclass is
considered somewhat outdated, and many developers prefer using ndarray with
functions from numpy.linalg for linear algebra operations.

Q70. How can you find the unique elements in an array in NumPy?

A:
You can use the numpy.unique() function to find the unique elements in a NumPy
array. This function also provides the counts of each unique element if needed.

Example:

import numpy as np

# Create an array with duplicate elements

array = np.array([1, 2, 3, 1, 2, 3, 3, 4, 5, 6, 7, 5])


# Find unique elements

unique_elements = np.unique(array)

print("Unique Elements:", unique_elements)

Output:

(Less-code)

Unique Elements: [1 2 3 4 5 6 7]

Getting Counts of Unique Elements:

If you also want to know how many times each unique element appears, use the
return_counts parameter.

import numpy as np

# Create an array with duplicate elements

array = np.array([1, 2, 3, 1, 2, 3, 3, 4, 5, 6, 7, 5])

# Find unique elements and their counts

unique_elements, counts = np.unique(array, return_counts=True)

print("Unique Elements:", unique_elements)

print("Counts:", counts)

Output:

(Less-code)

Unique Elements: [1 2 3 4 5 6 7]

Counts: [2 2 3 1 2 1 1]

Explanation:

 unique_elements contains all the unique values in the array, sorted in


ascending order.

 counts shows how many times each unique element appears in the original
array.

Finding Unique Rows in a 2D Array:

For multi-dimensional arrays, numpy.unique() can also find unique rows.

import numpy as np

# Create a 2D array with duplicate rows

array_2d = np.array([[1, 2],


[3, 4],

[1, 2],

[5, 6]])

# Find unique rows

unique_rows = np.unique(array_2d, axis=0)

print("Unique Rows:\n", unique_rows)

Output:

(Lua-code)

Unique Rows:

[[1 2]

[3 4]

[5 6]]

Parameters of numpy.unique():

 ar: Input array.

 return_index: If True, also return the indices of the first occurrences of the
unique values.

 return_inverse: If True, also return the indices to reconstruct the original array
from the unique array.

 return_counts: If True, also return the counts of each unique value.

 axis: Axis along which to find unique elements. If None, the array is flattened.

Additional Example with All Parameters:

import numpy as np

array = np.array([2, 1, 2, 4, 3, 4, 5])

unique_elements, indices, inverse, counts = np.unique(array, return_index=True,


return_inverse=True, return_counts=True)

print("Unique Elements:", unique_elements)

print("Indices:", indices)

print("Inverse:", inverse)

print("Counts:", counts)

Output:
(Less-code)

Unique Elements: [1 2 3 4 5]

Indices: [1 0 4 3 6]

Inverse: [1 0 1 3 2 3 4]

Counts: [1 2 1 2 1]

Summary:

 Use numpy.unique() to easily identify all unique elements in an array.

 Utilize additional parameters to get more information like counts or indices.

You might also like