Thanks to visit codestin.com
Credit goes to www.tutorialspoint.com

NumPy - Indexing with Datetimes



Indexing with Datetimes in NumPy

Indexing with datetimes in NumPy allows you to easily select and manipulate specific time-based data. This is helpful when dealing with time series data, like stock prices or temperature readings.

Using the datetime64 type in NumPy, you can slice, filter, and index data just like arrays. This allows you to focus on specific time periods, such as a particular day or range of dates, and perform operations like comparing or filtering dates for analysis.

Basic Indexing with Datetime Arrays

Indexing and slicing with datetime arrays in NumPy allow you to easily access specific dates or ranges of dates. You can index a single date from a datetime array by specifying its position, just like with regular arrays.

For slicing, you can select a continuous range of dates by providing a start and end index. Additionally, NumPy supports boolean indexing, which allows you to filter dates based on conditions (e.g., selecting all dates after a specific day).

Example

In the following example, we are slicing a datetime64 array to select specific ranges of dates −

import numpy as np

# Define a datetime array
dates = np.array(['2024-01-01', '2024-01-02', '2024-01-03', '2024-01-04', '2024-01-05'], dtype='datetime64[D]')

# Slice the datetime array
selected_dates = dates[1:4]

print(selected_dates)

This produces the following output −

['2024-01-02' '2024-01-03' '2024-01-04']

Filtering with Boolean Indexing

Boolean indexing in NumPy allows you to filter elements in an array based on conditions. When working with datetime arrays, this feature is useful for selecting data within specific time ranges or satisfying certain time-based criteria.

To perform boolean indexing, you create a condition (a boolean array) that matches the structure of the original datetime array. The condition can be any logical expression that compares dates (or other data), and it will return an array of True or False values. These True values are then used to filter out the corresponding elements from the original array.

Example

In this example, we are filtering a datetime array to select only the dates after a specific date, using boolean indexing −

import numpy as np

# Define a datetime array
dates = np.array(['2024-01-01', '2024-01-02', '2024-01-03', '2024-01-04', '2024-01-05'], dtype='datetime64[D]')

# Define the filter condition
filtered_dates = dates[dates > np.datetime64('2024-01-02')]

print(filtered_dates)

The output for this operation will be −

['2024-01-03' '2024-01-04' '2024-01-05']

Indexing with Date Ranges

Indexing with date ranges in NumPy allows you to select and work with subsets of datetime data that fall within specific time intervals.

To index with date ranges, you define a condition that specifies the start and end of the range you are interested in. This can be done using comparison operators to filter dates that fall within the desired range. You can combine conditions using logical operators to filter data more precisely.

Example

In this example, we are selecting data within a specific date range −

import numpy as np

# Define a datetime array
dates = np.array(['2024-01-01', '2024-01-02', '2024-01-03', '2024-01-04', '2024-01-05'], dtype='datetime64[D]')

# Define the start and end dates
start_date = np.datetime64('2024-01-02')
end_date = np.datetime64('2024-01-04')

# Select dates within the range
range_dates = dates[(dates >= start_date) & (dates <= end_date)]

print(range_dates)

The result produced is as follows −

['2024-01-02' '2024-01-03' '2024-01-04']

Working with Different Time Units

In NumPy, the datetime64 and timedelta64 objects allow you to work with various time units ranging from years down to attoseconds. This helps for the precise manipulation and analysis of time data in different scales, such as days, hours, minutes, and even smaller units like nanoseconds or femtoseconds.

The time units in NumPy are represented by strings, such as 'Y' for years, 'M' for months, 'D' for days, 'h' for hours, 'm' for minutes, 's' for seconds, 'ms' for milliseconds, 'us' for microseconds, and 'ns' for nanoseconds. You can use these units to create datetime and timedelta objects or perform arithmetic operations involving time intervals.

Example

In this example, we are indexing the datetime64 array to select dates within a specific month −

import numpy as np

# Define a datetime array
dates = np.array(['2024-01-01', '2024-02-01', '2024-03-01', '2024-04-01'], dtype='datetime64[M]')

# Filter dates by the month of January
january_dates = dates[dates == np.datetime64('2024-01', 'M')]

print(january_dates)

After executing the above code, we get the following output −

['2024-01']

Advanced Indexing with Structured Arrays

Structured arrays in NumPy allow you to store and manipulate complex data, such as records with multiple fields, each of which can be of a different type.

Advanced indexing techniques helps you to access and modify specific fields or subsets of the data. Structured arrays are created using the np.array() function with a dtype argument that specifies the names and types of the fields.

A structured array in NumPy is similar to a regular array, but it allows each element to have multiple fields, each with its own data type. These fields can represent different types of data, such as integers, floats, or strings, all organized under a single array.

Example

In this example, we create a structured array and index it by date, selecting specific records based on the datetime values −

import numpy as np

# Define a structured array with dates and associated data
data = np.array([('2024-01-01', 100), ('2024-01-02', 200), ('2024-01-03', 300)],
                dtype=[('date', 'datetime64[D]'), ('value', 'i4')])

# Filter data where the date is after '2024-01-01'
filtered_data = data[data['date'] > np.datetime64('2024-01-01')]

print(filtered_data)

The output will be −

[('2024-01-02', 200) ('2024-01-03', 300)]
Advertisements