2. CUPTI Python API Reference#

2.1. Documentation Issues#

The CUPTI Python API Reference section of the document is automatically generated and has some issues:

  • All the CUPTI Python enumerations, functions and classes are listed together in a single section.

  • The members of the inner struct/union classes (_py_anon_pod*) are not adequately documented. To get more information for a member, please refer the CUPTI C documentation.

  • The kind member of python classes has type int, instead of cupti.cupti.ActivityKind. While using the kind member, please use cupti.cupti.ActivityKind to get the enum value.

  • Some parts of the API documentation still cite C enums/data structures instead of mapping them to their Python counterparts.

2.2. API Reference#

exception cupti.cupti.cuptiError(status: int)#

Bases: Exception

class cupti.cupti.ActivityAPI#

Bases: object

Empty-initialize an instance of CUpti_ActivityAPI.

cbid#

The ID of the driver or runtime function.

Type:

int

correlation_id#

The correlation ID of the driver or runtime CUDA function. Each function invocation is assigned a unique correlation ID that is identical to the correlation ID in the memcpy, memset, or kernel activity record that is associated with this function.

Type:

int

end#

The end timestamp for the function, in ns. A value of 0 for both the start and end timestamps indicates that timestamp information could not be collected for the function.

Type:

int

kind#

The activity record kind, must be CUPTI_ACTIVITY_KIND_DRIVER, CUPTI_ACTIVITY_KIND_RUNTIME, or CUPTI_ACTIVITY_KIND_INTERNAL_LAUNCH_API.

Type:

int

process_id#

The ID of the process where the driver or runtime CUDA function is executing.

Type:

int

ptr#

Get the pointer address to the data as Python int.

return_value#

The return value for the function. For a CUDA driver function with will be a CUresult value, and for a CUDA runtime function this will be a cudaError_t value.

Type:

int

start#

The start timestamp for the function, in ns. A value of 0 for both the start and end timestamps indicates that timestamp information could not be collected for the function.

Type:

int

thread_id#

The ID of the thread where the driver or runtime CUDA function is executing.

Type:

int

class cupti.cupti.ActivityAttribute(value)#

Bases: IntEnum

See CUpti_ActivityAttribute.

ATTR_DEVICE_BUFFER_FORCE_INT = 2147483647#
ATTR_DEVICE_BUFFER_POOL_LIMIT = 2#
ATTR_DEVICE_BUFFER_PRE_ALLOCATE_VALUE = 6#
ATTR_DEVICE_BUFFER_SIZE = 0#
ATTR_DEVICE_BUFFER_SIZE_CDP = 1#
ATTR_DEVICE_BUFFER_SIZE_DEVICE_GRAPHS = 10#
ATTR_MEM_ALLOCATION_TYPE_HOST_PINNED = 8#
ATTR_PER_THREAD_BUFFER = 9#
ATTR_PROFILING_SEMAPHORE_POOL_LIMIT = 4#
ATTR_PROFILING_SEMAPHORE_POOL_SIZE = 3#
ATTR_PROFILING_SEMAPHORE_PRE_ALLOCATE_VALUE = 7#
ATTR_ZEROED_OUT_BUFFER = 5#
class cupti.cupti.ActivityAutoBoostState#

Bases: object

Empty-initialize an instance of CUpti_ActivityAutoBoostState.

enabled#

Returned auto boost state. 1 is returned in case auto boost is enabled, 0 otherwise

Type:

int

pid#

Id of process that has set the current boost state. The value will be CUPTI_AUTO_BOOST_INVALID_CLIENT_PID if the user does not have the permission to query process ids or there is an error in querying the process id.

Type:

int

ptr#

Get the pointer address to the data as Python int.

class cupti.cupti.ActivityCdpKernel#

Bases: object

Empty-initialize an instance of CUpti_ActivityCdpKernel.

block_x#

The X-dimension block size for the kernel.

Type:

int

block_y#

The Y-dimension block size for the kernel.

Type:

int

block_z#

The Z-dimension grid size for the kernel.

Type:

int

cache_config#

_py_anon_pod7:

completed#

The timestamp when kernel is marked as completed, in ns. A value of CUPTI_TIMESTAMP_UNKNOWN indicates that the completion time is unknown.

Type:

int

context_id#

The ID of the context where the kernel is executing.

Type:

int

correlation_id#

The correlation ID of the kernel. Each kernel execution is assigned a unique correlation ID that is identical to the correlation ID in the driver API activity record that launched the kernel.

Type:

int

device_id#

The ID of the device where the kernel is executing.

Type:

int

dynamic_shared_memory#

The dynamic shared memory reserved for the kernel, in bytes.

Type:

int

end#

The end timestamp for the kernel execution, in ns. A value of 0 for both the start and end timestamps indicates that timestamp information could not be collected for the kernel.

Type:

int

grid_id#

The grid ID of the kernel. Each kernel execution is assigned a unique grid ID.

Type:

int

grid_x#

The X-dimension grid size for the kernel.

Type:

int

grid_y#

The Y-dimension grid size for the kernel.

Type:

int

grid_z#

The Z-dimension grid size for the kernel.

Type:

int

kind#

The activity record kind, must be CUPTI_ACTIVITY_KIND_CDP_KERNEL

Type:

int

local_memory_per_thread#

The amount of local memory reserved for each thread, in bytes.

Type:

int

local_memory_total#

The total amount of local memory reserved for the kernel, in bytes.

Type:

int

name#

The name of the kernel. This name is shared across all activity records representing the same kernel, and so should not be modified.

Type:

str

pad#

Undefined. Reserved for internal use.

Type:

int

parent_block_x#

The X-dimension of the parent block.

Type:

int

parent_block_y#

The Y-dimension of the parent block.

Type:

int

parent_block_z#

The Z-dimension of the parent block.

Type:

int

parent_grid_id#

The grid ID of the parent kernel.

Type:

int

ptr#

Get the pointer address to the data as Python int.

queued#

The timestamp when kernel is queued up, in ns. A value of CUPTI_TIMESTAMP_UNKNOWN indicates that the queued time is unknown.

Type:

int

registers_per_thread#

The number of registers required for each thread executing the kernel.

Type:

int

shared_memory_config#

The shared memory configuration used for the kernel. The value is one of the CUsharedconfig enumeration values from cuda.h.

Type:

int

start#

The start timestamp for the kernel execution, in ns. A value of 0 for both the start and end timestamps indicates that timestamp information could not be collected for the kernel.

Type:

int

static_shared_memory#

The static shared memory allocated for the kernel, in bytes.

Type:

int

stream_id#

The ID of the stream where the kernel is executing.

Type:

int

submitted#

The timestamp when kernel is submitted to the gpu, in ns. A value of CUPTI_TIMESTAMP_UNKNOWN indicates that the submission time is unknown.

Type:

int

class cupti.cupti.ActivityComputeApiKind(value)#

Bases: IntEnum

See CUpti_ActivityComputeApiKind.

CUDA = 1#
CUDA_MPS = 2#
FORCE_INT = 2147483647#
UNKNOWN = 0#
class cupti.cupti.ActivityConfidentialComputeRotation#

Bases: object

Empty-initialize an instance of CUpti_ActivityConfidentialComputeRotation.

channel_id#

Channel ID

Type:

int

channel_type#

Channel Type CUpti_ChannelType

Type:

int

context_id#

Context ID

Type:

int

device_id#

Device ID

Type:

int

event_type#

Type of event CUpti_ConfidentialComputeRotationEventType

Type:

int

kind#

The activity record kind, must be CUPTI_ACTIVITY_KIND_CONFIDENTIAL_COMPUTE_ROTATION.

Type:

int

ptr#

Get the pointer address to the data as Python int.

timestamp#

Timestamp in ns

Type:

int

class cupti.cupti.ActivityContext3#

Bases: object

Empty-initialize an instance of CUpti_ActivityContext3.

cig_mode#

This field indicates the CIG mode

Type:

int

compute_api_kind#

The compute API kind. CUpti_ActivityComputeApiKind

Type:

int

context_id#

The context ID.

Type:

int

device_id#

The device ID.

Type:

int

is_green_context#

This field indicates whether the context is a green context

Type:

int

kind#

The activity record kind, must be CUPTI_ACTIVITY_KIND_CONTEXT.

Type:

int

null_stream_id#

The ID for the NULL stream in this context

Type:

int

num_multiprocessors#

Number of multiprocessors assigned to the green context Invalid if the field ‘isGreenContext’ is 0

Type:

int

padding#

int:

padding2#

int:

parent_context_id#

The ID of the parent context. It would be 0 if context does not have parent

Type:

int

ptr#

Get the pointer address to the data as Python int.

class cupti.cupti.ActivityCudaEvent2#

Bases: object

Empty-initialize an instance of CUpti_ActivityCudaEvent2.

context_id#

The ID of the context where the event was recorded.

Type:

int

correlation_id#

The correlation ID of the API to which this result is associated.

Type:

int

cuda_event_sync_id#

A unique ID to associate event synchronization records with the latest CUDA Event record. Similar field is added in CUpti_ActivitySynchronization2 to associate CUDA Event record to the synchronization record. The same CUDA event can be used multiple times, so the event id will not be unique to correlate the synchronization record with the latest CUDA Event record. This field will be unique and can be used to do the required correlation.

Type:

int

device_id#

The ID of the device where the event was recorded.

Type:

int

device_timestamp#

The device-side timestamp on CUDA event record. Timestamp is in nanoseconds. Collection of this field is disabled by default. It can be enabled by calling CUPTI API cuptiActivityEnableCudaEventDeviceTimestamps

Type:

int

event_id#

A unique event ID to identify the event record.

Type:

int

kind#

The activity record kind, must be CUPTI_ACTIVITY_KIND_CUDA_EVENT.

Type:

int

pad#

Undefined. Reserved for internal use.

Type:

int

pad2#

Undefined. Reserved for internal use.

Type:

int

ptr#

Get the pointer address to the data as Python int.

reserved0#

Undefined. Reserved for internal use.

Type:

int

stream_id#

The compute stream where the event was recorded.

Type:

int

class cupti.cupti.ActivityDevice5#

Bases: object

Empty-initialize an instance of CUpti_ActivityDevice5.

compute_capability_major#

Compute capability for the device, major number.

Type:

int

compute_capability_minor#

Compute capability for the device, minor number.

Type:

int

compute_instance_id#

Compute Instance id for MIG enabled devices. If mig mode is disabled value is set to UINT32_MAX

Type:

int

constant_memory_size#

The amount of constant memory on the device, in bytes.

Type:

int

core_clock_rate#

The core clock rate of the device, in kHz.

Type:

int

ecc_enabled#

ECC enabled flag for device

Type:

int

flags_#

The flags associated with the device. CUpti_ActivityFlag

Type:

int

global_memory_bandwidth#

The global memory bandwidth available on the device, in kBytes/sec.

Type:

int

global_memory_size#

The amount of global memory on the device, in bytes.

Type:

int

gpu_instance_id#

GPU Instance id for MIG enabled devices. If mig mode is disabled value is set to UINT32_MAX

Type:

int

id#

The device ID.

Type:

int

is_cuda_visible#

Flag to indicate whether the device is visible to CUDA. Users can set the device visibility using CUDA_VISIBLE_DEVICES environment

Type:

int

is_mig_enabled#

MIG enabled flag for device

Type:

int

is_numa_node#

The MIG UUID. This value is the globally unique immutable alphanumeric identifier of the device. Numa (Non-uniform memory access) information for device GPU is a NUMA node or not

Type:

int

kind#

The activity record kind, must be CUPTI_ACTIVITY_KIND_DEVICE.

Type:

int

l2cache_size#

The size of the L2 cache on the device, in bytes.

Type:

int

max_block_dim_x#

Maximum allowed X dimension for a block.

Type:

int

max_block_dim_y#

Maximum allowed Y dimension for a block.

Type:

int

max_block_dim_z#

Maximum allowed Z dimension for a block.

Type:

int

max_blocks_per_multiprocessor#

Maximum number of blocks that can be present on a multiprocessor at any given time.

Type:

int

max_grid_dim_x#

Maximum allowed X dimension for a grid.

Type:

int

max_grid_dim_y#

Maximum allowed Y dimension for a grid.

Type:

int

max_grid_dim_z#

Maximum allowed Z dimension for a grid.

Type:

int

max_ipc#

The maximum “instructions per cycle” possible on each device multiprocessor.

Type:

int

max_registers_per_block#

Maximum number of registers that can be allocated to a block.

Type:

int

max_registers_per_multiprocessor#

Maximum number of 32-bit registers available per multiprocessor.

Type:

int

max_shared_memory_per_block#

Maximum amount of shared memory that can be assigned to a block, in bytes.

Type:

int

max_shared_memory_per_multiprocessor#

Maximum amount of shared memory available per multiprocessor, in bytes.

Type:

int

max_threads_per_block#

Maximum number of threads allowed in a block.

Type:

int

max_warps_per_multiprocessor#

Maximum number of warps that can be present on a multiprocessor at any given time.

Type:

int

mig_uuid#
name#

The device UUID. This value is the globally unique immutable alphanumeric identifier of the device. The device name. Client is responsible for freeing this memory using the free function when done.

Type:

str

num_memcpy_engines#

Number of memory copy engines on the device.

Type:

int

num_multiprocessors#

Number of multiprocessors on the device.

Type:

int

num_threads_per_warp#

The number of threads per warp on the device.

Type:

int

numa_id#

Numa (Non-uniform memory access) information for device NUMA node ID of the GPU memory if GPU is not a NUMA node, it returns invalidNumaId

Type:

int

ptr#

Get the pointer address to the data as Python int.

uuid#
class cupti.cupti.ActivityDeviceAttribute#

Bases: object

Empty-initialize an instance of CUpti_ActivityDeviceAttribute.

attribute#

The attribute, either a CUpti_DeviceAttribute or CUdevice_attribute. Flag CUPTI_ACTIVITY_FLAG_DEVICE_ATTRIBUTE_CUDEVICE is used to indicate what kind of attribute this is. If CUPTI_ACTIVITY_FLAG_DEVICE_ATTRIBUTE_CUDEVICE is 1 then CUdevice_attribute field is value, otherwise CUpti_DeviceAttribute field is valid.

Type:

_py_anon_pod9

device_id#

The ID of the device that this attribute applies to.

Type:

int

flags_#

The flags associated with the device. CUpti_ActivityFlag

Type:

int

kind#

The activity record kind, must be CUPTI_ACTIVITY_KIND_DEVICE_ATTRIBUTE.

Type:

int

ptr#

Get the pointer address to the data as Python int.

value#

The value for the attribute. See CUpti_DeviceAttribute and CUdevice_attribute for the type of the value for a given attribute.

Type:

_py_anon_pod10

class cupti.cupti.ActivityDeviceGraphTrace#

Bases: object

Empty-initialize an instance of CUpti_ActivityDeviceGraphTrace.

context_id#

The ID of the context where the first node of the graph is executed.

Type:

int

device_id#

The ID of the device where the first node of the graph is executed.

Type:

int

device_launch_mode#

The type of launch. See CUpti_DeviceGraphLaunchMode

Type:

int

end#

The end timestamp for the graph execution, in ns. A value of 0 for both the start and end timestamps indicates that timestamp information could not be collected for the graph.

Type:

int

graph_id#

The unique ID of the graph that is launched.

Type:

int

kind#

The activity record kind, must be CUPTI_ACTIVITY_KIND_DEVICE_GRAPH_TRACE

Type:

int

launcher_graph_id#

The unique ID of the graph that has launched this graph.

Type:

int

ptr#

Get the pointer address to the data as Python int.

start#

The start timestamp for the graph execution, in ns. A value of 0 for both the start and end timestamps indicates that timestamp information could not be collected for the graph.

Type:

int

stream_id#

The ID of the stream where the graph is being launched.

Type:

int

class cupti.cupti.ActivityEnvironment#

Bases: object

Empty-initialize an instance of CUpti_ActivityEnvironment.

data_#

_py_anon_pod11:

device_id#

The ID of the device

Type:

int

environment_kind#

The kind of data reported in this record.

Type:

int

kind#

The activity record kind, must be CUPTI_ACTIVITY_KIND_ENVIRONMENT.

Type:

int

ptr#

Get the pointer address to the data as Python int.

timestamp#

The timestamp when this sample was retrieved, in ns. A value of 0 indicates that timestamp information could not be collected for the marker.

Type:

int

class cupti.cupti.ActivityEnvironmentKind(value)#

Bases: IntEnum

See CUpti_ActivityEnvironmentKind.

COOLING = 4#
COUNT = 5#
FORCE_INT = 2147483647#
POWER = 3#
SPEED = 1#
TEMPERATURE = 2#
UNKNOWN = 0#
class cupti.cupti.ActivityExternalCorrelation#

Bases: object

Empty-initialize an instance of CUpti_ActivityExternalCorrelation.

correlation_id#

The correlation ID of the associated CUDA driver or runtime API record.

Type:

int

external_id#

The correlation ID of the associated non-CUDA API record. The exact field in the associated external record depends on that record’s activity kind (externalKind).

Type:

int

external_kind#

The kind of external API this record correlated to.

Type:

int

kind#

The kind of this activity.

Type:

int

ptr#

Get the pointer address to the data as Python int.

class cupti.cupti.ActivityFlag(value)#

Bases: IntEnum

See CUpti_ActivityFlag.

DEVICE_ATTRIBUTE_CUDEVICE = 1#
DEVICE_CONCURRENT_KERNELS = 1#
FLUSH_FORCED = 1#
FORCE_INT = 2147483647#
GLOBAL_ACCESS_KIND_CACHED = 512#
GLOBAL_ACCESS_KIND_LOAD = 256#
GLOBAL_ACCESS_KIND_SIZE_MASK = 255#
INSTRUCTION_CLASS_MASK = 510#
INSTRUCTION_VALUE_INVALID = 1#
MARKER_COLOR_ARGB = 2#
MARKER_COLOR_NONE = 1#
MARKER_INSTANTANEOUS = 1#
MARKER_START = 2#
MARKER_SYNC_ACQUIRE = 8#
MARKER_SYNC_ACQUIRE_FAILED = 32#
MARKER_SYNC_ACQUIRE_SUCCESS = 16#
MARKER_SYNC_RELEASE = 64#
MEMCPY_ASYNC = 1#
MEMSET_ASYNC = 1#
METRIC_OVERFLOWED = 1#
METRIC_VALUE_INVALID = 2#
NONE = 0#
SHARED_ACCESS_KIND_LOAD = 256#
SHARED_ACCESS_KIND_SIZE_MASK = 255#
THRASHING_IN_CPU = 1#
THROTTLING_IN_CPU = 1#
class cupti.cupti.ActivityFunction#

Bases: object

Empty-initialize an instance of CUpti_ActivityFunction.

context_id#

The ID of the context where the function is launched.

Type:

int

function_ind_ex#

The function’s unique symbol index in the module.

Type:

int

id#

ID to uniquely identify the record

Type:

int

kind#

The activity record kind, must be CUPTI_ACTIVITY_KIND_FUNCTION.

Type:

int

module_id#

The module ID in which this global/device function is present.

Type:

int

name#

The name of the function. This name is shared across all activity records representing the same kernel, and so should not be modified.

Type:

str

pad#

Undefined. Reserved for internal use.

Type:

int

ptr#

Get the pointer address to the data as Python int.

class cupti.cupti.ActivityGraphTrace2#

Bases: object

Empty-initialize an instance of CUpti_ActivityGraphTrace2.

context_id#

The ID of the context where the first node of the graph is executed. If this is INT_MAX, then the start is on the host.

Type:

int

correlation_id#

The correlation ID of the graph launch. Each graph launch is assigned a unique correlation ID that is identical to the correlation ID in the driver API activity record that launched the graph.

Type:

int

device_id#

The ID of the device where the first node of the graph is executed. If this is INT_MAX, then the start is on the host.

Type:

int

end#

The end timestamp for the graph execution, in ns. A value of 0 for both the start and end timestamps indicates that timestamp information could not be collected for the graph.

Type:

int

end_context_id#

The ID of the context where the last node of the graph is executed.

Type:

int

end_device_id#

The ID of the device where last node of the graph is executed

Type:

int

graph_id#

The unique ID of the graph that is launched.

Type:

int

kind#

The activity record kind, must be CUPTI_ACTIVITY_KIND_GRAPH_TRACE

Type:

int

ptr#

Get the pointer address to the data as Python int.

start#

The start timestamp for the graph execution, in ns. A value of 0 for both the start and end timestamps indicates that timestamp information could not be collected for the graph.

Type:

int

stream_id#

The ID of the stream where the graph is being launched.

Type:

int

class cupti.cupti.ActivityInstructionClass(value)#

Bases: IntEnum

See CUpti_ActivityInstructionClass.

BARRIER = 17#
BIT_CONVERSION = 4#
CONSTANT = 11#
CONTROL_FLOW = 5#
FP_16 = 19#
FP_32 = 1#
FP_64 = 2#
GENERIC = 9#
GLOBAL = 6#
GLOBAL_ATOMIC = 13#
INTEGER = 3#
INTER_THREAD_COMMUNICATION = 16#
KIND_FORCE_INT = 2147483647#
LOCAL = 8#
MISCELLANEOUS = 18#
SHARED = 7#
SHARED_ATOMIC = 14#
SURFACE = 10#
SURFACE_ATOMIC = 15#
TEXTURE = 12#
UNIFORM = 20#
UNKNOWN = 0#
class cupti.cupti.ActivityJit2#

Bases: object

Empty-initialize an instance of CUpti_ActivityJit2.

cache_path#

The path where the fat binary is cached.

Type:

str

cache_size#

The size of compute cache.

Type:

int

correlation_id#

The correlation ID of the JIT operation to which records belong to. Each JIT operation is assigned a unique correlation ID that is identical to the correlation ID in the driver or runtime API activity record that launched the JIT operation.

Type:

int

device_id#

The device ID.

Type:

int

end#

The end timestamp for the JIT operation, in ns. A value of 0 for both the start and end timestamps indicates that timestamp information could not be collected for the JIT operation.

Type:

int

jit_entry_type#

The JIT entry type.

Type:

int

jit_operation_correlation_id#

The correlation ID to correlate JIT compilation, load and store operations. Each JIT compilation unit is assigned a unique correlation ID at the time of the JIT compilation. This correlation id can be used to find the matching JIT cache load/store records.

Type:

int

jit_operation_type#

The JIT operation type.

Type:

int

kind#

The activity record kind must be CUPTI_ACTIVITY_KIND_JIT.

Type:

int

padding#

Internal use.

Type:

int

process_id#

The ID of the process where the JIT operation is executing.

Type:

int

ptr#

Get the pointer address to the data as Python int.

start#

The start timestamp for the JIT operation, in ns. A value of 0 for both the start and end timestamps indicates that timestamp information could not be collected for the JIT operation.

Type:

int

thread_id#

The ID of the thread where the JIT operation is executing.

Type:

int

class cupti.cupti.ActivityJitEntryType(value)#

Bases: IntEnum

See CUpti_ActivityJitEntryType.

FORCE_INT = 2147483647#
INVALID = 0#
NVVM_IR_TO_PTX = 2#
PTX_TO_CUBIN = 1#
class cupti.cupti.ActivityJitOperationType(value)#

Bases: IntEnum

See CUpti_ActivityJitOperationType.

CACHE_LOAD = 1#
CACHE_STORE = 2#
COMPILE = 3#
FORCE_INT = 2147483647#
INVALID = 0#
class cupti.cupti.ActivityKernel10#

Bases: object

Empty-initialize an instance of CUpti_ActivityKernel10.

block_x#

The X-dimension block size for the kernel.

Type:

int

block_y#

The Y-dimension block size for the kernel.

Type:

int

block_z#

The Z-dimension grid size for the kernel.

Type:

int

cache_config#

For devices with compute capability 7.5+ cacheConfig values are not updated in case field isSharedMemoryCarveoutRequested is set

Type:

_py_anon_pod24

channel_id#

The ID of the HW channel on which the kernel is launched.

Type:

int

channel_type#

The type of the channel

Type:

int

cluster_scheduling_policy#

The cluster scheduling policy for the kernel. Refer CUclusterSchedulingPolicy Field is valid for devices with compute capability 9.0 and higher

Type:

int

cluster_x#

The X-dimension cluster size for the kernel. Field is valid for devices with compute capability 9.0 and higher

Type:

int

cluster_y#

The Y-dimension cluster size for the kernel. Field is valid for devices with compute capability 9.0 and higher

Type:

int

cluster_z#

The Z-dimension cluster size for the kernel. Field is valid for devices with compute capability 9.0 and higher

Type:

int

completed#

The completed timestamp for the kernel execution, in ns. It represents the completion of all it’s child kernels and the kernel itself. A value of CUPTI_TIMESTAMP_UNKNOWN indicates that the completion time is unknown.

Type:

int

context_id#

The ID of the context where the kernel is executing.

Type:

int

correlation_id#

The correlation ID of the kernel. Each kernel execution is assigned a unique correlation ID that is identical to the correlation ID in the driver or runtime API activity record that launched the kernel.

Type:

int

device_id#

The ID of the device where the kernel is executing.

Type:

int

dynamic_shared_memory#

The dynamic shared memory reserved for the kernel, in bytes.

Type:

int

end#

The end timestamp for the kernel execution, in ns. A value of 0 for both the start and end timestamps indicates that timestamp information could not be collected for the kernel.

Type:

int

graph_id#

The unique ID of the graph that launched this kernel through graph launch APIs. This field will be 0 if the kernel is not launched through graph launch APIs.

Type:

int

graph_node_id#

The unique ID of the graph node that launched this kernel through graph launch APIs. This field will be 0 if the kernel is not launched through graph launch APIs.

Type:

int

grid_id#

The grid ID of the kernel. Each kernel is assigned a unique grid ID at runtime.

Type:

int

grid_x#

The X-dimension grid size for the kernel.

Type:

int

grid_y#

The Y-dimension grid size for the kernel.

Type:

int

grid_z#

The Z-dimension grid size for the kernel.

Type:

int

is_device_launched#

This field is set to 1 if the kernel is part of a device launched graph.

Type:

int

is_shared_memory_carveout_requested#

This indicates if CU_FUNC_ATTRIBUTE_PREFERRED_SHARED_MEMORY_CARVEOUT was updated for the kernel launch

Type:

int

kind#

The activity record kind, must be CUPTI_ACTIVITY_KIND_KERNEL or CUPTI_ACTIVITY_KIND_CONCURRENT_KERNEL.

Type:

int

launch_type#

The indicates if the kernel was executed via a regular launch or via a single/multi device cooperative launch. CUpti_ActivityLaunchType

Type:

int

local_memory_per_thread#

The amount of local memory reserved for each thread, in bytes.

Type:

int

local_memory_total#

The total amount of local memory reserved for the kernel, in bytes (deprecated in CUDA 11.8). Refer field localMemoryTotal_v2

Type:

int

local_memory_total_v2#

The total amount of local memory reserved for the kernel, in bytes.

Type:

int

max_active_clusters#

The maximum clusters that could co-exist on the target device for the kernel

Type:

int

max_potential_cluster_size#

The maximum cluster size for the kernel

Type:

int

name#

The name of the kernel. This name is shared across all activity records representing the same kernel, and so should not be modified.

Type:

str

p_access_policy_window#

The pointer to the access policy window. The structure CUaccessPolicyWindow is defined in cuda.h.

Type:

int

padding#

Undefined. Reserved for internal use.

Type:

int

padding3#

(array of length 7).

Type:

uint8

partitioned_global_cache_executed#

The partitioned global caching executed for the kernel. Partitioned global caching is required to enable caching on certain chips, such as devices with compute capability 5.2. Partitioned global caching can be automatically disabled if the occupancy requirement of the launch cannot support caching.

Type:

int

partitioned_global_cache_requested#

The partitioned global caching requested for the kernel. Partitioned global caching is required to enable caching on certain chips, such as devices with compute capability 5.2.

Type:

int

ptr#

Get the pointer address to the data as Python int.

queued#

The timestamp when the kernel is queued up in the command buffer, in ns. A value of CUPTI_TIMESTAMP_UNKNOWN indicates that the queued time could not be collected for the kernel. This timestamp is not collected by default. Use API cuptiActivityEnableLatencyTimestamps() to enable collection. Command buffer is a buffer written by CUDA driver to send commands like kernel launch, memory copy etc to the GPU. All launches of CUDA kernels are asynchronous with respect to the host, the host requests the launch by writing commands into the command buffer, then returns without checking the GPU’s progress.

Type:

int

registers_per_thread#

The number of registers required for each thread executing the kernel.

Type:

int

reserved0#

Undefined. Reserved for internal use.

Type:

int

shared_memory_carveout_requested#

Shared memory carveout value requested for the function in percentage of the total resource. The value will be updated only if field isSharedMemoryCarveoutRequested is set.

Type:

int

shared_memory_config#

The shared memory configuration used for the kernel. The value is one of the CUsharedconfig enumeration values from cuda.h.

Type:

int

shared_memory_executed#

Shared memory size set by the driver.

Type:

int

shmem_limit_config#

The shared memory limit config for the kernel. This field shows whether user has opted for a higher per block limit of dynamic shared memory.

Type:

int

start#

The start timestamp for the kernel execution, in ns. A value of 0 for both the start and end timestamps indicates that timestamp information could not be collected for the kernel.

Type:

int

static_shared_memory#

The static shared memory allocated for the kernel, in bytes.

Type:

int

stream_id#

The ID of the stream where the kernel is executing.

Type:

int

submitted#

The timestamp when the command buffer containing the kernel launch is submitted to the GPU, in ns. A value of CUPTI_TIMESTAMP_UNKNOWN indicates that the submitted time could not be collected for the kernel. This timestamp is not collected by default. Use API cuptiActivityEnableLatencyTimestamps() to enable collection.

Type:

int

class cupti.cupti.ActivityKind(value)#

Bases: IntEnum

See CUpti_ActivityKind.

BRANCH = 16#
CDP_KERNEL = 18#
CONCURRENT_KERNEL = 10#
CONTEXT = 9#
COUNT = 56#
CUDA_EVENT = 36#
DEVICE = 8#
DEVICE_ATTRIBUTE = 28#
DEVICE_GRAPH_TRACE = 53#
DRIVER = 4#
ENVIRONMENT = 20#
EVENT = 6#
EVENT_INSTANCE = 21#
EXTERNAL_CORRELATION = 39#
FORCE_INT = 2147483647#
FUNCTION = 26#
GLOBAL_ACCESS = 15#
GRAPH_TRACE = 51#
INSTANTANEOUS_EVENT = 41#
INSTANTANEOUS_EVENT_INSTANCE = 42#
INSTANTANEOUS_METRIC = 43#
INSTANTANEOUS_METRIC_INSTANCE = 44#
INSTRUCTION_CORRELATION = 32#
INSTRUCTION_EXECUTION = 24#
INTERNAL_LAUNCH_API = 48#
INVALID = 0#
JIT = 52#
KERNEL = 3#
MARKER = 12#
MARKER_DATA = 13#
MEMCPY = 1#
MEMCPY2 = 22#
MEMORY = 45#
MEMORY2 = 49#
MEMORY_POOL = 50#
MEMSET = 2#
MEM_DECOMPRESS = 54#
METRIC = 7#
METRIC_INSTANCE = 23#
MODULE = 27#
NAME = 11#
OPENACC_DATA = 33#
OPENACC_LAUNCH = 34#
OPENACC_OTHER = 35#
OPENMP = 47#
OVERHEAD = 17#
PCIE = 46#
PC_SAMPLING = 30#
PC_SAMPLING_RECORD_INFO = 31#
PREEMPTION = 19#
ROTATION = 55#
RUNTIME = 5#
SHARED_ACCESS = 29#
SOURCE_LOCATOR = 14#
STREAM = 37#
SYNCHRONIZATION = 38#
UNIFIED_MEMORY_COUNTER = 25#
class cupti.cupti.ActivityLaunchType(value)#

Bases: IntEnum

See CUpti_ActivityLaunchType.

CBL_COMMANDLIST = 3#
COOPERATIVE_MULTI_DEVICE = 2#
COOPERATIVE_SINGLE_DEVICE = 1#
REGULAR = 0#
class cupti.cupti.ActivityMarker2#

Bases: object

Empty-initialize an instance of CUpti_ActivityMarker2.

domain#

The name of the domain to which this marker belongs to. This will be NULL for default domain.

Type:

str

flags_#

The flags associated with the marker. CUpti_ActivityFlag

Type:

int

id#

The marker ID.

Type:

int

kind#

The activity record kind, must be CUPTI_ACTIVITY_KIND_MARKER.

Type:

int

name#

The marker name for an instantaneous or start marker. This will be NULL for an end marker.

Type:

str

object_id#

The identifier for the activity object associated with this marker. ‘objectKind’ indicates which ID is valid for this record.

Type:

ActivityObjectKindId

object_kind#

The kind of activity object associated with this marker.

Type:

int

pad#

Undefined. Reserved for internal use.

Type:

int

ptr#

Get the pointer address to the data as Python int.

timestamp#

The timestamp for the marker, in ns. A value of 0 indicates that timestamp information could not be collected for the marker.

Type:

int

class cupti.cupti.ActivityMarkerData#

Bases: object

Empty-initialize an instance of CUpti_ActivityMarkerData.

category#

The category for the marker.

Type:

int

color#

The color for the marker.

Type:

int

flags_#

The flags associated with the marker. CUpti_ActivityFlag

Type:

int

id#

The marker ID.

Type:

int

kind#

The activity record kind, must be CUPTI_ACTIVITY_KIND_MARKER_DATA.

Type:

int

payload#

The payload value.

Type:

MetricValue

payload_kind#

Defines the payload format for the value associated with the marker.

Type:

int

ptr#

Get the pointer address to the data as Python :py:`int`.

class cupti.cupti.ActivityMemDecompress#

Bases: object

Empty-initialize an instance of CUpti_ActivityMemDecompress.

channel_id#

The ID of the HW channel on which the memory copy is occurring.

Type:

int

channel_type#

The type of the channel

Type:

int

context_id#

The ID of the context.

Type:

int

correlation_id#

The correlation ID of the decompression operations. Each operation is assigned a unique correlation ID that is identical to the correlation ID in the driver API activity record that launched the operation.

Type:

int

device_id#

The ID of the device.

Type:

int

end#

The end timestamp. A value of CUPTI_TIMESTAMP_UNKNOWN indicates that the start time is unknown.

Type:

int

kind#

The activity record kind, must be CUPTI_ACTIVITY_KIND_MEM_DECOMPRESS

Type:

int

number_of_operations#

The number of operations in the batch.

Type:

int

ptr#

Get the pointer address to the data as Python int.

reserved0#

This field is reserved for internal use

Type:

int

source_bytes#

The number of bytes to be read and decompressed in the batch operation.

Type:

int

start#

The start timestamp. A value of CUPTI_TIMESTAMP_UNKNOWN indicates that the start time is unknown.

Type:

int

stream_id#

The ID of the stream.

Type:

int

class cupti.cupti.ActivityMemcpy6#

Bases: object

Empty-initialize an instance of CUpti_ActivityMemcpy6.

bytes#

The number of bytes transferred by the memory copy.

Type:

int

channel_id#

The ID of the HW channel on which the memory copy is occurring.

Type:

int

channel_type#

The type of the channel

Type:

int

context_id#

The ID of the context where the memory copy is occurring.

Type:

int

copy_count#

The total number of memcopy operations traced in this record. This field is valid for memcpy operations happening using MemcpyBatchAsync APIs in CUDA. In MemcpyBatchAsync APIs, multiple memcpy operations are batched together for optimization purposes based on certain heuristics. For other memcpy operations, this field will be 1.

Type:

int

copy_kind#

The kind of the memory copy, stored as a byte to reduce record size. CUpti_ActivityMemcpyKind

Type:

int

correlation_id#

The correlation ID of the memory copy. Each memory copy is assigned a unique correlation ID that is identical to the correlation ID in the driver API activity record that launched the memory copy.

Type:

int

device_id#

The ID of the device where the memory copy is occurring.

Type:

int

dst_kind#

The destination memory kind read by the memory copy, stored as a byte to reduce record size. CUpti_ActivityMemoryKind

Type:

int

end#

The end timestamp for the memory copy, in ns. A value of 0 for both the start and end timestamps indicates that timestamp information could not be collected for the memory copy.

Type:

int

flags_#

The flags associated with the memory copy. CUpti_ActivityFlag

Type:

int

graph_id#

The unique ID of the graph that executed this memcpy through graph launch. This field will be 0 if the memcpy is not done through graph launch.

Type:

int

graph_node_id#

The unique ID of the graph node that executed this memcpy through graph launch. This field will be 0 if the memcpy is not done through graph launch.

Type:

int

is_device_launched#

This field is used to indicate if the memcpy operation is part of a device graph launch.

Type:

int

kind#

The activity record kind, must be CUPTI_ACTIVITY_KIND_MEMCPY.

Type:

int

pad#

Undefined. Reserved for internal use.

Type:

int

pad2#

(array of length 3).Reserved for internal use.

Type:

uint8

ptr#

Get the pointer address to the data as Python int.

reserved0#

Undefined. Reserved for internal use.

Type:

int

runtime_correlation_id#

The runtime correlation ID of the memory copy. Each memory copy is assigned a unique runtime correlation ID that is identical to the correlation ID in the runtime API activity record that launched the memory copy.

Type:

int

src_kind#

The source memory kind read by the memory copy, stored as a byte to reduce record size. CUpti_ActivityMemoryKind

Type:

int

start#

The start timestamp for the memory copy, in ns. A value of 0 for both the start and end timestamps indicates that timestamp information could not be collected for the memory copy.

Type:

int

stream_id#

The ID of the stream where the memory copy is occurring.

Type:

int

class cupti.cupti.ActivityMemcpyKind(value)#

Bases: IntEnum

See CUpti_ActivityMemcpyKind.

ATOA = 5#
ATOD = 6#
ATOH = 4#
DTOA = 7#
DTOD = 8#
DTOH = 2#
FORCE_INT = 2147483647#
HTOA = 3#
HTOD = 1#
HTOH = 9#
PTOP = 10#
UNKNOWN = 0#
class cupti.cupti.ActivityMemcpyPtoP4#

Bases: object

Empty-initialize an instance of CUpti_ActivityMemcpyPtoP4.

bytes#

The number of bytes transferred by the memory copy.

Type:

int

channel_id#

The ID of the HW channel on which the memory copy is occurring.

Type:

int

channel_type#

The type of the channel

Type:

int

context_id#

The ID of the context where the memory copy is occurring.

Type:

int

copy_kind#

The kind of the memory copy, stored as a byte to reduce record size. CUpti_ActivityMemcpyKind

Type:

int

correlation_id#

The correlation ID of the memory copy. Each memory copy is assigned a unique correlation ID that is identical to the correlation ID in the driver and runtime API activity record that launched the memory copy.

Type:

int

device_id#

The ID of the device where the memory copy is occurring.

Type:

int

dst_context_id#

The ID of the context owning the memory being copied to.

Type:

int

dst_device_id#

The ID of the device where memory is being copied to.

Type:

int

dst_kind#

The destination memory kind read by the memory copy, stored as a byte to reduce record size. CUpti_ActivityMemoryKind

Type:

int

end#

The end timestamp for the memory copy, in ns. A value of 0 for both the start and end timestamps indicates that timestamp information could not be collected for the memory copy.

Type:

int

flags_#

The flags associated with the memory copy. CUpti_ActivityFlag

Type:

int

graph_id#

The unique ID of the graph that executed this memcpy through graph launch. This field will be 0 if the memcpy is not done through graph launch.

Type:

int

graph_node_id#

The unique ID of the graph node that executed the memcpy through graph launch. This field will be 0 if memcpy is not done using graph launch.

Type:

int

kind#

The activity record kind, must be CUPTI_ACTIVITY_KIND_MEMCPY2.

Type:

int

ptr#

Get the pointer address to the data as Python int.

reserved0#

Undefined. Reserved for internal use.

Type:

int

src_context_id#

The ID of the context owning the memory being copied from.

Type:

int

src_device_id#

The ID of the device where memory is being copied from.

Type:

int

src_kind#

The source memory kind read by the memory copy, stored as a byte to reduce record size. CUpti_ActivityMemoryKind

Type:

int

start#

The start timestamp for the memory copy, in ns. A value of 0 for both the start and end timestamps indicates that timestamp information could not be collected for the memory copy.

Type:

int

stream_id#

The ID of the stream where the memory copy is occurring.

Type:

int

class cupti.cupti.ActivityMemory#

Bases: object

Empty-initialize an instance of CUpti_ActivityMemory.

address#

The virtual address of the allocation

Type:

int

alloc_pc#

The program counter of the allocation of memory

Type:

int

bytes#

The number of bytes of memory allocated.

Type:

int

context_id#

The ID of the context. If context is NULL, `contextId` is set to CUPTI_INVALID_CONTEXT_ID.

Type:

int

device_id#

The ID of the device where the memory allocation is taking place.

Type:

int

end#

The end timestamp for the memory operation, i.e. the time when memory was freed, in ns. This will be 0 if memory is not freed in the application

Type:

int

free_pc#

The program counter of the freeing of memory. This will be 0 if memory is not freed in the application

Type:

int

kind#

The activity record kind, must be CUPTI_ACTIVITY_KIND_MEMORY

Type:

int

memory_kind#

The memory kind requested by the user

Type:

int

name#

Variable name. This name is shared across all activity records representing the same symbol, and so should not be modified.

Type:

str

pad#

Undefined. Reserved for internal use.

Type:

int

process_id#

The ID of the process to which this record belongs to.

Type:

int

ptr#

Get the pointer address to the data as Python int.

start#

The start timestamp for the memory operation, i.e. the time when memory was allocated, in ns.

Type:

int

class cupti.cupti.ActivityMemory4#

Bases: object

Empty-initialize an instance of CUpti_ActivityMemory4.

address#

The virtual address of the allocation. The base address of the memory pool.

Type:

int

bytes#

The number of bytes of memory allocated.

Type:

int

context_id#

The ID of the context. If context is NULL, `contextId` is set to CUPTI_INVALID_CONTEXT_ID.

Type:

int

correlation_id#

The correlation ID of the memory operation. Each memory operation is assigned a unique correlation ID that is identical to the correlation ID in the driver and runtime API activity record that launched the memory operation.

Type:

int

device_id#

The ID of the device where the memory operation is taking place.

Type:

int

is_async#

`isAsync` is set if memory operation happens through async memory APIs.

Type:

int

kind#

The activity record kind, must be CUPTI_ACTIVITY_KIND_MEMORY2

Type:

int

memory_kind#

The memory kind requested by the user, CUpti_ActivityMemoryKind.

Type:

int

memory_operation_type#

The memory operation requested by the user, CUpti_ActivityMemoryOperationType.

Type:

int

memory_pool_config#

The memory pool configuration used for the memory operations.

Type:

_py_anon_pod5

name#

Variable name. This name is shared across all activity records representing the same symbol, and so should not be modified.

Type:

str

pad1#

Undefined. Reserved for internal use.

Type:

int

pc#

The program counter of the memory operation.

Type:

int

process_id#

int:

ptr#

Get the pointer address to the data as Python int.

source#

The shared object or binary that the memory allocation request comes from.

Type:

str

stream_id#

The ID of the stream. If memory operation is not async, `streamId` is set to CUPTI_INVALID_STREAM_ID.

Type:

int

timestamp#

The start timestamp for the memory operation, in ns.

Type:

int

class cupti.cupti.ActivityMemoryKind(value)#

Bases: IntEnum

See CUpti_ActivityMemoryKind.

ARRAY = 4#
DEVICE = 3#
DEVICE_STATIC = 6#
FORCE_INT = 2147483647#
MANAGED = 5#
MANAGED_STATIC = 7#
PAGEABLE = 1#
PINNED = 2#
UNKNOWN = 0#
class cupti.cupti.ActivityMemoryOperationType(value)#

Bases: IntEnum

See CUpti_ActivityMemoryOperationType.

ALLOCATION = 1#
FORCE_INT = 2147483647#
INVALID = 0#
RELEASE = 2#
class cupti.cupti.ActivityMemoryPool3#

Bases: object

Empty-initialize an instance of CUpti_ActivityMemoryPool3.

address#

The virtual address of the allocation.

Type:

int

correlation_id#

The correlation ID of the memory pool operation. Each memory pool operation is assigned a unique correlation ID that is identical to the correlation ID in the driver and runtime API activity record that launched the memory operation.

Type:

int

device_id#

The ID of the device where the memory pool is created.

Type:

int

is_managed_pool#

Whether the pool is of managed memory allocation or pinned memory allocation. If it is 0, it is pinned and if it is 1, the memory pool allocation is of managed memory type.

Type:

int

kind#

The activity record kind, must be CUPTI_ACTIVITY_KIND_MEMORY_POOL

Type:

int

memory_pool_operation_type#

The memory operation requested by the user, CUpti_ActivityMemoryPoolOperationType.

Type:

int

memory_pool_type#

The type of the memory pool, CUpti_ActivityMemoryPoolType

Type:

int

min_bytes_to_keep#

The minimum bytes to keep of the memory pool. `minBytesToKeep` is valid for CUPTI_ACTIVITY_MEMORY_POOL_OPERATION_TYPE_TRIMMED, CUpti_ActivityMemoryPoolOperationType

Type:

int

pad2#

(array of length 7).Undefined. Reserved for internal use.

Type:

uint8

process_id#

The ID of the process to which this record belongs to.

Type:

int

ptr#

Get the pointer address to the data as Python int.

release_threshold#

The release threshold of the memory pool. `releaseThreshold` is valid for CUPTI_ACTIVITY_MEMORY_POOL_TYPE_LOCAL, CUpti_ActivityMemoryPoolType.

Type:

int

size_#

The size of the memory pool operation in bytes. `size` is valid for CUPTI_ACTIVITY_MEMORY_POOL_TYPE_LOCAL, CUpti_ActivityMemoryPoolType.

Type:

int

timestamp#

The start timestamp for the memory operation, in ns.

Type:

int

utilized_size#

The utilized size of the memory pool. `utilizedSize` is valid for CUPTI_ACTIVITY_MEMORY_POOL_TYPE_LOCAL, CUpti_ActivityMemoryPoolType.

Type:

int

class cupti.cupti.ActivityMemoryPoolOperationType(value)#

Bases: IntEnum

See CUpti_ActivityMemoryPoolOperationType.

CREATED = 1#
DESTROYED = 2#
FORCE_INT = 2147483647#
INVALID = 0#
TRIMMED = 3#
class cupti.cupti.ActivityMemoryPoolType(value)#

Bases: IntEnum

See CUpti_ActivityMemoryPoolType.

FORCE_INT = 2147483647#
IMPORTED = 2#
INVALID = 0#
LOCAL = 1#
class cupti.cupti.ActivityMemset4#

Bases: object

Empty-initialize an instance of CUpti_ActivityMemset4.

bytes#

The number of bytes being set by the memory set.

Type:

int

channel_id#

The ID of the HW channel on which the memory set is occurring.

Type:

int

channel_type#

The type of the channel

Type:

int

context_id#

The ID of the context where the memory set is occurring.

Type:

int

correlation_id#

The correlation ID of the memory set. Each memory set is assigned a unique correlation ID that is identical to the correlation ID in the driver API activity record that launched the memory set.

Type:

int

device_id#

The ID of the device where the memory set is occurring.

Type:

int

end#

The end timestamp for the memory set, in ns. A value of 0 for both the start and end timestamps indicates that timestamp information could not be collected for the memory set.

Type:

int

flags_#

The flags associated with the memset. CUpti_ActivityFlag

Type:

int

graph_id#

The unique ID of the graph that executed this memset through graph launch. This field will be 0 if the memset is not executed through graph launch.

Type:

int

graph_node_id#

The unique ID of the graph node that executed this memset through graph launch. This field will be 0 if the memset is not executed through graph launch.

Type:

int

is_device_launched#

This field is used to indicate if the memset operation is part of a device graph launch.

Type:

int

kind#

The activity record kind, must be CUPTI_ACTIVITY_KIND_MEMSET.

Type:

int

memory_kind#

The memory kind of the memory set CUpti_ActivityMemoryKind

Type:

int

pad#

Undefined. Reserved for internal use.

Type:

int

pad2#

(array of length 3).Undefined. Reserved for internal use

Type:

uint8

ptr#

Get the pointer address to the data as Python int.

reserved0#

Undefined. Reserved for internal use.

Type:

int

start#

The start timestamp for the memory set, in ns. A value of 0 for both the start and end timestamps indicates that timestamp information could not be collected for the memory set.

Type:

int

stream_id#

The ID of the stream where the memory set is occurring.

Type:

int

value#

The value being assigned to memory by the memory set.

Type:

int

class cupti.cupti.ActivityModule#

Bases: object

Empty-initialize an instance of CUpti_ActivityModule.

context_id#

The ID of the context where the module is loaded.

Type:

int

cubin#

The pointer to cubin.

Type:

int

cubin_size#

The cubin size.

Type:

int

id#

The module ID.

Type:

int

kind#

The activity record kind, must be CUPTI_ACTIVITY_KIND_MODULE.

Type:

int

ptr#

Get the pointer address to the data as Python int.

class cupti.cupti.ActivityName#

Bases: object

Empty-initialize an instance of CUpti_ActivityName.

kind#

The activity record kind, must be CUPTI_ACTIVITY_KIND_NAME.

Type:

int

name#

The name.

Type:

str

object_id#

The identifier for the activity object. ‘objectKind’ indicates which ID is valid for this record.

Type:

ActivityObjectKindId

object_kind#

The kind of activity object being named.

Type:

int

pad#

Undefined. Reserved for internal use.

Type:

int

ptr#

Get the pointer address to the data as Python int.

class cupti.cupti.ActivityObjectKind(value)#

Bases: IntEnum

See CUpti_ActivityObjectKind.

CONTEXT = 4#
DEVICE = 3#
FORCE_INT = 2147483647#
PROCESS = 1#
STREAM = 5#
THREAD = 2#
UNKNOWN = 0#
class cupti.cupti.ActivityObjectKindId#

Bases: object

Empty-initialize an instance of CUpti_ActivityObjectKindId.

dcs#

A device object requires that we identify the device ID. A context object requires that we identify both the device and context ID. A stream object requires that we identify device, context, and stream ID.

Type:

_py_anon_pod4

pt#

A process object requires that we identify the process ID. A thread object requires that we identify both the process and thread ID.

Type:

_py_anon_pod3

ptr#

Get the pointer address to the data as Python int.

class cupti.cupti.ActivityOpenAccData#

Bases: object

Empty-initialize an instance of CUpti_ActivityOpenAccData.

async_#

int:

async_map#

int:

bytes#

Number of bytes

Type:

int

cu_context_id#

CUDA context id Valid only if deviceType is acc_device_nvidia.

Type:

int

cu_device_id#

CUDA device id Valid only if deviceType is acc_device_nvidia.

Type:

int

cu_process_id#

The ID of the process where the OpenACC activity is executing.

Type:

int

cu_stream_id#

CUDA stream id Valid only if deviceType is acc_device_nvidia.

Type:

int

cu_thread_id#

The ID of the thread where the OpenACC activity is executing.

Type:

int

device_number#

int:

device_ptr#

Device pointer if available

Type:

int

device_type#

int:

end#

CUPTI end timestamp

Type:

int

end_line_no#

int:

event_kind#

CUPTI OpenACC event kind (CUpti_OpenAccEventKind)

Type:

int

external_id#

The OpenACC correlation ID. Valid only if deviceType is acc_device_nvidia. If not 0, it uniquely identifies this record. It is identical to the externalId in the preceding external correlation record of type CUPTI_EXTERNAL_CORRELATION_KIND_OPENACC.

Type:

int

func_end_line_no#

int:

func_line_no#

int:

func_name#

str:

host_ptr#

Host pointer if available

Type:

int

implicit#

int:

kind#

The activity record kind, must be CUPTI_ACTIVITY_KIND_OPENACC_DATA.

Type:

int

line_no#

int:

parent_construct#

int:

ptr#

Get the pointer address to the data as Python int.

src_file#

str:

start#

CUPTI start timestamp

Type:

int

thread_id#

ThreadId

Type:

int

var_name#

str:

version#

int:

class cupti.cupti.ActivityOpenAccLaunch#

Bases: object

Empty-initialize an instance of CUpti_ActivityOpenAccLaunch.

async_#

Value of async() clause of the corresponding directive

Type:

int

async_map#

Internal asynchronous queue number used

Type:

int

cu_context_id#

CUDA context id Valid only if deviceType is acc_device_nvidia.

Type:

int

cu_device_id#

CUDA device id Valid only if deviceType is acc_device_nvidia.

Type:

int

cu_process_id#

The ID of the process where the OpenACC activity is executing.

Type:

int

cu_stream_id#

CUDA stream id Valid only if deviceType is acc_device_nvidia.

Type:

int

cu_thread_id#

The ID of the thread where the OpenACC activity is executing.

Type:

int

device_number#

Device number

Type:

int

device_type#

Device type

Type:

int

end#

CUPTI end timestamp

Type:

int

end_line_no#

For an OpenACC construct, this contains the line number of the end of the construct. A negative or zero value means the line number is not known.

Type:

int

event_kind#

CUPTI OpenACC event kind (CUpti_OpenAccEventKind)

Type:

int

external_id#

The OpenACC correlation ID. Valid only if deviceType is acc_device_nvidia. If not 0, it uniquely identifies this record. It is identical to the externalId in the preceding external correlation record of type CUPTI_EXTERNAL_CORRELATION_KIND_OPENACC.

Type:

int

func_end_line_no#

The last line number of the function named in func_name. A negative or zero value means the line number is not known.

Type:

int

func_line_no#

The line number of the first line of the function named in func_name. A negative or zero value means the line number is not known.

Type:

int

func_name#

A pointer to a null-terminated string containing the name of the function in which the event occurred.

Type:

str

implicit#

1 for any implicit event, such as an implicit wait at a synchronous data construct 0 otherwise

Type:

int

kernel_name#

A pointer to null-terminated string containing the name of the kernel being launched, if known, or a null pointer if not.

Type:

str

kind#

The activity record kind, must be CUPTI_ACTIVITY_KIND_OPENACC_LAUNCH.

Type:

int

line_no#

The line number of the directive or program construct or the starting line number of the OpenACC construct corresponding to the event. A negative or zero value means the line number is not known.

Type:

int

num_gangs#

The number of gangs created for this kernel launch

Type:

int

num_workers#

The number of workers created for this kernel launch

Type:

int

parent_construct#

CUPTI OpenACC parent construct kind (CUpti_OpenAccConstructKind) Note that for applications using PGI OpenACC runtime < 16.1, this will always be CUPTI_OPENACC_CONSTRUCT_KIND_UNKNOWN.

Type:

int

ptr#

Get the pointer address to the data as Python int.

src_file#

A pointer to null-terminated string containing the name of or path to the source file, if known, or a null pointer if not.

Type:

str

start#

CUPTI start timestamp

Type:

int

thread_id#

ThreadId

Type:

int

vector_length#

The number of vector lanes created for this kernel launch

Type:

int

version#

Version number

Type:

int

class cupti.cupti.ActivityOpenAccOther#

Bases: object

Empty-initialize an instance of CUpti_ActivityOpenAccOther.

async_#

Value of async() clause of the corresponding directive

Type:

int

async_map#

Internal asynchronous queue number used

Type:

int

cu_context_id#

CUDA context id Valid only if deviceType is acc_device_nvidia.

Type:

int

cu_device_id#

CUDA device id Valid only if deviceType is acc_device_nvidia.

Type:

int

cu_process_id#

The ID of the process where the OpenACC activity is executing.

Type:

int

cu_stream_id#

CUDA stream id Valid only if deviceType is acc_device_nvidia.

Type:

int

cu_thread_id#

The ID of the thread where the OpenACC activity is executing.

Type:

int

device_number#

Device number

Type:

int

device_type#

Device type

Type:

int

end#

CUPTI end timestamp

Type:

int

end_line_no#

For an OpenACC construct, this contains the line number of the end of the construct. A negative or zero value means the line number is not known.

Type:

int

event_kind#

CUPTI OpenACC event kind (CUpti_OpenAccEventKind)

Type:

int

external_id#

The OpenACC correlation ID. Valid only if deviceType is acc_device_nvidia. If not 0, it uniquely identifies this record. It is identical to the externalId in the preceding external correlation record of type CUPTI_EXTERNAL_CORRELATION_KIND_OPENACC.

Type:

int

func_end_line_no#

The last line number of the function named in func_name. A negative or zero value means the line number is not known.

Type:

int

func_line_no#

The line number of the first line of the function named in func_name. A negative or zero value means the line number is not known.

Type:

int

func_name#

A pointer to a null-terminated string containing the name of the function in which the event occurred.

Type:

str

implicit#

1 for any implicit event, such as an implicit wait at a synchronous data construct 0 otherwise

Type:

int

kind#

The activity record kind, must be CUPTI_ACTIVITY_KIND_OPENACC_OTHER.

Type:

int

line_no#

The line number of the directive or program construct or the starting line number of the OpenACC construct corresponding to the event. A negative or zero value means the line number is not known.

Type:

int

parent_construct#

CUPTI OpenACC parent construct kind (CUpti_OpenAccConstructKind) Note that for applications using PGI OpenACC runtime < 16.1, this will always be CUPTI_OPENACC_CONSTRUCT_KIND_UNKNOWN.

Type:

int

ptr#

Get the pointer address to the data as Python int.

src_file#

A pointer to null-terminated string containing the name of or path to the source file, if known, or a null pointer if not.

Type:

str

start#

CUPTI start timestamp

Type:

int

thread_id#

ThreadId

Type:

int

version#

Version number

Type:

int

class cupti.cupti.ActivityOpenMp#

Bases: object

Empty-initialize an instance of CUpti_ActivityOpenMp.

cu_process_id#

The ID of the process where the OpenMP activity is executing.

Type:

int

cu_thread_id#

The ID of the thread where the OpenMP activity is executing.

Type:

int

end#

CUPTI end timestamp

Type:

int

event_kind#

CUPTI OpenMP event kind (CUpti_OpenMpEventKind)

Type:

int

kind#

The kind of this activity.

Type:

int

ptr#

Get the pointer address to the data as Python int.

start#

CUPTI start timestamp

Type:

int

thread_id#

ThreadId

Type:

int

version#

Version number

Type:

int

class cupti.cupti.ActivityOverhead3#

Bases: object

Empty-initialize an instance of CUpti_ActivityOverhead3.

correlation_id#

The correlation ID of the overhead operation to which records belong to. This ID is identical to the correlation ID in the driver or runtime API activity record that launched the overhead operation. In some cases, it can be zero, such as for CUPTI_ACTIVITY_OVERHEAD_CUPTI_BUFFER_FLUSH records.

Type:

int

end#

The end timestamp for the overhead, in ns. A value of 0 for both the start and end timestamps indicates that timestamp information could not be collected for the overhead.

Type:

int

kind#

The activity record kind, must be CUPTI_ACTIVITY_OVERHEAD.

Type:

int

object_id#

The identifier for the activity object. ‘objectKind’ indicates which ID is valid for this record.

Type:

ActivityObjectKindId

object_kind#

The kind of activity object that the overhead is associated with.

Type:

int

overhead_data#

Pointer to the struct with additional details about the overhead. Refer CUpti_ActivityOverheadKind enum and the corresponding structure to typecast and access additional overhead data. Client is responsible for freeing this memory using the free function when done.

Type:

int

overhead_kind#

The kind of overhead, CUPTI, DRIVER, COMPILER etc.

Type:

int

ptr#

Get the pointer address to the data as Python int.

reserved0#

Reserved for internal use.

Type:

int

start#

The start timestamp for the overhead, in ns. A value of 0 for both the start and end timestamps indicates that timestamp information could not be collected for the overhead.

Type:

int

class cupti.cupti.ActivityOverheadKind(value)#

Bases: IntEnum

See CUpti_ActivityOverheadKind.

ACTIVITY_BUFFER_REQUEST = 458752#
COMMAND_BUFFER_FULL = 393216#
CUPTI_BUFFER_FLUSH = 65536#
CUPTI_INSTRUMENTATION = 131072#
CUPTI_RESOURCE = 196608#
DRIVER_COMPILER = 1#
FORCE_INT = 2147483647#
LAZY_FUNCTION_LOADING = 327680#
RUNTIME_TRIGGERED_MODULE_LOADING = 262144#
UNKNOWN = 0#
UVM_ACTIVITY_INIT = 524288#
class cupti.cupti.ActivityPCSamplingPeriod(value)#

Bases: IntEnum

See CUpti_ActivityPCSamplingPeriod.

FORCE_INT = 2147483647#
HIGH = 4#
INVALID = 0#
LOW = 2#
MAX = 5#
MID = 3#
MIN = 1#
class cupti.cupti.ActivityPCSamplingStallReason(value)#

Bases: IntEnum

See CUpti_ActivityPCSamplingStallReason.

CONSTANT_MEMORY_DEPENDENCY = 7#
EXEC_DEPENDENCY = 3#
FORCE_INT = 2147483647#
INST_FETCH = 2#
INVALID = 0#
MEMORY_DEPENDENCY = 4#
MEMORY_THROTTLE = 9#
NONE = 1#
NOT_SELECTED = 10#
OTHER = 11#
PIPE_BUSY = 8#
SLEEPING = 12#
SYNC = 6#
TEXTURE = 5#
class cupti.cupti.ActivityPartitionedGlobalCacheConfig(value)#

Bases: IntEnum

See CUpti_ActivityPartitionedGlobalCacheConfig.

FORCE_INT = 2147483647#
NOT_SUPPORTED = 1#
OFF = 2#
ON = 3#
UNKNOWN = 0#
class cupti.cupti.ActivityPreemption#

Bases: object

Empty-initialize an instance of CUpti_ActivityPreemption.

block_x#

The X-dimension of the block that is preempted

Type:

int

block_y#

The Y-dimension of the block that is preempted

Type:

int

block_z#

The Z-dimension of the block that is preempted

Type:

int

grid_id#

The grid-id of the block that is preempted

Type:

int

kind#

The activity record kind, must be CUPTI_ACTIVITY_KIND_PREEMPTION

Type:

int

pad#

Undefined. Reserved for internal use.

Type:

int

preemption_kind#

kind of the preemption

Type:

int

ptr#

Get the pointer address to the data as Python int.

timestamp#

The timestamp of the preemption, in ns. A value of 0 indicates that timestamp information could not be collected for the preemption.

Type:

int

class cupti.cupti.ActivityPreemptionKind(value)#

Bases: IntEnum

See CUpti_ActivityPreemptionKind.

FORCE_INT = 2147483647#
RESTORE = 2#
SAVE = 1#
UNKNOWN = 0#
class cupti.cupti.ActivityStream#

Bases: object

Empty-initialize an instance of CUpti_ActivityStream.

context_id#

The ID of the context where the stream was created.

Type:

int

correlation_id#

The correlation ID of the API to which this result is associated.

Type:

int

flag#

Flags associated with the stream.

Type:

int

kind#

The activity record kind, must be CUPTI_ACTIVITY_KIND_STREAM.

Type:

int

priority#

The clamped priority for the stream.

Type:

int

ptr#

Get the pointer address to the data as Python int.

stream_id#

A unique stream ID to identify the stream.

Type:

int

class cupti.cupti.ActivityStreamFlag(value)#

Bases: IntEnum

See CUpti_ActivityStreamFlag.

FLAG_DEFAULT = 1#
FLAG_FORCE_INT = 2147483647#
FLAG_NON_BLOCKING = 2#
FLAG_NULL = 3#
FLAG_UNKNOWN = 0#
MASK = 65535#
class cupti.cupti.ActivitySynchronization2#

Bases: object

Empty-initialize an instance of CUpti_ActivitySynchronization2.

context_id#

The ID of the context for which the synchronization API is called. In case of context synchronization API it is the context id for which the API is called. In case of stream/event synchronization it is the ID of the context where the stream/event was created.

Type:

int

correlation_id#

The correlation ID of the API to which this result is associated.

Type:

int

cuda_event_id#

The event ID for which the synchronization API is called. A CUPTI_SYNCHRONIZATION_INVALID_VALUE value indicate the field is not applicable for this record. Not valid for cuCtxSynchronize, cuStreamSynchronize.

Type:

int

cuda_event_sync_id#

A unique ID to associate event synchronization records with the latest CUDA Event record. Similar field is added in CUpti_ActivityCudaEvent2 to associate synchronization record to the CUDA Event record. The same CUDA event can be used multiple times, so the event id will not be unique to correlate the synchronization record with the latest CUDA Event record. This field will be unique and can be used to do the required correlation. A CUPTI_SYNCHRONIZATION_INVALID_VALUE value indicates that the field is not applicable for this record. Valid only for synchronization records related to CUDA Events.

Type:

int

end#

The end timestamp for the function, in ns. A value of 0 for both the start and end timestamps indicates that timestamp information could not be collected for the function.

Type:

int

kind#

The activity record kind, must be CUPTI_ACTIVITY_KIND_SYNCHRONIZATION.

Type:

int

pad#

Undefined. Reserved for internal use.

Type:

int

ptr#

Get the pointer address to the data as Python int.

return_value#

The return value for the synchronization record. Use cuptiActivityEnableAllSyncRecords API to enable/disable collection of synchronization records with return value being non-zero. This will be a CUresult value.

Type:

int

start#

The start timestamp for the function, in ns. A value of 0 for both the start and end timestamps indicates that timestamp information could not be collected for the function.

Type:

int

stream_id#

The compute stream for which the synchronization API is called. A CUPTI_SYNCHRONIZATION_INVALID_VALUE value indicate the field is not applicable for this record. Not valid for cuCtxSynchronize, cuEventSynchronize.

Type:

int

type#

The type of record.

Type:

int

class cupti.cupti.ActivitySynchronizationType(value)#

Bases: IntEnum

See CUpti_ActivitySynchronizationType.

CONTEXT_SYNCHRONIZE = 4#
EVENT_SYNCHRONIZE = 1#
FORCE_INT = 2147483647#
STREAM_SYNCHRONIZE = 3#
STREAM_WAIT_EVENT = 2#
UNKNOWN = 0#
class cupti.cupti.ActivityThreadIdType(value)#

Bases: IntEnum

See CUpti_ActivityThreadIdType.

DEFAULT = 0#
FORCE_INT = 2147483647#
SIZE = 2#
SYSTEM = 1#
class cupti.cupti.ActivityUnifiedMemoryAccessType(value)#

Bases: IntEnum

See CUpti_ActivityUnifiedMemoryAccessType.

ATOMIC = 3#
PREFETCH = 4#
READ = 1#
UNKNOWN = 0#
WRITE = 2#
class cupti.cupti.ActivityUnifiedMemoryCounter3#

Bases: object

Empty-initialize an instance of CUpti_ActivityUnifiedMemoryCounter3.

address#

This is the virtual base address of the page/s being transferred. For cpu and gpu faults, the virtual address for the page that faulted.

Type:

int

counter_kind#

The Unified Memory counter kind

Type:

int

dst_id#

The ID of the destination CPU/device involved in the memory transfer or remote map operation. Ignore this field if counterKind is CUPTI_ACTIVITY_UNIFIED_MEMORY_COUNTER_KIND_GPU_PAGE_FAULT or CUPTI_ACTIVITY_UNIFIED_MEMORY_COUNTER_KIND_CPU_PAGE_FAULT_COUNT or CUPTI_ACTIVITY_UNIFIED_MEMORY_COUNTER_KIND_THRASHING or CUPTI_ACTIVITY_UNIFIED_MEMORY_COUNTER_KIND_THROTTLING

Type:

int

end#

The end timestamp of the counter, in ns. Ignore this field if counterKind is CUPTI_ACTIVITY_UNIFIED_MEMORY_COUNTER_KIND_CPU_PAGE_FAULT_COUNT or CUPTI_ACTIVITY_UNIFIED_MEMORY_COUNTER_KIND_THRASHING or CUPTI_ACTIVITY_UNIFIED_MEMORY_COUNTER_KIND_REMOTE_MAP. For counterKind CUPTI_ACTIVITY_UNIFIED_MEMORY_COUNTER_KIND_BYTES_TRANSFER_HTOD and CUPTI_ACTIVITY_UNIFIED_MEMORY_COUNTER_KIND_BYTES_TRANSFER_DTOH, timestamp is captured when activity finishes on GPU. For counterKind CUPTI_ACTIVITY_UNIFIED_MEMORY_COUNTER_KIND_GPU_PAGE_FAULT, timestamp is captured when CUDA driver queues the replay of faulting memory accesses on the GPU For counterKind CUPTI_ACTIVITY_UNIFIED_MEMORY_COUNTER_KIND_THROTTLING, timestamp is captured when throttling operation was finished by CUDA driver

Type:

int

flags_#

The flags associated with this record. See enums CUpti_ActivityUnifiedMemoryAccessType if counterKind is CUPTI_ACTIVITY_UNIFIED_MEMORY_COUNTER_KIND_GPU_PAGE_FAULT and CUpti_ActivityUnifiedMemoryMigrationCause if counterKind is CUPTI_ACTIVITY_UNIFIED_MEMORY_COUNTER_KIND_BYTES_TRANSFER_HTOD or CUPTI_ACTIVITY_UNIFIED_MEMORY_COUNTER_KIND_BYTES_TRANSFER_HTOD and CUpti_ActivityUnifiedMemoryRemoteMapCause if counterKind is CUPTI_ACTIVITY_UNIFIED_MEMORY_COUNTER_KIND_REMOTE_MAP and CUpti_ActivityFlag if counterKind is CUPTI_ACTIVITY_UNIFIED_MEMORY_COUNTER_KIND_THRASHING or CUPTI_ACTIVITY_UNIFIED_MEMORY_COUNTER_KIND_THROTTLING

Type:

int

kind#

The activity record kind, must be CUPTI_ACTIVITY_KIND_UNIFIED_MEMORY_COUNTER

Type:

int

pad#

Undefined. Reserved for internal use.

Type:

int

process_id#

The ID of the process to which this record belongs to.

Type:

int

processors#

(array of length 5).The bitmask of devices involved in the operation. For counterKind CUPTI_ACTIVITY_UNIFIED_MEMORY_COUNTER_KIND_THRASHING, it is a bitwise ORing of the device IDs fighting for the memory region. processors[0] represents the device ID of the device 0 to device 63, processors[1] represents device ID of device 64 to device 127 and so on. Ignore this field if counterKind is CUPTI_ACTIVITY_UNIFIED_MEMORY_COUNTER_KIND_CPU_PAGE_FAULT_COUNT or CUPTI_ACTIVITY_UNIFIED_MEMORY_COUNTER_KIND_GPU_PAGE_FAULT or CUPTI_ACTIVITY_UNIFIED_MEMORY_COUNTER_KIND_THROTTLING or CUPTI_ACTIVITY_UNIFIED_MEMORY_COUNTER_KIND_REMOTE_MAP or CUPTI_ACTIVITY_UNIFIED_MEMORY_COUNTER_KIND_BYTES_TRANSFER_HTOD or CUPTI_ACTIVITY_UNIFIED_MEMORY_COUNTER_KIND_BYTES_TRANSFER_DTOH or CUPTI_ACTIVITY_UNIFIED_MEMORY_COUNTER_KIND_DTOD or CUPTI_ACTIVITY_UNIFIED_MEMORY_COUNTER_KIND_GPU_FAULT_REPLAY

Type:

uint64

ptr#

Get the pointer address to the data as Python int.

src_id#

The ID of the source CPU/device involved in the memory transfer, page fault, thrashing, throttling or remote map operation. For counterKind CUPTI_ACTIVITY_UNIFIED_MEMORY_COUNTER_KIND_THRASHING, it is a bitwise ORing of the device IDs fighting for the memory region, ONLY if there are less than 32 devices. Ignore this field if counterKind is CUPTI_ACTIVITY_UNIFIED_MEMORY_COUNTER_KIND_CPU_PAGE_FAULT_COUNT

Type:

int

start#

The start timestamp of the counter, in ns. For counterKind CUPTI_ACTIVITY_UNIFIED_MEMORY_COUNTER_KIND_BYTES_TRANSFER_HTOD and CUPTI_ACTIVITY_UNIFIED_MEMORY_COUNTER_KIND_BYTES_TRANSFER_DTOH, timestamp is captured when activity starts on GPU. For counterKind CUPTI_ACTIVITY_UNIFIED_MEMORY_COUNTER_KIND_GPU_PAGE_FAULT and CUPTI_ACTIVITY_UNIFIED_MEMORY_COUNTER_KIND_CPU_PAGE_FAULT_COUNT, timestamp is captured when CUDA driver started processing the fault. For counterKind CUPTI_ACTIVITY_UNIFIED_MEMORY_COUNTER_KIND_THRASHING, timestamp is captured when CUDA driver detected thrashing of memory region. For counterKind CUPTI_ACTIVITY_UNIFIED_MEMORY_COUNTER_KIND_THROTTLING, timestamp is captured when throttling operation was started by CUDA driver. For counterKind CUPTI_ACTIVITY_UNIFIED_MEMORY_COUNTER_KIND_REMOTE_MAP, timestamp is captured when CUDA driver has pushed all required operations to the processor specified by dstId.

Type:

int

stream_id#

The ID of the stream causing the transfer. This value of this field is invalid.

Type:

int

value#

Value of the counter For counterKind CUPTI_ACTIVITY_UNIFIED_MEMORY_COUNTER_KIND_BYTES_TRANSFER_HTOD, CUPTI_ACTIVITY_UNIFIED_MEMORY_COUNTER_KIND_BYTES_TRANSFER_DTOH, CUPTI_ACTIVITY_UNIFIED_MEMORY_COUNTER_KIND_THREASHING and CUPTI_ACTIVITY_UNIFIED_MEMORY_COUNTER_KIND_REMOTE_MAP, it is the size of the memory region in bytes. For counterKind CUPTI_ACTIVITY_UNIFIED_MEMORY_COUNTER_KIND_GPU_PAGE_FAULT, it is the number of page fault groups for the same page. For counterKind CUPTI_ACTIVITY_UNIFIED_MEMORY_COUNTER_KIND_CPU_PAGE_FAULT_COUNT, it is the program counter for the instruction that caused fault.

Type:

int

class cupti.cupti.ActivityUnifiedMemoryCounterConfig#

Bases: object

Empty-initialize an array of CUpti_ActivityUnifiedMemoryCounterConfig.

The resulting object is of length size and of dtype activity_unified_memory_counter_config_dtype. If default-constructed, the instance represents a single struct.

Parameters:

size (int) – number of structs, default=1.

device_id#

Device id of the target device. This is relevant only for single device scopes. (deprecated in CUDA 7.0)

Type:

Union[uint32, int]

enable#

Control to enable/disable the counter. To enable the counter set it to non-zero value while disable is indicated by zero.

Type:

Union[uint32, int]

kind#

Unified Memory counter Counter kind

Type:

Union[int32, int]

ptr#

Get the pointer address to the data as Python int.

scope#

Unified Memory counter Counter scope. (deprecated in CUDA 7.0)

Type:

Union[int32, int]

class cupti.cupti.ActivityUnifiedMemoryCounterKind(value)#

Bases: IntEnum

See CUpti_ActivityUnifiedMemoryCounterKind.

BYTES_TRANSFER_DTOD = 8#
BYTES_TRANSFER_DTOH = 2#
BYTES_TRANSFER_HTOD = 1#
COUNT = 9#
CPU_PAGE_FAULT_COUNT = 3#
FORCE_INT = 2147483647#
GPU_PAGE_FAULT = 4#
REMOTE_MAP = 7#
THRASHING = 5#
THROTTLING = 6#
UNKNOWN = 0#
class cupti.cupti.ActivityUnifiedMemoryCounterScope(value)#

Bases: IntEnum

See CUpti_ActivityUnifiedMemoryCounterScope.

COUNT = 3#
FORCE_INT = 2147483647#
PROCESS_ALL_DEVICES = 2#
PROCESS_SINGLE_DEVICE = 1#
UNKNOWN = 0#
class cupti.cupti.ActivityUnifiedMemoryMigrationCause(value)#

Bases: IntEnum

See CUpti_ActivityUnifiedMemoryMigrationCause.

ACCESS_COUNTERS = 5#
COHERENCE = 2#
EVICTION = 4#
PREFETCH = 3#
UNKNOWN = 0#
USER = 1#
class cupti.cupti.ActivityUnifiedMemoryRemoteMapCause(value)#

Bases: IntEnum

See CUpti_ActivityUnifiedMemoryRemoteMapCause.

COHERENCE = 1#
EVICTION = 5#
OUT_OF_MEMORY = 4#
POLICY = 3#
THRASHING = 2#
UNKNOWN = 0#
class cupti.cupti.ApiCallbackSite(value)#

Bases: IntEnum

See CUpti_ApiCallbackSite.

API_CBSITE_FORCE_INT = 2147483647#
API_ENTER = 0#
API_EXIT = 1#
class cupti.cupti.CallbackData#

Bases: object

Empty-initialize an instance of CUpti_CallbackData.

callback_site#

Point in the runtime or driver function from where the callback was issued.

Type:

int

context#

Driver context current to the thread, or null if no context is current. This value can change from the entry to exit callback of a runtime API function if the runtime initializes a context.

Type:

int

context_uid#

Unique ID for the CUDA context associated with the thread. The UIDs are assigned sequentially as contexts are created and are unique within a process.

Type:

int

correlation_data#

Pointer to data shared between the entry and exit callbacks of a given runtime or drive API function invocation. This field can be used to pass 64-bit values from the entry callback to the corresponding exit callback.

Type:

int

correlation_id#

The activity record correlation ID for this callback. For a driver domain callback (i.e. `domain` CUPTI_CB_DOMAIN_DRIVER_API) this ID will equal the correlation ID in the CUpti_ActivityAPI record corresponding to the CUDA driver function call. For a runtime domain callback (i.e. `domain` CUPTI_CB_DOMAIN_RUNTIME_API) this ID will equal the correlation ID in the CUpti_ActivityAPI record corresponding to the CUDA runtime function call. Within the callback, this ID can be recorded to correlate user data with the activity record. This field is new in 4.1.

Type:

int

function_name#

Name of the runtime or driver API function which issued the callback. This string is a global constant and so may be accessed outside of the callback.

Type:

str

function_return_value#

Pointer to the return value of the runtime or driver API call. This field is only valid within the exit::CUPTI_API_EXIT callback. For a runtime API `functionReturnValue` points to a `cudaError_t`. For a driver API `functionReturnValue` points to a `CUresult`.

Type:

int

ptr#

Get the pointer address to the data as Python int.

symbol_name#

Name of the symbol operated on by the runtime or driver API function which issued the callback. This entry is valid only for driver and runtime launch callbacks, where it returns the name of the kernel.

Type:

str

class cupti.cupti.CallbackDomain(value)#

Bases: IntEnum

See CUpti_CallbackDomain.

DRIVER_API = 1#
FORCE_INT = 2147483647#
INVALID = 0#
NVTX = 5#
RESOURCE = 3#
RUNTIME_API = 2#
SIZE = 7#
STATE = 6#
SYNCHRONIZE = 4#
class cupti.cupti.CallbackIdResource(value)#

Bases: IntEnum

See CUpti_CallbackIdResource.

CONTEXT_CREATED = 1#
CONTEXT_DESTROY_STARTING = 2#
CU_INIT_FINISHED = 5#
FORCE_INT = 2147483647#
GRAPHEXEC_CREATED = 18#
GRAPHEXEC_CREATE_STARTING = 17#
GRAPHEXEC_DESTROY_STARTING = 19#
GRAPHNODE_CLONED = 20#
GRAPHNODE_CREATED = 13#
GRAPHNODE_CREATE_STARTING = 12#
GRAPHNODE_DEPENDENCY_CREATED = 15#
GRAPHNODE_DEPENDENCY_DESTROY_STARTING = 16#
GRAPHNODE_DESTROY_STARTING = 14#
GRAPH_CLONED = 11#
GRAPH_CREATED = 9#
GRAPH_DESTROY_STARTING = 10#
GRAPH_NODE_SET_PARAMS = 23#
GRAPH_NODE_UPDATED = 22#
INVALID = 0#
MODULE_LOADED = 6#
MODULE_PROFILED = 8#
MODULE_UNLOAD_STARTING = 7#
SIZE = 24#
STREAM_ATTRIBUTE_CHANGED = 21#
STREAM_CREATED = 3#
STREAM_DESTROY_STARTING = 4#
class cupti.cupti.CallbackIdState(value)#

Bases: IntEnum

See CUpti_CallbackIdState.

ERROR = 2#
FATAL_ERROR = 1#
FORCE_INT = 2147483647#
INVALID = 0#
SIZE = 4#
WARNING = 3#
class cupti.cupti.CallbackIdSync(value)#

Bases: IntEnum

See CUpti_CallbackIdSync.

CONTEXT_SYNCHRONIZED = 2#
FORCE_INT = 2147483647#
INVALID = 0#
SIZE = 3#
STREAM_SYNCHRONIZED = 1#
class cupti.cupti.ChannelType(value)#

Bases: IntEnum

See CUpti_ChannelType.

ASYNC_MEMCPY = 2#
COMPUTE = 1#
DECOMP = 3#
FORCE_INT = 2147483647#
INVALID = 0#
class cupti.cupti.ConfidentialComputeRotationEventType(value)#

Bases: IntEnum

See CUpti_ConfidentialComputeRotationEventType.

EVENT_TYPE_FORCE_INT = 2147483647#
INVALID_ROTATION_EVENT = 0#
KEY_ROTATION_ACKNOWLEGED = 2#
KEY_ROTATION_CHANNEL_BLOCKED = 1#
KEY_ROTATION_CHANNEL_DRAINED = 2#
KEY_ROTATION_CHANNEL_UNBLOCKED = 3#
KEY_ROTATION_COMPLETED = 4#
KEY_ROTATION_REQUESTED = 1#
KEY_ROTATION_STARTED = 3#
class cupti.cupti.ContextCigMode(value)#

Bases: IntEnum

See CUpti_ContextCigMode.

CIG = 1#
CIG_FALLBACK = 2#
FORCE_INT = 2147483647#
NONE = 0#
class cupti.cupti.DevType(value)#

Bases: IntEnum

See CUpti_DevType.

FORCE_INT = 2147483647#
GPU = 1#
INVALID = 0#
NPU = 2#
class cupti.cupti.DeviceAttribute(value)#

Bases: IntEnum

See CUpti_DeviceAttribute.

ATTR_CLASS = 10#
ATTR_FLOP_DP_PER_CYCLE = 12#
ATTR_FLOP_HP_PER_CYCLE = 17#
ATTR_FLOP_SP_PER_CYCLE = 11#
ATTR_FORCE_INT = 2147483647#
ATTR_GLOBAL_MEMORY_BANDWIDTH = 3#
ATTR_INSTRUCTION_PER_CYCLE = 4#
ATTR_INSTRUCTION_THROUGHPUT_SINGLE_PRECISION = 5#
ATTR_MAX_EVENT_DOMAIN_ID = 2#
ATTR_MAX_EVENT_ID = 1#
ATTR_MAX_FRAME_BUFFERS = 6#
ATTR_MAX_L2_UNITS = 13#
ATTR_MAX_SHARED_MEMORY_CACHE_CONFIG_PREFER_EQUAL = 16#
ATTR_MAX_SHARED_MEMORY_CACHE_CONFIG_PREFER_L1 = 15#
ATTR_MAX_SHARED_MEMORY_CACHE_CONFIG_PREFER_SHARED = 14#
ATTR_NVSWITCH_PRESENT = 20#
ATTR_PCIE_GEN = 9#
class cupti.cupti.DeviceVirtualizationMode(value)#

Bases: IntEnum

See CUpti_DeviceVirtualizationMode.

FORCE_INT = 2147483647#
NONE = 0#
PASS_THROUGH = 1#
VIRTUAL_GPU = 2#
class cupti.cupti.EnvironmentClocksThrottleReason(value)#

Bases: IntEnum

See CUpti_EnvironmentClocksThrottleReason.

FORCE_INT = 2147483647#
GPU_IDLE = 1#
HW_SLOWDOWN = 8#
NONE = 0#
SW_POWER_CAP = 4#
UNKNOWN = 2147483648#
UNSUPPORTED = 1073741824#
USER_DEFINED_CLOCKS = 2#
class cupti.cupti.ExternalCorrelationKind(value)#

Bases: IntEnum

See CUpti_ExternalCorrelationKind.

CUSTOM0 = 3#
CUSTOM1 = 4#
CUSTOM2 = 5#
FORCE_INT = 2147483647#
INVALID = 0#
OPENACC = 2#
SIZE = 6#
UNKNOWN = 1#
class cupti.cupti.FuncShmemLimitConfig(value)#

Bases: IntEnum

See CUpti_FuncShmemLimitConfig.

DEFAULT = 0#
FORCE_INT = 2147483647#
OPTIN = 1#
class cupti.cupti.GraphData#

Bases: object

Empty-initialize an instance of CUpti_GraphData.

See also

CUpti_GraphData

dependency#

The dependent graph node

Type:

int

graph#

CUDA graph

Type:

int

graph_exec#

CUDA executable graph

Type:

int

node#

CUDA graph node

Type:

int

node_type#

Type of the node

Type:

int

original_graph#

The original CUDA graph from which graph is cloned

Type:

int

original_node#

The original CUDA graph node from which node is cloned

Type:

int

ptr#

Get the pointer address to the data as Python int.

class cupti.cupti.MetricValue#

Bases: object

Empty-initialize an instance of CUpti_MetricValue.

See also

CUpti_MetricValue

metric_value_double#

float:

metric_value_int64#

int:

metric_value_nvtx_extended_payload#

Value for CUPTI_METRIC_VALUE_KIND_NVTX_EXTENDED_PAYLOAD.

Type:

int

metric_value_percent#

float:

metric_value_throughput#

int:

metric_value_uint64#

int:

metric_value_utilization_level#

int:

ptr#

Get the pointer address to the data as Python int.

class cupti.cupti.MetricValueKind(value)#

Bases: IntEnum

See CUpti_MetricValueKind.

DOUBLE = 0#
FORCE_INT = 2147483647#
INT64 = 4#
NVTX_EXTENDED_PAYLOAD = 6#
PERCENT = 2#
THROUGHPUT = 3#
UINT64 = 1#
UTILIZATION_LEVEL = 5#
class cupti.cupti.MetricValueUtilizationLevel(value)#

Bases: IntEnum

See CUpti_MetricValueUtilizationLevel.

FORCE_INT = 2147483647#
HIGH = 8#
IDLE = 0#
LOW = 2#
MAX = 10#
MID = 5#
class cupti.cupti.ModuleResourceData#

Bases: object

Empty-initialize an instance of CUpti_ModuleResourceData.

cubin_size#

The size of the cubin.

Type:

int

module_id#

Identifier to associate with the CUDA module.

Type:

int

p_cubin#

Pointer to the associated cubin.

Type:

str

ptr#

Get the pointer address to the data as Python int.

class cupti.cupti.OpenAccConstructKind(value)#

Bases: IntEnum

See CUpti_OpenAccConstructKind.

ATOMIC = 8#
DATA = 4#
DECLARE = 9#
ENTER_DATA = 5#
EXIT_DATA = 6#
FORCE_INT = 2147483647#
HOST_DATA = 7#
INIT = 10#
KERNELS = 2#
LOOP = 3#
PARALLEL = 1#
ROUTINE = 14#
RUNTIME_API = 16#
SET = 12#
SHUTDOWN = 11#
UNKNOWN = 0#
UPDATE = 13#
WAIT = 15#
class cupti.cupti.OpenAccEventKind(value)#

Bases: IntEnum

See CUpti_OpenAccEventKind.

ALLOC = 15#
COMPUTE_CONSTRUCT = 9#
CREATE = 13#
DELETE = 14#
DEVICE_INIT = 1#
DEVICE_SHUTDOWN = 2#
ENQUEUE_DOWNLOAD = 6#
ENQUEUE_LAUNCH = 4#
ENQUEUE_UPLOAD = 5#
ENTER_DATA = 11#
EXIT_DATA = 12#
FORCE_INT = 2147483647#
FREE = 16#
IMPLICIT_WAIT = 8#
INVALID = 0#
RUNTIME_SHUTDOWN = 3#
UPDATE = 10#
WAIT = 7#
class cupti.cupti.OpenMpEventKind(value)#

Bases: IntEnum

See CUpti_OpenMpEventKind.

FORCE_INT = 2147483647#
IDLE = 4#
INVALID = 0#
PARALLEL = 1#
TASK = 2#
THREAD = 3#
WAIT_BARRIER = 5#
WAIT_TASKWAIT = 6#
class cupti.cupti.PcieDeviceType(value)#

Bases: IntEnum

See CUpti_PcieDeviceType.

BRIDGE = 1#
FORCE_INT = 2147483647#
GPU = 0#
class cupti.cupti.ResourceData#

Bases: object

Empty-initialize an instance of CUpti_ResourceData.

context#

For CUPTI_CBID_RESOURCE_CONTEXT_CREATED and CUPTI_CBID_RESOURCE_CONTEXT_DESTROY_STARTING, the context being created or destroyed. For CUPTI_CBID_RESOURCE_STREAM_CREATED and CUPTI_CBID_RESOURCE_STREAM_DESTROY_STARTING, the context containing the stream being created or destroyed.

Type:

int

ptr#

Get the pointer address to the data as Python int.

resource_descriptor#

Reserved for future use.

Type:

int

resource_handle#

_py_anon_pod0:

class cupti.cupti.Result(value)#

Bases: IntEnum

See CUptiResult.

ERROR_API_NOT_IMPLEMENTED = 11#
ERROR_CDP_TRACING_NOT_SUPPORTED = 32#
ERROR_CMP_DEVICE_NOT_SUPPORTED = 42#
ERROR_CONFIDENTIAL_COMPUTING_NOT_SUPPORTED = 41#
ERROR_CUDA_COMPILER_NOT_COMPATIBLE = 34#
ERROR_DISABLED = 23#
ERROR_FORCE_INT = 2147483647#
ERROR_HARDWARE = 9#
ERROR_HARDWARE_BUSY = 26#
ERROR_INSUFFICIENT_PRIVILEGES = 35#
ERROR_INVALID_CHIP_NAME = 46#
ERROR_INVALID_CONTEXT = 3#
ERROR_INVALID_DEVICE = 2#
ERROR_INVALID_EVENT_DOMAIN_ID = 4#
ERROR_INVALID_EVENT_ID = 5#
ERROR_INVALID_EVENT_NAME = 6#
ERROR_INVALID_EVENT_VALUE = 22#
ERROR_INVALID_HANDLE = 19#
ERROR_INVALID_KIND = 21#
ERROR_INVALID_METRIC_ID = 16#
ERROR_INVALID_METRIC_NAME = 17#
ERROR_INVALID_METRIC_VALUE = 25#
ERROR_INVALID_MODULE = 24#
ERROR_INVALID_OPERATION = 7#
ERROR_INVALID_PARAMETER = 1#
ERROR_INVALID_STREAM = 20#
ERROR_LEGACY_PROFILER_NOT_SUPPORTED = 38#
ERROR_MAX_LIMIT_REACHED = 12#
ERROR_MIG_DEVICE_NOT_SUPPORTED = 43#
ERROR_MULTIPLE_SUBSCRIBERS_NOT_SUPPORTED = 39#
ERROR_NOT_COMPATIBLE = 14#
ERROR_NOT_INITIALIZED = 15#
ERROR_NOT_READY = 13#
ERROR_NOT_SUPPORTED = 27#
ERROR_OLD_PROFILER_API_INITIALIZED = 36#
ERROR_OPENACC_UNDEFINED_ROUTINE = 37#
ERROR_OUT_OF_MEMORY = 8#
ERROR_PARAMETER_SIZE_NOT_SUFFICIENT = 10#
ERROR_QUEUE_EMPTY = 18#
ERROR_SLI_DEVICE_NOT_SUPPORTED = 44#
ERROR_UM_PROFILING_NOT_SUPPORTED = 28#
ERROR_UM_PROFILING_NOT_SUPPORTED_ON_DEVICE = 29#
ERROR_UM_PROFILING_NOT_SUPPORTED_ON_NON_P2P_DEVICES = 30#
ERROR_UM_PROFILING_NOT_SUPPORTED_WITH_MPS = 31#
ERROR_UNKNOWN = 999#
ERROR_VIRTUALIZED_DEVICE_INSUFFICIENT_PRIVILEGES = 40#
ERROR_VIRTUALIZED_DEVICE_NOT_SUPPORTED = 33#
ERROR_WSL_DEVICE_NOT_SUPPORTED = 45#
SUCCESS = 0#
class cupti.cupti.StateData#

Bases: object

Empty-initialize an instance of CUpti_StateData.

See also

CUpti_StateData

ptr#

Get the pointer address to the data as Python int.

class cupti.cupti.StreamAttrData#

Bases: object

Empty-initialize an instance of CUpti_StreamAttrData.

attr#

The type of the CUDA stream attribute

Type:

int

ptr#

Get the pointer address to the data as Python int.

stream#

The CUDA stream handle for the attribute

Type:

int

value#

The value of the CUDA stream attribute

Type:

int

class cupti.cupti.SubscriberParams#

Bases: object

Empty-initialize an instance of CUpti_SubscriberParams.

old_subscriber_name#

The name of the incompatible tool or the existing CUPTI subscriber, if cupti.cupti.subscribe_v2 errors out with CUPTI_ERROR_MULTIPLE_SUBSCRIBERS_NOT_SUPPORTED return code. Is None otherwise.

Type:

Union[str, None]

ptr#

Get the pointer address to the data as Python :py:`int`.

struct_size#

Size of the data structure. CUPTI client should set the size of the structure. It will be used in CUPTI to check what fields are available in the structure. Used to preserve backward compatibility.

Type:

int

subscriber_name#

Name given to the subscriber. The subscriber name need not include the “CUPTI” prefix, as the CUPTI library automatically adds it as “CUPTI for <subscriberName>”. Can be None. An internal copy is created. Size must not exceed cupti.cupti.SUBSCRIBER_NAME_MAX_LEN to avoid truncation.

Type:

str

class cupti.cupti.SynchronizeData#

Bases: object

Empty-initialize an instance of CUpti_SynchronizeData.

context#

The context of the stream being synchronized.

Type:

int

ptr#

Get the pointer address to the data as Python int.

stream#

The stream being synchronized.

Type:

int

class cupti.cupti._py_anon_pod0#

Bases: object

Empty-initialize an instance of _anon_pod0.

See also

_anon_pod0

ptr#

Get the pointer address to the data as Python int.

stream#

int:

class cupti.cupti._py_anon_pod1#

Bases: object

Empty-initialize an instance of _anon_pod1.

See also

_anon_pod1

notification#

_py_anon_pod2:

ptr#

Get the pointer address to the data as Python int.

class cupti.cupti._py_anon_pod10#

Bases: object

Empty-initialize an instance of _anon_pod10.

See also

_anon_pod10

ptr#

Get the pointer address to the data as Python int.

v_double#

float:

v_int32#

int:

v_int64#

int:

v_uint32#

int:

v_uint64#

int:

class cupti.cupti._py_anon_pod11#

Bases: object

Empty-initialize an instance of _anon_pod11.

See also

_anon_pod11

cooling#

_py_anon_pod15:

power#

_py_anon_pod14:

ptr#

Get the pointer address to the data as Python int.

speed#

_py_anon_pod12:

temperature#

_py_anon_pod13:

class cupti.cupti._py_anon_pod12#

Bases: object

Empty-initialize an instance of _anon_pod12.

See also

_anon_pod12

clocks_throttle_reasons#

int:

memory_clock#

int:

int:

int:

ptr#

Get the pointer address to the data as Python int.

sm_clock#

int:

class cupti.cupti._py_anon_pod13#

Bases: object

Empty-initialize an instance of _anon_pod13.

See also

_anon_pod13

gpu_temperature#

int:

ptr#

Get the pointer address to the data as Python int.

class cupti.cupti._py_anon_pod14#

Bases: object

Empty-initialize an instance of _anon_pod14.

See also

_anon_pod14

power#

int:

power_limit#

int:

ptr#

Get the pointer address to the data as Python int.

class cupti.cupti._py_anon_pod15#

Bases: object

Empty-initialize an instance of _anon_pod15.

See also

_anon_pod15

fan_speed#

int:

ptr#

Get the pointer address to the data as Python int.

class cupti.cupti._py_anon_pod2#

Bases: object

Empty-initialize an instance of _anon_pod2.

See also

_anon_pod2

message#

str:

ptr#

Get the pointer address to the data as Python int.

result#

int:

class cupti.cupti._py_anon_pod24#

Bases: object

Empty-initialize an instance of _anon_pod24.

See also

_anon_pod24

both#

int:

config#

_py_anon_pod25:

ptr#

Get the pointer address to the data as Python int.

class cupti.cupti._py_anon_pod25#

Bases: object

Empty-initialize an instance of _anon_pod25.

See also

_anon_pod25

executed#

int:

ptr#

Get the pointer address to the data as Python int.

requested#

int:

class cupti.cupti._py_anon_pod3#

Bases: object

Empty-initialize an instance of _anon_pod3.

See also

_anon_pod3

process_id#

int:

ptr#

Get the pointer address to the data as Python int.

thread_id#

int:

class cupti.cupti._py_anon_pod4#

Bases: object

Empty-initialize an instance of _anon_pod4.

See also

_anon_pod4

context_id#

int:

device_id#

int:

ptr#

Get the pointer address to the data as Python int.

stream_id#

int:

class cupti.cupti._py_anon_pod5#

Bases: object

Empty-initialize an instance of _anon_pod5.

See also

_anon_pod5

address#

int:

memory_pool_type#

int:

pad2#

int:

pool#

_py_anon_pod6:

ptr#

Get the pointer address to the data as Python int.

release_threshold#

int:

utilized_size#

int:

class cupti.cupti._py_anon_pod6#

Bases: object

Empty-initialize an instance of _anon_pod6.

See also

_anon_pod6

process_id#

int:

ptr#

Get the pointer address to the data as Python int.

size_#

int:

class cupti.cupti._py_anon_pod7#

Bases: object

Empty-initialize an instance of _anon_pod7.

See also

_anon_pod7

both#

int:

config#

_py_anon_pod8:

ptr#

Get the pointer address to the data as Python int.

class cupti.cupti._py_anon_pod8#

Bases: object

Empty-initialize an instance of _anon_pod8.

See also

_anon_pod8

executed#

int:

ptr#

Get the pointer address to the data as Python int.

requested#

int:

class cupti.cupti._py_anon_pod9#

Bases: object

Empty-initialize an instance of _anon_pod9.

See also

_anon_pod9

cu#

int:

cupti#

int:

ptr#

Get the pointer address to the data as Python int.

class cupti.cupti.driver_api_trace_cbid(value)#

Bases: IntEnum

See CUpti_driver_api_trace_cbid.

FORCE_INT = 2147483647#
INVALID = 0#
SIZE = 807#
cu64Array3DCreate = 230#
cu64Array3DGetDescriptor = 231#
cu64ArrayCreate = 228#
cu64ArrayGetDescriptor = 229#
cu64D3D10ResourceGetMappedPitch = 200#
cu64D3D10ResourceGetMappedPointer = 198#
cu64D3D10ResourceGetMappedSize = 199#
cu64D3D10ResourceGetSurfaceDimensions = 201#
cu64D3D9MapVertexBuffer = 206#
cu64D3D9ResourceGetMappedPitch = 205#
cu64D3D9ResourceGetMappedPointer = 203#
cu64D3D9ResourceGetMappedSize = 204#
cu64D3D9ResourceGetSurfaceDimensions = 202#
cu64DeviceTotalMem = 197#
cu64GLMapBufferObject = 207#
cu64GLMapBufferObjectAsync = 208#
cu64GraphicsResourceGetMappedPointer = 131#
cu64MemAlloc = 30#
cu64MemAllocPitch = 32#
cu64MemFree = 34#
cu64MemGetAddressRange = 36#
cu64MemGetInfo = 28#
cu64MemHostAlloc = 215#
cu64MemHostGetDevicePointer = 41#
cu64Memcpy2D = 232#
cu64Memcpy2DAsync = 234#
cu64Memcpy2DUnaligned = 233#
cu64Memcpy3D = 59#
cu64Memcpy3DAsync = 70#
cu64MemcpyAtoD = 52#
cu64MemcpyDtoA = 50#
cu64MemcpyDtoD = 48#
cu64MemcpyDtoDAsync = 65#
cu64MemcpyDtoH = 46#
cu64MemcpyDtoHAsync = 63#
cu64MemcpyHtoD = 44#
cu64MemcpyHtoDAsync = 61#
cu64MemsetD16 = 74#
cu64MemsetD16Async = 219#
cu64MemsetD2D16 = 80#
cu64MemsetD2D16Async = 225#
cu64MemsetD2D32 = 82#
cu64MemsetD2D32Async = 227#
cu64MemsetD2D8 = 78#
cu64MemsetD2D8Async = 223#
cu64MemsetD32 = 76#
cu64MemsetD32Async = 221#
cu64MemsetD8 = 72#
cu64MemsetD8Async = 217#
cu64ModuleGetGlobal = 25#
cu64TexRefGetAddress = 104#
cu64TexRefSetAddress = 96#
cu64TexRefSetAddress2D = 98#
cuArray3DCreate = 90#
cuArray3DCreate_v2 = 274#
cuArray3DGetDescriptor = 91#
cuArray3DGetDescriptor_v2 = 275#
cuArrayCreate = 87#
cuArrayCreate_v2 = 272#
cuArrayDestroy = 89#
cuArrayGetDescriptor = 88#
cuArrayGetDescriptor_v2 = 273#
cuArrayGetMemoryRequirements = 654#
cuArrayGetPlane = 597#
cuArrayGetSparseProperties = 582#
cuBinaryFree = 376#
cuCheckpointProcessCheckpoint = 771#
cuCheckpointProcessGetRestoreThreadId = 768#
cuCheckpointProcessGetState = 769#
cuCheckpointProcessLock = 770#
cuCheckpointProcessRestore = 772#
cuCheckpointProcessUnlock = 773#
cuCompilePtx = 375#
cuCoredumpGetAttribute = 701#
cuCoredumpGetAttributeGlobal = 702#
cuCoredumpSetAttribute = 703#
cuCoredumpSetAttributeGlobal = 704#
cuCtxAttach = 12#
cuCtxCreate = 10#
cuCtxCreate_v2 = 235#
cuCtxCreate_v3 = 645#
cuCtxCreate_v4 = 757#
cuCtxDestroy = 11#
cuCtxDestroy_v2 = 322#
cuCtxDetach = 13#
cuCtxDisablePeerAccess = 314#
cuCtxEnablePeerAccess = 313#
cuCtxFromGreenCtx = 753#
cuCtxGetApiVersion = 296#
cuCtxGetCacheConfig = 299#
cuCtxGetCurrent = 304#
cuCtxGetDevResource = 746#
cuCtxGetDevice = 16#
cuCtxGetDevice_v2 = 795#
cuCtxGetExecAffinity = 646#
cuCtxGetFlags = 391#
cuCtxGetId = 695#
cuCtxGetLimit = 137#
cuCtxGetSharedMemConfig = 337#
cuCtxGetStreamPriorityRange = 370#
cuCtxPopCurrent = 15#
cuCtxPopCurrent_v2 = 324#
cuCtxPushCurrent = 14#
cuCtxPushCurrent_v2 = 323#
cuCtxRecordEvent = 755#
cuCtxResetPersistingL2Cache = 568#
cuCtxSetCacheConfig = 300#
cuCtxSetCurrent = 303#
cuCtxSetFlags = 705#
cuCtxSetLimit = 136#
cuCtxSetSharedMemConfig = 336#
cuCtxSynchronize = 17#
cuCtxSynchronize_v2 = 800#
cuCtxWaitEvent = 756#
cuD3D10CtxCreate = 139#
cuD3D10CtxCreateOnDevice = 212#
cuD3D10CtxCreate_v2 = 236#
cuD3D10GetDevice = 138#
cuD3D10GetDevices = 211#
cuD3D10GetDirect3DDevice = 297#
cuD3D10MapResources = 143#
cuD3D10RegisterResource = 141#
cuD3D10ResourceGetMappedArray = 146#
cuD3D10ResourceGetMappedPitch = 149#
cuD3D10ResourceGetMappedPitch_v2 = 262#
cuD3D10ResourceGetMappedPointer = 147#
cuD3D10ResourceGetMappedPointer_v2 = 260#
cuD3D10ResourceGetMappedSize = 148#
cuD3D10ResourceGetMappedSize_v2 = 261#
cuD3D10ResourceGetSurfaceDimensions = 150#
cuD3D10ResourceGetSurfaceDimensions_v2 = 263#
cuD3D10ResourceSetMapFlags = 145#
cuD3D10UnmapResources = 144#
cuD3D10UnregisterResource = 142#
cuD3D11CtxCreate = 152#
cuD3D11CtxCreateOnDevice = 210#
cuD3D11CtxCreate_v2 = 237#
cuD3D11GetDevice = 151#
cuD3D11GetDevices = 209#
cuD3D11GetDirect3DDevice = 298#
cuD3D9Begin = 168#
cuD3D9CtxCreate = 155#
cuD3D9CtxCreateOnDevice = 214#
cuD3D9CtxCreate_v2 = 238#
cuD3D9End = 169#
cuD3D9GetDevice = 154#
cuD3D9GetDevices = 213#
cuD3D9GetDirect3DDevice = 157#
cuD3D9MapResources = 160#
cuD3D9MapVertexBuffer = 171#
cuD3D9MapVertexBuffer_v2 = 268#
cuD3D9RegisterResource = 158#
cuD3D9RegisterVertexBuffer = 170#
cuD3D9ResourceGetMappedArray = 164#
cuD3D9ResourceGetMappedPitch = 167#
cuD3D9ResourceGetMappedPitch_v2 = 267#
cuD3D9ResourceGetMappedPointer = 165#
cuD3D9ResourceGetMappedPointer_v2 = 265#
cuD3D9ResourceGetMappedSize = 166#
cuD3D9ResourceGetMappedSize_v2 = 266#
cuD3D9ResourceGetSurfaceDimensions = 163#
cuD3D9ResourceGetSurfaceDimensions_v2 = 264#
cuD3D9ResourceSetMapFlags = 162#
cuD3D9UnmapResources = 161#
cuD3D9UnmapVertexBuffer = 172#
cuD3D9UnregisterResource = 159#
cuD3D9UnregisterVertexBuffer = 173#
cuDestroyExternalMemory = 488#
cuDestroyExternalSemaphore = 494#
cuDevResourceGenerateDesc = 748#
cuDevSmResourceSplitByCount = 751#
cuDeviceCanAccessPeer = 312#
cuDeviceComputeCapability = 6#
cuDeviceGet = 3#
cuDeviceGetAttribute = 9#
cuDeviceGetByPCIBusId = 331#
cuDeviceGetCount = 4#
cuDeviceGetDefaultMemPool = 606#
cuDeviceGetDevResource = 745#
cuDeviceGetExecAffinitySupport = 644#
cuDeviceGetGraphMemAttribute = 641#
cuDeviceGetHostAtomicCapabilities = 805#
cuDeviceGetLuid = 532#
cuDeviceGetMemPool = 610#
cuDeviceGetName = 5#
cuDeviceGetNvSciSyncAttributes = 542#
cuDeviceGetP2PAtomicCapabilities = 804#
cuDeviceGetP2PAttribute = 454#
cuDeviceGetPCIBusId = 332#
cuDeviceGetProperties = 8#
cuDeviceGetTexture1DLinearMaxWidth = 579#
cuDeviceGetUuid = 482#
cuDeviceGetUuid_v2 = 647#
cuDeviceGraphMemTrim = 640#
cuDevicePrimaryCtxGetState = 392#
cuDevicePrimaryCtxRelease = 387#
cuDevicePrimaryCtxRelease_v2 = 544#
cuDevicePrimaryCtxReset = 389#
cuDevicePrimaryCtxReset_v2 = 545#
cuDevicePrimaryCtxRetain = 386#
cuDevicePrimaryCtxSetFlags = 388#
cuDevicePrimaryCtxSetFlags_v2 = 546#
cuDeviceRegisterAsyncNotification = 735#
cuDeviceSetGraphMemAttribute = 642#
cuDeviceSetMemPool = 609#
cuDeviceTotalMem = 7#
cuDeviceTotalMem_v2 = 259#
cuDeviceUnregisterAsyncNotification = 736#
cuDriverGetGpuCodeIsaVersion = 806#
cuDriverGetVersion = 2#
cuEGLStreamConsumerAcquireFrame = 395#
cuEGLStreamConsumerConnect = 393#
cuEGLStreamConsumerConnectWithFlags = 470#
cuEGLStreamConsumerDisconnect = 394#
cuEGLStreamConsumerReleaseFrame = 396#
cuEGLStreamProducerConnect = 446#
cuEGLStreamProducerDisconnect = 447#
cuEGLStreamProducerPresentFrame = 448#
cuEGLStreamProducerReturnFrame = 453#
cuEventCreate = 118#
cuEventCreateFromEGLSync = 479#
cuEventCreateFromNVNSync = 469#
cuEventDestroy = 122#
cuEventDestroy_v2 = 325#
cuEventElapsedTime = 123#
cuEventElapsedTime_v2 = 780#
cuEventQuery = 120#
cuEventRecord = 119#
cuEventRecordWithFlags = 587#
cuEventRecordWithFlags_ptsz = 588#
cuEventRecord_ptsz = 441#
cuEventSynchronize = 121#
cuExternalMemoryGetMappedBuffer = 486#
cuExternalMemoryGetMappedMipmappedArray = 487#
cuFlushGPUDirectRDMAWrites = 627#
cuFuncGetAttribute = 85#
cuFuncGetModule = 566#
cuFuncGetName = 718#
cuFuncGetParamInfo = 733#
cuFuncIsLoaded = 741#
cuFuncLoad = 742#
cuFuncSetAttribute = 481#
cuFuncSetBlockShape = 83#
cuFuncSetCacheConfig = 86#
cuFuncSetSharedMemConfig = 338#
cuFuncSetSharedSize = 84#
cuGLCtxCreate = 174#
cuGLCtxCreate_v2 = 239#
cuGLGetDevices = 333#
cuGLGetDevices_v2 = 385#
cuGLInit = 178#
cuGLMapBufferObject = 180#
cuGLMapBufferObjectAsync = 184#
cuGLMapBufferObjectAsync_v2 = 270#
cuGLMapBufferObjectAsync_v2_ptsz = 445#
cuGLMapBufferObject_v2 = 269#
cuGLMapBufferObject_v2_ptds = 417#
cuGLRegisterBufferObject = 179#
cuGLSetBufferObjectMapFlags = 183#
cuGLUnmapBufferObject = 181#
cuGLUnmapBufferObjectAsync = 185#
cuGLUnregisterBufferObject = 182#
cuGetErrorName = 373#
cuGetErrorString = 372#
cuGetExportTable = 135#
cuGetProcAddress = 626#
cuGetProcAddress_v2 = 677#
cuGraphAddBatchMemOpNode = 669#
cuGraphAddChildGraphNode = 525#
cuGraphAddDependencies = 518#
cuGraphAddDependencies_v2 = 727#
cuGraphAddEmptyNode = 526#
cuGraphAddEventRecordNode = 589#
cuGraphAddEventWaitNode = 590#
cuGraphAddExternalSemaphoresSignalNode = 618#
cuGraphAddExternalSemaphoresWaitNode = 621#
cuGraphAddHostNode = 530#
cuGraphAddKernelNode = 502#
cuGraphAddKernelNode_v2 = 689#
cuGraphAddMemAllocNode = 638#
cuGraphAddMemFreeNode = 639#
cuGraphAddMemcpyNode = 504#
cuGraphAddMemsetNode = 506#
cuGraphAddNode = 712#
cuGraphAddNode_v2 = 723#
cuGraphBatchMemOpNodeGetParams = 670#
cuGraphBatchMemOpNodeSetParams = 671#
cuGraphChildGraphNodeGetGraph = 529#
cuGraphClone = 523#
cuGraphConditionalHandleCreate = 722#
cuGraphCreate = 501#
cuGraphDebugDotPrint = 628#
cuGraphDestroy = 517#
cuGraphDestroyNode = 522#
cuGraphEventRecordNodeGetEvent = 591#
cuGraphEventRecordNodeSetEvent = 593#
cuGraphEventWaitNodeGetEvent = 592#
cuGraphEventWaitNodeSetEvent = 594#
cuGraphExecBatchMemOpNodeSetParams = 672#
cuGraphExecChildGraphNodeSetParams = 586#
cuGraphExecDestroy = 516#
cuGraphExecEventRecordNodeSetEvent = 595#
cuGraphExecEventWaitNodeSetEvent = 596#
cuGraphExecExternalSemaphoresSignalNodeSetParams = 624#
cuGraphExecExternalSemaphoresWaitNodeSetParams = 625#
cuGraphExecGetFlags = 658#
cuGraphExecHostNodeSetParams = 564#
cuGraphExecKernelNodeSetParams = 538#
cuGraphExecKernelNodeSetParams_v2 = 692#
cuGraphExecMemcpyNodeSetParams = 562#
cuGraphExecMemsetNodeSetParams = 563#
cuGraphExecNodeSetParams = 714#
cuGraphExecUpdate = 561#
cuGraphExecUpdate_v2 = 696#
cuGraphExternalSemaphoresSignalNodeGetParams = 619#
cuGraphExternalSemaphoresSignalNodeSetParams = 620#
cuGraphExternalSemaphoresWaitNodeGetParams = 622#
cuGraphExternalSemaphoresWaitNodeSetParams = 623#
cuGraphGetEdges = 535#
cuGraphGetEdges_v2 = 724#
cuGraphGetNodes = 534#
cuGraphGetRootNodes = 510#
cuGraphHostNodeGetParams = 531#
cuGraphHostNodeSetParams = 533#
cuGraphInstantiate = 513#
cuGraphInstantiateWithFlags = 643#
cuGraphInstantiateWithParams = 656#
cuGraphInstantiateWithParams_ptsz = 657#
cuGraphInstantiate_v2 = 578#
cuGraphKernelNodeCopyAttributes = 569#
cuGraphKernelNodeGetAttribute = 570#
cuGraphKernelNodeGetParams = 503#
cuGraphKernelNodeGetParams_v2 = 690#
cuGraphKernelNodeSetAttribute = 571#
cuGraphKernelNodeSetParams = 521#
cuGraphKernelNodeSetParams_v2 = 691#
cuGraphLaunch = 514#
cuGraphLaunch_ptsz = 515#
cuGraphMemAllocNodeGetParams = 648#
cuGraphMemFreeNodeGetParams = 649#
cuGraphMemcpyNodeGetParams = 505#
cuGraphMemcpyNodeSetParams = 520#
cuGraphMemsetNodeGetParams = 507#
cuGraphMemsetNodeSetParams = 508#
cuGraphNodeFindInClone = 524#
cuGraphNodeGetDependencies = 511#
cuGraphNodeGetDependencies_v2 = 725#
cuGraphNodeGetDependentNodes = 512#
cuGraphNodeGetDependentNodes_v2 = 726#
cuGraphNodeGetEnabled = 651#
cuGraphNodeGetType = 509#
cuGraphNodeSetEnabled = 650#
cuGraphNodeSetParams = 713#
cuGraphReleaseUserObject = 637#
cuGraphRemoveDependencies = 519#
cuGraphRemoveDependencies_v2 = 728#
cuGraphRetainUserObject = 636#
cuGraphUpload = 580#
cuGraphUpload_ptsz = 581#
cuGraphicsD3D10RegisterResource = 140#
cuGraphicsD3D11RegisterResource = 153#
cuGraphicsD3D9RegisterResource = 156#
cuGraphicsEGLRegisterImage = 390#
cuGraphicsGLRegisterBuffer = 175#
cuGraphicsGLRegisterImage = 176#
cuGraphicsMapResources = 133#
cuGraphicsMapResources_ptsz = 443#
cuGraphicsResourceGetMappedEglFrame = 449#
cuGraphicsResourceGetMappedMipmappedArray = 360#
cuGraphicsResourceGetMappedPointer = 130#
cuGraphicsResourceGetMappedPointer_v2 = 258#
cuGraphicsResourceSetMapFlags = 132#
cuGraphicsResourceSetMapFlags_v2 = 380#
cuGraphicsSubResourceGetMappedArray = 129#
cuGraphicsUnmapResources = 134#
cuGraphicsUnmapResources_ptsz = 444#
cuGraphicsUnregisterResource = 128#
cuGraphicsVDPAURegisterOutputSurface = 189#
cuGraphicsVDPAURegisterVideoSurface = 188#
cuGreenCtxCreate = 743#
cuGreenCtxDestroy = 744#
cuGreenCtxGetDevResource = 747#
cuGreenCtxGetId = 782#
cuGreenCtxRecordEvent = 749#
cuGreenCtxStreamCreate = 758#
cuGreenCtxWaitEvent = 750#
cuImportExternalMemory = 485#
cuImportExternalSemaphore = 489#
cuInit = 1#
cuIpcCloseMemHandle = 330#
cuIpcGetEventHandle = 334#
cuIpcGetMemHandle = 328#
cuIpcOpenEventHandle = 335#
cuIpcOpenMemHandle = 329#
cuIpcOpenMemHandle_v2 = 567#
cuKernelGetAttribute = 686#
cuKernelGetFunction = 683#
cuKernelGetLibrary = 754#
cuKernelGetName = 719#
cuKernelGetParamInfo = 734#
cuKernelSetAttribute = 687#
cuKernelSetCacheConfig = 688#
cuLaunch = 115#
cuLaunchCooperativeKernel = 477#
cuLaunchCooperativeKernelMultiDevice = 480#
cuLaunchCooperativeKernel_ptsz = 478#
cuLaunchGrid = 116#
cuLaunchGridAsync = 117#
cuLaunchHostFunc = 527#
cuLaunchHostFunc_ptsz = 528#
cuLaunchKernel = 307#
cuLaunchKernelEx = 652#
cuLaunchKernelEx_ptsz = 653#
cuLaunchKernel_ptsz = 442#
cuLibraryEnumerateKernels = 740#
cuLibraryGetGlobal = 684#
cuLibraryGetKernel = 681#
cuLibraryGetKernelCount = 739#
cuLibraryGetManaged = 685#
cuLibraryGetModule = 682#
cuLibraryGetUnifiedFunction = 700#
cuLibraryLoadData = 678#
cuLibraryLoadFromFile = 679#
cuLibraryUnload = 680#
cuLinkAddData = 363#
cuLinkAddData_v2 = 382#
cuLinkAddFile = 364#
cuLinkAddFile_v2 = 383#
cuLinkComplete = 365#
cuLinkCreate = 362#
cuLinkCreate_v2 = 381#
cuLinkDestroy = 366#
cuLogsCurrent = 765#
cuLogsDumpToFile = 766#
cuLogsDumpToMemory = 767#
cuLogsRegisterCallback = 763#
cuLogsUnregisterCallback = 764#
cuMemAddressFree = 548#
cuMemAddressReserve = 547#
cuMemAdvise = 457#
cuMemAdvise_v2 = 715#
cuMemAlloc = 29#
cuMemAllocAsync = 598#
cuMemAllocAsync_ptsz = 599#
cuMemAllocFromPoolAsync = 611#
cuMemAllocFromPoolAsync_ptsz = 612#
cuMemAllocHost = 37#
cuMemAllocHost_v2 = 294#
cuMemAllocManaged = 371#
cuMemAllocPitch = 31#
cuMemAllocPitch_v2 = 244#
cuMemAlloc_v2 = 243#
cuMemBatchDecompressAsync = 761#
cuMemBatchDecompressAsync_ptsz = 762#
cuMemCreate = 549#
cuMemDiscardAndPrefetchBatchAsync = 791#
cuMemDiscardAndPrefetchBatchAsync_ptsz = 792#
cuMemDiscardBatchAsync = 789#
cuMemDiscardBatchAsync_ptsz = 790#
cuMemExportToShareableHandle = 554#
cuMemFree = 33#
cuMemFreeAsync = 600#
cuMemFreeAsync_ptsz = 601#
cuMemFreeHost = 38#
cuMemFree_v2 = 245#
cuMemGetAccess = 558#
cuMemGetAddressRange = 35#
cuMemGetAddressRange_v2 = 246#
cuMemGetAllocationGranularity = 556#
cuMemGetAllocationPropertiesFromHandle = 557#
cuMemGetDefaultMemPool = 801#
cuMemGetHandleForAddressRange = 674#
cuMemGetInfo = 27#
cuMemGetInfo_v2 = 242#
cuMemGetMemPool = 802#
cuMemHostAlloc = 39#
cuMemHostAlloc_v2 = 271#
cuMemHostGetDevicePointer = 40#
cuMemHostGetDevicePointer_v2 = 247#
cuMemHostGetFlags = 42#
cuMemHostRegister = 301#
cuMemHostRegister_v2 = 379#
cuMemHostUnregister = 302#
cuMemImportFromShareableHandle = 555#
cuMemMap = 551#
cuMemMapArrayAsync = 584#
cuMemMapArrayAsync_ptsz = 585#
cuMemPeerGetDevicePointer = 317#
cuMemPeerRegister = 315#
cuMemPeerUnregister = 316#
cuMemPoolCreate = 607#
cuMemPoolDestroy = 608#
cuMemPoolExportPointer = 615#
cuMemPoolExportToShareableHandle = 613#
cuMemPoolGetAccess = 617#
cuMemPoolGetAttribute = 604#
cuMemPoolImportFromShareableHandle = 614#
cuMemPoolImportPointer = 616#
cuMemPoolSetAccess = 605#
cuMemPoolSetAttribute = 603#
cuMemPoolTrimTo = 602#
cuMemPrefetchAsync = 467#
cuMemPrefetchAsync_ptsz = 468#
cuMemPrefetchAsync_v2 = 716#
cuMemPrefetchAsync_v2_ptsz = 717#
cuMemPrefetchBatchAsync = 784#
cuMemPrefetchBatchAsync_ptsz = 785#
cuMemRangeGetAttribute = 471#
cuMemRangeGetAttributes = 472#
cuMemRelease = 550#
cuMemRetainAllocationHandle = 565#
cuMemSetAccess = 553#
cuMemSetMemPool = 803#
cuMemUnmap = 552#
cuMemcpy = 305#
cuMemcpy2D = 56#
cuMemcpy2DAsync = 68#
cuMemcpy2DAsync_v2 = 289#
cuMemcpy2DAsync_v2_ptsz = 424#
cuMemcpy2DUnaligned = 57#
cuMemcpy2DUnaligned_v2 = 288#
cuMemcpy2DUnaligned_v2_ptds = 406#
cuMemcpy2D_v2 = 287#
cuMemcpy2D_v2_ptds = 405#
cuMemcpy3D = 58#
cuMemcpy3DAsync = 69#
cuMemcpy3DAsync_v2 = 291#
cuMemcpy3DAsync_v2_ptsz = 425#
cuMemcpy3DBatchAsync = 778#
cuMemcpy3DBatchAsync_ptsz = 779#
cuMemcpy3DBatchAsync_v2 = 798#
cuMemcpy3DBatchAsync_v2_ptsz = 799#
cuMemcpy3DPeer = 320#
cuMemcpy3DPeerAsync = 321#
cuMemcpy3DPeerAsync_ptsz = 427#
cuMemcpy3DPeer_ptds = 410#
cuMemcpy3D_v2 = 290#
cuMemcpy3D_v2_ptds = 407#
cuMemcpyAsync = 306#
cuMemcpyAsync_ptsz = 418#
cuMemcpyAtoA = 55#
cuMemcpyAtoA_v2 = 286#
cuMemcpyAtoA_v2_ptds = 404#
cuMemcpyAtoD = 51#
cuMemcpyAtoD_v2 = 284#
cuMemcpyAtoD_v2_ptds = 401#
cuMemcpyAtoH = 54#
cuMemcpyAtoHAsync = 67#
cuMemcpyAtoHAsync_v2 = 283#
cuMemcpyAtoHAsync_v2_ptsz = 420#
cuMemcpyAtoH_v2 = 282#
cuMemcpyAtoH_v2_ptds = 403#
cuMemcpyBatchAsync = 776#
cuMemcpyBatchAsync_ptsz = 777#
cuMemcpyBatchAsync_v2 = 796#
cuMemcpyBatchAsync_v2_ptsz = 797#
cuMemcpyDtoA = 49#
cuMemcpyDtoA_v2 = 285#
cuMemcpyDtoA_v2_ptds = 400#
cuMemcpyDtoD = 47#
cuMemcpyDtoDAsync = 64#
cuMemcpyDtoDAsync_v2 = 281#
cuMemcpyDtoDAsync_v2_ptsz = 423#
cuMemcpyDtoD_v2 = 280#
cuMemcpyDtoD_v2_ptds = 399#
cuMemcpyDtoH = 45#
cuMemcpyDtoHAsync = 62#
cuMemcpyDtoHAsync_v2 = 279#
cuMemcpyDtoHAsync_v2_ptsz = 422#
cuMemcpyDtoH_v2 = 278#
cuMemcpyDtoH_v2_ptds = 398#
cuMemcpyHtoA = 53#
cuMemcpyHtoAAsync = 66#
cuMemcpyHtoAAsync_v2 = 293#
cuMemcpyHtoAAsync_v2_ptsz = 419#
cuMemcpyHtoA_v2 = 292#
cuMemcpyHtoA_v2_ptds = 402#
cuMemcpyHtoD = 43#
cuMemcpyHtoDAsync = 60#
cuMemcpyHtoDAsync_v2 = 277#
cuMemcpyHtoDAsync_v2_ptsz = 421#
cuMemcpyHtoD_v2 = 276#
cuMemcpyHtoD_v2_ptds = 397#
cuMemcpyPeer = 318#
cuMemcpyPeerAsync = 319#
cuMemcpyPeerAsync_ptsz = 426#
cuMemcpyPeer_ptds = 409#
cuMemcpy_ptds = 408#
cuMemcpy_v2 = 248#
cuMemsetD16 = 73#
cuMemsetD16Async = 218#
cuMemsetD16Async_ptsz = 429#
cuMemsetD16_v2 = 250#
cuMemsetD16_v2_ptds = 412#
cuMemsetD2D16 = 79#
cuMemsetD2D16Async = 224#
cuMemsetD2D16Async_ptsz = 432#
cuMemsetD2D16_v2 = 253#
cuMemsetD2D16_v2_ptds = 415#
cuMemsetD2D32 = 81#
cuMemsetD2D32Async = 226#
cuMemsetD2D32Async_ptsz = 433#
cuMemsetD2D32_v2 = 254#
cuMemsetD2D32_v2_ptds = 416#
cuMemsetD2D8 = 77#
cuMemsetD2D8Async = 222#
cuMemsetD2D8Async_ptsz = 431#
cuMemsetD2D8_v2 = 252#
cuMemsetD2D8_v2_ptds = 414#
cuMemsetD32 = 75#
cuMemsetD32Async = 220#
cuMemsetD32Async_ptsz = 430#
cuMemsetD32_v2 = 251#
cuMemsetD32_v2_ptds = 413#
cuMemsetD8 = 71#
cuMemsetD8Async = 216#
cuMemsetD8Async_ptsz = 428#
cuMemsetD8_v2 = 249#
cuMemsetD8_v2_ptds = 411#
cuMipmappedArrayCreate = 347#
cuMipmappedArrayDestroy = 349#
cuMipmappedArrayGetLevel = 348#
cuMipmappedArrayGetMemoryRequirements = 655#
cuMipmappedArrayGetSparseProperties = 583#
cuModuleEnumerateFunctions = 738#
cuModuleGetFunction = 23#
cuModuleGetFunctionCount = 737#
cuModuleGetGlobal = 24#
cuModuleGetGlobal_v2 = 241#
cuModuleGetLoadingMode = 673#
cuModuleGetSurfRef = 190#
cuModuleGetTexRef = 26#
cuModuleLoad = 18#
cuModuleLoadData = 19#
cuModuleLoadDataEx = 20#
cuModuleLoadFatBinary = 21#
cuModuleUnload = 22#
cuMultiKernelCooperativeDomainCreate = 793#
cuMultiKernelCooperativeDomainDestroy = 794#
cuMulticastAddDevice = 707#
cuMulticastBindAddr = 709#
cuMulticastBindMem = 708#
cuMulticastCreate = 706#
cuMulticastGetGranularity = 711#
cuMulticastUnbind = 710#
cuNNSetAllocator = 466#
cuNVNbufferGetPointer = 464#
cuNVNtextureGetArray = 465#
cuOccupancyAvailableDynamicSMemPerBlock = 543#
cuOccupancyMaxActiveBlocksPerMultiprocessor = 374#
cuOccupancyMaxActiveBlocksPerMultiprocessorWithFlags = 451#
cuOccupancyMaxActiveClusters = 676#
cuOccupancyMaxPotentialBlockSize = 384#
cuOccupancyMaxPotentialBlockSizeWithFlags = 452#
cuOccupancyMaxPotentialClusterSize = 675#
cuParamSetSize = 110#
cuParamSetTexRef = 114#
cuParamSetf = 112#
cuParamSeti = 111#
cuParamSetv = 113#
cuPointerGetAttribute = 310#
cuPointerGetAttributes = 450#
cuPointerSetAttribute = 378#
cuProfilerInitialize = 311#
cuProfilerStart = 308#
cuProfilerStop = 309#
cuSemaphoreCreate = 786#
cuSemaphoreDestroy = 788#
cuSemaphoreExport = 787#
cuSignalExternalSemaphoresAsync = 490#
cuSignalExternalSemaphoresAsync_ptsz = 491#
cuStreamAddCallback = 346#
cuStreamAddCallback_ptsz = 437#
cuStreamAttachMemAsync = 377#
cuStreamAttachMemAsync_ptsz = 438#
cuStreamBatchMemOp = 462#
cuStreamBatchMemOp_ptsz = 463#
cuStreamBatchMemOp_v2 = 667#
cuStreamBatchMemOp_v2_ptsz = 668#
cuStreamBeginCapture = 495#
cuStreamBeginCaptureToGraph = 720#
cuStreamBeginCaptureToGraph_ptsz = 721#
cuStreamBeginCapture_ptsz = 496#
cuStreamBeginCapture_v2 = 539#
cuStreamBeginCapture_v2_ptsz = 540#
cuStreamCopyAttributes = 572#
cuStreamCopyAttributes_ptsz = 573#
cuStreamCreate = 124#
cuStreamCreateForCaptureToCig = 783#
cuStreamCreateWithPriority = 367#
cuStreamDestroy = 127#
cuStreamDestroy_v2 = 326#
cuStreamEndCapture = 497#
cuStreamEndCapture_ptsz = 498#
cuStreamGetAttribute = 574#
cuStreamGetAttribute_ptsz = 575#
cuStreamGetCaptureInfo = 536#
cuStreamGetCaptureInfo_ptsz = 537#
cuStreamGetCaptureInfo_v2 = 629#
cuStreamGetCaptureInfo_v2_ptsz = 630#
cuStreamGetCaptureInfo_v3 = 729#
cuStreamGetCaptureInfo_v3_ptsz = 730#
cuStreamGetCtx = 483#
cuStreamGetCtx_ptsz = 484#
cuStreamGetCtx_v2 = 759#
cuStreamGetCtx_v2_ptsz = 760#
cuStreamGetDevice = 774#
cuStreamGetDevice_ptsz = 775#
cuStreamGetFlags = 369#
cuStreamGetFlags_ptsz = 435#
cuStreamGetGreenCtx = 752#
cuStreamGetId = 693#
cuStreamGetId_ptsz = 694#
cuStreamGetPriority = 368#
cuStreamGetPriority_ptsz = 434#
cuStreamIsCapturing = 499#
cuStreamIsCapturing_ptsz = 500#
cuStreamQuery = 125#
cuStreamQuery_ptsz = 439#
cuStreamSetAttribute = 576#
cuStreamSetAttribute_ptsz = 577#
cuStreamSetFlags = 559#
cuStreamSetFlags_ptsz = 560#
cuStreamSynchronize = 126#
cuStreamSynchronize_ptsz = 440#
cuStreamUpdateCaptureDependencies = 631#
cuStreamUpdateCaptureDependencies_ptsz = 632#
cuStreamUpdateCaptureDependencies_v2 = 731#
cuStreamUpdateCaptureDependencies_v2_ptsz = 732#
cuStreamWaitEvent = 295#
cuStreamWaitEvent_ptsz = 436#
cuStreamWaitValue32 = 458#
cuStreamWaitValue32_ptsz = 459#
cuStreamWaitValue32_v2 = 659#
cuStreamWaitValue32_v2_ptsz = 660#
cuStreamWaitValue64 = 473#
cuStreamWaitValue64_ptsz = 474#
cuStreamWaitValue64_v2 = 661#
cuStreamWaitValue64_v2_ptsz = 662#
cuStreamWriteValue32 = 460#
cuStreamWriteValue32_ptsz = 461#
cuStreamWriteValue32_v2 = 663#
cuStreamWriteValue32_v2_ptsz = 664#
cuStreamWriteValue64 = 475#
cuStreamWriteValue64_ptsz = 476#
cuStreamWriteValue64_v2 = 665#
cuStreamWriteValue64_v2_ptsz = 666#
cuSurfObjectCreate = 343#
cuSurfObjectDestroy = 344#
cuSurfObjectGetResourceDesc = 345#
cuSurfRefCreate = 191#
cuSurfRefDestroy = 192#
cuSurfRefGetArray = 196#
cuSurfRefGetFormat = 195#
cuSurfRefSetArray = 194#
cuSurfRefSetFormat = 193#
cuTensorMapEncodeIm2col = 698#
cuTensorMapEncodeIm2colWide = 781#
cuTensorMapEncodeTiled = 697#
cuTensorMapReplaceAddress = 699#
cuTexObjectCreate = 339#
cuTexObjectDestroy = 340#
cuTexObjectGetResourceDesc = 341#
cuTexObjectGetResourceViewDesc = 361#
cuTexObjectGetTextureDesc = 342#
cuTexRefCreate = 92#
cuTexRefDestroy = 93#
cuTexRefGetAddress = 103#
cuTexRefGetAddressMode = 106#
cuTexRefGetAddress_v2 = 257#
cuTexRefGetArray = 105#
cuTexRefGetBorderColor = 456#
cuTexRefGetFilterMode = 107#
cuTexRefGetFlags = 109#
cuTexRefGetFormat = 108#
cuTexRefGetMaxAnisotropy = 359#
cuTexRefGetMipmapFilterMode = 356#
cuTexRefGetMipmapLevelBias = 357#
cuTexRefGetMipmapLevelClamp = 358#
cuTexRefGetMipmappedArray = 355#
cuTexRefSetAddress = 95#
cuTexRefSetAddress2D = 97#
cuTexRefSetAddress2D_v2 = 256#
cuTexRefSetAddress2D_v3 = 327#
cuTexRefSetAddressMode = 100#
cuTexRefSetAddress_v2 = 255#
cuTexRefSetArray = 94#
cuTexRefSetBorderColor = 455#
cuTexRefSetFilterMode = 101#
cuTexRefSetFlags = 102#
cuTexRefSetFormat = 99#
cuTexRefSetMaxAnisotropy = 354#
cuTexRefSetMipmapFilterMode = 351#
cuTexRefSetMipmapLevelBias = 352#
cuTexRefSetMipmapLevelClamp = 353#
cuTexRefSetMipmappedArray = 350#
cuThreadExchangeStreamCaptureMode = 541#
cuUserObjectCreate = 633#
cuUserObjectRelease = 635#
cuUserObjectRetain = 634#
cuVDPAUCtxCreate = 187#
cuVDPAUCtxCreate_v2 = 240#
cuVDPAUGetDevice = 186#
cuWGLGetDevice = 177#
cuWaitExternalSemaphoresAsync = 492#
cuWaitExternalSemaphoresAsync_ptsz = 493#
class cupti.cupti.runtime_api_trace_cbid(value)#

Bases: IntEnum

See CUpti_runtime_api_trace_cbid.

FORCE_INT = 2147483647#
INVALID = 0#
SIZE = 523#
cuda470_v12060 = 470#
cuda471_v12060 = 471#
cuda472_v12060 = 472#
cuda473_v12060 = 473#
cuda474_v12060 = 474#
cuda475_v12060 = 475#
cuda476_v12060 = 476#
cuda477_v12060 = 477#
cuda478_v12060 = 478#
cuda479_v12060 = 479#
cudaArrayGetInfo_v4010 = 181#
cudaArrayGetMemoryRequirements_v11060 = 428#
cudaArrayGetPlane_v11020 = 381#
cudaArrayGetSparseProperties_v11010 = 359#
cudaBindSurfaceToArray_v3020 = 61#
cudaBindTexture2D_v3020 = 56#
cudaBindTextureToArray_v3020 = 57#
cudaBindTextureToMipmappedArray_v5000 = 195#
cudaBindTexture_v3020 = 55#
cudaChooseDevice_v3020 = 5#
cudaConfigureCall_v3020 = 8#
cudaCreateChannelDesc_v3020 = 7#
cudaCreateSurfaceObject_v5000 = 189#
cudaCreateTextureObject_v2_v11080 = 434#
cudaCreateTextureObject_v5000 = 185#
cudaCtxResetPersistingL2Cache_v11000 = 337#
cudaD3D10GetDevice_v3020 = 88#
cudaD3D10GetDevices_v3020 = 89#
cudaD3D10GetDirect3DDevice_v3020 = 149#
cudaD3D10MapResources_v3020 = 94#
cudaD3D10RegisterResource_v3020 = 92#
cudaD3D10ResourceGetMappedArray_v3020 = 98#
cudaD3D10ResourceGetMappedPitch_v3020 = 101#
cudaD3D10ResourceGetMappedPointer_v3020 = 99#
cudaD3D10ResourceGetMappedSize_v3020 = 100#
cudaD3D10ResourceGetSurfaceDimensions_v3020 = 97#
cudaD3D10ResourceSetMapFlags_v3020 = 96#
cudaD3D10SetDirect3DDevice_v3020 = 90#
cudaD3D10UnmapResources_v3020 = 95#
cudaD3D10UnregisterResource_v3020 = 93#
cudaD3D11GetDevice_v3020 = 84#
cudaD3D11GetDevices_v3020 = 85#
cudaD3D11GetDirect3DDevice_v3020 = 148#
cudaD3D11SetDirect3DDevice_v3020 = 86#
cudaD3D9Begin_v3020 = 117#
cudaD3D9End_v3020 = 118#
cudaD3D9GetDevice_v3020 = 102#
cudaD3D9GetDevices_v3020 = 103#
cudaD3D9GetDirect3DDevice_v3020 = 105#
cudaD3D9MapResources_v3020 = 109#
cudaD3D9MapVertexBuffer_v3020 = 121#
cudaD3D9RegisterResource_v3020 = 107#
cudaD3D9RegisterVertexBuffer_v3020 = 119#
cudaD3D9ResourceGetMappedArray_v3020 = 113#
cudaD3D9ResourceGetMappedPitch_v3020 = 116#
cudaD3D9ResourceGetMappedPointer_v3020 = 114#
cudaD3D9ResourceGetMappedSize_v3020 = 115#
cudaD3D9ResourceGetSurfaceDimensions_v3020 = 112#
cudaD3D9ResourceSetMapFlags_v3020 = 111#
cudaD3D9SetDirect3DDevice_v3020 = 104#
cudaD3D9UnmapResources_v3020 = 110#
cudaD3D9UnmapVertexBuffer_v3020 = 122#
cudaD3D9UnregisterResource_v3020 = 108#
cudaD3D9UnregisterVertexBuffer_v3020 = 120#
cudaDestroyExternalMemory_v10000 = 277#
cudaDestroyExternalSemaphore_v10000 = 283#
cudaDestroySurfaceObject_v5000 = 190#
cudaDestroyTextureObject_v5000 = 186#
cudaDeviceCanAccessPeer_v4000 = 154#
cudaDeviceDisablePeerAccess_v4000 = 156#
cudaDeviceEnablePeerAccess_v4000 = 155#
cudaDeviceFlushGPUDirectRDMAWrites_v11030 = 405#
cudaDeviceGetAttribute_v5000 = 200#
cudaDeviceGetByPCIBusId_v4010 = 173#
cudaDeviceGetCacheConfig_v3020 = 168#
cudaDeviceGetDefaultMemPool_v11020 = 372#
cudaDeviceGetGraphMemAttribute_v11040 = 424#
cudaDeviceGetHostAtomicCapabilities_v13000 = 521#
cudaDeviceGetLimit_v3020 = 166#
cudaDeviceGetMemPool_v11020 = 386#
cudaDeviceGetNvSciSyncAttributes_v10020 = 328#
cudaDeviceGetP2PAtomicCapabilities_v13000 = 522#
cudaDeviceGetP2PAttribute_v8000 = 255#
cudaDeviceGetPCIBusId_v4010 = 174#
cudaDeviceGetSharedMemConfig_v4020 = 183#
cudaDeviceGetStreamPriorityRange_v5050 = 205#
cudaDeviceGetTexture1DLinearMaxWidth_v11010 = 347#
cudaDeviceGraphMemTrim_v11040 = 423#
cudaDeviceRegisterAsyncNotification_v12040 = 465#
cudaDeviceReset_v3020 = 164#
cudaDeviceSetCacheConfig_v3020 = 169#
cudaDeviceSetGraphMemAttribute_v11040 = 425#
cudaDeviceSetLimit_v3020 = 167#
cudaDeviceSetMemPool_v11020 = 385#
cudaDeviceSetSharedMemConfig_v4020 = 184#
cudaDeviceSynchronize_v3020 = 165#
cudaDeviceUnregisterAsyncNotification_v12040 = 466#
cudaDriverGetVersion_v3020 = 1#
cudaEGLStreamConsumerAcquireFrame_v7000 = 259#
cudaEGLStreamConsumerConnectWithFlags_v7000 = 268#
cudaEGLStreamConsumerConnect_v7000 = 257#
cudaEGLStreamConsumerDisconnect_v7000 = 258#
cudaEGLStreamConsumerReleaseFrame_v7000 = 260#
cudaEGLStreamProducerConnect_v7000 = 261#
cudaEGLStreamProducerDisconnect_v7000 = 262#
cudaEGLStreamProducerPresentFrame_v7000 = 263#
cudaEGLStreamProducerReturnFrame_v7000 = 264#
cudaEventCreateFromEGLSync_v9000 = 271#
cudaEventCreateWithFlags_v3020 = 134#
cudaEventCreate_v3020 = 133#
cudaEventDestroy_v3020 = 136#
cudaEventElapsedTime_v12080 = 486#
cudaEventElapsedTime_v2_v12080 = 486#
cudaEventElapsedTime_v3020 = 139#
cudaEventQuery_v3020 = 138#
cudaEventRecordWithFlags_ptsz_v11010 = 371#
cudaEventRecordWithFlags_v11010 = 370#
cudaEventRecord_ptsz_v7000 = 242#
cudaEventRecord_v3020 = 135#
cudaEventSynchronize_v3020 = 137#
cudaExternalMemoryGetMappedBuffer_v10000 = 275#
cudaExternalMemoryGetMappedMipmappedArray_v10000 = 276#
cudaFreeArray_v3020 = 24#
cudaFreeAsync_ptsz_v11020 = 376#
cudaFreeAsync_v11020 = 375#
cudaFreeHost_v3020 = 26#
cudaFreeMipmappedArray_v5000 = 194#
cudaFree_v3020 = 22#
cudaFuncGetAttributes_v3020 = 15#
cudaFuncGetName_v12030 = 451#
cudaFuncGetParamInfo_v12040 = 467#
cudaFuncSetAttribute_v9000 = 273#
cudaFuncSetCacheConfig_v3020 = 14#
cudaFuncSetSharedMemConfig_v4020 = 182#
cudaGLGetDevices_v4010 = 175#
cudaGLMapBufferObjectAsync_v3020 = 69#
cudaGLMapBufferObject_v3020 = 65#
cudaGLRegisterBufferObject_v3020 = 64#
cudaGLSetBufferObjectMapFlags_v3020 = 68#
cudaGLSetGLDevice_v3020 = 63#
cudaGLUnmapBufferObjectAsync_v3020 = 70#
cudaGLUnmapBufferObject_v3020 = 66#
cudaGLUnregisterBufferObject_v3020 = 67#
cudaGetChannelDesc_v3020 = 6#
cudaGetDeviceCount_v3020 = 3#
cudaGetDeviceFlags_v7000 = 212#
cudaGetDeviceProperties_v12000 = 440#
cudaGetDeviceProperties_v2_v12000 = 440#
cudaGetDeviceProperties_v3020 = 4#
cudaGetDevice_v3020 = 17#
cudaGetDriverEntryPointByVersion_ptsz_v12050 = 469#
cudaGetDriverEntryPointByVersion_v12050 = 468#
cudaGetDriverEntryPoint_ptsz_v11030 = 407#
cudaGetDriverEntryPoint_v11030 = 406#
cudaGetErrorName_v6050 = 209#
cudaGetErrorString_v3020 = 12#
cudaGetExportTable_v13000 = 493#
cudaGetFuncBySymbol_v11000 = 336#
cudaGetKernel_v12000 = 439#
cudaGetLastError_v3020 = 10#
cudaGetMipmappedArrayLevel_v5000 = 193#
cudaGetSurfaceObjectResourceDesc_v5000 = 191#
cudaGetSurfaceReference_v3020 = 62#
cudaGetSymbolAddress_v3020 = 53#
cudaGetSymbolSize_v3020 = 54#
cudaGetTextureAlignmentOffset_v3020 = 59#
cudaGetTextureObjectResourceDesc_v5000 = 187#
cudaGetTextureObjectResourceViewDesc_v5000 = 199#
cudaGetTextureObjectTextureDesc_v2_v11080 = 435#
cudaGetTextureObjectTextureDesc_v5000 = 188#
cudaGetTextureReference_v3020 = 60#
cudaGraphAddChildGraphNode_v10000 = 298#
cudaGraphAddDependencies_v10000 = 307#
cudaGraphAddDependencies_v12030 = 458#
cudaGraphAddDependencies_v2_v12030 = 458#
cudaGraphAddEmptyNode_v10000 = 300#
cudaGraphAddEventRecordNode_v11010 = 362#
cudaGraphAddEventWaitNode_v11010 = 365#
cudaGraphAddExternalSemaphoresSignalNode_v11020 = 397#
cudaGraphAddExternalSemaphoresWaitNode_v11020 = 400#
cudaGraphAddHostNode_v10000 = 296#
cudaGraphAddKernelNode_v10000 = 289#
cudaGraphAddMemAllocNode_v11040 = 419#
cudaGraphAddMemFreeNode_v11040 = 421#
cudaGraphAddMemcpyNode1D_v11010 = 352#
cudaGraphAddMemcpyNodeFromSymbol_v11010 = 351#
cudaGraphAddMemcpyNodeToSymbol_v11010 = 350#
cudaGraphAddMemcpyNode_v10000 = 290#
cudaGraphAddMemsetNode_v10000 = 293#
cudaGraphAddNode_v12020 = 445#
cudaGraphAddNode_v12030 = 460#
cudaGraphAddNode_v2_v12030 = 460#
cudaGraphChildGraphNodeGetGraph_v10000 = 299#
cudaGraphClone_v10000 = 301#
cudaGraphConditionalHandleCreate_v12030 = 454#
cudaGraphCreate_v10000 = 286#
cudaGraphDebugDotPrint_v11030 = 408#
cudaGraphDestroyNode_v10000 = 309#
cudaGraphDestroy_v10000 = 314#
cudaGraphEventRecordNodeGetEvent_v11010 = 363#
cudaGraphEventRecordNodeSetEvent_v11010 = 364#
cudaGraphEventWaitNodeGetEvent_v11010 = 366#
cudaGraphEventWaitNodeSetEvent_v11010 = 367#
cudaGraphExecChildGraphNodeSetParams_v11010 = 361#
cudaGraphExecDestroy_v10000 = 313#
cudaGraphExecEventRecordNodeSetEvent_v11010 = 368#
cudaGraphExecEventWaitNodeSetEvent_v11010 = 369#
cudaGraphExecExternalSemaphoresSignalNodeSetParams_v11020 = 403#
cudaGraphExecExternalSemaphoresWaitNodeSetParams_v11020 = 404#
cudaGraphExecGetFlags_v12000 = 438#
cudaGraphExecHostNodeSetParams_v10020 = 334#
cudaGraphExecKernelNodeSetParams_v10010 = 326#
cudaGraphExecMemcpyNodeSetParams1D_v11010 = 358#
cudaGraphExecMemcpyNodeSetParamsFromSymbol_v11010 = 357#
cudaGraphExecMemcpyNodeSetParamsToSymbol_v11010 = 356#
cudaGraphExecMemcpyNodeSetParams_v10020 = 332#
cudaGraphExecMemsetNodeSetParams_v10020 = 333#
cudaGraphExecNodeSetParams_v12020 = 447#
cudaGraphExecUpdate_v10020 = 335#
cudaGraphExternalSemaphoresSignalNodeGetParams_v11020 = 398#
cudaGraphExternalSemaphoresSignalNodeSetParams_v11020 = 399#
cudaGraphExternalSemaphoresWaitNodeGetParams_v11020 = 401#
cudaGraphExternalSemaphoresWaitNodeSetParams_v11020 = 402#
cudaGraphGetEdges_v10000 = 323#
cudaGraphGetEdges_v12030 = 455#
cudaGraphGetEdges_v2_v12030 = 455#
cudaGraphGetNodes_v10000 = 322#
cudaGraphGetRootNodes_v10000 = 304#
cudaGraphHostNodeGetParams_v10000 = 297#
cudaGraphHostNodeSetParams_v10000 = 321#
cudaGraphInstantiateWithFlags_v11040 = 418#
cudaGraphInstantiateWithParams_ptsz_v12000 = 437#
cudaGraphInstantiateWithParams_v12000 = 436#
cudaGraphInstantiate_v10000 = 310#
cudaGraphInstantiate_v12000 = 443#
cudaGraphKernelNodeCopyAttributes_v11000 = 338#
cudaGraphKernelNodeGetAttribute_v11000 = 339#
cudaGraphKernelNodeGetParams_v10000 = 287#
cudaGraphKernelNodeSetAttribute_v11000 = 340#
cudaGraphKernelNodeSetParams_v10000 = 288#
cudaGraphLaunch_ptsz_v10000 = 312#
cudaGraphLaunch_v10000 = 311#
cudaGraphMemAllocNodeGetParams_v11040 = 420#
cudaGraphMemFreeNodeGetParams_v11040 = 422#
cudaGraphMemcpyNodeGetParams_v10000 = 291#
cudaGraphMemcpyNodeSetParams1D_v11010 = 355#
cudaGraphMemcpyNodeSetParamsFromSymbol_v11010 = 354#
cudaGraphMemcpyNodeSetParamsToSymbol_v11010 = 353#
cudaGraphMemcpyNodeSetParams_v10000 = 292#
cudaGraphMemsetNodeGetParams_v10000 = 294#
cudaGraphMemsetNodeSetParams_v10000 = 295#
cudaGraphNodeFindInClone_v10000 = 302#
cudaGraphNodeGetDependencies_v10000 = 305#
cudaGraphNodeGetDependencies_v12030 = 456#
cudaGraphNodeGetDependencies_v2_v12030 = 456#
cudaGraphNodeGetDependentNodes_v10000 = 306#
cudaGraphNodeGetDependentNodes_v12030 = 457#
cudaGraphNodeGetDependentNodes_v2_v12030 = 457#
cudaGraphNodeGetEnabled_v11060 = 427#
cudaGraphNodeGetType_v10000 = 303#
cudaGraphNodeSetEnabled_v11060 = 426#
cudaGraphNodeSetParams_v12020 = 446#
cudaGraphReleaseUserObject_v11030 = 417#
cudaGraphRemoveDependencies_v10000 = 308#
cudaGraphRemoveDependencies_v12030 = 459#
cudaGraphRemoveDependencies_v2_v12030 = 459#
cudaGraphRetainUserObject_v11030 = 416#
cudaGraphUpload_ptsz_v10000 = 349#
cudaGraphUpload_v10000 = 348#
cudaGraphicsD3D10RegisterResource_v3020 = 91#
cudaGraphicsD3D11RegisterResource_v3020 = 87#
cudaGraphicsD3D9RegisterResource_v3020 = 106#
cudaGraphicsEGLRegisterImage_v7000 = 256#
cudaGraphicsGLRegisterBuffer_v3020 = 73#
cudaGraphicsGLRegisterImage_v3020 = 72#
cudaGraphicsMapResources_v3020 = 76#
cudaGraphicsResourceGetMappedEglFrame_v7000 = 265#
cudaGraphicsResourceGetMappedMipmappedArray_v5000 = 196#
cudaGraphicsResourceGetMappedPointer_v3020 = 78#
cudaGraphicsResourceSetMapFlags_v3020 = 75#
cudaGraphicsSubResourceGetMappedArray_v3020 = 79#
cudaGraphicsUnmapResources_v3020 = 77#
cudaGraphicsUnregisterResource_v3020 = 74#
cudaGraphicsVDPAURegisterOutputSurface_v3020 = 83#
cudaGraphicsVDPAURegisterVideoSurface_v3020 = 82#
cudaHostAlloc_v3020 = 27#
cudaHostGetDevicePointer_v3020 = 28#
cudaHostGetFlags_v3020 = 29#
cudaHostRegister_v4000 = 152#
cudaHostUnregister_v4000 = 153#
cudaImportExternalMemory_v10000 = 274#
cudaImportExternalSemaphore_v10000 = 278#
cudaInitDevice_v12000 = 444#
cudaIpcCloseMemHandle_v4010 = 180#
cudaIpcGetEventHandle_v4010 = 176#
cudaIpcGetMemHandle_v4010 = 178#
cudaIpcOpenEventHandle_v4010 = 177#
cudaIpcOpenMemHandle_v4010 = 179#
cudaKernelSetAttributeForDevice_v12060 = 479#
cudaLaunchCooperativeKernelMultiDevice_v9000 = 272#
cudaLaunchCooperativeKernel_ptsz_v9000 = 270#
cudaLaunchCooperativeKernel_v9000 = 269#
cudaLaunchHostFunc_ptsz_v10000 = 285#
cudaLaunchHostFunc_v10000 = 284#
cudaLaunchKernelExC_ptsz_v11060 = 431#
cudaLaunchKernelExC_v11060 = 430#
cudaLaunchKernel_ptsz_v7000 = 214#
cudaLaunchKernel_v7000 = 211#
cudaLaunch_ptsz_v7000 = 213#
cudaLaunch_v3020 = 13#
cudaLibraryEnumerateKernels_v12060 = 478#
cudaLibraryGetGlobal_v12060 = 474#
cudaLibraryGetKernelCount_v12060 = 477#
cudaLibraryGetKernel_v12060 = 473#
cudaLibraryGetManaged_v12060 = 475#
cudaLibraryGetUnifiedFunction_v12060 = 476#
cudaLibraryLoadData_v12060 = 470#
cudaLibraryLoadFromFile_v12060 = 471#
cudaLibraryUnload_v12060 = 472#
cudaLogsCurrent_v13000 = 515#
cudaLogsDumpToFile_v13000 = 516#
cudaLogsDumpToMemory_v13000 = 517#
cudaLogsRegisterCallback_v13000 = 513#
cudaLogsUnregisterCallback_v13000 = 514#
cudaMalloc3DArray_v3020 = 141#
cudaMalloc3D_v3020 = 140#
cudaMallocArray_v3020 = 23#
cudaMallocAsync_ptsz_v11020 = 374#
cudaMallocAsync_v11020 = 373#
cudaMallocFromPoolAsync_ptsz_v11020 = 392#
cudaMallocFromPoolAsync_v11020 = 391#
cudaMallocHost_v3020 = 25#
cudaMallocManaged_v6000 = 206#
cudaMallocMipmappedArray_v5000 = 192#
cudaMallocPitch_v3020 = 21#
cudaMalloc_v3020 = 20#
cudaMemAdvise_v12020 = 448#
cudaMemAdvise_v2_v12020 = 448#
cudaMemAdvise_v8000 = 254#
cudaMemDiscardAndPrefetchBatchAsync_ptsz_v13000 = 492#
cudaMemDiscardAndPrefetchBatchAsync_v13000 = 491#
cudaMemDiscardBatchAsync_ptsz_v13000 = 490#
cudaMemDiscardBatchAsync_v13000 = 489#
cudaMemGetDefaultMemPool_v13000 = 518#
cudaMemGetInfo_v3020 = 30#
cudaMemGetMemPool_v13000 = 519#
cudaMemPoolCreate_v11020 = 383#
cudaMemPoolDestroy_v11020 = 384#
cudaMemPoolExportPointer_v11020 = 389#
cudaMemPoolExportToShareableHandle_v11020 = 387#
cudaMemPoolGetAccess_v11020 = 382#
cudaMemPoolGetAttribute_v11020 = 379#
cudaMemPoolImportFromShareableHandle_v11020 = 388#
cudaMemPoolImportPointer_v11020 = 390#
cudaMemPoolSetAccess_v11020 = 380#
cudaMemPoolSetAttribute_v11020 = 378#
cudaMemPoolTrimTo_v11020 = 377#
cudaMemPrefetchAsync_ptsz_v12020 = 450#
cudaMemPrefetchAsync_ptsz_v8000 = 253#
cudaMemPrefetchAsync_v12020 = 449#
cudaMemPrefetchAsync_v2_ptsz_v12020 = 450#
cudaMemPrefetchAsync_v2_v12020 = 449#
cudaMemPrefetchAsync_v8000 = 252#
cudaMemPrefetchBatchAsync_ptsz_v13000 = 488#
cudaMemPrefetchBatchAsync_v13000 = 487#
cudaMemRangeGetAttribute_v8000 = 266#
cudaMemRangeGetAttributes_v8000 = 267#
cudaMemSetMemPool_v13000 = 520#
cudaMemcpy2DArrayToArray_ptds_v7000 = 222#
cudaMemcpy2DArrayToArray_v3020 = 38#
cudaMemcpy2DAsync_ptsz_v7000 = 228#
cudaMemcpy2DAsync_v3020 = 44#
cudaMemcpy2DFromArrayAsync_ptsz_v7000 = 230#
cudaMemcpy2DFromArrayAsync_v3020 = 46#
cudaMemcpy2DFromArray_ptds_v7000 = 220#
cudaMemcpy2DFromArray_v3020 = 36#
cudaMemcpy2DToArrayAsync_ptsz_v7000 = 229#
cudaMemcpy2DToArrayAsync_v3020 = 45#
cudaMemcpy2DToArray_ptds_v7000 = 218#
cudaMemcpy2DToArray_v3020 = 34#
cudaMemcpy2D_ptds_v7000 = 216#
cudaMemcpy2D_v3020 = 32#
cudaMemcpy3DAsync_ptsz_v7000 = 246#
cudaMemcpy3DAsync_v3020 = 145#
cudaMemcpy3DBatchAsync_ptsz_v12080 = 485#
cudaMemcpy3DBatchAsync_ptsz_v13000 = 512#
cudaMemcpy3DBatchAsync_v12080 = 484#
cudaMemcpy3DBatchAsync_v13000 = 511#
cudaMemcpy3DPeerAsync_ptsz_v7000 = 250#
cudaMemcpy3DPeerAsync_v4000 = 163#
cudaMemcpy3DPeer_ptds_v7000 = 249#
cudaMemcpy3DPeer_v4000 = 162#
cudaMemcpy3D_ptds_v7000 = 245#
cudaMemcpy3D_v3020 = 144#
cudaMemcpyArrayToArray_ptds_v7000 = 221#
cudaMemcpyArrayToArray_v3020 = 37#
cudaMemcpyAsync_ptsz_v7000 = 225#
cudaMemcpyAsync_v3020 = 41#
cudaMemcpyBatchAsync_ptsz_v12080 = 483#
cudaMemcpyBatchAsync_ptsz_v13000 = 510#
cudaMemcpyBatchAsync_v12080 = 482#
cudaMemcpyBatchAsync_v13000 = 509#
cudaMemcpyFromArrayAsync_ptsz_v7000 = 227#
cudaMemcpyFromArrayAsync_v3020 = 43#
cudaMemcpyFromArray_ptds_v7000 = 219#
cudaMemcpyFromArray_v3020 = 35#
cudaMemcpyFromSymbolAsync_ptsz_v7000 = 232#
cudaMemcpyFromSymbolAsync_v3020 = 48#
cudaMemcpyFromSymbol_ptds_v7000 = 224#
cudaMemcpyFromSymbol_v3020 = 40#
cudaMemcpyPeerAsync_v4000 = 161#
cudaMemcpyPeer_v4000 = 160#
cudaMemcpyToArrayAsync_ptsz_v7000 = 226#
cudaMemcpyToArrayAsync_v3020 = 42#
cudaMemcpyToArray_ptds_v7000 = 217#
cudaMemcpyToArray_v3020 = 33#
cudaMemcpyToSymbolAsync_ptsz_v7000 = 231#
cudaMemcpyToSymbolAsync_v3020 = 47#
cudaMemcpyToSymbol_ptds_v7000 = 223#
cudaMemcpyToSymbol_v3020 = 39#
cudaMemcpy_ptds_v7000 = 215#
cudaMemcpy_v3020 = 31#
cudaMemset2DAsync_ptsz_v7000 = 236#
cudaMemset2DAsync_v3020 = 52#
cudaMemset2D_ptds_v7000 = 234#
cudaMemset2D_v3020 = 50#
cudaMemset3DAsync_ptsz_v7000 = 244#
cudaMemset3DAsync_v3020 = 143#
cudaMemset3D_ptds_v7000 = 243#
cudaMemset3D_v3020 = 142#
cudaMemsetAsync_ptsz_v7000 = 235#
cudaMemsetAsync_v3020 = 51#
cudaMemset_ptds_v7000 = 233#
cudaMemset_v3020 = 49#
cudaMipmappedArrayGetMemoryRequirements_v11060 = 429#
cudaMipmappedArrayGetSparseProperties_v11010 = 360#
cudaOccupancyAvailableDynamicSMemPerBlock_v10200 = 329#
cudaOccupancyMaxActiveBlocksPerMultiprocessorWithFlags_v7000 = 251#
cudaOccupancyMaxActiveBlocksPerMultiprocessor_v6000 = 207#
cudaOccupancyMaxActiveBlocksPerMultiprocessor_v6050 = 210#
cudaOccupancyMaxActiveClusters_v11070 = 433#
cudaOccupancyMaxPotentialClusterSize_v11070 = 432#
cudaPeekAtLastError_v3020 = 11#
cudaPeerGetDevicePointer_v4000 = 159#
cudaPeerRegister_v4000 = 157#
cudaPeerUnregister_v4000 = 158#
cudaPointerGetAttributes_v4000 = 151#
cudaProfilerInitialize_v4000 = 170#
cudaProfilerStart_v4000 = 171#
cudaProfilerStop_v4000 = 172#
cudaRuntimeGetVersion_v3020 = 2#
cudaSetDeviceFlags_v3020 = 19#
cudaSetDevice_v3020 = 16#
cudaSetDoubleForDevice_v3020 = 124#
cudaSetDoubleForHost_v3020 = 125#
cudaSetValidDevices_v3020 = 18#
cudaSetupArgument_v3020 = 9#
cudaSignalExternalSemaphoresAsync_ptsz_v10000 = 280#
cudaSignalExternalSemaphoresAsync_ptsz_v11020 = 394#
cudaSignalExternalSemaphoresAsync_v10000 = 279#
cudaSignalExternalSemaphoresAsync_v11020 = 393#
cudaSignalExternalSemaphoresAsync_v2_ptsz_v11020 = 394#
cudaSignalExternalSemaphoresAsync_v2_v11020 = 393#
cudaStreamAddCallback_ptsz_v7000 = 248#
cudaStreamAddCallback_v5000 = 197#
cudaStreamAttachMemAsync_ptsz_v7000 = 241#
cudaStreamAttachMemAsync_v6000 = 208#
cudaStreamBeginCaptureToGraph_ptsz_v12030 = 453#
cudaStreamBeginCaptureToGraph_v12030 = 452#
cudaStreamBeginCapture_ptsz_v10000 = 316#
cudaStreamBeginCapture_v10000 = 315#
cudaStreamCopyAttributes_ptsz_v11000 = 342#
cudaStreamCopyAttributes_v11000 = 341#
cudaStreamCreateWithFlags_v5000 = 198#
cudaStreamCreateWithPriority_v5050 = 202#
cudaStreamCreate_v3020 = 129#
cudaStreamDestroy_v3020 = 130#
cudaStreamDestroy_v5050 = 201#
cudaStreamEndCapture_ptsz_v10000 = 320#
cudaStreamEndCapture_v10000 = 319#
cudaStreamGetAttribute_ptsz_v11000 = 344#
cudaStreamGetAttribute_v11000 = 343#
cudaStreamGetCaptureInfo_ptsz_v10010 = 325#
cudaStreamGetCaptureInfo_ptsz_v12030 = 462#
cudaStreamGetCaptureInfo_v10010 = 324#
cudaStreamGetCaptureInfo_v12030 = 461#
cudaStreamGetCaptureInfo_v2_ptsz_v11030 = 410#
cudaStreamGetCaptureInfo_v2_v11030 = 409#
cudaStreamGetCaptureInfo_v3_ptsz_v12030 = 462#
cudaStreamGetCaptureInfo_v3_v12030 = 461#
cudaStreamGetDevice_ptsz_v12080 = 481#
cudaStreamGetDevice_v12080 = 480#
cudaStreamGetFlags_ptsz_v7000 = 238#
cudaStreamGetFlags_v5050 = 204#
cudaStreamGetId_ptsz_v12000 = 442#
cudaStreamGetId_v12000 = 441#
cudaStreamGetPriority_ptsz_v7000 = 237#
cudaStreamGetPriority_v5050 = 203#
cudaStreamIsCapturing_ptsz_v10000 = 318#
cudaStreamIsCapturing_v10000 = 317#
cudaStreamQuery_ptsz_v7000 = 240#
cudaStreamQuery_v3020 = 132#
cudaStreamSetAttribute_ptsz_v11000 = 346#
cudaStreamSetAttribute_v11000 = 345#
cudaStreamSetFlags_ptsz_v10200 = 331#
cudaStreamSetFlags_v10200 = 330#
cudaStreamSynchronize_ptsz_v7000 = 239#
cudaStreamSynchronize_v3020 = 131#
cudaStreamUpdateCaptureDependencies_ptsz_v11030 = 412#
cudaStreamUpdateCaptureDependencies_ptsz_v12030 = 464#
cudaStreamUpdateCaptureDependencies_v11030 = 411#
cudaStreamUpdateCaptureDependencies_v12030 = 463#
cudaStreamUpdateCaptureDependencies_v2_ptsz_v12030 = 464#
cudaStreamUpdateCaptureDependencies_v2_v12030 = 463#
cudaStreamWaitEvent_ptsz_v7000 = 247#
cudaStreamWaitEvent_v3020 = 147#
cudaThreadExchangeStreamCaptureMode_v10010 = 327#
cudaThreadExit_v3020 = 123#
cudaThreadGetCacheConfig_v3020 = 150#
cudaThreadGetLimit_v3020 = 127#
cudaThreadSetCacheConfig_v3020 = 146#
cudaThreadSetLimit_v3020 = 128#
cudaThreadSynchronize_v3020 = 126#
cudaUnbindTexture_v3020 = 58#
cudaUserObjectCreate_v11030 = 413#
cudaUserObjectRelease_v11030 = 415#
cudaUserObjectRetain_v11030 = 414#
cudaVDPAUGetDevice_v3020 = 80#
cudaVDPAUSetVDPAUDevice_v3020 = 81#
cudaWGLGetDevice_v3020 = 71#
cudaWaitExternalSemaphoresAsync_ptsz_v10000 = 282#
cudaWaitExternalSemaphoresAsync_ptsz_v11020 = 396#
cudaWaitExternalSemaphoresAsync_v10000 = 281#
cudaWaitExternalSemaphoresAsync_v11020 = 395#
cudaWaitExternalSemaphoresAsync_v2_ptsz_v11020 = 396#
cudaWaitExternalSemaphoresAsync_v2_v11020 = 395#
cupti.cupti.activity_configure_unified_memory_counter(config: int, count: int)#

Set Unified Memory Counter configuration.

Parameters:
  • config (intptr_t) – A pointer to CUpti_ActivityUnifiedMemoryCounterConfig structures containing Unified Memory counter configuration.

  • count (uint32_t) – Number of Unified Memory counter configuration structures.

cupti.cupti.activity_disable(kind: int)#

Disable collection of a specific kind of activity record.

Parameters:

kind (ActivityKind) – The kind of activity record to stop collecting.

cupti.cupti.activity_disable_context(context: int, kind: int)#

Disable collection of a specific kind of activity record for a context.

Parameters:
  • context (intptr_t) – The context for which activity is to be disabled.

  • kind (ActivityKind) – The kind of activity record to stop collecting.

cupti.cupti.activity_enable(kind: int)#

Enable collection of a specific kind of activity record.

Parameters:

kind (ActivityKind) – The kind of activity record to collect.

cupti.cupti.activity_enable_all_sync_records(enable: int)#

Enables collecting records for all synchronization operations.

Parameters:

enable (uint8_t) – is a boolean, denoting whether to enable or disable the collection of all CUDA event query and stream query records.

cupti.cupti.activity_enable_allocation_source(enable: int)#

Enables tracking the source library for memory allocation requests.

Parameters:

enable (uint8_t) – is a boolean, denoting whether the source library of the memory allocation request needs to be tracked.

cupti.cupti.activity_enable_and_dump(kind: int)#

Enable collection of a specific kind of activity record. For certain activity kinds it dumps existing records.

Parameters:

kind (ActivityKind) – The kind of activity record to collect.

cupti.cupti.activity_enable_context(context: int, kind: int)#

Enable collection of a specific kind of activity record for a context.

Parameters:
  • context (intptr_t) – The context for which activity is to be enabled.

  • kind (ActivityKind) – The kind of activity record to collect.

cupti.cupti.activity_enable_cuda_event_device_timestamps(enable: int)#

Enable/Disable collecting device timestamp for CUPTI_ACTIVITY_KIND_CUDA_EVENT record.

Parameters:

enable (uint8_t) – is a boolean, denoting whether to enable or disable the collection of CUDA event device timestamps.

cupti.cupti.activity_enable_device_graph(enable: int)#

Controls the collection of records for device launched graphs.

Parameters:

enable (uint8_t) – is a boolean, denoting whether these records should be collected.

cupti.cupti.activity_enable_driver_api(cbid: int, enable: int)#

Controls the collection of activity records for specific CUDA Driver APIs.

Parameters:
  • cbid (uint32_t) – callback id of the CUDA Driver API. This can be found in the header cupti_driver_cbid.h.

  • enable (uint8_t) – is a boolean, denoting whether to enable or disable the collection.

cupti.cupti.activity_enable_hw_trace(enable: int)#

Enables the collection of CUDA kernel timestamps through Hardware Event System(HES).

Parameters:

enable (uint8_t) – is a boolean, denoting whether to enable or disable the collection through HW events.

cupti.cupti.activity_enable_latency_timestamps(enable: int)#

Controls the collection of queued and submitted timestamps for kernels.

Parameters:

enable (uint8_t) – is a boolean, denoting whether these timestamps should be collected.

cupti.cupti.activity_enable_launch_attributes(enable: int)#

Controls the collection of launch attributes for kernels.

Parameters:

enable (uint8_t) – is a boolean denoting whether these launch attributes should be collected.

cupti.cupti.activity_enable_runtime_api(cbid: int, enable: int)#

Controls the collection of activity records for specific CUDA Runtime APIs.

Parameters:
  • cbid (uint32_t) – callback id of the CUDA Runtime API. This can be found in the header cupti_runtime_cbid.h.

  • enable (uint8_t) – is a boolean, denoting whether to enable or disable the collection.

cupti.cupti.activity_flush_all(flag: int)#

Request to deliver activity records via the buffer completion callback.

Parameters:

flag (uint32_t) – The flag can be set to indicate a forced flush. See CUpti_ActivityFlag.

cupti.cupti.activity_flush_period(time: int)#

Sets the flush period for the worker thread.

Parameters:

time (uint32_t) – flush period in milliseconds (ms).

cupti.cupti.activity_get_attribute(attr: int, value_size: int, value: int)#

Read an activity API attribute.

Parameters:
  • attr (ActivityAttribute) – The attribute to read.

  • value_size (intptr_t) – Size of buffer pointed by the value, and returns the number of bytes written to value.

  • value (intptr_t) – Returns the value of the attribute.

cupti.cupti.activity_get_num_dropped_records(
context: int,
stream_id: int,
dropped: int,
)#

Get the number of activity records that were dropped of insufficient buffer space.

Parameters:
  • context (intptr_t) – The context, or NULL to get dropped count from global queue.

  • stream_id (uint32_t) – The stream ID.

  • dropped (intptr_t) – The number of records that were dropped since the last call to this function.

cupti.cupti.activity_pop_external_correlation_id(kind: int) int#

Pop an external correlation id for the calling thread.

Parameters:

kind (ExternalCorrelationKind) – The kind of external API activities should be correlated with.

Returns:

If the function returns successful, contains the last external correlation id for this kind, can be NULL.

Return type:

uint64_t

cupti.cupti.activity_push_external_correlation_id(kind: int, id: int)#

Push an external correlation id for the calling thread.

Parameters:
  • kind (ExternalCorrelationKind) – The kind of external API activities should be correlated with.

  • id (uint64_t) – External correlation id.

cupti.cupti.activity_register_callbacks(
func_buffer_requested,
func_buffer_completed,
)#

Registers callback functions with CUPTI for activity buffer handling.

Parameters:
  • func_buffer_requested (function) – callback which is invoked when an empty buffer is requested by CUPTI.

  • func_buffer_completed (function) – callback which is invoked when a buffer containing activity records is available from CUPTI.

cupti.cupti.activity_register_timestamp_callback(func_timestamp)#

Registers callback function with CUPTI for providing timestamp.

Parameters:

func_timestamp (function) – callback which is invoked when a timestamp is needed by CUPTI.

cupti.cupti.activity_set_attribute(attr: int, value_size: int, value: int)#

Write an activity API attribute.

Parameters:
  • attr (ActivityAttribute) – The attribute to write.

  • value_size (intptr_t) – The size, in bytes, of the value.

  • value (intptr_t) – The attribute value to write.

cupti.cupti.compute_capability_supported(major: int, minor: int) int#

Check support for a compute capability.

Parameters:
  • major (int) – The major revision number of the compute capability.

  • minor (int) – The minor revision number of the compute capability.

Returns:

Pointer to an integer to return the support status.

Return type:

int

cupti.cupti.device_supported(dev: int) int#

Check support for a compute device.

Parameters:

dev (int) – The device handle returned by CUDA Driver API cuDeviceGet.

Returns:

Pointer to an integer to return the support status.

Return type:

int

cupti.cupti.device_virtualization_mode(dev: int) int#

Query the virtualization mode of the device.

Parameters:

dev (int) – The device handle returned by CUDA Driver API cuDeviceGet.

Returns:

Pointer to an CUpti_DeviceVirtualizationMode to return the virtualization mode.

Return type:

int

cupti.cupti.enable_all_domains(enable: int, subscriber: int)#

Enable or disable all callbacks in all domains.

Parameters:
  • enable (uint32_t) – New enable state for all callbacks in all domain. Zero disables all callbacks, non-zero enables all callbacks.

  • subscriber (intptr_t) – Handle to callback subscription.

cupti.cupti.enable_callback(enable: int, subscriber: int, domain: int, cbid: int)#

Enable or disabled callbacks for a specific domain and callback ID.

Parameters:
  • enable (uint32_t) – New enable state for the callback. Zero disables the callback, non-zero enables the callback.

  • subscriber (intptr_t) – Handle to callback subscription.

  • domain (CallbackDomain) – The domain of the callback.

  • cbid (uint32_t) – The ID of the callback.

cupti.cupti.enable_domain(enable: int, subscriber: int, domain: int)#

Enable or disabled all callbacks for a specific domain.

Parameters:
  • enable (uint32_t) – New enable state for all callbacks in the domain. Zero disables all callbacks, non-zero enables all callbacks.

  • subscriber (intptr_t) – Handle to callback subscription.

  • domain (CallbackDomain) – The domain of the callback.

cupti.cupti.finalize()#

Detach CUPTI from the running process.

See also

cuptiFinalize

cupti.cupti.get_auto_boost_state(context: int, state: int)#

Get auto boost state.

Parameters:
  • context (intptr_t) – A valid CUcontext.

  • state (intptr_t) – A pointer to CUpti_ActivityAutoBoostState structure which contains the current state and the id of the process that has requested the current state.

cupti.cupti.get_callback_name(domain: int, cbid: int)#

Get the name of a callback for a specific domain and callback ID.

Parameters:
  • domain (CallbackDomain) – The domain of the callback.

  • cbid (uint32_t) – The ID of the callback.

Returns:

Returns name of the callback for the specified domain and callback ID

Return type:

name (str)

cupti.cupti.get_callback_state(subscriber: int, domain: int, cbid: int) int#

Get the current enabled/disabled state of a callback for a specific domain and function ID.

Parameters:
  • subscriber (intptr_t) – Handle to the initialize subscriber.

  • domain (CallbackDomain) – The domain of the callback.

  • cbid (uint32_t) – The ID of the callback.

Returns:

Returns non-zero if callback enabled, zero if not enabled.

Return type:

uint32_t

cupti.cupti.get_context_id(context: int) int#

Get the ID of a context.

Parameters:

context (intptr_t) – The context.

Returns:

Returns a process-unique ID for the context.

Return type:

uint32_t

cupti.cupti.get_device_id(context: int) int#

Get the ID of a device.

Parameters:

context (intptr_t) – The context, or NULL to indicate the current context.

Returns:

Returns the ID of the device that is current for the calling thread.

Return type:

uint32_t

See also

cuptiGetDeviceId

cupti.cupti.get_graph_exec_id(graph_exec: int) int#

Get the unique ID of executable graph.

Parameters:

graph_exec (intptr_t) – The executable graph.

Returns:

Returns the unique ID of the executable graph.

Return type:

uint32_t

cupti.cupti.get_graph_id(graph: int) int#

Get the unique ID of graph.

Parameters:

graph (intptr_t) – The graph.

Returns:

Returns the unique ID of the graph.

Return type:

uint32_t

See also

cuptiGetGraphId

cupti.cupti.get_graph_node_id(node: int) int#

Get the unique ID of a graph node.

Parameters:

node (intptr_t) – The graph node.

Returns:

Returns the unique ID of the node.

Return type:

uint64_t

cupti.cupti.get_last_error() int#

Returns the last error from a cupti call or callback.

cupti.cupti.get_stream_id_ex(
context: int,
stream: int,
per_thread_stream: int,
) int#

Get the ID of a stream.

Parameters:
  • context (intptr_t) – If non-NULL then the stream is checked to ensure that it belongs to this context. Typically this parameter should be null.

  • stream (intptr_t) – The stream.

  • per_thread_stream (uint8_t) – Flag to indicate if program is compiled for per-thread streams.

Returns:

Returns a context-unique ID for the stream.

Return type:

uint32_t

cupti.cupti.get_thread_id_type() int#

Get the thread-id type.

Returns:

.

Return type:

int

cupti.cupti.get_timestamp() int#

Get the CUPTI timestamp.

Returns:

Returns the CUPTI timestamp.

Return type:

uint64_t

cupti.cupti.set_thread_id_type(type: int)#

Set the thread-id type.

Parameters:

type (ActivityThreadIdType) –

.

cupti.cupti.subscribe(callback, userdata) int#

Initialize a callback subscriber with a callback function and user data.

Parameters:
  • callback (function) – The callback function.

  • userdata (intptr_t) – A pointer to user data. This data will be passed to the callback function via the userdata parameter.

Returns:

Returns handle to initialize subscriber.

Return type:

intptr_t

See also

cuptiSubscribe

cupti.cupti.subscribe_v2(callback, userdata, p_params: int) int#

Initialize a callback subscriber with a callback function and user data.

Parameters:
  • callback (function) – The callback function.

  • userdata (intptr_t) – A pointer to user data. This data will be passed to the callback function via the userdata parameter.

  • p_params (intptr_t) – A pointer to CUpti_SubscriberParams. Can be NULL.

Returns:

Returns handle to initialize subscriber.

Return type:

intptr_t

cupti.cupti.supported_domains()#

Get the available callback domains.

Returns:

List of all available callback domains

Return type:

list[CallbackDomain]

cupti.cupti.unsubscribe(subscriber: int)#

Unregister a callback subscriber.

Parameters:

subscriber (intptr_t) – Handle to the initialize subscriber.

See also

cuptiUnsubscribe