Thanks to visit codestin.com
Credit goes to github.com

Skip to content
View rosenrodt's full-sized avatar

Block or report rosenrodt

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Examples demonstrating available options to program multiple GPUs in a single node or a cluster

Cuda 823 143 Updated Sep 26, 2025

Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more

Python 33,895 3,230 Updated Nov 5, 2025

Fast and memory-efficient exact attention

Python 20,353 2,113 Updated Nov 5, 2025

Modern C++ Programming Course (C++03/11/14/17/20/23/26)

HTML 14,002 974 Updated Sep 16, 2025

MLX: An array framework for Apple silicon

C++ 22,709 1,378 Updated Nov 5, 2025

Performance Tuning Tutorial given at Oak Ridge National Laboratory

C++ 183 20 Updated May 19, 2021

TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. Tensor…

C++ 12,042 1,839 Updated Nov 5, 2025

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

Python 16,043 3,177 Updated Nov 5, 2025

gProfiler is a system-wide profiler, combining multiple sampling profilers to produce unified visualization of what your CPU is spending time on.

Python 793 70 Updated Oct 31, 2025

🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.

Python 152,116 31,045 Updated Nov 5, 2025

A very fast and expressive template engine.

Python 11,243 1,680 Updated Jun 14, 2025

Use pytest's runner to discover and execute C++ tests

C++ 140 26 Updated Oct 27, 2025

An efficient C++17 GPU numerical computing library with Python-like syntax

C++ 1,358 107 Updated Nov 5, 2025

Flexible and powerful tensor operations for readable and reliable code (for pytorch, jax, TF and others)

Python 9,260 385 Updated Aug 12, 2025

Composable Kernel: Performance Portable Programming Model for Machine Learning Tensor Operators

C++ 482 248 Updated Nov 5, 2025

[ARCHIVED] The C++ Standard Library for your entire system. See https://github.com/NVIDIA/cccl

C++ 2,309 191 Updated Feb 7, 2024

C++ Insights - See your source code with the eyes of a compiler

C++ 4,407 258 Updated Jun 26, 2025

Run GUI applications and desktops in docker and podman containers. Focus on security.

Shell 6,062 404 Updated Apr 4, 2024

An MLIR-based toolchain for AMD AI Engine-enabled devices.

MLIR 517 158 Updated Nov 5, 2025

Instant neural graphics primitives: lightning fast NeRF and more

Cuda 17,029 2,017 Updated Oct 8, 2025

Development repository for the Triton language and compiler

MLIR 17,469 2,359 Updated Nov 5, 2025

Multiple cursors plugin for vim/neovim

Vim Script 4,678 94 Updated Sep 1, 2024

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.

LLVM 35,235 15,079 Updated Nov 5, 2025

GPU texture/buffer performance tester

C++ 655 32 Updated Nov 19, 2020

A client/server indexer for c/c++/objc[++] with integration for Emacs based on clang.

C++ 1,838 256 Updated Sep 23, 2025

⭐ Vim for Visual Studio Code

TypeScript 14,895 1,418 Updated Nov 5, 2025

MSVC's implementation of the C++ Standard Library.

C++ 10,857 1,587 Updated Nov 5, 2025

VexCL is a C++ vector expression template library for OpenCL/CUDA/OpenMP

C++ 717 84 Updated Jul 19, 2025

Assembler for NVIDIA Maxwell architecture

Sass 1,046 172 Updated Jan 3, 2023

HIP: C++ Heterogeneous-Compute Interface for Portability

C++ 4,221 573 Updated Nov 4, 2025
Next