Thanks to visit codestin.com
Credit goes to github.com

rosenrodt

Follow

Anthony Chang rosenrodt

Follow

7 followers · 0 following

Achievements

Achievements

Stars

NVIDIA / multi-gpu-programming-models

Examples demonstrating available options to program multiple GPUs in a single node or a cluster

Cuda 823 143 Updated Sep 26, 2025

jax-ml / jax

Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more

Python 33,895 3,230 Updated Nov 5, 2025

Dao-AILab / flash-attention

Fast and memory-efficient exact attention

Python 20,353 2,113 Updated Nov 5, 2025

federico-busato / Modern-CPP-Programming

Modern C++ Programming Course (C++03/11/14/17/20/23/26)

HTML 14,002 974 Updated Sep 16, 2025

ml-explore / mlx

MLX: An array framework for Apple silicon

C++ 22,709 1,378 Updated Nov 5, 2025

NAThompson / performance_tuning_tutorial

Performance Tuning Tutorial given at Oak Ridge National Laboratory

C++ 183 20 Updated May 19, 2021

NVIDIA / TensorRT-LLM

TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. Tensor…

C++ 12,042 1,839 Updated Nov 5, 2025

NVIDIA-NeMo / NeMo

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

Python 16,043 3,177 Updated Nov 5, 2025

intel / gprofiler

gProfiler is a system-wide profiler, combining multiple sampling profilers to produce unified visualization of what your CPU is spending time on.

Python 793 70 Updated Oct 31, 2025

huggingface / transformers

🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.

Python 152,116 31,045 Updated Nov 5, 2025

pallets / jinja

A very fast and expressive template engine.

Python 11,243 1,680 Updated Jun 14, 2025

pytest-dev / pytest-cpp

Use pytest's runner to discover and execute C++ tests

C++ 140 26 Updated Oct 27, 2025

NVIDIA / MatX

An efficient C++17 GPU numerical computing library with Python-like syntax

C++ 1,358 107 Updated Nov 5, 2025

arogozhnikov / einops

Flexible and powerful tensor operations for readable and reliable code (for pytorch, jax, TF and others)

Python 9,260 385 Updated Aug 12, 2025

ROCm / composable_kernel

Composable Kernel: Performance Portable Programming Model for Machine Learning Tensor Operators

C++ 482 248 Updated Nov 5, 2025

NVIDIA / libcudacxx

[ARCHIVED] The C++ Standard Library for your entire system. See https://github.com/NVIDIA/cccl

C++ 2,309 191 Updated Feb 7, 2024

andreasfertig / cppinsights

C++ Insights - See your source code with the eyes of a compiler

C++ 4,407 258 Updated Jun 26, 2025

mviereck / x11docker

Run GUI applications and desktops in docker and podman containers. Focus on security.

Shell 6,062 404 Updated Apr 4, 2024

Xilinx / mlir-aie

An MLIR-based toolchain for AMD AI Engine-enabled devices.

MLIR 517 158 Updated Nov 5, 2025

NVlabs / instant-ngp

Instant neural graphics primitives: lightning fast NeRF and more

Cuda 17,029 2,017 Updated Oct 8, 2025

triton-lang / triton

Development repository for the Triton language and compiler

MLIR 17,469 2,359 Updated Nov 5, 2025

mg979 / vim-visual-multi

Multiple cursors plugin for vim/neovim

Vim Script 4,678 94 Updated Sep 1, 2024

llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.

LLVM 35,235 15,079 Updated Nov 5, 2025

sebbbi / perftest

GPU texture/buffer performance tester

C++ 655 32 Updated Nov 19, 2020

Andersbakken / rtags

A client/server indexer for c/c++/objc[++] with integration for Emacs based on clang.

C++ 1,838 256 Updated Sep 23, 2025

VSCodeVim / Vim

⭐ Vim for Visual Studio Code

TypeScript 14,895 1,418 Updated Nov 5, 2025

microsoft / STL

MSVC's implementation of the C++ Standard Library.

C++ 10,857 1,587 Updated Nov 5, 2025

ddemidov / vexcl

VexCL is a C++ vector expression template library for OpenCL/CUDA/OpenMP

C++ 717 84 Updated Jul 19, 2025

NervanaSystems / maxas

Assembler for NVIDIA Maxwell architecture

Sass 1,046 172 Updated Jan 3, 2023

ROCm / hip

HIP: C++ Heterogeneous-Compute Interface for Portability

C++ 4,221 573 Updated Nov 4, 2025