Thanks to visit codestin.com
Credit goes to Github.com

Skip to content
View sunway513's full-sized avatar

Block or report sunway513

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results
Python 107 15 Updated Feb 26, 2026

Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.

75,748 8,731 Updated Feb 5, 2026

AMD RAD's multi-GPU Triton-based framework for seamless multi-GPU programming

Python 175 33 Updated Feb 26, 2026

Modular RDMA Interface

C++ 84 22 Updated Feb 26, 2026

Online resources for Python Crash Course, 3rd edition, from No Starch Press.

Python 2,155 884 Updated Dec 1, 2025

AI Tensor Engine for ROCm

Python 359 218 Updated Feb 26, 2026

Repository hosting code for "Actions Speak Louder than Words: Trillion-Parameter Sequential Transducers for Generative Recommendations" (https://arxiv.org/abs/2402.17152).

Python 1,753 354 Updated Feb 20, 2026

Mamba SSM architecture

Python 17,249 1,598 Updated Feb 18, 2026
C++ 9 2 Updated Dec 19, 2023

JAX implementation of OpenAI's Whisper model for up to 70x speed-up on TPU.

Jupyter Notebook 4,681 413 Updated Apr 3, 2024

An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.

Python 39,417 4,778 Updated Jun 2, 2025

Accessible large language models via k-bit quantization for PyTorch.

Python 7,988 827 Updated Feb 26, 2026

A tool for generating information about the matrix multiplication instructions in AMD Radeon™ and AMD Instinct™ accelerators

Python 127 19 Updated Nov 14, 2025
Python 30 16 Updated Feb 11, 2026

A collection of libraries to optimise AI model performances

Python 8,354 625 Updated Jul 22, 2024

Kernl lets you run PyTorch transformer models several times faster on GPU with a single line of code, and is designed to be easily hackable.

Jupyter Notebook 1,585 98 Updated Jan 28, 2026

AITemplate is a Python framework which renders neural network into high performance CUDA/HIP C++ code. Specialized for FP16 TensorCore (NVIDIA GPU) and MatrixCore (AMD GPU) inference.

Python 4,706 382 Updated Jan 12, 2026

This repository contains demos I made with the Transformers library by HuggingFace.

Jupyter Notebook 11,508 1,717 Updated Jan 13, 2026
60 11 Updated Sep 15, 2023

openpilot is an operating system for robotics. Currently, it upgrades the driver assistance system on 300+ supported cars.

Python 60,166 10,658 Updated Feb 26, 2026

An MLIR-based compiler framework bridges DSLs (domain-specific languages) to DSAs (domain-specific architectures).

Python 695 235 Updated Feb 26, 2026

Models and examples built with TensorFlow

Python 77,688 45,273 Updated Feb 25, 2026

Benchmarks to capture important workloads.

Python 32 24 Updated Feb 5, 2026

Install guide of ROCm and Tensorflow on Ubuntu for the RX580

128 17 Updated Sep 30, 2024

A curated list of awesome C++ (or C) frameworks, libraries, resources, and shiny things. Inspired by awesome-... stuff.

69,916 8,231 Updated Feb 23, 2026

A cheatsheet of modern C++ language and library features.

21,488 2,259 Updated Feb 22, 2026
Cuda 1 Updated Feb 4, 2021

cuDF - GPU DataFrame Library

C++ 9,495 1,012 Updated Feb 26, 2026

My Python Examples

Python 34,773 12,906 Updated Feb 16, 2026

Tool to run rccl-tests/nccl-tests based on from an application and gather performance.

Python 3 10 Updated Nov 21, 2020
Next