Thanks to visit codestin.com
Credit goes to github.com

Skip to content
View gigit0000's full-sized avatar
  • Kim Baksa's Lab, South Korea
  • 11:43 (UTC +09:00)

Block or report gigit0000

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

SkyRL: A Modular Full-stack RL Library for LLMs

Python 1,389 204 Updated Dec 19, 2025

📚A curated list of Awesome LLM/VLM Inference Papers with Codes: Flash-Attention, Paged-Attention, WINT8/4, Parallelism, etc.🎉

Python 4,847 327 Updated Nov 28, 2025

Easy, Fast, and Scalable Multimodal AI

Python 81 6 Updated Dec 19, 2025

Nvidia Instruction Set Specification Generator

Python 304 16 Updated Jul 9, 2024

Cataloging released Triton kernels.

277 14 Updated Sep 9, 2025

Learning Deep Representations of Data Distributions

TeX 717 59 Updated Dec 18, 2025

Small scale distributed training of sequential deep learning models, built on Numpy and MPI.

Python 153 7 Updated Oct 19, 2023

Python pdb for multiple processes

Python 72 9 Updated May 24, 2025

Large-scale LLM inference engine

C++ 1,610 178 Updated Nov 24, 2025

Memray is a memory profiler for Python

Python 14,685 432 Updated Dec 15, 2025
Python 7 Updated Jul 26, 2025

Triton Support in Compiler Explorer

TypeScript 5 Updated Aug 5, 2025

Run compilers interactively from your web browser and interact with the assembly

TypeScript 18,354 1,969 Updated Dec 19, 2025

This repo provides several classic attention variant implementation based on FlexAttention API.

Python 2 1 Updated May 18, 2025

A unified library for building, evaluating, and storing speculative decoding algorithms for LLM inference in vLLM

Python 169 22 Updated Dec 19, 2025

OpenBLAS is an optimized BLAS library based on GotoBLAS2 1.13 BSD version.

C 7,175 1,624 Updated Dec 19, 2025

Hacker News

HTML 13 5 Updated Dec 20, 2025

Distribute and run LLMs with a single file.

C++ 1 Updated Jul 23, 2024

Distribute and run LLMs with a single file.

C 23,534 1,250 Updated Dec 19, 2025

CUDA on non-NVIDIA GPUs

Rust 13,673 879 Updated Dec 19, 2025

GPU & Accelerator process monitoring for AMD, Apple, Huawei, Intel, NVIDIA and Qualcomm

C 9,875 356 Updated Oct 25, 2025

A .NET MAUI app for displaying the top posts on Hacker News that demonstrates text sentiment analysis gathered using artificial intelligence

C# 280 40 Updated Nov 24, 2025

A curated list of awesome C frameworks, libraries, resources and other shiny things. Inspired by all the other awesome-... projects out there.

10,903 908 Updated Nov 7, 2025

Local AI voice assistant stack for Home Assistant (GPU-accelerated) with persistent memory, follow-up conversation, and Ollama model recommendations - settings designed for low VRAM systems.

221 19 Updated Jul 27, 2025

Debug Module for Embedded Systems

C 1 Updated May 3, 2025

A TTS model capable of generating ultra-realistic dialogue in one pass.

Python 1 Updated Jul 6, 2025

📝 A curated list of awesome Raspberry Pi tools, projects, images and resources

Shell 15,582 1,071 Updated Nov 10, 2025

Inference Llama 2 in one file of pure C & one file with CUDA

C 31 1 Updated Oct 14, 2023

A TTS model capable of generating ultra-realistic dialogue in one pass.

Python 18,983 1,651 Updated Nov 19, 2025

V-lang api wrapper for llm-inference chatllm.cpp

C 6 Updated Nov 20, 2024
Next