Thanks to visit codestin.com
Credit goes to github.com

ranjiewwen

Follow

🎯

Focusing

jiewen ranjiewwen

🎯

Focusing

Follow

CV&NLP

350 followers · 215 following

algorithmic engineer
chengdu

Achievements

Achievements

Highlights

Developer Program Member

Organizations

Lists (3)

Sort

CV

computer vision

LLM

large language model

NLP

natural language processing

Stars

alipay / PainlessInferenceAcceleration

Accelerate inference without tears

Python 372 22 Updated Jan 23, 2026

hpcaitech / SwiftInfer

Efficient AI Inference & Serving

Python 480 31 Updated Jan 8, 2024

hemingkx / SpeculativeDecodingPapers

📰 Must-read papers and blogs on Speculative Decoding ⚡️

1,102 63 Updated Jan 24, 2026

deepspeedai / DeepSpeed-Kernels

C++ 71 19 Updated Mar 26, 2025

meta-pytorch / gpt-fast

Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.

Python 6,180 568 Updated Aug 22, 2025

haotian-liu / LLaVA

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Python 24,382 2,716 Updated Aug 12, 2024

mit-han-lab / streaming-llm

[ICLR 2024] Efficient Streaming Language Models with Attention Sinks

Python 7,176 394 Updated Jul 11, 2024

Ucas-HaoranWei / Vary-toy

Official code implementation of Vary-toy (Small Language Model Meets with Reinforced Vision Vocabulary)

Python 629 43 Updated Dec 30, 2024

DLYuanGod / TinyGPT-V

TinyGPT-V: Efficient Multimodal Large Language Model via Small Backbones

Python 1,306 79 Updated Apr 18, 2024

wangzhaode / llm-export

llm-export can export llm model to onnx.

Python 342 38 Updated Oct 24, 2025

FasterDecoding / Medusa

Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads

Jupyter Notebook 2,697 192 Updated Jun 25, 2024

SafeAILab / EAGLE

Official Implementation of EAGLE-1 (ICML'24), EAGLE-2 (EMNLP'24), and EAGLE-3 (NeurIPS'25).

Python 2,156 250 Updated Jan 13, 2026

xlite-dev / Awesome-LLM-Inference

📚A curated list of Awesome LLM/VLM Inference Papers with Codes: Flash-Attention, Paged-Attention, WINT8/4, Parallelism, etc.🎉

Python 4,945 337 Updated Jan 18, 2026

mistralai / mistral-inference

Official inference library for Mistral models

Jupyter Notebook 10,640 1,006 Updated Nov 21, 2025

karpathy / llama2.c

Inference Llama 2 in one file of pure C

C 19,136 2,441 Updated Aug 6, 2024

Tiiny-AI / PowerInfer

High-speed Large Language Model Serving for Local Deployment

C++ 8,602 479 Updated Jan 24, 2026

bytedance / ByteTransformer

optimized BERT transformer inference on NVIDIA GPU. https://arxiv.org/abs/2210.03052

C++ 476 37 Updated Mar 15, 2024

hao-ai-lab / LookaheadDecoding

[ICML 2024] Break the Sequential Dependency of LLM Inference Using Lookahead Decoding

Python 1,316 78 Updated Mar 6, 2025

MegEngine / InferLLM

a lightweight LLM model inference framework

C++ 748 94 Updated Apr 7, 2024

baichuan-inc / Baichuan2

A series of large language models developed by Baichuan Intelligent Technology

Python 4,117 293 Updated Nov 8, 2024

NascentCore / llm-numbers-cn

中文版 llm-numbers

129 6 Updated Dec 25, 2023

ray-project / llm-numbers

Numbers every LLM developer should know

4,279 140 Updated Jan 16, 2024

NVIDIA / TensorRT-LLM

TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. Tensor…

Python 12,733 2,043 Updated Jan 27, 2026

JIA-Lab-research / LongLoRA

Code and documents of LongLoRA and LongAlpaca (ICLR 2024 Oral)

Python 2,698 294 Updated Aug 14, 2024

openai / openai-python

The official Python library for the OpenAI API

Python 29,777 4,521 Updated Jan 27, 2026

OpenPPL / ppl.nn.llm

141 18 Updated Apr 23, 2024

huggingface / safetensors

Simple, safe way to store and distribute tensors

Python 3,602 293 Updated Jan 14, 2026

mit-han-lab / llm-awq

[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

Python 3,424 289 Updated Jul 17, 2025

InternLM / lmdeploy

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

Python 7,554 649 Updated Jan 26, 2026

ModelTC / LightLLM

LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.

Python 3,861 296 Updated Jan 27, 2026