-
University of Electronic Science and Technology of China
- Chengdu, Sichuan, China
-
06:41
(UTC +08:00)
Stars
Examples demonstrating available options to program multiple GPUs in a single node or a cluster
Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.
High-speed Large Language Model Serving for Local Deployment
The official GitHub page for the survey paper "A Survey of Large Language Models".
中文翻译的 Hands-On-Large-Language-Models (hands-on-llms),动手学习大模型
Clash流媒体等策略组规则整合. Clash proxy rules. Make a website/media be proxied by a specific country server.
✔(已完结)最全面的 深度学习 笔记【土堆 Pytorch】【李沐 动手学深度学习】【吴恩达 深度学习】
📚LeetCUDA: Modern CUDA Learn Notes with PyTorch for Beginners🐑, 200+ CUDA Kernels, Tensor Cores, HGEMM, FA-2 MMA.🎉
My learning notes/codes for ML SYS.
A curated list to learn about distributed systems
Papers from the computer science community to read and discuss.
LLM notes, including model inference, transformer model structure, and llm framework code analysis notes.
A cheatsheet of modern C++ language and library features.
Master programming by recreating your favorite technologies from scratch.
The Linux Kernel Module Programming Guide (updated for 5.0+ kernels)
Official code repo for the O'Reilly Book - "Hands-On Large Language Models"
本项目旨在分享大模型相关技术原理以及实战经验(大模型工程化、大模型应用落地)
2023年最新整理 c++后端开发,1000篇优秀博文,含内存,网络,架构设计,高性能,数据结构,基础组件,中间件,分布式相关
llama3 implementation one matrix multiplication at a time
深度学习系统笔记,包含深度学习数学基础知识、神经网络基础部件详解、深度学习炼丹策略、模型压缩算法详解。
Low-Level Software Security for Compiler Developers
A minimal GPU design in Verilog to learn how GPUs work from the ground up