-
-
tokenizers Public
Forked from huggingface/tokenizers๐ฅ Fast State-of-the-Art Tokenizers optimized for Research and Production
Rust Apache License 2.0 UpdatedMay 13, 2025 -
-
BetterChatGPT Public
Forked from ztjhz/BetterChatGPTAn amazing UI for OpenAI's ChatGPT (Website + Windows + MacOS + Linux)
TypeScript Creative Commons Zero v1.0 Universal UpdatedAug 14, 2024 -
vllm Public
Forked from vllm-project/vllmA high-throughput and memory-efficient inference and serving engine for LLMs
-
djl-serving Public
Forked from deepjavalibrary/djl-servingA universal scalable machine learning model deployment solution
Java Apache License 2.0 UpdatedMay 22, 2024 -
flash-attention Public
Forked from Dao-AILab/flash-attentionFast and memory-efficient exact attention
-
lm-evaluation-harness Public
Forked from EleutherAI/lm-evaluation-harnessA framework for few-shot evaluation of language models.
Python MIT License UpdatedFeb 2, 2024 -
PipeEdge Public
Forked from usc-isi/PipeEdgePipeEdge: Pipeline Parallelism for Large-Scale Model Inference on Heterogeneous Edge Devices
Python BSD 3-Clause "New" or "Revised" License UpdatedJan 31, 2024 -
ColossalAI-Documentation Public
Forked from hpcaitech/ColossalAI-DocumentationDocumentation for Colossal-AI
JavaScript Apache License 2.0 UpdatedJan 16, 2024 -
llm-awq Public
Forked from mit-han-lab/llm-awqAWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
Python MIT License UpdatedSep 14, 2023 -
llama.cpp Public
Forked from ggml-org/llama.cppPort of Facebook's LLaMA model in C/C++
C MIT License UpdatedSep 14, 2023 -
pytorch Public
Forked from pytorch/pytorchTensors and Dynamic neural networks in Python with strong GPU acceleration
Python Other UpdatedSep 8, 2023 -
text-generation-webui Public
Forked from oobabooga/text-generation-webuiA Gradio web UI for Large Language Models. Supports transformers, GPTQ, llama.cpp (ggml/gguf), Llama models.
Python GNU Affero General Public License v3.0 UpdatedAug 29, 2023 -
AITemplate Public
Forked from facebookincubator/AITemplateAITemplate is a Python framework which renders neural network into high performance CUDA/HIP C++ code. Specialized for FP16 TensorCore (NVIDIA GPU) and MatrixCore (AMD GPU) inference.
Python Apache License 2.0 UpdatedJul 31, 2023 -
ColossalAI Public
Forked from hpcaitech/ColossalAIMaking large AI models cheaper, faster and more accessible
Python Apache License 2.0 UpdatedJul 20, 2023 -
alpa Public
Forked from alpa-projects/alpaTraining and serving large-scale neural networks
Python Apache License 2.0 UpdatedMay 19, 2023 -
transformers Public
Forked from huggingface/transformers๐ค Transformers: State-of-the-art Natural Language Processing for Pytorch, TensorFlow, and JAX.
-
-
accelerate Public
Forked from huggingface/accelerate๐ A simple way to train and use PyTorch models with multi-GPU, TPU, mixed-precision
Python Apache License 2.0 UpdatedFeb 28, 2023 -
-
tensorflow-fork Public
Forked from tensorflow/tensorflowAn Open Source Machine Learning Framework for Everyone
C++ Apache License 2.0 UpdatedOct 20, 2022 -
detr Public
Forked from facebookresearch/detrEnd-to-End Object Detection with Transformers
-
-
maskrcnn-benchmark Public
Forked from facebookresearch/maskrcnn-benchmarkFast, modular reference implementation of Instance Segmentation and Object Detection algorithms in PyTorch.
Python MIT License UpdatedMar 3, 2022 -
-
tvm Public
Forked from apache/tvmOpen deep learning compiler stack for cpu, gpu and specialized accelerators
Python Apache License 2.0 UpdatedJan 25, 2022 -
-
-