-
Fudan University
- China
-
09:13
(UTC +08:00)
Stars
HISIM introduces a suite of analytical models at the system level to speed up performance prediction for AI models, covering logic-on-logic architectures across 2D, 2.5D, 3D and 3.5D integration
Open-source Framework for HPCA2024 paper: Gemini: Mapping and Architecture Co-exploration for Large-scale DNN Chiplet Accelerators
Curated collection of papers in MoE model inference
Repository to host and maintain SCALE-Sim code
This is a processing-in-memory simulator which models 3D-stacked memory within gem5. Also includes the workloads used for IMPICA (In-Memory PoInter Chasing Accelerator), an ICCD 2016 paper by Hsieh…
End-to-end SoC simulation: integrating the gem5 system simulator with the Aladdin accelerator simulator.
The official repository for the gem5 computer-system architecture simulator.
Ramulator 2.0 is a modern, modular, extensible, and fast cycle-accurate DRAM simulator. It provides support for agile implementation and evaluation of new memory system designs (e.g., new DRAM stan…
DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
Unofficial implementation of LSQ-Net, a neural network quantization framework
A reading list for SRAM-based Compute-In-Memory (CIM) research.
This is an implementation of YOLO using LSQ network quantization method.
NeuPIMs: NPU-PIM Heterogeneous Acceleration for Batched LLM Inferencing
AISystem 主要是指AI系统,包括AI芯片、AI编译器、AI推理和训练框架等AI全栈底层技术
A comprehensive tool that allows for system-level performance estimation of chiplet-based In-Memory computing (IMC) architectures.
The PULP Ara is a 64-bit Vector Unit, compatible with the RISC-V Vector Extension Version 1.0, working as a coprocessor to CORE-V's CVA6 core
Verilog AXI components for FPGA implementation
Vision Transformer (ViT) in PyTorch