Thanks to visit codestin.com
Credit goes to github.com

THUNLP

All

259 repositories

StateX
Public
The official implementation of the paper "StateX: Enhancing RNN Recall via Post-training State Expansion".
machine-learning memory rnn recall ssm mamba linear-attention llm long-context
Python
•0•1•0•0•Updated Oct 24, 2025Oct 24, 2025
KG-Infused-RAG
Public
Official implementation for the paper "KG-Infused RAG: Augmenting Corpus-Based RAG with External Knowledge Graphs"
Python
•0•10•0•0•Updated Oct 21, 2025Oct 21, 2025
AgentRM
Public
[ACL 2025 main] AgentRM: Enhancing Agent Generalization with Reward Modeling
Python
•0•4•0•0•Updated Sep 29, 2025Sep 29, 2025
stuffed-mamba
Public
The code of the paper Stuffed Mamba: Oversized States Lead to the Inability to Forget
machine-learning rnn mamba long-context
Python
•0•1•0•0•Updated Sep 28, 2025Sep 28, 2025
BurstEngine
Public
BurstEngine is an efficient framework designed to train LLMs on long-sequence data.
Python
•2•7•0•0•Updated Sep 25, 2025Sep 25, 2025
cost-optimal-gqa
Public
The code for the paper "Cost-Optimal Grouped-Query Attention for Long-Context Modeling"
natural-language-processing transformer attention long-context llms
Python
•1•3•1•0•Updated Sep 14, 2025Sep 14, 2025
LLMxMapReduce
Public
Python
•
Apache License 2.0
•63•826•1•0•Updated Sep 12, 2025Sep 12, 2025
SIR-Bench
Public
Python
•
Apache License 2.0
•0•3•1•0•Updated Sep 12, 2025Sep 12, 2025
Seq1F1B
Public
Sequence-level 1F1B schedule for LLMs.
Python
•
Other
•3.2k•32•1•0•Updated Aug 26, 2025Aug 26, 2025
ProactiveAgent
Public
A LLM-based Agent that predict its tasks proactively.
Python
•
Apache License 2.0
•36•432•5•0•Updated Aug 22, 2025Aug 22, 2025
ChartCoder
Public
[ACL'25 Main] ChartCoder: Advancing Multimodal Large Language Model for Chart-to-Code Generation
Python
•2•64•2•0•Updated Jul 30, 2025Jul 30, 2025
FR-Spec
Public
[ACL 2025 main] FR-Spec: Frequency-Ranked Speculative Sampling
C++
•2•44•3•0•Updated Jul 15, 2025Jul 15, 2025
BlockFFN
Public
Source codes for paper "BlockFFN: Towards End-Side Acceleration-Friendly Mixture-of-Experts with Chunk-Level Activation Sparsity".
Python
•5•17•0•0•Updated Jul 14, 2025Jul 14, 2025
TritonBench
Public
TritonBench: Benchmarking Large Language Model Capabilities for Generating Triton Operators
Python
•
Apache License 2.0
•9•88•4•1•Updated Jun 14, 2025Jun 14, 2025
ClueAnchor
Public
Python
•0•8•0•0•Updated Jun 11, 2025Jun 11, 2025
DeepPerception
Public
DeepPerception: Advancing R1-like Cognitive Visual Perception in MLLMs for Knowledge-Intensive Visual Grounding
Python
•
MIT License
•1•65•1•0•Updated Jun 10, 2025Jun 10, 2025
TAADpapers
Public
Must-read Papers on Textual Adversarial Attack and Defense
nlp natural-language-processing adversarial-learning adversarial-attacks paper-list adversarial-defense
Python
•
MIT License
•195•1.6k•3•0•Updated Jun 4, 2025Jun 4, 2025
LongPiBench
Public
Python
•
MIT License
•0•0•0•0•Updated May 28, 2025May 28, 2025
DIET
Public
Official code for "The Overthinker's DIET: Cutting Token Calories with DIfficulty-AwarE Training"
Python
•0•0•0•0•Updated May 27, 2025May 27, 2025
Migician
Public
[ACL2025 Findings] Migician: Revealing the Magic of Free-Form Multi-Image Grounding in Multimodal Large Language Models
Python
•
MIT License
•4•80•0•0•Updated May 20, 2025May 20, 2025
ToLeaP
Public
Python
•
MIT License
•1•5•1•1•Updated May 17, 2025May 17, 2025
SICOG
Public
Will Pre-Training Ever End? A First Step Toward Next-Generation Foundation MLLMs via Self-Improving Systematic Cognition
Python
•
GNU General Public License v3.0
•2•31•1•0•Updated May 14, 2025May 14, 2025
LLaVA-UHD
Public
LLaVA-UHD v2: an MLLM Integrating High-Resolution Semantic Pyramid via Hierarchical Window Transformer
Python
•
Apache License 2.0
•20•388•10•0•Updated Apr 20, 2025Apr 20, 2025
Dynamics-of-Zero-Shot-Generalization
Public
Code for the paper "The Right Time Matters: Data Arrangement Affects Zero-Shot Generalization in Instruction Tuning"
Python
•0•5•0•0•Updated Apr 8, 2025Apr 8, 2025
DeepNote
Public
Python
•8•130•1•0•Updated Apr 7, 2025Apr 7, 2025
Ouroboros
Public
Ouroboros: Speculative Decoding with Large Model Enhanced Drafting (EMNLP 2024 main)
Python
•
Apache License 2.0
•9•110•4•0•Updated Mar 20, 2025Mar 20, 2025
SchemaReinforcementLearning
Public
Learning to Generate STRUCTURED Output with Schema Reinforcement Learning
Python
•
Apache License 2.0
•4•18•0•0•Updated Mar 2, 2025Mar 2, 2025
APB
Public
Official Implementation of APB (ACL 2025 main Oral)
C++
•3•31•0•0•Updated Feb 22, 2025Feb 22, 2025
EmbodiedEval
Public
Evaluate Multimodal LLMs as Embodied Agents
Python
•
MIT License
•4•54•2•0•Updated Feb 14, 2025Feb 14, 2025
LEGENT
Public
Open Platform for Embodied Agents
physics-engine robot-simulator language-grounding embodied-ai large-multimodal-models
Python
•
Apache License 2.0
•23•331•9•1•Updated Jan 12, 2025Jan 12, 2025