Highlights
- Pro
Stars
A curated list of products, benchmarks, and research papers on autonomous code agents. Beyond coding — they're redefining how software changes the world.
ValueCell is a community-driven, multi-agent platform for financial applications.
Research code artifacts for Code World Model (CWM) including inference tools, reproducibility, and documentation.
Auto get diffusion nlp papers in Axriv. More papers Information can be found in another repository "Diffusion-LM-Papers".
Learning to Parallel: Accelerating Diffusion Large Language Models via Learnable Parallel Decoding
The AI Scientist-v2: Workshop-Level Automated Scientific Discovery via Agentic Tree Search
[EMNLP2025] From Automation to Autonomy: A Survey on Large Language Models in Scientific Discovery
When Agent Becomes the Scientist – Building Closed-Loop System from Hypothesis to Verification
A live stream development of RL tunning for LLM agents
Agent-R1: Training Powerful LLM Agents with End-to-End Reinforcement Learning
[TMLR] A curated list of language modeling researches for code (and other software engineering activities), plus related datasets.
From Zero to Hero: Cold-Start Anomaly Detection (ACL 2024)
The dataset and code for paper "SciReplicate-Bench: Benchmarking LLMs in Agent-driven Algorithmic Reproduction from Research Papers"
Official Implementation codes of "Bridging Local Details and Global Context in Text-Attributed Graphs" (EMNLP 2024 Main)
Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation
[EMNLP 2024] Reasoning Paths Optimization: Learning to Reason and Explore From Diverse Paths
[EMNLP, Findings 2024] a radiology report generation metric that leverages the natural language understanding of language models to identify and explain clinically significant errors in candidate r…
[EMNLP 2024 Findings] Official PyTorch Implementation of "Adaptive Contrastive Search: Uncertainty-Guided Decoding for Open-Ended Text Generation"
Tongyi Deep Research, the Leading Open-source Deep Research Agent
KEN: Unleash the power of large language models with the easiest and universal non-parametric pruning algorithm