Awesome-AI-Papers

Collect 0-10 the most impressive AI papers each year. Actively keep updating

Papers

2025

[2025] Qwen3 Technical Report (Qwen3). [paper]
[2025] DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning (DeepSeek-R1). [paper]
[2025] Kimi K1.5: Scaling Reinforcement Learning with LLMs (Kimi K1.5). [paper]
[2025] Scaling In‑the‑Wild Training for Diffusion‑based Illumination Harmonization and Editing (IC-Light, ICLR 2025). [paper]
[2025] Learning to (Learn at Test Time): RNNs with Expressive Hidden States(TTT). [paper]

2024

[2024] The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery (2024). [paper]
[2024] Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters (2024). [paper]
[2024] The Llama 3 Herd of Models (Llama 3). [paper]
[2024] Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context (Gemini 1.5). [paper]
[2024] Mixtral of Experts (SMoE). [paper]
[2024] Phi-3 technical report: A highly capable language model locally on your phone (PHI). [paper]
[2024] A survey on evaluation of large language models(Survey). [paper]
[2024] Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Models(VIM). [paper]
[2024] DeepSeek-V3 Technical Report (DeepSeek-V3). [paper]

2023

[2023] GPT-4 Technical Report (GPT-4). [paper]
[2023] Llama 2: Open Foundation and Fine-Tuned Chat Models(LLaMa 2). [paper]
[2023] Mamba: Linear-time sequence modeling with selective state spaces (Mamba). [paper]
[2023] LLaMA: Open and Efficient Foundation Language Models (LLaMa). [paper]
[2023] QLoRA: Efficient Finetuning of Quantized LLMs (QLoRA). [paper]
[2023] Gemini: A Family of Highly Capable Multimodal Models(Gemini). [paper]
[2023] Qwen Technical Report (Qwen). [paper]
[2023] PaLM: Scaling Language Modeling with Pathways (PaLM, JMLR 2023). [paper]
[2023] Visual Instruction Tuning (LLaVA, NeurIPS 2023). [paper]

2022

[2022] Chain-of-Thought Prompting Elicits Reasoning in Large Language Models (CoT). [paper]
[2022] Training language models to follow instructions with human feedback (InstructGPT, GPT-3.5). [paper]
[2022] Masked Autoencoders Are Scalable Vision Learners (MAE, CVPR 2022). [paper]
[2022] High-Resolution Image Synthesis with Latent Diffusion Models (StableDiffusion, CVPR 2022). [paper]
[2022] LoRA: Low-Rank Adaptation of Large Language Models (ICLR 2022). [paper]
[2022] Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding(NeurIPS 2022). [paper]

2021

[2021] Emerging Properties in Self‑Supervised Vision Transformers(DINO). [paper]
[2021] Highly accurate protein structure prediction with AlphaFold (Nature 2021). [paper]
[2021] Hierarchical Vision Transformer using Shifted Windows (Swin, ICCV 2021). [paper]
[2021] An image is worth 16x16 words: Transformers for image recognition at scale (ViT, ICLR 2021). [paper]
[2021] Learning Transferable Visual Models From Natural Language Supervision (CLIP, ICML 2021). [paper]
[2021] Zero-Shot Text-to-Image Generation(PMLR 2021). [paper]
[2021] Evaluating Large Language Models Trained on Code(Codex). [paper]

2020

[2020] End-to-end object detection with transformers(DETR, ECCV 2020). [paper]
[2020] Language Models are Few-Shot Learners (GPT-3, NeurIPS 2020). [paper]
[2020] Denoising Diffusion Probabilistic Models (Diffusion, NeurIPS 2020). [paper]
[2020] YOLOv4: Optimal Speed and Accuracy of Object Detection(YOLOv4). [paper]
[2020] Exploring the limits of transfer learning with a unified text-to-text transformer(T5). [paper]
[2020] Efficientdet: Scalable and efficient object detection (ICCV 2020). [paper]
[2020] A Simple Framework for Contrastive Learning of Visual Representations (ICML 2020). [paper]
[2020] ALBERT: A Lite BERT for Self-supervised Learning of Language Representations(ICLR 2020). [paper]

2000-2019

[2019] Language Models are Unsupervised Multitask Learners (GPT-2). [paper]
[2019] Decoupled Weight Decay Regularization (AdamW, ICLR 2019). [paper]
[2018] BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension (BART). [paper]
[2018] Improving language understanding by generative pre-training (GPT-1). [paper]
[2018] BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding (Bert). [paper]
[2017] Mastering the game of Go without human knowledge (AlphaGOZero, Nature 2017). [paper]
[2017] Attention Is All You Need (Transformer, NeurIPS 2017). [paper]
[2017] Pointnet: Deep learning on point sets for 3d classification and segmentation (PointNet, CVPR 2017). [paper]
[2017] Mask R-CNN (ICCV 2017). [paper]
[2016] Neural Architecture Search with Reinforcement Learning (NAS). [paper]
[2016] Mastering the game of Go with deep neural networks and tree search (AlphaGo, Nature 2016). [paper]
[2016] Deep Residual Learning for Image Recognition (ResNet, CVPR 2016). [paper]
[2016] You only look once: Unified, real-time object detection (YOLO, CVPR 2016). [paper]
[2015] Deep learning (Nature 2015). [paper]
[2015] Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift (BN, ICML 2015). [paper]
[2015] Adam: A Method for Stochastic Optimization (Adam). [paper]
[2015] U-Net: Convolutional Networks for Biomedical Image Segmentation (U-Net). [paper]
[2015] Very deep convolutional networks for large-scale image recognition (VGG, ICLR 2015). [paper]
[2014] Generative Adversarial Nets (GAN, NeurIPS 2014). [paper]
[2014] Neural Machine Translation by Jointly Learning to Align and Translate (Attention). [paper]
[2014] Dropout: a simple way to prevent neural networks from overfitting (Dropout). [paper]
[2014] Sequence to Sequence Learning with Neural Networks (Seq2seq, NeurIPS 2014). [paper]
[2014] Distilling the Knowledge in a Neural Network (Knowledge Distillation). [paper]
[2013] Distributed Representations of Words and Phrases and their Compositionality (word2vec, NeurIPS 2013). [paper]
[2013] Playing Atari with Deep Reinforcement Learning (Q-learning). [paper]
[2013] auto-encoding variational bayes (VAE). [paper]
[2012] Imagenet classification with deep convolutional neural networks (AlexNet, NeurIps 2012). [paper]
[2011] Deep Sparse Rectifier Neural Networks (ReLU). [paper]
[2009] Imagenet: A large-scale hierarchical image database (CVPR 2009). [paper]

1986-2000

[1998] Gradient-based learning applied to document recognition (CNN, Proceedings of the IEEE 1998). [paper]
[1997] Long Short-Term Memory (LSTM, Neural Computation 1997). [paper]
[1990] Finding Structure in Time (RNN). [paper]
[1986] Learning internal representations by error propagation (BP, Biometrika 1986). [paper]

Notice

The collection of papers is somewhat subjective and limited in knowledge. Sorry for any possible omissions.
Before this list, there exist [another awesome deep learning list].

Name		Name	Last commit message	Last commit date
Latest commit History 78 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Awesome-AI-Papers

Papers

2025

2024

2023

2022

2021

2020

2000-2019

1986-2000

Notice

About

Uh oh!

Releases

Packages

TheBrainLab/Awesome-AI-Papers

Folders and files

Latest commit

History

Repository files navigation

Awesome-AI-Papers

Papers

2025

2024

2023

2022

2021

2020

2000-2019

1986-2000

Notice

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages