Combine CV with NLP tasks,focus on Medical Report Generation、Image/Video Captioning、VQA、Anchor-free Object Detection、Weakly Supervised Segmentation.
- Image/Video Captioning
- Paragraph Description Generation
- Visual Question Answering
- Medical Report Generation
- Medical Image Processing
- Object Detection
- Segmentation
- Weakly Supervised Segmentation
- Metrics
- Others
- 
CNN-RNN - Show and Tell: A Neural Image Caption Generator, Oriol Vinyals et al, CVPR 2015, Google(pdf)
- Show, Attend and Tell: Neural Image Caption Generation with Visual Attention, Kelvin Xu et at, ICML 2015(pdf)(code)
- Show and Tell: Lessons learned from the 2015 MSCOCO Image Captioning Challenge, PAMI 2016(pdf)(code)
- Areas of Attention for Image Captioning, ICCV 2017(pdf)
- Rethinking the Form of Latent States in Image Captioning, ECCV 2018, CUHK(pdf)
- Recurrent Fusion Network for Image Captioning, ECCV 2018, Tencent AI Lab, 复旦(pdf)
- Move Forward and Tell- A Progressive Generator of Video Descriptions, ECCV 2018, CUHK(pdf)
- Video Paragraph Captioning Using Hierarchical Recurrent Neural Networks, CVPR 2016(pdf)
 
- 
CNN-CNN 
- 
Reinforcement Learning 
- 
Others - A Neural Compositional Paradigm for Image Captioning, NIPS 2018, CUHK(pdf)
 
- CNN-RNN
- DenseCap: Fully Convolutional Localization Networks for Dense Captioning, Justin Johnson et al, CVPR 2016, Standford(homepage)(code)
- A Hierarchical Approach for Generating Descriptive Image Paragraphs, Jonathan Krause et al, CVPR 2017, Stanford(homepage)(dense-caption code)
- Recurrent Topic-Transition GAN for Visual Paragraph Generation, ICCV 2017
- Diverse and Coherent Paragraph Generation from Images, ECCV 2018(code)
 
- CNN-RNN
- Multi-level Attention Networks for Visual Question Answering, CVPR 2017
- Motion-Appearance Co-Memory Networks for Video Question Answering, 2018
- Deep Attention Neural Tensor Network for Visual Question Answering, ECCV 2018, HIT
- Question-Guided Hybrid Convolution for Visual Question Answering, Peng Gao et al, ECCV 2018, CUHK(pdf)
 
- 
CNN-RNN - Learning to Read Chest X-Rays- Recurrent Neural Cascade Model for Automated Image Annotation, CVPR 2016(pdf)
- TieNet Text-Image Embedding Network for Common Thorax Disease Classification and Reporting in Chest X-rays, Xiaosong Wang et at, CVPR 2018, NIH(pdf)(author's homepage)
- On the Automatic Generation of Medical Imaging Reports, Baoyu Jing et al., ACL 2018, CMU(pdf)(author's homepage)
- Multimodal Recurrent Model with Attention for Automated Radiology Report Generation, Yuan Xue et al., MICCAI 2018, PSU(pdf)
- Attention-Based Abnormal-Aware Fusion Network for Radiology Report Generation, Xiancheng Xie et al., 2019, Fudan University
- Addressing Data Bias Problems for Chest X-ray Image Report Generation, Philipp Harzig et al., 2019, University of Augsburg(pdf)
- Addressing Data Bias Problems for Chest X-ray Image Report Generation, Philipp Harzig et al., 2019(pdf)
 
- 
Reinforcement Learning - Hybrid Retrieval-Generation Reinforced Agent for Medical Image Report Generation, Christy Y. Li et al, NIPS 2018, CMU(pdf)(author's homepage)
 
- 
Knowledge Graph - Knowledge-Driven Encode, Retrieve, Paraphrase for Medical Image Report Generation, Christy Y. Li et al, AAAI 2019, DU(pdf)
 
- 
Other - TextRay Mining Clinical Reports to Gain a Broad Understanding of Chest X-rays, 2018 MICCAI(pdf)
 
- 
Blogs 
- 
NIH Chest X-ray8/14(download link)(kaggle's download link) - ChestX-ray8: Hospital-scale Chest X-ray Database and Benchmarks on Weakly-Supervised Classification and Localization of Common Thorax Diseases, CVPR 2017, NIH(pdf)
 
- 
Open-i Chest X-Ray(download link) 
- 
Radiology Objects in COntext(ROCO) - Radiology Objects in COntext (ROCO): A Multimodal Image Dataset, MICCAI 2018(intro)(pdf)(download)
 
- 
Detection - CheXNet- Radiologist-Level Pneumonia Detection on Chest X-Rays with Deep Learning, 2018 吴恩达
- Attention-Guided Curriculum Learning for Weakly Supervised Classification and Localization of Thoracic Diseases on Chest Radiographs, Yuxing Tang et at, MICCAI-MLMI oral 2018, NIH(pdf)
- DeepRadiologyNet - Radiologist Level Pathology Detection in CT Head Images
- 肺部CT图像病变区域检测方法
- 基于定量影像组学的肺肿瘤良恶性预测方法
 
- 
Enhance - Super Resolution
- Image Super-Resolution Using Deep Convolutional Networks
- Deeply-Recursive Convolutional Network for Image Super-Resolution
 
 
- Super Resolution
- 
Segmentation - U-Net: Convolutional Networks for Biomedical Image Segmentation, 2015 MICCAI
- A 3D Coarse-to-Fine Framework for Automatic Pancreas Segmentation
 
- 
Weakly-supervised 
- 
Anchor-based 
- 
Anchor-free - YOLO, You Only Look Once- Unified, Real-Time Object Detection, Joseph Redmon et al, CVPR 2016(pdf)(note)
- CornerNet, CornerNet: Detecting Objects as Paired Keypoints, Hei Law et al, ECCV 2018, Michigan University(pdf)(code)(blog)
- FCOS, FCOS: Fully Convolutional One-Stage Object Detection, Zhi Tian et al, ICCV 2019, Adelaide University(pdf)(code)(blog)
- CenterNet, Objects as Points, Xingyi Zhou et al, 2019, UT Austin(pdf)(code)
 
- 
Others 
- Semantic Segmentation
- Instance Segmentation
- Bounding Box Supervision
- Weakly- and Semi-Supervised Learning of a Deep Convolutional Network for Semantic Image Segmentation, Liang-Chieh Chen et al., ICCV 2015, UCLA(pdf)(deeplab-v1-code)(model)(note)
- BoxSup: Exploiting Bounding Boxes to Supervise Convolutional Networks for Semantic Segmentation, Jifeng Dai et al., ICCV 2015, Microsoft Research(pdf)
- Simple Does It: Weakly Supervised Instance and Semantic Segmentation, Anna Khoreva et al., CVPR 2017, Max Planck Institute for Informatics(pdf)(code)(tf-code)
- Box-driven Class-wise Region Masking and Filling Rate Guided Loss for Weakly Supervised Semantic Segmentation, Chunfeng Song et al, CVPR 2019, CASIA(pdf)
 
- Image Label Supervision
- FULLY CONVOLUTIONAL MULTI-CLASS MULTIPLE INSTANCE LEARNING, Deepak Pathak et al., ICLR 2015, UC Berkeley(pdf)(note)
- From Image-level to Pixel-level Labeling with Convolutional Networks, Pedro O. Pinheiro et.al., CVPR 2015, Idiap Research Institute, Martigny(pdf)(note)
- DSRG, Weakly-Supervised Semantic Segmentation Network with Deep Seeded Region Growing, Zilong Huang et al., CVPR 2018, HUST(pdf)(code)
- SSENet, Self-supervised Scale Equivariant Network for Weakly Supervised Semantic Segmentation, Yude Wang et al., 2019, CAS(pdf)(code)
 
- Others
- DenseCRF, Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials, Philipp Krahenbuhl et al., NIPS 2011, Stanford University(pdf)(homepage)(code)
- A Comprehensive Analysis of Weakly-Supervised Semantic Segmentation in Different Image Domains, Lyndon Chan et al., 2019(pdf)
 
- Good References
- BLEU
- BLEU: a method for automatic evaluation of machine translation, Kishore Papineni et al, ACL 2002(pdf)
 
- CIDEr
- 
Visual Commonsense Reasoning(VCR-视觉常识推理) - From Recognition to Cognition- Visual Commonsense Reasoning, Rowan Zeller et al, 2018, Paul G. Allen School(homepage)(pdf)
 
- 
Language Model(语言模型) - Transformer:Attention Is All You Need, Ashish Vaswani et al, NIPS 2017, Google Brain/Research(pdf)(code)(blog)
- BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding, Jacob Devlin et al, 2018, Googel AI Language(pdf)(code)(slides)
- ELMo:Deep contextualized word representations, Matthew E. Peters et al, NAACL 2018, Paul G. Allen School(homepage)(pdf)(code-tf)
 
- 
Teacher Forcing Policy 
- 
classification - VGG, Very Deep Convolutional NetWorks for Large-Scale Image Recognition, Karen Simonyan et at., ICLR 2015(pdf)
- Inception, Going Deeper with Convolutions, Christian Szegedy et al, CVPR 2015, Google(pdf)
- ResNet, Deep Residual Learning for Image Recognition, Kaiming He et al, CVPR 2016, Microsoft Research(pdf)(code)(blog)
- SENet:Squeeze-and-Excitation Networks, Jie Hu et al, CVPR 2018, Momenta(中国无人驾驶公司) and Oxford University(pdf)(code)(blog)