- Handbook of Graphical Models.
online - Deep Learning.
online - Neural Networks and Deep Learning.
online - Speech and Language Processing.
online
- BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding.
paper - GPT-2: Language Models are Unsupervised Multitask Learners.
paper - Transformer-XL: Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context.
paper - XLNet: Generalized Autoregressive Pretraining for Language Understanding.
paper - RoBERTa: Robustly Optimized BERT Pretraining Approach.
paper - DistilBERT: a distilled version of BERT: smaller, faster, cheaper and lighter.
paper - ALBERT: A Lite BERT for Self-supervised Learning of Language Representations.
paper - T5: Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer.
paper - ELECTRA: pre-training text encoders as discriminators rather than generators.
paper - GPT3: Language Models are Few-Shot Learners.
paper
- LSTM(Long Short-term Memory).
paper - Sequence to Sequence Learning with Neural Networks.
paper - Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation.
paper - Residual Network(Deep Residual Learning for Image Recognition).
paper - Dropout(Improving neural networks by preventing co-adaptation of feature detectors).
paper - Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift.
paper
- An overview of gradient descent optimization algorithms.
paper - Analysis Methods in Neural Language Processing: A Survey.
paper - Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer.
paper - A Review on Generative Adversarial Networks: Algorithms, Theory, and Applications.
paper - A Gentle Introduction to Deep Learning for Graphs.
paper - A Survey on Deep Learning for Named Entity Recognition.
paper - More Data, More Relations, More Context and More Openness: A Review and Outlook for Relation Extraction.
paper - Deep Learning Based Text Classification: A Comprehensive Review.
paper - Pre-trained Models for Natural Language Processing: A Survey.
paper - A Survey on Contextual Embeddings.
paper - A Survey on Knowledge Graphs: Representation, Acquisition and Applications.
paper - Knowledge Graphs.
paper - Pre-trained Models for Natural Language Processing: A Survey.
paper
- A Neural Probabilistic Language Model.
paper - word2vec Parameter Learning Explained.
paper - Language Models are Unsupervised Multitask Learners.
paper - An Empirical Study of Smoothing Techniques for Language Modeling.
paper - Efficient Estimation of Word Representations in Vector Space.
paper - Distributed Representations of Sentences and Documents.
paper - Enriching Word Vectors with Subword Information(FastText).
paper - GloVe: Global Vectors for Word Representation.
online - ELMo (Deep contextualized word representations).
paper - Pre-Training with Whole Word Masking for Chinese BERT.
paper
- Bag of Tricks for Efficient Text Classification (FastText).
paper - Convolutional Neural Networks for Sentence Classification.
paper - Attention-Based Bidirectional Long Short-Term Memory Networks for Relation Classification.
paper
- A Deep Ensemble Model with Slot Alignment for Sequence-to-Sequence Natural Language Generation.
paper - SeqGAN: Sequence Generative Adversarial Nets with Policy Gradient.
paper
- Learning to Rank Short Text Pairs with Convolutional Deep Neural Networks.
paper - Learning Text Similarity with Siamese Recurrent Networks.
paper - A Deep Architecture for Matching Short Texts.
paper
- A Question-Focused Multi-Factor Attention Network for Question Answering.
paper - The Design and Implementation of XiaoIce, an Empathetic Social Chatbot.
paper - A Knowledge-Grounded Neural Conversation Model.
paper - Neural Generative Question Answering.
paper - Sequential Matching Network A New Architecture for Multi-turn Response Selection in Retrieval-Based Chatbots.
paper - Modeling Multi-turn Conversation with Deep Utterance Aggregation.
paper - Multi-Turn Response Selection for Chatbots with Deep Attention Matching Network.
paper - Deep Reinforcement Learning For Modeling Chit-Chat Dialog With Discrete Attributes.
paper
- Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation.
paper - Neural Machine Translation by Jointly Learning to Align and Translate.
paper - Transformer (Attention Is All You Need).
paper
- Get To The Point: Summarization with Pointer-Generator Networks.
paper - Deep Recurrent Generative Decoder for Abstractive Text Summarization.
paper
- Distant Supervision for Relation Extraction via Piecewise Convolutional Neural Networks.
paper - Neural Relation Extraction with Multi-lingual Attention.
paper - FewRel: A Large-Scale Supervised Few-Shot Relation Classification Dataset with State-of-the-Art Evaluation.
paper - End-to-End Relation Extraction using LSTMs on Sequences and Tree Structures.
paper
- Training language models to follow instructions with human feedback.
paper - LLaMA: Open and Efficient Foundation Language Models.
paper