README_en.md

简体中文 | English

Features | Installation | Quick Start | API Reference | Community

News

🔥 2021.5.18-19 We will introduce UIE (Universal Information Extraction) and ERNIE 3.0 light-weight model. Welcome to join us!
🔥 2022.5.16 PaddleNLP v2.3 Released!🎉
- 🔥Release ERNIE 3.0 light-weight model which achieved better results compared to ERNIE 2.0 on CLUE benchmark. Release ERNIE-Health, a biomedical SOTA pretrained model on CBLUE; Release PLATO-XL with FasterGeneration, which can do fast parallel inference with 11B large-scale model.
- 🔥Release UIE (Universal Information Extraction) technique, which single model can support NER, Relation Extraction, Event Extraction and Sentiment Anlaysis simultaneously.

Features

PaddleNLP is an easy-to-use and high performance NLP library with awesome pre-trained Transformer models, supporting wide-range of NLP tasks from research to industrial applications.

Off-the-shelf NLP Pre-built Task

Awesome Chinese Pre-trained Model Zoo

Industrial End-to-end NLP System

High Performance Distributed Training and Infernece

Off-the-shelf NLP Pre-built Task

Taskflow aims to provide off-the-shelf NLP pre-built task covering NLU and NLG scenario, in the meanwhile with extreamly fast infernece satisfying industrial applications.

For more usage please refer to Taskflow Docs。

Awesome Chinese Pre-trained Model Zoo

Comprehensive Chinese Transformer Models

We provide 45+ network architectures and over 500+ pretrained models. Not only includes all the SOTA model like ERNIE, PLATO and SKEP released by Baidu, but also integrates most of the high quality Chinese pretrained model developed by other organizations. Use AutoModel API to ⚡FAST⚡ download pretrained mdoels of different architecture. We welcome all developers to contribute your Transformer models to PaddleNLP!

from paddlenlp.transformers import *

ernie = AutoModel.from_pretrained('ernie-3.0-base-zh')
bert = AutoModel.from_pretrained('bert-wwm-chinese')
albert = AutoModel.from_pretrained('albert-chinese-tiny')
roberta = AutoModel.from_pretrained('roberta-wwm-ext')
electra = AutoModel.from_pretrained('chinese-electra-small')
gpt = AutoModelForPretraining.from_pretrained('gpt-cpm-large-cn')

Unified API experience for NLP task like semantic representation, text classification, sentence matching, sequence labeling, question answering, etc.

import paddle
from paddlenlp.transformers import *

tokenizer = AutoTokenizer.from_pretrained('ernie-3.0-medium-zh')
text = tokenizer('natural language processing')

# Semantic Representation
model = AutoModel.from_pretrained('ernie-3.0-medium-zh')
sequence_output, pooled_output = model(input_ids=paddle.to_tensor([text['input_ids']]))
# Text Classificaiton and Matching
model = AutoModelForSequenceClassification.from_pretrained('ernie-3.0-medium-zh')
# Sequence Labeling
model = AutoModelForTokenClassification.from_pretrained('ernie-3.0-medium-zh')
# Question Answering
model = AutoModelForQuestionAnswering.from_pretrained('ernie-3.0-medium-zh')

PaddleNLP Transformer model summary, click to show more detials

Model	Sequence Classification	Token Classification	Question Answering	Text Generation	Multiple Choice
ALBERT	✅	✅	✅	❌	✅
BART	✅	✅	✅	✅	❌
BERT	✅	✅	✅	❌	✅
BigBird	✅	✅	✅	❌	✅
BlenderBot	❌	❌	❌	✅	❌
ChineseBERT	✅	✅	✅	❌	❌
ConvBERT	✅	✅	✅	❌	✅
CTRL	✅	❌	❌	❌	❌
DistilBERT	✅	✅	✅	❌	❌
ELECTRA	✅	✅	✅	❌	✅
ERNIE	✅	✅	✅	❌	✅
ERNIE-CTM	❌	✅	❌	❌	❌
ERNIE-Doc	✅	✅	✅	❌	❌
ERNIE-GEN	❌	❌	❌	✅	❌
ERNIE-Gram	✅	✅	✅	❌	❌
ERNIE-M	✅	✅	✅	❌	❌
FNet	✅	✅	✅	❌	✅
Funnel-Transformer	✅	✅	✅	❌	❌
GPT	✅	✅	❌	✅	❌
LayoutLM	✅	✅	❌	❌	❌
LayoutLMv2	❌	✅	❌	❌	❌
LayoutXLM	❌	✅	❌	❌	❌
LUKE	❌	✅	✅	❌	❌
mBART	✅	❌	✅	❌	✅
MegatronBERT	✅	✅	✅	❌	✅
MobileBERT	✅	❌	✅	❌	❌
MPNet	✅	✅	✅	❌	✅
NEZHA	✅	✅	✅	❌	✅
PP-MiniLM	✅	❌	❌	❌	❌
ProphetNet	❌	❌	❌	✅	❌
Reformer	✅	❌	✅	❌	❌
RemBERT	✅	✅	✅	❌	✅
RoBERTa	✅	✅	✅	❌	✅
RoFormer	✅	✅	✅	❌	❌
SKEP	✅	✅	❌	❌	❌
SqueezeBERT	✅	✅	✅	❌	❌
T5	❌	❌	❌	✅	❌
TinyBERT	✅	❌	❌	❌	❌
UnifiedTransformer	❌	❌	❌	✅	❌
XLNet	✅	✅	✅	❌	✅

For more pretrained model usage, please refer to Transformer API Docs.

Wide-range NLP Task Support

PaddleNLP provides rich application examples covering mainstream NLP task to help developers accelerate problem solving. You can find our powerful transformer Model Zoo, and wide-range NLP application exmaples with detailed instructions.

Also you can run our interactive Notebook tutorial on AI Studio, a powerful platform with FREE computing resource.

Industrial End-to-end System Cases

We provide high value scenarios including information extraction, semantic retrieval, questionn answering high-valuePaddleNLP针对信息抽取、语义检索、智能问答、情感分析等高频NLP技术场景，提供端到端系统范例，打通数据标注-模型训练-调优-预测部署全流程，持续降低NLP技术产业落地门槛. For more details industial cases please refer to Applications。

Speech Command Analysis

Integrated ASR Model, Information Extraction, we provide a speech command analysis pipeline that show how to use PaddleNLP and PaddleSpeech to solve Speech + NLP real scenarios.

For more details please refer to Speech Command Analysis。

Semantic Retrieval System

For more details please refer to Neural Search。

Question Answering System

We provide question answering pipeline which can support FAQ system, Document-level Visual Question answering system based on RocketQA technique.

For more details please refer to Question Answering。

Opinion Extraction and Sentiment Analysis

We build an opinion extraction system for product review and fine-grained sentiment analysis based on SKEP Model.

For more details please refer to Sentiment Analysis。

High Performance Distributed Training and Inference

Fleet API: 4D Hybrid Distributed Training

For more supre large-scale model training please refer to GPT-3。

FasterTokenizers: High Performance Text Preprocessing Library

For more usage please refer to FasterTokenizers。

FasterGeneration: High Perforance Generation Utilities

For more usage please refer to FasterGeneration。

Community👬

Special Interest Group (SIG)

Welcome to join PaddleNLP SIG for contribution, eg. Dataset, Models and Toolkit.

Slack

To connect with other users and contributors, welcome to join our Slack channel.

WeChat

Scan the QR code below with your Wechat⬇️. You can access to official technical exchange group. Look forward to your participation.

Installation

Prerequisites

python >= 3.6
paddlepaddle >= 2.2

More information about PaddlePaddle installation please refer to PaddlePaddle's Website.

Python pip Installation

pip install --upgrade paddlenlp

More API Usage

Please find more API Reference from our readthedocs.

Citation

If you find PaddleNLP useful in your research, please consider cite

@misc{=paddlenlp,
    title={PaddleNLP: An Easy-to-use and High Performance NLP Library},
    author={PaddleNLP Contributors},
    howpublished = {\url{https://github.com/PaddlePaddle/PaddleNLP}},
    year={2021}
}

Acknowledge

We have borrowed from Hugging Face's Transformer🤗 excellent design on pretrained models usage, and we would like to express our gratitude to the authors of Hugging Face and its open source community.

License

PaddleNLP is provided under the Apache-2.0 License.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Features | Installation | Quick Start | API Reference | Community

News

Features

Off-the-shelf NLP Pre-built Task

Awesome Chinese Pre-trained Model Zoo

Industrial End-to-end NLP System

High Performance Distributed Training and Infernece

Off-the-shelf NLP Pre-built Task

Awesome Chinese Pre-trained Model Zoo

Comprehensive Chinese Transformer Models

Wide-range NLP Task Support

Industrial End-to-end System Cases

Speech Command Analysis

Semantic Retrieval System

Question Answering System

Opinion Extraction and Sentiment Analysis

High Performance Distributed Training and Inference

Fleet API: 4D Hybrid Distributed Training

FasterTokenizers: High Performance Text Preprocessing Library

FasterGeneration: High Perforance Generation Utilities

Community👬

Special Interest Group (SIG)

Slack

WeChat

Installation

Prerequisites

Python pip Installation

More API Usage

Citation

Acknowledge

License

FilesExpand file tree

README_en.md

Latest commit

History

README_en.md

File metadata and controls

Features | Installation | Quick Start | API Reference | Community

News

Features

Off-the-shelf NLP Pre-built Task

Awesome Chinese Pre-trained Model Zoo

Industrial End-to-end NLP System

High Performance Distributed Training and Infernece

Off-the-shelf NLP Pre-built Task

Awesome Chinese Pre-trained Model Zoo

Comprehensive Chinese Transformer Models

Wide-range NLP Task Support

Industrial End-to-end System Cases

Speech Command Analysis

Semantic Retrieval System

Question Answering System

Opinion Extraction and Sentiment Analysis

High Performance Distributed Training and Inference

Fleet API: 4D Hybrid Distributed Training

FasterTokenizers: High Performance Text Preprocessing Library

FasterGeneration: High Perforance Generation Utilities

Community👬

Special Interest Group (SIG)

Slack

WeChat

Installation

Prerequisites

Python pip Installation

More API Usage

Citation

Acknowledge

License