简体中文 | English
-
🔥 2021.5.18-19 We will introduce UIE (Universal Information Extraction) and ERNIE 3.0 light-weight model. Welcome to join us!
-
🔥 2022.5.16 PaddleNLP v2.3 Released!🎉
- 🔥Release ERNIE 3.0 light-weight model which achieved better results compared to ERNIE 2.0 on CLUE benchmark. Release ERNIE-Health, a biomedical SOTA pretrained model on CBLUE; Release PLATO-XL with FasterGeneration, which can do fast parallel inference with 11B large-scale model.
- 🔥Release UIE (Universal Information Extraction) technique, which single model can support NER, Relation Extraction, Event Extraction and Sentiment Anlaysis simultaneously.
PaddleNLP is an easy-to-use and high performance NLP library with awesome pre-trained Transformer models, supporting wide-range of NLP tasks from research to industrial applications.
Taskflow aims to provide off-the-shelf NLP pre-built task covering NLU and NLG scenario, in the meanwhile with extreamly fast infernece satisfying industrial applications.
For more usage please refer to Taskflow Docs。
We provide 45+ network architectures and over 500+ pretrained models. Not only includes all the SOTA model like ERNIE, PLATO and SKEP released by Baidu, but also integrates most of the high quality Chinese pretrained model developed by other organizations. Use AutoModel API to ⚡FAST⚡ download pretrained mdoels of different architecture. We welcome all developers to contribute your Transformer models to PaddleNLP!
from paddlenlp.transformers import *
ernie = AutoModel.from_pretrained('ernie-3.0-base-zh')
bert = AutoModel.from_pretrained('bert-wwm-chinese')
albert = AutoModel.from_pretrained('albert-chinese-tiny')
roberta = AutoModel.from_pretrained('roberta-wwm-ext')
electra = AutoModel.from_pretrained('chinese-electra-small')
gpt = AutoModelForPretraining.from_pretrained('gpt-cpm-large-cn')Unified API experience for NLP task like semantic representation, text classification, sentence matching, sequence labeling, question answering, etc.
import paddle
from paddlenlp.transformers import *
tokenizer = AutoTokenizer.from_pretrained('ernie-3.0-medium-zh')
text = tokenizer('natural language processing')
# Semantic Representation
model = AutoModel.from_pretrained('ernie-3.0-medium-zh')
sequence_output, pooled_output = model(input_ids=paddle.to_tensor([text['input_ids']]))
# Text Classificaiton and Matching
model = AutoModelForSequenceClassification.from_pretrained('ernie-3.0-medium-zh')
# Sequence Labeling
model = AutoModelForTokenClassification.from_pretrained('ernie-3.0-medium-zh')
# Question Answering
model = AutoModelForQuestionAnswering.from_pretrained('ernie-3.0-medium-zh')PaddleNLP Transformer model summary, click to show more detials
| Model | Sequence Classification | Token Classification | Question Answering | Text Generation | Multiple Choice |
|---|---|---|---|---|---|
| ALBERT | ✅ | ✅ | ✅ | ❌ | ✅ |
| BART | ✅ | ✅ | ✅ | ✅ | ❌ |
| BERT | ✅ | ✅ | ✅ | ❌ | ✅ |
| BigBird | ✅ | ✅ | ✅ | ❌ | ✅ |
| BlenderBot | ❌ | ❌ | ❌ | ✅ | ❌ |
| ChineseBERT | ✅ | ✅ | ✅ | ❌ | ❌ |
| ConvBERT | ✅ | ✅ | ✅ | ❌ | ✅ |
| CTRL | ✅ | ❌ | ❌ | ❌ | ❌ |
| DistilBERT | ✅ | ✅ | ✅ | ❌ | ❌ |
| ELECTRA | ✅ | ✅ | ✅ | ❌ | ✅ |
| ERNIE | ✅ | ✅ | ✅ | ❌ | ✅ |
| ERNIE-CTM | ❌ | ✅ | ❌ | ❌ | ❌ |
| ERNIE-Doc | ✅ | ✅ | ✅ | ❌ | ❌ |
| ERNIE-GEN | ❌ | ❌ | ❌ | ✅ | ❌ |
| ERNIE-Gram | ✅ | ✅ | ✅ | ❌ | ❌ |
| ERNIE-M | ✅ | ✅ | ✅ | ❌ | ❌ |
| FNet | ✅ | ✅ | ✅ | ❌ | ✅ |
| Funnel-Transformer | ✅ | ✅ | ✅ | ❌ | ❌ |
| GPT | ✅ | ✅ | ❌ | ✅ | ❌ |
| LayoutLM | ✅ | ✅ | ❌ | ❌ | ❌ |
| LayoutLMv2 | ❌ | ✅ | ❌ | ❌ | ❌ |
| LayoutXLM | ❌ | ✅ | ❌ | ❌ | ❌ |
| LUKE | ❌ | ✅ | ✅ | ❌ | ❌ |
| mBART | ✅ | ❌ | ✅ | ❌ | ✅ |
| MegatronBERT | ✅ | ✅ | ✅ | ❌ | ✅ |
| MobileBERT | ✅ | ❌ | ✅ | ❌ | ❌ |
| MPNet | ✅ | ✅ | ✅ | ❌ | ✅ |
| NEZHA | ✅ | ✅ | ✅ | ❌ | ✅ |
| PP-MiniLM | ✅ | ❌ | ❌ | ❌ | ❌ |
| ProphetNet | ❌ | ❌ | ❌ | ✅ | ❌ |
| Reformer | ✅ | ❌ | ✅ | ❌ | ❌ |
| RemBERT | ✅ | ✅ | ✅ | ❌ | ✅ |
| RoBERTa | ✅ | ✅ | ✅ | ❌ | ✅ |
| RoFormer | ✅ | ✅ | ✅ | ❌ | ❌ |
| SKEP | ✅ | ✅ | ❌ | ❌ | ❌ |
| SqueezeBERT | ✅ | ✅ | ✅ | ❌ | ❌ |
| T5 | ❌ | ❌ | ❌ | ✅ | ❌ |
| TinyBERT | ✅ | ❌ | ❌ | ❌ | ❌ |
| UnifiedTransformer | ❌ | ❌ | ❌ | ✅ | ❌ |
| XLNet | ✅ | ✅ | ✅ | ❌ | ✅ |
For more pretrained model usage, please refer to Transformer API Docs.
PaddleNLP provides rich application examples covering mainstream NLP task to help developers accelerate problem solving. You can find our powerful transformer Model Zoo, and wide-range NLP application exmaples with detailed instructions.
Also you can run our interactive Notebook tutorial on AI Studio, a powerful platform with FREE computing resource.
We provide high value scenarios including information extraction, semantic retrieval, questionn answering high-valuePaddleNLP针对信息抽取、语义检索、智能问答、情感分析等高频NLP技术场景,提供端到端系统范例,打通数据标注-模型训练-调优-预测部署全流程,持续降低NLP技术产业落地门槛. For more details industial cases please refer to Applications。
Integrated ASR Model, Information Extraction, we provide a speech command analysis pipeline that show how to use PaddleNLP and PaddleSpeech to solve Speech + NLP real scenarios.
For more details please refer to Speech Command Analysis。
For more details please refer to Neural Search。
We provide question answering pipeline which can support FAQ system, Document-level Visual Question answering system based on RocketQA technique.
For more details please refer to Question Answering。
We build an opinion extraction system for product review and fine-grained sentiment analysis based on SKEP Model.
For more details please refer to Sentiment Analysis。
For more supre large-scale model training please refer to GPT-3。
For more usage please refer to FasterTokenizers。
For more usage please refer to FasterGeneration。
Welcome to join PaddleNLP SIG for contribution, eg. Dataset, Models and Toolkit.
To connect with other users and contributors, welcome to join our Slack channel.
Scan the QR code below with your Wechat⬇️. You can access to official technical exchange group. Look forward to your participation.
- python >= 3.6
- paddlepaddle >= 2.2
More information about PaddlePaddle installation please refer to PaddlePaddle's Website.
pip install --upgrade paddlenlp
Please find more API Reference from our readthedocs.
If you find PaddleNLP useful in your research, please consider cite
@misc{=paddlenlp,
title={PaddleNLP: An Easy-to-use and High Performance NLP Library},
author={PaddleNLP Contributors},
howpublished = {\url{https://github.com/PaddlePaddle/PaddleNLP}},
year={2021}
}
We have borrowed from Hugging Face's Transformer🤗 excellent design on pretrained models usage, and we would like to express our gratitude to the authors of Hugging Face and its open source community.
PaddleNLP is provided under the Apache-2.0 License.










