Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Latest commit

 

History

History
296 lines (199 loc) · 17.2 KB

File metadata and controls

296 lines (199 loc) · 17.2 KB

简体中文 | English


News

  • 🔥 2021.5.18-19 We will introduce UIE (Universal Information Extraction) and ERNIE 3.0 light-weight model. Welcome to join us!

  • 🔥 2022.5.16 PaddleNLP v2.3 Released!🎉

    • 🔥Release ERNIE 3.0 light-weight model which achieved better results compared to ERNIE 2.0 on CLUE benchmark. Release ERNIE-Health, a biomedical SOTA pretrained model on CBLUE; Release PLATO-XL with FasterGeneration, which can do fast parallel inference with 11B large-scale model.
    • 🔥Release UIE (Universal Information Extraction) technique, which single model can support NER, Relation Extraction, Event Extraction and Sentiment Anlaysis simultaneously.

Features

PaddleNLP is an easy-to-use and high performance NLP library with awesome pre-trained Transformer models, supporting wide-range of NLP tasks from research to industrial applications.

Off-the-shelf NLP Pre-built Task

Taskflow aims to provide off-the-shelf NLP pre-built task covering NLU and NLG scenario, in the meanwhile with extreamly fast infernece satisfying industrial applications.

taskflow1

For more usage please refer to Taskflow Docs

Awesome Chinese Pre-trained Model Zoo

Comprehensive Chinese Transformer Models

We provide 45+ network architectures and over 500+ pretrained models. Not only includes all the SOTA model like ERNIE, PLATO and SKEP released by Baidu, but also integrates most of the high quality Chinese pretrained model developed by other organizations. Use AutoModel API to ⚡FAST⚡ download pretrained mdoels of different architecture. We welcome all developers to contribute your Transformer models to PaddleNLP!

from paddlenlp.transformers import *

ernie = AutoModel.from_pretrained('ernie-3.0-base-zh')
bert = AutoModel.from_pretrained('bert-wwm-chinese')
albert = AutoModel.from_pretrained('albert-chinese-tiny')
roberta = AutoModel.from_pretrained('roberta-wwm-ext')
electra = AutoModel.from_pretrained('chinese-electra-small')
gpt = AutoModelForPretraining.from_pretrained('gpt-cpm-large-cn')

Unified API experience for NLP task like semantic representation, text classification, sentence matching, sequence labeling, question answering, etc.

import paddle
from paddlenlp.transformers import *

tokenizer = AutoTokenizer.from_pretrained('ernie-3.0-medium-zh')
text = tokenizer('natural language processing')

# Semantic Representation
model = AutoModel.from_pretrained('ernie-3.0-medium-zh')
sequence_output, pooled_output = model(input_ids=paddle.to_tensor([text['input_ids']]))
# Text Classificaiton and Matching
model = AutoModelForSequenceClassification.from_pretrained('ernie-3.0-medium-zh')
# Sequence Labeling
model = AutoModelForTokenClassification.from_pretrained('ernie-3.0-medium-zh')
# Question Answering
model = AutoModelForQuestionAnswering.from_pretrained('ernie-3.0-medium-zh')
 PaddleNLP Transformer model summary, click to show more detials
Model Sequence Classification Token Classification Question Answering Text Generation Multiple Choice
ALBERT
BART
BERT
BigBird
BlenderBot
ChineseBERT
ConvBERT
CTRL
DistilBERT
ELECTRA
ERNIE
ERNIE-CTM
ERNIE-Doc
ERNIE-GEN
ERNIE-Gram
ERNIE-M
FNet
Funnel-Transformer
GPT
LayoutLM
LayoutLMv2
LayoutXLM
LUKE
mBART
MegatronBERT
MobileBERT
MPNet
NEZHA
PP-MiniLM
ProphetNet
Reformer
RemBERT
RoBERTa
RoFormer
SKEP
SqueezeBERT
T5
TinyBERT
UnifiedTransformer
XLNet

For more pretrained model usage, please refer to Transformer API Docs.

Wide-range NLP Task Support

PaddleNLP provides rich application examples covering mainstream NLP task to help developers accelerate problem solving. You can find our powerful transformer Model Zoo, and wide-range NLP application exmaples with detailed instructions.

Also you can run our interactive Notebook tutorial on AI Studio, a powerful platform with FREE computing resource.

Industrial End-to-end System Cases

We provide high value scenarios including information extraction, semantic retrieval, questionn answering high-valuePaddleNLP针对信息抽取、语义检索、智能问答、情感分析等高频NLP技术场景,提供端到端系统范例,打通数据标注-模型训练-调优-预测部署全流程,持续降低NLP技术产业落地门槛. For more details industial cases please refer to Applications

Speech Command Analysis

Integrated ASR Model, Information Extraction, we provide a speech command analysis pipeline that show how to use PaddleNLP and PaddleSpeech to solve Speech + NLP real scenarios.

For more details please refer to Speech Command Analysis

Semantic Retrieval System

For more details please refer to Neural Search

Question Answering System

We provide question answering pipeline which can support FAQ system, Document-level Visual Question answering system based on RocketQA technique.

For more details please refer to Question Answering

Opinion Extraction and Sentiment Analysis

We build an opinion extraction system for product review and fine-grained sentiment analysis based on SKEP Model.

For more details please refer to Sentiment Analysis

High Performance Distributed Training and Inference

Fleet API: 4D Hybrid Distributed Training

For more supre large-scale model training please refer to GPT-3

FasterTokenizers: High Performance Text Preprocessing Library

For more usage please refer to FasterTokenizers

FasterGeneration: High Perforance Generation Utilities

For more usage please refer to FasterGeneration

Community👬

Special Interest Group (SIG)

Welcome to join PaddleNLP SIG for contribution, eg. Dataset, Models and Toolkit.

Slack

To connect with other users and contributors, welcome to join our Slack channel.

WeChat

Scan the QR code below with your Wechat⬇️. You can access to official technical exchange group. Look forward to your participation.

Installation

Prerequisites

  • python >= 3.6
  • paddlepaddle >= 2.2

More information about PaddlePaddle installation please refer to PaddlePaddle's Website.

Python pip Installation

pip install --upgrade paddlenlp

More API Usage

Please find more API Reference from our readthedocs.

Citation

If you find PaddleNLP useful in your research, please consider cite

@misc{=paddlenlp,
    title={PaddleNLP: An Easy-to-use and High Performance NLP Library},
    author={PaddleNLP Contributors},
    howpublished = {\url{https://github.com/PaddlePaddle/PaddleNLP}},
    year={2021}
}

Acknowledge

We have borrowed from Hugging Face's Transformer🤗 excellent design on pretrained models usage, and we would like to express our gratitude to the authors of Hugging Face and its open source community.

License

PaddleNLP is provided under the Apache-2.0 License.