Rasa NLU for Chinese, a fork from RasaHQ/rasa_nlu.

中文Blog

Files you should have:

data/total_word_feature_extractor_zh.dat

Trained from Chinese corpus by MITIE wordrep tools (takes 2-3 days for training)

For training, please build the MITIE Wordrep Tool. Note that Chinese corpus should be tokenized first before feeding into the tool for training. Close-domain corpus that best matches user case works best.

A trained model from Chinese Wikipedia Dump and Baidu Baike can be downloaded from 中文Blog.

data/examples/rasa/demo-rasa_zh.json

Should add as much examples as possible.

Usage:

Clone this project, and run

python setup.py install

Modify configuration.

Currently for Chinese we have two pipelines:

Use MITIE+Jieba (config_jieba_mitie.json):

["nlp_mitie", "tokenizer_jieba", "ner_mitie", "ner_synonyms", "intent_classifier_mitie"]

Use MITIE+Jieba+sklearn (config_jieba_mitie_sklearn.json):

["nlp_mitie", "tokenizer_jieba", "ner_mitie", "ner_synonyms", "intent_featurizer_mitie", "intent_classifier_sklearn"]

Train model by running:

python -m rasa_nlu.train -c config_jieba_mitie.json

or

python -m rasa_nlu.train -c config_jieba_mitie_sklearn.json

This will save your model at /models

Run the rasa_nlu server:

python -m rasa_nlu.server -c config_jieba_mitie.json --server_model_dirs=./model_20170701_mitie_chi

or

python -m rasa_nlu.server -c config_jieba_mitie_sklearn.json --server_model_dirs=./model_20170701_mitie_sklearn_chi

Change the configure json file and model path to your own.

Open a new terminal and now you can curl results from the server, for example:

$ curl -XPOST localhost:5000/parse -d '{"q":"我发烧了该吃什么药？"}' | python -mjson.tool
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   364  100   326  100    38  98519  11483 --:--:-- --:--:-- --:--:--  106k
{
    "entities": [
        {
            "end": 3,
            "entity": "disease",
            "extractor": "ner_jieba_mitie",
            "start": 1,
            "value": "\u53d1\u70e7"
        }
    ],
    "intent": {
        "confidence": 0.02073156639321614,
        "name": "medical"
    },
    "text": "\u6211\u53d1\u70e7\u4e86\u8be5\u5403\u4ec0\u4e48\u836f\uff1f"
}

Name		Name	Last commit message	Last commit date
Latest commit History 1,388 Commits
.github		.github
_pytest		_pytest
alt_requirements		alt_requirements
data		data
docker		docker
docs		docs
heroku		heroku
rasa_nlu		rasa_nlu
sample_configs		sample_configs
test_models		test_models
.coveragerc		.coveragerc
.dockerignore		.dockerignore
.env		.env
.gitattributes		.gitattributes
.gitignore		.gitignore
.travis.yml		.travis.yml
CHANGELOG.rst		CHANGELOG.rst
LICENSE.txt		LICENSE.txt
MANIFEST.in		MANIFEST.in
README.md		README.md
app.json		app.json
config_jieba_mitie_sklearn.json		config_jieba_mitie_sklearn.json
entrypoint.sh		entrypoint.sh
requirements.txt		requirements.txt
setup.cfg		setup.cfg
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Rasa NLU for Chinese, a fork from RasaHQ/rasa_nlu.

中文Blog

Files you should have:

Usage:

About

Uh oh!

Releases

Packages

Languages

License

czxcc/rasa_nlu_chi

Folders and files

Latest commit

History

Repository files navigation

Rasa NLU for Chinese, a fork from RasaHQ/rasa_nlu.

中文Blog

Files you should have:

Usage:

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages