Janome is a Japanese morphological analysis engine written in pure Python.
General documentation:
https://janome.mocobeta.dev/en/ (English)
https://janome.mocobeta.dev/ja/ (Japanese)
Python 3.7+ is required.
[Note] This consumes about 500 MB memory for building.
(venv) $ pip install janome(venv) $ python
>>> from janome.tokenizer import Tokenizer
>>> t = Tokenizer()
>>> for token in t.tokenize('すもももももももものうち'):
...     print(token)
...
すもも 名詞,一般,*,*,*,*,すもも,スモモ,スモモ
も    助詞,係助詞,*,*,*,*,も,モ,モ
もも  名詞,一般,*,*,*,*,もも,モモ,モモ
も    助詞,係助詞,*,*,*,*,も,モ,モ
もも  名詞,一般,*,*,*,*,もも,モモ,モモ
の    助詞,連体化,*,*,*,*,の,ノ,ノ
うち  名詞,非自立,副詞可能,*,*,*,うち,ウチ,ウチLicensed under Apache License 2.0 and uses the MeCab-IPADIC dictionary/statistical model.
See LICENSE.txt and NOTICE.txt for license details.
Special thanks to @ikawaha, @takuyaa, @nakagami and @janome_oekaki.
Copyright(C) 2015-2025, Tomoko Uchida. All rights reserved.