refactor!: Remove DictionaryFetcher and DictionaryStore, and move to Tokenizer #397
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Purpose
Decouple jpreprocess and lindera, providing users with a way to use tokenizers other than lindera.
Main Breaking Changes
jpreprocess
Crate:JPreprocess
has been changed fromDictionaryFetcher
toTokenizer
. If you are usingJPreprocess<DefaultFetcher>
, replace it withJPreprocess<DefaultTokenizer>
.JPreprocess::with_dictionary_fetcher
has been removed. For advanced use cases that previously requiredwith_dictionary_fetcher
, usefrom_tokenizer
instead, though some modifications are required.jpreprocess-dictionary
Crate:DictionaryFetcher
andDictionaryStore
traits have been removed. TheTokenizer
andToken
traits will now serve a similar purpose, allowing precise control over dictionary loading behavior, but they require additional implementation for the tokenization step.DefaultFetcher
has been removed. TheDefaultTokenizer
provides almost the same functionality but does not detect older dictionaries.jpreprocess-njd
Crate:NJD::from_tokens
has been changed to accept any type implementing theToken
trait, and it no longer requiresDictionaryFetcher
as an argument.NJDNode::load
method now acceptsWordEntry
by immutable borrow.