Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

phenylshima
Copy link
Member

Purpose

Decouple jpreprocess and lindera, providing users with a way to use tokenizers other than lindera.

Main Breaking Changes

jpreprocess Crate:

  • The type variable trait for JPreprocess has been changed from DictionaryFetcher to Tokenizer. If you are using JPreprocess<DefaultFetcher>, replace it with JPreprocess<DefaultTokenizer>.
  • The method JPreprocess::with_dictionary_fetcher has been removed. For advanced use cases that previously required with_dictionary_fetcher, use from_tokenizer instead, though some modifications are required.

jpreprocess-dictionary Crate:

  • The DictionaryFetcher and DictionaryStore traits have been removed. The Tokenizer and Token traits will now serve a similar purpose, allowing precise control over dictionary loading behavior, but they require additional implementation for the tokenization step.
  • The DefaultFetcher has been removed. The DefaultTokenizer provides almost the same functionality but does not detect older dictionaries.

jpreprocess-njd Crate:

  • The function signature of NJD::from_tokens has been changed to accept any type implementing the Token trait, and it no longer requires DictionaryFetcher as an argument.
  • The NJDNode::load method now accepts WordEntry by immutable borrow.

@phenylshima phenylshima marked this pull request as ready for review February 2, 2025 07:01
@phenylshima phenylshima merged commit 73561a8 into main Feb 2, 2025
15 checks passed
@phenylshima phenylshima deleted the fetcher-to-tokenizer branch February 2, 2025 09:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants