-
Notifications
You must be signed in to change notification settings - Fork 11
Description
Hi Andrew,
I was trying the Keyword Extraction API with TF-IDF, the code is:
bert_kws = extract_kws(
method="TFIDF", # "BERT", "LDA", "TFIDF", "frequency"
bert_st_model="xlm-r-bert-base-nli-stsb-mean-tokens",
text_corpus=corpus_no_ngrams, # automatically tokenized if using LDA
input_language=input_language,
output_language=None, # allows the output to be translated
num_keywords=num_keywords,
num_topics=num_topics,
corpuses_to_compare=None, # for TFIDF
ignore_words=ignore_words,
prompt_remove_words=True, # check words with user
show_progress_bar=True,
batch_size=5,
)
Which returns the error,
AssertionError: TFIDF requires another text corpus to be passed to the corpuses_to_compare
argument.
I wonder why we require corpus to compare for keyword extraction? Thanks!