A library of translation-based text similarity measures.
To learn more about how these measures work, have a look at Jannis' blog post. Also, read our paper, "NMTScore: A Multilingual Analysis of Translation-based Text Similarity Measures".
- Requires Python >= 3.7 and PyTorch
pip install nmtscore- Extra requirements for the Prism model:
pip install nmtscore[prism]
Instantiate a scorer and start scoring short sentence pairs.
from nmtscore import NMTScorer
scorer = NMTScorer()
scorer.score("This is a sentence.", "This is another sentence.")
# 0.45192727655379844The library implements three different measures:
# Translation cross-likelihood (default)
scorer.score_cross_likelihood(a, b, tgt_lang="en", normalize=True, both_directions=True)
# Direct translation probability
scorer.score_direct(a, b, a_lang="en", b_lang="en", normalize=True, both_directions=True)
# Pivot translation probability
scorer.score_pivot(a, b, a_lang="en", b_lang="en", pivot_lang="en", normalize=True, both_directions=True)The score method is a shortcut for cross-likelihood.
The scoring methods also accept lists of strings:
scorer.score(
["This is a sentence.", "This is a sentence.", "This is another sentence."],
["This is another sentence.", "This sentence is completely unrelated.", "This is another sentence."],
)
# [0.4519273529250307, 0.13127038689469997, 1.0000000000000102]The sentences in the first list are compared element-wise to the sentences in the second list.
The default batch size is 8. An alternative batch size can be specified as follows (independently for translating and scoring):
scorer.score_direct(
a, b, a_lang="en", b_lang="en",
score_kwargs={"batch_size": 16}
)
scorer.score_cross_likelihood(
a, b,
translate_kwargs={"batch_size": 16},
score_kwargs={"batch_size": 16}
)This library currently supports three NMT models:
By default, the leanest model (m2m100_418M) is loaded. The main results in the paper are based on the Prism model.
scorer = NMTScorer("m2m100_418M", device=None) # default
scorer = NMTScorer("m2m100_1.2B", device=None)
scorer = NMTScorer("prism", device=None)It can make sense to cache the translations and scores if they are needed repeatedly, e.g. in reference-based evaluation.
scorer.score_direct(
a, b, a_lang="en", b_lang="en",
score_kwargs={"use_cache": True} # default: False
)
scorer.score_cross_likelihood(
a, b,
translate_kwargs={"use_cache": True}, # default: False
score_kwargs={"use_cache": True} # default: False
)Activating this option will create an SQLite database in the ~/.cache directory. The directory can be overriden via the NMTSCORE_CACHE environment variable.
Print a version signature (à la SacreBLEU)
scorer.score(a, b, print_signature=True)
# NMTScore-cross|tgt-lang:en|model:facebook/m2m100_418M|normalized|both-directions|v0.1.0|hf4.17.0The NMT models also provide a direct interface for translating and scoring.
from nmtscore.models import load_translation_model
model = load_translation_model("m2m100_418M")
model.translate("de", ["This is a test."])
# ["Das ist ein Test."]
model.score("de", ["This is a test."], ["Das ist ein Test."])
# [0.5148844122886658]@article{vamvas2022nmtscore,
title={{NMTScore}: A Multilingual Analysis of Translation-based Text Similarity Measures},
author={Vamvas, Jannis and Sennrich, Rico},
journal={arXiv preprint arXiv:2204.13692},
year={2022}
}- Code: MIT License
- Data: See data subdirectories
