This Python packages is used to computer the X-ParEval score proposed in X-Pareval: A Language-Agnostic Metricfor Paraphrase Evaluation.
When working on paraphrase generation the problem of evaluating the results of different approaches emerged. Looking into published articles on paraphrase generation we were faced by a lack of standarisation and a constant use of metrics that were not quite fit for the task. Following the example of Matt Post with SacreBleu we decided to design an accessible and easy to use metric fit for the task. That is how X-ParEval came to be.
- Compute both semantic and surface similarity of sentences
- Computer the X-ParEval score
- Report all thre values
- Keep track of the configuration used to obtain the values
We plan to make the code publicly available and isntallable through pip. In the meantime, you can isntall the package following this steps:
- Clone the repository in your local computer
- Install the packages
requirements_dev.txt - Run
python setup.py sdist bdist_wheelin the main directory - Move to the
distdirectory - Run
pip install xpareval-0.0.1-py3-none-any.whlfrom there
To use the package you first need to import it
>>> import xpareval
Then you need to create an instance of the Scorer class. This scorer loads the semantic similarity model and computes all the requied scores.
>>> s = xpareval.Scorer()
You then use the get_xpareval() function of the scorer object. it receives two lists of sentences as argument:
>>> xpareval_score = s.get_xpareval(['I love cookies.', 'I live in China.'], ['I like biscuits.', 'I sleep at night.'])
This will generate an XParEval object:
>>> xpareval_score
xpareval.XParEval(settings=sentence-transformers/paraphrase-xlm-r-multilingual-v1+nltk.translate.gleu_score, semantic_similarity=[0.7359247 0.2674271], surface_similarity=[0.07142857 0.07142857], xpareval=[0.73404735 0.26674488])
You can then easily access the different values of the score:
>>>xpareval_score.semantic_similarity
array([0.73404735, 0.26674488], dtype=float32)
>>>xpareval_score.surface_similarity
array([0.07142857, 0.07142857], dtype=float32)
>>>xpareval_score.semantic_similarity
array([0.7359247, 0.2674271], dtype=float32)
>>>xpareval_score.settings
'sentence-transformers/paraphrase-xlm-r-multilingual-v1+nltk.translate.gleu_score'