chore: remove XSUM dataset from example notebook and integration tests #192
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description of changes:
This PR is a follow-up of #191, where the last traces of the XSUM dataset are removed from the codebase. The integration tests that used XSUM now use Gigaword, and have had their expected values updated.
This PR also updates all of the integration tests so that
ray.shutdown()is called in between the tests for each evaluation algorithm. This is used to clean up resources in between tests, and has reduced the mask disk usage during testing from ~18 GB to ~6 GB.Lastly, this PR moves the initialization of the
SummarizationAccuracyobject intest_summarization_accuracy.pyfrom the top of the file into the test method. This is required because code at the top level of every file gets run at the very start of testing, before any tests are executed. This means that theBertscoreHelperModelactor created by theSummarizationAccuracyobject also gets created right from the beginning. When we callray.shutdown()the first time, it will clean up theBertscoreHelperModelresource, meaning that by the time we execute the summarization accuracy integ test, said actor will not exist as expected, and the test will fail.By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.