Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Make unpacking archives optional#2215

Open
anjackson wants to merge 7 commits into
Yelp:masterfrom
ukwa:make-unpacking-archives-optional
Open

Make unpacking archives optional#2215
anjackson wants to merge 7 commits into
Yelp:masterfrom
ukwa:make-unpacking-archives-optional

Conversation

@anjackson

Copy link
Copy Markdown

We are using MrJob to process WARC files, in similar manner to this example given in the Writing Jobs guide.

For our use case, it is crucial that the .gz compressed file is not automatically decompressed before use.

This PR proposes a new setting that would allow this to be controlled via a unpack_archives option passed to the MrJob runner. This new option defaults to True to maintain the expected default behaviour, while allowing us to set it to False when needed. We have tested this locally and it seems to work just fine.

I've attempted to document this new option, as per the contributing guidelines, but I'm not sure I've covered everything. Is there any other documentation I should add?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants